Networking, Service Discovery, and Ingress
Learning Objectives
- You have a high-level understanding of how Kubernetes networking and service discovery work.
- You can create services and deployments in Kubernetes that communicate with each other.
- You know of the terms ingress and ingress controller.
Kubernetes networking
Kubernetes uses a flat network model where all pods can communicate with each other directly, without network address translation (NAT). Every pod in a cluster has a unique IP address, and pods use these IPs to communicate with each other as if they were on the same network. As each pod has its own IP, there are no port conflicts.
As pods are ephemeral, communication between pods is not implemented using the IP addresses in code, as they can change. Instead, services are used to provide endpoints for a group of pods.
A service defines a set of pods and a policy by which to access them. Services abstract away the details of how to reach a set of pods, providing a stable IP address and DNS name for clients to use.
Under the hood, Kubernetes maintais a dynamic list of endpoints (EndpointSlices) for each service. When a service is created, Kubernetes automatically creates an endpoint object that contains the IP addresses of the pods that match the service’s selector. When a pod is added or removed, the endpoint object is updated automatically.
To concretely check the list of endpoints for an application, modify the deployment configuration minikube-demo-server-service.yaml
so that we deploy two replicas instead of one:
apiVersion: apps/v1
kind: Deployment
metadata:
name: minikube-demo-server-deployment
labels:
app: minikube-demo-server
spec:
replicas: 2
selector:
matchLabels:
app: minikube-demo-server
template:
metadata:
labels:
app: minikube-demo-server
spec:
containers:
- name: minikube-demo-server
image: minikube-demo-server:1.0
imagePullPolicy: Never
ports:
- containerPort: 8000
Then, apply the configuration:
kubectl apply -f k8s/minikube-demo-server-service.yaml,k8s/minikube-demo-server-deployment.yaml
With this change, there will be two pods running the application, and a service that exposes them. You can check the pods using kubectl get pods
.
NAME READY STATUS
minikube-demo-server-deployment-694d46996b-h2gds 1/1 Running
minikube-demo-server-deployment-694d46996b-tjt6v 1/1 Running
The endpoints are available using kubectl get endpoints
. As there are two pods, there should also be two IP addresses listed for the service:
NAME ENDPOINTS
minikube-demo-server-service 10.244.0.12:8000,10.244.0.13:8000
Concretely, when a request is made to the service, it will be forwarded to one of the pods listed in the endpoints. The service acts as a load balancer, distributing the requests among the pods. This is illustrated in Figure 1 below.
Kubernetes has a component called kube-proxy that is responsible for implementing the traffic forwarding.
When you change the number of replicas to 1 and reapply the configuration, you’ll notice that the list of endpoints will contain only one IP address.
Service discovery
Kubernetes uses DNS-based service discovery for finding services within a cluster. When you create a service, the DNS system (e.g. CoreDNS) automatically generates DNS records for that service. By default, each service gets a DNS name of the form <service-name>.<namespace>.svc.cluster.local
. For example, the service minikube-demo-server-service
in the default namespace would get a DNS entry minikube-demo-server-service.default.svc.cluster.local
.
Kubernetes also configures a DNS search domain so that in the same namespace, you can just refer to the service as
minikube-demo-server-service
.
Because of the DNS-based service discovery, pods can refer to each other by service name rather than by IP.
Service types
There are a set of service types that determine how services can be reached and by whom. The main service types are:
-
ClusterIP — exposes the service on an internal cluster IP that is reachable only from within the cluster.
-
NodePort — exposes the service on a static port on the node (without a load balancer).
-
LoadBalancer — exposes the service on a static port on the node (with a load balancer); on cloud platforms like Google Cloud, creating a LoadBalancer service automatically sets up a vendor-specific cloud load balancer and assigns an IP or DNS name to it.
Example: service-to-service communication
Going back to the Minikube example from last part, create a copy of the folder server
called server-fetcher
and modify the app.js
file to make a request to the minikube-demo-server-service
service. The code should look like this:
import { Hono } from "@hono/hono";
import { cors } from "@hono/hono/cors";
import { logger } from "@hono/hono/logger";
const app = new Hono();
app.use("/*", cors());
app.use("/*", logger());
app.get("/", async (c) => {
const res = await fetch("http://minikube-demo-server-service:8000");
const data = await res.json();
return c.json({
message: `Fetched: ${data.message}`,
});
});
export default app;
In this code, the server fetcher makes a request to the port 8000
of the minikube-demo-server-service
service. The service name is resolved by the DNS system to the IP addresses of the pods that are part of the service. The response is then returned to the client.
Build the image as minikube-demo-server-fetcher:1.0
.
minikube image build -t minikube-demo-server-fetcher:1.0 .
Then, add a new deployment and service configuration for the server fetcher. The deployment configuration — minikube-demo-server-fetcher-deployment.yaml
— should look like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: minikube-demo-server-fetcher-deployment
labels:
app: minikube-demo-server-fetcher
spec:
replicas: 1
selector:
matchLabels:
app: minikube-demo-server-fetcher
template:
metadata:
labels:
app: minikube-demo-server-fetcher
spec:
containers:
- name: minikube-demo-server-fetcher
image: minikube-demo-server-fetcher:1.0
imagePullPolicy: Never
ports:
- containerPort: 8000
And the service configuration — minikube-demo-server-fetcher-service.yaml
— should look like this:
apiVersion: v1
kind: Service
metadata:
name: minikube-demo-server-fetcher-service
spec:
type: LoadBalancer
ports:
- port: 8000
targetPort: 8000
protocol: TCP
selector:
app: minikube-demo-server-fetcher
Next, apply the configurations. As there are already quite a few configuration files, we can just use the folder name as a parameter.
$ kubectl apply -f k8s/
deployment.apps/minikube-demo-server-deployment created
deployment.apps/minikube-demo-server-fetcher-deployment created
service/minikube-demo-server-fetcher-service created
service/minikube-demo-server-service created
Now, when we run kubectl get all
, we should see the services and their respective pods.
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/minikube-demo-server-deployment-694d46996b-pbqkg 1/1 Running 0 21s
pod/minikube-demo-server-fetcher-deployment-6548f75dd4-gm27q 1/1 Running 0 21s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h
service/minikube-demo-server-fetcher-service LoadBalancer 10.102.146.105 <pending> 8000:32708/TCP 21s
service/minikube-demo-server-service LoadBalancer 10.104.164.32 <pending> 8000:31843/TCP 21s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/minikube-demo-server-deployment 1/1 1 1 21s
deployment.apps/minikube-demo-server-fetcher-deployment 1/1 1 1 21s
NAME DESIRED CURRENT READY AGE
replicaset.apps/minikube-demo-server-deployment-694d46996b 1 1 1 21s
replicaset.apps/minikube-demo-server-fetcher-deployment-6548f75dd4 1 1 1 21s
Then, we can ask for the URL of the service minikube-demo-server-fetcher-service
using minikube service
.
$ minikube service minikube-demo-server-fetcher-service --url
http://192.168.49.2:32708
Now, when we make a request to the service, we should get a response from the server fetcher that contains the message from the server.
curl http://192.168.49.2:32708
{"message":"Fetched: Hello world!"}%
The concrete flow of the request from the “server-fetcher” to the server service and subsequently a pod of the server service, including retrieving the name of the server service from DNS, is illustrated in Figure 2 below.
Even if a pod is restarted and gets a new IP address, the service will still forward the request to the new IP address. This is because the service maintains a list of endpoints that are updated automatically when pods are added or removed.
Kubernetes networking is permissive by default, which means that any pod can talk to any other pod in the cluster. It is possible to also restrict who can talk to whom. For additional details, see Kubernetes Network Policies.
Ingress and ingress controller
So far, we’ve looked into communicating between services and how a LoadBalancer service can be used to expose services to the world. Kubernetes has also a resource called Ingress that can be used to manage external access to services in a more flexible way.
Ingress is an API object that defines rules for routing external HTTP/HTTPS traffic to internal services. Instead of giving every service its own LoadBalancer (or NodePort), you can have an entry point that routes requests to many services based on the URL or hostname. For example, you might have one Ingress that says: send requests for api.mycompany.com
to the “api-service”, and requests for www.mycompany.com
to the “web-service”.
While an Ingress is a set of rules, an Ingress Controller is the component that watches Ingress resources and configures an underlying proxy or load balancer to satisfy them. Kubernetes itself doesn’t handle ingress traffic; it relies on controllers (usually third-party) to do that.
While the traffic is mapped to services, the services are not exposed to the outside world. Within the cluster, “kube-proxy” (or similar) will still handle the traffic.
We won’t go in detail into how an ingress controller is taken into use. For additional details, see e.g. Traefik’s Kubernetes Quick Start.