Scaling applications manually
Learning objectives
- You know how to manually scale applications with Kubernetes.
Kubernetes supports both manual and automatic scaling of applications. Manual scaling is done by adjusting the number of replicas in the deployment (similar to what we previously did with Docker Compose). Automatic scaling, on the other hand, requires collecting metrics from the application and using them to adjust the number of replicas.
We continue here from our first Kubernetes deployment -- to start it up again, we launch minikube and apply the two the configurations.
minikube start
😄 minikube v1.29.0 on Ubuntu 22.04
...
🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
kubectl apply -f kubernetes/my-app-deployment.yaml,kubernetes/my-app-service.yaml
deployment.apps/my-app-deployment created
service/my-app-service created
When we list the pods, we see one pod.
kubectl get pods
NAME READY STATUS RESTARTS AGE
my-app-deployment-85bcb74bcd-fpwr4 1/1 Running 0 38s
And, when we ask for url of the service from minikube, we receive an address that responds to requests.
minikube service my-app-service --url
http://192.168.49.2:32512
curl http://192.168.49.2:32512
Hello from server: 3746
curl http://192.168.49.2:32512
Hello from server: 3746
Manual scaling of applications with Kubernetes is easy. We can either use the kubectl scale command to set a number of replicas, or adjust the configuration file to specify a number of replicas.
Using kubectl scale
Let's first try the kubectl scale
command.
The kubectl scale
command takes the number of replicas and the configuration file of the deployment (or the name of the deployment) as arguments. In our case, we wish to adjust the deployment that was created from the my-app-deployment.yaml
file -- let's create two replicas for the deployment.
kubectl scale --replicas=2 -f kubernetes/my-app-deployment.yaml
deployment.apps/my-app-deployment scaled
Now, when we list the pods, we see two pods instead of one. As you notice, one of them is older, while one of them has been recently started.
ubectl get pods
NAME READY STATUS RESTARTS AGE
my-app-deployment-85bcb74bcd-fpwr4 1/1 Running 0 10m
my-app-deployment-85bcb74bcd-wtbnc 1/1 Running 0 33s
As our service has been created as a load balancer, Kubernetes takes care of balancing the load between the pods for us. When we send requests to the server, we notice that the response is coming from both pods (our application creates a random number that it always includes in the responses).
curl http://192.168.49.2:32512
Hello from server: 8320
curl http://192.168.49.2:32512
Hello from server: 8320
curl http://192.168.49.2:32512
Hello from server: 3746
curl http://192.168.49.2:32512
Hello from server: 8320
curl http://192.168.49.2:32512
Hello from server: 3746
curl http://192.168.49.2:32512
Hello from server: 3746
With this approach, our configuration file has not changed however, and our deployment configuration is still as follows.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-deployment
labels:
app: my-app
spec:
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: my-app:latest
imagePullPolicy: Never
ports:
- containerPort: 7777
Defining replicas in configuration
An alternative approach to manual scaling is adjusting the configuration file to include a number of replicas, and then applying the configuration file again. The number of replicas is defined in the spec
section of the configuration file -- if the value is not set, the value one is used by default. Below, we adjust the deployment configuration, setting the number of replicas to three.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-deployment
labels:
app: my-app
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: my-app:latest
imagePullPolicy: Never
ports:
- containerPort: 7777
Now, if we apply the above configuration and list the pods again, we see that there are now three pods.
kubectl apply -f kubernetes/my-app-deployment.yaml
deployment.apps/my-app-deployment configured
kubectl get pods
NAME READY STATUS RESTARTS AGE
my-app-deployment-85bcb74bcd-fpwr4 1/1 Running 0 21m
my-app-deployment-85bcb74bcd-g7d6j 1/1 Running 0 5s
my-app-deployment-85bcb74bcd-wtbnc 1/1 Running 0 12m
Similarly to before, the requests to the pods are balanced. Now, when we make requests to the servers, we see that there is a third server that is responding to the requests.
curl http://192.168.49.2:32512
Hello from server: 3746
curl http://192.168.49.2:32512
Hello from server: 8320
curl http://192.168.49.2:32512
Hello from server: 8320
curl http://192.168.49.2:32512
Hello from server: 5831
curl http://192.168.49.2:32512
Hello from server: 3746
curl http://192.168.49.2:32512
Hello from server: 3746
Configuration files over command line use
When we scale the deployment using the kubectl scale
command, the changes are not reflected in the configuration file. For the purposes of maintaining a specific scale, it is meaningful to modify the configuration file and use it. This way, we can always use the same configuration file to create deployments, which can also be stored into e.g. a Git repository.