Containers & Kubernetes

Storage and Operators

Learning Objectives

You know how to use PersistentVolumes and PersistentVolumeClaims.
You know of Kubernetes operators.

Persistent storage

As pods are ephemeral, any files written to the container filesystem of a pod is lost when the pod is deleted or when a new pod is scheduled. For applications that need to store data, Kubernetes provides abstractions for persistent storage.

The idea is similar to Docker volumes. With Kubernetes, however, the abstraction is stronger and more configuration-oriented.

There are two main abstractions for persistent storage: PersistentVolumes and PersistentVolumeClaims. A PersistentVolume is a piece of storage in the cluster that has been provisioned by an administrator of the cluster. A PersistentVolumeClaim is a request for storage by a user.

PersistentVolumes are resources, PersistentVolumeClaims are resource requests.

Creating a PersistentVolume

To provision a PersistentVolume, we need to create a PersistentVolume configuration file. Create a file called minikube-demo-persistentvolume.yaml to the k8s directory with the following content:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: minikube-demo-local-persistentvolume
spec:
  storageClassName: "standard"
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/data"

The above configuration declares a PersistentVolume named minikube-demo-local-persistentvolume with 1Gi of storage capacity. The storage is set to use a StorageClass “standard” (default in Minikube), and is accessible as a ReadWriteOnce object, meaning it can be mounted by a single node at a time. The storage is backed by a hostPath, which is a directory on the Minikube VM at /mnt/data.

While hostPath is useful for local development, it is not recommended for production use. In production, one would use a cloud provider’s storage solution or a network storage solution.

Next, apply the configuration to create the PersistentVolume:

kubectl apply -f k8s/minikube-demo-persistentvolume.yaml

You can check the status of the PersistentVolume with the following command:

kubectl get pv

The output is something like the following

NAME                                   CAPACITY  ...
minikube-demo-local-persistentvolume   1Gi       ...

Now, the cluster has a PersistentVolume available for use.

Creating a PersistentVolumeClaim

Next, we need to create a PersistentVolumeClaim to request storage. Create a file called minikube-demo-persistentvolume-claim.yaml to the k8s directory with the following content:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: minikube-demo-local-persistentvolume-claim
spec:
  storageClassName: "standard"
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Mi

The above configuration declares a PersistentVolumeClaim named minikube-demo-local-persistentvolume-claim requesting 100Mi of storage capacity. The specification is similar to the PersistentVolume configuration, but this time it is a request for storage.

Next, apply the configuration to create the PersistentVolumeClaim:

kubectl apply -f k8s/minikube-demo-persistentvolume-claim.yaml

You can check the status of the PersistentVolumeClaim with the following command:

kubectl get pvc

The output is something like the following:

NAME                                         STATUS   VOLUME ...
minikube-demo-local-persistentvolume-claim   Bound    pvc-51875849-29d7-4118-b541-47f98ad16eb2 ...

Using the PersistentVolumeClaim

To use a PersistentVolumeClaim in a pod, we need to mount the claim as a volume in the deployment specification. Let’s modify the file minikube-demo-server-deployment.yaml to include a volume mount for the PersistentVolumeClaim:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: minikube-demo-server-deployment
  labels:
    app: minikube-demo-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: minikube-demo-server
  template:
    metadata:
      labels:
        app: minikube-demo-server
    spec:
      containers:
        - name: minikube-demo-server
          image: minikube-demo-server:1.1
          imagePullPolicy: Never
          ports:
            - containerPort: 8000
          envFrom:
            - configMapRef:
                name: minikube-demo-configmap
            - secretRef:
                name: minikube-demo-secret
          volumeMounts:
            - name: data-storage
              mountPath: "/app/data"
      volumes:
        - name: data-storage
          persistentVolumeClaim:
            claimName: minikube-demo-local-persistentvolume-claim

Now, the configuration states that the container in the pod should have a volume mounted at /app/data which is backed by the PersistentVolumeClaim minikube-demo-local-persistentvolume-claim.

Let’s apply the modified deployment configuration to see whether this holds.

kubectl apply -f k8s/minikube-demo-server-deployment.yaml

Now, when we access the pod, we should see a directory /app/data which is backed by the PersistentVolumeClaim.

To concretely test this, we can first get the name of the pod, and then exec into the pod to check the directory:

$ kubectl get pods
NAME                                                       READY   STATUS    RESTARTS   AGE
minikube-demo-server-deployment-554f9fcf65-vmxtp           1/1     Running   0          104s
minikube-demo-server-fetcher-deployment-6548f75dd4-pcjsb   1/1     Running   0          6m4s
$ kubectl exec -it minikube-demo-server-deployment-554f9fcf65-vmxtp -- /bin/sh
/app # ls -lt
total 24
drwxrwxrwx    2 root     root          4096 Mar 13 15:57 data
-rw-r--r--    1 root     root           176 Mar 12 16:14 Dockerfile
-rw-r--r--    1 root     root           325 Mar 12 16:13 app.js
-rw-r--r--    1 root     root           293 Mar 11 13:57 deno.lock
-rw-r--r--    1 root     root            64 Mar 11 11:39 deno.json
-rw-r--r--    1 root     root            52 Mar 11 11:39 app-run.js

As we can see from the above output, the /app/data directory is present in the pod. Now, the container has the /app/data directory which is backed by the persistent volume. Any files the app writes there will persist even if the pod is rescheduled or restarted.

Loading Exercise...

Dynamic provisioning

Above, we manually created a PersistentVolume and a PersistentVolumeClaim. In practice, especially when using Kubernetes on the cloud, dynamic provisioning is used. This means that we do not manually create persistent volumes, but instead, we create a persistent volume claim, and Kubernetes automatically creates a persistent volume to satisfy it.

The main change would be changing the StorageClass in the PersistentVolumeClaim configuration to match the class that dynamically provisions the persistent volume. This is cloud vendor specific.

Often, also, when working with cloud vendors, the storage is not local to the cluster, but is network storage provided by the cloud vendor. This is a more robust and scalable solution.

Loading Exercise...

Operators

In much of web development, the goal is to keep the application as stateless as possible. However, there are cases where stateful applications are necessary. For example, databases like PostgreSQL or Redis are stateful services that need to maintain data across restarts. Managing such services in Kubernetes can be however complex. This is where Operators come in.

Kubernetes operators are Kubernetes extensions that work with custom resources to manage applications. They allow automating configuration, deployment, and maintenance of software, and in general help in defining deployable software components for Kubernetes.

For additional information on operators, see the Cloud Native Computing Foundation’s Operator White Paper.

There exists a variety of Kubernetes operators for setting up databases. As an example, for PostgreSQL, there exists multiple operators, including the CloudNativePG, Zalando’s Postgres Operator, Kubegres, Stolon, and Crunchy Data’s PGO. Similarly, for e.g. Redis, there exists a handful of options to choose from, including the official (non-free) Redis Enterprise version, Spotahome’s redis operator, and a redis operator from Opstree solutions.

When using operators, one is typically bound to a specific configuration and a way of doing things, which may not always align with project requirements. Furthermore, like with any external dependency, the use of operators in an unorthodox way may also lead to isses. Regardless, using an operator is sensible when the operator aligns with the project requirements.

As an example of potential challenges, read the Palark blog post Our failure story with Redis operator for K8s.

Loading Exercise...

← Configuration Management

CloudNativePG Operator →