To allow a software to be stateful (i.e.  keeping its contents and information when it is being shutdown and startup again). You will need Persistent Volumes in Kubernetes.

This is part 1 of the 3 parts article, links to part 2 and 3 below:

  1. Part 1 - Basic OS setup and master node.
  2. Part 2 - Worker node and creating a new service.
  3. Part 3 - Make your service stateful via persistent volume

Since Kubernetes is designed to be cloud-first, the storage is not just confined to traditional file systems (like NTFS, or ext4 in Linux) or block devices (like raw hard drives). It can also be defined as Amazon S3.

Obviously you cannot simply mix and match, as different kind of storage has different characteristics. Some are better at database workloads, some better at file storage, and some better with streaming. And in the cloud, you can also choose to use a database as a service directly without bothering with storage. This is another topic for another day.

Lets discuss Persistent Volume in this session. In the previous session, we want to run transmission, a BitTorrent software, in Kubernetes. It needs some place to store its configuration files, and the files it has downloaded. We will use a NFS server as the backend storage of the persistent volume for now because:

  1. NFS volume can be mounted by multiple containers at the same time, and/or
  2. It does not have to be confined to certain worker node. If you use a local file system in worker node A, it will not be visible if the container (pod) is spawned in worker node B.
  3. We will need 2 persistent volume, one for the config folder, and the other one for the downloads folder.

Defining a Persistent Volume in NFS

Assuming you already have a working NFS server (in this example - nfsserver), the persistent volume is defined using again, a yaml manifest file:

# pv-transmission.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-transmission-config            # <-- name can be changed
spec:
  capacity:
    storage: 10Mi
  accessModes:
    - ReadWriteMany
  nfs:
    server: nfsserver
    path: "/exports/transmission/config"  # <-- This is the server path

And the one for the downloads folder is similar.

Also, you will need the application (transmission) to "claim" that it is using the persistent volume, this is done by defining a Persistent Volume Claim manifest like below:

# pvc-transmission.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-transmission-config        # <-- In transmission.yaml
spec:
  accessModes:
    - ReadWriteMany                    # <-- For NFS
  storageClassName: ""
  volumeName: pv-transmission-config   # <-- In pv-transmission.yaml
  resources:
    requests:
      storage: 10Mi                    # <-- Size of the volume

Simply apply the above manifest files by:

kubectl apply -f <manifest.yaml>

The above, together with the transmission manifest file (recap below), allows you to run the software in any of the worker nodes that are eligable.

# transmission.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: transmission
spec:
  selector:
    matchLabels:
      app: transmission
  replicas: 1
  template:
    metadata:
      labels:
        app: transmission
    spec:
      volumes:
      - name: pv-transmission-config             # <-- defined PV
        persistentVolumeClaim:
          claimName: pvc-transmission-config     # <-- defined PVC
      - name: pv-transmission-downloads          # <-- defined PV
        persistentVolumeClaim:
          claimName: pvc-transmission-downloads  # <-- defined PVC
      hostNetwork: false
      containers:
      - name: transmission
        image: ghcr.io/linuxserver/transmission
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9091
        - containerPort: 51413
        - containerPort: 51413
          protocol: UDP
        env:
        - name: name
          value: "transmission"
        - name: PUID
          value: "997"
        - name: PGID
          value: "997"
        - name: TRANSMISSION_WEB_HOME
          value: "/combustion-release/"
        - name: WHITELIST
          value: "<IP address range>"             # e.g. 10.0.1.1/24
        volumeMounts:
        - mountPath: "/config"                    # <-- Mount point
          name: pv-transmission-config            # volume defined above
        - mountPath: "/downloads"                 # <-- Mount point
          name: pv-transmission-downloads         # volume defined above

Accessing the application

At this moment, you will probably wonder how can you test out the application. Typically transmission is ran by accessing its web interface, through port 9091 like below:

http://<transmission_host>:9091

However, it is not that easy in Kubernetes, as when you setup the cluster, it will create another network address range and run everything within its own network. Assuming that your network is in the range of 192.168.0.0/24, your Kubernetes cluster may be running at 10.0.1.0/8. If you run the following, you will see that the cluster network is not the same as your private network:

kubectl get svc

And the output will look like:

NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)              AGE
ghost-service   ClusterIP   10.110.155.206   <none>        2368/TCP             32h
homebridge      ClusterIP   10.102.134.197   <none>        51826/TCP,8080/TCP   21h
kubernetes      ClusterIP   10.96.0.1        <none>        443/TCP              45h
transmission    NodePort    10.111.85.219    <none>        9091:30991/TCP       22h

You will need to expose the service so that you can access it by defining a service in Kubernetes, again using a manifest file:

# svc-transmission.yaml
apiVersion: v1
kind: Service
metadata:
  name: transmission
spec:
  type: NodePort
  selector:
    app: transmission
  ports:
  - protocol: TCP
    port: 9091
    nodePort: 30991
    name: http

There are different kinds of services, the most common ones are:

  1. ClusterIP
  2. NodePort
  3. LoadBalancer

In the above example, we will be using NodePort. After this is applied, the application will be accessible by:

http://<any one of the node IP>:30991

Please note that the port number assigned had been changed from 9091 to 30991 in the following snippet of manifest:

  ports:
  - protocol: TCP
    port: 9091
    nodePort: 30991
    name: http

And you can start trying to download something using your application in the cluster!

Wrapping it up

This concludes the 3 part initial write up of how to setup a Kubernetes cluster using a simple setup that is very cheap to try out. I am using a pair of Raspberry Pi 4 machines in this setup. Stay tuned for future articles that will dig deeper into terminologies, architecture and a real life example of web site setup using Kubernetes!