To allow a software to be stateful (i.e. keeping its contents and information when it is being shutdown and startup again). You will need Persistent Volumes in Kubernetes.
This is part 1 of the 3 parts article, links to part 2 and 3 below:
- Part 1 - Basic OS setup and master node.
- Part 2 - Worker node and creating a new service.
- Part 3 - Make your service stateful via persistent volume
Since Kubernetes is designed to be cloud-first, the storage is not just confined to traditional file systems (like NTFS, or ext4 in Linux) or block devices (like raw hard drives). It can also be defined as Amazon S3.
Obviously you cannot simply mix and match, as different kind of storage has different characteristics. Some are better at database workloads, some better at file storage, and some better with streaming. And in the cloud, you can also choose to use a database as a service directly without bothering with storage. This is another topic for another day.
Lets discuss Persistent Volume in this session. In the previous session, we want to run transmission, a BitTorrent software, in Kubernetes. It needs some place to store its configuration files, and the files it has downloaded. We will use a NFS server as the backend storage of the persistent volume for now because:
- NFS volume can be mounted by multiple containers at the same time, and/or
- It does not have to be confined to certain worker node. If you use a local file system in worker node A, it will not be visible if the container (pod) is spawned in worker node B.
- We will need 2 persistent volume, one for the config folder, and the other one for the downloads folder.
Defining a Persistent Volume in NFS
Assuming you already have a working NFS server (in this example - nfsserver), the persistent volume is defined using again, a yaml manifest file:
# pv-transmission.yaml apiVersion: v1 kind: PersistentVolume metadata: name: pv-transmission-config # <-- name can be changed spec: capacity: storage: 10Mi accessModes: - ReadWriteMany nfs: server: nfsserver path: "/exports/transmission/config" # <-- This is the server path
And the one for the downloads folder is similar.
Also, you will need the application (transmission) to "claim" that it is using the persistent volume, this is done by defining a Persistent Volume Claim manifest like below:
# pvc-transmission.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-transmission-config # <-- In transmission.yaml spec: accessModes: - ReadWriteMany # <-- For NFS storageClassName: "" volumeName: pv-transmission-config # <-- In pv-transmission.yaml resources: requests: storage: 10Mi # <-- Size of the volume
Simply apply the above manifest files by:
kubectl apply -f <manifest.yaml>
The above, together with the transmission manifest file (recap below), allows you to run the software in any of the worker nodes that are eligable.
# transmission.yaml apiVersion: apps/v1 kind: Deployment metadata: name: transmission spec: selector: matchLabels: app: transmission replicas: 1 template: metadata: labels: app: transmission spec: volumes: - name: pv-transmission-config # <-- defined PV persistentVolumeClaim: claimName: pvc-transmission-config # <-- defined PVC - name: pv-transmission-downloads # <-- defined PV persistentVolumeClaim: claimName: pvc-transmission-downloads # <-- defined PVC hostNetwork: false containers: - name: transmission image: ghcr.io/linuxserver/transmission imagePullPolicy: IfNotPresent ports: - containerPort: 9091 - containerPort: 51413 - containerPort: 51413 protocol: UDP env: - name: name value: "transmission" - name: PUID value: "997" - name: PGID value: "997" - name: TRANSMISSION_WEB_HOME value: "/combustion-release/" - name: WHITELIST value: "<IP address range>" # e.g. 10.0.1.1/24 volumeMounts: - mountPath: "/config" # <-- Mount point name: pv-transmission-config # volume defined above - mountPath: "/downloads" # <-- Mount point name: pv-transmission-downloads # volume defined above
Accessing the application
At this moment, you will probably wonder how can you test out the application. Typically transmission is ran by accessing its web interface, through port 9091 like below:
However, it is not that easy in Kubernetes, as when you setup the cluster, it will create another network address range and run everything within its own network. Assuming that your network is in the range of 192.168.0.0/24, your Kubernetes cluster may be running at 10.0.1.0/8. If you run the following, you will see that the cluster network is not the same as your private network:
kubectl get svc
And the output will look like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ghost-service ClusterIP 10.110.155.206 <none> 2368/TCP 32h homebridge ClusterIP 10.102.134.197 <none> 51826/TCP,8080/TCP 21h kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 45h transmission NodePort 10.111.85.219 <none> 9091:30991/TCP 22h
You will need to expose the service so that you can access it by defining a service in Kubernetes, again using a manifest file:
# svc-transmission.yaml apiVersion: v1 kind: Service metadata: name: transmission spec: type: NodePort selector: app: transmission ports: - protocol: TCP port: 9091 nodePort: 30991 name: http
There are different kinds of services, the most common ones are:
In the above example, we will be using NodePort. After this is applied, the application will be accessible by:
http://<any one of the node IP>:30991
Please note that the port number assigned had been changed from 9091 to 30991 in the following snippet of manifest:
ports: - protocol: TCP port: 9091 nodePort: 30991 name: http
And you can start trying to download something using your application in the cluster!
Wrapping it up
This concludes the 3 part initial write up of how to setup a Kubernetes cluster using a simple setup that is very cheap to try out. I am using a pair of Raspberry Pi 4 machines in this setup. Stay tuned for future articles that will dig deeper into terminologies, architecture and a real life example of web site setup using Kubernetes!