When I worked for large corporates, I have seen so much production issues that are really avoidable if you design, plan and configure things properly.

I have seen load balancing hardware not properly setup (not based on properly probing the services but instead rely pn a simple ping test), extended downtime due to poor planning of changes, or more oftenly, expiring SSL certificates.

The good thing is, with Kubernetes, these can become something in the past easily. With simple configurations, you can setup a stateless web service that is:

  1. properly load balanced, and if you bring down one of the resilient server (or add another one), you only need to do minimal configuration to make it effective. No manual deployment is needed.
  2. Load balancing will automatically detect hanged web services in one of the resilient nodes and will not direct traffic to it, so you have way less downtime.
  3. No more SSL certificate expiration embrassment, like the one hit by HSBC several years ago. (I can tell you it happens way more often than you think, and in a modern architecture, it is not as simple as a embrassment but a downtime because you have a lot of nested web service calls.)

These are all done by 4 things in a cloud Kubernetes setup, called simply LoadBalancer, Ingress, Certificate Manager and making sure Kubernetes know how to verify if your service is up and running. We will discuss the first 3 things in this article.

We will discuss how all these are set up accordingly in bare metal (our Raspberry Pi rig). But before we start, let's dig into networking of Kubernetes a bit before we start.

Networking in Kubernetes

Kubernetes is originally designed with Cloud services in mind. Imagine you have a cloud service from Google Cloud Platform (GCP), you will have a server setup with a public IP address (in IPv4 world) assigned.

However, Kubernetes nodes are a bunch of inter-connected network services constantly probing and instructing each order services. If it is designed in a way that directly utilize your public IP address, then it will occupy a lot of TCP/IP ports and you will also have to configure the access control list (similar to configuring a firewall, but on cloud) on your cloud service provider painfully.

Even in bare metal within private LAN, all Kubernetes communications are done via its own internal network configuration, as shown in the following:

Here is my private LAN configuration in one of the node machine.

$ ifconfig eth0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.x.y  netmask 255.255.255.0  broadcast 192.168.x.255
        inet6 fe80::dea6:32ff:fe9d:c80a  prefixlen 64  scopeid 0x20<link>
        ether dc:a6:32:9d:c8:0a  txqueuelen 1000  (Ethernet)
        RX packets 583064  bytes 219906226 (219.9 MB)
        RX errors 0  dropped 23776  overruns 0  frame 0
        TX packets 511216  bytes 96980142 (96.9 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
My IP address is in a Class C one - 192.168.x.x

However, if you try to see what kind of services are running in all the nodes, you will get something like below:

$ kubectl get deployment

NAME              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)              AGE
ghost-service     ClusterIP   10.110.155.206   <none>        2368/TCP             13d
homebridge        ClusterIP   10.110.251.22    <none>        51826/TCP,8080/TCP   3d22h
kubernetes        ClusterIP   10.96.0.1        <none>        443/TCP              13d
schnack-service   ClusterIP   10.97.69.97      <none>        3000/TCP             6d21h
transmission      NodePort    10.111.85.219    <none>        9091:30991/TCP       12d
The CLUSTER-IP is a Class A subnet - 10.y.y.y

Obviously the external world (in my case, my private subnet) will not be able to reach the services via the Cluster IP.

There are a couple of ways to "expose" the Cluster IP into an external world. They are:

  • NodePort (as shown in the above example - "transmission")
  • LoadBalancer (which we will discuss in here), because it is easiest to understand and the most useful.

The LoadBalancer

In a cloud native environment (like AWS, or GCP), the LoadBalancer is provided by them natively. There is no need to setup anything. You only need to configure the service to expose itself as a LoadBalancer.

In a bare metal world, there is only one implementation of LoadBalancer, called MetalLB, which you will need to deploy and configure in your bare metal cluster.

The website of MetalLB will humbly tell you that it is still a beta software, however, I find it reliable enough for normal website use for myself. Also, it mimics AWS / GCP's behaviour hence it is a good way to learn the real cloud environment without having to pay for the subscription.

Installation of MetalLB

Installation is very straight forward and simple. You can simply follow it through with only 1 catch: It may not be compatible with all the network addons for Kubernetes. I used Flannel, which is compatible and is the easiest to setup, as described in my previous how-tos. Also, if you are using any of the Cloud Platforms, you will not need it in general.

Configuration of MetalLB

Configuration for small scale testing / deployment is also very simple, we will use Layer 2 Configuration as the other type of configuration (BGP configuration) assumes you have real network routing and Border Gateway Protocol knowledge. Again, just follow the configuration procedure to assign some un-used IP addresses in your private network range to act as load balancing IP addresses and you are done.

Once you have set them up with default configuration, you will be able to find its deployment via the following command and results:

$ kubectl get deployment -n metallb-system

NAME         READY   UP-TO-DATE   AVAILABLE   AGE
controller   1/1     1            1           13d

A new namespace will be created called metallb-system and it will contain a controller deployment, managing the load balancer's IP address assignments.

Ingress

Ingress is a very specific term in Kubernetes which is applicable to http and https traffic. It is useful when you have only 1 public IP address and wants to host different domains / webservices based on Fully Qualified Domain Name (e.g. www.abc.com, comments.abc.com) or based on the path of the URI.

The most common Ingress is based on NGINX, a very powerful and popular web / proxy server implementation. In Kubernetes, the ingress-nginx implementation make use of NGINX's reverse proxing capabilities to assign different FQDN / URI path to different services in the Kubernetes cluster.

Installation of ingress-nginx

For bare-metal like me, use this section of the installation guide from Kubernetes to install ingress-nginx.

This will configure ingress-nginx to use NodePort after installation. If you try to review the setup afterwards, you will see something like below:

$ kubectl get svc -n ingress-nginx

NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP       PORT(S)                      AGE
ingress-nginx-controller             NodePort       10.99.241.178   <none>.           80:31125/TCP,443:30820/TCP   13d
ingress-nginx-controller-admission   ClusterIP      10.106.81.124   <none>            443/TCP                      13d
Initial configuration of ingress-nginx after bare-metal installation

But way, we have MetalLB, so we can change this from NodePort to LoadBalancer. It can be done via issuing this command:

kubectl edit service ingress-nginx-controller -n ingress-nginx

It will present you with your favourite editor in Linux, in my case, vi, so that you can modify the configuration like below. Find the type configuration in the yaml file and change from NodePort to LoadBalancer.

  selector:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
  sessionAffinity: None
  type: LoadBalancer

Then save the configuration file. kubectl will detect the changes and apply the configuration. You will see your service type changed to LoadBalancer if you review the services again like below:

$ kubectl get svc -n ingress-nginx

NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP       PORT(S)                      AGE
ingress-nginx-controller             LoadBalancer   10.99.241.178   192.168..zzz.aaa   80:31125/TCP,443:30820/TCP   13d
ingress-nginx-controller-admission   ClusterIP      10.106.81.124   <none>            443/TCP                      13d
The service became a load balancer and an EXTERNAL-IP is assigned by MetalLB

Once you have a working Ingress, you can also test it via your web browser, just enter the external-ip using http in the above example, you will get a screenshot like below:

Working ingress-nginx but with no service exposed.

In theory, you can now expose your webservice, or web server, by creating an Ingress resource using a sample configuration below:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: your-web-service-name
  annotations:
    kubernetes.io/ingress.class: "nginx"

spec:
  rules:
  - host: www.example.domain
    http:
      paths:
      - path: /
        backend:
          serviceName: ghost-service
          servicePort: 2368
Exposing Ghost, a popular blogging platform, to Ingress

However, I will show you a more elaborated example with https enabled, to complete the full picture.

Automatic SSL enablement and renewal via cert-manager and LetsEncrypt

cert-manager is a Kubernetes implementation that will help you manage external certificates. It supports a number of external certificate authorities, including the common ones such as Venafi, and more importantly, Letsencrypt.

LetsEncrypt is a free Certificate Authority which aims at providing everyone a low cost way to properly issue a SSL certificate that can be used in web servers and webservices. It really lowers the barrier of self-hosting, or startups. Obviously, you need a working domain, and a DNS service provider in order to use Letsencrypt. In this case, I will use example.com throughout the steps.

Combining cert-manager and LetsEncrypt, you will get a low cost, fully automated certificate management and never have to worry about certificate renewals again.

Installing cert-manager

Again, cert-manager provides an excellent and simple installation guide, which you can follow directly. It only consist of a 1-liner kubectl application like below:

kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.2.0/cert-manager.yaml

After it is installed, you can verify it as below:

$ kubectl get deployments -n cert-manager

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
cert-manager              1/1     1            1           13d
cert-manager-cainjector   1/1     1            1           13d
cert-manager-webhook      1/1     1            1           13d

Testing cert-manager

It is very important to test the cert-manager because most certificate authorities will rate-limit your SSL certificate requests. If you fails too many times within a time limit, you will be blocked for a period of at least a couple of days.

To test cert-manager, create and apply a Issuer manifest file like below:

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
  apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    # You must replace this email address with your own.
    # Let's Encrypt will use this to contact you about expiring
    # certificates, and issues related to your account.
    email: michael@example.com
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      # Secret resource that will be used to store the account's private key.
      name: letsencrypt-staging-account-key
    # Add a single challenge solver, HTTP01 using nginx
    solvers:
    - http01:
        ingress:
          class: nginx

The above will instruct cert-manager to use LetsEncrypt's staging environment to issue SSL certificates. It will also instruct cert-manager to work with ingress-nginx and use http protocol to validate the ownership of the domain name, which is easiest. The other way is to use DNS to validate the ownership of domain name but it will require you to have full control over the DNS administration.

Once it is setup, you can try to issue a testing certificate for your web server through its ingress definition yaml:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: example-com
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/force-ssl-redirect: "false"
    nginx.ingress.kubernetes.io/from-to-www-redirect: "true"
    nginx.ingress.kubernetes.io/server-alias: www.example.com
    cert-manager.io/issuer: "letsencrypt-staging"

spec:
  tls:
  - hosts:
    - example.com
    - www.example.com
    secretName: example-com-tls
  rules:
  - host: example.com
    http:
      paths:
      - path: /
        backend:
          serviceName: example-service
          servicePort: 2368

The above example will instruct cert-manager to try to issue a certificate using LetsEncrypt staging environment (referencing cert-manager.io/issuer annotation and the tls section above). It will also ask ingress-nginx to redirect the following traffic:

  1. http://example.com to https://example.com
  2. http://www.example.com to https://example.com
  3. https://www.example.com to https://example.com

Which is very common in web serving nowadays. The service that is serving the website is provided by the Kubernetes service called example-service (and with the port specified).

Once applied, cert-manager will "detect" that you will need a certification (through the annotation cert-manager.io/issuer) and start to request for a SSL certificate.

Before it request for anything from LetsEncrypt, it will try to probe the webserver to ensure that it can generate a proper challenge/response token by accessing the webserver with the domain name specified above (https://example.com). All these will be gracefully handled by ingress-nginx so normally you do not need to worry about anything. In case of issue, there are a couple of area that you may want to check:

  • Whether the certificate request was issued properly by:
$ kubectl get certificaterequest

NAME                              READY   AGE
example-com-tls-g2hfb             True    6d22h

# You can also use
# kubectl describe certificaterequest example-com-tls-g2hfb
# to see what details are there, and why it failed (if anything)
Check if certificate request is generated properly
  • Whether the domain challenge is being properly handled by:
$ kubectl get challenges

NAME                                 STATE     DOMAIN            REASON                                     AGE
example-com-<some number>.           pending   example.com       Waiting for dns-01 challenge propagation   22s
  • Whether the certificate had been issued properly by:
kubectl get certificate
NAME                        READY   SECRET                      AGE
example-com-tls             True    example-com          -tls   6d22h

# You can also use
# kubectl describe certificate example-com-tls to see the details
# and why it may have failed.
Check if certificate is issued.

Once the above is all verified, you can then issue the production certificate.

Issuing Production Certificate

It is really simple to issue production certificate once you have everything tested. You will need a production issuer setup in cert-manager like below:

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: michael@example.com
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    server: https://acme-v02.api.letsencrypt.org/directory
    solvers:
    - http01:
        ingress:
          class: nginx
See the difference in the "server" field and the name.

Then, modify your ingress yaml above to point to the letsencrypt-prod issuer like below:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: example-com
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/force-ssl-redirect: "false"
    nginx.ingress.kubernetes.io/from-to-www-redirect: "true"
    nginx.ingress.kubernetes.io/server-alias: www.example.com
    cert-manager.io/issuer: "letsencrypt-production"

spec:
  tls:
  - hosts:
    - example.com
    - www.example.com
    secretName: example-com-tls
  rules:
  - host: example.com
    http:
      paths:
      - path: /
        backend:
          serviceName: example-service
          servicePort: 2368
cert-manager.io/issuer now points to letsencrypt-production

cert-manager now will detect this change, and re-request for another production SSL certificate. Once this is ready (and verified), use your web browser to visit your own site to see everything properly configured (like me).

Wrapping it up

With the above configuration, you will have a properly setup web server (or web service), load balanced (if you have more than 1 instance) and SSL secured for external traffic from the Internet, which is sufficient in most web serving scenarios. Obviously in a commercial world, not just the external traffic needs to be encrypted, but the internal ones as well. This can be done with Layer 3 encryption (IPSec) or having cert-manager to issue SSL certificates for all web services (which could be complex during setup, but fully automated once done). It really depends on the use case and what kind of team you have in your organisation.

Obviously, in our homelab setup, we have only 1 public IP address available. Hence all external communications (like the web traffic, or the LetsEncrypt validation) needs to be done there and managed by the ingress-nginx. To do so, you will need to properly configure your firewall to direct all traffics from the external public IP address to the LoadBalancer IP address that is assigned to ingress-nginx-controller. This is out of scope for our article, but stay tuned... I will probably post an article which will use a very secure operating system (OpenBSD) to setup a full fledged home firewall soon.