Proof of concept Kubernetes cluster on Raspberry Pi using K3s

The project

The plan is somewhat simple here: we’ll try to setup a proof of concept Kubernetes cluster in homelab environment with a twist: service announcement over BGP.

The hardware

You’ll need several pieces of hardware at this stage. I’ve tried this with a Raspberry Pi version 3 model B and feel that the Pi didn’t manage the load well at times - version 4 might be a bit better idea.

The Mikrotik RB4011 router is a great choice for this project as it comes with 10 gigabit RJ45 switch ports (on 2 separate switch chips), great OS (RouterOS) and more capabilities than one can even dream of utilizing. That being said, it’s an overkill. Any router that speaks BGP is sufficient, but the snippets console in this post are from MikroTik’s RouterOS.

  • Mikrotik RB4011, an ethernet router that, among other features, speaks BGP
  • Raspberry Pi 3 Model B
  • MicroSD card (8 GiB seems enough),
  • MicroSD card write-capable device
  • power adapter suitable for the Pi
  • ethernet cable

Setting up the network infrastructure

We’ll bridge ports 6 through 10 (1 is used for WAN, 2-5 for home network) and create a new subnet where the project will take place. Why 5 ports for 1 Pi? At this point, it’s just a convenience of being able to plug it into any of the 5 ports and be on the correct network. Starting at layer 2:

/interface bridge
add name=bridge2
/interface bridge port
add bridge=bridge2 interface=ether6
add bridge=bridge2 interface=ether7
add bridge=bridge2 interface=ether8
add bridge=bridge2 interface=ether9
add bridge=bridge2 interface=ether10

Next, we move over to L3 tasks. Home lab will initially be assigned the subnet, and the router will be at This is an important piece of information that we’ll need when setting up BGP.

/ip address
add address= interface=bridge2 network=

Then we probably want a DHCP server. The portion of the subnet will be reserved for static IPs, and DHCP will only distribute addresess in range. The split exists because it scales to more complex scenarios in the future, we won’t need to touch the static space except for the router.

/ip pool
add name=lab-pool ranges=
/ip dhcp-server network
add address= dns-server=, gateway=
/ip dhcp-server
add address-pool=lab-pool disabled=no interface=bridge2 name=dhcp2

Preparing the Raspberry Pi (on MacOS)

We need some OS. At this point, experimenting with ARM64 is a needless overhead. The same applies to any non-standard OS. The path of least resistance seems to be Raspberry Pi OS (previously called raspbian).

$ wget \
    --trust-server-names \
$ unzip

Since the DHCP server is under our control (running on RB4011), it is acceptable for the Pi to obtain a DHCP lease. We’ll then find the IP directly within the DHCP server’s interface. There is no expectation that a screen will ever be connected to the Pi, and Raspberry Pi OS does not ship with SSH enabled by default. Can we somehow enable it before at this stage?

According to the docs, the OS looks for a file named ssh in the boot directory. If found, the OS will boot up with SSH enabled (using default pi username and raspberry password).

$ open 2020-08-20-raspios-buster-armhf-lite.img
$ cd /Volumes/boot
$ touch ssh
$ diskutil unMount /Volumes/boot

We then need to get the modified image to MicroSD card used as our main system disk. Since Apple hardware no longer ships with (Micro)SD card reader, an external reader is our only choice.

First, let’s figure out which device is the SD card using diskutil command:

$ diskutil list

/dev/disk3 (external, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     FDisk_partition_scheme                        *7.9 GB     disk3
   1:             Windows_FAT_32 boot                    268.4 MB   disk3s1
   2:                      Linux                         7.7 GB     disk3s2

Double check the parameters of the device - writing to a wrong device would lead to catastrophic consequences. In order to speed up the process, we want to write to the raw device (/dev/rdiskN) instead of the buffered /dev/diskN variant.

$ sudo diskutil unmountDisk /dev/disk3
$ sudo dd if=~/2020-08-20-raspios-buster-armhf-lite.img of=/dev/rdisk3 bs=4m
$ sudo diskutil eject /dev/rdisk3

After ejecting the logical device, it’s time to get on the hardware level and plug the card into the Raspberry Pi. Continuing on the hardware front, we plug the Pi into any of the bridged router ports (6-10) and connect a power adapter.

First test!

The Pi should boot up and obtain a DHCP lease. Let’s consult the router. The output is modified to hide any other leases and hwaddr.

/ip dhcp-server lease p
 1 D B8:27:EB:00:00:00 rpi dhcp2 bound

That means we should be able to reach our Pi via SSH:

$ ssh pi@
pi@'s password:
Linux k8s-master 5.4.51-v7+ #1333 SMP Mon Aug 10 16:45:19 BST 2020 armv7l


pi@rpi:~ $

Success! We’re in, and it’s time to install Kubernetes. The flavor of Kubernetes of choice is k3s (GitHub link). K3s is pretty popular (at least according to the 15k GitHub as of Sep. 2020) minimal Kubernetes distribution from Rancher. For this project, the main value propostion of k3s is a minimal resource usage and also high quality developer tools.

Naively reading the k3s docs, it seems that all we need is to run single curl command and pipe that into shell. As a root. Sounds very safe, but it’s an experiment - so why not. :) Also note that Pi was renamed to k8s-master.

root@k8s-master:/home/pi# curl -sfL | sh -
[INFO]  Finding release for channel stable
[INFO]  Using v1.18.9+k3s1 as release
[INFO]  Downloading hash
[INFO]  Downloading binary
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/
[INFO]  Creating uninstall script /usr/local/bin/
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/ → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s

Wow, at this point it’s worth noting that k3s project kept the promise of “convenient way to download K3s and add a service to systemd or openrc”. Did we really manage to install Kubernetes (ARM even!) with one command?!

root@k8s-master:/home/pi# kubectl get pods --all-namespaces
NAMESPACE     NAME                                     READY   STATUS              RESTARTS   AGE
kube-system   helm-install-traefik-jstv5               0/1     ContainerCreating   0          65s
kube-system   local-path-provisioner-6d59f47c7-7r5kr   0/1     ContainerCreating   0          63s
kube-system   metrics-server-7566d596c8-xb8mg          0/1     ContainerCreating   0          63s
kube-system   coredns-7944c66d8d-9w9lk                 0/1     ContainerCreating   0          63s

It does seem to be the case. That being said, the Pi isn’t managing the load exactly well. It’s time to order few Pi 4s as cluster slowly boots up.

top - 16:45:07 up  4:57,  1 user,  load average: 7.37, 3.57, 1.82
Tasks: 168 total,   2 running, 166 sleeping,   0 stopped,   0 zombie
%Cpu0  :  25.6/8.4    34[||||||||||||||||||||||||||||||||||                                                                  ]
%Cpu1  :  37.3/6.0    43[|||||||||||||||||||||||||||||||||||||||||||                                                         ]
%Cpu2  :  54.5/2.4    57[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||                                           ]
%Cpu3  :  68.1/4.0    72[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                            ]
MiB Mem :    925.9 total,     21.9 free,    474.7 used,    429.2 buff/cache
MiB Swap:    100.0 total,     79.2 free,     20.8 used.    394.3 avail Mem

As few minutes pass, this is what we get:

root@k8s-master:/home/pi# kubectl get pods --all-namespaces
NAMESPACE     NAME                                     READY   STATUS      RESTARTS   AGE
kube-system   local-path-provisioner-6d59f47c7-7r5kr   1/1     Running     0          2m53s
kube-system   metrics-server-7566d596c8-xb8mg          1/1     Running     0          2m53s
kube-system   coredns-7944c66d8d-9w9lk                 1/1     Running     0          2m53s
kube-system   helm-install-traefik-jstv5               0/1     Completed   0          2m55s
kube-system   svclb-traefik-7wwmr                      2/2     Running     0          62s
kube-system   traefik-758cd5fc85-kbbnr                 1/1     Running     0          64s

I’m not exactly happy about the choice of Traefik (but that’s almost a material for another post), but let’s consider that acceptable for now. It’s time to get MetalLB up and try to advertise a service over BGP. Since we lack any automation on the router side, we need to prepare the router to peer with our MetalLB instance:

/routing bgp peer
add name=peer1 remote-address= remote-as=64500 ttl=default

Let’s see how MetalLB setup goes!

kubectl apply -f
kubectl apply -f
kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"

MetalLB also requires a configmap that configures BGP peers and address pools for service allocation. The used pool should be outside of the DHCP range to avoid possibly conflicts. Our choice is therefore as that subnet happens to be unused in our network.

$ cat << EOF > metallb-cm.yaml
apiVersion: v1
kind: ConfigMap
  namespace: metallb-system
  name: config
  config: |
    - peer-address:
      peer-asn: 65530
      my-asn: 64500
    - name: default
      protocol: bgp
      avoid-buggy-ips: true
$ kubectl apply -f metallb-cm.yaml
configmap/config created

After setting everything up, it’s time to create a service type LoadBalancer and see if MetalLB is able to advertise the IP. Actually, wait a second. Since there is a Traefik in the cluster, don’t we already have one LoadBalancer service?

root@k8s-master:/home/pi# kubectl get svc --all-namespaces
NAMESPACE     NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
default       kubernetes           ClusterIP       <none>        443/TCP                      9m55s
kube-system   kube-dns             ClusterIP      <none>        53/UDP,53/TCP,9153/TCP       9m51s
kube-system   metrics-server       ClusterIP   <none>        443/TCP                      9m44s
kube-system   traefik-prometheus   ClusterIP    <none>        9100/TCP                     7m45s
kube-system   traefik              LoadBalancer    80:31932/TCP,443:30621/TCP   7m44s

Uh-oh. On the positive side, we do happen to have a LoadBalancer service. On the other hand, certainly doesn’t belong to our pool - What went wrong?

$ /routing bgp peer p
Flags: X - disabled, E - established
 #   INSTANCE                                               REMOTE-ADDRESS                                                                         REMOTE-AS
 0 E default                                                                                                                   64500

MetalLB managed to peer with the router successfully. Checking the speaker logs…

$ kubectl logs -n metallb-system speaker-96mxn --tail 100
{"caller":"main.go:267","event":"startUpdate","msg":"start of service update","service":"kube-system/traefik","ts":"2020-09-27T15:54:55.90341073Z"}
{"caller":"main.go:293","error":"assigned IP not allowed by config","ip":"","msg":"IP allocated by controller not allowed by config","op":"setBalancer","service":"kube-system/traefik","ts":"2020-09-27T15:54:55.903652343Z"}
{"caller":"main.go:369","event":"serviceWithdrawn","ip":"","msg":"withdrawing service announcement","reason":"ipNotAllowed","service":"kube-system/traefik","ts":"2020-09-27T15:54:55.903869425Z"}
{"caller":"main.go:294","event":"endUpdate","msg":"end of service update","service":"kube-system/traefik","ts":"2020-09-27T15:54:55.903962654Z"}

There seems to be a process interferring with what MetalLB controller attempts to do, and the external IP toggles between node IP and our advertised IP.

$ /routing prefix-lists> /ip route p
Flags: X - disabled, A - active, D - dynamic, C - connect, S - static, r - rip, b - bgp, o - ospf, m - mme, B - blackhole, U - unreachable, P - prohibit
 #      DST-ADDRESS        PREF-SRC        GATEWAY            DISTANCE
 3 ADb                             20

The router also accepted the route, so what went wrong? Digging into the k3s docs, it’s apparent that k3s ships with some form of a LoadBalancer controller. Luckily, the docs mention that this is an optional component that can be disabled with --disable servicelb, perfect! The doc also mentions that it’s possible to disable Traefik with --disable traefik, so let’s try to combine these two and see where we get. It’s again time to appreciate how powerful tools k3s ships with. There’s a script to uninstall everything and start from scratch!


Second attempt, now without Traefik and ServiceLB

With our newly obtained knowledge, let’s get k3s up and running without the components we don’t want:

curl -sfL | sh -s - --disable traefik --disable servicelb

And after few minutes, this is what we get:

NAMESPACE     NAME                                     READY   STATUS    RESTARTS   AGE
kube-system   metrics-server-7566d596c8-pdz8w          1/1     Running   0          73s
kube-system   local-path-provisioner-6d59f47c7-t5c8v   1/1     Running   0          73s
kube-system   coredns-7944c66d8d-wj4pw                 1/1     Running   0          73s

Perfect, a minimal cluster! Lack of Traefik also means that there is no pre-allocated LoadBalancer service. Let’s start by deploying NGINX ingress controller with a service type LoadBalancer to see if we’re able to obtain an external IP for the service. Since NGINX controller for bare-metal ships with a NodePort service by default, we need to do the necessary change from NodePort to LoadBalancer.

$ wget
$ sed -i 's/NodePort/LoadBalancer/' deploy.yaml
$ kubectl apply -f deploy.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
configmap/ingress-nginx-controller created created created created created
service/ingress-nginx-controller-admission created
service/ingress-nginx-controller created
deployment.apps/ingress-nginx-controller created created
serviceaccount/ingress-nginx-admission created created created created created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created

It’s time to check the state of the LoadBalancer service:

$ kubectl get svc -n ingress-nginx
NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
ingress-nginx-controller-admission   ClusterIP   <none>        443/TCP                      18s
ingress-nginx-controller             LoadBalancer     <pending>     80:31657/TCP,443:31381/TCP   18s

Repeating this for a few times while also checking the logs, it doesn’t seem that there is anything to assign the IP. That is expected - we disabled the ServiceLB component of k3s. Time to try MetalLB again!

$ kubectl apply -f
$ kubectl apply -f
$ kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
$ cat << EOF > metallb-cm.yaml
apiVersion: v1
kind: ConfigMap
  namespace: metallb-system
  name: config
  config: |
    - peer-address:
      peer-asn: 65530
      my-asn: 64500
    - name: default
      protocol: bgp
      avoid-buggy-ips: true
$ kubectl apply -f metallb-cm.yaml

After a while, the BGP peering is seen as established on the router side.

$ /routing bgp peer p
Flags: X - disabled, E - established
 #   INSTANCE                                               REMOTE-ADDRESS                                                                         REMOTE-AS
 0   default                                                                                                                   64500

And MetalLB controller has assigned an IP address to the service!

$ kubectl get svc -n ingress-nginx
NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
ingress-nginx-controller-admission   ClusterIP   <none>        443/TCP                      10m
ingress-nginx-controller             LoadBalancer      80:30579/TCP,443:31078/TCP   100s

Hello world!

It’s time to see if BGP route advertisement really worked and we can reach the service from the home network (assuming that the firewall configurations permits that).

$ curl -vvv
*   Trying
* Connected to ( port 80 (#0)
> GET / HTTP/1.1
> Host:
> User-Agent: curl/7.64.1
> Accept: */*
< HTTP/1.1 404 Not Found
< Server: nginx/1.19.2
< Date: Sun, 27 Sep 2020 16:21:20 GMT
< Content-Type: text/html
< Content-Length: 153
< Connection: keep-alive
<head><title>404 Not Found</title></head>
<center><h1>404 Not Found</h1></center>
* Connection #0 to host left intact
* Closing connection 0
$ /ip route> /ip route p
Flags: X - disabled, A - active, D - dynamic, C - connect, S - static, r - rip, b - bgp, o - ospf, m - mme, B - blackhole, U - unreachable, P - prohibit
 #      DST-ADDRESS        PREF-SRC        GATEWAY            DISTANCE
 3 ADb                             20

Perfect! Although the server returns 404, the Server header hints that we’ve reached the NGINX. Any cluster service can now be exposed via the Ingress resource.

And that’s it for the day! The Raspberry Pi 3 B is somewhat overloaded, so it’s time to wait for Pi version 4 to arrive before experimenting further. All in all, this has been a great experience with K3s distribution - the ease of setup for project like this is perfect.