Kubernetes

This was a bad (good) idea

So i recently came to own seven (7) Raspberry Pi 4s and three (3) 5-terabyte hard drives. This new compute power, all with the aarch64 architecture, led me to thinking about service deployments. In the past, I had chosen a specific raspberry pi or rock64 and put effort into configuring that specific device to run a specific service. This was fine, but it became harder to maintain as I spun up new services on new devices. I didn't have a central place for storing configurations, and everything was unique to each system.

I first tried solving this with Ansible. I created ansible playbooks to deploy gitea and mastodon, but this ended up being a lot of work, and each deployment took forever (especially mastodon, which compiles it's own ruby). I ended up beginning to transition my services to containers. First, I built containers for Mastodon using podman and deployed them to a couple of my systems. I ended up using Dockerfiles to construct the containers, because it's both easy to reproduce, and documentation for putting mastodon together. After that, I built containers for Plume.

Deploying containers was fine, but I ran into a couple issues.

  • Deploying on Podman only makes sense if your containers are small or you have a lot of storage. Podman makes full copies of each image to generate a container based on it, so a 1.2GB mastodon container would end up being 2.4GB to run one instance, and 3.6 to run two instances.
  • Deploying on Docker is difficult to manage with systemd services. While I could start and manage a podman container easily with systemd, docker didn't play as nice and I ended up forgoing systemd entirely to rely on manually kicking the container.
  • I was still manually putting specific software on specific hosts and hard-coding rules about their communication and storage mechanisms

So... Kubernetes?

It made sense to me that a platform for deploying containers to [some host on my network] while simultaneously centralizing my configuration was a net win, so I went about figuring out how to do that.

First, I decided I needed some storage. I had heard about GlusterFS in a chat sometime last year, so I went to check it out. GlusterFS is a network filesystem that is capable of combining drives from multiple hosts on a network to present single volumes. It has the benefit of scaling performance almost linearly with each added node, since reads and writes can be split between the devices. Since I have three large harddrives, I can plug each drive into a different raspberry pi, and then replicate storage across them.

Installing GlusterFS on Ubuntu is incredibly simple. It's just an apt install glusterfs-server away. I opted to add the PPA (ppa:gluster/glusterfs-6), since it allowed me to potentially add different versions of ubuntu as hosts, but use the same glusterfs version across them. GlusterFS is a FUSE filesystem, i.e. not built into the kernel, so each of my kubernetes nodes need to have glusterfs-client installed to mount filesystems. apt install glusterfs-client.

In order for my nodes to share storage securely, I opted to have them all talk wireguard to each other. This is another simple installation: ppa:wireguard/wireguard, apt install wireguard (also install python and linux-kernel-headers if you don't already have them). Once it was installed, I set up interfaces on 192.168.3.0/24 for each one, and shared around a peers-wg.conf file that contained knowledge of each other node's public keys and addresses.

Generating wireguard keys:

$ umask 077
$ wg genkey > privatekey
$ cat privatekey | wg pubkey > publickey

Example /etc/wireguard/wg0.conf

[Interface]
PrivateKey = ...
Address = 192.168.3.21/32
ListenPort = 51820
SaveConfig = true

Example Peer entry

[Peer]
PublicKey = ...
AllowedIPs = 192.168.3.21/32
Endpoint = 192.168.1.123:51820

Adding peers to an existing wireguard interface

$ sudo systemctl start wg-quick@wg0
$ sudo wg addconf wg0 peers-wg.conf
$ sudo systemctl restart wg-quick@wg0

Kubernetes Time? Not quite

Now that wireguard is configured, I can continue setting up glusterfs

$ sudo gluster peer probe 192.168.3.4
$ sudo gluster peer probe 192.168.3.5

Now we have the gluster nodes talking to each other over wireguard, but we don't have any volumes yet. Luckily, we can create a few on those external hard drives. Oh, they should probably be encrypted, too.

On each node, make and open an encrypted filesystem with LUKS. Also add an entry to /etc/crypttab (not shown)

$ sudo dd if=/dev/urandom of=/etc/keyfile bs=512 count=4
$ sudo cryptsetup luksFormat /dev/sda -d /etc/keyfile
$ sudo cryptsetup luksOpen /dev/sda cryptdrive -d /etc/keyfile

Mount the filesystem to a new folder. Also add an entry to /etc/fstab (not shown)

$ sudo mkfs.btrfs /dev/mapper/cryptdrive
$ sudo mount /dev/mapper/cryptdrive
$ sudo btrfs subvolume create /mnt/@mastodon
$ sudo mkdir -p /glusterfs/mastodon
$ sudo mount -o subvol=@mastodon /dev/mapper/cryptdrive /glusterfs/mastodon
$ sudo mkdir /glusterfs/mastodon/live

On a single node, create and start the glusterfs volume. This volume replicates data across all three drives

$ sudo gluster volume create mastodon replica 3 192.168.3.{3,4,5}:/glusterfs/mastodon/live
$ sudo gluster volume start mastodon

Alright, so Kubernetes now? Still no.

Some services we're going to deploy to kubernetes will need storage, but we don't want to manually provision a volume ahead of time like we did for mastodon. In those cases we'll want to use NFS, since there's an easy automatic nfs provisioner we can use later.

On each storage node, also add entries to /etc/fstab and /etc/exports (not shown)

$ sudo apt install nfs-server
$ sudo btrfs subvolume create /mnt/@nfs
$ sudo mkdir -p /storage/nfs
$ sudo mount -o subvol=@nfs /dev/mapper/cryptdrive /storage/nfs

Kubernetes? well...

On a Raspberry Pi, there's a couple things we need to do before kubernetes will actually work. The first is enabling cgroups in the kernel. This can be done by adding cgroup_enable=memory cgroup_memory=1 to the arguments in /boot/firmware/btcmd.txt and /boot/firmware/nobtcmd.txt. Next, and this is common across most ubuntu installs, we'll need to enable IP Forwarding. the line net.ipv4.ip_forward=1 in /etc/sysctl.conf should be uncommented.

Kubernetes??? yes

$ sudo snap install microk8s --classic

I've used microk8s since it's easy to set up on ubuntu systems. Unfortunately, it seems like microk8s takes between 200 and 300 MB RAM, which is a scarce resource on a Raspberry Pi. For this reason I'd recommend the 4GB model Pi4. I did not personally use that model. Once kubernetes is installed on all the nodes, you can try executing a kubectl command.

Example:

$ microk8s.kubectl get no

This produces an error like /snap/bin is not in $PATH so log out and log in and try again

$ microk8s.kubectl get no

This produces an error like ubuntu is not in the kubernetes group so add ubuntu to the kubernetes group and log out and log in

$ microk8s.kubectl get no

This works now, but it's kinda tiring. On the node deemed the controller, add alias kubectl=microk8s.kubectl to the .bashrc for a better time.

Now, on the controller node, start connecting things

$ microk8s.add-node

This produces some output with a command to copy and paste on another node, so do that for each other node. Note that add-node must be run on the controller each time a node is added. It generates unique URLs for each one.

Check the connected nodes.

$ kubectl get no
NAME           STATUS   ROLES    AGE     VERSION
192.168.3.11   Ready    <none>   2d4h    v1.17.0
192.168.3.12   Ready    <none>   32h     v1.17.0
192.168.3.2    Ready    <none>   2d4h    v1.17.0
192.168.3.22   Ready    <none>   6d10h   v1.17.0
192.168.3.23   Ready    <none>   7h      v1.17.0
192.168.3.3    Ready    <none>   5h33m   v1.17.0
192.168.3.4    Ready    <none>   13h     v1.17.0
192.168.3.5    Ready    <none>   6h22m   v1.17.0
pi4b1          Ready    <none>   6d11h   v1.17.0

Now that's a lot of nodes!

Alright, but they're not doing anything except using 200-300MB RAM each just to connect to each other. And that ends part 1 of kubernetes????