Virtualization in Minikube

... or how I went too deep to build a custom minikube image because I can't download more RAM

Introduction

Minikube is a very useful tool when we are actively developing Helm charts or doing some tests in Kubernetes as it mimics pretty well a k8s environment. It also mimics pretty well an OpenShift environment which is a distribution of Kubernetes: k8s with more security restrictions by default, builtin UI, more resource intensive and closed-source developed by Red Hat (there is an open-source version called OKD which mimics it almost perfectly).

OpenShift uses its own cli tool (oc) to perform all operations (even though we can use regular kubectl) and comes with an operator store, which is basically a store to install plugins in a cluster. One of this operators is specifically for virtualization - OpenShift Virtualization -, meaning we can run virtual machines inside a cluster, using YAML files for creating them. This is pretty cool as it follows the GitOps approach, but there are multiple questions we need to answer when going for it:

How different from a virtual machine in another virtualization platform is it?
How can I interact with the virtual machine?
OpenShift is a paid platform, how to have access to it for developing? Do I need to pay?

How different from a virtual machine in another virtualization platform is it?

The difference shouldn't be much if we are comparing to another type 1 hypervisor, as the OpenShift operator is using the KubeVirt operator as the upstream, which in the end is just a wrapper around KVM, the best virtualization platform in my opinion. KVM, although running on a host OS, directly access hardware, making it a type 1 hypervisor. If we are comparing with a type 2 hypervisor, then it will perform better. The main difference between this hypervisor and others is that this is open-source and builtin to the Linux kernel, so it is widely tested and can be easily customized. When creating a VM, if we select the VirtIO option for the drivers of the VM components, it should perform even better as the machine components won't be completely virtualized.

How can I interact with the virtual machine?

We can connect via VNC to the virtual machines using virtctl, which is a kubectl plugin that handles all the virtualization commands: start, stop, delete, ssh, etc. If the VM also has a public (or private but accessible from our machine) IP, we can access it regularly via SSH or RDP.

In OpenShift, there is even a UI for almost all of these operations, so it is easy to change configurations and start/stop machines through it.

OpenShift is a paid platform, how to have access to it for developing? Do I need to pay?

You can have access to OpenShift via OpenShift Local, a program that you can install in your computer and that will deploy an OpenShift cluster in a single node, with minimal functions. To have access to it, you just need to create a free account on Red Hat and download it. The problem with OpenShift Local is that for idling with virtualization enabled, it requires around 20GB of RAM and a lot of disk space.

There is also minishift, but it is no longer maintained and it ships the equivalent of OpenShift 3 (current version is 4).

The other option, is going with OKD, the upstream project of OpenShift, and for this we do not need an account at Red Hat. However, installing OKD is the same as installing OpenShift and it will refuse to install in a node which doesn't have the minimum amount of RAM / CPU and also requires some domain handling for it to install. The installation steps for OKD are here as it was hard to find the best way to install it as an Single Node Cluster (SNO).

The final option is minikube, which has the possibility to install kubevirt through its plugin system, however it is quite an outdated version and I wasn't able to make it work. So, we need to make it work in minikube as the resources are very minimal and it seems like the obvious approach.

Installing kubevirt in minikube

To support virtualization in minikube, it needs to create a node which has the /dev/kvm device. We can check that using minikube ssh and try to list the device, and by default it is present, so no need to change anything. We can proceed with the installation of KubeVirt in our cluster, which will install successfully.

The problem here is when trying to start a virtual machine: it won't start and it isn't easy to debug. But the error obtain is because the nftables module is not present in the minikube node (and to save some time, it needs JSON support for it, just as a side note). So the question is what is nftables and how can I add it?

Nftables is the Linux kernel packet filter which is the successor of iptables. To answer the part of how can I add it? we need to understand what minikube really is.

What really is minikube?

Minikube is a lightweight Kubernetes version, meaning some configuration possibilities are not supported there or the choices that are possible with k8s are already taken.

When using minikube, it spawns either a VM or a container, depending on the driver, that has some packages already present there and is already prepared with a Kubernetes cluster ready to use. So, basically minikube is an immutable Linux distribution, that ships with its own kernel configuration and packages.

Being immutable, means that we cannot add anything to it after it is deployed, meaning that if we want to have a different configuration, we need to build an image first and then when creating the cluster we need to specify it. The second part (the easiest) is possible through the CLI flag --iso-url.

Focusing now on the first part, lets check how to build a custom minikube image.

Building a custom image for minikube

There are instructions on building the minikube ISO in the official documentation, however I wasn't completely successful by following them, so I am defining the steps I took. I used the container build as it doesn't require us to have all the tools installed in our machine.

To build the ISO, we need:

x86_64 CPU
Linux distribution
Docker (or podman, but needs a small trick)
Go
4GB of RAM (this I stole from the documentation, should be good, maybe works with less)

Let's start by cloning the source code:

git clone --depth 1 --branch v1.36.0 https://github.com/kubernetes/minikube.git # Change the version to your minikube version for best compatibility
cd minikube

If we are using podman rather than docker, we need to change in the Makefile one of the lines that runs a docker container, otherwise this step can be skipped. We remove the flag --user and we need to add the flag --env FORCE_UNSAFE_CONFIGURE=1 to make it compile as the root user. Additionally, if we don't have podman-docker installed or don't have docker aliased to podman we need to change the command from docker to podman. So, in the end we have a change similar to this (in doubt change for all of the docker run commands as it didn't break it in my case):

@@ -341,8 +341,8 @@ out/minikube-%.iso: $(shell find "deploy/iso/minikube-iso" -type f)
 ifeq ($(IN_DOCKER),1)
        $(MAKE) minikube-iso-$*
 else
-       docker run --rm --workdir /mnt --volume $(CURDIR):/mnt:Z $(ISO_DOCKER_EXTRA_ARGS) \
-               --user $(shell id -u):$(shell id -g) --env HOME=/tmp --env IN_DOCKER=1 \
+       podman run --rm --workdir /mnt --volume $(CURDIR):/mnt:Z $(ISO_DOCKER_EXTRA_ARGS) \
+               --env HOME=/tmp --env IN_DOCKER=1 --env FORCE_UNSAFE_CONFIGURE=1 \
                $(ISO_BUILD_IMAGE) /bin/bash -lc '/usr/bin/make minikube-iso-$*'
 endif

Since the complete configuration of the image via container is not completely supported by the Makefile, we need to end the following lines at the bottom of the file, that will run the command inside the container preventing compatibility issues:

container-iso-menuconfig-%:
    podman run --rm -it -w /mnt -v $(CURDIR):/mnt:Z $(ISO_DOCKER_EXTRA_ARGS) \
        -e HOME=/tmp $(ISO_BUILD_IMAGE) /bin/bash -lc '/usr/bin/make iso-menuconfig-$*'

container-linux-menuconfig-%:
    podman run --rm -it --workdir /mnt -v $(CURDIR):/mnt:Z $(ISO_DOCKER_EXTRA_ARGS) \
        -e HOME=/tmp $(ISO_BUILD_IMAGE) /bin/bash -lc '/usr/bin/make linux-menuconfig-$*'

The only final step before starting building is to add the package libncurses-dev to the container image. To do that, we just need to change the file in deploy/iso/minikube-iso/Dockerfile, in the RUN step where the packages are installed, we just need to also insert it there.

We can start building the container image that will build the actual minikube distribution image. We just need to run this once

GOSUMDB=sum.golang.org make buildroot-image

The next step will take a very long time (around 2h) as it will compile the Linux kernel and build the minikube ISO. This is the command we need to re-run in order to perform the changes in the ISO, however it will take a lot less time as most stuff is already built (I will use the amd64 which is what I used for my test, but steps are valid for an ARM image for ARM minikube cluster as well, just replace in the next commands amd64 with arm64 and x86_64 with aarch64)

GOSUMDB=sum.golang.org make out/minikube-amd64.iso

This command will create a folder called out/buildroot and in the out folder will create an ISO file named minikube-amd64.iso which is the ISO we will use in the cluster after we have modified it. The command mentioned in the minikube documentation builds the image outside the container, so it makes the previous step useless.

After the command finishes, we can start playing around adding kernel modules and packages to the minikube distribution. Let's just stop here for a moment to understand what is this buildroot thing.

Buildroot is a simple way to package a Linux system for embedded devices. It simplifies kernel compilation and package shipping through a set of makefiles and scripts. We can configure our distribution with make menuconfig and it will open a TUI where we can select all our configuration options (other options, like graphical UI, are available) and then we save it to a configuration file that will be used when building the image.

Changing kernel configuration

To change kernel build parameters, to affect modules, we can use the command

GOSUMDB=sum.golang.org make container-linux-menuconfig-x86_64

We then select the configuration parameters by category. In the end we just need to save the configuration to the default file .config. When we exit, the Makefile from minikube will overwrite the changes in its own configuration file.

Adding/Remove an existing package

To change the packages included, we can use the command

GOSUMDB=sum.golang.org make container-iso-menuconfig-x86_64

Again, the packages are organized by category. We save the configuration to the default .config file and after exiting, once again, the Makefile from minikube will save it to the correct place (deploy/iso/minikube-iso/configs/minikube_x86_64_defconfig).

Adding a new package

For minikube, if we want to create a new package, we should put it into the deploy/iso/minikube-iso/package/<package name> folder. In this folder, we have 3 mandatory files:

<package name>.mk - The Makefile describing how to download the package source and build it. See https://buildroot.org/downloads/manual/manual.html#_the_literal_mk_literal_file for details
<package name>.hash - The file that contains the hash type, hash value and the file name for the downloaded files. See https://buildroot.org/downloads/manual/manual.html#adding-packages-hash for details.
Config.in - The file with the configuration options for the package, it is written in the KConfig language. See https://buildroot.org/downloads/manual/manual.html#_config_files for details.

Finally, we just need to add our new package to the configuration options of minikube in the deploy/iso/minikube-iso/Config.in file, similar to the others.

Since when I first tried to make the image I didn't succeed using the make commands from minikube because they are undocumented, I ended up setting the configuration options by hand. For kernel configuration, the file is located in deploy/iso/minikube-iso/board/minikube/x86_64/linux_x86_64_defconfig and the configuration options all start with CONFIG_<PACKAGE>_<OPTION>, so I had to dig into the Linux kernel source for the netfilter source and check which options were available.

There are only 3 valid values for the configuration options:

m - Load as module
n - Do not build
y - Built-in to kernel

Now, when looking at the result through the UI, I am still happy with my choice as through the TUI I am only able to set nftables as a kernel module instead of it being builtin to the kernel - the difference is that as a module it needs to be loaded manually or through a configuration file and builtin is always loaded. However, we can always change the values we saved in the UI manually from m to y. I enabled basically every option for nftables. While this increases build time and resource consumption, it has a minimal impact on performance and makes debugging easier than setting each option manually and testing it.

To help in debugging if everything is fine, I added the nftables package (that provides us with the CLI tool) and jansson due to the JSON requirement I mentioned before. I haven't tried without them as the whole process of building, deploying and testing take a while, but probably they are both not needed.

After we are happy with our changes, we can issue again the rebuild command and this time we should have the image built now with the options we set.

Finally, we can deploy our custom image to be used in the cluster with:

minikube start --iso-url file://<path to ISO>

Then we can proceed with KubeVirt installation following the official steps:

export VERSION=$(curl -s https://storage.googleapis.com/kubevirt-prow/release/kubevirt/kubevirt/stable.txt)
kubectl apply -f "https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/kubevirt-operator.yaml"
kubectl apply -f "https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/kubevirt-cr.yaml"

To deploy an OS image for our VM, we can use the CDI operator, and we can install it with:

export TAG=$(curl -s -w %{redirect_url} https://github.com/kubevirt/containerized-data-importer/releases/latest)
export VERSION=$(echo ${TAG##*/})
kubectl apply -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-operator.yaml
kubectl apply -f https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-cr.yaml

Finally, we can upload our OS image to a DataVolume:

kubectl rollout status -n cdi deployment cdi-uploadproxy
kubectl port-forward -n cdi services/cdi-uploadproxy 8000:443 &
kubectl virt image-upload dv <VOLUME NAME> --size <SIZE OF IMAGE OR BIGGER> --image-path <PATH TO OS IMAGE> --access-mode ReadWriteOnce --uploadproxy-url https://localhost:8000 --insecure

After this, we have a DataVolume that we can attach to our VM and a VM that is able to start.

Virtual machines with multiple network interfaces

By default, in a Kubernetes environment, each pod is attributed one network interface. However, when we have VMs, it is common to have them with multiple network interfaces, and here we can already see one issue. Luckily, someone already thought the same and we have multus, a Kubernetes network interface plugin that allows a pod (and also a VM) to have multiple network interfaces.

The problem here arises once again after trying to deploy a virtual machine with a multus network interface, so we need again to modify our minikube image. This time, since we have everything already built, it also only takes some minutes rather than hours.

Modifying the custom image

The process for adding new packages and modules is exactly the same. In this case the error is about IPv6, and it is not enabled by default in minikube. We just need to enable kernel support for it as well as ip6tables and nftables support for IPv6.

After rebuilding the image once more, everything should be setup to deploy the cluster again.

We can follow the same steps as before to install kubevirt and CDI. To install multus, we just run

export TAG=$(curl -s -w %{redirect_url} https://github.com/k8snetworkplumbingwg/multus-cni/releases/latest)
export VERSION=$(echo ${TAG##*/})
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/${VERSION}/deployments/multus-daemonset-thick.yml

An example of a network attachment is:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: my-network
spec:
  config: |-
    { "cniVersion": "1.1.0", "name": "my-network", "type": "bridge", "bridge": "mybridge0", "ipam": { "type": "host-local", "ranges": [ [{ "subnet": "192.168.100.0/24", "gateway": "192.168.100.1", "rangeStart": "192.168.100.3", "rangeEnd": "192.168.100.220" }] ] } }

And in the VM we just create a network interface with a spec similar to:

- multus:
    networkName: <namespace>/my-network
    name: my-interface

Final thoughts

Now we have a complete development setup to mimic an OpenShift environment. I was able to make this run with 6GiB of RAM dedicated to the cluster, which is a lot smaller than the 20GiB (a less than a third of usage). We also learned how to compile the Linux kernel and build an embedded Linux distribution.

In the end, we have an OpenShift-like environment which is reproducible and lightweight being suitable to make all kinds of experiments. The main thing missing is the UI, but as a suggestion we can use HeadLamp. While it is not a replacement for OpenShift, it makes it easier to check cluster state and perform some operations.

Maybe it is a bit too much, but in my opinion it is completely worthy as I was able to get a lot of insights on how the whole minikube system works. The only thing I regret is the 2h of first system compilation, which I will never get back, but that time can be spent doing other things ;)

Configuration files

For reference here are the configuration options I have used:

deploy/iso/minikube-iso/board/minikube/x86_64/linux_x86_64_defconfig:

...
- CONFIG_NF_CONNTRACK=m
+ CONFIG_NF_CONNTRACK=y
...
- CONFIG_IP_NF_NAT=m
+ CONFIG_IP_NF_NAT=y
...
+ # ip6tables
+ CONFIG_IP6_NF_NAT=y
+ CONFIG_IP6_NF_IPTABLES=y
+ CONFIG_IPV6=y
+ CONFIG_NF_NAT=y
+ CONFIG_NF_TABLES_IPV6=y
+ CONFIG_NETFILTER_ADVANCED=y

+ # Nftables
+ CONFIG_NF_TABLES=y
+ CONFIG_NF_TABLES_INET=y
+ CONFIG_NFT_EXTHDR=y
+ CONFIG_NFT_META=y
+ CONFIG_NFT_CT=y
+ CONFIG_NFT_RBTREE=y
+ CONFIG_NFT_HASH=y
+ CONFIG_NFT_COUNTER=y
+ CONFIG_NFT_LOG=y
+ CONFIG_NFT_LIMIT=y
+ CONFIG_NFT_MASQ=y
+ CONFIG_NFT_REDIR=y
+ CONFIG_NFT_NAT=y
+ CONFIG_NFT_QUEUE=y
+ CONFIG_NFT_REJECT=y
+ CONFIG_NFT_REJECT_INET=y
+ CONFIG_NFT_COMPAT=y
+ CONFIG_NFT_CHAIN_ROUTE_IPV4=y
+ CONFIG_NFT_REJECT_IPV4=y
+ CONFIG_NFT_CHAIN_NAT_IPV4=y
+ CONFIG_NFT_MASQ_IPV4=y
+ CONFIG_NFT_REDIR_IPV4=y
+ CONFIG_NFT_CHAIN_ROUTE_IPV6=y
+ CONFIG_NFT_REJECT_IPV6=y
+ CONFIG_NFT_CHAIN_NAT_IPV6=y
+ CONFIG_NFT_MASQ_IPV6=y
+ CONFIG_NFT_REDIR_IPV6=y
+ CONFIG_NFT_BRIDGE_META=y
+ CONFIG_NFT_BRIDGE_REJECT=y
...

deploy/iso/minikube-iso/configs/minikube_x86_64_defconfig:

...
+ BR2_PACKAGE_JANSSON=y
+ BR2_PACKAGE_NFTABLES=y
...