Running QEMU-ARM Debian Guests with libvirt
Like most embedded devices on the market our Blickwerk sensors are ARM-based and is powered by a NXP (formerly Freescale Semiconductor) i.MX283 processor. ARM cores are great for low-powered devices or devices under thermal constraints but they are not exactly powerful. Even though we try to utilize cross-compile toolchains whenever and whereever we can, we sometimes need to resort to an actual ARM environment for building libraries and applications.
Until now we’ve used actual hardware for this namely an ODROID-XU4 and a Banana PI BPI-M3 because they have relatively large amounts of RAM and enough processing power to complete builds in a reasonable amount of time. Each of these SBCs is running a Buildbot worker.
With the release of Debian Buster we wanted to revisit the potential of ARM virtualization with QEMU. Peter Maydell wrote an excellent article on the virt
board in QEMU back in 2016, which doesn’t try to emulate specific ARM hardware like the often used versatilepb
board and therefore features large amounts of RAM and a configurable amount of CPUs. The article was a great starting point for our endeavor but one of the stumbling blocks was proper libvirt integration. libvirt is a virtualization daemon and API that defines some abstraction around common virtualization or process isolation techniques like LXC, KVM or, XEN, which we use basically everywhere.
From QEMU arguments to libvirt XML
Peter Maydell’s article closes with the following qemu-system-arm
call:
qemu-system-arm -M virt -m 1024 \
-kernel vmlinuz-3.16.0-4-armmp-lpae \
-initrd initrd.img-3.16.0-4-armmp-lpae \
-append 'root=/dev/vda2' \
-drive if=none,file=hda.qcow2,format=qcow2,id=hd \
-device virtio-blk-device,drive=hd \
-netdev user,id=mynet \
-device virtio-net-device,netdev=mynet \
-nographic
The specific problem we’ve encountered was to translate the -device
arguments into something that libvirt understands. virsh
, the command line management tool that comes with libvirt, even has a domxml-from-native
command that was designed to convert qemu arguments to the appropriate libvirt format. Unfortunately it isn’t exactly well maintained as outlined in this mail from Cole Robinson. Our second approach was to start the machine with the known working qemu arguments and see what bus and drivers are used inside the system in order to replicate the environment in libvirt. And et voilá: the crucial hint is exposed by the /dev/disk/by-path/
directory exposing the shortest physical path to disks in the system. In our case a simple call to ls /dev/disk/by-path/
reveals the following output:
platform-a003c00.virtio_mmio platform-a003e00.virtio_mmio
virtio-mmio
is a valid type
attribute in the
documentation for the <address/>
element
of each <device>
element. In contrast to that libvirt generates pci
address types by default
which don’t seem to be supported in current versions of Debian. In the end the only thing we
needed to do in order to convert the default libvirt configuration for our ARM host to something
that actually works was to replace all <address type="pci" .../>
lines with
<address type="virtio-mmio"/>
.
This worked for both drives and network devices.
This is how our working libvirt configuration looks like:
<domain type='qemu' id='38'>
<name>usain</name>
<uuid>3bf6e58f-e513-47b4-9b64-e00b32d9d9f4</uuid>
<memory unit='KiB'>3145728</memory>
<currentMemory unit='KiB'>3145728</currentMemory>
<vcpu placement='static'>3</vcpu>
<resource>
<partition>/machine</partition>
</resource>
<os>
<type arch='armv7l' machine='virt-3.1'>hvm</type>
<kernel>/var/lib/libvirt/boot/usain/vmlinuz</kernel>
<initrd>/var/lib/libvirt/boot/usain/initrd.img</initrd>
<cmdline>root=UUID=7a7f1855-2536-4342-a481-4853a125560f</cmdline>
<boot dev='hd'/>
</os>
<features>
<gic version='2'/>
</features>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/bin/qemu-system-arm</emulator>
<disk type='block' device='disk'>
<driver name='qemu' type='raw'/>
<source dev='/dev/lvm-uhura/usain-boot'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='virtio-mmio'/>
</disk>
<disk type='block' device='disk'>
<driver name='qemu' type='raw'/>
<source dev='/dev/lvm-uhura/usain-root'/>
<backingStore/>
<target dev='vdb' bus='virtio'/>
<alias name='virtio-disk1'/>
<address type='virtio-mmio'/>
</disk>
<controller type='pci' index='0' model='pcie-root'>
<alias name='pcie.0'/>
</controller>
<interface type='bridge'>
<mac address='52:54:00:79:39:16'/>
<source bridge='br-virt'/>
<target dev='vnet0'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='virtio-mmio'/>
</interface>
<serial type='pty'>
<source path='/dev/pts/1'/>
<target type='system-serial' port='0'>
<model name='pl011'/>
</target>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/1'>
<source path='/dev/pts/1'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
</devices>
<seclabel type='dynamic' model='apparmor' relabel='yes'>
<label>libvirt-3bf6e58f-e513-47b4-9b64-e00b32d9d9f4</label>
<imagelabel>libvirt-3bf6e58f-e513-47b4-9b64-e00b32d9d9f4</imagelabel>
</seclabel>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+64055:+64055</label>
<imagelabel>+64055:+64055</imagelabel>
</seclabel>
</domain>
Automatic Kernel Updates
Another complication with ARM virtualization is that we currently have to use the direct kernel boot feature in order to start the system. libvirt supports direct kernel boot out of the box but the kernel and initrd must be accessible from within the virtualization host’s filesystem. This boils down to a simple problem: the kernel and initrd are updated in the virtual machine but are booted from the outside. So whenever either one is updated, we would have to mount the boot partition and copy them somewhere where libvirt can access them.
Fortunately libvirt also supports hooks that are executed
whenever a qemu guest is started. We usually use LVM as storage backend for libvirt and each ARM
guest has a $name-boot
and $name-root
volume associated with it. So whenever we start an
ARM guest we can automatically mount the appropriate LVM volume, copy the kernel and initrd and let
libvirt handle the rest.
This works great so far and lightens our maintenance burden when it comes to automatic system
updates of our ARM guests.
This is the hook we use for that:
#!/bin/sh
set -eu
GUEST="$1"
ACTION="$2"
PHASE="$3"
BOOT_IMAGE_BASE_PATH=/var/lib/libvirt/boot
_is_mounted() {
grep -qwF "$1" /proc/mounts
}
_is_host_running() {
# calling virsh domstate here will cause the process to hang so we use ps instead
ps --no-headers -u libvirt-qemu -o cmd | grep -q -- "-name guest=$1"
}
_get_boot_volume() {
local guest="$1"
local configured_volume
local dm_path
# looks for a volume whose name ends with "-boot"
configured_volume="$(
xmllint --xpath 'string(/domain/devices/disk[@type="block"]/source[substring(@dev, string-length(@dev) - string-length("-boot") + 1) = "-boot"]/@dev)' \
"/etc/libvirt/qemu/$guest.xml"
)"
# the configured volume might contain any path that refers to a volume but /proc/mounts
# will contain paths from /dev/mapper so we need to find the path of the actual devices
# and then find the corresponding symlink in /dev/mapper
dm_path="$(realpath "$configured_volume")"
find /dev/mapper -type l | while read -r mapper_path; do
if [ "$(readlink -f "$mapper_path")" = "$dm_path" ]; then
echo "$mapper_path"
break
fi
done
}
update_guest_kernel_and_initrd() {
# ARM hosts cannot be booted like any x64_64 host.
# Instead we need libvirt to boot the kernel directly along with the guest’s generated initrd.
# We update the kernel and initrd on every guest startup, so that a system update will behave
# as expected on the next reboot.
local guest="$1"
local boot_image_path="$BOOT_IMAGE_BASE_PATH/$guest"
local boot_volume
local tmp_mount_path
boot_volume="${2:-$(_get_boot_volume "$guest")}"
if [ ! -z "$boot_volume" ]; then
echo "Boot volume for guest $guest not found." >&2
return 1
fi
if [ ! -e "$boot_volume" ]; then
echo "Boot volume for guest $guest does not exist in '$boot_volume'. Cannot extract kernel and initrd." >&2
return 1
fi
if _is_host_running "$guest"; then
# this should not happen, but maybe someone is calling this script manually
echo "Guest $guest is not shut down. Refusing to mount volumes." >&2
return 1
fi
if _is_mounted "$boot_volume"; then
echo "Boot volume '$boot_volume' is already mounted in the system. Mounting the volume twice may cause data loss." >&2
return 1
fi
mkdir -p --mode 750 "$boot_image_path"
chgrp libvirt-qemu "$boot_image_path"
tmp_mount_path="$(mktemp -d)"
trap "umount '$tmp_mount_path'; rmdir '$tmp_mount_path'" EXIT
mount -o ro "$boot_volume" "$tmp_mount_path"
cp "$tmp_mount_path/vmlinuz" "$tmp_mount_path/initrd.img" "$boot_image_path/"
}
if [ "$ACTION" = prepare ] && [ "$PHASE" = begin ]; then
# kernel and initrd of guests that use ARM emulation should be updated before being started
if grep -qwF /usr/bin/qemu-system-arm "/etc/libvirt/qemu/$GUEST.xml"; then
update_guest_kernel_and_initrd "$GUEST"
fi
fi
Conclusion
ARM virtualization works great on Debian Buster and even though the performance is not on par with actual ARM hardware it’s fast enough to support our use cases. Virtualized environments are much easier to scale, manage and replicate compared to the hardware boards we have used earlier that often rely on ancient kernels due to non-free firmware.
You might also like
Nits, Lux, Lumen, Candela - calculating with light and lighting
Luminous flux, luminous intensity, luminance and co. - when is which quantity used and how can they be compared?