Intro
microvm.nix is a Flake to run lightweight NixOS virtual machines on NixOS. Starting with the reasons why for the remainder of this chapter, this handbook guides you through the provisioning of MicroVMs on your NixOS machine.
Compartmentalization
NixOS makes running services a breeze. Being able to quickly rollback configuration is a life-saver. Not so much however on systems that are shared by multiple services where maintenance of one affects others.
Increase stability by partitioning services into virtual NixOS systems that can be updated individually.
microvm.nix can isolate your /nix/store into exactly what is required for the guest's NixOS: the root filesystem is a read-only erofs/squashfs file-systems that include only the binaries of your configuration. Of course, that holds only true until you mount the host's /nix/store as a share for faster build times, or mount the store with a writable overlay for Nix builds inside the VM.
The Case Against Containers
Linux containers are not a single technology but a plethora of kernel features that serve to isolate various system resources so that the running system appears as one. It is still one shared Linux kernel with a huge attack surface.
Virtual machines on the other hand run their own OS kernel, reducing the attack surface to the hypervisor and its device drivers. The resource usage however incurs some overhead when compared with containers, with memory allocation being especially inflexible.
microvm.nix is a tool that helps you building the guest's OS and
running it in ways that are easier than writing a Dockerfile
, once
you know how to put a NixOS config into a flake.nix
file.
Just Virtual Machines?
Full virtualization has been available for a long time with QEMU and VirtualBox. The MicroVM machine type highlights that virtualization overhead has been reduced a lot by replacing emulated devices with virtio interfaces that have been optimized for this environment.
This Flake offers you to run your MicroVMs not only on QEMU but with other Hypervisors that have been explicitly authored for virtio. Some of them are written in Rust, a programming language that is renowned for being safer than C.
Declaring NixOS MicroVMs
microvm.nix creates virtual machine disk images and runner script
packages for the entries of the nixosConfigurations
section of a
flake.nix
file.
The microvm
module
To add MicroVM functionality, a NixOS system configuration is
augmented by importing this flake's nixosModule.microvm
:
# Example flake.nix
{
inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
inputs.microvm.url = "github:astro/microvm.nix";
inputs.microvm.inputs.nixpkgs.follows = "nixpkgs";
outputs = { self, nixpkgs, microvm }: {
# Example nixosConfigurations entry
nixosConfigurations.my-microvm = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
# Include the microvm module
microvm.nixosModules.microvm
# Add more modules here
{
networking.hostName = "my-microvm";
microvm.hypervisor = "cloud-hypervisor";
}
];
};
};
}
To get you started quickly, a Flake template is included. Run nix flake init -t github:astro/microvm.nix
in a new project directory.
Configuration options
By including the microvm
module a set of NixOS options is made
available for customization. These are the most important ones:
Option | Purpose |
---|---|
microvm.hypervisor | Hypervisor to use by default in microvm.declaredRunner |
microvm.vcpu | Number of Virtual CPU cores |
microvm.mem | RAM allocation in MB |
microvm.interfaces | Network interfaces |
microvm.volumes | Block device images |
microvm.shares | Shared filesystem directories |
microvm.devices | PCI/USB devices for host-to-vm passthrough |
microvm.socket | Control socket for the Hypervisor so that a MicroVM can be shutdown cleanly |
microvm.user | (qemu only) User account which Qemu will switch to when started as root |
microvm.forwardPorts | (qemu user-networking only) TCP/UDP port forwarding |
microvm.kernelParams | Like boot.kernelParams but will not end up in system.build.toplevel , saving you rebuilds |
microvm.storeOnDisk | Enables the store on the boot squashfs even in the presence of a share with the host's /nix/store |
microvm.writableStoreOverlay | Optional string of the path where all writes to /nix/store should go to. |
See the options declarations for a full reference.
Network interfaces
Declare a MicroVM's virtual network interfaces like this in its NixOS configuration:
{
microvm.interfaces = [ {
type = "tap";
# interface name on the host
id = "vm-a1";
# Ethernet address of the MicroVM's interface, not the host's
#
# Locally administered have one of 2/6/A/E in the second nibble.
mac = "02:00:00:00:00:01";
} ];
}
type = "user"
User-mode networking is only provided by qemu and kvmtool, providing outgoing connectivity to your MicroVM without any further setup.
As kvmtool seems to lack a built-in DHCP server, additional static IP configuration is necessary inside the MicroVM.
type = "tap"
Use a virtual tuntap Ethernet interface. Its name is the value of
id
.
Some Hypervisors may be able to automatically create these interfaces when running as root, which we advise against. Instead, create the interfaces before starting a MicroVM:
sudo ip tuntap add $IFACE_NAME mode tap user $USER
Note: add multi_queue
to this command line if the VM is configured
with more than one CPU core.
When running MicroVMs through the host
module, the tap network
interfaces are created through a systemd service dependency.
type = "macvtap"
MACVTAP interfaces attach to a host's physical network interface, joining the same Ethernet segment with a separate MAC address.
Before running a MicroVM interactively from a package, do the following steps manually:
# Parent interface:
LINK=eth0
# MACVTAP interface, as specified under microvm.interfaces.*.id:
ID=microvm1
# Create the interface
sudo ip l add link $LINK name $ID type macvtap mode bridge
# Obtain the interface index number
IFINDEX=$(cat /sys/class/net/$ID/ifindex)
# Grant yourself permission
sudo chown $USER /dev/tap$IFINDEX
When running MicroVMs through the host
module, the macvtap network
interfaces are created through a systemd service dependency. Per
interface with type = "macvtap"
, a link
attribute with the parent
interface, and mode
attribute for the MACVTAP filtering mode must be
specified.
type = "bridge"
This mode lets qemu create a tap interface and attach it to a bridge.
The qemu-bridge-helper
binary needs to be setup with the proper
permissions. See the host
module for that. qemu will be run
without -sandbox on
in order for this contraption to work.
Shares
In microvm.shares
elements the proto
field allows either of two
values:
-
9p
(default) is built into many hypervisors, allowing you to quickly share a directory tree -
virtiofs
requires a separate virtiofsd service which is only started as a prerequisite when you start MicroVMs through a systemd service that comes with themicrovm.nixosModules.host
module.Expect
virtiofs
to yield better performance over9p
.
-o xattr=sa -o acltype=posixacl
Sharing a host's /nix/store
If a share with source = "/nix/store"
is defined, size and build
time of the stage1 squashfs for /dev/vda
will be reduced
drastically.
microvm.shares = [ {
tag = "ro-store";
source = "/nix/store";
mountPoint = "/nix/.ro-store";
} ];
Writable /nix/store
overlay
An optional writable layer will be mounted if the path
microvm.writableStoreOverlay
is set. Make sure that the path is
located on a writable filesystem.
Caveat: The Linux overlay filesystem is very picky about the filesystems that can be the upper (writable) layer. 9p/virtiofs shares don't work currently, so resort to using a volume for that:
{ config, ... }:
{
microvm.writableStoreOverlay = "/nix/.rw-store";
microvm.volumes = [ {
image = "nix-store-overlay.img";
mountPoint = config.microvm.writableStoreOverlay;
size = 2048;
} ];
}
CPU emulation
It's possible to emulate a CPU if desired. This feature is only supported by the qemu hypervisor.
Note: this feature has a significant performance impact.
Defining an emulated NixOS system
You can call to nixpkgs.lib.nixosSystem
, with the following key
settings:
-
Set the
system
attribute to the host system. -
A module that sets
nixpkgs.crossSystem.config
to the guest system. This letsmicrovm.nix
know that it's a cross-system environment. -
Set
microvm.hypervisor
toqemu
, given this is the only hypervisor that supports this feature. -
Set
microvm.cpu
to the desired emulated CPU. You can find a list of the available systems here.
# Example flake.nix
{
inputs = {
nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
microvm = {
url = "github:astro/microvm.nix";
inputs.nixpkgs.follows = "nixpkgs";
};
};
outputs = { self, nixpkgs, microvm }: {
emulated-dev = nixpkgs.lib.nixosSystem {
# host system
system = "x86_64-linux";
modules = let
guestSystem = "aarch64-unknown-linux-gnu";
# you can use packages in the guest machine with cross system configuration
pkgs = import nixpkgs {
system = "x86_64-linux";
crossSystem.config = guestSystem;
};
in [
{nixpkgs.crossSystem.config = guestSystem;}
microvm.nixosModules.microvm
{
microvm = {
# you can choose what CPU will be emulated by qemu
cpu = "cortex-a53";
hypervisor = "qemu";
};
environment.systemPackages = with pkgs; [ cowsay htop ];
services.getty.autologinUser = "root";
system.stateVersion = "23.11";
}
];
};
};
}
You can run the example with nix run .#emulated-dev.config.microvm.declaredRunner
.
As shown in this example, you can use system packages on the guest
system by using nixpkgs with a proper crossSystem
configuration.
MicroVM output options
Hypervisor runners are provided in the config
generated by a
nixosSystem for you to use inside and outside your configuration.
Option | Purpose |
---|---|
microvm.declaredRunner | Runner package selected according to microvm.hypervisor |
microvm.runners | Attribute set of runner packages per known Hypervisor. |
The microvm.declaredRunner
selects the hypervisor according to the
configured microvm.hypervisor
.
nix run .#nixosConfigurations.my-microvm.config.microvm.declaredRunner
The microvm.runners
option provides a runner for each known
Hypervisor regardless of the microvm.hypervisor
config setting. To
build my-microvm for Firecracker for example:
nix run .#nixosConfigurations.my-microvm.config.microvm.runners.firecracker
Configure microvm.hypervisor
, use microvm.declaredRunner
!
One of the microvm.runners
is picked by microvm.declaredRunner
by
evaluating microvm.hypervisor
.
You may switch the Hypervisor quickly, but use declaredRunner
in
production. Any other NixOS configuration that evaluates the
microvm.hypervisor
option can be wrong when you pick from
microvm.runners
directly. One example would be the defaults set by
microvm.optimize
.
microvm.balloonMem
Amount of balloon memory in megabytes
The way virtio-balloon works is that this is the memory size that the host can request to be freed by the VM. Initial booting of the VM allocates mem+balloonMem megabytes of RAM.
Type: signed integer
Default:
0
Declared by:
microvm.cloud-hypervisor.extraArgs
Extra arguments to pass to cloud-hypervisor.
Type: list of string
Default:
[ ]
Declared by:
microvm.cpu
What CPU to emulate, if any. If different from the host architecture, it will have a serious performance hit.
Note: Only supported with qemu.
Type: null or string
Default:
null
Declared by:
microvm.crosvm.extraArgs
Extra arguments to pass to crosvm.
Type: list of string
Default:
[ ]
Declared by:
microvm.crosvm.pivotRoot
A Hypervisor’s sandbox directory
Type: null or string
Default:
null
Declared by:
microvm.declaredRunner
Generated Hypervisor declared by config.microvm.hypervisor
Type: package
Default:
"config.microvm.runner.${config.microvm.hypervisor}"
Declared by:
microvm.devices
PCI/USB devices that are passed from the host to the MicroVM
Type: list of (submodule)
Default:
[ ]
Example:
[ {
bus = "pci";
path = "0000:01:00.0";
} {
bus = "pci";
path = "0000:01:01.0";
} {
# QEMU only
bus = "usb";
path = "vendorid=0xabcd,productid=0x0123";
} ]
Declared by:
microvm.devices.*.bus
Device is either on the pci
or the usb
bus
Type: one of “pci”, “usb”
Declared by:
microvm.devices.*.path
Identification of the device on its bus
Type: string
Declared by:
microvm.forwardPorts
When using the SLiRP user networking (default), this option allows to forward ports to/from the host/guest.
Warning: If the NixOS firewall on the virtual machine is enabled, you also have to open the guest ports to enable the traffic between host and guest.
Note: Currently QEMU supports only IPv4 forwarding.
Type: list of (submodule)
Default:
[ ]
Example:
[ # forward local port 2222 -> 22, to ssh into the VM
{ from = "host"; host.port = 2222; guest.port = 22; }
# forward local port 80 -> 10.0.2.10:80 in the VLAN
{ from = "guest";
guest.address = "10.0.2.10"; guest.port = 80;
host.address = "127.0.0.1"; host.port = 80;
}
]
Declared by:
microvm.forwardPorts.*.from
Controls the direction in which the ports are mapped:
-
<literal>“host”</literal> means traffic from the host ports is forwarded to the given guest port.
-
<literal>“guest”</literal> means traffic from the guest ports is forwarded to the given host port.
Type: one of “host”, “guest”
Default:
"host"
Declared by:
microvm.forwardPorts.*.guest.address
The IPv4 address on the guest VLAN.
Type: string
Default:
""
Declared by:
microvm.forwardPorts.*.guest.port
The guest port to be mapped.
Type: 16 bit unsigned integer; between 0 and 65535 (both inclusive)
Declared by:
microvm.forwardPorts.*.host.address
The IPv4 address of the host.
Type: string
Default:
""
Declared by:
microvm.forwardPorts.*.host.port
The host port to be mapped.
Type: 16 bit unsigned integer; between 0 and 65535 (both inclusive)
Declared by:
microvm.forwardPorts.*.proto
The protocol to forward.
Type: one of “tcp”, “udp”
Default:
"tcp"
Declared by:
microvm.graphics.enable
Enable GUI support.
MicroVMs with graphics are intended for the interactive use-case. They cannot be started through systemd jobs.
Support in Hypervisors:
qemu
starts a Gtk window with the framebuffer of the virtio-gpu
Type: boolean
Default:
false
Declared by:
microvm.graphics.socket
Path of vhost-user socket
Type: string
Default:
"$HOSTNAME-gpu.sock"
Declared by:
microvm.guest.enable
Whether to enable the microvm.nix guest module at all.
Type: boolean
Default:
true
Declared by:
microvm.hugepageMem
Whether to use hugepages as memory backend. (Currently only respected if using cloud-hypervisor)
Type: boolean
Default:
false
Declared by:
microvm.hypervisor
Which hypervisor to use for this MicroVM
Choose one of: qemu, cloud-hypervisor, firecracker, crosvm, kvmtool, stratovirt, alioth
Type: one of “qemu”, “cloud-hypervisor”, “firecracker”, “crosvm”, “kvmtool”, “stratovirt”, “alioth”
Default:
"qemu"
Declared by:
microvm.initrdPath
Path to the initrd file in the initrd package
Type: path
Default:
"${config.system.build.initialRamdisk}/${config.system.boot.loader.initrdFile}"
Declared by:
microvm.interfaces
Network interfaces
Type: list of (submodule)
Default:
[ ]
Declared by:
microvm.interfaces.*.bridge
Attach network interface to host bridge interface for type = “bridge”
Type: null or string
Default:
null
Declared by:
microvm.interfaces.*.id
Interface name on the host
Type: string
Declared by:
microvm.interfaces.*.mac
MAC address of the guest’s network interface
Type: string
Declared by:
microvm.interfaces.*.macvtap.link
Attach network interface to host interface for type = “macvlan”
Type: null or string
Default:
null
Declared by:
microvm.interfaces.*.macvtap.mode
The MACVLAN mode to use
Type: null or one of “private”, “vepa”, “bridge”, “passthru”, “source”
Default:
null
Declared by:
microvm.interfaces.*.type
Interface type
Type: one of “user”, “tap”, “macvtap”, “bridge”
Declared by:
microvm.kernel
Kernel package to use for MicroVM runners. Better set boot.kernelPackages
instead.
Type: package
Default:
"${config.boot.kernelPackages.kernel}"
Declared by:
microvm.kernelParams
Includes boot.kernelParams but doesn’t end up in toplevel, thereby allowing references to toplevel
Type: list of string
Declared by:
microvm.mem
Amount of RAM in megabytes
Type: signed integer
Default:
512
Declared by:
microvm.preStart
Commands to run before starting the hypervisor
Type: strings concatenated with “\n”
Default:
""
Declared by:
microvm.prettyProcnames
Set a recognizable process name right before executing the Hyperisor.
Type: boolean
Default:
true
Declared by:
microvm.qemu.extraArgs
Extra arguments to pass to qemu.
Type: list of string
Default:
[ ]
Declared by:
microvm.qemu.machine
QEMU machine model, eg. microvm
, or q35
Get a full list with qemu-system-x86_64 -M help
This has a default declared with lib.mkDefault
because it
depends on ${pkgs.system}.
Type: string
Declared by:
microvm.qemu.machineOpts
Overwrite the default machine model options.
Type: null or (attribute set of string)
Default:
null
Declared by:
microvm.qemu.serialConsole
Whether to enable the virtual serial console on qemu.
Type: boolean
Default:
true
Declared by:
microvm.runner
Generated Hypervisor runner for this NixOS
Type: attribute set of package
Declared by:
microvm.shares
Shared directory trees
Type: list of (submodule)
Default:
[ ]
Declared by:
microvm.shares.*.mountPoint
Where to mount the share inside the container
Type: path
Declared by:
microvm.shares.*.proto
Protocol for this share
Type: one of “9p”, “virtiofs”
Default:
"9p"
Declared by:
microvm.shares.*.securityModel
What security model to use for the shared directory
Type: one of “passthrough”, “none”, “mapped”, “mapped-file”
Default:
"none"
Declared by:
microvm.shares.*.socket
Socket for communication with virtiofs daemon
Type: null or string
Default:
null
Declared by:
microvm.shares.*.source
Path to shared directory tree
Type: non-empty string
Declared by:
microvm.shares.*.tag
Unique virtiofs daemon tag
Type: string
Declared by:
microvm.socket
Hypervisor control socket path
Type: null or string
Default:
"${hostName}.sock"
Declared by:
microvm.storeOnDisk
Whether to boot with the storeDisk, that is, unless the host’s /nix/store is a microvm.share.
Type: boolean
Default:
true
Declared by:
microvm.user
User to switch to when started as root
Type: null or string
Default:
null
Declared by:
microvm.vcpu
Number of virtual CPU cores
Type: signed integer
Default:
1
Declared by:
microvm.volumes
Disk images
Type: list of (submodule)
Default:
[ ]
Declared by:
microvm.volumes.*.autoCreate
Created image on host automatically before start?
Type: boolean
Default:
true
Declared by:
microvm.volumes.*.direct
Whether to set O_DIRECT on the disk. (Currently only respected if using cloud-hypervisor)
Type: null or boolean
Default:
null
Declared by:
microvm.volumes.*.fsType
Filesystem for automatic creation and mounting
Type: string
Default:
"ext4"
Declared by:
microvm.volumes.*.image
Path to disk image on the host
Type: string
Declared by:
microvm.volumes.*.label
Label of the volume, if any. Only applicable if autoCreate
is true; otherwise labeling of the volume must be done manually
Type: null or string
Default:
null
Declared by:
microvm.volumes.*.mkfsExtraArgs
Set extra Filesystem creation parameters
Type: list of string
Default:
[ ]
Declared by:
microvm.volumes.*.mountPoint
If and where to mount the volume inside the container
Type: null or path
Declared by:
microvm.volumes.*.serial
User-configured serial number for the disk (Currently only respected if using cloud-hypervisor)
Type: null or string
Default:
null
Declared by:
microvm.volumes.*.size
Volume size if created automatically
Type: signed integer
Declared by:
microvm.vsock.cid
Virtual Machine address; setting it enables AF_VSOCK
The following are reserved:
- 0: Hypervisor
- 1: Loopback
- 2: Host
Type: null or signed integer
Default:
null
Declared by:
microvm.writableStoreOverlay
Path to the writable /nix/store overlay.
If set to a filesystem path, the initrd will mount /nix/store as an overlay filesystem consisting of the read-only part as a host share or from the built storeDisk, and this configuration option as the writable overlay part. This allows you to build nix derivations inside the VM.
Make sure that the path points to a writable filesystem (tmpfs, volume, or share).
Type: null or string
Default:
null
Example:
"/nix/.rw-store"
Declared by:
Running a MicroVM as a package
Quickly running a MicroVM interactively is great for testing. You get to interact with its console.
There are drawbacks: no preparation for TAP network interfaces is done
and no virtiofsd is started. These can be worked around by relying on
9p shares and using qemu's host
network interfaces.
Immediately running a nixosConfiguration
To run a nixosConfiguration
off your Flake directly use:
nix run .#nixosConfigurations.my-microvm.config.microvm.declaredRunner
Add a runner package to your Flake
To add this runner permanently add a package like this to the outputs
of your flake.nix
:
packages.x86_64-linux.my-microvm = self.nixosConfigurations.my-microvm.config.microvm.declaredRunner;
You can then run the MicroVM with a simple nix run .#my-microvm
Preparing a NixOS host for declarative MicroVMs
microvm.nix adds the following configuration for servers to host MicroVMs reliably:
- a
/var/lib/microvms
state directory with one subdirectory per MicroVM - systemd services
microvm-tap-interfaces@
to setup TAP network interfaces - systemd services
microvm-virtiofsd@
to start virtiofsd instances - systemd services
microvm@
to start a MicroVM - configuration options to declaratively build MicroVMs with the host system
- tools to manage MicroVMs imperatively
Prepare your host by including the microvm.nix host
nixosModule:
# Your server's flake.nix
{
inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
inputs.microvm.url = "github:astro/microvm.nix";
inputs.microvm.inputs.nixpkgs.follows = "nixpkgs";
outputs = { self, nixpkgs, microvm }: {
# Example nixosConfigurations entry
nixosConfigurations.server1 = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
# Include the microvm host module
microvm.nixosModules.host
# Add more modules here
{
networking.hostName = "server1";
# try to automatically start these MicroVMs on bootup
microvm.autostart = [
"my-microvm"
"your-microvm"
"their-microvm"
];
}
];
};
};
}
Preparing a non-Flakes host
If you really cannot migrate to Flakes easily, just import the host
module directly in your NixOS configuration:
imports = [ (builtins.fetchGit {
url = "https://github.com/astro/microvm.nix";
} + "/nixos-modules/host") ];
A simple network setup
While networking infrastructure is out of scope for the microvm.nix flake, here is some guidance for providing the MicroVMs on your NixOS machine with internet access.
Use this for your local LAN where IP addresses are free and plentiful. If not, head over to the advanced networking page.
Because we already use systemd for MicroVM startup, let's pick
systemd-networkd
:
networking.useNetworkd = true;
A bridge to link TAP interfaces
To make your MicroVM reachable, the host will place its Ethernet port (eno1
)
on a bridge (br0
). This bridge will have the MicroVM's TAP interface attached
to it - directly placing the MicroVM on your local network.
Note that the addresses provided below are examples and you must adjust these
for your network settings. Also note that the eno1
must be attached to the
bridge with the vm-*
TAP interfaces that you will specify in the MicroVM
definition.
systemd.network.enable = true;
systemd.network.networks."10-lan" = {
matchConfig.Name = ["eno1" "vm-*"];
networkConfig = {
Bridge = "br0";
};
};
systemd.network.netdevs."br0" = {
netdevConfig = {
Name = "br0";
Kind = "bridge";
};
};
systemd.network.networks."10-lan-bridge" = {
matchConfig.Name = "br0";
networkConfig = {
Address = ["192.168.1.2/24" "2001:db8::a/64"];
Gateway = "192.168.1.1";
DNS = ["192.168.1.1"];
IPv6AcceptRA = true;
};
linkConfig.RequiredForOnline = "routable";
};
Now that the host is configured, you can define a MicroVM to have a static IP address with:
microvm = {
#...add additional MicroVM configuration here
interfaces = [
{
type = "tap";
id = "vm-test1";
mac = "02:00:00:00:00:01";
}
];
};
systemd.network.enable = true;
systemd.network.networks."20-lan" = {
matchConfig.Type = "ether";
networkConfig = {
Address = ["192.168.1.3/24" "2001:db8::b/64"];
Gateway = "192.168.1.1";
DNS = ["192.168.1.1"];
IPv6AcceptRA = true;
DHCP = "no";
};
};
Docker and systemd-network
If you use the above systemd.network
bridge config and wish to run
Docker containers inside your microvms using virtualisation.docker
,
you may need to add the following snippet to stop systemd-networkd
from
managing the bridged veth*
interfaces Docker creates for each container.
Without this, network access inside the containers will be broken.
systemd.network.networks."19-docker" = {
matchConfig.Name = "veth*";
linkConfig = {
Unmanaged = true;
};
};
Advanced: Improving Performance
If you prioritize network performance over inter-VM communication on the virtual bridge, have a look into these alternatives:
-
Network interfaces with
type = "macvtap"
are supported in microvm.nix. While they're technically tap devices, they attach to an external Ethernet port, eliminating thebr0
bridge. -
Server Ethernet cards support SR-IOV: setup Virtual Function devices for PCI passthru into MicroVMs.
Advanced network setup
Renting a server in a datacenter usually gets you one IP address. You should not bridge your local VM traffic together with the physical Ethernet uplink port. Instead, setup a bridge only for the Virtual Machines, and provide them with Internet through NAT just like your plastic ADSL router at home.
A bridge to link TAP interfaces
Instead of placing MicroVMs directly on a LAN, one can also use a TAP interface to get a virtual Ethernet interface on the host. Although it is possible to assign individual IP configuration to these individual interfaces, let us avoid the additional configuration effort and create a bridge instead:
systemd.network = {
netdevs."10-microvm".netdevConfig = {
Kind = "bridge";
Name = "microvm";
};
networks."10-microvm" = {
matchConfig.Name = "microvm";
networkConfig = {
DHCPServer = true;
IPv6SendRA = true;
};
addresses = [ {
addressConfig.Address = "10.0.0.1/24";
} {
addressConfig.Address = "fd12:3456:789a::1/64";
} ];
ipv6Prefixes = [ {
ipv6PrefixConfig.Prefix = "fd12:3456:789a::/64";
} ];
};
};
# Allow inbound traffic for the DHCP server
networking.firewall.allowedUDPPorts = [ 67 ];
This configuration will hand out IP addresses to clients on the bridge. In practise, better leave out the DHCP server and its state by opting for declarative, versioned configuration instead.
Last, the TAP interfaces of MicroVMs shall be attached to this central
bridge. Make sure your matchConfig
matches just the interfaces you
want!
systemd.network = {
networks."11-microvm" = {
matchConfig.Name = "vm-*";
# Attach to the bridge that was configured above
networkConfig.Bridge = "microvm";
};
};
Provide Internet Access with NAT
IPv4 addresses are exhausted. It is a very common case that you get one public IPv4 address for your machine. The solution is to route your internal virtual machines with Network Address Translation.
You might not get a dedicated /64 IPv6 prefix to route to your MicroVMs. NAT works for this address family, too!
networking.nat = {
enable = true;
enableIPv6 = true;
# Change this to the interface with upstream Internet access
externalInterface = "eth0";
internalInterfaces = [ "microvm" ];
};
Check out
networking.nat.forwardPorts
to make your MicroVM's services available to networks outside your
host!
Port forwarding
Isolating your public Internet services is a great use-case for virtualization. But how does traffic get to you when your MicroVMs have private IP addresses behind NAT?
NixOS has got you covered with the networking.nat.forwardPorts
option! This example forwards TCP ports 80 (HTTP) and 443 (HTTPS) to
other hosts:
networking.nat = {
enable = true;
forwardPorts = [ {
proto = "tcp";
sourcePort = 80;
destination = my-addresses.http-reverse-proxy.ip4;
} {
proto = "tcp";
sourcePort = 443;
destination = my-addresses.https-reverse-proxy.ip4;
} ];
};
systemd services on a MicroVM host
The host
nixosModule provides a few systemd services for additional
bringup which is not available when running a MicroVM interactively
from a package.
install-microvm-${name}.service
Creates and prepares a subdirectory under /var/lib/microvms
for
declarative MicroVMs according to the
microvm.vms
option.
If the MicroVM subdirectory under /var/lib/microvms
already exists,
and the MicroVM is configured to be built from a flake's
nixosConfigurations
, this systemd unit will be skipped. The reason
for this behavior is that it is easier to update with the microvm
command instead of restarting all virtual
machines on a host when doing nixos-rebuild switch
.
microvm-tap-interfaces@.service
Creates TAP virtual network interfaces for the user that will run MicroVMs.
microvm-macvtap-interfaces@.service
Creates MACVTAP virtual network interfaces for the user that will run MicroVMs.
microvm-pci-devices@.service
Prepares PCI devices for passthrough (VFIO).
microvm-virtiofsd@.service
Starts a fleet of virtiofsd servers, one for each virtiofs
mountpoint in microvm.shares
.
microvm@.service
Runs the actual MicroVM through
/var/lib/microvms/%i/current/bin/microvm-run
where %i
is the
MicroVM name.
microvms.target
Depends on the microvm@.service
instance for all configured
microvm.autostart
.
microvm.autostart
MicroVMs to start by default.
This includes declarative config.microvm.vms
as well as MicroVMs that are managed through the microvm
command.
Type: list of string
Default:
[ ]
Declared by:
microvm.host.enable
Whether to enable the microvm.nix host module.
Type: boolean
Default:
true
Declared by:
microvm.host.tapScript
Commands to run after creating a tap interface
Defaults to bring the interface up.
If you do not want the interface to be automatically created
at all, just set
systemd.services."microvm-tap-interfaces@%i.service".enable = false
Type: strings concatenated with “\n”
Default:
"''/nix/store/p6c92mx3kmnqcr482d16qqz67yqam72p-iproute2-6.10.0/bin/ip link set \"$id\" up"
Example:
''
# Attach tap interface to bridge br0, and bring it up
''${pkgs.iproute2}/bin/ip link set "$id" master br0 up
''
Declared by:
microvm.host.useNotifySockets
Enable if all your MicroVMs run with a Hypervisor that sends readiness notification over a VSOCK.
Danger! If one of your MicroVMs doesn’t do this, its systemd service will not start up successfully!
Type: boolean
Default:
false
Declared by:
microvm.stateDir
Directory that contains the MicroVMs
Type: path
Default:
"/var/lib/microvms"
Declared by:
microvm.virtiofsd.extraArgs
Extra command-line switch to pass to virtiofsd.
Type: list of string
Default:
[ ]
Declared by:
microvm.virtiofsd.inodeFileHandles
When to use file handles to reference inodes instead of O_PATH file descriptors (never, prefer, mandatory)
Allows you to overwrite default behavior in case you hit “too many open files” on eg. ZFS. https://gitlab.com/virtio-fs/virtiofsd/-/issues/121
Type: null or one of “never”, “prefer”, “mandatory”
Default:
null
Declared by:
microvm.virtiofsd.threadPoolSize
The amounts of threads virtiofsd should spawn. This option also takes the special
string \
nproc`` which spawns as many threads as the host has cores.
Type: string or (unsigned integer, meaning >=0)
Default:
"`nproc`"
Declared by:
microvm.vms
The MicroVMs that shall be built declaratively with the host NixOS.
Type: attribute set of (submodule)
Default:
{ }
Declared by:
microvm.vms.<name>.autostart
Add this MicroVM to config.microvm.autostart?
Type: boolean
Default:
true
Declared by:
microvm.vms.<name>.config
A specification of the desired configuration of this MicroVM, as a NixOS module, for building without a flake.
Type: null or (Toplevel NixOS config)
Default:
null
Declared by:
microvm.vms.<name>.flake
Source flake for declarative build
Type: null or path
Default:
flakeInputs.my-infra
Declared by:
microvm.vms.<name>.nixpkgs
This option is only respected when config
is
specified.
The nixpkgs path to use for the MicroVM. Defaults to the host’s nixpkgs.
Type: path
Default:
pkgs.path
Declared by:
microvm.vms.<name>.pkgs
This option is only respected when config
is specified.
The package set to use for the MicroVM. Must be a nixpkgs package set with the microvm overlay. Determines the system of the MicroVM.
If set to null, a new package set will be instantiated.
Type: null or unspecified value
Default:
pkgs
Declared by:
microvm.vms.<name>.restartIfChanged
Restart this MicroVM’s services if the systemd units are changed, i.e. if it has been updated by rebuilding the host.
Defaults to true for fully-declarative MicroVMs.
Type: boolean
Default:
false
Declared by:
microvm.vms.<name>.specialArgs
This option is only respected when config
is specified.
A set of special arguments to be passed to NixOS modules.
This will be merged into the specialArgs
used to evaluate
the NixOS configurations.
Type: attribute set of unspecified value
Default:
{ }
Declared by:
microvm.vms.<name>.updateFlake
Source flakeref to store for later imperative update
Type: null or string
Default:
"git+file:///home/user/my-infra"
Declared by:
Declarative MicroVMs
Provided your NixOS host includes the host nixosModule, options are declared to build a MicroVM together with the host. You can choose whether your MicroVMs should be managed in a fully-declarative way, or whether your only want the initial deployment be declarative (with subsequent imperative updates using the microvm command).
microvm.nix distinguishes between fully-declarative configurations
and declarative deployment by allowing you to specify either
a full config
or just a flake
respectively.
Fully declarative
You can create fully declarative VMs by directly defining their nixos system configuration in-place. This is very similar to how nixos-containers work if you are familiar with those.
# microvm refers to microvm.nixosModules
{ microvm, ... }: {
imports = [ microvm.host ];
microvm.vms = {
my-microvm = {
# The package set to use for the microvm. This also determines the microvm's architecture.
# Defaults to the host system's package set if not given.
pkgs = import nixpkgs { system = "x86_64-linux"; };
# (Optional) A set of special arguments to be passed to the MicroVM's NixOS modules.
#specialArgs = {};
# The configuration for the MicroVM.
# Multiple definitions will be merged as expected.
config = {
# It is highly recommended to share the host's nix-store
# with the VMs to prevent building huge images.
microvm.shares = [{
source = "/nix/store";
mountPoint = "/nix/.ro-store";
tag = "ro-store";
proto = "virtiofs";
}];
# Any other configuration for your MicroVM
# [...]
};
};
};
}
Declarative deployment
Why deployed? The per-MicroVM subdirectory under /var/lib/microvms
is only created if it did not exist before. This behavior is
intended to ensure existence of MicroVMs that are critical to
operation. To update them later you will have to use the imperative microvm
command.
microvm.vms = {
my-microvm = {
# Host build-time reference to where the MicroVM NixOS is defined
# under nixosConfigurations
flake = self;
# Specify from where to let `microvm -u` update later on
updateFlake = "git+file:///etc/nixos";
};
};
Note that building MicroVMs with the host increases build time and closure size of the host's system.
Imperative MicroVM management with the microvm
command
Compartmentalizing services in an infrastructure landscape allows us to
conduct maintenance individually and without affecting unrelated
MicroVMs. The microvm
command helps with that.
Create a MicroVM
You can specify this MicroVM's source flake with -f
. If omitted, the
tool will assume git+file:///etc/nixos
. The source flakeref will be
kept in /var/lib/microvms/*/flake
for future updating the MicroVM.
microvm -f git+https://... -c my-microvm
Enabling MicroVM autostart
Extension of the host's systemd units must happen declaratively in the host's NixOS configuration:
microvm.autostart = [
"myvm1"
"myvm2"
"myvm3"
];
Update a MicroVM
Updating does not refresh your packages but simply rebuilds the
MicroVM. Use nix flake update
to get new package versions.
microvm -u my-microvm
Until ways have been found to safely transfer the profile into the target /nix/store, and subsequently activate it, you must restart the MicroVM for the update to take effect.
Use the -R
flag to automatically restart if an update was built.
List MicroVMs
Listing your MicroVMs is as trivial as ls -1 /var/lib/microvms
For more insight, the following command will read the current system version of all MicroVMs and compare them to what the corresponding flake evaluates. It is therefore quite slow to run, yet very useful for an updatable VM overview.
microvm -l
If you want a faster overview of booted and current versions, run this instead:
ls -l /var/lib/microvms/*/{current,booted}/share/microvm/system
Removing MicroVMs
First, stop the MicroVM:
systemctl stop microvm@$NAME
If you don't use absolute filesystem paths for sockets, volumes, or
shares, all MicroVM state is kept under /var/lib/microvms/$NAME/
.
The microvm@.serivce
systemd service template depends on existence
of this directory.
rm -rf /var/lib/microvms/$NAME
Deploying via SSH
By running either from packages or through systemd services microvm.nix tries to support a wholesome Nix workflow: develop and test on your local laptop, then deploy to staging and later to production.
Let's explore alternative ways before detailing our elaboration:
- You could build
.#nixosConfiguration.my-microvm.config.microvm.declaredRunner
locally, thennix copy
it to the target host for installation. This comes at the expense of your laptop's battery time and it can also become quite network-heavy. - You may transfer each change to the remote host to build entirely
remote. There you're going to have a repository state that is going
to confuse fellow operators. Also, your local
--override-input
parameters will become meaningless on the remote filesystem.
microvm.deploy.rebuild
The easy interface that is named after nixos-rebuild
combines the
two scripts that are described below:
- First, we evaluate locally and build remotely with
microvm.deploy.installOnHost
- Depending on whether the host's
/nix/store
is mounted and SSH is running in the MicroVM:- We either run
microvm.deploy.sshSwitch
as described below - Alternatively, we restart the MicroVM's systemd service on the host
- We either run
Because it needs to know about both the host and the MicroVM, these
ssh addresses must come before the actual switch
argument:
nix run .#nixosConfigurations.my-microvm.config.microvm.deploy.rebuild root@example.com root@my-microvm.example.com switch
microvm.deploy.installOnHost
This script will evaluate only the system's derivations locally. It then transfers these and their dependencies to the remote system so the actual build can be performed there.
Just like the microvm command, it then installs
the MicroVM under /var/lib/microvms/$NAME
so that the systemd
services of the host
module can pick it up.
It is irrelevant whether you create a new MicroVM or update an existing one.
microvm.deploy.sshSwitch
Once the host has an updated MicroVM in its /nix/store
(see above)
the new system must be activated. For a proper state, this script does
a bit more in the MicroVM than just switch-to-configuration
:
- First, the
config.networking.hostName
is compared to the running system for safety reasons. - The Nix database registration will be imported which is important if
you build packages into a
microvm.writableStoreOverlay
. - The new system is installed into
/nix/var/nix/profiles/system
which is optional but expected by some Nix tooling. - Finally, run
switch-to-configuration
with the provided parameter (eg.switch
).
Conventions between MicroVM packages and the host
This section describes the interface that is used to run MicroVM
packages with the flake's host
module. While the microvm.nix
flake was designed for single-server usage, you can build different
MicroVM deployments using the information on this page.
nixosModule.microvm option | MicroVM package file | nixosModules.host systemd service | Description |
---|---|---|---|
microvm.hypervisor | bin/microvm-run | microvm@.service | Start script for the main MicroVM process |
microvm.hypervisor | bin/microvm-shutdown | microvm@.service | Script for graceful shutdown of the MicroVM (i.e. triggering the power button) |
microvm.interfaces.*.id | share/microvm/tap-interfaces | microvm-tap-interfaces@.service | Names of the tap network interfaces to setup for the proper user |
microvm.devices.*.path | share/microvm/pci-devices | microvm-pci-devices@.service | PCI devices that must be bound to the vfio-pci driver on the host |
microvm.shares.*.source | share/microvm/virtiofs/${tag}/source | microvm-virtiofsd@.service | Source directory of a virtiofs instance by tag |
microvm.shares.*.socket | share/microvm/virtiofs/${tag}/socket | microvm-virtiofsd@.service | virtiofsd socket path by tag |
share/microvm/system | config.system.build.toplevel symlink, used for comparing versions when running microvm -l |
Generating custom operating system hypervisor packages
Because a microvm.nix runner package completely defines how to run the Hypervisor, it is possible to define independent packages that virtualize other operating systems than NixOS.
-
Your NixOS configurations should export their runner package as
config.microvm.declaredRunner
so that it can be picked up either as declarative MicroVMs or by the microvm command. -
The runner package must have a file layout as described in the table above.
microvm-solo5-spt is an example of a Flake that can run on a microvm.nix host.
Frequently Asked Questions
Can I support the development and maintenance of this project?
How to centralize logging with journald?
That is possible without even requiring a network transport by just
making the journals available to the host as a share. Because journald
identifies hosts by their /etc/machine-id
, we propose to use static
content for that file. Add a NixOS module like the following to your
MicroVM configuration:
environment.etc."machine-id" = {
mode = "0644";
text =
# change this to suit your flake's interface
self.lib.addresses.machineId.${config.networking.hostName} + "\n";
};
microvm.shares = [ {
# On the host
source = "/var/lib/microvms/${config.networking.hostName}/journal";
# In the MicroVM
mountPoint = "/var/log/journal";
tag = "journal";
proto = "virtiofs";
socket = "journal.sock";
} ];
Last, make the MicroVM journals available to your host. The
machine-id
must be available.
systemd.tmpfiles.rules = map (vmHost:
let
machineId = self.lib.addresses.machineId.${vmHost};
in
# creates a symlink of each MicroVM's journal under the host's /var/log/journal
"L+ /var/log/journal/${machineId} - - - - /var/lib/microvms/${vmHost}/journal/${machineId}"
) (builtins.attrNames self.lib.addresses.machineId);
Once your MicroVM's journal data is visible in the
/var/log/journal/$machineId/
directories, journalctl
can pick it
up using the -m
/--merge
switch.
Can I build with hypervisors from the host's nixpkgs instead of the MicroVM's?
Yes. This scenario is enabled through the flake's lib.buildRunner
function. See the nix run microvm#build-microvm
script that you will need to customize to fit your deployment scenario.
How can I deploy imperatively from Continuous Integration?
Do this by integrating into your automation what the microvm
command
does.
environment.systemPackages = [ (
# Provide a manual updating script that fetches the latest
# updated+built system from Hydra
pkgs.writeShellScriptBin "update-microvm" ''
if [ $# -lt 1 ]; then
NAMES="$(ls -1 /var/lib/microvms)"
else
NAMES="$@"
fi
for NAME in $NAMES; do
echo MicroVM $NAME
cd /var/lib/microvms/$NAME
# Is this truly the flake that is being built on Hydra?
if [ "$(cat flake)" = "git+https://gitea.example.org/org/nix-config?ref=flake-update" ]; then
NEW=$(curl -sLH "Accept: application/json" https://hydra.example.org/job/org/nix-config/$NAME/latest | ${pkgs.jq}/bin/jq -er .buildoutputs.out.path)
nix copy --from https://nix-cache.example.org $NEW
if [ -e booted ]; then
nix store diff-closures $(readlink booted) $NEW
elif [ -e current ]; then
echo "NOT BOOTED! Diffing to old current:"
nix store diff-closures $(readlink current) $NEW
else
echo "NOT BOOTED?"
fi
CHANGED=no
if ! [ -e current ]; then
ln -s $NEW current
CHANGED=yes
elif [ "$(readlink current)" != $NEW ]; then
rm -f old
cp --no-dereference current old
rm -f current
ln -s $NEW current
CHANGED=yes
fi
fi
if [ "$CHANGED" = "yes" ]; then
systemctl restart microvm@$NAME
fi
echo
done
''
) ];
Can I include my host's <nixpkgs>
channel when building the VM?
Use the following configuration if you build your MicroVM with
--impure
from channels, not Flakes:
nix.nixPath = [
"nixpkgs=${builtins.storePath <nixpkgs>}"
];
How do I let the microvm
user access block devices?
You can re-add the following line to your host's NixOS configuration which was removed from microvm.nix:
users.users.microvm.extraGroups = [ "disk" ];
The more secure solution would be writing custom
services.udev.extraRules
that assign ownership/permissions to the
individually used block devices.
My virtiofs-shared sops-nix /run/secrets disappears when the host is updated!
A workaround may be setting sops.keepGenerations = 0;
, effectively
stopping sops-nix from ever removing old generations in
/run/secrets.d/
.
That means that you still must reboot all MicroVMs to adapt any updated secrets.