Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

VMM

ktstr includes a purpose-built VMM (virtual machine monitor) that boots Linux kernels in KVM for testing.

KtstrVm builder

let result = vmm::KtstrVm::builder()
    .kernel(&kernel_path)
    .init_binary(&ktstr_binary)
    .topology(numa_nodes, llcs, cores_per_llc, threads_per_core)
    .memory_mb(4096)
    .run_args(&["run".into(), "--ktstr-test-fn".into(), "my_test".into()])
    .build()?
    .run()?;

Topology

The VM topology is specified as (numa_nodes, llcs, cores_per_llc, threads_per_core). On x86_64, the VMM creates ACPI tables (MADT, SRAT, SLIT, and HMAT when numa_nodes > 1) and MP tables. On aarch64, topology is expressed via FDT cpu nodes with MPIDR-derived reg properties.

pub struct Topology {
    pub llcs: u32,
    pub cores_per_llc: u32,
    pub threads_per_core: u32,
    pub numa_nodes: u32,
    pub nodes: Option<&'static [NumaNode]>,
    pub distances: Option<&'static NumaDistance>,
}

total_cpus() = llcs * cores_per_llc * threads_per_core. num_llcs() = llcs.

When nodes is None (the default), memory and LLCs are distributed uniformly across NUMA nodes with default 10/20 distances. When Some, each NumaNode specifies its LLC count, memory size, and optional HMAT attributes (latency_ns, bandwidth_mbs, mem_side_cache). A NumaNode with llcs = 0 models a CXL memory-only node.

NumaDistance is an NxN inter-node distance matrix. Diagonal entries must be 10, off-diagonal > 10, and the matrix must be symmetric (ACPI SLIT requirements).

Use Topology::new(numa_nodes, llcs, cores, threads) for uniform topologies, or Topology::with_nodes(cores, threads, &nodes) for explicit per-node configuration.

initramfs

The VMM builds a cpio initramfs containing:

  • The test binary (as /init)
  • Optional scheduler binary (as /scheduler)
  • Shared library dependencies (resolved via ELF DT_NEEDED parsing)

The initramfs is cached based on a cache key derived from the binary contents. A compressed SHM segment enables COW overlay into guest memory, sharing physical pages across concurrent VMs.

Guest-host communication

Serial console – COM2 carries guest stdout/stderr, the canonical crash diagnostic transport, and a fallback result transport. The guest panic hook writes PANIC: <info>\n<bt>\n to COM2; the host parses it via extract_panic_message and surfaces the backtrace in test failure output. Delimited test results (between ===KTSTR_TEST_RESULT_START=== / ===KTSTR_TEST_RESULT_END=== sentinels) and exit codes (KTSTR_EXIT=N) are also written to COM2 as a fallback when the TLV stream is unavailable.

Virtio-console port 1 TLV stream – the primary guest-to-host data channel. Carries scenario markers (MSG_TYPE_SCENARIO_START, MSG_TYPE_SCENARIO_END), test results (MSG_TYPE_TEST_RESULT), exit codes (MSG_TYPE_EXIT), stimulus events (MSG_TYPE_STIMULUS), scheduler exit notifications (MSG_TYPE_SCHED_EXIT), profraw coverage data (MSG_TYPE_PROFRAW), per-payload-invocation metrics (MSG_TYPE_PAYLOAD_METRICS), and raw LlmExtract output (MSG_TYPE_RAW_PAYLOAD_OUTPUT). Each TLV frame has a CRC32 for integrity checking.

Virtio devices

The VMM implements three virtio-MMIO devices in addition to the serial console above. All three speak the virtio 1.x MMIO transport (virtio-v1.2 §4.2.2) with VIRTIO_F_VERSION_1 and use irqfd (eventfd → KVM GSI) for interrupt delivery.

  • virtio-blk (vmm::virtio_blk) – file-backed block device with a single request virtqueue and a token-bucket throttle. Used to give workloads real on-disk filesystems (per-test images cloned from a btrfs template). Advertises VIRTIO_BLK_F_BLK_SIZE, VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_FLUSH, and VIRTIO_RING_F_EVENT_IDX, plus VIRTIO_BLK_F_RO when configured read-only.
  • virtio-net (vmm::virtio_net) – two-virtqueue (RX, TX) NIC with an in-VMM L2 loopback backend. Used by network-shaped workloads (TCP/UDP throughput, latency) without depending on the host’s network stack. Advertises VIRTIO_NET_F_MAC so the guest binds a deterministic MAC.
  • virtio-console (vmm::virtio_console) – three-port multiport console with eight virtqueues (per virtio-v1.2 §5.3.5: two control queues plus an in/out pair per port, three ports → 2 + 2·3 = 8). Port 0 carries the interactive /dev/hvc0 console alongside the COM1/COM2 16550 serial ports; port 1 carries the guest-to-host TLV stream that delivers exit code, test result, per-payload metrics, raw payload outputs, profraw, and scheduler exit notifications; port 2 is a transparent byte-pipe relay carrying scx_stats request bytes from the host to the in-guest relay thread and the scheduler’s responses back. Advertises VIRTIO_CONSOLE_F_MULTIPORT with max_nr_ports = 3.

Performance mode

When performance_mode is enabled, the VMM applies host-side isolation (vCPU pinning, hugepages, NUMA mbind, RT scheduling), guest-visible hints (KVM_HINTS_REALTIME CPUID), and KVM exit suppression. Non-performance-mode VMs set KVM_CAP_HALT_POLL to 200us; overcommitted topologies set it to 0.

See Performance Mode for the full optimization list, prerequisites, and validation.

Dual-role architecture

The same test binary serves two roles:

Host side – manages the VM lifecycle: builds the initramfs, boots the kernel, runs the monitor, and evaluates results.

Guest side – runs inside the VM as /init (PID 1). The Rust init code (vmm::rust_init) mounts filesystems, starts the scheduler, dispatches the test function, then reboots.

The role is determined at runtime:

  • PID 1 detection: when running as PID 1, the #[ctor] function ktstr_test_early_dispatch() runs the guest init path, which handles the full guest lifecycle.
  • #[ktstr_test] host dispatch: a #[ctor::ctor] function (ktstr_test_early_dispatch) runs before main() in any binary that links against ktstr. When both --ktstr-test-fn and --ktstr-topo are present, it boots a VM and runs the test inside it.
  • #[ktstr_test] guest dispatch: when only --ktstr-test-fn is present (no --ktstr-topo), the ctor runs the test function directly – the binary is already inside a VM.

This design means one cargo build produces everything needed for both host and guest execution. The initramfs embeds the same binary that built it.

Boot process

  1. Load kernel (bzImage on x86_64, Image on aarch64) via linux-loader.
  2. Set up KVM vCPUs with the specified topology.
  3. Build and load initramfs.
  4. Set up serial devices (COM1 for console, COM2 for results).
  5. Boot the kernel.
  6. Kernel starts /init (the test binary).
  7. PID 1 detected: the guest init path mounts filesystems, starts the scheduler, dispatches the test function, and reboots.