sidecar/main.rs
1// Copyright (c) Microsoft Corporation.
2// Licensed under the MIT License.
3
4#![cfg_attr(minimal_rt, no_std, no_main)]
5
6//! This crate implements the OpenHCL sidecar kernel. This is a kernel that runs
7//! along side the OpenHCL Linux kernel, operating on a subset of the virtual
8//! machine's CPUs.
9//!
10//! This is done to avoid needing to boot all CPUs into Linux, since this is
11//! very expensive for large VMs. Instead, most of the CPUs are run in the
12//! sidecar kernel, where they run a minimal dispatch loop. If a sidecar CPU
13//! hits a condition that it cannot handle locally (e.g., the guest OS attempts
14//! to access an emulated device), it will send a message to the main Linux
15//! kernel. One of the Linux CPUs can then handle the exit remotely, and/or
16//! convert the sidecar CPU to a Linux CPU.
17//!
18//! Similarly, if a Linux CPU needs to run code on a sidecar CPU (e.g., to run
19//! it as a target for device interrupts from the host), it can convert the
20//! sidecar CPU to a Linux CPU.
21//!
22//! Sidecar is modeled to Linux as a set of devices, one per node (a contiguous
23//! set of CPUs; this may or may not correspond to a NUMA node or CPU package).
24//! Each device has a single control page, used to communicate with the sidecar
25//! CPUs. Each CPU additionally has a command page, which is used to specify
26//! sidecar commands (e.g., run the VP, or get or set VP registers). These
27//! commands are in separate pages at least partially so that they can be
28//! operated on independently; the Linux kernel communicates with sidecar via
29//! control page, and the user-mode VMM communicates with the individual sidecar
30//! CPUs via the command pages.
31//!
32//! The sidecar kernel is a very simple kernel. It runs at a fixed virtual
33//! address (although it is still built with dynamic relocations). Each CPU has
34//! its own set of page tables (sharing some portion of them) so that they only
35//! map what they use. Each CPU is independent after boot; sidecar CPUs never
36//! communicate with each other and only communicate with Linux CPUs, via the
37//! Linux sidecar driver.
38//!
39//! The sidecar CPU runs a simple dispatch loop. It halts the processor, waiting
40//! for the control page to indicate that it should run (the sidecar driver
41//! sends an IPI when the control page is updated). It then reads a command from
42//! the command page and executes the command; if the command can run for an
43//! unbounded amount of time (e.g., the command to run the VP), then the driver
44//! can interrupt the command via another request on the control page (and
45//! another IPI).
46//!
47//! # Processor Startup
48//!
49//! The sidecar kernel is initialized by a single bootstrap processor (BSP),
50//! which is typically VP 0. This initialization happens during the boot shim
51//! phase, before the Linux kernel starts. The BSP (which will later become a
52//! Linux CPU) calls into the sidecar kernel to perform all global initialization
53//! tasks: copying the hypercall page, setting up the IDT, initializing control
54//! pages for each node, and preparing page tables and per-CPU state for all
55//! application processors (APs).
56//!
57//! After the BSP completes its initialization work, it begins starting APs. The
58//! startup process uses a fan-out pattern to minimize total boot time: the BSP
59//! starts the first few APs, and then each newly-started AP immediately helps
60//! start additional APs. This creates an exponential growth in the number of
61//! CPUs actively participating in the boot process.
62//!
63//! Concurrency during startup is managed through atomic operations on a
64//! per-node `next_vp` counter. Each CPU (whether BSP or AP) atomically
65//! increments this counter to claim the next VP index to start within that
66//! node. This ensures that each VP is started exactly once without requiring
67//! locks or complex coordination. The startup fan-out continues until all VPs
68//! in all nodes have been started (or skipped if marked as REMOVED).
69//!
70//! Note that the first VP in each NUMA node is typically reserved for the Linux
71//! kernel and does not run the sidecar kernel. The sidecar startup logic
72//! accounts for this by initializing the `next_vp` counter to 1 for each node,
73//! effectively skipping the base VP (index 0) of that node.
74//!
75//! Each CPU's page tables include a mapping for its node's control page at a
76//! fixed virtual address (PTE_CONTROL_PAGE). This is set up during AP
77//! initialization via the `init_ap` function, which builds the per-CPU page
78//! table hierarchy and maps the control page at the same virtual address for
79//! all CPUs in the node. This allows each CPU to access its control page
80//! without knowing its physical address, and ensures that all CPUs in a node
81//! see the same control page data (since they all map the same physical page).
82//! The control page mapping is read-write from the sidecar's perspective, as
83//! the sidecar needs to update status fields (like `cpu_status` and
84//! `needs_attention`) using atomic operations.
85//!
86//! As of this writing, sidecar only supports x86_64, without hardware
87//! isolation.
88
89mod arch;
90
91#[cfg(not(minimal_rt))]
92fn main() {
93 panic!("must build with MINIMAL_RT_BUILD=1")
94}