PCIe Emulation
OpenVMM emulates a PCIe topology that presents root complexes, root ports, and optional switches to the guest. Endpoint devices (NVMe, virtio, etc.) attach to root ports via a generic bus interface.
Topology
Root Complex (GenericPcieRootComplex)
├── Root Port 0 (PcieDownstreamPort) → endpoint device
├── Root Port 1 (PcieDownstreamPort) → endpoint device or switch
└── Root Port N (PcieDownstreamPort) → ...
The root complex owns the ECAM MMIO region. When the guest reads or writes a config space address, the root complex decodes the bus/device/function from the ECAM offset and routes the access to the correct port. Each port has a Type 1 (bridge) configuration space with PCIe Express and MSI capabilities.
Ports may optionally be hotplug-capable. Devices behind non-hotplug ports are attached at VM construction time. Hotplug-capable ports start empty and devices can be added or removed at runtime.
PCIe Hotplug
Native PCIe hotplug follows the same interrupt-driven model as
real hardware (PCIe Base Spec §6.7). No ACPI GPE, SCI, or custom
protocol is needed — the guest's pciehp driver handles
everything via config space registers and MSI.
How it works
-
VMM sets port state: When a device is hot-added, the VMM atomically updates the port's Slot Status (
presence_detect_state,presence_detect_changed,data_link_layer_state_changed) and Link Status (data_link_layer_link_active). -
MSI fires: If the guest has enabled
hot_plug_interrupt_enablein the port's Slot Control register, the VMM fires the port's MSI. -
Guest handles the event: The guest's
pciehpdriver receives the MSI, reads Slot Status to see what changed, programs the bridge's bus numbers, scans the secondary bus for new devices, and clears the RW1C status bits. -
Device removal follows the same flow in reverse — presence and link active are cleared, changed bits are set, and MSI fires.
Runtime API
Hot-add and hot-remove are triggered via
VmRpc::AddPcieDevice and VmRpc::RemovePcieDevice messages.
These resolve a device resource, create the device with MMIO
registration, attach it to the named port, and fire the hotplug
notification.
See the
PetriVmRuntime
trait for the test API (add_pcie_device / remove_pcie_device)
and the pcie_hotplug test in
vmm_tests/vmm_tests/tests/tests/multiarch/pcie.rs for a
working example.
ACPI _OSC
The SSDT includes an _OSC method on each PCIe root complex
that grants native PCIe control to the OS (ACPI spec §6.2.11,
PCI Firmware Spec §4.5.1). This tells the OS it can use native
hotplug, PME, AER, and other PCIe features rather than
ACPI-based fallbacks. Linux assumes native control regardless,
but Windows requires _OSC to enable native hotplug.
MSI Interrupt Routing (aarch64)
On aarch64, PCIe MSI/MSI-X interrupts are routed through either a GICv3 ITS or a GICv2m MSI frame, depending on the platform:
-
GICv3 ITS (default on KVM with GICv3): The VMM creates a KVM in-kernel ITS device. Each MSI is delivered with a 32-bit ITS device ID of the form
(segment << 16) | BDF. The BDF is resolved lazily at interrupt-delivery time by the device'sMsiTarget, which combines the parent port'sAssignedBusRange(updated by the guest as it programs secondary/subordinate bus numbers) with the device'sdevfn. Multi-function devices and root/switch ports that need a specific RID pass it explicitly. A singleItsSignalMsi/ItsIrqFdwrapper per segment then prepends the segment to produce the final ITS device ID — no per-device wrappers and no push-based RID synchronization are needed. ACPI boots emit an IORT with an ITS Group node and per-root-complex ID mappings; the device tree includes anitschild node under the GIC withmsi-controller. -
GICv2m: MSI writes map to a fixed pool of 64 SPIs via a v2m doorbell register. The MADT includes a GICv2m MSI frame entry.
The MSI controller can be overridden with the --gic-msi
CLI option (auto, its, or v2m).
Implementation notes
Hotplug ports advertise no_command_completed_support in Slot
Capabilities. Our emulation applies Slot Control changes
instantly, so the guest does not need to wait for command
completion. This avoids an interrupt storm that would occur if
command_completed were set on every Slot Control write — see
the comment in with_hotplug_support() in pci_express.rs
for details.
The pcieport driver's PME service shares the same MSI vector
as pciehp. When a hotplug MSI fires, the PME handler also
runs and may log "Spurious native interrupt!" since there is no
PME event. These warnings are cosmetic and do not affect
functionality. A future improvement could use MSI-X (multiple
vectors) to give each service its own interrupt.