1.. SPDX-License-Identifier: GPL-2.0
2
3Overview
4========
5The Linux kernel contains a variety of code for running as a fully
6enlightened guest on Microsoft's Hyper-V hypervisor.  Hyper-V
7consists primarily of a bare-metal hypervisor plus a virtual machine
8management service running in the parent partition (roughly
9equivalent to KVM and QEMU, for example).  Guest VMs run in child
10partitions.  In this documentation, references to Hyper-V usually
11encompass both the hypervisor and the VMM service without making a
12distinction about which functionality is provided by which
13component.
14
15Hyper-V runs on x86/x64 and arm64 architectures, and Linux guests
16are supported on both.  The functionality and behavior of Hyper-V is
17generally the same on both architectures unless noted otherwise.
18
19Linux Guest Communication with Hyper-V
20--------------------------------------
21Linux guests communicate with Hyper-V in four different ways:
22
23* Implicit traps: As defined by the x86/x64 or arm64 architecture,
24  some guest actions trap to Hyper-V.  Hyper-V emulates the action and
25  returns control to the guest.  This behavior is generally invisible
26  to the Linux kernel.
27
28* Explicit hypercalls: Linux makes an explicit function call to
29  Hyper-V, passing parameters.  Hyper-V performs the requested action
30  and returns control to the caller.  Parameters are passed in
31  processor registers or in memory shared between the Linux guest and
32  Hyper-V.   On x86/x64, hypercalls use a Hyper-V specific calling
33  sequence.  On arm64, hypercalls use the ARM standard SMCCC calling
34  sequence.
35
36* Synthetic register access: Hyper-V implements a variety of
37  synthetic registers.  On x86/x64 these registers appear as MSRs in
38  the guest, and the Linux kernel can read or write these MSRs using
39  the normal mechanisms defined by the x86/x64 architecture.  On
40  arm64, these synthetic registers must be accessed using explicit
41  hypercalls.
42
43* VMbus: VMbus is a higher-level software construct that is built on
44  the other 3 mechanisms.  It is a message passing interface between
45  the Hyper-V host and the Linux guest.  It uses memory that is shared
46  between Hyper-V and the guest, along with various signaling
47  mechanisms.
48
49The first three communication mechanisms are documented in the
50`Hyper-V Top Level Functional Spec (TLFS)`_.  The TLFS describes
51general Hyper-V functionality and provides details on the hypercalls
52and synthetic registers.  The TLFS is currently written for the
53x86/x64 architecture only.
54
55.. _Hyper-V Top Level Functional Spec (TLFS): https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
56
57VMbus is not documented.  This documentation provides a high-level
58overview of VMbus and how it works, but the details can be discerned
59only from the code.
60
61Sharing Memory
62--------------
63Many aspects are communication between Hyper-V and Linux are based
64on sharing memory.  Such sharing is generally accomplished as
65follows:
66
67* Linux allocates memory from its physical address space using
68  standard Linux mechanisms.
69
70* Linux tells Hyper-V the guest physical address (GPA) of the
71  allocated memory.  Many shared areas are kept to 1 page so that a
72  single GPA is sufficient.   Larger shared areas require a list of
73  GPAs, which usually do not need to be contiguous in the guest
74  physical address space.  How Hyper-V is told about the GPA or list
75  of GPAs varies.  In some cases, a single GPA is written to a
76  synthetic register.  In other cases, a GPA or list of GPAs is sent
77  in a VMbus message.
78
79* Hyper-V translates the GPAs into "real" physical memory addresses,
80  and creates a virtual mapping that it can use to access the memory.
81
82* Linux can later revoke sharing it has previously established by
83  telling Hyper-V to set the shared GPA to zero.
84
85Hyper-V operates with a page size of 4 Kbytes. GPAs communicated to
86Hyper-V may be in the form of page numbers, and always describe a
87range of 4 Kbytes.  Since the Linux guest page size on x86/x64 is
88also 4 Kbytes, the mapping from guest page to Hyper-V page is 1-to-1.
89On arm64, Hyper-V supports guests with 4/16/64 Kbyte pages as
90defined by the arm64 architecture.   If Linux is using 16 or 64
91Kbyte pages, Linux code must be careful to communicate with Hyper-V
92only in terms of 4 Kbyte pages.  HV_HYP_PAGE_SIZE and related macros
93are used in code that communicates with Hyper-V so that it works
94correctly in all configurations.
95
96As described in the TLFS, a few memory pages shared between Hyper-V
97and the Linux guest are "overlay" pages.  With overlay pages, Linux
98uses the usual approach of allocating guest memory and telling
99Hyper-V the GPA of the allocated memory.  But Hyper-V then replaces
100that physical memory page with a page it has allocated, and the
101original physical memory page is no longer accessible in the guest
102VM.  Linux may access the memory normally as if it were the memory
103that it originally allocated.  The "overlay" behavior is visible
104only because the contents of the page (as seen by Linux) change at
105the time that Linux originally establishes the sharing and the
106overlay page is inserted.  Similarly, the contents change if Linux
107revokes the sharing, in which case Hyper-V removes the overlay page,
108and the guest page originally allocated by Linux becomes visible
109again.
110
111Before Linux does a kexec to a kdump kernel or any other kernel,
112memory shared with Hyper-V should be revoked.  Hyper-V could modify
113a shared page or remove an overlay page after the new kernel is
114using the page for a different purpose, corrupting the new kernel.
115Hyper-V does not provide a single "set everything" operation to
116guest VMs, so Linux code must individually revoke all sharing before
117doing kexec.   See hv_kexec_handler() and hv_crash_handler().  But
118the crash/panic path still has holes in cleanup because some shared
119pages are set using per-CPU synthetic registers and there's no
120mechanism to revoke the shared pages for CPUs other than the CPU
121running the panic path.
122
123CPU Management
124--------------
125Hyper-V does not have a ability to hot-add or hot-remove a CPU
126from a running VM.  However, Windows Server 2019 Hyper-V and
127earlier versions may provide guests with ACPI tables that indicate
128more CPUs than are actually present in the VM.  As is normal, Linux
129treats these additional CPUs as potential hot-add CPUs, and reports
130them as such even though Hyper-V will never actually hot-add them.
131Starting in Windows Server 2022 Hyper-V, the ACPI tables reflect
132only the CPUs actually present in the VM, so Linux does not report
133any hot-add CPUs.
134
135A Linux guest CPU may be taken offline using the normal Linux
136mechanisms, provided no VMbus channel interrupts are assigned to
137the CPU.  See the section on VMbus Interrupts for more details
138on how VMbus channel interrupts can be re-assigned to permit
139taking a CPU offline.
140
14132-bit and 64-bit
142-----------------
143On x86/x64, Hyper-V supports 32-bit and 64-bit guests, and Linux
144will build and run in either version. While the 32-bit version is
145expected to work, it is used rarely and may suffer from undetected
146regressions.
147
148On arm64, Hyper-V supports only 64-bit guests.
149
150Endian-ness
151-----------
152All communication between Hyper-V and guest VMs uses Little-Endian
153format on both x86/x64 and arm64.  Big-endian format on arm64 is not
154supported by Hyper-V, and Linux code does not use endian-ness macros
155when accessing data shared with Hyper-V.
156
157Versioning
158----------
159Current Linux kernels operate correctly with older versions of
160Hyper-V back to Windows Server 2012 Hyper-V. Support for running
161on the original Hyper-V release in Windows Server 2008/2008 R2
162has been removed.
163
164A Linux guest on Hyper-V outputs in dmesg the version of Hyper-V
165it is running on.  This version is in the form of a Windows build
166number and is for display purposes only. Linux code does not
167test this version number at runtime to determine available features
168and functionality. Hyper-V indicates feature/function availability
169via flags in synthetic MSRs that Hyper-V provides to the guest,
170and the guest code tests these flags.
171
172VMbus has its own protocol version that is negotiated during the
173initial VMbus connection from the guest to Hyper-V. This version
174number is also output to dmesg during boot.  This version number
175is checked in a few places in the code to determine if specific
176functionality is present.
177
178Furthermore, each synthetic device on VMbus also has a protocol
179version that is separate from the VMbus protocol version. Device
180drivers for these synthetic devices typically negotiate the device
181protocol version, and may test that protocol version to determine
182if specific device functionality is present.
183
184Code Packaging
185--------------
186Hyper-V related code appears in the Linux kernel code tree in three
187main areas:
188
1891. drivers/hv
190
1912. arch/x86/hyperv and arch/arm64/hyperv
192
1933. individual device driver areas such as drivers/scsi, drivers/net,
194   drivers/clocksource, etc.
195
196A few miscellaneous files appear elsewhere. See the full list under
197"Hyper-V/Azure CORE AND DRIVERS" and "DRM DRIVER FOR HYPERV
198SYNTHETIC VIDEO DEVICE" in the MAINTAINERS file.
199
200The code in #1 and #2 is built only when CONFIG_HYPERV is set.
201Similarly, the code for most Hyper-V related drivers is built only
202when CONFIG_HYPERV is set.
203
204Most Hyper-V related code in #1 and #3 can be built as a module.
205The architecture specific code in #2 must be built-in.  Also,
206drivers/hv/hv_common.c is low-level code that is common across
207architectures and must be built-in.
208