xref: /qemu/docs/system/ppc/pseries.rst (revision c49b67f7)
1a23a72ddSLeonardo Garcia===================================
2a4ee352fSCédric Le GoaterpSeries family boards (``pseries``)
3a4ee352fSCédric Le Goater===================================
4a4ee352fSCédric Le Goater
5a23a72ddSLeonardo GarciaThe Power machine para-virtualized environment described by the Linux on Power
6a23a72ddSLeonardo GarciaArchitecture Reference ([LoPAR]_) document is called pSeries. This environment
7a23a72ddSLeonardo Garciais also known as sPAPR, System p guests, or simply Power Linux guests (although
8a23a72ddSLeonardo Garciait is capable of running other operating systems, such as AIX).
988581cc4SLeonardo Garcia
1088581cc4SLeonardo GarciaEven though pSeries is designed to behave as a guest environment, it is also
1188581cc4SLeonardo Garciacapable of acting as a hypervisor OS, providing, on that role, nested
1288581cc4SLeonardo Garciavirtualization capabilities.
1388581cc4SLeonardo Garcia
14a4ee352fSCédric Le GoaterSupported devices
15a23a72ddSLeonardo Garcia=================
16a4ee352fSCédric Le Goater
1788581cc4SLeonardo Garcia * Multi processor support for many Power processors generations: POWER7,
1888581cc4SLeonardo Garcia   POWER7+, POWER8, POWER8NVL, POWER9, and Power10. Support for POWER5+ exists,
1988581cc4SLeonardo Garcia   but its state is unknown.
2088581cc4SLeonardo Garcia * Interrupt Controller, XICS (POWER8) and XIVE (POWER9 and Power10)
2188581cc4SLeonardo Garcia * vPHB PCIe Host bridge.
2288581cc4SLeonardo Garcia * vscsi and vnet devices, compatible with the same devices available on a
2388581cc4SLeonardo Garcia   PowerVM hypervisor with VIOS managing LPARs.
2488581cc4SLeonardo Garcia * Virtio based devices.
2588581cc4SLeonardo Garcia * PCIe device pass through.
2688581cc4SLeonardo Garcia
27a4ee352fSCédric Le GoaterMissing devices
28a23a72ddSLeonardo Garcia===============
29a4ee352fSCédric Le Goater
3088581cc4SLeonardo Garcia * SPICE support.
31a4ee352fSCédric Le Goater
32a4ee352fSCédric Le GoaterFirmware
33a23a72ddSLeonardo Garcia========
3488581cc4SLeonardo Garcia
35162eec18SAlexey KardashevskiyThe pSeries platform in QEMU comes with 2 firmwares:
36162eec18SAlexey Kardashevskiy
3788581cc4SLeonardo Garcia`SLOF <https://github.com/aik/SLOF>`_ (Slimline Open Firmware) is an
3888581cc4SLeonardo Garciaimplementation of the `IEEE 1275-1994, Standard for Boot (Initialization
3988581cc4SLeonardo GarciaConfiguration) Firmware: Core Requirements and Practices
4088581cc4SLeonardo Garcia<https://standards.ieee.org/standard/1275-1994.html>`_.
4188581cc4SLeonardo Garcia
42162eec18SAlexey KardashevskiySLOF performs bus scanning, PCI resource allocation, provides the client
43162eec18SAlexey Kardashevskiyinterface to boot from block devices and network.
44162eec18SAlexey Kardashevskiy
4588581cc4SLeonardo GarciaQEMU includes a prebuilt image of SLOF which is updated when a more recent
4688581cc4SLeonardo Garciaversion is required.
4788581cc4SLeonardo Garcia
48162eec18SAlexey KardashevskiyVOF (Virtual Open Firmware) is a minimalistic firmware to work with
49162eec18SAlexey Kardashevskiy``-machine pseries,x-vof=on``. When enabled, the firmware acts as a slim
50162eec18SAlexey Kardashevskiyshim and QEMU implements parts of the IEEE 1275 Open Firmware interface.
51162eec18SAlexey Kardashevskiy
52162eec18SAlexey KardashevskiyVOF does not have device drivers, does not do PCI resource allocation and
53162eec18SAlexey Kardashevskiyrelies on ``-kernel`` used with Linux kernels recent enough (v5.4+)
54162eec18SAlexey Kardashevskiyto PCI resource assignment. It is ideal to use with petitboot.
55162eec18SAlexey Kardashevskiy
56162eec18SAlexey KardashevskiyBooting via ``-kernel`` supports the following:
57162eec18SAlexey Kardashevskiy
58162eec18SAlexey Kardashevskiy+-------------------+-------------------+------------------+
59162eec18SAlexey Kardashevskiy| kernel            | pseries,x-vof=off | pseries,x-vof=on |
60162eec18SAlexey Kardashevskiy+===================+===================+==================+
61162eec18SAlexey Kardashevskiy| vmlinux BE        |     ✓             |     ✓            |
62162eec18SAlexey Kardashevskiy+-------------------+-------------------+------------------+
63162eec18SAlexey Kardashevskiy| vmlinux LE        |     ✓             |     ✓            |
64162eec18SAlexey Kardashevskiy+-------------------+-------------------+------------------+
65c49b67f7SAlexey Kardashevskiy| zImage.pseries BE |     ✓¹            |     ✓¹           |
66162eec18SAlexey Kardashevskiy+-------------------+-------------------+------------------+
67162eec18SAlexey Kardashevskiy| zImage.pseries LE |     ✓             |     ✓            |
68162eec18SAlexey Kardashevskiy+-------------------+-------------------+------------------+
69162eec18SAlexey Kardashevskiy
70162eec18SAlexey Kardashevskiy¹ must set kernel-addr=0
71162eec18SAlexey Kardashevskiy
7288581cc4SLeonardo GarciaBuild directions
73a23a72ddSLeonardo Garcia================
7488581cc4SLeonardo Garcia
7588581cc4SLeonardo Garcia.. code-block:: bash
7688581cc4SLeonardo Garcia
7788581cc4SLeonardo Garcia  ./configure --target-list=ppc64-softmmu && make
7888581cc4SLeonardo Garcia
7988581cc4SLeonardo GarciaRunning instructions
80a23a72ddSLeonardo Garcia====================
8188581cc4SLeonardo Garcia
8288581cc4SLeonardo GarciaSomeone can select the pSeries machine type by running QEMU with the following
8388581cc4SLeonardo Garciaoptions:
8488581cc4SLeonardo Garcia
8588581cc4SLeonardo Garcia.. code-block:: bash
8688581cc4SLeonardo Garcia
8788581cc4SLeonardo Garcia  qemu-system-ppc64 -M pseries <other QEMU arguments>
8888581cc4SLeonardo Garcia
8988581cc4SLeonardo GarciasPAPR devices
90a23a72ddSLeonardo Garcia=============
9188581cc4SLeonardo Garcia
9288581cc4SLeonardo GarciaThe sPAPR specification defines a set of para-virtualized devices, which are
9388581cc4SLeonardo Garciaalso supported by the pSeries machine in QEMU and can be instantiated with the
9488581cc4SLeonardo Garcia``-device`` option:
9588581cc4SLeonardo Garcia
9688581cc4SLeonardo Garcia* ``spapr-vlan`` : a virtual network interface.
9788581cc4SLeonardo Garcia* ``spapr-vscsi`` : a virtual SCSI disk interface.
9888581cc4SLeonardo Garcia* ``spapr-rng`` : a pseudo-device for passing random number generator data to the
9988581cc4SLeonardo Garcia  guest (see the `H_RANDOM hypercall feature
10088581cc4SLeonardo Garcia  <https://wiki.qemu.org/Features/HRandomHypercall>`_ for details).
10188581cc4SLeonardo Garcia* ``spapr-vty``: a virtual teletype.
10288581cc4SLeonardo Garcia* ``spapr-pci-host-bridge``: a PCI host bridge.
10388581cc4SLeonardo Garcia* ``tpm-spapr``: a Trusted Platform Module (TPM).
10488581cc4SLeonardo Garcia* ``spapr-tpm-proxy``: a TPM proxy.
10588581cc4SLeonardo Garcia
10688581cc4SLeonardo GarciaThese are compatible with the devices historically available for use when
10788581cc4SLeonardo Garciarunning the IBM PowerVM hypervisor with LPARs.
10888581cc4SLeonardo Garcia
10988581cc4SLeonardo GarciaHowever, since these devices have originally been specified with another
11088581cc4SLeonardo Garciahypervisor and non-Linux guests in mind, you should use the virtio counterparts
11188581cc4SLeonardo Garcia(virtio-net, virtio-blk/scsi and virtio-rng for instance) if possible instead,
11288581cc4SLeonardo Garciasince they will most probably give you better performance with Linux guests in a
11388581cc4SLeonardo GarciaQEMU environment.
11488581cc4SLeonardo Garcia
11588581cc4SLeonardo GarciaThe pSeries machine in QEMU is always instantiated with the following devices:
11688581cc4SLeonardo Garcia
11788581cc4SLeonardo Garcia* A NVRAM device (``spapr-nvram``).
11888581cc4SLeonardo Garcia* A virtual teletype (``spapr-vty``).
11988581cc4SLeonardo Garcia* A PCI host bridge (``spapr-pci-host-bridge``).
12088581cc4SLeonardo Garcia
12188581cc4SLeonardo GarciaHence, it is not needed to add them manually, unless you use the ``-nodefaults``
12288581cc4SLeonardo Garciacommand line option in QEMU.
12388581cc4SLeonardo Garcia
12488581cc4SLeonardo GarciaIn the case of the default ``spapr-nvram`` device, if someone wants to make the
12588581cc4SLeonardo Garciacontents of the NVRAM device persistent, they will need to specify a PFLASH
12688581cc4SLeonardo Garciadevice when starting QEMU, i.e. either use
12788581cc4SLeonardo Garcia``-drive if=pflash,file=<filename>,format=raw`` to set the default PFLASH
12888581cc4SLeonardo Garciadevice, or specify one with an ID
12988581cc4SLeonardo Garcia(``-drive if=none,file=<filename>,format=raw,id=pfid``) and pass that ID to the
13088581cc4SLeonardo GarciaNVRAM device with ``-global spapr-nvram.drive=pfid``.
13188581cc4SLeonardo Garcia
13288581cc4SLeonardo GarciasPAPR specification
133a23a72ddSLeonardo Garcia-------------------
13488581cc4SLeonardo Garcia
135a23a72ddSLeonardo GarciaThe main source of documentation on the sPAPR standard is the [LoPAR]_ document.
13688581cc4SLeonardo GarciaHowever, documentation specific to QEMU's implementation of the specification
13788581cc4SLeonardo Garciacan  also be found in QEMU documentation:
13888581cc4SLeonardo Garcia
13988581cc4SLeonardo Garcia.. toctree::
14088581cc4SLeonardo Garcia   :maxdepth: 1
14188581cc4SLeonardo Garcia
14222beb38bSLeonardo Garcia   ../../specs/ppc-spapr-hotplug.rst
1439befbe4fSLeonardo Garcia   ../../specs/ppc-spapr-hcalls.rst
14488581cc4SLeonardo Garcia   ../../specs/ppc-spapr-numa.rst
1458e12c012SLeonardo Garcia   ../../specs/ppc-spapr-uv-hcalls.rst
14688581cc4SLeonardo Garcia   ../../specs/ppc-spapr-xive.rst
14788581cc4SLeonardo Garcia
14888581cc4SLeonardo GarciaSwitching between the KVM-PR and KVM-HV kernel module
149a23a72ddSLeonardo Garcia=====================================================
15088581cc4SLeonardo Garcia
15188581cc4SLeonardo GarciaCurrently, there are two implementations of KVM on Power, ``kvm_hv.ko`` and
15288581cc4SLeonardo Garcia``kvm_pr.ko``.
15388581cc4SLeonardo Garcia
15488581cc4SLeonardo Garcia
15588581cc4SLeonardo GarciaIf a host supports both KVM modes, and both KVM kernel modules are loaded, it is
15688581cc4SLeonardo Garciapossible to switch between the two modes with the ``kvm-type`` parameter:
15788581cc4SLeonardo Garcia
15888581cc4SLeonardo Garcia* Use ``qemu-system-ppc64 -M pseries,accel=kvm,kvm-type=PR`` to use the
15988581cc4SLeonardo Garcia  ``kvm_pr.ko`` kernel module.
16088581cc4SLeonardo Garcia* Use ``qemu-system-ppc64 -M pseries,accel=kvm,kvm-type=HV`` to use ``kvm_hv.ko``
16188581cc4SLeonardo Garcia  instead.
16288581cc4SLeonardo Garcia
16388581cc4SLeonardo GarciaKVM-PR
164a23a72ddSLeonardo Garcia------
16588581cc4SLeonardo Garcia
16688581cc4SLeonardo GarciaKVM-PR uses the so-called **PR**\ oblem state of the PPC CPUs to run the guests,
16788581cc4SLeonardo Garciai.e. the virtual machine is run in user mode and all privileged instructions
16888581cc4SLeonardo Garciatrap and have to be emulated by the host. That means you can run KVM-PR inside
16988581cc4SLeonardo Garciaa pSeries guest (or a PowerVM LPAR for that matter), and that is where it has
17088581cc4SLeonardo Garciaoriginated, as historically (prior to POWER7) it was not possible to run Linux
17188581cc4SLeonardo Garciaon hypervisor mode on a Power processor (this function was restricted to
17288581cc4SLeonardo GarciaPowerVM, the IBM proprietary hypervisor).
17388581cc4SLeonardo Garcia
17488581cc4SLeonardo GarciaBecause all privileged instructions are trapped, guests that use a lot of
17588581cc4SLeonardo Garciaprivileged instructions run quite slow with KVM-PR. On the other hand, because
17688581cc4SLeonardo Garciaof that, this kernel module can run on pretty much every PPC hardware, and is
17788581cc4SLeonardo Garciaable to emulate a lot of guests CPUs. This module can even be used to run other
17888581cc4SLeonardo GarciaPowerPC guests like an emulated PowerMac.
17988581cc4SLeonardo Garcia
18088581cc4SLeonardo GarciaAs KVM-PR can be run inside a pSeries guest, it can also provide nested
18188581cc4SLeonardo Garciavirtualization capabilities (i.e. running a guest from within a guest).
18288581cc4SLeonardo Garcia
18388581cc4SLeonardo GarciaIt is important to notice that, as KVM-HV provides a much better execution
18488581cc4SLeonardo Garciaperformance, maintenance work has been much more focused on it in the past
18588581cc4SLeonardo Garciayears. Maintenance for KVM-PR has been minimal.
18688581cc4SLeonardo Garcia
18788581cc4SLeonardo GarciaIn order to run KVM-PR guests with POWER9 processors, someone will need to start
18888581cc4SLeonardo GarciaQEMU with ``kernel_irqchip=off`` command line option.
18988581cc4SLeonardo Garcia
19088581cc4SLeonardo GarciaKVM-HV
191a23a72ddSLeonardo Garcia------
19288581cc4SLeonardo Garcia
19388581cc4SLeonardo GarciaKVM-HV uses the hypervisor mode of more recent Power processors, that allow
19488581cc4SLeonardo Garciaaccess to the bare metal hardware directly. Although POWER7 had this capability,
19588581cc4SLeonardo Garciait was only starting with POWER8 that this was officially supported by IBM.
19688581cc4SLeonardo Garcia
19788581cc4SLeonardo GarciaOriginally, KVM-HV was only available when running on a PowerNV platform (a.k.a.
19888581cc4SLeonardo GarciaPower bare metal). Although it runs on a PowerNV platform, it can only be used
19988581cc4SLeonardo Garciato start pSeries guests. As the pSeries guest doesn't have access to the
20088581cc4SLeonardo Garciahypervisor mode of the Power CPU, it wasn't possible to run KVM-HV on a guest.
20188581cc4SLeonardo GarciaThis limitation has been lifted, and now it is possible to run KVM-HV inside
20288581cc4SLeonardo GarciapSeries guests as well, making nested virtualization possible with KVM-HV.
20388581cc4SLeonardo Garcia
20488581cc4SLeonardo GarciaAs KVM-HV has access to privileged instructions, guests that use a lot of these
20588581cc4SLeonardo Garciacan run much faster than with KVM-PR. On the other hand, the guest CPU has to be
20688581cc4SLeonardo Garciaof the same type as the host CPU this way, e.g. it is not possible to specify an
20788581cc4SLeonardo Garciaembedded PPC CPU for the guest with KVM-HV. However, there is at least the
20888581cc4SLeonardo Garciapossibility to run the guest in a backward-compatibility mode of the previous
20988581cc4SLeonardo GarciaCPUs generations, e.g. you can run a POWER7 guest on a POWER8 host by using
21088581cc4SLeonardo Garcia``-cpu POWER8,compat=power7`` as parameter to QEMU.
21188581cc4SLeonardo Garcia
21288581cc4SLeonardo GarciaModules support
213a23a72ddSLeonardo Garcia===============
21488581cc4SLeonardo Garcia
21588581cc4SLeonardo GarciaAs noticed in the sections above, each module can run in a different
21688581cc4SLeonardo Garciaenvironment. The following table shows with which environment each module can
21788581cc4SLeonardo Garciarun. As long as you are in a supported environment, you can run KVM-PR or KVM-HV
21888581cc4SLeonardo Garcianested. Combinations not shown in the table are not available.
21988581cc4SLeonardo Garcia
22088581cc4SLeonardo Garcia+--------------+------------+------+-------------------+----------+--------+
22188581cc4SLeonardo Garcia| Platform     | Host type  | Bits | Page table format | KVM-HV   | KVM-PR |
22288581cc4SLeonardo Garcia+==============+============+======+===================+==========+========+
22388581cc4SLeonardo Garcia| PowerNV      | bare metal | 32   | hash              | no       | yes    |
22488581cc4SLeonardo Garcia|              |            |      +-------------------+----------+--------+
22588581cc4SLeonardo Garcia|              |            |      | radix             | N/A      | N/A    |
22688581cc4SLeonardo Garcia|              |            +------+-------------------+----------+--------+
22788581cc4SLeonardo Garcia|              |            | 64   | hash              | yes      | yes    |
22888581cc4SLeonardo Garcia|              |            |      +-------------------+----------+--------+
22988581cc4SLeonardo Garcia|              |            |      | radix             | yes      | no     |
23088581cc4SLeonardo Garcia+--------------+------------+------+-------------------+----------+--------+
23188581cc4SLeonardo Garcia| pSeries [1]_ | PowerNV    | 32   | hash              | no       | yes    |
23288581cc4SLeonardo Garcia|              |            |      +-------------------+----------+--------+
23388581cc4SLeonardo Garcia|              |            |      | radix             | N/A      | N/A    |
23488581cc4SLeonardo Garcia|              |            +------+-------------------+----------+--------+
23588581cc4SLeonardo Garcia|              |            | 64   | hash              | no       | yes    |
23688581cc4SLeonardo Garcia|              |            |      +-------------------+----------+--------+
23788581cc4SLeonardo Garcia|              |            |      | radix             | yes [2]_ | no     |
23888581cc4SLeonardo Garcia|              +------------+------+-------------------+----------+--------+
23988581cc4SLeonardo Garcia|              | PowerVM    | 32   | hash              | no       | yes    |
24088581cc4SLeonardo Garcia|              |            |      +-------------------+----------+--------+
24188581cc4SLeonardo Garcia|              |            |      | radix             | N/A      | N/A    |
24288581cc4SLeonardo Garcia|              |            +------+-------------------+----------+--------+
24388581cc4SLeonardo Garcia|              |            | 64   | hash              | no       | yes    |
24488581cc4SLeonardo Garcia|              |            |      +-------------------+----------+--------+
24588581cc4SLeonardo Garcia|              |            |      | radix [3]_        | no       | yes    |
24688581cc4SLeonardo Garcia+--------------+------------+------+-------------------+----------+--------+
24788581cc4SLeonardo Garcia
24888581cc4SLeonardo Garcia.. [1] On POWER9 DD2.1 processors, the page table format on the host and guest
24988581cc4SLeonardo Garcia   must be the same.
25088581cc4SLeonardo Garcia
25188581cc4SLeonardo Garcia.. [2] KVM-HV cannot run nested on POWER8 machines.
25288581cc4SLeonardo Garcia
25388581cc4SLeonardo Garcia.. [3] Introduced on Power10 machines.
25488581cc4SLeonardo Garcia
255808ead89SThomas Huth
25696a46defSCornelia Huck.. _power-papr-protected-execution-facility-pef:
25796a46defSCornelia Huck
258808ead89SThomas HuthPOWER (PAPR) Protected Execution Facility (PEF)
259808ead89SThomas Huth-----------------------------------------------
260808ead89SThomas Huth
261808ead89SThomas HuthProtected Execution Facility (PEF), also known as Secure Guest support
262808ead89SThomas Huthis a feature found on IBM POWER9 and POWER10 processors.
263808ead89SThomas Huth
264808ead89SThomas HuthIf a suitable firmware including an Ultravisor is installed, it adds
265808ead89SThomas Huthan extra memory protection mode to the CPU.  The ultravisor manages a
266808ead89SThomas Huthpool of secure memory which cannot be accessed by the hypervisor.
267808ead89SThomas Huth
268808ead89SThomas HuthWhen this feature is enabled in QEMU, a guest can use ultracalls to
269808ead89SThomas Huthenter "secure mode".  This transfers most of its memory to secure
270808ead89SThomas Huthmemory, where it cannot be eavesdropped by a compromised hypervisor.
271808ead89SThomas Huth
272808ead89SThomas HuthLaunching
273808ead89SThomas Huth^^^^^^^^^
274808ead89SThomas Huth
275808ead89SThomas HuthTo launch a guest which will be permitted to enter PEF secure mode::
276808ead89SThomas Huth
277808ead89SThomas Huth  $ qemu-system-ppc64 \
278808ead89SThomas Huth      -object pef-guest,id=pef0 \
279808ead89SThomas Huth      -machine confidential-guest-support=pef0 \
280808ead89SThomas Huth      ...
281808ead89SThomas Huth
282808ead89SThomas HuthLive Migration
283808ead89SThomas Huth^^^^^^^^^^^^^^
284808ead89SThomas Huth
285808ead89SThomas HuthLive migration is not yet implemented for PEF guests.  For
286808ead89SThomas Huthconsistency, QEMU currently prevents migration if the PEF feature is
287808ead89SThomas Huthenabled, whether or not the guest has actually entered secure mode.
288808ead89SThomas Huth
289808ead89SThomas Huth
29088581cc4SLeonardo GarciaMaintainer contact information
291a23a72ddSLeonardo Garcia==============================
29288581cc4SLeonardo Garcia
29388581cc4SLeonardo GarciaCédric Le Goater <clg@kaod.org>
29488581cc4SLeonardo Garcia
29588581cc4SLeonardo GarciaDaniel Henrique Barboza <danielhb413@gmail.com>
296a23a72ddSLeonardo Garcia
297a23a72ddSLeonardo Garcia.. [LoPAR] `Linux on Power Architecture Reference document (LoPAR) revision
298a23a72ddSLeonardo Garcia   2.9 <https://openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200812.pdf>`_.
299