1aa69abe6SJonathan CameronCompute Express Link (CXL) 2aa69abe6SJonathan Cameron========================== 3aa69abe6SJonathan CameronFrom the view of a single host, CXL is an interconnect standard that 4aa69abe6SJonathan Camerontargets accelerators and memory devices attached to a CXL host. 5aa69abe6SJonathan CameronThis description will focus on those aspects visible either to 6aa69abe6SJonathan Cameronsoftware running on a QEMU emulated host or to the internals of 7aa69abe6SJonathan Cameronfunctional emulation. As such, it will skip over many of the 8aa69abe6SJonathan Cameronelectrical and protocol elements that would be more of interest 9aa69abe6SJonathan Cameronfor real hardware and will dominate more general introductions to CXL. 10aa69abe6SJonathan CameronIt will also completely ignore the fabric management aspects of CXL 11aa69abe6SJonathan Cameronby considering only a single host and a static configuration. 12aa69abe6SJonathan Cameron 13aa69abe6SJonathan CameronCXL shares many concepts and much of the infrastructure of PCI Express, 14aa69abe6SJonathan Cameronwith CXL Host Bridges, which have CXL Root Ports which may be directly 15aa69abe6SJonathan Cameronattached to CXL or PCI End Points. Alternatively there may be CXL Switches 16aa69abe6SJonathan Cameronwith CXL and PCI Endpoints attached below them. In many cases additional 17aa69abe6SJonathan Cameroncontrol and capabilities are exposed via PCI Express interfaces. 187a21bee2SDaniel P. BerrangéThis sharing of interfaces and hence emulation code is reflected 19aa69abe6SJonathan Cameronin how the devices are emulated in QEMU. In most cases the various 20aa69abe6SJonathan CameronCXL elements are built upon an equivalent PCIe devices. 21aa69abe6SJonathan Cameron 22aa69abe6SJonathan CameronCXL devices support the following interfaces: 23aa69abe6SJonathan Cameron 24aa69abe6SJonathan Cameron* Most conventional PCIe interfaces 25aa69abe6SJonathan Cameron 26aa69abe6SJonathan Cameron - Configuration space access 27aa69abe6SJonathan Cameron - BAR mapped memory accesses used for registers and mailboxes. 28aa69abe6SJonathan Cameron - MSI/MSI-X 29aa69abe6SJonathan Cameron - AER 30aa69abe6SJonathan Cameron - DOE mailboxes 31aa69abe6SJonathan Cameron - IDE 32aa69abe6SJonathan Cameron - Many other PCI express defined interfaces.. 33aa69abe6SJonathan Cameron 34aa69abe6SJonathan Cameron* Memory operations 35aa69abe6SJonathan Cameron 36aa69abe6SJonathan Cameron - Equivalent of accessing DRAM / NVDIMMs. Any access / feature 37aa69abe6SJonathan Cameron supported by the host for normal memory should also work for 38aa69abe6SJonathan Cameron CXL attached memory devices. 39aa69abe6SJonathan Cameron 40aa69abe6SJonathan Cameron* Cache operations. The are mostly irrelevant to QEMU emulation as 41aa69abe6SJonathan Cameron QEMU is not emulating a coherency protocol. Any emulation related 42aa69abe6SJonathan Cameron to these will be device specific and is out of the scope of this 43aa69abe6SJonathan Cameron document. 44aa69abe6SJonathan Cameron 45aa69abe6SJonathan CameronCXL 2.0 Device Types 46aa69abe6SJonathan Cameron-------------------- 47aa69abe6SJonathan CameronCXL 2.0 End Points are often categorized into three types. 48aa69abe6SJonathan Cameron 49aa69abe6SJonathan Cameron**Type 1:** These support coherent caching of host memory. Example might 50aa69abe6SJonathan Cameronbe a crypto accelerators. May also have device private memory accessible 51aa69abe6SJonathan Cameronvia means such as PCI memory reads and writes to BARs. 52aa69abe6SJonathan Cameron 53aa69abe6SJonathan Cameron**Type 2:** These support coherent caching of host memory and host 54aa69abe6SJonathan Cameronmanaged device memory (HDM) for which the coherency protocol is managed 55aa69abe6SJonathan Cameronby the host. This is a complex topic, so for more information on CXL 56aa69abe6SJonathan Cameroncoherency see the CXL 2.0 specification. 57aa69abe6SJonathan Cameron 58aa69abe6SJonathan Cameron**Type 3 Memory devices:** These devices act as a means of attaching 59aa69abe6SJonathan Cameronadditional memory (HDM) to a CXL host including both volatile and 60aa69abe6SJonathan Cameronpersistent memory. The CXL topology may support interleaving across a 61aa69abe6SJonathan Cameronnumber of Type 3 memory devices using HDM Decoders in the host, host 62aa69abe6SJonathan Cameronbridge, switch upstream port and endpoints. 63aa69abe6SJonathan Cameron 64aa69abe6SJonathan CameronScope of CXL emulation in QEMU 65aa69abe6SJonathan Cameron------------------------------ 66aa69abe6SJonathan CameronThe focus of CXL emulation is CXL revision 2.0 and later. Earlier CXL 67aa69abe6SJonathan Cameronrevisions defined a smaller set of features, leaving much of the control 68aa69abe6SJonathan Cameroninterface as implementation defined or device specific, making generic 69aa69abe6SJonathan Cameronemulation challenging with host specific firmware being responsible 70aa69abe6SJonathan Cameronfor setup and the Endpoints being presented to operating systems 71aa69abe6SJonathan Cameronas Root Complex Integrated End Points. CXL rev 2.0 looks a lot 72aa69abe6SJonathan Cameronmore like PCI Express, with fully specified discoverability 73aa69abe6SJonathan Cameronof the CXL topology. 74aa69abe6SJonathan Cameron 75aa69abe6SJonathan CameronCXL System components 76aa69abe6SJonathan Cameron---------------------- 77aa69abe6SJonathan CameronA CXL system is made up a Host with a number of 'standard components' 78aa69abe6SJonathan Cameronthe control and capabilities of which are discoverable by system software 79aa69abe6SJonathan Cameronusing means described in the CXL 2.0 specification. 80aa69abe6SJonathan Cameron 81aa69abe6SJonathan CameronCXL Fixed Memory Windows (CFMW) 82aa69abe6SJonathan Cameron~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 83aa69abe6SJonathan CameronA CFMW consists of a particular range of Host Physical Address space 84aa69abe6SJonathan Cameronwhich is routed to particular CXL Host Bridges. At time of generic 85aa69abe6SJonathan Cameronsoftware initialization it will have a particularly interleaving 86120f765eSStefan Weilconfiguration and associated Quality of Service Throttling Group (QTG). 87aa69abe6SJonathan CameronThis information is available to system software, when making 88aa69abe6SJonathan Camerondecisions about how to configure interleave across available CXL 89aa69abe6SJonathan Cameronmemory devices. It is provide as CFMW Structures (CFMWS) in 90aa69abe6SJonathan Cameronthe CXL Early Discovery Table, an ACPI table. 91aa69abe6SJonathan Cameron 92aa69abe6SJonathan CameronNote: QTG 0 is the only one currently supported in QEMU. 93aa69abe6SJonathan Cameron 94aa69abe6SJonathan CameronCXL Host Bridge (CXL HB) 95aa69abe6SJonathan Cameron~~~~~~~~~~~~~~~~~~~~~~~~ 96aa69abe6SJonathan CameronA CXL host bridge is similar to the PCIe equivalent, but with a 97aa69abe6SJonathan Cameronspecification defined register interface called CXL Host Bridge 98aa69abe6SJonathan CameronComponent Registers (CHBCR). The location of this CHBCR MMIO 99aa69abe6SJonathan Cameronspace is described to system software via a CXL Host Bridge 100aa69abe6SJonathan CameronStructure (CHBS) in the CEDT ACPI table. The actual interfaces 101120f765eSStefan Weilare identical to those used for other parts of the CXL hierarchy 102aa69abe6SJonathan Cameronas CXL Component Registers in PCI BARs. 103aa69abe6SJonathan Cameron 104aa69abe6SJonathan CameronInterfaces provided include: 105aa69abe6SJonathan Cameron 106aa69abe6SJonathan Cameron* Configuration of HDM Decoders to route CXL Memory accesses with 107aa69abe6SJonathan Cameron a particularly Host Physical Address range to the target port 108aa69abe6SJonathan Cameron below which the CXL device servicing that address lies. This 109aa69abe6SJonathan Cameron may be a mapping to a single Root Port (RP) or across a set of 110aa69abe6SJonathan Cameron target RPs. 111aa69abe6SJonathan Cameron 112aa69abe6SJonathan CameronCXL Root Ports (CXL RP) 113aa69abe6SJonathan Cameron~~~~~~~~~~~~~~~~~~~~~~~ 11463cec050SStefan WeilA CXL Root Port serves the same purpose as a PCIe Root Port. 115aa69abe6SJonathan CameronThere are a number of CXL specific Designated Vendor Specific 116aa69abe6SJonathan CameronExtended Capabilities (DVSEC) in PCIe Configuration Space 117aa69abe6SJonathan Cameronand associated component register access via PCI bars. 118aa69abe6SJonathan Cameron 119aa69abe6SJonathan CameronCXL Switch 120aa69abe6SJonathan Cameron~~~~~~~~~~ 121aa69abe6SJonathan CameronHere we consider a simple CXL switch with only a single 122aa69abe6SJonathan Cameronvirtual hierarchy. Whilst more complex devices exist, their 123aa69abe6SJonathan Cameronvisibility to a particular host is generally the same as for 124aa69abe6SJonathan Camerona simple switch design. Hosts often have no awareness 125aa69abe6SJonathan Cameronof complex rerouting and device pooling, they simply see 126aa69abe6SJonathan Camerondevices being hot added or hot removed. 127aa69abe6SJonathan Cameron 128aa69abe6SJonathan CameronA CXL switch has a similar architecture to those in PCIe, 129aa69abe6SJonathan Cameronwith a single upstream port, internal PCI bus and multiple 130aa69abe6SJonathan Camerondownstream ports. 131aa69abe6SJonathan Cameron 132aa69abe6SJonathan CameronBoth the CXL upstream and downstream ports have CXL specific 133aa69abe6SJonathan CameronDVSECs in configuration space, and component registers in PCI 134aa69abe6SJonathan CameronBARs. The Upstream Port has the configuration interfaces for 135aa69abe6SJonathan Cameronthe HDM decoders which route incoming memory accesses to the 136aa69abe6SJonathan Cameronappropriate downstream port. 137aa69abe6SJonathan Cameron 1383afcbb7bSJonathan CameronA CXL switch is created in a similar fashion to PCI switches 1393afcbb7bSJonathan Cameronby creating an upstream port (cxl-upstream) and a number of 1403afcbb7bSJonathan Camerondownstream ports on the internal switch bus (cxl-downstream). 1413afcbb7bSJonathan Cameron 142aa69abe6SJonathan CameronCXL Memory Devices - Type 3 143aa69abe6SJonathan Cameron~~~~~~~~~~~~~~~~~~~~~~~~~~~ 144aa69abe6SJonathan CameronCXL type 3 devices use a PCI class code and are intended to be supported 145aa69abe6SJonathan Cameronby a generic operating system driver. They have HDM decoders 146120f765eSStefan Weilthough in these EP devices, the decoder is responsible not for 147aa69abe6SJonathan Cameronrouting but for translation of the incoming host physical address (HPA) 148aa69abe6SJonathan Cameroninto a Device Physical Address (DPA). 149aa69abe6SJonathan Cameron 150aa69abe6SJonathan CameronCXL Memory Interleave 151aa69abe6SJonathan Cameron--------------------- 152aa69abe6SJonathan CameronTo understand the interaction of different CXL hardware components which 153aa69abe6SJonathan Cameronare emulated in QEMU, let us consider a memory read in a fully configured 154aa69abe6SJonathan CameronCXL topology. Note that system software is responsible for configuration 155aa69abe6SJonathan Cameronof all components with the exception of the CFMWs. System software is 156aa69abe6SJonathan Cameronresponsible for allocating appropriate ranges from within the CFMWs 157aa69abe6SJonathan Cameronand exposing those via normal memory configurations as would be done 158aa69abe6SJonathan Cameronfor system RAM. 159aa69abe6SJonathan Cameron 1606ff35919SLi ZhijianExample system topology. x marks the match in each decoder level:: 161aa69abe6SJonathan Cameron 162aa69abe6SJonathan Cameron |<------------------SYSTEM PHYSICAL ADDRESS MAP (1)----------------->| 163aa69abe6SJonathan Cameron | __________ __________________________________ __________ | 164aa69abe6SJonathan Cameron | | | | | | | | 165ca475058SBrice Goglin | | CFMW 0 | | CXL Fixed Memory Window 1 | | CFMW 2 | | 166aa69abe6SJonathan Cameron | | HB0 only | | Configured to interleave memory | | HB1 only | | 167aa69abe6SJonathan Cameron | | | | memory accesses across HB0/HB1 | | | | 168aa69abe6SJonathan Cameron | |__________| |_____x____________________________| |__________| | 169aa69abe6SJonathan Cameron | | | | 170aa69abe6SJonathan Cameron | | | | 171aa69abe6SJonathan Cameron | | | | 172aa69abe6SJonathan Cameron | Interleave Decoder | | 173aa69abe6SJonathan Cameron | Matches this HB | | 174aa69abe6SJonathan Cameron \_____________| |_____________/ 175aa69abe6SJonathan Cameron __________|__________ _____|_______________ 176aa69abe6SJonathan Cameron | | | | 177aa69abe6SJonathan Cameron (2) | CXL HB 0 | | CXL HB 1 | 178aa69abe6SJonathan Cameron | HB IntLv Decoders | | HB IntLv Decoders | 179aa69abe6SJonathan Cameron | PCI/CXL Root Bus 0c | | PCI/CXL Root Bus 0d | 180aa69abe6SJonathan Cameron | | | | 181aa69abe6SJonathan Cameron |___x_________________| |_____________________| 182aa69abe6SJonathan Cameron | | | | 183aa69abe6SJonathan Cameron | | | | 184aa69abe6SJonathan Cameron A HB 0 HDM Decoder | | | 185aa69abe6SJonathan Cameron matches this Port | | | 186aa69abe6SJonathan Cameron | | | | 187aa69abe6SJonathan Cameron ___________|___ __________|__ __|_________ ___|_________ 188aa69abe6SJonathan Cameron (3)| Root Port 0 | | Root Port 1 | | Root Port 2| | Root Port 3 | 189aa69abe6SJonathan Cameron | Appears in | | Appears in | | Appears in | | Appear in | 1906ff35919SLi Zhijian | PCI topology | | PCI topology| | PCI topo | | PCI topo | 1916ff35919SLi Zhijian | as 0c:00.0 | | as 0c:01.0 | | as de:00.0 | | as de:01.0 | 192aa69abe6SJonathan Cameron |_______________| |_____________| |____________| |_____________| 193aa69abe6SJonathan Cameron | | | | 194aa69abe6SJonathan Cameron | | | | 195aa69abe6SJonathan Cameron _____|_________ ______|______ ______|_____ ______|_______ 196aa69abe6SJonathan Cameron (4)| x | | | | | | | 197aa69abe6SJonathan Cameron | CXL Type3 0 | | CXL Type3 1 | | CXL type3 2| | CLX Type 3 3 | 198aa69abe6SJonathan Cameron | | | | | | | | 199aa69abe6SJonathan Cameron | PMEM0(Vol LSA)| | PMEM1 (...) | | PMEM2 (...)| | PMEM3 (...) | 200aa69abe6SJonathan Cameron | Decoder to go | | | | | | | 201aa69abe6SJonathan Cameron | from host PA | | PCI 0e:00.0 | | PCI df:00.0| | PCI e0:00.0 | 202aa69abe6SJonathan Cameron | to device PA | | | | | | | 203aa69abe6SJonathan Cameron | PCI as 0d:00.0| | | | | | | 204aa69abe6SJonathan Cameron |_______________| |_____________| |____________| |______________| 205aa69abe6SJonathan Cameron 206aa69abe6SJonathan CameronNotes: 207aa69abe6SJonathan Cameron 208aa69abe6SJonathan Cameron(1) **3 CXL Fixed Memory Windows (CFMW)** corresponding to different 209aa69abe6SJonathan Cameron ranges of the system physical address map. Each CFMW has 210aa69abe6SJonathan Cameron particular interleave setup across the CXL Host Bridges (HB) 211ca475058SBrice Goglin CFMW0 provides uninterleaved access to HB0, CFMW2 provides 212ca475058SBrice Goglin uninterleaved access to HB1. CFMW1 provides interleaved memory access 213aa69abe6SJonathan Cameron across HB0 and HB1. 214aa69abe6SJonathan Cameron 215aa69abe6SJonathan Cameron(2) **Two CXL Host Bridges**. Each of these has 2 CXL Root Ports and 216aa69abe6SJonathan Cameron programmable HDM decoders to route memory accesses either to 217aa69abe6SJonathan Cameron a single port or interleave them across multiple ports. 218aa69abe6SJonathan Cameron A complex configuration here, might be to use the following HDM 219aa69abe6SJonathan Cameron decoders in HB0. HDM0 routes CFMW0 requests to RP0 and hence 220aa69abe6SJonathan Cameron part of CXL Type3 0. HDM1 routes CFMW0 requests from a 221aa69abe6SJonathan Cameron different region of the CFMW0 PA range to RP2 and hence part 222aa69abe6SJonathan Cameron of CXL Type 3 1. HDM2 routes yet another PA range from within 223aa69abe6SJonathan Cameron CFMW0 to be interleaved across RP0 and RP1, providing 2 way 224aa69abe6SJonathan Cameron interleave of part of the memory provided by CXL Type3 0 and 225aa69abe6SJonathan Cameron CXL Type 3 1. HDM3 routes those interleaved accesses from 226aa69abe6SJonathan Cameron CFMW1 that target HB0 to RP 0 and another part of the memory of 227aa69abe6SJonathan Cameron CXL Type 3 0 (as part of a 2 way interleave at the system level 228aa69abe6SJonathan Cameron across for example CXL Type3 0 and CXL Type3 2. 229aa69abe6SJonathan Cameron HDM4 is used to enable system wide 4 way interleave across all 230aa69abe6SJonathan Cameron the present CXL type3 devices, by interleaving those (interleaved) 231aa69abe6SJonathan Cameron requests that HB0 receives from from CFMW1 across RP 0 and 232aa69abe6SJonathan Cameron RP 1 and hence to yet more regions of the memory of the 233aa69abe6SJonathan Cameron attached Type3 devices. Note this is a representative subset 234aa69abe6SJonathan Cameron of the full range of possible HDM decoder configurations in this 235aa69abe6SJonathan Cameron topology. 236aa69abe6SJonathan Cameron 237aa69abe6SJonathan Cameron(3) **Four CXL Root Ports.** In this case the CXL Type 3 devices are 238aa69abe6SJonathan Cameron directly attached to these ports. 239aa69abe6SJonathan Cameron 240aa69abe6SJonathan Cameron(4) **Four CXL Type3 memory expansion devices.** These will each have 241aa69abe6SJonathan Cameron HDM decoders, but in this case rather than performing interleave 242aa69abe6SJonathan Cameron they will take the Host Physical Addresses of accesses and map 243aa69abe6SJonathan Cameron them to their own local Device Physical Address Space (DPA). 244aa69abe6SJonathan Cameron 2453afcbb7bSJonathan CameronExample topology involving a switch:: 2463afcbb7bSJonathan Cameron 2473afcbb7bSJonathan Cameron |<------------------SYSTEM PHYSICAL ADDRESS MAP (1)----------------->| 2483afcbb7bSJonathan Cameron | __________ __________________________________ __________ | 2493afcbb7bSJonathan Cameron | | | | | | | | 250ca475058SBrice Goglin | | CFMW 0 | | CXL Fixed Memory Window 1 | | CFMW 2 | | 2513afcbb7bSJonathan Cameron | | HB0 only | | Configured to interleave memory | | HB1 only | | 2523afcbb7bSJonathan Cameron | | | | memory accesses across HB0/HB1 | | | | 2533afcbb7bSJonathan Cameron | |____x_____| |__________________________________| |__________| | 2543afcbb7bSJonathan Cameron | | | | 2553afcbb7bSJonathan Cameron | | | | 2563afcbb7bSJonathan Cameron | | | 2573afcbb7bSJonathan Cameron Interleave Decoder | | | 2583afcbb7bSJonathan Cameron Matches this HB | | | 2593afcbb7bSJonathan Cameron \_____________| |_____________/ 2603afcbb7bSJonathan Cameron __________|__________ _____|_______________ 2613afcbb7bSJonathan Cameron | | | | 2623afcbb7bSJonathan Cameron | CXL HB 0 | | CXL HB 1 | 2633afcbb7bSJonathan Cameron | HB IntLv Decoders | | HB IntLv Decoders | 2643afcbb7bSJonathan Cameron | PCI/CXL Root Bus 0c | | PCI/CXL Root Bus 0d | 2653afcbb7bSJonathan Cameron | | | | 2663afcbb7bSJonathan Cameron |___x_________________| |_____________________| 2673afcbb7bSJonathan Cameron | | | | 2683afcbb7bSJonathan Cameron | 2693afcbb7bSJonathan Cameron A HB 0 HDM Decoder 2703afcbb7bSJonathan Cameron matches this Port 2713afcbb7bSJonathan Cameron ___________|___ 2723afcbb7bSJonathan Cameron | Root Port 0 | 2733afcbb7bSJonathan Cameron | Appears in | 2743afcbb7bSJonathan Cameron | PCI topology | 2756ff35919SLi Zhijian | as 0c:00.0 | 2763afcbb7bSJonathan Cameron |___________x___| 2773afcbb7bSJonathan Cameron | 2783afcbb7bSJonathan Cameron | 2793afcbb7bSJonathan Cameron \_____________________ 2803afcbb7bSJonathan Cameron | 2813afcbb7bSJonathan Cameron | 2823afcbb7bSJonathan Cameron --------------------------------------------------- 2833afcbb7bSJonathan Cameron | Switch 0 USP as PCI 0d:00.0 | 2843afcbb7bSJonathan Cameron | USP has HDM decoder which direct traffic to | 285120f765eSStefan Weil | appropriate downstream port | 2863afcbb7bSJonathan Cameron | Switch BUS appears as 0e | 2873afcbb7bSJonathan Cameron |x__________________________________________________| 2883afcbb7bSJonathan Cameron | | | | 2893afcbb7bSJonathan Cameron | | | | 2903afcbb7bSJonathan Cameron _____|_________ ______|______ ______|_____ ______|_______ 2913afcbb7bSJonathan Cameron (4)| x | | | | | | | 2923afcbb7bSJonathan Cameron | CXL Type3 0 | | CXL Type3 1 | | CXL type3 2| | CLX Type 3 3 | 2933afcbb7bSJonathan Cameron | | | | | | | | 2943afcbb7bSJonathan Cameron | PMEM0(Vol LSA)| | PMEM1 (...) | | PMEM2 (...)| | PMEM3 (...) | 2953afcbb7bSJonathan Cameron | Decoder to go | | | | | | | 2963afcbb7bSJonathan Cameron | from host PA | | PCI 10:00.0 | | PCI 11:00.0| | PCI 12:00.0 | 2973afcbb7bSJonathan Cameron | to device PA | | | | | | | 2983afcbb7bSJonathan Cameron | PCI as 0f:00.0| | | | | | | 2993afcbb7bSJonathan Cameron |_______________| |_____________| |____________| |______________| 3003afcbb7bSJonathan Cameron 301aa69abe6SJonathan CameronExample command lines 302aa69abe6SJonathan Cameron--------------------- 303adacc814SGregory PriceA very simple setup with just one directly attached CXL Type 3 Persistent Memory device:: 304aa69abe6SJonathan Cameron 3050795b98fSRaghu H qemu-system-x86_64 -M q35,cxl=on -m 4G,maxmem=8G,slots=8 -smp 4 \ 306aa69abe6SJonathan Cameron ... 307aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest.raw,size=256M \ 308aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa.raw,size=256M \ 309aa69abe6SJonathan Cameron -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ 310aa69abe6SJonathan Cameron -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \ 311adacc814SGregory Price -device cxl-type3,bus=root_port13,persistent-memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0 \ 312adacc814SGregory Price -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G 313adacc814SGregory Price 314adacc814SGregory PriceA very simple setup with just one directly attached CXL Type 3 Volatile Memory device:: 315adacc814SGregory Price 3166ee07cfbSJonathan Cameron qemu-system-x86_64 -M q35,cxl=on -m 4G,maxmem=8G,slots=8 -smp 4 \ 317adacc814SGregory Price ... 318adacc814SGregory Price -object memory-backend-ram,id=vmem0,share=on,size=256M \ 319adacc814SGregory Price -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ 320adacc814SGregory Price -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \ 321adacc814SGregory Price -device cxl-type3,bus=root_port13,volatile-memdev=vmem0,id=cxl-vmem0 \ 322adacc814SGregory Price -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G 323adacc814SGregory Price 324adacc814SGregory PriceThe same volatile setup may optionally include an LSA region:: 325adacc814SGregory Price 3266ee07cfbSJonathan Cameron qemu-system-x86_64 -M q35,cxl=on -m 4G,maxmem=8G,slots=8 -smp 4 \ 327adacc814SGregory Price ... 328adacc814SGregory Price -object memory-backend-ram,id=vmem0,share=on,size=256M \ 329adacc814SGregory Price -object memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa.raw,size=256M \ 330adacc814SGregory Price -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ 331adacc814SGregory Price -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \ 332adacc814SGregory Price -device cxl-type3,bus=root_port13,volatile-memdev=vmem0,lsa=cxl-lsa0,id=cxl-vmem0 \ 33303b39fcfSJonathan Cameron -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G 334aa69abe6SJonathan Cameron 335aa69abe6SJonathan CameronA setup suitable for 4 way interleave. Only one fixed window provided, to enable 2 way 336aa69abe6SJonathan Cameroninterleave across 2 CXL host bridges. Each host bridge has 2 CXL Root Ports, with 337aa69abe6SJonathan Cameronthe CXL Type3 device directly attached (no switches).:: 338aa69abe6SJonathan Cameron 3390795b98fSRaghu H qemu-system-x86_64 -M q35,cxl=on -m 4G,maxmem=8G,slots=8 -smp 4 \ 340aa69abe6SJonathan Cameron ... 341aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest.raw,size=256M \ 342aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M \ 343aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M \ 344aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-mem4,share=on,mem-path=/tmp/cxltest4.raw,size=256M \ 345aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa.raw,size=256M \ 346aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M \ 347aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M \ 348aa69abe6SJonathan Cameron -object memory-backend-file,id=cxl-lsa4,share=on,mem-path=/tmp/lsa4.raw,size=256M \ 349aa69abe6SJonathan Cameron -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ 350aa69abe6SJonathan Cameron -device pxb-cxl,bus_nr=222,bus=pcie.0,id=cxl.2 \ 351aa69abe6SJonathan Cameron -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \ 352adacc814SGregory Price -device cxl-type3,bus=root_port13,persistent-memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0 \ 353aa69abe6SJonathan Cameron -device cxl-rp,port=1,bus=cxl.1,id=root_port14,chassis=0,slot=3 \ 354adacc814SGregory Price -device cxl-type3,bus=root_port14,persistent-memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem1 \ 355aa69abe6SJonathan Cameron -device cxl-rp,port=0,bus=cxl.2,id=root_port15,chassis=0,slot=5 \ 356adacc814SGregory Price -device cxl-type3,bus=root_port15,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem2 \ 357aa69abe6SJonathan Cameron -device cxl-rp,port=1,bus=cxl.2,id=root_port16,chassis=0,slot=6 \ 358adacc814SGregory Price -device cxl-type3,bus=root_port16,persistent-memdev=cxl-mem4,lsa=cxl-lsa4,id=cxl-pmem3 \ 35903b39fcfSJonathan Cameron -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.targets.1=cxl.2,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k 360aa69abe6SJonathan Cameron 3613afcbb7bSJonathan CameronAn example of 4 devices below a switch suitable for 1, 2 or 4 way interleave:: 3623afcbb7bSJonathan Cameron 3630795b98fSRaghu H qemu-system-x86_64 -M q35,cxl=on -m 4G,maxmem=8G,slots=8 -smp 4 \ 3643afcbb7bSJonathan Cameron ... 3653afcbb7bSJonathan Cameron -object memory-backend-file,id=cxl-mem0,share=on,mem-path=/tmp/cxltest.raw,size=256M \ 3663afcbb7bSJonathan Cameron -object memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest1.raw,size=256M \ 3673afcbb7bSJonathan Cameron -object memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M \ 3683afcbb7bSJonathan Cameron -object memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M \ 3693afcbb7bSJonathan Cameron -object memory-backend-file,id=cxl-lsa0,share=on,mem-path=/tmp/lsa0.raw,size=256M \ 3703afcbb7bSJonathan Cameron -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa1.raw,size=256M \ 3713afcbb7bSJonathan Cameron -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M \ 3723afcbb7bSJonathan Cameron -object memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M \ 3733afcbb7bSJonathan Cameron -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ 3743afcbb7bSJonathan Cameron -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ 3753afcbb7bSJonathan Cameron -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \ 3763afcbb7bSJonathan Cameron -device cxl-upstream,bus=root_port0,id=us0 \ 3773afcbb7bSJonathan Cameron -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ 378adacc814SGregory Price -device cxl-type3,bus=swport0,persistent-memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \ 3793afcbb7bSJonathan Cameron -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ 380adacc814SGregory Price -device cxl-type3,bus=swport1,persistent-memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem1 \ 3813afcbb7bSJonathan Cameron -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ 382adacc814SGregory Price -device cxl-type3,bus=swport2,persistent-memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem2 \ 3833afcbb7bSJonathan Cameron -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ 384adacc814SGregory Price -device cxl-type3,bus=swport3,persistent-memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem3 \ 3853afcbb7bSJonathan Cameron -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=4k 3863afcbb7bSJonathan Cameron 387adacc814SGregory PriceDeprecations 388adacc814SGregory Price------------ 389adacc814SGregory Price 390adacc814SGregory PriceThe Type 3 device [memdev] attribute has been deprecated in favor of the 391adacc814SGregory Price[persistent-memdev] attributes. [memdev] will default to a persistent memory 392adacc814SGregory Pricedevice for backward compatibility and is incapable of being used in combination 393adacc814SGregory Pricewith [persistent-memdev]. 394adacc814SGregory Price 395aa69abe6SJonathan CameronKernel Configuration Options 396aa69abe6SJonathan Cameron---------------------------- 397aa69abe6SJonathan Cameron 398120f765eSStefan WeilIn Linux 5.18 the following options are necessary to make use of 399aa69abe6SJonathan CameronOS management of CXL memory devices as described here. 400aa69abe6SJonathan Cameron 401aa69abe6SJonathan Cameron* CONFIG_CXL_BUS 402aa69abe6SJonathan Cameron* CONFIG_CXL_PCI 403aa69abe6SJonathan Cameron* CONFIG_CXL_ACPI 404aa69abe6SJonathan Cameron* CONFIG_CXL_PMEM 405aa69abe6SJonathan Cameron* CONFIG_CXL_MEM 406aa69abe6SJonathan Cameron* CONFIG_CXL_PORT 407aa69abe6SJonathan Cameron* CONFIG_CXL_REGION 408aa69abe6SJonathan Cameron 409aa69abe6SJonathan CameronReferences 410aa69abe6SJonathan Cameron---------- 411aa69abe6SJonathan Cameron 412aa69abe6SJonathan Cameron - Consortium website for specifications etc: 413aa69abe6SJonathan Cameron http://www.computeexpresslink.org 414*8700ee15SJonathan Cameron - Compute Express Link (CXL) Specification, Revision 3.1, August 2023 415