1P9 XIVE Exploitation
2====================
3
4.. _xive-device-tree:
5
6I - Device-tree updates
7-----------------------
8
9 1) The existing OPAL ``/interrupt-controller@0`` node remains
10
11    This node represents both the emulated XICS source controller and
12    an abstraction of the virtualization engine. This represents the
13    fact thet OPAL set_xive/get_xive functions are still supported
14    though they don't provide access to the full functionality.
15
16    It is still the parent of all interrupts in the device-tree.
17
18    New or modified properties:
19
20    - ``compatible`` : This is extended with a new value ``ibm,opal-xive-vc``
21
22
23 2) The new ``/interrupt-controller@<addr>`` node
24
25    This node represents both the emulated XICS presentation controller
26    and the new XIVE presentation layer.
27
28    Unlike the traditional XICS, there is only one such node for the whole
29    system.
30
31    New or modified properties:
32
33    - ``compatible`` : This contains at least the following strings:
34
35      - ``ibm,opal-intc`` : This represents the emulated XICS presentation
36        facility and might be the only property present if the version of
37        OPAL doesn't support XIVE exploitation.
38      - ``ibm,opal-xive-pe`` : This represents the XIVE presentation
39        engine.
40
41    - ``ibm,xive-eq-sizes`` : One cell per size supported, contains log2
42      of size, in ascending order.
43
44    - ``ibm,xive-#priorities`` : One cell, the number of supported priorities
45      (the priorities will be 0...n)
46
47    - ``ibm,xive-provision-page-size`` : Page size (in bytes) of the pages to
48      pass to OPAL for provisioning internal structures
49      (see opal_xive_donate_page). If this is absent, OPAL will never require
50      additional provisioning. The page must be naturally aligned.
51
52    - ``ibm,xive-provision-chips`` : The list of chip IDs for which provisioning
53      is required. Typically, if a VP allocation return OPAL_XIVE_PROVISIONING,
54      opal_xive_donate_page() will need to be called to donate a page to
55      *each* of these chips before trying again.
56
57    - ``reg`` property contains the addresses & sizes for the register
58      ranges corresponding respectively to the 4 rings:
59
60      - Ultravisor level
61      - Hypervisor level
62      - Guest OS level
63      - User level
64
65      For any of these, a size of 0 means this level is not supported.
66
67    - ``single-escalation-support`` (option). When present, indicatges that
68      the "single escalation" feature is supported, thus enabling the use
69      of the OPAL_XIVE_VP_SINGLE_ESCALATION flag.
70
713) Interrupt descriptors
72
73    The interrupt descriptors (aka "interrupts" properties and parts
74    of "interrupt-map" properties) remain 2 cells. The first cell is
75    a global interrupt number which represents a unique interrupt
76    source in the system and is an abstraction provided by OPAL.
77
78    The default configuration for all sources in the IVT/EAS is to
79    issue that number (it's internally a combination of the source
80    chip and per-chip interrupt number but the details of that
81    combination are not exposed and subject to change).
82
83    The second cell remains as usual "0" for an edge interrupt and
84    "1" for a level interrupts.
85
86 4) IPIs
87
88    Each ``cpu`` node now contains an ``interrupts`` property which has
89    one entry (2 cells per entry) for each thread on that core
90    containing the interrupt number for the IPI targeted at that
91    thread.
92
93 5) Interrupt targets
94
95    Targetting of interrupts uses processor targets and priority
96    numbers. The processor target encoding depends on which API is
97    used:
98
99     - The legacy opal_set/get_xive() APIs only support the old
100       "mangled" (ie. shifted by 2) HW processor numbers.
101
102     - The new opal_xive_set/get_irq_config API (and other
103       exploitation mode APIs) use a "token" VP number which is
104       described in II-2. Unmodified HW processor numbers are valid
105       VP numbers for those APIs.
106
107II - General operations
108-----------------------
109
110Most configuration operations are abstracted via OPAL calls, there is
111no direct access or exposure of such things as real HW interrupt or VP
112numbers.
113
114OPAL sets up all the physical interrupts and assigns them numbers, it
115also allocates enough virtual interrupts to provide an IPI per physical
116thread in the system.
117
118All interrupts are pre-configured masked and must be set to an explicit
119target before first use. The default interrupt number is programmed
120in the EAS and will remain unchanged if the targetting/unmasking is
121done using the legacy set_xive() interface.
122
123An interrupt "target" is a combination of a target processor number
124and a priority.
125
126Processor numbers are in a single domain that represents both the
127physical processors and any virtual processor or group allocated
128using the interfaces defined in this specification. These numbers
129are an OPAL maintained abstraction and are only partially related
130to the real VP numbers:
131
132In order to maintain the grouping ability, when VPs are allocated
133in blocks of naturally aligned powers of 2, the underlying HW
134numbers will respect this alignment.
135
136  .. note:: The block group mode extension makes the numbering scheme
137   	    a bit more tricky than simple powers of two however, see below.
138
139
1401) Interrupt numbering and allocation
141
142   As specified in the device-tree definition, interrupt numbers
143   are abstracted by OPAL to be a 30-bit number. All HW interrupts
144   are "allocated" and configured at boot time along with enough
145   IPIs for all processor threads.
146
147   Additionally, in order to be compatible with the XICS emulation,
148   all interrupt numbers present in the device-tree (ie all physical
149   sources or pre-allocated IPIs) will fit within a 24-bit number
150   space.
151
152   Interrupt sources that are only usable in exploitation mode, such
153   as escalation interrupts, can have numbers covering the full 30-bit
154   range. The same is true of interrupts allocated dynamically.
155
156   The hypervisor can allocate additional blocks of interrupts,
157   in which case OPAL will return the resulting abstracted global
158   numbers. They will have to be individually configured to map
159   to a given number at the target and be routed to a given target
160   and priority using opal_xive_set_irq_config(). This call is
161   semantically equivalent to the old opal_set_xive() which is
162   still supported with the addition that opal_xive_set_irq_config()
163   can also specify the logical interrupt number.
164
1652) VP numbering and allocation
166
167   A VP number is a 64-bit number. The internal make-up of that number
168   is opaque to the OS. However, it is a discrete integer that will
169   be a naturally aligned power of two when allocating a chunk of
170   VPs representing the "base" number of that chunk, the OS will do
171   basic arithmetic to get to all the VPs in the range.
172
173   Groups, when supported, will also be numbers in that space.
174
175   The physical processors numbering uses the same number space.
176
177   The underlying HW VP numbering is hidden from the OS, the APIs
178   uses the system processor numbers as presented in the
179   ``ibm,ppc-interrupt-server#s`` which corresponds to the PIR register
180   content to represent physical processors within the same number
181   space as dynamically allocated VPs.
182
183   .. note:: Note about block group mode:
184
185	     The block group mode shall as much as possible be handled
186	     transparently by OPAL.
187
188	     For example, on a 2-chips machine, a request to allocate
189	     2^n VPs might result in an allocation of 2^(n-1) VPs per
190	     chip allocated accross 2 chips. The resulting VP numbers
191	     will encode the order of the allocation allowing OPAL to
192	     reconstitute which bits are the block ID bits and which bits
193	     are the index bits in a way transparent to the OS. The overall
194	     range of numbers passed to Linux will still be contiguous.
195
196	     That implies however a limitation: We can only allocate within
197	     power-of-two number of blocks. Thus the VP allocator will limit
198	     itself to the largest power of two that can fit in the number
199	     of available chips in the machine: A machine with 3 good chips
200	     will only be able to allocate VPs from 2 of them.
201
2023) Group numbering and allocation
203
204   The group numbers are in the *same* number space as the VP
205   numbers. OPAL will internally use some bits of the VP number
206   to encode the group geometry.
207
208   [TBD] OPAL may or may not allocate a default group of all physical
209   processors, per-chip groups or per-core groups. This will be
210   represented in the device-tree somewhat...
211
212   [TBD] OPAL will provide interfaces for allocating groups
213
214
215   .. note:: Note about P/Q bit operation on sources:
216
217	     opal_xive_get_irq_info() returns a certain number of flags
218	     which define the type of operation supported. The following
219	     rules apply based on what those flags say:
220
221             - The Q bit isn't functional on an LSI interrupt. There is no
222               garantee that the special combination "01" will work for an
223               LSI (and in fact it will not work on the PHB LSIs). However
224               just setting P to 1 is sufficient to mask an LSI (just don't
225               EOI it while masked).
226
227             - The recommended setting for a masked interrupt that is
228	       temporarily masked by a driver is "10". This means a new
229	       occurrence while masked will be recorded and a "StoreEOI"
230	       will replay it appropriately.
231
232
233III - Event queues
234------------------
235
236Each virtual processor or group has a certain number of event queues
237associated with it. Each correspond to a given priority. The number
238of supported priorities is provided in the device-tree
239(``ibm,xive-#priorities`` property of the xive node).
240
241By default, OPAL populates at least one queue for every physical thread
242in the system. The number of queues and the size used is implementation
243specific. If the OS wants to re-use these to save memory, it can query
244the VP configuration.
245
246The opal_xive_get_queue_info() and opal_xive_set_queue_info() can be used
247to query a queue configuration (ie, to obtain the current page and size
248for the queue itself, but also to collect some configuration flags for
249that queue such as whether it coalesces notifications etc...) and to
250obtain the MMIO address of the queue EOI page (in the case where
251coalescing is enabled).
252
253IV - OPAL APIs
254--------------
255
256.. warning:: *All* the calls listed below may return OPAL_BUSY unless
257             explicitely documented not to. In that case, the call
258             should be performed again. The OS is allowed to insert a
259             delay though no minimum nor maxmimum delay is specified.
260             This will typically happen when performing cache update
261             operations in the XIVE, if they result in a collision.
262
263.. warning:: Calls that are expected to be called at runtime
264             simultaneously without conflicts such as getting/setting
265             IRQ info or queue info are fine to do so concurrently.
266
267             However, there is no internal locking to prevent races
268             between things such as freeing a VP block and getting/setting
269             queue infos on that block.
270
271             These aren't fully specified (yet) but common sense shall
272             apply.
273
274.. _OPAL_XIVE_RESET:
275
276OPAL_XIVE_RESET
277^^^^^^^^^^^^^^^
278.. code-block:: c
279
280   int64_t opal_xive_reset(uint64_t version)
281
282The OS should call this once when starting up to re-initialize the
283XIVE hardware and the OPAL XIVE related state back to all defaults.
284
285It can call it a second time before handing over to another (ie.
286kexec) to re-enable XICS emulation.
287
288The "version" argument should be set to 1 to enable the XIVE
289exploitation mode APIs or 0 to switch back to the default XICS
290emulation mode.
291
292Future versions of OPAL might allow higher versions than 1 to
293represent newer versions of this API. OPAL will return an error
294if it doesn't recognize the requested version.
295
296Any page of memory that the OS has "donated" to OPAL, either backing
297store for EQDs or VPDs or actual queue buffers will be removed from
298the various HW maps and can be re-used by the OS or freed after this
299call regardless of the version information. The HW will be reset to
300a (mostly) clean state.
301
302It is the responsibility of the caller to ensure that no other
303XIVE or XICS emulation call happens simultaneously to this. This
304basically should happen on an otherwise quiescent system. In the
305case of kexec, it is recommended that all processors CPPR is lowered
306first.
307
308.. note:: This call always executes fully synchronously, never returns
309	  OPAL_BUSY and will work regardless of whether VPs and EQs are left
310	  enabled or disabled. It *will* spend a significant amount of time
311	  inside OPAL and as such is not suitable to be performed during normal
312	  runtime.
313
314.. _OPAL_XIVE_GET_IRQ_INFO:
315
316OPAL_XIVE_GET_IRQ_INFO
317^^^^^^^^^^^^^^^^^^^^^^
318.. code-block:: c
319
320   int64_t opal_xive_get_irq_info(uint32_t girq,
321                                  uint64_t *out_flags,
322                                  uint64_t *out_eoi_page,
323                                  uint64_t *out_trig_page,
324				  uint32_t *out_esb_shift,
325                                  uint32_t *out_src_chip);
326
327Returns info about an interrupt source. This call never returns
328OPAL_BUSY.
329
330* out_flags returns a set of flags. The following flags
331  are defined in the API (some bits are reserved, so any bit
332  not defined here should be ignored):
333
334  - OPAL_XIVE_IRQ_TRIGGER_PAGE
335
336    Indicate that the trigger page is a separate page. If that
337    bit is clear, there is either no trigger page or the trigger
338    can be done in the same page as the EOI, see below.
339
340  - OPAL_XIVE_IRQ_STORE_EOI
341
342    Indicates that the interrupt supports the "Store EOI" option,
343    ie a store to the EOI page will move Q into P and retrigger
344    if the resulting P bit is 1. If this flag is 0, then a store
345    to the EOI page will do a trigger if OPAL_XIVE_IRQ_TRIGGER_PAGE
346    is also 0.
347
348  - OPAL_XIVE_IRQ_LSI
349
350    Indicates that the source is a level sensitive source and thus
351    doesn't have a functional Q bit. The Q bit may or may not be
352    implemented in HW but SW shouldn't rely on it doing anything.
353
354  - OPAL_XIVE_IRQ_SHIFT_BUG
355
356    Indicates that the source has a HW bug that shifts the bits
357    of the "offset" inside the EOI page left by 4 bits. So when
358    this is set, us 0xc000, 0xd000... instead of 0xc00, 0xd00...
359    as offets in the EOI page.
360
361  - OPAL_XIVE_IRQ_MASK_VIA_FW
362
363    Indicates that a FW call is needed (either opal_set_xive()
364    or opal_xive_set_irq_config()) to succesfully mask and unmask
365    the interrupt. The operations via the ESB page aren't fully
366    functional.
367
368  - OPAL_XIVE_IRQ_EOI_VIA_FW
369
370    Indicates that a FW call to opal_xive_eoi() is needed to
371    successfully EOI the interrupt. The operation via the ESB page
372    isn't fully functional.
373
374    * out_eoi_page and out_trig_page outputs will be set to the
375      EOI page physical address (always) and the trigger page address
376      (if it exists).
377      The trigger page may exist even if OPAL_XIVE_IRQ_TRIGGER_PAGE
378      is not set. In that case out_trig_page is equal to out_eoi_page.
379      If the trigger page doesn't exist, out_trig_page is set to 0.
380
381    * out_esb_shift contains the size (as an order, ie 2^n) of the
382      EOI and trigger pages. Current supported values are 12 (4k)
383      and 16 (64k). Those cannot be configured by the OS and are set
384      by firmware but can be different for different interrupt sources.
385
386    * out_src_chip will be set to the chip ID of the HW entity this
387      interrupt is sourced from. It's meant to be informative only
388      and thus isn't guaranteed to be 100% accurate. The idea is for
389      the OS to use that to pick up a default target processor on
390      the same chip.
391
392.. _OPAL_XIVE_EOI:
393
394OPAL_XIVE_EOI
395^^^^^^^^^^^^^
396
397.. code-block:: c
398
399   int64_t opal_xive_eoi(uint32_t girq);
400
401Performs an EOI on the interrupt. This should only be called if
402OPAL_XIVE_IRQ_EOI_VIA_FW is set as otherwise direct ESB access
403is preferred.
404
405.. note:: This is the *same* opal_xive_eoi() call used by OPAL XICS
406	  emulation. However the XIRR parameter is re-purposed as "GIRQ".
407
408	  The call will perform the appropriate function depending on
409	  whether OPAL is in XICS emulation mode  or native XIVE exploitation
410	  mode.
411
412.. _OPAL_XIVE_GET_IRQ_CONFIG:
413
414OPAL_XIVE_GET_IRQ_CONFIG
415^^^^^^^^^^^^^^^^^^^^^^^^
416.. code-block:: c
417
418 int64_t opal_xive_get_irq_config(uint32_t girq, uint64_t *out_vp,
419                                  uint8_t *out_prio, uint32_t *out_lirq);
420
421Returns current the configuration of an interrupt source. This is
422the equivalent of opal_get_xive() with the addition of the logical
423interrupt number (the number that will be presented in the queue).
424
425* girq: The interrupt number to get the configuration of as
426  provided by the device-tree.
427
428* out_vp: Will contain the target virtual processor where the
429  interrupt is currently routed to. This can return 0xffffffff
430  if the interrupt isn't routed to a valid virtual processor.
431
432* out_prio: Will contain the priority of the interrupt or 0xff
433  if masked
434
435* out_lirq: Will contain the logical interrupt assigned to the
436  interrupt. By default this will be the same as girq.
437
438.. _OPAL_XIVE_SET_IRQ_CONFIG:
439
440OPAL_XIVE_SET_IRQ_CONFIG
441^^^^^^^^^^^^^^^^^^^^^^^^
442.. code-block:: c
443
444 int64_t opal_xive_set_irq_config(uint32_t girq, uint64_t vp, uint8_t prio,
445                                  uint32_t lirq);
446
447This allows configuration and routing of a hardware interrupt. This is
448equivalent to opal_set_xive() with the addition of the ability to
449configure the logical IRQ number (the number that will be presented
450in the target queue).
451
452* girq: The interrupt number to configure of as provided by the
453  device-tree.
454
455* vp: The target virtual processor. The target VP/Prio combination
456  must already exist, be enabled and populated (ie, a queue page must
457  be provisioned for that queue).
458
459* prio: The priority of the interrupt.
460
461* lirq: The logical interrupt number assigned to that interrupt
462
463  .. note:: Note about masking:
464
465	    If the prio is set to 0xff, this call will cause the interrupt to
466	    be masked (*). This function will not clobber the source P/Q bits (**).
467	    It will however set the IVT/EAS "mask" bit if the prio passed
468	    is 0xff which means that interrupt events from the ESB will be
469	    discarded, potentially leaving the ESB in a stale state. Thus
470	    care must be taken by the caller to "cleanup" the ESB state
471	    appropriately before enabling an interrupt with this.
472
473	    (*) Escalation interrupts cannot be masked via this function
474
475	    (**) The exception to this rule is interrupt sources that have
476	    the OPAL_XIVE_IRQ_MASK_VIA_FW flag set. For such sources, the OS
477	    should make no assumption as to the state of the ESB and this
478	    function *will* perform all the necessary masking and unmasking.
479
480  .. note:: This call contains an implicit opal_xive_sync() of the interrupt
481	    source (see OPAL_XIVE_SYNC below)
482
483  It is recommended for an OS exploiting the XIVE directly to not use
484  this function for temporary driver-initiated masking of interrupts
485  but to directly mask using the P/Q bits of the source instead.
486
487  Masking using this function is intended for the case where the OS has
488  no handler registered for a given interrupt anymore or when registering
489  a new handler for an interrupt that had none. In these case, losing
490  interrupts happening while no handler was attached is considered fine.
491
492.. _OPAL_XIVE_GET_QUEUE_INFO:
493
494OPAL_XIVE_GET_QUEUE_INFO
495^^^^^^^^^^^^^^^^^^^^^^^^
496.. code-block:: c
497
498 int64_t opal_xive_get_queue_info(uint64_t vp, uint32_t prio,
499                                  uint64_t *out_qpage,
500                                  uint64_t *out_qsize,
501                                  uint64_t *out_qeoi_page,
502                                  uint32_t *out_escalate_irq,
503                                  uint64_t *out_qflags);
504
505This returns informations about a given interrupt queue associated
506with a virtual processor and a priority.
507
508* out_qpage: will contain the physical address of the page where the
509  interrupt events will be posted or 0 if none has been configured
510  yet.
511
512* out_qsize: will contain the log2 of the size of the queue buffer
513  or 0 if the queue hasn't been populated. Example: 12 for a 4k page.
514
515* out_qeoi_page: will contain the physical address of the MMIO page
516  used to perform EOIs for the queue notifications.
517
518* out_escalate_irq: will contain a girq number for the escalation
519  interrupt associated with that queue.
520
521  .. warning:: The "escalate_irq" is a special interrupt number, depending
522	       on the implementation it may or may not correspond to a normal
523	       XIVE source. Those interrupts have no triggers, and will not
524	       be masked by opal_set_irq_config() with a prio of 0xff.
525
526  ..note::     The state of the OPAL_XIVE_VP_SINGLE_ESCALATION flag passed to
527	       opal_xive_set_vp_info() can change the escalation irq number,
528	       so make sure you only retrieve this after having set the flag
529	       to the desired value. When set, all priorities will have the
530	       same escalation interrupt.
531
532* out_qflags: will contain flags defined as follow:
533
534  - OPAL_XIVE_EQ_ENABLED
535
536    This must be set for the queue to be enabled and thus a valid
537    target for interrupts. Newly allocated queues are disabled by
538    default and must be disabled again before being freed (allocating
539    and freeing of queues currently only happens along with their
540    owner VP).
541
542    .. note:: A newly enabled queue will have the generation set to 1
543              and the queue pointer to 0. If the OS wants to "reset" a queue
544              generation and pointer, it thus must disable and re-enable
545              the queue.
546
547  - OPAL_XIVE_EQ_ALWAYS_NOTIFY
548
549    When this is set, the HW will always notify the VP on any new
550    entry in the queue, thus the queue own P/Q bits won't be relevant
551    and using the EOI page will be unnecessary.
552
553  - OPAL_XIVE_EQ_ESCALATE
554
555    When this is set, the EQ will escalate to the escalation interrupt
556    when failing to notify.
557
558.. _OPAL_XIVE_SET_QUEUE_INFO:
559
560OPAL_XIVE_SET_QUEUE_INFO
561^^^^^^^^^^^^^^^^^^^^^^^^
562.. code-block:: c
563
564 int64_t opal_xive_set_queue_info(uint64_t vp, uint32_t prio,
565                                  uint64_t qpage,
566                                  uint64_t qsize,
567                                  uint64_t qflags);
568
569This allows the OS to configure the queue page for a given processor
570and priority and adjust the behaviour of the queue via flags.
571
572* qpage: physical address of the page where the interrupt events will
573  be posted. This has to be naturally aligned.
574
575* qsize: log2 of the size of the above page. A 0 here will disable
576  the queue.
577
578* qflags: Flags (see definitions in opal_xive_get_queue_info)
579
580  .. note:: This call will reset the generation bit to 1 and the queue
581	    production pointer to 0.
582
583  .. note:: The PQ bits of the escalation interrupts and of the queue
584            notification will be set to 00 when OPAL_XIVE_EQ_ENABLED is
585	    set, and to 01 (masked) when disabling it.
586
587  .. note:: This must be called at least once on a queue with the flag
588	    OPAL_XIVE_EQ_ENABLED in order to enable it after it has been
589	    allocated (along with its owner VP).
590
591  .. note:: When the queue is disabled (flag OPAL_XIVE_EQ_ENABLED cleared)
592	    all other flags and arguments are ignored and the queue
593	    configuration is wiped.
594
595.. _OPAL_XIVE_DONATE_PAGE:
596
597OPAL_XIVE_DONATE_PAGE
598^^^^^^^^^^^^^^^^^^^^^
599.. code-block:: c
600
601 int64_t opal_xive_donate_page(uint32_t chip_id, uint64_t addr);
602
603This call is used to donate pages to OPAL for use by VP/EQ provisioning.
604
605The pages must be of the size specified by the "ibm,xive-provision-page-size"
606property and naturally aligned.
607
608All donated pages are forgotten by OPAL (and thus returned to the OS)
609on any call to opal_xive_reset().
610
611The chip_id should be the chip on which the pages were allocated or -1
612if unspecified. Ideally, when a VP allocation request fails with the
613OPAL_XIVE_PROVISIONING error, the OS should allocate one such page
614for each chip in the system and hand it to OPAL before trying again.
615
616.. note:: It is possible that the provisioning ends up requiring more than
617	  one page per chip. OPAL will keep returning the above error until
618	  enough pages have been provided.
619
620.. _OPAL_XIVE_ALLOCATE_VP_BLOCK:
621
622OPAL_XIVE_ALLOCATE_VP_BLOCK
623^^^^^^^^^^^^^^^^^^^^^^^^^^^
624.. code-block:: c
625
626 int64_t opal_xive_alloc_vp_block(uint32_t alloc_order);
627
628This call is used to allocate a block of VPs. It will return a number
629representing the base of the block which will be aligned on the alloc
630order, allowing the OS to do basic arithmetic to index VPs in the block.
631
632The VPs will have queue structures reserved (but not initialized nor
633provisioned) for all the priorities defined in the "ibm,xive-#priorities"
634property
635
636This call might return OPAL_XIVE_PROVISIONING. In this case, the OS
637must allocate pages and provision OPAL using opal_xive_donate_page(),
638see the documentation for opal_xive_donate_page() for details.
639
640The resulting VPs must be individudally enabled with opal_xive_set_vp_info
641below with the OPAL_XIVE_VP_ENABLED flag set before use.
642
643For all priorities, the corresponding queues must also be individually
644provisioned and enabled with opal_xive_set_queue_info.
645
646.. _OPAL_XIVE_FREE_VP_BLOCK:
647
648OPAL_XIVE_FREE_VP_BLOCK
649^^^^^^^^^^^^^^^^^^^^^^^
650.. code-block:: c
651
652 int64_t opal_xive_free_vp_block(uint64_t vp);
653
654This call is used to free a block of VPs. It must be called with the same
655*base* number as was returned by opal_xive_alloc_vp() (any index into the
656block will result in an OPAL_PARAMETER error).
657
658The VPs must have been previously all disabled with opal_xive_set_vp_info
659below with the OPAL_XIVE_VP_ENABLED flag cleared before use.
660
661All the queues must also have been disabled.
662
663Failure to do any of the above will result in an OPAL_XIVE_FREE_ACTIVE error.
664
665.. _OPAL_XIVE_GET_VP_INFO:
666
667OPAL_XIVE_GET_VP_INFO
668^^^^^^^^^^^^^^^^^^^^^
669.. code-block:: c
670
671 int64_t opal_xive_get_vp_info(uint64_t vp,
672                               uint64_t *flags,
673                               uint64_t *cam_value,
674                               uint64_t *report_cl_pair,
675			       uint32_t *chip_id);
676
677This call returns information about a VP:
678
679* flags:
680
681  - OPAL_XIVE_VP_ENABLED
682
683    Returns the enabled state of the VP
684
685  - OPAL_XIVE_VP_SINGLE_ESCALATION (if available)
686
687    Returns whether single escalation mode is enabled for this VP
688    (see opal_xive_set_vp_info()).
689
690* cam_value: This is the value to program into the thread management
691  area to dispatch that VP (ie, an encoding of the block + index).
692
693* report_cl_pair:  This is the real address of the reporting cache line
694  pair for that VP (defaults to 0, ie disabled)
695
696* chip_id: The chip that VCPU was allocated on
697
698.. _OPAL_XIVE_SET_VP_INFO:
699
700OPAL_XIVE_SET_VP_INFO
701^^^^^^^^^^^^^^^^^^^^^
702.. code-block:: c
703
704 int64_t opal_xive_set_vp_info(uint64_t vp,
705                               uint64_t flags,
706                               uint64_t report_cl_pair);
707
708This call configures a VP:
709
710* flags:
711
712  - OPAL_XIVE_VP_ENABLED
713
714    This must be set for the VP to be usable and cleared before freeing it.
715
716    .. note:: This can be used to disable the boot time VPs though this
717	      isn't recommended. This must be used to enable allocated VPs.
718
719  - OPAL_XIVE_VP_SINGLE_ESCALATION (if available)
720
721    If this is set, the queues are configured such that all priorities
722    turn into a single escalation interrupt. This results in the loss of
723    priority 7 which can no longer be used. This this needs to be set
724    before any interrupt is routed to that priority and queue 7 must not
725    have been already enabled.
726
727    This feature is available if the "single-escalation-property" is
728    present in the xive device-tree node.
729
730    .. warning:: When enabling single escalation, and pre-existing routing
731		 and configuration of the individual queues escalation
732		 is lost (except queue 7 which is the new merged escalation).
733		 When further disabling it, the previous value is not
734		 retrieved and the field cleared, escalation is disabled on
735		 all the queues.
736
737* report_cl_pair: This is the real address of the reporting cache line
738  pair for that VP or 0 to disable.
739
740    .. note:: When disabling a VP, all other VP settings are lost.
741
742.. _OPAL_XIVE_ALLOCATE_IRQ:
743
744OPAL_XIVE_ALLOCATE_IRQ
745^^^^^^^^^^^^^^^^^^^^^^
746.. code-block:: c
747
748 int64_t opal_xive_allocate_irq(uint32_t chip_id);
749
750This call allocates a software IRQ on a given chip. It returns the
751interrupt number or a negative error code.
752
753.. _OPAL_XIVE_FREE_IRQ:
754
755OPAL_XIVE_FREE_IRQ
756^^^^^^^^^^^^^^^^^^
757.. code-block:: c
758
759 int64_t opal_xive_free_irq(uint32_t girq);
760
761This call frees a software IRQ that was allocated by
762opal_xive_allocate_irq. Passing any other interrupt number
763will result in an OPAL_PARAMETER error.
764
765.. _OPAL_XIVE_SYNC:
766
767OPAL_XIVE_SYNC
768^^^^^^^^^^^^^^
769.. code-block:: c
770
771 int64_t opal_xive_sync(uint32_t type, uint32_t id);
772
773This call is uses to synchronize some HW queues to ensure various changes
774have taken effect to the point where their effects are visible to the
775processor.
776
777* type: Type of synchronization:
778
779  - XIVE_SYNC_EAS: Synchronize a source. "id" is the girq number of the
780    interrupt. This will ensure that any change to the PQ bits or the
781    interrupt targetting has taken effect.
782
783  - XIVE_SYNC_QUEUE: Synchronize a target queue. "id" is the girq number
784    of the interrupt. This will ensure that any previous occurrence of the
785    interrupt has reached the in-memory queue and is visible to the processor.
786
787    .. note:: XIVE_SYNC_EAS and XIVE_SYNC_QUEUE can be used together
788	      (ie. XIVE_SYNC_EAS | XIVE_SYNC_QUEUE) to completely synchronize
789	      the path of an interrupt to its queue.
790
791* id: Depends on the synchronization type, see above
792
793.. _OPAL_XIVE_DUMP:
794
795OPAL_XIVE_DUMP
796^^^^^^^^^^^^^^
797.. code-block:: c
798
799  int64_t opal_xive_dump(uint32_t type, uint32_t id);
800
801This is a debugging call that will dump in the OPAL console various
802state information about the XIVE.
803
804* type: Type of info to dump:
805
806  - XIVE_DUMP_TM_HYP:  Dump the TIMA area for hypervisor physical thread
807                       "id" is the PIR value of the thread
808
809  - XIVE_DUMP_TM_POOL: Dump the TIMA area for the hypervisor pool
810		       "id" is the PIR value of the thread
811
812  - XIVE_DUMP_TM_OS:   Dump the TIMA area for the OS
813		       "id" is the PIR value of the thread
814
815  - XIVE_DUMP_TM_USER: Dump the TIMA area for the "user" area (unsupported)
816		       "id" is the PIR value of the thread
817
818  - XIVE_DUMP_VP:      Dump the state of a VP structure
819                       "id" is the VP id
820
821  - XIVE_DUMP_EMU:     Dump the state of the XICS emulation for a thread
822		       "id" is the PIR value of the thread
823
824.. _OPAL_XIVE_GET_QUEUE_STATE:
825
826OPAL_XIVE_GET_QUEUE_STATE
827^^^^^^^^^^^^^^^^^^^^^^^^^
828.. code-block:: c
829
830 int64_t opal_xive_get_queue_state(uint64_t vp, uint32_t prio,
831				   uint32_t *out_qtoggle,
832				   uint32_t *out_qindex);
833
834This call saves the queue toggle bit and index. This must be called on
835an enabled queue.
836
837* vp, prio: The target queue
838
839* out_qtoggle: toggle bit of the queue
840
841* out_qindex: index of the queue
842
843.. _OPAL_XIVE_SET_QUEUE_STATE:
844
845OPAL_XIVE_SET_QUEUE_STATE
846^^^^^^^^^^^^^^^^^^^^^^^^^
847.. code-block:: c
848
849 int64_t opal_xive_set_queue_state(uint64_t vp, uint32_t prio,
850				   uint32_t qtoggle,
851				   uint32_t qindex);
852
853This call restores the queue toggle bit and index that was previously
854saved by a call to opal_xive_get_queue_state(). This must be called on
855an enabled queue.
856
857* vp, prio: The target queue
858
859* qtoggle: toggle bit of the queue
860
861* qindex: index of the queue
862
863
864.. _OPAL_XIVE_GET_VP_STATE:
865
866OPAL_XIVE_GET_VP_STATE
867^^^^^^^^^^^^^^^^^^^^^^
868.. code-block:: c
869
870 int64_t opal_xive_get_vp_state(uint64_t vp_id,
871				uint64_t *out_state);
872
873This call saves the VP HW state in "out_state". The format matches the
874XIVE NVT word 4 and word 5. This must be called on an enabled VP.
875
876* vp_id: The target VP
877
878* out_state: Location where the state is to be stored
879