1.. Copyright 2021 Simon Ser 2 3.. contents:: 4 5 6linux-dmabuf feedback introduction 7================================== 8 9linux-dmabuf feedback allows compositors and clients to negotiate optimal buffer 10allocation parameters. This document will assume that the compositor is using a 11rendering API such as OpenGL or Vulkan and KMS as the presentation API: even if 12linux-dmabuf feedback isn't restricted to this use-case, it's the most common. 13 14linux-dmabuf feedback introduces the following concepts: 15 161. A main device. This is the render device that the compositor is using to 17 perform composition. Compositors should always be able to display a buffer 18 submitted by a client, so this device can be used as a fallback in case none 19 of the more optimized code-paths work. Clients should allocate buffers such 20 that they can be imported and textured from the main device. 21 222. One or more tranches. Each tranche consists of a target device, allocation 23 flags and a set of format/modifier pairs. A tranche can be seen as a set of 24 formats/modifier pairs that are compatible with the target device. 25 26 A tranche can have the ``scanout`` flag. It means that the target device is 27 a KMS device, and that buffers allocated with one of the format/modifier 28 pairs in the tranche are eligible for direct scanout. 29 30 Clients should use the tranches in order to allocate buffers with the most 31 appropriate format/modifier and also to avoid allocating in private device 32 memory when cross-device operations are going to happen. 33 34linux-dmabuf feedback implementation notes 35========================================== 36 37This section contains recommendations for client and compositor implementations. 38 39For clients 40----------- 41 42Clients are expected to either pick a fixed DRM format beforehand, or 43perform the following steps repeatedly until they find a suitable format. 44 45Basic clients may only support static buffer allocation on startup. These 46clients should do the following: 47 481. Send a ``get_default_feedback`` request to get global feedback. 492. Select the device indicated by ``main_device`` for allocation. 503. For each tranche: 51 52 1. If ``tranche_target_device`` doesn't match the allocation device, ignore 53 the tranche. 54 2. Accumulate allocation flags from ``tranche_flags``. 55 3. Accumulate format/modifier pairs received via ``tranche_formats`` in a 56 list. 57 4. When the ``tranche_done`` event is received, try to allocate the buffer 58 with the accumulated list of modifiers and allocation flags. If that 59 fails, proceed with the next tranche. If that succeeds, stop the loop. 60 614. Destroy the feedback object. 62 63Tranches are ordered by preference: the more optimized tranches come first. As 64such, clients should use the first tranche that happens to work. 65 66Some clients may have already selected the device they want to use beforehand. 67These clients can ignore the ``main_device`` event, and ignore tranches whose 68``tranche_target_device`` doesn't match the selected device. Such clients need 69to be prepared for the ``wp_linux_buffer_params.create`` request to potentially 70fail. 71 72If the client allocates a buffer without specifying explicit modifiers on a 73device different from the one indicated by ``main_device``, then the client 74must force a linear layout. 75 76Some clients might support re-negotiating the buffer format/modifier on the 77fly. These clients should send a ``get_surface_feedback`` request and keep the 78feedback object alive after the initial allocation. Each time a new set of 79feedback parameters is received (ended by the ``done`` event), they should 80perform the same steps as basic clients described above. They should detect 81when the optimal allocation parameters didn't change (same 82format/modifier/flags) to avoid needlessly re-allocating their buffers. 83 84Some clients might additionally support switching the device used for 85allocations on the fly. Such clients should send a ``get_surface_feedback`` 86request. For each tranche, select the device indicated by 87``tranche_target_device`` for allocation. Accumulate allocation flags (received 88via ``tranche_flags``) and format/modifier pairs (received via 89``tranche_formats``) as usual. When the ``tranche_done`` event is received, try 90to allocate the buffer with the accumulated list of modifiers and the 91allocation flags. Try to import the resulting buffer by sending a 92``wp_linux_buffer_params.create`` request (this might fail). Repeat with each 93tranche until an allocation and import succeeds. Each time a new set of 94feedback parameters is received, they should perform these steps again. They 95should detect when the optimal allocation parameters didn't change (same 96device/format/modifier/flags) to avoid needlessly re-allocating their buffers. 97 98For compositors 99--------------- 100 101Basic compositors may only support texturing the DMA-BUFs via a rendering API 102such as OpenGL or Vulkan. Such compositors can send a single tranche as a reply 103to both ``get_default_feedback`` and ``get_surface_feedback``. Set the 104``main_device`` to the rendering device. Send the tranche with 105``tranche_target_device`` set to the rendering device and all of the DRM 106format/modifier pairs supported by the rendering API. Do not set the 107``scanout`` flag in the ``tranche_flags`` event. 108 109Some compositors may support direct scan-out for full-screen surfaces. These 110compositors can re-send the feedback parameters when a surface becomes 111full-screen or leaves full-screen mode if the client has used the 112``get_surface_feedback`` request. The non-full-screen feedback parameters are 113the same as basic compositors described above. The full-screen feedback 114parameters have two tranches: one with the format/modifier pairs supported by 115the KMS plane, with the ``scanout`` flag set in the ``tranche_flags`` event and 116with ``tranche_target_device`` set to the KMS scan-out device; the other with 117the rest of the format/modifier pairs (supported for texturing, but not for 118scan-out), without the ``scanout`` flag set in the ``tranche_flags`` event, and 119with the ``tranche_target_device`` set to the rendering device. 120 121Some compositors may support direct scan-out for all surfaces. These 122compositors can send two tranches for surfaces that become candidates for 123direct scan-out, similarly to compositors supporting direct scan-out for 124fullscreen surfaces. When a surface stops being a candidate for direct 125scan-out, compositors should re-send the feedback parameters optimized for 126texturing only. The way candidates for direct scan-out are selected is 127compositor policy, a possible implementation is to select as many surfaces as 128there are available hardware planes, starting from surfaces closer to the eye. 129 130Some compositors may support multiple devices at the same time. If the 131compositor supports rendering with a fixed device and direct scan-out on a 132secondary device, it may send a separate tranche for surfaces displayed on 133the secondary device that are candidates for direct scan-out. The 134``tranche_target_device`` for this tranche will be the secondary device and 135will not match the ``main_device``. 136 137Some compositors may support switching their rendering device at runtime or 138changing their rendering device depending on the surface. When the rendering 139device changes for a surface, such compositors may re-send the feedback 140parameters with a different ``main_device``. However there is a risk that 141clients don't support switching their device at runtime and continue using the 142previous device. For this reason, compositors should always have a fallback 143rendering device that they initially send as ``main_device``, such that these 144clients use said fallback device. 145 146Compositors should not change the ``main_device`` on-the-fly when explicit 147modifiers are not supported, because there's a risk of importing buffers 148with an implicit non-linear modifier as a linear buffer, resulting in 149misinterpreted buffer contents. 150 151Compositors should not send feedback parameters if they don't have a fallback 152path. For instance, compositors shouldn't send a format/modifier supported for 153direct scan-out but not supported by the rendering API for texturing. 154 155Compositors can decide to use multiple tranches to describe the allocation 156parameters optimized for texturing. For example, if there are formats which 157have a fast texturing path and formats which have a slower texturing path, the 158compositor can decide to expose two separate tranches. 159 160Compositors can decide to use intermediate tranches to describe code-paths 161slower than direct scan-out but faster than texturing. For instance, a 162compositor could insert an intermediate tranche if it's possible to use a 163mem2mem device to convert buffers to be able to use scan-out. 164 165``dev_t`` encoding 166================== 167 168The protocol carries ``dev_t`` values on the wire using arrays. A compositor 169written in C can encode the values as follows: 170 171.. code-block:: c 172 173 struct stat drm_node_stat; 174 struct wl_array dev_array = { 175 .size = sizeof(drm_node_stat.st_rdev), 176 .data = &drm_node_stat.st_rdev, 177 }; 178 179A client can decode the values as follows: 180 181.. code-block:: c 182 183 struct dev_t dev; 184 assert(dev_array->size == sizeof(dev)); 185 memcpy(&dev, dev_array->data, sizeof(dev)); 186 187Because two DRM nodes can refer to the same DRM device while having different 188``dev_t`` values, clients should use ``drmDevicesEqual`` to compare two 189devices. 190 191``format_table`` encoding 192========================= 193 194The ``format_table`` event carries a file descriptor containing a list of 195format + modifier pairs. The list is an array of pairs which can be accessed 196with this C structure definition: 197 198.. code-block:: c 199 200 struct dmabuf_format_modifier { 201 uint32_t format; 202 uint32_t pad; /* unused */ 203 uint64_t modifier; 204 }; 205 206Integration with other APIs 207=========================== 208 209- libdrm: ``drmGetDeviceFromDevId`` returns a ``drmDevice`` from a device ID. 210- EGL: the `EGL_EXT_device_drm_render_node`_ extension may be used to query the 211 DRM device render node used by a given EGL display. When unavailable, the 212 older `EGL_EXT_device_drm`_ extension may be used as a fallback. 213- Vulkan: the `VK_EXT_physical_device_drm`_ extension may be used to query the 214 DRM device used by a given ``VkPhysicalDevice``. 215 216.. _EGL_EXT_device_drm: https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_device_drm.txt 217.. _EGL_EXT_device_drm_render_node: https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_device_drm_render_node.txt 218.. _VK_EXT_physical_device_drm: https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VK_EXT_physical_device_drm.html 219