1.. Copyright 2021 Simon Ser
2
3.. contents::
4
5
6linux-dmabuf feedback introduction
7==================================
8
9linux-dmabuf feedback allows compositors and clients to negotiate optimal buffer
10allocation parameters. This document will assume that the compositor is using a
11rendering API such as OpenGL or Vulkan and KMS as the presentation API: even if
12linux-dmabuf feedback isn't restricted to this use-case, it's the most common.
13
14linux-dmabuf feedback introduces the following concepts:
15
161. A main device. This is the render device that the compositor is using to
17   perform composition. Compositors should always be able to display a buffer
18   submitted by a client, so this device can be used as a fallback in case none
19   of the more optimized code-paths work. Clients should allocate buffers such
20   that they can be imported and textured from the main device.
21
222. One or more tranches. Each tranche consists of a target device, allocation
23   flags and a set of format/modifier pairs. A tranche can be seen as a set of
24   formats/modifier pairs that are compatible with the target device.
25
26   A tranche can have the ``scanout`` flag. It means that the target device is
27   a KMS device, and that buffers allocated with one of the format/modifier
28   pairs in the tranche are eligible for direct scanout.
29
30   Clients should use the tranches in order to allocate buffers with the most
31   appropriate format/modifier and also to avoid allocating in private device
32   memory when cross-device operations are going to happen.
33
34linux-dmabuf feedback implementation notes
35==========================================
36
37This section contains recommendations for client and compositor implementations.
38
39For clients
40-----------
41
42Clients are expected to either pick a fixed DRM format beforehand, or
43perform the following steps repeatedly until they find a suitable format.
44
45Basic clients may only support static buffer allocation on startup. These
46clients should do the following:
47
481. Send a ``get_default_feedback`` request to get global feedback.
492. Select the device indicated by ``main_device`` for allocation.
503. For each tranche:
51
52   1. If ``tranche_target_device`` doesn't match the allocation device, ignore
53      the tranche.
54   2. Accumulate allocation flags from ``tranche_flags``.
55   3. Accumulate format/modifier pairs received via ``tranche_formats`` in a
56      list.
57   4. When the ``tranche_done`` event is received, try to allocate the buffer
58      with the accumulated list of modifiers and allocation flags. If that
59      fails, proceed with the next tranche. If that succeeds, stop the loop.
60
614. Destroy the feedback object.
62
63Tranches are ordered by preference: the more optimized tranches come first. As
64such, clients should use the first tranche that happens to work.
65
66Some clients may have already selected the device they want to use beforehand.
67These clients can ignore the ``main_device`` event, and ignore tranches whose
68``tranche_target_device`` doesn't match the selected device. Such clients need
69to be prepared for the ``wp_linux_buffer_params.create`` request to potentially
70fail.
71
72If the client allocates a buffer without specifying explicit modifiers on a
73device different from the one indicated by ``main_device``, then the client
74must force a linear layout.
75
76Some clients might support re-negotiating the buffer format/modifier on the
77fly. These clients should send a ``get_surface_feedback`` request and keep the
78feedback object alive after the initial allocation. Each time a new set of
79feedback parameters is received (ended by the ``done`` event), they should
80perform the same steps as basic clients described above. They should detect
81when the optimal allocation parameters didn't change (same
82format/modifier/flags) to avoid needlessly re-allocating their buffers.
83
84Some clients might additionally support switching the device used for
85allocations on the fly. Such clients should send a ``get_surface_feedback``
86request. For each tranche, select the device indicated by
87``tranche_target_device`` for allocation. Accumulate allocation flags (received
88via ``tranche_flags``) and format/modifier pairs (received via
89``tranche_formats``) as usual. When the ``tranche_done`` event is received, try
90to allocate the buffer with the accumulated list of modifiers and the
91allocation flags. Try to import the resulting buffer by sending a
92``wp_linux_buffer_params.create`` request (this might fail). Repeat with each
93tranche until an allocation and import succeeds. Each time a new set of
94feedback parameters is received, they should perform these steps again. They
95should detect when the optimal allocation parameters didn't change (same
96device/format/modifier/flags) to avoid needlessly re-allocating their buffers.
97
98For compositors
99---------------
100
101Basic compositors may only support texturing the DMA-BUFs via a rendering API
102such as OpenGL or Vulkan. Such compositors can send a single tranche as a reply
103to both ``get_default_feedback`` and ``get_surface_feedback``. Set the
104``main_device`` to the rendering device. Send the tranche with
105``tranche_target_device`` set to the rendering device and all of the DRM
106format/modifier pairs supported by the rendering API. Do not set the
107``scanout`` flag in the ``tranche_flags`` event.
108
109Some compositors may support direct scan-out for full-screen surfaces. These
110compositors can re-send the feedback parameters when a surface becomes
111full-screen or leaves full-screen mode if the client has used the
112``get_surface_feedback`` request. The non-full-screen feedback parameters are
113the same as basic compositors described above. The full-screen feedback
114parameters have two tranches: one with the format/modifier pairs supported by
115the KMS plane, with the ``scanout`` flag set in the ``tranche_flags`` event and
116with ``tranche_target_device`` set to the KMS scan-out device; the other with
117the rest of the format/modifier pairs (supported for texturing, but not for
118scan-out), without the ``scanout`` flag set in the ``tranche_flags`` event, and
119with the ``tranche_target_device`` set to the rendering device.
120
121Some compositors may support direct scan-out for all surfaces. These
122compositors can send two tranches for surfaces that become candidates for
123direct scan-out, similarly to compositors supporting direct scan-out for
124fullscreen surfaces. When a surface stops being a candidate for direct
125scan-out, compositors should re-send the feedback parameters optimized for
126texturing only.  The way candidates for direct scan-out are selected is
127compositor policy, a possible implementation is to select as many surfaces as
128there are available hardware planes, starting from surfaces closer to the eye.
129
130Some compositors may support multiple devices at the same time. If the
131compositor supports rendering with a fixed device and direct scan-out on a
132secondary device, it may send a separate tranche for surfaces displayed on
133the secondary device that are candidates for direct scan-out. The
134``tranche_target_device`` for this tranche will be the secondary device and
135will not match the ``main_device``.
136
137Some compositors may support switching their rendering device at runtime or
138changing their rendering device depending on the surface. When the rendering
139device changes for a surface, such compositors may re-send the feedback
140parameters with a different ``main_device``. However there is a risk that
141clients don't support switching their device at runtime and continue using the
142previous device. For this reason, compositors should always have a fallback
143rendering device that they initially send as ``main_device``, such that these
144clients use said fallback device.
145
146Compositors should not change the ``main_device`` on-the-fly when explicit
147modifiers are not supported, because there's a risk of importing buffers
148with an implicit non-linear modifier as a linear buffer, resulting in
149misinterpreted buffer contents.
150
151Compositors should not send feedback parameters if they don't have a fallback
152path. For instance, compositors shouldn't send a format/modifier supported for
153direct scan-out but not supported by the rendering API for texturing.
154
155Compositors can decide to use multiple tranches to describe the allocation
156parameters optimized for texturing. For example, if there are formats which
157have a fast texturing path and formats which have a slower texturing path, the
158compositor can decide to expose two separate tranches.
159
160Compositors can decide to use intermediate tranches to describe code-paths
161slower than direct scan-out but faster than texturing. For instance, a
162compositor could insert an intermediate tranche if it's possible to use a
163mem2mem device to convert buffers to be able to use scan-out.
164
165``dev_t`` encoding
166==================
167
168The protocol carries ``dev_t`` values on the wire using arrays. A compositor
169written in C can encode the values as follows:
170
171.. code-block:: c
172
173    struct stat drm_node_stat;
174    struct wl_array dev_array = {
175        .size = sizeof(drm_node_stat.st_rdev),
176        .data = &drm_node_stat.st_rdev,
177    };
178
179A client can decode the values as follows:
180
181.. code-block:: c
182
183    struct dev_t dev;
184    assert(dev_array->size == sizeof(dev));
185    memcpy(&dev, dev_array->data, sizeof(dev));
186
187Because two DRM nodes can refer to the same DRM device while having different
188``dev_t`` values, clients should use ``drmDevicesEqual`` to compare two
189devices.
190
191``format_table`` encoding
192=========================
193
194The ``format_table`` event carries a file descriptor containing a list of
195format + modifier pairs. The list is an array of pairs which can be accessed
196with this C structure definition:
197
198.. code-block:: c
199
200    struct dmabuf_format_modifier {
201        uint32_t format;
202        uint32_t pad; /* unused */
203        uint64_t modifier;
204    };
205
206Integration with other APIs
207===========================
208
209- libdrm: ``drmGetDeviceFromDevId`` returns a ``drmDevice`` from a device ID.
210- EGL: the `EGL_EXT_device_drm_render_node`_ extension may be used to query the
211  DRM device render node used by a given EGL display. When unavailable, the
212  older `EGL_EXT_device_drm`_ extension may be used as a fallback.
213- Vulkan: the `VK_EXT_physical_device_drm`_ extension may be used to query the
214  DRM device used by a given ``VkPhysicalDevice``.
215
216.. _EGL_EXT_device_drm: https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_device_drm.txt
217.. _EGL_EXT_device_drm_render_node: https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_device_drm_render_node.txt
218.. _VK_EXT_physical_device_drm: https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VK_EXT_physical_device_drm.html
219