1Off Main Thread Painting 2======================== 3 4OMTP, or ‘off main thread painting’, is the component of Gecko that 5allows us to perform painting of web content off of the main thread. 6This gives us more time on the main thread for javascript, layout, 7display list building, and other tasks which allows us to increase our 8responsiveness. 9 10Take a look at this `blog 11post <https://mozillagfx.wordpress.com/2017/12/05/off-main-thread-painting/>`__ 12for an introduction. 13 14Background 15---------- 16 17Painting (or rasterization) is the last operation that happens in a 18layer transaction before we forward it to the compositor. At this point, 19all display items have been assigned to a layer and invalid regions have 20been calculated and assigned to each layer. 21 22The painted layer uses a content client to acquire a buffer for 23painting. The main purpose of the content client is to allow us to 24retain already painted content when we are scrolling a layer. We have 25two main strategies for this, rotated buffer and tiling. 26 27This is implemented with two class hierarchies. ``ContentClient`` for 28rotated buffer and ``TiledContentClient`` for tiling. Additionally we 29have two different painted layer implementations, ``ClientPaintedLayer`` 30and ``ClientTiledPaintedLayer``. 31 32The main distinction between rotated buffer and tiling is the amount of 33graphics surfaces required. Rotated buffer utilizes just a single buffer 34for a frame but potentially requires painting into it multiple times. 35Tiling uses multiple buffers but doesn’t require painting into the 36buffers multiple times. 37 38Once the painted layer has a surface (or surfaces with tiling) to paint 39into, they are wrapped in a ``DrawTarget`` of some form and a callback 40to ``FrameLayerBuilder`` is called. This callback uses the assigned 41display items and invalid regions to trigger rasterization. Each 42``nsDisplayItem`` has their ``Paint`` method called with the provided 43``DrawTarget`` that represents the surface, and they paint into it. 44 45High level 46---------- 47 48The key abstraction that allows us to paint off of the main thread is 49``DrawTargetCapture`` [1]_. ``DrawTargetCapture`` is a special 50``DrawTarget`` which records all draw commands for replaying to another 51draw target in the local process. This is similar to 52``DrawTargetRecording``, but only holds a reference to resources instead 53of copying them into the command stream. This allows the command stream 54to be much more lightweight than ``DrawTargetRecording``. 55 56OMTP works by instrumenting the content clients to use a capture target 57for all painting [2]_ [3]_ [4]_ [5]_. This capture draw target records all 58the operations that would normally be performed directly on the 59surface’s draw target. Once we have all of the commands, we send the 60capture and surface draw target to the ``PaintThread`` [6]_ where the 61commands are replayed onto the surface. Once the rasterization is done, 62we forward the layer transaction to the compositor. 63 64Tiling and parallel painting 65---------------------------- 66 67We can make one additional improvement if we are using tiling as our 68content client backend. 69 70When we are tiling, the screen is subdivided into a grid of equally 71sized surfaces and draw commands are performed on the tiles they affect. 72Each tile is independent of the others, so we’re able to parallelize 73painting by using a worker thread pool and dispatching a task for each 74tile individually. 75 76This is commonly referred to as P-OMTP or parallel painting. 77 78Main thread rasterization 79------------------------- 80 81Even with OMTP it’s still possible for the main thread to perform 82rasterization. A common pattern for painting code is to create a 83temporary draw target, perform drawing with it, take a snapshot, and 84then draw the snapshot onto the main draw target. This is done for 85blurs, box shadows, text shadows, and with the basic layer manager 86fallback. 87 88If the temporary draw target is not a draw target capture, then this 89will perform rasterization on the main thread. This can be bad as it 90lowers our parallelism and can cause contention with content backends, 91like Direct2D, that use locking around shared resources. 92 93To work around this, we changed the main thread painting code to use a 94draw target capture for these operations and added a source surface 95capture [7]_ which only resolves the painting of the draw commands when 96needed on the paint thread. 97 98There are still possible cases we can perform main thread rasterization, 99but we try and address them when they come up. 100 101Out of memory issues 102-------------------- 103 104The web is very complex, and so we can sometimes have a large amount of 105draw commands for a content paint. We’ve observed OOM errors for capture 106command lists that have grown to be 200MiB large. 107 108We initially tried to mitigate this by lowering the overhead of capture 109command lists. We do this by filtering commands that don’t actually 110change the draw target state and folding consecutive transform changes, 111but that was not always enough. So we added the ability for our draw 112target capture’s to flush their command lists to the surface draw target 113while we are capturing on the main thread [8]_. 114 115This is triggered by a configurable memory limit. Because this 116introduces a new source of main thread rasterization we try to balance 117setting this too low and suffering poor performance, or setting this too 118high and suffering crashes. 119 120Synchronization 121--------------- 122 123OMTP is conceptually simple, but in practice it relies on subtle code to 124ensure thread safety. This was the most arguably the most difficult part 125of the project. 126 127There are roughly four areas that are critical. 128 1291. Compositor message ordering 130 131 Immediately after we queue the async paints to be asynchronously 132 completed, we have a problem. We need to forward the layer 133 transaction at some point, but the compositor cannot process the 134 transaction until all async paints have finished. If it did, it could 135 access unfinished painted content. 136 137 We obviously can’t block on the async paints completing as that would 138 beat the whole point of OMTP. We also can’t hold off on sending the 139 layer transaction to ``IPDL``, as we’d trigger race conditions for 140 messages sent after the layer transaction is built but before it is 141 forwarded. Reftest and other code assumes that messages sent after a 142 layer transaction to the compositor are processed after that layer 143 transaction is processed. 144 145 The solution is to forward the layer transaction to the compositor 146 over ``IPDL``, but flag the message channel to start postponing 147 messages [9]_. Then once all async paints have completed, we unflag 148 the message channel and all postponed messages are sent [10]_. This 149 allows us to keep our message ordering guarantees and not have to 150 worry about scheduling a runnable in the future. 151 1522. Texture clients 153 154 The backing store for content surfaces is managed by texture client. 155 While async paints are executing, it’s possible for shutdown or any 156 number of things to happen that could cause layer manager, all 157 layers, all content clients, and therefore all texture clients to be 158 destroyed. Therefore it’s important that we keep these texture 159 clients alive throughout async painting. Texture clients also manage 160 IPC resources and must be destroyed on the main thread, so we are 161 careful to do that [11]_. 162 1633. Double buffering 164 165 We currently double buffer our content painting - our content clients 166 only ever have zero or one texture that is available to be painted 167 into at any moment. 168 169 This implies that we cannot start async painting a layer tree while 170 previous async paints are still active as this would lead to awful 171 races. We also don’t support multiple nested sets of postponed IPC 172 messages to allow sending the first layer transaction to the 173 compositor, but not the second. 174 175 To prevent issues with this, we flush all active async paints before 176 we begin to paint a new layer transaction [12]_. 177 178 There was some initial debate about implementing triple buffering for 179 content painting, but we have not seen evidence it would help us 180 significantly. 181 1824. Moz2D thread safety 183 184 Finally, most Moz2D objects were not thread safe. We had to insert 185 special locking into draw target and source surface as they have a 186 special copy on write relationship that must be consistent even if 187 they are on different threads. 188 189 Some platform specific resources like fonts needed locking added in 190 order to be thread safe. We also did some work to make filter nodes 191 work with multiple threads executing them at the same time. 192 193Browser process 194--------------- 195 196Currently only content processes are able to use OMTP. 197 198This restriction was added because of concern about message ordering 199between ``APZ`` and OMTP. It might be able to lifted in the future. 200 201Important bugs 202-------------- 203 2041. `OMTP Meta <https://bugzilla.mozilla.org/show_bug.cgi?id=omtp>`__ 2052. `Enable on 206 Windows <https://bugzilla.mozilla.org/show_bug.cgi?id=1403935>`__ 2073. `Enable on 208 OSX <https://bugzilla.mozilla.org/show_bug.cgi?id=1422392>`__ 2094. `Enable on 210 Linux <https://bugzilla.mozilla.org/show_bug.cgi?id=1432531>`__ 2115. `Parallel 212 painting <https://bugzilla.mozilla.org/show_bug.cgi?id=1425056>`__ 213 214Code links 215---------- 216 217.. [1] `DrawTargetCapture <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/2d/DrawTargetCapture.h#22>`__ 218.. [2] `Creating DrawTargetCapture for rotated 219 buffer <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/client/ContentClient.cpp#185>`__ 220.. [3] `Dispatch DrawTargetCapture for rotated 221 buffer <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/client/ClientPaintedLayer.cpp#99>`__ 222.. [4] `Creating DrawTargetCapture for 223 tiling <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/client/TiledContentClient.cpp#714>`__ 224.. [5] `Dispatch DrawTargetCapture for 225 tiling <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/client/MultiTiledContentClient.cpp#288>`__ 226.. [6] `PaintThread <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/PaintThread.h#53>`__ 227.. [7] `SourceSurfaceCapture <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/2d/SourceSurfaceCapture.h#19>`__ 228.. [8] `Sync flushing draw 229 commands <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/2d/DrawTargetCapture.h#165>`__ 230.. [9] `Postponing messages for 231 PCompositorBridge <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/ipc/CompositorBridgeChild.cpp#1319>`__ 232.. [10] `Releasing messages for 233 PCompositorBridge <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/ipc/CompositorBridgeChild.cpp#1303>`__ 234.. [11] `Releasing texture clients on main 235 thread <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/ipc/CompositorBridgeChild.cpp#1170>`__ 236.. [12] `Flushing async 237 paints <https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/gfx/layers/client/ClientLayerManager.cpp#289>`__ 238