1# media/mojo
2
3This folder contains mojo interfaces, clients and implementations that extend
4the core "media" target to support most out-of-process use cases, including
5Media Player, Metrics (WatchTime), etc.
6
7Currently the “media” target does not depend on mojo, so that other applications
8can use the “media” target without having to pull in mojo dependency.
9
10[TOC]
11
12## Media Player
13
14### Media Components
15
16Media Player (`WebMediaPlayer`) supports HTML5 \<video\> playback in Chromium.
17Internally, it depends on many **media components** to perform some specific
18tasks, e.g. **media renderer**, **audio decoder**, **video decoder**, and
19**content decryption module** (CDM). A CDM is required for a *media renderer*,
20*audio decoder* or *video decoder* to handle protected content. See more details
21in the general [media documentation](/media).
22
23While most of HTML5 media player stack and Encrypted Media Extensions (EME)
24stack live in the sandboxed render process (e.g. for security reasons), there
25are some cases where some media components must live in a different process.
26For example:
27
28* A hardware-based media renderer, where all audio/video decoding and rendering
29  happens in hardware, which is not accessible in the sandboxed render process.
30* A hardware based video decoder, where the hardware decoding libraries are not
31  accessible in the sandboxed render process.
32* On Android, a media component depends on Android Java API, which is not
33  accessible in the sandboxed render process.
34* A CDM contains third-party code and should run in its own sandboxed process.
35
36Here we provide a generic framework to support most out-of-process (OOP) media
37component use cases in Chromium.
38
39### Media Player Mojo Interfaces
40
41We use mojom interfaces as the transport layer of each media component to
42support hosting them remotely. These interfaces are called **media player mojo
43interfaces**. They are very similar to their C++ counterparts:
44
45* `media::Renderer` -> `media::mojom::Renderer`
46* `media::AudioDecoder` -> `media::mojom::AudioDecoder`
47* `media::VideoDecoder` -> `media::mojom::VideoDecoder`
48* `media::ContentDecryptionModule` -> `media::mojom::ContentDecryptionModule`
49* `media::Decryptor` -> `media::mojom::Decryptor`
50
51### Enable Remote Media Components
52
53Standard clients and implementations of these interfaces are provided. For
54example, for `media::Renderer`, `MojoRenderer` implements `media::Renderer`, and
55forwards calls to a `media::mojom::RendererPtr`. `MojoRendererService`
56implements `media::mojom::Renderer`, and can host any `media::Renderer`
57implementation.
58
59Remote media components can be easily enabled and seamlessly integrated in the
60current media pipeline. Simply set the gn argument `mojo_media_services` to
61specify which remote media components you want to enable. For example, with the
62following gn arguments, the media pipeline will enable `MojoRenderer` and
63`MojoCdm`:
64```
65enable_mojo_media = true
66mojo_media_services = ["renderer", "cdm"]
67```
68Note that you must set `enable_mojo_media` first.
69
70### Media Mojo Interface Factory
71
72`media::mojom::InterfaceFactory` has factory methods like `CreateRenderer()`,
73`CreateCdm()` etc. It is used to request media player mojo interfaces.
74
75In the render process, each `RenderFrameImpl` has a
76`mojo::PendingRemote<media::mojom::InterfaceFactory>` which is used to request
77 all media player mojo interfaces for that frame from the browser process. In
78the browser process, each `RenderFrameHostImpl` owns a `MediaInterfaceProxy`,
79which implements `media::mojom::InterfaceFactory`.
80
81`MediaInterfaceProxy` is a central hub for handling media player mojo interface
82requests. By default it will forward all the requests to the
83[`MediaService`](#MediaService). But it also has the flexibility to handle some
84special or more complicated use cases. For example:
85* On desktop platforms, when library CDM is enabled, the
86  `media::mojom::ContentDecryptionModule` request will be forwarded to the
87  [`CdmService`](#CdmService) running in its own CDM (utility) process.
88* On Android, the `media::mojom::Renderer` request is handled in the
89  `RenderFrameHostImpl` context directly by creating `MediaPlayerRenderer` in
90  the browser process, even though the `MediaService` is configured to run in
91  the GPU process.
92* On Chromecast, the `media::mojom::Renderer` and
93  `media::mojom::ContentDecryptionModule` requests are handled by
94  [`MediaRendererService`](#MediaRendererService) which runs in the browser
95  process. The `media::mojom::VideoDecoder` request is handled by the default
96  `MediaService` which runs in the GPU process.
97
98Note that `media::mojom::InterfaceFactory` interface is reused in the
99communication between `MediaInterfaceProxy` and `MediaService` (see
100[below](#Site-Isolation)).
101
102### MediaService
103
104The MediaService is a mojo `service_manager::Service` that provides media player
105mojo interface implementations. It comes with some nice benefits.
106
107#### Flexible Process Model
108
109Different platforms or products have different requirements on where the remote
110media components should run. For example, a hardware decoder typically should
111run in the GPU process. The `ServiceManagerContext` provides the ability to run
112a service_manager::Service in-process (browser), out-of-process (utility) or in
113the GPU process. Therefore, by using a `MediaService`, it’s very easy to support
114hosting remote media components interfaces in most common Chromium process types
115(Browser/Utility/GPU). This can by set using the gn argument  `mojo_media_host`,
116e.g.
117```
118mojo_media_host = "browser" or “gpu” or “utility”
119```
120
121MediaService is registered in `ServiceManagerContext` using `kMediaServiceName`.
122`mojo_media_host` is checked to decide in which process the service is
123registered to run.
124
125#### Connects Different Media Components
126
127Some remote media components depend on other components to work. For example, a
128Renderer, an audio decoder or a video decoder needs a CDM to be able to handle
129encrypted streams. Typically there's a `SetCdm()` call to connect the renderer
130or decoder with the CDM. If, for example, a Renderer interface and a CDM
131interface are hosted separately, then it will be hard to implement the
132`SetCdm()` call. It would require an object or entity that are aware of both
133sides to be able to connect them. `MediaService` handles this internally, and is
134 actually serving as such an object or entity, so you don’t have to reinvent
135 the wheel. See more details [below](#Using-CdmContext).
136
137#### Customization through MojoMediaClient
138
139`MediaService` provides everything needed to host an OOP media component, but
140it doesn’t provide the media component itself. It’s up to the client of
141`MediaService` to provide the concrete media component implementations.
142
143The `MojoMediaClient` interface provides a way for `MediaService` clients to
144provide concrete media components’ implementations. When `MediaService` is
145created, a `MojoMediaClient` must be passed in so that `MediaService` knows how
146to create the media components.
147
148For example, ChromeCast uses `MediaService` to host a media Renderer and a CDM
149in the browser process, and it provides the `CastRenderer` and `CastCdm` through
150`CastMojoMediaClient`, a `MojoMediaClient` implementation. Note that this
151overriding mechanism is not implemented everywhere. It’s trivial to add the
152support and we’ll only add it when we need it.
153
154#### Site Isolation
155
156In Blink, both media element and EME MediaKeys belong to a `WebLocalFrame`. In
157Chromium, this translates to media player and CDM belonging to a `RenderFrame`.
158In the render process, this is clear. However, when hosting all remote media
159components in a single `MediaService` (service manager only supports one service
160instance per process), the Frame boundary could get fuzzy. This will be
161especially dangerous for media components that interact with each other.
162For example, a Renderer from foo.com lives in the same MediaService instance as
163a CDM from bar.net. It would be  wrong if the bar.net CDM is set on the foo.com
164Renderer to handle decryption.
165
166To prevent this from happening, we introduce an additional layer to simulate
167the `RenderFrame` boundary. A MediaService hosts multiple InterfaceFactory
168(one per `RenderFrame`), and each InterfaceFactory creates and manages media
169components it creates.
170
171For this reason, `media::mojom::InterfaceFactory` interface is reused in the
172communication between `MediaInterfaceProxy` and `MediaService`.
173
174> Note: there are plans to split apart the responsibilities of
175`media::mojom::InterfaceFactory` to make it clear which interfaces are used
176where.
177
178#### Specialized Out-of-Process media::Renderers
179
180The `media::Renderer` interface is a simple API, which is general enough to
181capture the essence of high level media playback commands. This allows us to
182extend the functionality of the `WebMediaPlayer` via **specialized renderers**.
183Specifically, we can build a sub-component that encapsulates the complexities of
184an advanced scenario, write a small adapter layer that exposes the component as
185a `media::Renderer`, and embed it within the existing `media::Pipeline` state
186machine. Specialized Renderers reduce technical complexity costs, by limiting
187the scope of details to the files and classes need them, by requiring little
188control flow boilerplate, and by generally having little impact on the default
189paths that `WebMediaPlayer` uses most of the time.
190
191Two examples of complex scenarios enabled by specialized renderers are: handling
192HLS playback on Android by delegating it to the Android Media Player (see
193`MediaPlayerRenderer`) and casting "src=" media from an Android phone to a cast
194device (see `FlingingRenderer`). Both of these examples have sub-components that
195need to live in the Browser process. We therefore proxy the
196`MediaPlayerRenderer` and `FlingingRenderer` to the Browser process, using the
197Mojo interfaces defined in renderer.mojom and renderer_extensions.mojom. This
198idea can be generalized to handle any special case *Foo scenario* as a
199**specialized OOP FooRenderer**.
200
201The classes required to create a *specialized OOP FooRenderer* come in pairs,
202serving similar purposes in their respective processes. By convention, the
203`FooRenderer` lives in the target process and the `FooRendererClient` lives in
204the Renderer process. The `MojoRenderer` and `MojoRendererService` proxies
205`media::Renderer` and `media::RendererClient` calls to/from the
206`FooRenderer[Client]`. One-off commands and events that can't be expressed as a
207`media::Renderer[Client]` call are carried across process boundaries by
208*renderer extensions* instead (see `renderer_extension.mojom`). The
209`FooRenderer[Client]Extension` mojo interfaces are implemented by their
210respective `FooRenderer[Client]` instances directly. The
211`FooRenderer[Client]Factory` sets up the scenario specific boilerplate, and all
212of the mojo interface pointers/requests needed to talk to the other process.
213Interface pointers and requests are connected across process boundaries when
214mojom::InterfaceFactory::CreateFooRenderer() is called. The end result is
215illustrated below:
216
217![Communication diagram for an OOP Renderer](./renderer_extension_diagram.png)
218
219To allow the creation and use of a FooRenderer within WebMediaPlayer, a
220`FooRendererClientFactory` must be built and passed to the
221`RendererFactorySelector`. The `RendererFactorySelector` must also be given a
222way to query if we are currently in a scenario that requires the use of the
223`FooRenderer`. When we enter a *Foo scenario*, cycling the `media::Pipeline` via
224suspend()/resume() should be enough for the next call to
225`RendererFactorySelector::GetCurrentFactory()` to return the
226`FooRendererClientFactory`. When `RendererFactory::CreateRenderer()` is called,
227the pipeline will receive a `FooRendererClient` as an opaque `media::Renderer`.
228The default pipeline state machine will control the OOP `FooRenderer`.
229When we exit the *Foo scenario*, cycling the pipeline once more should bring us
230back into the right state.
231
232#### Support Other Clients
233
234`MediaService`, as a `service_manager::Service`, can be used by clients other
235than the media player in the render process. For example, we could have another
236(mojo) service that handles audio data and wants to play it in a media Renderer.
237Since `MediaService` is a mojo service, it’s very convenient for any other mojo
238service to connect to it through a `service_manager::mojom::Connector` and use
239the remote media Renderer it hosts.
240
241### CdmService
242
243Although `MediaService` supports `media::mojom::CDM`, in some cases (e.g.
244library CDM on desktop) the remote CDM needs to run in its own process,
245typically for security reasons. `CdmService` is introduced to handle this. It
246also implements `service_manager::Service`, and is registered in
247`ServiceManagerContext` using `kCdmServiceName`. Currently it’s always
248registered to run in the utility process (with CDM sandbox type). `CdmService`
249also has additional support on library CDM, e.g. loading the library CDM etc.
250Note that `CdmService` only supports `media::mojom::CDM` and does NOT support
251other media player mojo interfaces.
252
253### MediaRendererService
254
255`MediaRendererService` supports `media::mojom::Renderer` and
256`media::mojom::CDM`. It's hosted in a different process than the default
257`MediaService`. It's registered in `ServiceManagerContext` using
258'kMediaRendererServiceName`. This allows to run `media::mojom::VideoDecoder` and
259`media::mojom::Renderer` in two different processes. Currently Chromecast use
260this to support `CastRenderer` `CDM` in browser process and GPU accelerated
261video decoder in GPU process. The main goals are:
2621. Allow two pages to hold their own video pipeline simultaneously, because
263   `CastRenderer` only support one video pipeline at a time.
2642. Support GPU accelerated video decoder for RTC path.
265
266### Mojo CDM and Mojo Decryptor
267
268Mojo CDM is special among all media player mojo interfaces because it is needed
269by all local/remote media components to handle encrypted buffers:
270
2711. Local media components like `DecryptingDemuxerStream`,
272   `DecryptingAudioDecoder` and `DecryptingVideoDecoder`.
2732. Remote media components hosted in `MediaService`, e.g. by
274   `MojoRendererService`, `MojoAudioDecoderService` and
275   `MojoVideoDecoderService`.
276
277At the JavaScript layer, the media player and MediaKeys are connected via
278[`setMediaKeys()`](https://w3c.github.io/encrypted-media/#dom-htmlmediaelement-setmediakeys).
279This is implemented by `SetCdm()` in the render process.
280
281A media component can use a CDM in two ways.
282
283#### Using a Decryptor (via CdmContext)
284
285Some CDM provides a `Decryptor` implementation, which supports decrypting
286methods directly, e.g. `Decrypt()`, `DecryptAndDecode()` etc. Both the
287`AesDecryptor` and library CDM support the `Decryptor` interface.
288
289In the case of a remote CDM, e.g. hosted by `MojoCdmService` in `MediaService`
290or `CdmService`, if the remote CDM supports the `Decryptor` interface, the
291`MojoCdm` will also support the `Decryptor` interface, implemented by
292`MojoDecryptor`, which set up a new message pipe to forward all `Decryptor`
293calls to the `Decryptor` in the remote CDM.
294
295#### Using CdmContext
296
297In some cases the media component is set to work with a specific CDM. For
298example, on Android, MediaCodec-based decoders (e.g. `MediaCodecAudioDecoder`)
299can only use MediaDrm-based CDM via `MediaCryptoContext`. The media component
300and the CDM must live in the same process because the interaction of these two
301are typically happening deep at the OS level. In theory, they can both live in
302the render process. But in practice, typically both the CDM and the media
303component are hosted by the MediaService in a remote (e.g. GPU) process.
304
305To be able to attach a remote CDM with a remote media component, each
306`InterfaceFactoryImpl` instance (corresponding to one `RenderFrame`) in the
307`MediaService` maintains a `MojoCdmServiceContext` that keeps track of all
308remote CDMs created for the `RenderFrame`. Each remote CDM is assigned a unique
309CDM ID, which is sent back to the `MojoCdm` in the render process. In the render
310process, when `SetCdm()` is called, the CDM ID is passed to the local media
311component (e.g. `MojoRenderer`), which is forwarded the remote media component
312(e.g. `MojoRendererService`). The remote media component will talk to
313`MojoCdmServiceContext` to get the `CdmContext` associated with the CDM ID, and
314complete the connection.
315
316### Secure Auxiliary Services
317
318Media components often need other services to work. In the render process, the
319local media components get services from content layer through the `MediaClient`
320interface. In `MediaService` and `CdmService`, remote media components get
321services from the through **secure auxiliary services**.
322
323Note that as a `service_manager::Service`, `MediaService` and `CdmService` can
324always connect to other `service_manager::Service` hosted by the service_manager
325through the `Connector` interface. However, these are generic services that
326doesn’t belong to any individual `RenderFrame`, or even user profile.
327
328Some services do require `RenderFrame` or user profile identity, e.g. file
329system. Since media components all belong to a given `RenderFrame`, we must
330maintain the frame identity when accessing these services for security reasons.
331These services are called secure auxiliary services. `FrameServiceBase` is a
332base class for all secure auxiliary services to help manage the lifetime of
333these services (e.g. to handle navigation).
334
335In `MediaInterfaceProxy`, when we request `media::mojom::InterfaceFactory` in
336the `MediaService` or `CdmService`, we call `GetFrameServices()` to configure
337which secure auxiliary services are exposed to the remote components over the
338separate `blink::mojom::BrowserInterfaceBroker`.
339
340Currently only the remote CDM needs secure auxiliary services. This is a list of
341currently supported services:
342
343* `OutputProtection`: to check output protection status
344* `PlatformVerification`: to check whether the platform is secure
345* `CdmFileIO`: for the CDM to store persistent data
346* `ProvisionFetcher`: for Android MediaDrm device provisioning
347* `CdmProxy`: (in progress)
348
349### Security
350
351In most cases, the client side runs in the renderer process which is the least
352trusted. Also always assume the client side code may be compromised, e.g. making
353calls in random order or passing in garbage parameters.
354
355Due to the [Flexible Process Model](#Flexible-Process-Model), it's sometimes
356hard to know in which process the service side runs. As a rule of thumb, assume
357all service side code may run in a privileged process (e.g. browser process),
358including the common supporting code like `MojoVideoDecoderService`, as well as
359the concrete [Media Component](#Media-Components), e.g. MediaCodecVideoDecoder
360on Android.  To know exactly which [Media Component](#Media-Components) runs in
361which process in production, see [Adoption](#Adoption) below.
362
363Also note that all the [Secure Auxiliary Services](#Secure-Auxiliary-Services)
364are running in a more privileged process than the process where the media
365components that use them run in. For example, all of the existing services run
366in the browser process except for the `CdmProxy`, which runs in the GPU process.
367They must defend against compromised media components.
368
369### Adoption
370
371#### Android
372
373* `MediaService` in the GPU process (registered in `GpuServiceFactory` with
374  `GpuMojoMediaClient`)
375* `MojoCdm` + `MediaDrmBridge` (CDM)
376* `MediaDrmBridge` uses mojo `ProvisionFetcher` service for CDM provisioning
377* `MojoAudioDecoder` + `MediaCodecAudioDecoder`
378* `MojoVideoDecoder` + `MediaCodecVideoDecoder` (in progress)
379* HLS support:
380    * `MojoRenderer` + `MediaPlayerRenderer`
381    * NOT using `MediaService`. Instead, `MojoRendererService` is hosted by
382      `RenderFrameHostImpl`/`MediaInterfaceProxy`  in the browser process
383      directly.
384* Flinging media to cast devices (RemotePlayback API):
385    * `MojoRenderer` + `FlingingRenderer`
386    * NOT using `MediaService`. Instead, `MojoRendererService` is hosted by
387      `RenderFrameHostImpl`/`MediaInterfaceProxy`  in the browser process
388      directly.
389
390#### ChromeCast
391
392* `MediaService` in the Browser process (registered in
393  `CastContentBrowserClient` with `CastMojoMediaClient`)
394* `MojoRenderer` + `CastRenderer`
395* `MojoCdm` + `CastCdm`
396
397#### Desktop (ChromeOS/Linux/Mac/Windows)
398
399* CdmService
400    * `CdmService` in the utility process (registered in `UtilityServiceFactory`
401      with `ContentCdmServiceClient`)
402    * `MojoCdm` + `CdmAdapter` + Library CDM implementation
403    * `CdmAdapter` uses various secure auxiliary services
404* MediaService (in progress)
405    * `MediaService` in the GPU process (registered in `GpuServiceFactory` with
406      `GpuMojoMediaClient`)
407    * `MojoVideoDecoder` + hardware video decoders such as D3D11VideoDecoder
408    * Provides `CdmProxy` to the `CdmService`
409
410## Other Services
411
412> TODO(xhwang): Add documentation on other mojo services, e.g. remoting, etc.
413
414