1# media/mojo 2 3This folder contains mojo interfaces, clients and implementations that extend 4the core "media" target to support most out-of-process use cases, including 5Media Player, Metrics (WatchTime), etc. 6 7Currently the “media” target does not depend on mojo, so that other applications 8can use the “media” target without having to pull in mojo dependency. 9 10[TOC] 11 12## Media Player 13 14### Media Components 15 16Media Player (`WebMediaPlayer`) supports HTML5 \<video\> playback in Chromium. 17Internally, it depends on many **media components** to perform some specific 18tasks, e.g. **media renderer**, **audio decoder**, **video decoder**, and 19**content decryption module** (CDM). A CDM is required for a *media renderer*, 20*audio decoder* or *video decoder* to handle protected content. See more details 21in the general [media documentation](/media). 22 23While most of HTML5 media player stack and Encrypted Media Extensions (EME) 24stack live in the sandboxed render process (e.g. for security reasons), there 25are some cases where some media components must live in a different process. 26For example: 27 28* A hardware-based media renderer, where all audio/video decoding and rendering 29 happens in hardware, which is not accessible in the sandboxed render process. 30* A hardware based video decoder, where the hardware decoding libraries are not 31 accessible in the sandboxed render process. 32* On Android, a media component depends on Android Java API, which is not 33 accessible in the sandboxed render process. 34* A CDM contains third-party code and should run in its own sandboxed process. 35 36Here we provide a generic framework to support most out-of-process (OOP) media 37component use cases in Chromium. 38 39### Media Player Mojo Interfaces 40 41We use mojom interfaces as the transport layer of each media component to 42support hosting them remotely. These interfaces are called **media player mojo 43interfaces**. They are very similar to their C++ counterparts: 44 45* `media::Renderer` -> `media::mojom::Renderer` 46* `media::AudioDecoder` -> `media::mojom::AudioDecoder` 47* `media::VideoDecoder` -> `media::mojom::VideoDecoder` 48* `media::ContentDecryptionModule` -> `media::mojom::ContentDecryptionModule` 49* `media::Decryptor` -> `media::mojom::Decryptor` 50 51### Enable Remote Media Components 52 53Standard clients and implementations of these interfaces are provided. For 54example, for `media::Renderer`, `MojoRenderer` implements `media::Renderer`, and 55forwards calls to a `media::mojom::RendererPtr`. `MojoRendererService` 56implements `media::mojom::Renderer`, and can host any `media::Renderer` 57implementation. 58 59Remote media components can be easily enabled and seamlessly integrated in the 60current media pipeline. Simply set the gn argument `mojo_media_services` to 61specify which remote media components you want to enable. For example, with the 62following gn arguments, the media pipeline will enable `MojoRenderer` and 63`MojoCdm`: 64``` 65enable_mojo_media = true 66mojo_media_services = ["renderer", "cdm"] 67``` 68Note that you must set `enable_mojo_media` first. 69 70### Media Mojo Interface Factory 71 72`media::mojom::InterfaceFactory` has factory methods like `CreateRenderer()`, 73`CreateCdm()` etc. It is used to request media player mojo interfaces. 74 75In the render process, each `RenderFrameImpl` has a 76`mojo::PendingRemote<media::mojom::InterfaceFactory>` which is used to request 77 all media player mojo interfaces for that frame from the browser process. In 78the browser process, each `RenderFrameHostImpl` owns a `MediaInterfaceProxy`, 79which implements `media::mojom::InterfaceFactory`. 80 81`MediaInterfaceProxy` is a central hub for handling media player mojo interface 82requests. By default it will forward all the requests to the 83[`MediaService`](#MediaService). But it also has the flexibility to handle some 84special or more complicated use cases. For example: 85* On desktop platforms, when library CDM is enabled, the 86 `media::mojom::ContentDecryptionModule` request will be forwarded to the 87 [`CdmService`](#CdmService) running in its own CDM (utility) process. 88* On Android, the `media::mojom::Renderer` request is handled in the 89 `RenderFrameHostImpl` context directly by creating `MediaPlayerRenderer` in 90 the browser process, even though the `MediaService` is configured to run in 91 the GPU process. 92* On Chromecast, the `media::mojom::Renderer` and 93 `media::mojom::ContentDecryptionModule` requests are handled by 94 [`MediaRendererService`](#MediaRendererService) which runs in the browser 95 process. The `media::mojom::VideoDecoder` request is handled by the default 96 `MediaService` which runs in the GPU process. 97 98Note that `media::mojom::InterfaceFactory` interface is reused in the 99communication between `MediaInterfaceProxy` and `MediaService` (see 100[below](#Site-Isolation)). 101 102### MediaService 103 104The MediaService is a mojo `service_manager::Service` that provides media player 105mojo interface implementations. It comes with some nice benefits. 106 107#### Flexible Process Model 108 109Different platforms or products have different requirements on where the remote 110media components should run. For example, a hardware decoder typically should 111run in the GPU process. The `ServiceManagerContext` provides the ability to run 112a service_manager::Service in-process (browser), out-of-process (utility) or in 113the GPU process. Therefore, by using a `MediaService`, it’s very easy to support 114hosting remote media components interfaces in most common Chromium process types 115(Browser/Utility/GPU). This can by set using the gn argument `mojo_media_host`, 116e.g. 117``` 118mojo_media_host = "browser" or “gpu” or “utility” 119``` 120 121MediaService is registered in `ServiceManagerContext` using `kMediaServiceName`. 122`mojo_media_host` is checked to decide in which process the service is 123registered to run. 124 125#### Connects Different Media Components 126 127Some remote media components depend on other components to work. For example, a 128Renderer, an audio decoder or a video decoder needs a CDM to be able to handle 129encrypted streams. Typically there's a `SetCdm()` call to connect the renderer 130or decoder with the CDM. If, for example, a Renderer interface and a CDM 131interface are hosted separately, then it will be hard to implement the 132`SetCdm()` call. It would require an object or entity that are aware of both 133sides to be able to connect them. `MediaService` handles this internally, and is 134 actually serving as such an object or entity, so you don’t have to reinvent 135 the wheel. See more details [below](#Using-CdmContext). 136 137#### Customization through MojoMediaClient 138 139`MediaService` provides everything needed to host an OOP media component, but 140it doesn’t provide the media component itself. It’s up to the client of 141`MediaService` to provide the concrete media component implementations. 142 143The `MojoMediaClient` interface provides a way for `MediaService` clients to 144provide concrete media components’ implementations. When `MediaService` is 145created, a `MojoMediaClient` must be passed in so that `MediaService` knows how 146to create the media components. 147 148For example, ChromeCast uses `MediaService` to host a media Renderer and a CDM 149in the browser process, and it provides the `CastRenderer` and `CastCdm` through 150`CastMojoMediaClient`, a `MojoMediaClient` implementation. Note that this 151overriding mechanism is not implemented everywhere. It’s trivial to add the 152support and we’ll only add it when we need it. 153 154#### Site Isolation 155 156In Blink, both media element and EME MediaKeys belong to a `WebLocalFrame`. In 157Chromium, this translates to media player and CDM belonging to a `RenderFrame`. 158In the render process, this is clear. However, when hosting all remote media 159components in a single `MediaService` (service manager only supports one service 160instance per process), the Frame boundary could get fuzzy. This will be 161especially dangerous for media components that interact with each other. 162For example, a Renderer from foo.com lives in the same MediaService instance as 163a CDM from bar.net. It would be wrong if the bar.net CDM is set on the foo.com 164Renderer to handle decryption. 165 166To prevent this from happening, we introduce an additional layer to simulate 167the `RenderFrame` boundary. A MediaService hosts multiple InterfaceFactory 168(one per `RenderFrame`), and each InterfaceFactory creates and manages media 169components it creates. 170 171For this reason, `media::mojom::InterfaceFactory` interface is reused in the 172communication between `MediaInterfaceProxy` and `MediaService`. 173 174> Note: there are plans to split apart the responsibilities of 175`media::mojom::InterfaceFactory` to make it clear which interfaces are used 176where. 177 178#### Specialized Out-of-Process media::Renderers 179 180The `media::Renderer` interface is a simple API, which is general enough to 181capture the essence of high level media playback commands. This allows us to 182extend the functionality of the `WebMediaPlayer` via **specialized renderers**. 183Specifically, we can build a sub-component that encapsulates the complexities of 184an advanced scenario, write a small adapter layer that exposes the component as 185a `media::Renderer`, and embed it within the existing `media::Pipeline` state 186machine. Specialized Renderers reduce technical complexity costs, by limiting 187the scope of details to the files and classes need them, by requiring little 188control flow boilerplate, and by generally having little impact on the default 189paths that `WebMediaPlayer` uses most of the time. 190 191Two examples of complex scenarios enabled by specialized renderers are: handling 192HLS playback on Android by delegating it to the Android Media Player (see 193`MediaPlayerRenderer`) and casting "src=" media from an Android phone to a cast 194device (see `FlingingRenderer`). Both of these examples have sub-components that 195need to live in the Browser process. We therefore proxy the 196`MediaPlayerRenderer` and `FlingingRenderer` to the Browser process, using the 197Mojo interfaces defined in renderer.mojom and renderer_extensions.mojom. This 198idea can be generalized to handle any special case *Foo scenario* as a 199**specialized OOP FooRenderer**. 200 201The classes required to create a *specialized OOP FooRenderer* come in pairs, 202serving similar purposes in their respective processes. By convention, the 203`FooRenderer` lives in the target process and the `FooRendererClient` lives in 204the Renderer process. The `MojoRenderer` and `MojoRendererService` proxies 205`media::Renderer` and `media::RendererClient` calls to/from the 206`FooRenderer[Client]`. One-off commands and events that can't be expressed as a 207`media::Renderer[Client]` call are carried across process boundaries by 208*renderer extensions* instead (see `renderer_extension.mojom`). The 209`FooRenderer[Client]Extension` mojo interfaces are implemented by their 210respective `FooRenderer[Client]` instances directly. The 211`FooRenderer[Client]Factory` sets up the scenario specific boilerplate, and all 212of the mojo interface pointers/requests needed to talk to the other process. 213Interface pointers and requests are connected across process boundaries when 214mojom::InterfaceFactory::CreateFooRenderer() is called. The end result is 215illustrated below: 216 217![Communication diagram for an OOP Renderer](./renderer_extension_diagram.png) 218 219To allow the creation and use of a FooRenderer within WebMediaPlayer, a 220`FooRendererClientFactory` must be built and passed to the 221`RendererFactorySelector`. The `RendererFactorySelector` must also be given a 222way to query if we are currently in a scenario that requires the use of the 223`FooRenderer`. When we enter a *Foo scenario*, cycling the `media::Pipeline` via 224suspend()/resume() should be enough for the next call to 225`RendererFactorySelector::GetCurrentFactory()` to return the 226`FooRendererClientFactory`. When `RendererFactory::CreateRenderer()` is called, 227the pipeline will receive a `FooRendererClient` as an opaque `media::Renderer`. 228The default pipeline state machine will control the OOP `FooRenderer`. 229When we exit the *Foo scenario*, cycling the pipeline once more should bring us 230back into the right state. 231 232#### Support Other Clients 233 234`MediaService`, as a `service_manager::Service`, can be used by clients other 235than the media player in the render process. For example, we could have another 236(mojo) service that handles audio data and wants to play it in a media Renderer. 237Since `MediaService` is a mojo service, it’s very convenient for any other mojo 238service to connect to it through a `service_manager::mojom::Connector` and use 239the remote media Renderer it hosts. 240 241### CdmService 242 243Although `MediaService` supports `media::mojom::CDM`, in some cases (e.g. 244library CDM on desktop) the remote CDM needs to run in its own process, 245typically for security reasons. `CdmService` is introduced to handle this. It 246also implements `service_manager::Service`, and is registered in 247`ServiceManagerContext` using `kCdmServiceName`. Currently it’s always 248registered to run in the utility process (with CDM sandbox type). `CdmService` 249also has additional support on library CDM, e.g. loading the library CDM etc. 250Note that `CdmService` only supports `media::mojom::CDM` and does NOT support 251other media player mojo interfaces. 252 253### MediaRendererService 254 255`MediaRendererService` supports `media::mojom::Renderer` and 256`media::mojom::CDM`. It's hosted in a different process than the default 257`MediaService`. It's registered in `ServiceManagerContext` using 258'kMediaRendererServiceName`. This allows to run `media::mojom::VideoDecoder` and 259`media::mojom::Renderer` in two different processes. Currently Chromecast use 260this to support `CastRenderer` `CDM` in browser process and GPU accelerated 261video decoder in GPU process. The main goals are: 2621. Allow two pages to hold their own video pipeline simultaneously, because 263 `CastRenderer` only support one video pipeline at a time. 2642. Support GPU accelerated video decoder for RTC path. 265 266### Mojo CDM and Mojo Decryptor 267 268Mojo CDM is special among all media player mojo interfaces because it is needed 269by all local/remote media components to handle encrypted buffers: 270 2711. Local media components like `DecryptingDemuxerStream`, 272 `DecryptingAudioDecoder` and `DecryptingVideoDecoder`. 2732. Remote media components hosted in `MediaService`, e.g. by 274 `MojoRendererService`, `MojoAudioDecoderService` and 275 `MojoVideoDecoderService`. 276 277At the JavaScript layer, the media player and MediaKeys are connected via 278[`setMediaKeys()`](https://w3c.github.io/encrypted-media/#dom-htmlmediaelement-setmediakeys). 279This is implemented by `SetCdm()` in the render process. 280 281A media component can use a CDM in two ways. 282 283#### Using a Decryptor (via CdmContext) 284 285Some CDM provides a `Decryptor` implementation, which supports decrypting 286methods directly, e.g. `Decrypt()`, `DecryptAndDecode()` etc. Both the 287`AesDecryptor` and library CDM support the `Decryptor` interface. 288 289In the case of a remote CDM, e.g. hosted by `MojoCdmService` in `MediaService` 290or `CdmService`, if the remote CDM supports the `Decryptor` interface, the 291`MojoCdm` will also support the `Decryptor` interface, implemented by 292`MojoDecryptor`, which set up a new message pipe to forward all `Decryptor` 293calls to the `Decryptor` in the remote CDM. 294 295#### Using CdmContext 296 297In some cases the media component is set to work with a specific CDM. For 298example, on Android, MediaCodec-based decoders (e.g. `MediaCodecAudioDecoder`) 299can only use MediaDrm-based CDM via `MediaCryptoContext`. The media component 300and the CDM must live in the same process because the interaction of these two 301are typically happening deep at the OS level. In theory, they can both live in 302the render process. But in practice, typically both the CDM and the media 303component are hosted by the MediaService in a remote (e.g. GPU) process. 304 305To be able to attach a remote CDM with a remote media component, each 306`InterfaceFactoryImpl` instance (corresponding to one `RenderFrame`) in the 307`MediaService` maintains a `MojoCdmServiceContext` that keeps track of all 308remote CDMs created for the `RenderFrame`. Each remote CDM is assigned a unique 309CDM ID, which is sent back to the `MojoCdm` in the render process. In the render 310process, when `SetCdm()` is called, the CDM ID is passed to the local media 311component (e.g. `MojoRenderer`), which is forwarded the remote media component 312(e.g. `MojoRendererService`). The remote media component will talk to 313`MojoCdmServiceContext` to get the `CdmContext` associated with the CDM ID, and 314complete the connection. 315 316### Secure Auxiliary Services 317 318Media components often need other services to work. In the render process, the 319local media components get services from content layer through the `MediaClient` 320interface. In `MediaService` and `CdmService`, remote media components get 321services from the through **secure auxiliary services**. 322 323Note that as a `service_manager::Service`, `MediaService` and `CdmService` can 324always connect to other `service_manager::Service` hosted by the service_manager 325through the `Connector` interface. However, these are generic services that 326doesn’t belong to any individual `RenderFrame`, or even user profile. 327 328Some services do require `RenderFrame` or user profile identity, e.g. file 329system. Since media components all belong to a given `RenderFrame`, we must 330maintain the frame identity when accessing these services for security reasons. 331These services are called secure auxiliary services. `FrameServiceBase` is a 332base class for all secure auxiliary services to help manage the lifetime of 333these services (e.g. to handle navigation). 334 335In `MediaInterfaceProxy`, when we request `media::mojom::InterfaceFactory` in 336the `MediaService` or `CdmService`, we call `GetFrameServices()` to configure 337which secure auxiliary services are exposed to the remote components over the 338separate `blink::mojom::BrowserInterfaceBroker`. 339 340Currently only the remote CDM needs secure auxiliary services. This is a list of 341currently supported services: 342 343* `OutputProtection`: to check output protection status 344* `PlatformVerification`: to check whether the platform is secure 345* `CdmFileIO`: for the CDM to store persistent data 346* `ProvisionFetcher`: for Android MediaDrm device provisioning 347* `CdmProxy`: (in progress) 348 349### Security 350 351In most cases, the client side runs in the renderer process which is the least 352trusted. Also always assume the client side code may be compromised, e.g. making 353calls in random order or passing in garbage parameters. 354 355Due to the [Flexible Process Model](#Flexible-Process-Model), it's sometimes 356hard to know in which process the service side runs. As a rule of thumb, assume 357all service side code may run in a privileged process (e.g. browser process), 358including the common supporting code like `MojoVideoDecoderService`, as well as 359the concrete [Media Component](#Media-Components), e.g. MediaCodecVideoDecoder 360on Android. To know exactly which [Media Component](#Media-Components) runs in 361which process in production, see [Adoption](#Adoption) below. 362 363Also note that all the [Secure Auxiliary Services](#Secure-Auxiliary-Services) 364are running in a more privileged process than the process where the media 365components that use them run in. For example, all of the existing services run 366in the browser process except for the `CdmProxy`, which runs in the GPU process. 367They must defend against compromised media components. 368 369### Adoption 370 371#### Android 372 373* `MediaService` in the GPU process (registered in `GpuServiceFactory` with 374 `GpuMojoMediaClient`) 375* `MojoCdm` + `MediaDrmBridge` (CDM) 376* `MediaDrmBridge` uses mojo `ProvisionFetcher` service for CDM provisioning 377* `MojoAudioDecoder` + `MediaCodecAudioDecoder` 378* `MojoVideoDecoder` + `MediaCodecVideoDecoder` (in progress) 379* HLS support: 380 * `MojoRenderer` + `MediaPlayerRenderer` 381 * NOT using `MediaService`. Instead, `MojoRendererService` is hosted by 382 `RenderFrameHostImpl`/`MediaInterfaceProxy` in the browser process 383 directly. 384* Flinging media to cast devices (RemotePlayback API): 385 * `MojoRenderer` + `FlingingRenderer` 386 * NOT using `MediaService`. Instead, `MojoRendererService` is hosted by 387 `RenderFrameHostImpl`/`MediaInterfaceProxy` in the browser process 388 directly. 389 390#### ChromeCast 391 392* `MediaService` in the Browser process (registered in 393 `CastContentBrowserClient` with `CastMojoMediaClient`) 394* `MojoRenderer` + `CastRenderer` 395* `MojoCdm` + `CastCdm` 396 397#### Desktop (ChromeOS/Linux/Mac/Windows) 398 399* CdmService 400 * `CdmService` in the utility process (registered in `UtilityServiceFactory` 401 with `ContentCdmServiceClient`) 402 * `MojoCdm` + `CdmAdapter` + Library CDM implementation 403 * `CdmAdapter` uses various secure auxiliary services 404* MediaService (in progress) 405 * `MediaService` in the GPU process (registered in `GpuServiceFactory` with 406 `GpuMojoMediaClient`) 407 * `MojoVideoDecoder` + hardware video decoders such as D3D11VideoDecoder 408 * Provides `CdmProxy` to the `CdmService` 409 410## Other Services 411 412> TODO(xhwang): Add documentation on other mojo services, e.g. remoting, etc. 413 414