README.md
1# Image Capture API
2
3This folder contains the implementation of the [W3C Image Capture API].
4Image Capture was shipped in Chrome M59; please consult the
5[Implementation Status] if you think a feature should be available and isn't.
6
7[W3C Image Capture API]: https://w3c.github.io/mediacapture-image/
8[Implementation Status]: https://github.com/w3c/mediacapture-image/blob/master/implementation-status.md
9
10This API is structured around the [ImageCapture class] _and_ a number of
11[extensions] to the `MediaStreamTrack` feeding it (let's call them
12`theImageCapturer` and `theTrack`, respectively).
13
14[ImageCapture class]: https://w3c.github.io/mediacapture-image/#imagecaptureapi
15[extensions]: https://w3c.github.io/mediacapture-image/#extensions
16
17
18## API Mechanics
19
20### `takePhoto()` and `grabFrame()`
21
22* `takePhoto()` returns the result of a single photographic exposure as a
23 `Blob` which can be downloaded, stored by the browser or displayed in an
24 `img` element. This method uses the highest available photographic camera
25 resolution.
26
27* `grabFrame()` returns a snapshot of the live video in `theTrack` as an
28 `ImageBitmap` object which could (for example) be drawn on a `canvas` and
29 then post-processed to selectively change color values. Note that the
30 `ImageBitmap` will only have the resolution of the video track — which
31 will generally be lower than the camera's still-image resolution.
32
33(_Adapted from the [Origin Trials Web Update post](
34https://developers.google.com/web/updates/2016/12/imagecapture)_)
35
36
37### Photo settings and capabilities
38
39The photo-specific options and settings are associated to `theImageCapturer` or
40`theTrack` depending on whether a given capability/setting has an immediately
41recognisable effect on `theTrack`, in other words if it's "live" or not. For
42example, changing the zoom level is instantly reflected on the `theTrack`,
43while enabling red eye reduction, if available, is not.
44
45| Object |Type | Example |
46|:------------------------ |:------------------- | ---------------------------------------:|
47|[`PhotoCapabilities`] |non-live capabilities|`theImageCapturer.getPhotoCapabilities()`|
48|[`MediaTrackCapabilities`]|live capabilities |`theTrack.getCapabilities()` |
49| | | |
50|[`PhotoSettings`] |non-live settings |`theImageCapturer.takePhoto(photoSettings)`|
51|[`MediaTrackSettings`] |live settings |`theTrack.getSettings()` |
52
53[`PhotoCapabilities`]: https://w3c.github.io/mediacapture-image/##photocapabilities-section
54[`MediaTrackCapabilities`]: https://w3c.github.io/mediacapture-image/#mediatrackcapabilities-section
55[`PhotoSettings`]: https://w3c.github.io/mediacapture-image/##photosettings-section
56[`MediaTrackSettings`]: https://w3c.github.io/mediacapture-image/#mediatracksettings-section
57
58## Other topics
59
60### Are `takePhoto()` and `grabFrame()` the same?
61
62These methods would not produce the same results as explained in
63[this issue comment](
64https://bugs.chromium.org/p/chromium/issues/detail?id=655107#c8):
65
66
67> Let me reconstruct the conversion steps each image goes through in CrOs/Linux;
68> [...]
69>
70> a) Live video capture produces frames via `V4L2CaptureDelegate::DoCapture()` [1].
71> The original data (from the WebCam) comes in either YUY2 (a 4:2:2 format) or
72> MJPEG, depending if the capture is smaller than 1280x720p or not, respectively.
73
74> b) This `V4L2CaptureDelegate` sends the capture frame to a conversion stage to
75> I420 [2]. I420 is a 4:2:0 format, so it has lost some information
76> irretrievably. This I420 format is the one used for transporting video frames
77> to the rendered.
78
79> c) This I420 is the input to `grabFrame()`, which produces a JS ImageBitmap,
80> unencoded, after converting the I420 into RGBA [3] of the appropriate endian-ness.
81
82> What happens to `takePhoto()`? It takes the data from the Webcam in a) and
83> either returns a JPEG Blob [4] or converts the YUY2 [5] and encodes it to PNG
84> using the default compression value (6 in a 0-10 scale IIRC) [6].
85
86> IOW:
87
88```
89 - for smaller video resolutions:
90
91 OS -> YUY2 ---> I420 --> RGBA --> ImageBitmap grabFrame()
92 |
93 +--> RGBA --> PNG ---> Blob takePhoto()
94
95 - and for larger video resolutions:
96
97 OS -> MJPEG ---> I420 --> RGBA --> ImageBitmap grabFrame()
98 |
99 +--> JPG ------------> Blob takePhoto()
100```
101
102
103> Where every conversion to-I420 loses information and so does the encoding to
104> PNG. Even a conversion `RGBA --> I420 --> RGBA` would not produce the original
105> image. (Plus, when you show `ImageBitmap` and/or Blob on an `<img>` or `<canvas>`
106> there are more stages of decoding and even colour correction involved!)
107
108> With all that, I'm not surprised at all that the images are not pixel
109> accurate! :-)
110
111
112### Why are `PhotoCapabilities.fillLightMode` and `MediaTrackCapabilities.torch` separated?
113
114Because they are different things: `torch` means flash constantly on/off whereas
115`fillLightMode` means flash always-on/always-off/auto _when taking a
116photographic exposure_.
117
118`torch` lives in `theTrack` because the effect can be seen "live" on it,
119whereas `fillLightMode` lives in `theImageCapture` object because the effect
120of modifying it can only be seen after taking a picture.
121
122
123
124## Testing
125
126Image Capture web tests are located in [web_tests/imagecapture],
127[web_tests/fast/imagecapture] and [web_tests/external/mediacapture-image].
128
129[web_tests/imagecapture]: https://chromium.googlesource.com/chromium/src/+/master/third_party/blink/web_tests/imagecapture
130[web_tests/fast/imagecapture]: https://chromium.googlesource.com/chromium/src/+/master/third_party/blink/web_tests/fast/imagecapture/
131[web_tests/external/mediacapture-image]: https://chromium.googlesource.com/chromium/src/+/master/third_party/blink/web_tests/external/wpt/mediacapture-image/
132
133