1.. _origintelemetry: 2 3================ 4Origin Telemetry 5================ 6 7*Origin Telemetry* is an experimental Firefox Telemetry mechanism that allows us to privately report origin-specific information in aggregate. 8In short, it allows us to get exact counts of how *many* Firefox clients do certain things on specific origins without us being able to know *which* clients were doing which things on which origins. 9 10As an example, Content Blocking would like to know which trackers Firefox blocked most frequently. 11Origin Telemetry allows us to count how many times a given tracker is blocked without being able to find out which clients were visiting pages that had those trackers on them. 12 13.. important:: 14 15 This mechanism is experimental and is a prototype. 16 Please do not try to use this without explicit permission from the Firefox Telemetry Team, as it's really only been designed to work for Content Blocking right now. 17 18Adding or removing Origins or Metrics is not supported in artifact builds and build faster workflows. A non-artifact Firefox build is necessary to change these lists. 19 20This mechanism is enabled on Firefox Nightly only at present. 21 22.. important:: 23 24 Every new or changed data collection in Firefox needs a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__ from a Data Steward. 25 26Privacy 27======= 28 29To achieve the necessary goal of getting accurate counts without being able to learn which clients contributed to the counts we use a mechanism called `Prio (pdf) <https://www.usenix.org/system/files/conference/nsdi17/nsdi17-corrigan-gibbs.pdf>`_. 30 31Prio uses cryptographic techniques to encrypt information and a proof that the information is correct, only sending the encrypted information on to be aggregated. 32Only after aggregation do we learn the information we want (aggregated counts), and at no point do we learn the information we don't want (which clients contributed to the counts). 33 34.. _origin.usage: 35 36Using Origin Telemetry 37====================== 38 39To record that something happened on a given origin, three things must happen: 40 411. The origin must be one of the fixed, known list of origins. ("Where" something happened) 422. The metric must be one of the fixed, known list of metrics. ("What" happened) 433. A call must be made to the Origin Telemetry API. (To let Origin Telemetry know "that" happened "there") 44 45At present the lists of origins and metrics are hardcoded in C++. 46Please consult the Firefox Telemetry Team before changing these lists. 47 48Origins can be arbitrary byte sequences of any length. 49Do not add duplicate origins to the list. 50 51If an attempt is made to record to an unknown origin, a meta-origin ``__UNKNOWN__`` captures that it happened. 52Unlike other origins where multiple recordings are considered additive ``__UNKNOWN__`` only accumulates a single value. 53This is to avoid inflating the ping size in case the caller submits a lot of unknown origins for a given unit (e.g. pageload). 54 55Metrics should be of the form ``categoryname.metric_name``. 56Both ``categoryname`` and ``metric_name`` should not exceed 40 bytes (UTF-8 encoded) in length and should only contain alphanumeric character and infix underscores. 57 58.. _origin.API: 59 60API 61=== 62 63Origin Telemetry supplies APIs for recording information into and snapshotting information out of storage. 64 65Recording 66--------- 67 68``Telemetry::RecordOrigin(aOriginMetricID, aOrigin);`` 69~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 70 71This C++ API records that a metric was true for a given origin. 72For instance, maybe the user visited a page in which content from ``example.net`` was blocked. 73That call might look like ``Telemetry::RecordOrigin(OriginMetricID::ContentBlocking_Blocked, "example.net"_ns)``. 74 75Snapshotting 76------------ 77 78``let snapshot = await Telemetry.getEncodedOriginSnapshot(aClear);`` 79~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80 81This JS API provides a snapshot of the prio-encoded payload and is intended to only be used to assemble the :doc:`"prio" ping's <../data/prio-ping>` payload. 82It returns a Promise which resolves to an object of the form: 83 84.. code-block:: js 85 86 { 87 a: <base64-encoded, prio-encoded data>, 88 b: <base64-encoded, prio-encoded data>, 89 } 90 91``let snapshot = Telemetry.getOriginSnapshot(aClear);`` 92~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 93 94This JS API provides a snapshot of the unencrypted storage of unsent Origin Telemetry, optionally clearing that storage. 95It returns a structure of the form: 96 97.. code-block:: js 98 99 { 100 "categoryname.metric_name": { 101 "origin1": count1, 102 "origin2": count2, 103 ... 104 }, 105 ... 106 } 107 108.. important:: 109 110 This API is only intended to be used by ``about:telemetry`` and tests. 111 112.. _origin.example: 113 114Example 115======= 116 117Firefox Content Blocking blocks web content from certain origins present on a list. 118Users can exempt certain origins from being blocked. 119To improve Content Blocking's effectiveness we need to know these two "what's" of information about that list of "wheres". 120 121This means we need two metrics ``contentblocking.blocked`` and ``contentblocking.exempt`` (the "what's"), and a list of origins (the "wheres"). 122 123Say "example.net" was blocked and "example.com" was exempted from blocking. 124Content Blocking calls ``Telemetry::RecordOrigin(OriginMetricID::ContentBlocking_Blocked, "example.net"_ns))`` and ``Telemetry::RecordOrigin(OriginMetricID::ContentBlocking_Exempt, "example.com"_ns)``. 125 126At this time a call to ``Telemetry.getOriginSnapshot()`` would return: 127 128.. code-block:: js 129 130 { 131 "contentblocking.blocked": {"example.net": 1}, 132 "contentblocking.exempt": {"example.com": 1}, 133 } 134 135Later, Origin Telemetry will get the encoded snapshot (clearing the storage) and assemble it with other information into a :doc:`"prio" ping <../data/prio-ping>` which will then be submitted. 136 137.. _origin.encoding: 138 139Encoding 140======== 141 142.. note:: 143 144 This section is provided to help you understand the client implementation's architecture. 145 If how we arranged our code doesn't matter to you, feel free to ignore. 146 147There are three levels of encoding in Origin Telemetry: App Encoding, Prio Encoding, and Base64 Encoding. 148 149*App Encoding* is the process by which we turn the Metrics and Origins into data structures that Prio can encrypt for us. 150Prio, at time of writing, only supports counting up to 2046 "true/false" values at a time. 151Thus, from the example, we need to turn "example.net was blocked" into "the boolean at index 11 of chunk 2 is true". 152This encoding can be done any way we like so long as we don't change it without informing the aggregation servers (by sending it a new :ref:`encoding name <prio-ping.encoding>`). 153This encoding provides no privacy benefit and is just a matter of transforming the data into a format Prio can process. 154 155*Prio Encoding* is the process by which those ordered true/false values that result from App Encoding are turned into an encrypted series of bytes. 156You can `read the paper (pdf) <https://www.usenix.org/system/files/conference/nsdi17/nsdi17-corrigan-gibbs.pdf>`_ to learn more about that. 157This encoding, together with the overall system architecture, is what provides the privacy quality to Origin Telemetry. 158 159*Base64 Encoding* is how we turn those encrypted bytes into a string of characters we can send over the network. 160You can learn more about Base64 encoding `on wikipedia <https://wikipedia.org/wiki/Base64>`_. 161This encoding provides no privacy benefit and is just used to make Data Engineers' lives a little easier. 162 163Version History 164=============== 165 166- Firefox 68: Initial Origin Telemetry support (Nightly Only) (`bug 1536565 <https://bugzilla.mozilla.org/show_bug.cgi?id=1536565>`_). 167