include/measurement_kit/README.md

# Measurement Kit API

Measurement Kit only exposes a simple C like API suitable to be used via
[FFI](https://en.wikipedia.org/wiki/Foreign_function_interface).

We implemented the following higher-level, easier-to-use APIs on top of
this basic C-like FFI-friendly API:

- [ObjectiveC API](https://github.com/measurement-kit/mkall-ios);

- [Android API](https://github.com/measurement-kit/android-libs);

We encourage you to avoid using it when a more user-friendly API is available
because, for each specific platform, we will strive to maintain backwards
compatibility with the most high level API available on such platform.

## Synopsis

```C++
#include <measurement_kit/ffi.h>

typedef          struct mk_event_   mk_event_t;
typedef          struct mk_task_    mk_task_t;

mk_task_t       *mk_task_start(const char *settings);
mk_event_t      *mk_task_wait_for_next_event(mk_task_t *task);
int              mk_task_is_done(mk_task_t *task);
void             mk_task_interrupt(mk_task_t *task);

const char      *mk_event_serialization(mk_event_t *event);
void             mk_event_destroy(mk_event_t *event);

void             mk_task_destroy(mk_task_t *task);
```

## Introduction

Measurement Kit is a network measurement engine. It runs _tasks_ (e.g. a
specific network measurement). To start a task, you need to configure it with
specific _settings_. Among these settings there is the most important one, the
task name (e.g. "Ndt" is the task name of the NDT network performance test).
While running, a task emits _events_ (e.g. a log line, a JSON describing the
result of the measurement, and other intermediate results).

Each task runs in its own thread. Measurement Kit implements a
simple mechanism, based on a shared semaphore, to guarantee that
tasks do not run concurrently. This mechanism prevents tasks from
running concurrently, but do not necessarily guarantee that tasks
are run in FIFO order. Yet, this is enough to avoid that a task
creates network noise that impacts onto another task's measurements.

The thread running a task will post events generated by the task
on a shared, thread safe queue. Your code should loop by extracting
and processing events from such shared queue until the task has
finished running.

## API documentation

`mk_task_start` starts a task (generally a nettest) with the settings provided
as a serialized JSON. Returns `NULL` if `conf` was `NULL`, or in
case of parse error. You own (and must destroy) the returned task
pointer.

`mk_task_wait_for_next_event` blocks waiting for the `task` to emit the next
event. Returns `NULL` if `task` is `NULL` or on error. If the task is
terminated, it returns immediately a `task.terminated` event. You own (and
must destroy) the returned event pointer.

`mk_task_is_done` returns zero when the tasks is running, nonzero otherwise. If
the `task` is `NULL`, nonzero is returned.

`mk_task_interrupt` interrupts a running `task`. Interrupting a `NULL` task
has no effect.

`mk_event_serialization` obtains the JSON serialization of `event`. Return `NULL`
if either the `event` is `NULL` or there is an error.

`mk_event_destroy` destroys the memory associated with `event`. Attempting to
destroy a `NULL` event has no effect.

`mk_task_destroy` destroys the memory associated with `task`. Attempting to
destroy a `NULL` task has no effect. Attempting to destroy a running `task` will
wait for the task to complete before releasing memory.

## Example

The following C++ example runs the "Ndt" test with "INFO" verbosity.

```C++
  const char *settings = R"({
    "name": "Ndt",
    "log_level": "INFO"
  })";
  mk_task_t *task = mk_task_start(settings);
  if (!task) {
    std::clog << "ERROR: cannot start task" << std::endl;
    return;
  }
  while (!mk_task_is_done(task)) {
    mk_event_t *event = mk_task_wait_for_next_event(task);
    if (!event) {
      std::clog << "ERROR: cannot wait for next event" << std::endl;
      break;
    }
    const char *json_serialized_event = mk_event_serialization(event);
    if (!json_serialized_event) {
      std::clog << "ERROR: cannot get event serialization" << std::endl;
      break;
    }
    {
      // TODO: process the JSON-serialized event
    }
    mk_event_destroy(event);
  }
  mk_task_destroy(task);
```

## Nettest tasks

The following nettests tasks are defined (case matters):

- `"Dash"`: Neubot's DASH test.
- `"CaptivePortal"`: OONI's captive portal test.
- `"DnsInjection"`: OONI's DNS injection test.
- `"FacebookMessenger"`: OONI's Facebook Messenger test.
- `"HttpHeaderFieldManipulation"`: OONI's HTTP header field manipulation test.
- `"HttpInvalidRequestLine"`: OONI's HTTP invalid request line test.
- `"MeekFrontedRequests"`: OONI's meek fronted requests test.
- `"Ndt"`: the NDT network performance test.
- `"TcpConnect"`: OONI's TCP connect test.
- `"Telegram"`: OONI's Telegram test.
- `"WebConnectivity"`: OONI's Web Connectivity test.
- `"Whatsapp"`: OONI's WhatsApp test.

## Settings

The nettest task settings object is a JSON like:

```JSON
{
  "annotations": {
    "optional_annotation_1": "value_1",
    "another_annotation": "with_its_value"
  },
  "disabled_events": [
    "status.queued",
    "status.started"
  ],
  "inputs": [
    "www.google.com",
    "www.x.org"
  ],
  "input_filepaths": [
    "/path/to/file",
    "/path/to/another/file"
  ],
  "log_filepath": "logfile.txt",
  "log_level": "INFO",
  "name": "WebConnectivity",
  "options": {
    "all_endpoints": false,
    "backend": "",
    "bouncer_base_url": "",
    "collector_base_url": "",
    "constant_bitrate": 0,
    "dns/nameserver": "",
    "dns/engine": "system",
    "expected_body": "",
    "geoip_asn_path": "",
    "geoip_country_path": "",
    "hostname": "",
    "ignore_bouncer_error": true,
    "ignore_open_report_error": true,
    "max_runtime": -1,
    "mlabns/address_family": "ipv4",
    "mlabns/base_url": "https://locate.measurementlab.net/",
    "mlabns/country": "IT",
    "mlabns/metro": "trn",
    "mlabns/policy": "random",
    "mlabns_tool_name": "",
    "net/ca_bundle_path": "",
    "net/timeout": 10.0,
    "no_bouncer": false,
    "no_collector": false,
    "no_file_report": false,
    "no_geoip": false,
    "no_resolver_lookup": false,
    "port": 1234,
    "probe_ip": "1.2.3.4",
    "probe_asn": "AS30722",
    "probe_cc": "IT",
    "probe_network_name": "Network name",
    "randomize_input": true,
    "save_real_probe_asn": true,
    "save_real_probe_cc": true,
    "save_real_probe_ip": false,
    "save_real_resolver_ip": true,
    "server": "neubot.mlab.mlab1.trn01.measurement-lab.org",
    "software_name": "measurement_kit",
    "software_version": "<current-mk-version>",
    "test_suite": 0,
    "uuid": ""
  },
  "output_filepath": "results.njson",
}
```

The only mandatory key is `name`, which identifies the task. All the other
keys are optional. Above we have shown the most commonly used `options`, that
are described in greater detail below. The value we included for options
is their default value (_however_, the value of non-`options` settings _is not_
the default value, rather is a meaningful example). The following keys
are available:

- `"annotations"`: (object; optional) JSON object containing key, value string
  mappings that are copied verbatim in the measurement result file;

- `"disabled_events"`: (array; optional) strings array containing the names of
  the events that you are not interested into. All the available event
  names are described below. By default all events are enabled;

- `"inputs"`: (array; optional) array of strings to be passed to the nettest as
  input. If the nettest does not take any input, this is ignored. If the nettest
  requires input and you provide neither `"inputs"` nor `"input_filepaths"`,
  the nettest will fail;

- `"input_filepaths"`: (array; optional) array of files containing input
  strings, one per line, to be passed to the nettest. These files are read and
  their content is merged with the one of the `inputs` key.

- `"log_filepath"`: (string; optional) name of the file where to
  write logs. By default logs are written on `stderr`;

- `"log_level"`: (string; optional) how much information you want to see
  written in the log file and emitted by log-related events. The default log
  level is "WARNING" (see below);

- `"name"`: (string; mandatory) name of the nettest to run. The available
  nettest names have been described above;

- `"options"`: (object; optional) options modifying the nettest behavior, as
  an object mapping string keys to string, integer or double values;

- `"output_filepath"`: (string; optional) file where you want MK to
  write measurement results, as a sequence of lines, each line being
  the result of a measurement serialized as JSON;

## Log levels

The available log levels are:

- `"ERR"`: an error message

- `"WARNING"`: a warning message

- `"INFO"`: an informational message

- `"DEBUG"`: a debugging message

- `"DEBUG2"`: a really specific debugging message

When you specify a log level in the settings, only messages with a log level
equal or greater than the specified one are emitted. For example, if you
specify `"INFO"`, you will only see `"ERR"`, `"WARNING"`, and `"INFO"` logs.

The default log level is "WARNING".

## Options

Options can be `string`, `double`, `int`, or `boolean`. There used to be no
boolean type, but we later added support for it (between v0.9.0-alpha.9 and
v0.9.0-alpha.10). The code will currently warn you if the type of a variable
is not correct. In future versions, we will enforce this restriction more
strictly and only accept options with the correct type.

These are the available options:

- `"all_endpoints"`: (boolean) whether to check just a few or all the
  available endpoints in tests that work with specific endpoints, such
  as, the "WhatsApp" test;

- `"backend"`: (string) pass specific backend to OONI tests requiring it,
  e.g., WebConnectivity, HTTP Invalid Request Line;

- `"bouncer_base_url"`: (string) base URL of OONI bouncer, by default set to
  the empty string. If empty, the OONI bouncer will be used;

- `"collector_base_url"`: (string) base URL of OONI collector, by default set
  to the empty string. If empty, the OONI collector will be used;

- `"constant_bitrate"`: (int) force DASH to run at the specified
  constant bitrate;

- `"dns/nameserver"`: (string) nameserver to be used with non-`system` DNS
  engines. Can or cannot include an optional port number. By default, set
  to the empty string;

- `"dns/engine"`: (string) what DNS engine to use. By default, set to
  `"system"`, meaning that `getaddrinfo()` will be used to resolve domain
  names. Can also be set to `"libevent"`, to use libevent's DNS engine.
  In such case, you must provide a `"dns/nameserver"` as well;

- `"expected_body"`: (string) body expected by Meek Fronted Requests;

- `"geoip_asn_path"`: (string) path to the GeoLite2 `.mmdb` ASN database
  file. By default not set;

- `"geoip_country_path"`: (string) path to the GeoLite2 `.mmdb` Country
  database file. By default not set;

- `"hostname"`: (string) hostname to be used by the DASH test;

- `"ignore_bouncer_error"`: (boolean) whether to ignore an error in contacting
  the OONI bouncer. By default set to `true` so that bouncer errors will
  be ignored;

- `"ignore_open_report_error"`: (boolean) whether to ignore an error when opening
  the report with the OONI collector. By default set to `true` so that errors
  will be ignored;

- `"max_runtime"`: (integer) number of seconds after which the test will
  be stopped. Works _only_ for tests taking input. By default set to `-1`
  so that there is no maximum runtime for tests with input;

- `"mlabns/address_family"`: (string) set to `"ipv4"` or `"ipv6"` to force
   M-Lab NS to only return IPv4 or IPv6 addresses (you don't normally
   need to set this option and it only has effect for NDT and DASH anyway);

- `"mlabns/base_url"`: (string) base URL of the M-Lab NS service (you don't
  normally need to set this option and it only has effect for NDT and
  DASH anyway);

- `"mlabns/country"`: (string) tells M-Lab NS the country in which you are
  rather than letting it guess for you, so that it returns results that
  are meaningful within that country (again, normally you don't need this
  option, and it only impacts on DASH and NDT);

- `"mlabns/metro"`: (string) this restricts the results returned by M-Lab
  NS to a specific metro area; for example, setting this to `"ord"` will
  only returns M-Lab servers in the Chicago area (again, you normally don't
  need this option, and it only impacts on DASH and NDT);

- `"mlabns/policy"`: (string) overrides the default M-Lab NS policy; for
  example, setting this to `"random"` will return a random server (as stated
  above, you normally don't need this variable, and it only impacts on
  the NDT and DASH tests);

- `"mlabns_tool_name"`: (string) force NDT to use an mlab-ns tool
  name different from the default (`"ndt"`);

- `"net/ca_bundle_path"`: (string) path to the CA bundle path to be used
  to validate SSL certificates. Required on mobile;

- `"net/timeout"`: (double) number of seconds after which network I/O
  operations will timeout. By default set to `10.0` seconds;

- `"no_bouncer"`: (boolean) whether to use a bouncer. By default set to
  `false`, meaning that a bouncer will be used;

- `"no_collector"`: (boolean) whether to use a collector. By default set to
  `false`, meaning that a collector will be used;

- `"no_file_report"`: (boolean) whether to write a report (i.e. measurement
  result) file on disk. By default set to `false`, meaning that we'll try;

- `"no_geoip"`: (boolean) whether to perform GeoIP lookup. By default set to
  `false`, meaning that we'll try;

- `"no_resolver_lookup"`: (boolean) whether to lookup the IP address of
  the resolver. By default `false`, meaning that we'll try. When true we
  will set the resolver IP address to `127.0.0.1`;

- `"probe_asn"`: (string) sets the `probe_asn` to be included into the
  report, thus skipping the ASN resolution;

- `"probe_cc"`: (string) like `probe_asn` but for country code;

- `"probe_ip"`: (string) like `probe_asn` but for IP address;

- `"probe_network_name"`: (string) like `probe_asn` but for the
  name associated to the ASN;

- `"port"`: (int) allows to override the port for tests that connect to a
  specific port, such as NDT and DASH;

- `"randomize_input"`: (boolean) whether to randomize input. By default set to
  `true`, meaning that we'll randomize input;

- `"save_real_probe_asn"`: (boolean) whether to save the ASN. By default set
  to `true`, meaning that we will save it;

- `"save_real_probe_cc"`: (boolean) whether to save the CC. By
  default set to `true`, meaning that we will save it;

- `"save_real_probe_ip"`: (boolean) whether to save our IP. By
  default set to `true`, meaning that we will not save it;

- `"save_real_resolver_ip"`: (boolean) whether to save the resolver
  IP. By default set to `true`, meaning that we'll save it;

- `"server"`: (server) allows to override the server hostname for tests that
  connect to a specific port, such as NDT and DASH;

- `"software_name"`: (string) name of the app. By default set to
  `"measurement_kit"`. This string will be included in the user-agent
  header when contacting mlab-ns.

- `"software_version"`: (string) version of the app. By default set to the
  current version of Measurement Kit. As for `software_name` this string
  will be included in the user-agent header when contacting mlab-ns;

- `"test_suite"`: (int) force NDT to use a specific test suite;

- `"uuid"`: (string) force DASH to use a specific UUID.

## Events

An event is a JSON object like:

```JSON
  {
    "key": "<key>",
    "value": {}
  }
```

Where `"value"` is a JSON object with an event-specific structure, and `"key"`
is a string. Below we describe all the possible event keys, along with the
"value" JSON structure. Unless otherwise specified, an event key can be emitted
an arbitrary number of times during the lifecycle of a task. Unless otherwise
specified, all the keys introduced below where added in MK v0.9.0.

- `"bug.json_dump"`: (object) There was a failure in serialising an event
  to JSON and we let you know about this Measurement Kit bug. Please, open an
  issue on GitHub if you notice this kind of bug. This event has been added
  in MK v0.9.2. The JSON returned by this event is like:

```JSON
{
  "key": "bug.json_dump",
  "value": {
    "failure": "<failure_string>",
    "orig_key": "<orig_key>",
  }
}
```

Where `<orig_key>` is the key that failure and `<failure_string>` is an
error providing some additional information. Note that both fields MAY
be base64 encoded if they're not JSON serialisable.

- `"failure.asn_lookup"`: (object) There was a failure attempting to lookup the
  user autonomous system number. The JSON returned by this event is like:

```JSON
{
  "key": "failure.asn_lookup",
  "value": {
    "failure": "<failure_string>"
  }
}
```

Where all the possible values of `<failure_string>` are described below.

- `"failure.cc_lookup"`: (object) There was a failure attempting to lookup the
  user country code. The JSON returned by this event is like:

```JSON
{
  "key": "failure.cc_lookup",
  "value": {
    "failure": "<failure_string>"
  }
}
```

Where all the possible values of `<failure_string>` are described below.

- `"failure.ip_lookup"`: (object) There was a failure attempting to lookup the
  user IP address. The JSON returned by this event is like:

```JSON
{
  "key": "failure.ip_lookup",
  "value": {
    "failure": "<failure_string>"
  }
}
```

Where all the possible values of `<failure_string>` are described below.

- `"failure.measurement"`: (object) There was a failure running the
  measurement. The complete JSON returned by this event is like:

```JSON
{
  "key": "failure.measurement",
  "value": {
    "idx": 0,
    "failure": "<failure_string>"
  }
}
```

Where all the possible values of `<failure_string>` are described below.

- `"failure.measurement_submission"`: (object) There was a failure in
submitting the measurement result to the configured collector (if any). The
complete JSON returned by this event is like:

```JSON
{
  "key": "failure.measurement_submission",
  "value": {
    "idx": 0,
    "json_str": "<serialized_result>",
    "failure": "<failure_string>"
  }
}
```

Where `idx` is the index of the current measurement, which is relevant for the
tests that run over an input list; `json_str` is the measurement that we failed
to submit, serialized as JSON; `failure` is the error that occurred.

- `"failure.report_create"`: (object) There was a failure in creating the
measurement result to the configured collector (if any). The complete JSON
returned by this event is like:

```JSON
{
  "key": "failure.report_create",
  "value": {
    "failure": "<failure_string>"
  }
}
```

Where `failure` is the error that occurred.

- `"failure.report_close"`: (object) There was a failure in closing the
measurement result to the configured collector (if any). The complete JSON
returned by this event is like:

```JSON
{
  "key": "failure.report_close",
  "value": {
    "failure": "<failure_string>"
  }
}
```

Where `failure` is the error that occurred.

- `"failure.resolver_lookup"`: (object) There was a failure attempting to
  lookup the user DNS resolver. The JSON returned by this event is like:

```JSON
{
  "key": "failure.resolver_lookup",
  "value": {
    "failure": "<failure_string>"
  }
}
```

Where all the possible values of `<failure_string>` are described below.

- `"failure.startup"`: (object) There was a failure starting the test, most
likely because you passed in incorrect options. The complete JSON returned by
this event is like:

```JSON
{
  "key": "failure.startup",
  "value": {
    "failure": "<failure_string>"
  }
}
```

Where `<failure_string>` is the error that occurred.

- `"log"`: (object) A log line was emitted. The complete JSON is like:

```JSON
{
  "key": "log",
  "value": {
    "log_level": "<a_log_level>",
    "message": "<string>"
  }
}
```

Where `log_level` is one of the log levels described above, and `message`
is the log message emitted by Measurement Kit.

- `"measurement"`: (object) The result of a measurement. The complete JSON
is like:

```JSON
{
  "key": "measurement",
  "value": {
    "idx": 0,
    "json_str": "<serialized_result>"
  }
}
```

Where `json_str` is the measurement result serialized as JSON. The schema of
a measurement result depends on the type of nettest, as described below. And
where `idx` is the index of the current measurement (relevant only for nettests
that iterate over an input list).

- `"status.end"`: (object) This event is emitted just once at the end of the
test. The complete JSON is like:

```JSON
{
  "key": "status.end",
  "value": {
    "downloaded_kb": 0.0,
    "uploaded_kb": 0.0,
    "failure": "<failure_string>"
  }
}
```

Where `downloaded_kb` and `uploaded_kb` are the amount of downloaded and
uploaded kilo-bytes, and `failure` is the overall failure that occurred during
the test (or the empty string, if no error occurred).

- `"status.geoip_lookup"`: (object) This event is emitted only once at the
beginning of the nettest, and provides information about the user's IP address,
country and autonomous system. In detail, the JSON is like:

```JSON
{
  "key": "status.geoip_lookup",
  "value": {
    "probe_ip": "<ip_address>",
    "probe_asn": "<asn>",
    "probe_cc": "<cc>",
    "probe_network_name": "<network_name>"
  }
}
```

Where `<ip_address>` is the user's IP address, `asn` is the autonomous
system number, `cc` is the country code, `network_name` is the commercial
name associated to the autonomous system number.

- `"status.progress"`: (object) This is emitted during the nettest lifecycle to
inform you about the nettest progress towards completion. In detail, the JSON is
like:

```JSON
{
  "key": "status.progress",
  "value": {
    "percentage": 0.0,
    "message": "<string>"
  }
}
```

Where `percentage` is the percentage of completion of the nettest, as a real
number between 0.0 and 1.0, and `message` indicates the operation that the
nettest just completed.

- `"status.queued"`: (object) Indicates that the nettest has been accepted. In
case there are already running nettests, as mentioned above, they will be
prevented from running concurrently. The JSON is like:

```JSON
{
  "key": "status.queued",
  "value": {
  }
}
```

Where `value` is empty.

- `"status.measurement_start"`: (object) Indicates that a measurement inside
a nettest has started. The JSON is like:

```JSON
{
  "key": "status.measurement_start",
  "value": {
    "idx": 0,
    "input": "<input>"
  }
}
```

Where `idx` is the index of the current input and `input` is the current
input. For tests that take no input, this event MAY be emitted with
`idx` equal to `0` and `input` equal to the empty string.

- `"status.measurement_submission"`: (object) The specific measurement has
been uploaded successfully. The JSON is like:

```JSON
{
  "key": "status.measurement_submission",
  "value": {
    "idx": 0,
  }
}
```

Where `idx` is the index of the measurement input.

- `"status.measurement_done"`: (object) Measurement Kit has finished processing
the specified input. The JSON is like:

```JSON
{
  "key": "status.measurement_done",
  "value": {
    "idx": 0,
  }
}
```

Where `idx` is the index of the measurement input.

- `"status.report_close"`: (object) Measurement Kit has closed a report for the
current nettest, and tells you the report-ID. The report-ID is the identifier of
the measurement result(s), which have been submitted. The JSON is like:

```JSON
{
  "key": "status.report_create",
  "value": {
    "report_id": "string",
  }
}
```

Where `report_id` is the report identifier.

- `"status.report_create"`: (object) Measurement Kit has created a report for
the current nettest, and tells you the report-ID. The report-ID is the identifier
of the measurement result(s), which will be later submitted. The JSON is like:

```JSON
{
  "key": "status.report_create",
  "value": {
    "report_id": "string",
  }
}
```

Where `report_id` is the report identifier.

- `"status.resolver_lookup"`: (object) This event is emitted only once at the
beginning of the nettest, when the IP address of the resolver is discovered. The
JSON is like:

```JSON
{
  "key": "status.resolver_lookup",
  "value": {
    "resolver_ip": "<ip_address>"
  }
}
```

Where `<ip_address>` is the resolver's IP address.

- `"status.started"`: (object) The nettest has started, and the JSON is like:

```JSON
{
  "key": "status.started",
  "value": {
  }
}
```

Where `value` is empty.

- `"status.update.performance"`: (object) This is an event emitted by tests that
measure network performance. The JSON is like:

```JSON
{
  "key": "status.update.performance",
  "value": {
    "direction": "<direction>",
    "elapsed": 0.0,
    "num_streams": 0,
    "speed_kbps": 0.0
  }
}
```

Where `direction` is either "download" or "upload", `elapsed` is the elapsed
time (in seconds) since the measurement started, `num_streams` is the number of
streams we are using to measure performance, `speed_kbps` is the speed, in
kbit/s, measured since the previous performance measurement.

- `"status.update.websites"`: (object) This is an event emitted by tests that
measure the reachability of websites. The JSON is like:

```JSON
{
  "key": "status.update.websites",
  "value": {
    "url": "<url>",
    "status": "<status>"
  }
}
```

Where `url` is the URL we're measuring and `status` is either `accessible`
or `blocking`.

- `"task_terminated"`: (object) This event is emitted when you attempt to
extract events from the task queue, but the task is not running anymore (i.e.
it's the equivalent of `EOF` for the task queue). The related JSON is like:

```JSON
{
  "key": "status.terminated",
  "value": {
  }
}
```

Where `value` is empty.

## Task pseudocode

The following illustrates in pseudocode the operations performed
by a nettest once you call `mk_task_start`. It not 100% accurate;
in particular, we have omitted the code that generates most log
messages. This pseudocode is meant to help you understand how
Measurement Kit works internally, and specifically how all the
settings described above interact together when you specify them
for running Measurement Kit nettests. We are using pseudo JavaScript
because that is the easiest language to show manipulation of JSON
objects such as the `settings` object.

As of MK v0.9.0-alpha.9, there are some misalignments between the pseudocode
and the implementation, which we'll fix during the v0.12.0 releases.

As mentioned, a nettest runs in its own thread. It first validate
settings, then it opens the logfile (if needed), and finally it
waits in queue until other possibly running nettests terminate. The
`finish` function will be called when the nettest is done, and will
emit all the events emitted at the end of a nettest.

```JavaScript
function taskThread(settings) {
  emitEvent("status.queued", {})
  semaphore.Acquire()                 // blocked until my turn

  let finish = function(error) {
    semaphore.Release()               // allow another test to run
    emitEvent("status.end", {
      downloaded_kb: countDownloadedKb(),
      uploaded_kb: countUploadedKb(),
      failure: error.AsString()
    })
  }

  if (!settingsAreValid(settings)) {
    emitEvent("failure.startup", {
      failure: "value_error",
    })
    finish("value_error")
    return
  }

  if (settings.log_filepath != "") {
    // TODO(bassosimone): we should decide whether we want to deal with the
    // case where we cannot write into the log file. Currently we don't.
    openLogFile(settings.log_filepath)
  }

  let task = makeNettestTask(settings.name)

  emitEvent("status.started", {})


```

After all this setup, a nettest contacts the OONI bouncer, lookups
the IP address, the country code, the autonomous system number, and
the resolver lookup. All these information end up in the JSON
measurement. Also, all these operations can be explicitly disabled
by setting the appropriate settings.

```JavaScript
  let test_helpers = test.defaultTestHelpers()
  if (!settings.options.no_bouncer) {
    if (settings.options.bouncer_base_url == "") {
      settings.options.bouncer_base_url = defaultBouncerBaseURL()
    }
    let error
    [test_helpers, error] = queryOONIBouncer(settings)
    if (error) {
      emitWarning(settings, "cannot query OONI bouncer")
      if (!settings.options.ignore_bouncer_error) {
        finish(error)
        return
      }
    }
    // TODO(bassosimone): we should make sure the progress described here
    // is consistent with the one emitted by the real code.
    emitProgress(0.1, "contacted bouncer")
  }

  let probe_ip = "127.0.0.1"
  if (settings.options.probe_ip != "") {
    probe_ip = settings.options.probe_ip
  } else if (!settings.options.no_geoip) {
    let error
    [probe_ip, error] = lookupIP(settings)
    if (error) {
      emitEvent("failure.ip_lookup", {
        failure: error.AsString()
      })
      emitWarning(settings, "cannot lookup probe IP")
    }
  }

  let probe_asn = "AS0",
      probe_network_name = "Unknown"
  if (settings.options.probe_asn != "") {
    probe_asn = settings.options.probe_asn
  } else if (!settings.options.no_geoip) {
    let error
    [probe_asn, probe_network_name, error] = lookupASN(settings)
    if (error) {
      emitEvent("failure.asn_lookup", {
        failure: error.AsString()
      })
      emitWarning(settings, "cannot lookup probe ASN")
    }
  }

  let probe_cc = "ZZ"
  if (settings.options.probe_cc != "") {
    probe_cc = settings.options.probe_cc
  } else if (!settings.options.no_geoip) {
    let error
    [probe_cc, error] = lookupCC(settings)
    if (error) {
      emitEvent("failure.cc_lookup", {
        failure: error.AsString()
      })
      emitWarning(settings, "cannot lookup probe CC")
    }
  }

  if (!settings.options.no_geoip) {
    emitEvent("status.geoip_lookup", {
      probe_ip: probe_ip,
      probe_asn: probe_asn,
      probe_network_name: probe_network_name,
      probe_cc: probe_cc
    })
    emitProgress(0.2, "geoip lookup")
  }

  let resolver_ip = "127.0.0.1"
  if (!settings.options.no_resolver_lookup) {
    let error
    [resolver_ip, error] = lookupResolver(settings)
    if (error) {
      emitEvent("failure.resolver_lookup", {
        failure: error.AsString()
      })
      emitWarning(settings, "cannot lookup resolver IP")
    }
  }
  emitEvent("status.resolver_lookup", {
    resolver_ip: resolver_ip
  })
  emitProgress(0.3, "resolver lookup")
```

Then, Measurement Kit opens the report file on disk, which will contain
the measurements, each serialized on a JSON on its own line. It will also
contact the OONI bouncer and obtain a report-ID for the report.

```JavaScript
  if (!settings.options.no_file_report) {
    if (settings.output_filepath == "") {
      settings.output_filepath = makeDefaultOutputFilepath(settings);
    }
    let error = openFileReport(settings.output_filepath)
    if (error) {
      emitWarning(settings, "cannot open file report")
      finish(error)
      return
    }
  }

  let report_id
  if (!settings.options.no_collector) {
    if (settings.options.collector_base_url == "") {
      settings.options.collector_base_url = defaultCollectorBaseURL();
    }
    let error
    [report_id, error] = collectorOpenReport(settings)
    if (error) {
      emitWarning("cannot open report with OONI collector")
      emitEvent("failure.report_create", {
        failure: error.AsString()
      })
      if (!settings.options.ignore_open_report_error) {
        finish(error)
        return
      }
    } else {
      emitEvent("status.report_create", {
        report_id: report_id
      })
    }
  }

  emitProgress(0.4, "open report")
```

Then comes input processing. Measurement Kit assembles a list of inputs to
process. If the test do not take any input, we fake a single input entry
consisting of the empty string, implying that this test needs to perform just
a single iteration. (This is a somewhat internal detail, but it explains
some events with `idx` equal to `0` and `input` equal to an empty string.)

```JavaScript
  for (let i = 0; i < settings.input_filepaths.length; ++i) {
    let [inputs, error] = readInputFile(settings.input_filepaths[i])
    if (error) {
      emitWarning("cannot read input file")
      finish(error)
      return
    }
    settings.inputs = settings.inputs.concat(inputs)
  }
  if (settings.inputs.length <= 0) {
    if (task.needs_input) {
      emitWarning(settings, "no input provided")
      finish(Error("value_error"))
      return
    }
    settings.inputs.push("") // empty input for input-less tests
  }
  if (settings.options.randomize_input) {
    shuffle(settings.input)
  }
```

Then, Measurement Kit iterates over all the input and runs the function
implementing the specified nettest on each input.

```JavaScript
  let begin = timeNow()
  for (let i = 0; i < settings.inputs; ++i) {
    let currentTime = timeNow()
    if (settings.options.max_runtime >= 0 &&
        currentTime - begin > settings.options.max_runtime) {
      emitWarning("exceeded maximum runtime")
      break
    }
    emitEvent("status.measurement_start", {
      idx: i,
      input: settings.inputs[i]
    })
    let measurement = Measurement()
    measurement.annotations = settings.annotations
    measurement.annotations.engine_name = "libmeasurement_kit"
    measurement.annotations.engine_version = mkVersion()
    measurement.annotations.engine_version_full = mkVersionFull()
    measurement.annotations.platform = platformName()
    measurement.annotations.probe_network_name = settings.options.save_real_probe_asn
                                                  ? probe_network_name : "Unknown"
    measurement.id = UUID4()
    measurement.input = settings.inputs[i]
    measurement.input_hashes = []
    measurement.measurement_start_time = currentTime
    measurement.options = []
    measurement.probe_asn = settings.options.save_real_probe_asn ? probe_asn : "AS0"
    measurement.probe_cc = settings.options.save_real_probe_cc ? probe_cc : "ZZ"
    measurement.probe_ip = settings.options.save_real_probe_ip
                              ? probe_ip : "127.0.0.1"
    measurement.report_id = report_id
    measurement.sotfware_name = settings.options.software_name
    measurement.sotfware_version = settings.options.software_version
    measurement.test_helpers = test_helpers
    measurement.test_name = test.AsOONITestName()
    measurement.test_start_time = begin
    measurement.test_verson = test.Version()
    let [test_keys, error] = task.Run(
          settings.inputs[i], settings, test_helpers)
    measurement.test_runtime = timeNow() - currentTime
    measurement.test_keys = test_keys
    measurement.test_keys.resolver_ip = settings.options.save_resolver_ip
                                          ? resolve_ip : "127.0.0.1"
    if (error) {
      emitEvent("failure.measurement", {
        failure: error.AsString(),
        idx: i,
        input: settings.inputs[i]
      })
    }
    emitEvent("measurement", {
      json_str: measurement.asJSON(),
      idx: i,
      input: settings.inputs[i]
    })
    if (!settings.options.no_file_report) {
      let error = writeReportToFile(measurement)
      if (error) {
        emitWarning("cannot write report to file")
        finish(error)
        return
      }
    }
    if (!settings.options.no_collector) {
      let error = submitMeasurementToOONICollector(measurement)
      if (error) {
        emitEvent("failure.measurement_submission", {
          idx: i,
          input: settings.inputs[i],
          json_str: measurement.asJSON(),
          failure: error.AsString()
        })
      } else {
        emitEvent("status.measurement_submission", {
          idx: i,
          input: settings.inputs[i],
        })
      }
    }
    emitEvent("status.measurement_done", {
      idx: i
    })
  }
```

Finally, Measurement Kit ends the test by closing the local results file
and the remote report managed by the OONI collector.

```JavaScript
  emitProgress(0.9, "ending the test")

  if (!settings.options.no_file_report) {
    error = closeFileReport()
    if (error) {
      emitWarning("cannot close file report")
      finish(error)
      return
    }
  }
  if (!settings.options.no_collector) {
    let error = closeRemoteReport()
    if (error) {
      emitEvent("failure.report_close", {
        failure: error.AsString()
      })
      emitWarning("cannot close remote report with OONI collector")
    } else {
      emitEvent("status.report_close", {
        report_id: report_id
      })
    }
  }

  finish()
}
```