1# Measurement Kit API 2 3Measurement Kit only exposes a simple C like API suitable to be used via 4[FFI](https://en.wikipedia.org/wiki/Foreign_function_interface). 5 6We implemented the following higher-level, easier-to-use APIs on top of 7this basic C-like FFI-friendly API: 8 9- [ObjectiveC API](https://github.com/measurement-kit/mkall-ios); 10 11- [Android API](https://github.com/measurement-kit/android-libs); 12 13We encourage you to avoid using it when a more user-friendly API is available 14because, for each specific platform, we will strive to maintain backwards 15compatibility with the most high level API available on such platform. 16 17## Synopsis 18 19```C++ 20#include <measurement_kit/ffi.h> 21 22typedef struct mk_event_ mk_event_t; 23typedef struct mk_task_ mk_task_t; 24 25mk_task_t *mk_task_start(const char *settings); 26mk_event_t *mk_task_wait_for_next_event(mk_task_t *task); 27int mk_task_is_done(mk_task_t *task); 28void mk_task_interrupt(mk_task_t *task); 29 30const char *mk_event_serialization(mk_event_t *event); 31void mk_event_destroy(mk_event_t *event); 32 33void mk_task_destroy(mk_task_t *task); 34``` 35 36## Introduction 37 38Measurement Kit is a network measurement engine. It runs _tasks_ (e.g. a 39specific network measurement). To start a task, you need to configure it with 40specific _settings_. Among these settings there is the most important one, the 41task name (e.g. "Ndt" is the task name of the NDT network performance test). 42While running, a task emits _events_ (e.g. a log line, a JSON describing the 43result of the measurement, and other intermediate results). 44 45Each task runs in its own thread. Measurement Kit implements a 46simple mechanism, based on a shared semaphore, to guarantee that 47tasks do not run concurrently. This mechanism prevents tasks from 48running concurrently, but do not necessarily guarantee that tasks 49are run in FIFO order. Yet, this is enough to avoid that a task 50creates network noise that impacts onto another task's measurements. 51 52The thread running a task will post events generated by the task 53on a shared, thread safe queue. Your code should loop by extracting 54and processing events from such shared queue until the task has 55finished running. 56 57## API documentation 58 59`mk_task_start` starts a task (generally a nettest) with the settings provided 60as a serialized JSON. Returns `NULL` if `conf` was `NULL`, or in 61case of parse error. You own (and must destroy) the returned task 62pointer. 63 64`mk_task_wait_for_next_event` blocks waiting for the `task` to emit the next 65event. Returns `NULL` if `task` is `NULL` or on error. If the task is 66terminated, it returns immediately a `task.terminated` event. You own (and 67must destroy) the returned event pointer. 68 69`mk_task_is_done` returns zero when the tasks is running, nonzero otherwise. If 70the `task` is `NULL`, nonzero is returned. 71 72`mk_task_interrupt` interrupts a running `task`. Interrupting a `NULL` task 73has no effect. 74 75`mk_event_serialization` obtains the JSON serialization of `event`. Return `NULL` 76if either the `event` is `NULL` or there is an error. 77 78`mk_event_destroy` destroys the memory associated with `event`. Attempting to 79destroy a `NULL` event has no effect. 80 81`mk_task_destroy` destroys the memory associated with `task`. Attempting to 82destroy a `NULL` task has no effect. Attempting to destroy a running `task` will 83wait for the task to complete before releasing memory. 84 85## Example 86 87The following C++ example runs the "Ndt" test with "INFO" verbosity. 88 89```C++ 90 const char *settings = R"({ 91 "name": "Ndt", 92 "log_level": "INFO" 93 })"; 94 mk_task_t *task = mk_task_start(settings); 95 if (!task) { 96 std::clog << "ERROR: cannot start task" << std::endl; 97 return; 98 } 99 while (!mk_task_is_done(task)) { 100 mk_event_t *event = mk_task_wait_for_next_event(task); 101 if (!event) { 102 std::clog << "ERROR: cannot wait for next event" << std::endl; 103 break; 104 } 105 const char *json_serialized_event = mk_event_serialization(event); 106 if (!json_serialized_event) { 107 std::clog << "ERROR: cannot get event serialization" << std::endl; 108 break; 109 } 110 { 111 // TODO: process the JSON-serialized event 112 } 113 mk_event_destroy(event); 114 } 115 mk_task_destroy(task); 116``` 117 118## Nettest tasks 119 120The following nettests tasks are defined (case matters): 121 122- `"Dash"`: Neubot's DASH test. 123- `"CaptivePortal"`: OONI's captive portal test. 124- `"DnsInjection"`: OONI's DNS injection test. 125- `"FacebookMessenger"`: OONI's Facebook Messenger test. 126- `"HttpHeaderFieldManipulation"`: OONI's HTTP header field manipulation test. 127- `"HttpInvalidRequestLine"`: OONI's HTTP invalid request line test. 128- `"MeekFrontedRequests"`: OONI's meek fronted requests test. 129- `"Ndt"`: the NDT network performance test. 130- `"TcpConnect"`: OONI's TCP connect test. 131- `"Telegram"`: OONI's Telegram test. 132- `"WebConnectivity"`: OONI's Web Connectivity test. 133- `"Whatsapp"`: OONI's WhatsApp test. 134 135## Settings 136 137The nettest task settings object is a JSON like: 138 139```JSON 140{ 141 "annotations": { 142 "optional_annotation_1": "value_1", 143 "another_annotation": "with_its_value" 144 }, 145 "disabled_events": [ 146 "status.queued", 147 "status.started" 148 ], 149 "inputs": [ 150 "www.google.com", 151 "www.x.org" 152 ], 153 "input_filepaths": [ 154 "/path/to/file", 155 "/path/to/another/file" 156 ], 157 "log_filepath": "logfile.txt", 158 "log_level": "INFO", 159 "name": "WebConnectivity", 160 "options": { 161 "all_endpoints": false, 162 "backend": "", 163 "bouncer_base_url": "", 164 "collector_base_url": "", 165 "constant_bitrate": 0, 166 "dns/nameserver": "", 167 "dns/engine": "system", 168 "expected_body": "", 169 "geoip_asn_path": "", 170 "geoip_country_path": "", 171 "hostname": "", 172 "ignore_bouncer_error": true, 173 "ignore_open_report_error": true, 174 "max_runtime": -1, 175 "mlabns/address_family": "ipv4", 176 "mlabns/base_url": "https://locate.measurementlab.net/", 177 "mlabns/country": "IT", 178 "mlabns/metro": "trn", 179 "mlabns/policy": "random", 180 "mlabns_tool_name": "", 181 "net/ca_bundle_path": "", 182 "net/timeout": 10.0, 183 "no_bouncer": false, 184 "no_collector": false, 185 "no_file_report": false, 186 "no_geoip": false, 187 "no_resolver_lookup": false, 188 "port": 1234, 189 "probe_ip": "1.2.3.4", 190 "probe_asn": "AS30722", 191 "probe_cc": "IT", 192 "probe_network_name": "Network name", 193 "randomize_input": true, 194 "save_real_probe_asn": true, 195 "save_real_probe_cc": true, 196 "save_real_probe_ip": false, 197 "save_real_resolver_ip": true, 198 "server": "neubot.mlab.mlab1.trn01.measurement-lab.org", 199 "software_name": "measurement_kit", 200 "software_version": "<current-mk-version>", 201 "test_suite": 0, 202 "uuid": "" 203 }, 204 "output_filepath": "results.njson", 205} 206``` 207 208The only mandatory key is `name`, which identifies the task. All the other 209keys are optional. Above we have shown the most commonly used `options`, that 210are described in greater detail below. The value we included for options 211is their default value (_however_, the value of non-`options` settings _is not_ 212the default value, rather is a meaningful example). The following keys 213are available: 214 215- `"annotations"`: (object; optional) JSON object containing key, value string 216 mappings that are copied verbatim in the measurement result file; 217 218- `"disabled_events"`: (array; optional) strings array containing the names of 219 the events that you are not interested into. All the available event 220 names are described below. By default all events are enabled; 221 222- `"inputs"`: (array; optional) array of strings to be passed to the nettest as 223 input. If the nettest does not take any input, this is ignored. If the nettest 224 requires input and you provide neither `"inputs"` nor `"input_filepaths"`, 225 the nettest will fail; 226 227- `"input_filepaths"`: (array; optional) array of files containing input 228 strings, one per line, to be passed to the nettest. These files are read and 229 their content is merged with the one of the `inputs` key. 230 231- `"log_filepath"`: (string; optional) name of the file where to 232 write logs. By default logs are written on `stderr`; 233 234- `"log_level"`: (string; optional) how much information you want to see 235 written in the log file and emitted by log-related events. The default log 236 level is "WARNING" (see below); 237 238- `"name"`: (string; mandatory) name of the nettest to run. The available 239 nettest names have been described above; 240 241- `"options"`: (object; optional) options modifying the nettest behavior, as 242 an object mapping string keys to string, integer or double values; 243 244- `"output_filepath"`: (string; optional) file where you want MK to 245 write measurement results, as a sequence of lines, each line being 246 the result of a measurement serialized as JSON; 247 248## Log levels 249 250The available log levels are: 251 252- `"ERR"`: an error message 253 254- `"WARNING"`: a warning message 255 256- `"INFO"`: an informational message 257 258- `"DEBUG"`: a debugging message 259 260- `"DEBUG2"`: a really specific debugging message 261 262When you specify a log level in the settings, only messages with a log level 263equal or greater than the specified one are emitted. For example, if you 264specify `"INFO"`, you will only see `"ERR"`, `"WARNING"`, and `"INFO"` logs. 265 266The default log level is "WARNING". 267 268## Options 269 270Options can be `string`, `double`, `int`, or `boolean`. There used to be no 271boolean type, but we later added support for it (between v0.9.0-alpha.9 and 272v0.9.0-alpha.10). The code will currently warn you if the type of a variable 273is not correct. In future versions, we will enforce this restriction more 274strictly and only accept options with the correct type. 275 276These are the available options: 277 278- `"all_endpoints"`: (boolean) whether to check just a few or all the 279 available endpoints in tests that work with specific endpoints, such 280 as, the "WhatsApp" test; 281 282- `"backend"`: (string) pass specific backend to OONI tests requiring it, 283 e.g., WebConnectivity, HTTP Invalid Request Line; 284 285- `"bouncer_base_url"`: (string) base URL of OONI bouncer, by default set to 286 the empty string. If empty, the OONI bouncer will be used; 287 288- `"collector_base_url"`: (string) base URL of OONI collector, by default set 289 to the empty string. If empty, the OONI collector will be used; 290 291- `"constant_bitrate"`: (int) force DASH to run at the specified 292 constant bitrate; 293 294- `"dns/nameserver"`: (string) nameserver to be used with non-`system` DNS 295 engines. Can or cannot include an optional port number. By default, set 296 to the empty string; 297 298- `"dns/engine"`: (string) what DNS engine to use. By default, set to 299 `"system"`, meaning that `getaddrinfo()` will be used to resolve domain 300 names. Can also be set to `"libevent"`, to use libevent's DNS engine. 301 In such case, you must provide a `"dns/nameserver"` as well; 302 303- `"expected_body"`: (string) body expected by Meek Fronted Requests; 304 305- `"geoip_asn_path"`: (string) path to the GeoLite2 `.mmdb` ASN database 306 file. By default not set; 307 308- `"geoip_country_path"`: (string) path to the GeoLite2 `.mmdb` Country 309 database file. By default not set; 310 311- `"hostname"`: (string) hostname to be used by the DASH test; 312 313- `"ignore_bouncer_error"`: (boolean) whether to ignore an error in contacting 314 the OONI bouncer. By default set to `true` so that bouncer errors will 315 be ignored; 316 317- `"ignore_open_report_error"`: (boolean) whether to ignore an error when opening 318 the report with the OONI collector. By default set to `true` so that errors 319 will be ignored; 320 321- `"max_runtime"`: (integer) number of seconds after which the test will 322 be stopped. Works _only_ for tests taking input. By default set to `-1` 323 so that there is no maximum runtime for tests with input; 324 325- `"mlabns/address_family"`: (string) set to `"ipv4"` or `"ipv6"` to force 326 M-Lab NS to only return IPv4 or IPv6 addresses (you don't normally 327 need to set this option and it only has effect for NDT and DASH anyway); 328 329- `"mlabns/base_url"`: (string) base URL of the M-Lab NS service (you don't 330 normally need to set this option and it only has effect for NDT and 331 DASH anyway); 332 333- `"mlabns/country"`: (string) tells M-Lab NS the country in which you are 334 rather than letting it guess for you, so that it returns results that 335 are meaningful within that country (again, normally you don't need this 336 option, and it only impacts on DASH and NDT); 337 338- `"mlabns/metro"`: (string) this restricts the results returned by M-Lab 339 NS to a specific metro area; for example, setting this to `"ord"` will 340 only returns M-Lab servers in the Chicago area (again, you normally don't 341 need this option, and it only impacts on DASH and NDT); 342 343- `"mlabns/policy"`: (string) overrides the default M-Lab NS policy; for 344 example, setting this to `"random"` will return a random server (as stated 345 above, you normally don't need this variable, and it only impacts on 346 the NDT and DASH tests); 347 348- `"mlabns_tool_name"`: (string) force NDT to use an mlab-ns tool 349 name different from the default (`"ndt"`); 350 351- `"net/ca_bundle_path"`: (string) path to the CA bundle path to be used 352 to validate SSL certificates. Required on mobile; 353 354- `"net/timeout"`: (double) number of seconds after which network I/O 355 operations will timeout. By default set to `10.0` seconds; 356 357- `"no_bouncer"`: (boolean) whether to use a bouncer. By default set to 358 `false`, meaning that a bouncer will be used; 359 360- `"no_collector"`: (boolean) whether to use a collector. By default set to 361 `false`, meaning that a collector will be used; 362 363- `"no_file_report"`: (boolean) whether to write a report (i.e. measurement 364 result) file on disk. By default set to `false`, meaning that we'll try; 365 366- `"no_geoip"`: (boolean) whether to perform GeoIP lookup. By default set to 367 `false`, meaning that we'll try; 368 369- `"no_resolver_lookup"`: (boolean) whether to lookup the IP address of 370 the resolver. By default `false`, meaning that we'll try. When true we 371 will set the resolver IP address to `127.0.0.1`; 372 373- `"probe_asn"`: (string) sets the `probe_asn` to be included into the 374 report, thus skipping the ASN resolution; 375 376- `"probe_cc"`: (string) like `probe_asn` but for country code; 377 378- `"probe_ip"`: (string) like `probe_asn` but for IP address; 379 380- `"probe_network_name"`: (string) like `probe_asn` but for the 381 name associated to the ASN; 382 383- `"port"`: (int) allows to override the port for tests that connect to a 384 specific port, such as NDT and DASH; 385 386- `"randomize_input"`: (boolean) whether to randomize input. By default set to 387 `true`, meaning that we'll randomize input; 388 389- `"save_real_probe_asn"`: (boolean) whether to save the ASN. By default set 390 to `true`, meaning that we will save it; 391 392- `"save_real_probe_cc"`: (boolean) whether to save the CC. By 393 default set to `true`, meaning that we will save it; 394 395- `"save_real_probe_ip"`: (boolean) whether to save our IP. By 396 default set to `true`, meaning that we will not save it; 397 398- `"save_real_resolver_ip"`: (boolean) whether to save the resolver 399 IP. By default set to `true`, meaning that we'll save it; 400 401- `"server"`: (server) allows to override the server hostname for tests that 402 connect to a specific port, such as NDT and DASH; 403 404- `"software_name"`: (string) name of the app. By default set to 405 `"measurement_kit"`. This string will be included in the user-agent 406 header when contacting mlab-ns. 407 408- `"software_version"`: (string) version of the app. By default set to the 409 current version of Measurement Kit. As for `software_name` this string 410 will be included in the user-agent header when contacting mlab-ns; 411 412- `"test_suite"`: (int) force NDT to use a specific test suite; 413 414- `"uuid"`: (string) force DASH to use a specific UUID. 415 416## Events 417 418An event is a JSON object like: 419 420```JSON 421 { 422 "key": "<key>", 423 "value": {} 424 } 425``` 426 427Where `"value"` is a JSON object with an event-specific structure, and `"key"` 428is a string. Below we describe all the possible event keys, along with the 429"value" JSON structure. Unless otherwise specified, an event key can be emitted 430an arbitrary number of times during the lifecycle of a task. Unless otherwise 431specified, all the keys introduced below where added in MK v0.9.0. 432 433- `"bug.json_dump"`: (object) There was a failure in serialising an event 434 to JSON and we let you know about this Measurement Kit bug. Please, open an 435 issue on GitHub if you notice this kind of bug. This event has been added 436 in MK v0.9.2. The JSON returned by this event is like: 437 438```JSON 439{ 440 "key": "bug.json_dump", 441 "value": { 442 "failure": "<failure_string>", 443 "orig_key": "<orig_key>", 444 } 445} 446``` 447 448Where `<orig_key>` is the key that failure and `<failure_string>` is an 449error providing some additional information. Note that both fields MAY 450be base64 encoded if they're not JSON serialisable. 451 452- `"failure.asn_lookup"`: (object) There was a failure attempting to lookup the 453 user autonomous system number. The JSON returned by this event is like: 454 455```JSON 456{ 457 "key": "failure.asn_lookup", 458 "value": { 459 "failure": "<failure_string>" 460 } 461} 462``` 463 464Where all the possible values of `<failure_string>` are described below. 465 466- `"failure.cc_lookup"`: (object) There was a failure attempting to lookup the 467 user country code. The JSON returned by this event is like: 468 469```JSON 470{ 471 "key": "failure.cc_lookup", 472 "value": { 473 "failure": "<failure_string>" 474 } 475} 476``` 477 478Where all the possible values of `<failure_string>` are described below. 479 480- `"failure.ip_lookup"`: (object) There was a failure attempting to lookup the 481 user IP address. The JSON returned by this event is like: 482 483```JSON 484{ 485 "key": "failure.ip_lookup", 486 "value": { 487 "failure": "<failure_string>" 488 } 489} 490``` 491 492Where all the possible values of `<failure_string>` are described below. 493 494- `"failure.measurement"`: (object) There was a failure running the 495 measurement. The complete JSON returned by this event is like: 496 497```JSON 498{ 499 "key": "failure.measurement", 500 "value": { 501 "idx": 0, 502 "failure": "<failure_string>" 503 } 504} 505``` 506 507Where all the possible values of `<failure_string>` are described below. 508 509- `"failure.measurement_submission"`: (object) There was a failure in 510submitting the measurement result to the configured collector (if any). The 511complete JSON returned by this event is like: 512 513```JSON 514{ 515 "key": "failure.measurement_submission", 516 "value": { 517 "idx": 0, 518 "json_str": "<serialized_result>", 519 "failure": "<failure_string>" 520 } 521} 522``` 523 524Where `idx` is the index of the current measurement, which is relevant for the 525tests that run over an input list; `json_str` is the measurement that we failed 526to submit, serialized as JSON; `failure` is the error that occurred. 527 528- `"failure.report_create"`: (object) There was a failure in creating the 529measurement result to the configured collector (if any). The complete JSON 530returned by this event is like: 531 532```JSON 533{ 534 "key": "failure.report_create", 535 "value": { 536 "failure": "<failure_string>" 537 } 538} 539``` 540 541Where `failure` is the error that occurred. 542 543- `"failure.report_close"`: (object) There was a failure in closing the 544measurement result to the configured collector (if any). The complete JSON 545returned by this event is like: 546 547```JSON 548{ 549 "key": "failure.report_close", 550 "value": { 551 "failure": "<failure_string>" 552 } 553} 554``` 555 556Where `failure` is the error that occurred. 557 558- `"failure.resolver_lookup"`: (object) There was a failure attempting to 559 lookup the user DNS resolver. The JSON returned by this event is like: 560 561```JSON 562{ 563 "key": "failure.resolver_lookup", 564 "value": { 565 "failure": "<failure_string>" 566 } 567} 568``` 569 570Where all the possible values of `<failure_string>` are described below. 571 572- `"failure.startup"`: (object) There was a failure starting the test, most 573likely because you passed in incorrect options. The complete JSON returned by 574this event is like: 575 576```JSON 577{ 578 "key": "failure.startup", 579 "value": { 580 "failure": "<failure_string>" 581 } 582} 583``` 584 585Where `<failure_string>` is the error that occurred. 586 587- `"log"`: (object) A log line was emitted. The complete JSON is like: 588 589```JSON 590{ 591 "key": "log", 592 "value": { 593 "log_level": "<a_log_level>", 594 "message": "<string>" 595 } 596} 597``` 598 599Where `log_level` is one of the log levels described above, and `message` 600is the log message emitted by Measurement Kit. 601 602- `"measurement"`: (object) The result of a measurement. The complete JSON 603is like: 604 605```JSON 606{ 607 "key": "measurement", 608 "value": { 609 "idx": 0, 610 "json_str": "<serialized_result>" 611 } 612} 613``` 614 615Where `json_str` is the measurement result serialized as JSON. The schema of 616a measurement result depends on the type of nettest, as described below. And 617where `idx` is the index of the current measurement (relevant only for nettests 618that iterate over an input list). 619 620- `"status.end"`: (object) This event is emitted just once at the end of the 621test. The complete JSON is like: 622 623```JSON 624{ 625 "key": "status.end", 626 "value": { 627 "downloaded_kb": 0.0, 628 "uploaded_kb": 0.0, 629 "failure": "<failure_string>" 630 } 631} 632``` 633 634Where `downloaded_kb` and `uploaded_kb` are the amount of downloaded and 635uploaded kilo-bytes, and `failure` is the overall failure that occurred during 636the test (or the empty string, if no error occurred). 637 638- `"status.geoip_lookup"`: (object) This event is emitted only once at the 639beginning of the nettest, and provides information about the user's IP address, 640country and autonomous system. In detail, the JSON is like: 641 642```JSON 643{ 644 "key": "status.geoip_lookup", 645 "value": { 646 "probe_ip": "<ip_address>", 647 "probe_asn": "<asn>", 648 "probe_cc": "<cc>", 649 "probe_network_name": "<network_name>" 650 } 651} 652``` 653 654Where `<ip_address>` is the user's IP address, `asn` is the autonomous 655system number, `cc` is the country code, `network_name` is the commercial 656name associated to the autonomous system number. 657 658- `"status.progress"`: (object) This is emitted during the nettest lifecycle to 659inform you about the nettest progress towards completion. In detail, the JSON is 660like: 661 662```JSON 663{ 664 "key": "status.progress", 665 "value": { 666 "percentage": 0.0, 667 "message": "<string>" 668 } 669} 670``` 671 672Where `percentage` is the percentage of completion of the nettest, as a real 673number between 0.0 and 1.0, and `message` indicates the operation that the 674nettest just completed. 675 676- `"status.queued"`: (object) Indicates that the nettest has been accepted. In 677case there are already running nettests, as mentioned above, they will be 678prevented from running concurrently. The JSON is like: 679 680```JSON 681{ 682 "key": "status.queued", 683 "value": { 684 } 685} 686``` 687 688Where `value` is empty. 689 690- `"status.measurement_start"`: (object) Indicates that a measurement inside 691a nettest has started. The JSON is like: 692 693```JSON 694{ 695 "key": "status.measurement_start", 696 "value": { 697 "idx": 0, 698 "input": "<input>" 699 } 700} 701``` 702 703Where `idx` is the index of the current input and `input` is the current 704input. For tests that take no input, this event MAY be emitted with 705`idx` equal to `0` and `input` equal to the empty string. 706 707- `"status.measurement_submission"`: (object) The specific measurement has 708been uploaded successfully. The JSON is like: 709 710```JSON 711{ 712 "key": "status.measurement_submission", 713 "value": { 714 "idx": 0, 715 } 716} 717``` 718 719Where `idx` is the index of the measurement input. 720 721- `"status.measurement_done"`: (object) Measurement Kit has finished processing 722the specified input. The JSON is like: 723 724```JSON 725{ 726 "key": "status.measurement_done", 727 "value": { 728 "idx": 0, 729 } 730} 731``` 732 733Where `idx` is the index of the measurement input. 734 735- `"status.report_close"`: (object) Measurement Kit has closed a report for the 736current nettest, and tells you the report-ID. The report-ID is the identifier of 737the measurement result(s), which have been submitted. The JSON is like: 738 739```JSON 740{ 741 "key": "status.report_create", 742 "value": { 743 "report_id": "string", 744 } 745} 746``` 747 748Where `report_id` is the report identifier. 749 750- `"status.report_create"`: (object) Measurement Kit has created a report for 751the current nettest, and tells you the report-ID. The report-ID is the identifier 752of the measurement result(s), which will be later submitted. The JSON is like: 753 754```JSON 755{ 756 "key": "status.report_create", 757 "value": { 758 "report_id": "string", 759 } 760} 761``` 762 763Where `report_id` is the report identifier. 764 765- `"status.resolver_lookup"`: (object) This event is emitted only once at the 766beginning of the nettest, when the IP address of the resolver is discovered. The 767JSON is like: 768 769```JSON 770{ 771 "key": "status.resolver_lookup", 772 "value": { 773 "resolver_ip": "<ip_address>" 774 } 775} 776``` 777 778Where `<ip_address>` is the resolver's IP address. 779 780- `"status.started"`: (object) The nettest has started, and the JSON is like: 781 782```JSON 783{ 784 "key": "status.started", 785 "value": { 786 } 787} 788``` 789 790Where `value` is empty. 791 792- `"status.update.performance"`: (object) This is an event emitted by tests that 793measure network performance. The JSON is like: 794 795```JSON 796{ 797 "key": "status.update.performance", 798 "value": { 799 "direction": "<direction>", 800 "elapsed": 0.0, 801 "num_streams": 0, 802 "speed_kbps": 0.0 803 } 804} 805``` 806 807Where `direction` is either "download" or "upload", `elapsed` is the elapsed 808time (in seconds) since the measurement started, `num_streams` is the number of 809streams we are using to measure performance, `speed_kbps` is the speed, in 810kbit/s, measured since the previous performance measurement. 811 812- `"status.update.websites"`: (object) This is an event emitted by tests that 813measure the reachability of websites. The JSON is like: 814 815```JSON 816{ 817 "key": "status.update.websites", 818 "value": { 819 "url": "<url>", 820 "status": "<status>" 821 } 822} 823``` 824 825Where `url` is the URL we're measuring and `status` is either `accessible` 826or `blocking`. 827 828- `"task_terminated"`: (object) This event is emitted when you attempt to 829extract events from the task queue, but the task is not running anymore (i.e. 830it's the equivalent of `EOF` for the task queue). The related JSON is like: 831 832```JSON 833{ 834 "key": "status.terminated", 835 "value": { 836 } 837} 838``` 839 840Where `value` is empty. 841 842## Task pseudocode 843 844The following illustrates in pseudocode the operations performed 845by a nettest once you call `mk_task_start`. It not 100% accurate; 846in particular, we have omitted the code that generates most log 847messages. This pseudocode is meant to help you understand how 848Measurement Kit works internally, and specifically how all the 849settings described above interact together when you specify them 850for running Measurement Kit nettests. We are using pseudo JavaScript 851because that is the easiest language to show manipulation of JSON 852objects such as the `settings` object. 853 854As of MK v0.9.0-alpha.9, there are some misalignments between the pseudocode 855and the implementation, which we'll fix during the v0.12.0 releases. 856 857As mentioned, a nettest runs in its own thread. It first validate 858settings, then it opens the logfile (if needed), and finally it 859waits in queue until other possibly running nettests terminate. The 860`finish` function will be called when the nettest is done, and will 861emit all the events emitted at the end of a nettest. 862 863```JavaScript 864function taskThread(settings) { 865 emitEvent("status.queued", {}) 866 semaphore.Acquire() // blocked until my turn 867 868 let finish = function(error) { 869 semaphore.Release() // allow another test to run 870 emitEvent("status.end", { 871 downloaded_kb: countDownloadedKb(), 872 uploaded_kb: countUploadedKb(), 873 failure: error.AsString() 874 }) 875 } 876 877 if (!settingsAreValid(settings)) { 878 emitEvent("failure.startup", { 879 failure: "value_error", 880 }) 881 finish("value_error") 882 return 883 } 884 885 if (settings.log_filepath != "") { 886 // TODO(bassosimone): we should decide whether we want to deal with the 887 // case where we cannot write into the log file. Currently we don't. 888 openLogFile(settings.log_filepath) 889 } 890 891 let task = makeNettestTask(settings.name) 892 893 emitEvent("status.started", {}) 894 895 896``` 897 898After all this setup, a nettest contacts the OONI bouncer, lookups 899the IP address, the country code, the autonomous system number, and 900the resolver lookup. All these information end up in the JSON 901measurement. Also, all these operations can be explicitly disabled 902by setting the appropriate settings. 903 904```JavaScript 905 let test_helpers = test.defaultTestHelpers() 906 if (!settings.options.no_bouncer) { 907 if (settings.options.bouncer_base_url == "") { 908 settings.options.bouncer_base_url = defaultBouncerBaseURL() 909 } 910 let error 911 [test_helpers, error] = queryOONIBouncer(settings) 912 if (error) { 913 emitWarning(settings, "cannot query OONI bouncer") 914 if (!settings.options.ignore_bouncer_error) { 915 finish(error) 916 return 917 } 918 } 919 // TODO(bassosimone): we should make sure the progress described here 920 // is consistent with the one emitted by the real code. 921 emitProgress(0.1, "contacted bouncer") 922 } 923 924 let probe_ip = "127.0.0.1" 925 if (settings.options.probe_ip != "") { 926 probe_ip = settings.options.probe_ip 927 } else if (!settings.options.no_geoip) { 928 let error 929 [probe_ip, error] = lookupIP(settings) 930 if (error) { 931 emitEvent("failure.ip_lookup", { 932 failure: error.AsString() 933 }) 934 emitWarning(settings, "cannot lookup probe IP") 935 } 936 } 937 938 let probe_asn = "AS0", 939 probe_network_name = "Unknown" 940 if (settings.options.probe_asn != "") { 941 probe_asn = settings.options.probe_asn 942 } else if (!settings.options.no_geoip) { 943 let error 944 [probe_asn, probe_network_name, error] = lookupASN(settings) 945 if (error) { 946 emitEvent("failure.asn_lookup", { 947 failure: error.AsString() 948 }) 949 emitWarning(settings, "cannot lookup probe ASN") 950 } 951 } 952 953 let probe_cc = "ZZ" 954 if (settings.options.probe_cc != "") { 955 probe_cc = settings.options.probe_cc 956 } else if (!settings.options.no_geoip) { 957 let error 958 [probe_cc, error] = lookupCC(settings) 959 if (error) { 960 emitEvent("failure.cc_lookup", { 961 failure: error.AsString() 962 }) 963 emitWarning(settings, "cannot lookup probe CC") 964 } 965 } 966 967 if (!settings.options.no_geoip) { 968 emitEvent("status.geoip_lookup", { 969 probe_ip: probe_ip, 970 probe_asn: probe_asn, 971 probe_network_name: probe_network_name, 972 probe_cc: probe_cc 973 }) 974 emitProgress(0.2, "geoip lookup") 975 } 976 977 let resolver_ip = "127.0.0.1" 978 if (!settings.options.no_resolver_lookup) { 979 let error 980 [resolver_ip, error] = lookupResolver(settings) 981 if (error) { 982 emitEvent("failure.resolver_lookup", { 983 failure: error.AsString() 984 }) 985 emitWarning(settings, "cannot lookup resolver IP") 986 } 987 } 988 emitEvent("status.resolver_lookup", { 989 resolver_ip: resolver_ip 990 }) 991 emitProgress(0.3, "resolver lookup") 992``` 993 994Then, Measurement Kit opens the report file on disk, which will contain 995the measurements, each serialized on a JSON on its own line. It will also 996contact the OONI bouncer and obtain a report-ID for the report. 997 998```JavaScript 999 if (!settings.options.no_file_report) { 1000 if (settings.output_filepath == "") { 1001 settings.output_filepath = makeDefaultOutputFilepath(settings); 1002 } 1003 let error = openFileReport(settings.output_filepath) 1004 if (error) { 1005 emitWarning(settings, "cannot open file report") 1006 finish(error) 1007 return 1008 } 1009 } 1010 1011 let report_id 1012 if (!settings.options.no_collector) { 1013 if (settings.options.collector_base_url == "") { 1014 settings.options.collector_base_url = defaultCollectorBaseURL(); 1015 } 1016 let error 1017 [report_id, error] = collectorOpenReport(settings) 1018 if (error) { 1019 emitWarning("cannot open report with OONI collector") 1020 emitEvent("failure.report_create", { 1021 failure: error.AsString() 1022 }) 1023 if (!settings.options.ignore_open_report_error) { 1024 finish(error) 1025 return 1026 } 1027 } else { 1028 emitEvent("status.report_create", { 1029 report_id: report_id 1030 }) 1031 } 1032 } 1033 1034 emitProgress(0.4, "open report") 1035``` 1036 1037Then comes input processing. Measurement Kit assembles a list of inputs to 1038process. If the test do not take any input, we fake a single input entry 1039consisting of the empty string, implying that this test needs to perform just 1040a single iteration. (This is a somewhat internal detail, but it explains 1041some events with `idx` equal to `0` and `input` equal to an empty string.) 1042 1043```JavaScript 1044 for (let i = 0; i < settings.input_filepaths.length; ++i) { 1045 let [inputs, error] = readInputFile(settings.input_filepaths[i]) 1046 if (error) { 1047 emitWarning("cannot read input file") 1048 finish(error) 1049 return 1050 } 1051 settings.inputs = settings.inputs.concat(inputs) 1052 } 1053 if (settings.inputs.length <= 0) { 1054 if (task.needs_input) { 1055 emitWarning(settings, "no input provided") 1056 finish(Error("value_error")) 1057 return 1058 } 1059 settings.inputs.push("") // empty input for input-less tests 1060 } 1061 if (settings.options.randomize_input) { 1062 shuffle(settings.input) 1063 } 1064``` 1065 1066Then, Measurement Kit iterates over all the input and runs the function 1067implementing the specified nettest on each input. 1068 1069```JavaScript 1070 let begin = timeNow() 1071 for (let i = 0; i < settings.inputs; ++i) { 1072 let currentTime = timeNow() 1073 if (settings.options.max_runtime >= 0 && 1074 currentTime - begin > settings.options.max_runtime) { 1075 emitWarning("exceeded maximum runtime") 1076 break 1077 } 1078 emitEvent("status.measurement_start", { 1079 idx: i, 1080 input: settings.inputs[i] 1081 }) 1082 let measurement = Measurement() 1083 measurement.annotations = settings.annotations 1084 measurement.annotations.engine_name = "libmeasurement_kit" 1085 measurement.annotations.engine_version = mkVersion() 1086 measurement.annotations.engine_version_full = mkVersionFull() 1087 measurement.annotations.platform = platformName() 1088 measurement.annotations.probe_network_name = settings.options.save_real_probe_asn 1089 ? probe_network_name : "Unknown" 1090 measurement.id = UUID4() 1091 measurement.input = settings.inputs[i] 1092 measurement.input_hashes = [] 1093 measurement.measurement_start_time = currentTime 1094 measurement.options = [] 1095 measurement.probe_asn = settings.options.save_real_probe_asn ? probe_asn : "AS0" 1096 measurement.probe_cc = settings.options.save_real_probe_cc ? probe_cc : "ZZ" 1097 measurement.probe_ip = settings.options.save_real_probe_ip 1098 ? probe_ip : "127.0.0.1" 1099 measurement.report_id = report_id 1100 measurement.sotfware_name = settings.options.software_name 1101 measurement.sotfware_version = settings.options.software_version 1102 measurement.test_helpers = test_helpers 1103 measurement.test_name = test.AsOONITestName() 1104 measurement.test_start_time = begin 1105 measurement.test_verson = test.Version() 1106 let [test_keys, error] = task.Run( 1107 settings.inputs[i], settings, test_helpers) 1108 measurement.test_runtime = timeNow() - currentTime 1109 measurement.test_keys = test_keys 1110 measurement.test_keys.resolver_ip = settings.options.save_resolver_ip 1111 ? resolve_ip : "127.0.0.1" 1112 if (error) { 1113 emitEvent("failure.measurement", { 1114 failure: error.AsString(), 1115 idx: i, 1116 input: settings.inputs[i] 1117 }) 1118 } 1119 emitEvent("measurement", { 1120 json_str: measurement.asJSON(), 1121 idx: i, 1122 input: settings.inputs[i] 1123 }) 1124 if (!settings.options.no_file_report) { 1125 let error = writeReportToFile(measurement) 1126 if (error) { 1127 emitWarning("cannot write report to file") 1128 finish(error) 1129 return 1130 } 1131 } 1132 if (!settings.options.no_collector) { 1133 let error = submitMeasurementToOONICollector(measurement) 1134 if (error) { 1135 emitEvent("failure.measurement_submission", { 1136 idx: i, 1137 input: settings.inputs[i], 1138 json_str: measurement.asJSON(), 1139 failure: error.AsString() 1140 }) 1141 } else { 1142 emitEvent("status.measurement_submission", { 1143 idx: i, 1144 input: settings.inputs[i], 1145 }) 1146 } 1147 } 1148 emitEvent("status.measurement_done", { 1149 idx: i 1150 }) 1151 } 1152``` 1153 1154Finally, Measurement Kit ends the test by closing the local results file 1155and the remote report managed by the OONI collector. 1156 1157```JavaScript 1158 emitProgress(0.9, "ending the test") 1159 1160 if (!settings.options.no_file_report) { 1161 error = closeFileReport() 1162 if (error) { 1163 emitWarning("cannot close file report") 1164 finish(error) 1165 return 1166 } 1167 } 1168 if (!settings.options.no_collector) { 1169 let error = closeRemoteReport() 1170 if (error) { 1171 emitEvent("failure.report_close", { 1172 failure: error.AsString() 1173 }) 1174 emitWarning("cannot close remote report with OONI collector") 1175 } else { 1176 emitEvent("status.report_close", { 1177 report_id: report_id 1178 }) 1179 } 1180 } 1181 1182 finish() 1183} 1184``` 1185