1================= 2SanitizerCoverage 3================= 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11LLVM has a simple code coverage instrumentation built in (SanitizerCoverage). 12It inserts calls to user-defined functions on function-, basic-block-, and edge- levels. 13Default implementations of those callbacks are provided and implement 14simple coverage reporting and visualization, 15however if you need *just* coverage visualization you may want to use 16:doc:`SourceBasedCodeCoverage <SourceBasedCodeCoverage>` instead. 17 18Tracing PCs with guards 19======================= 20 21With ``-fsanitize-coverage=trace-pc-guard`` the compiler will insert the following code 22on every edge: 23 24.. code-block:: none 25 26 __sanitizer_cov_trace_pc_guard(&guard_variable) 27 28Every edge will have its own `guard_variable` (uint32_t). 29 30The compler will also insert calls to a module constructor: 31 32.. code-block:: c++ 33 34 // The guards are [start, stop). 35 // This function will be called at least once per DSO and may be called 36 // more than once with the same values of start/stop. 37 __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop); 38 39With an additional ``...=trace-pc,indirect-calls`` flag 40``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call. 41 42The functions `__sanitizer_cov_trace_pc_*` should be defined by the user. 43 44Example: 45 46.. code-block:: c++ 47 48 // trace-pc-guard-cb.cc 49 #include <stdint.h> 50 #include <stdio.h> 51 #include <sanitizer/coverage_interface.h> 52 53 // This callback is inserted by the compiler as a module constructor 54 // into every DSO. 'start' and 'stop' correspond to the 55 // beginning and end of the section with the guards for the entire 56 // binary (executable or DSO). The callback will be called at least 57 // once per DSO and may be called multiple times with the same parameters. 58 extern "C" void __sanitizer_cov_trace_pc_guard_init(uint32_t *start, 59 uint32_t *stop) { 60 static uint64_t N; // Counter for the guards. 61 if (start == stop || *start) return; // Initialize only once. 62 printf("INIT: %p %p\n", start, stop); 63 for (uint32_t *x = start; x < stop; x++) 64 *x = ++N; // Guards should start from 1. 65 } 66 67 // This callback is inserted by the compiler on every edge in the 68 // control flow (some optimizations apply). 69 // Typically, the compiler will emit the code like this: 70 // if(*guard) 71 // __sanitizer_cov_trace_pc_guard(guard); 72 // But for large functions it will emit a simple call: 73 // __sanitizer_cov_trace_pc_guard(guard); 74 extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) { 75 if (!*guard) return; // Duplicate the guard check. 76 // If you set *guard to 0 this code will not be called again for this edge. 77 // Now you can get the PC and do whatever you want: 78 // store it somewhere or symbolize it and print right away. 79 // The values of `*guard` are as you set them in 80 // __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive 81 // and use them to dereference an array or a bit vector. 82 void *PC = __builtin_return_address(0); 83 char PcDescr[1024]; 84 // This function is a part of the sanitizer run-time. 85 // To use it, link with AddressSanitizer or other sanitizer. 86 __sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr)); 87 printf("guard: %p %x PC %s\n", guard, *guard, PcDescr); 88 } 89 90.. code-block:: c++ 91 92 // trace-pc-guard-example.cc 93 void foo() { } 94 int main(int argc, char **argv) { 95 if (argc > 1) foo(); 96 } 97 98.. code-block:: console 99 100 clang++ -g -fsanitize-coverage=trace-pc-guard trace-pc-guard-example.cc -c 101 clang++ trace-pc-guard-cb.cc trace-pc-guard-example.o -fsanitize=address 102 ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out 103 104.. code-block:: console 105 106 INIT: 0x71bcd0 0x71bce0 107 guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:2 108 guard: 0x71bcd8 3 PC 0x4ecd9e in main trace-pc-guard-example.cc:3:7 109 110.. code-block:: console 111 112 ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out with-foo 113 114 115.. code-block:: console 116 117 INIT: 0x71bcd0 0x71bce0 118 guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:3 119 guard: 0x71bcdc 4 PC 0x4ecdc7 in main trace-pc-guard-example.cc:4:17 120 guard: 0x71bcd0 1 PC 0x4ecd20 in foo() trace-pc-guard-example.cc:2:14 121 122Inline 8bit-counters 123==================== 124 125**Experimental, may change or disappear in future** 126 127With ``-fsanitize-coverage=inline-8bit-counters`` the compiler will insert 128inline counter increments on every edge. 129This is similar to ``-fsanitize-coverage=trace-pc-guard`` but instead of a 130callback the instrumentation simply increments a counter. 131 132Users need to implement a single function to capture the counters at startup. 133 134.. code-block:: c++ 135 136 extern "C" 137 void __sanitizer_cov_8bit_counters_init(char *start, char *end) { 138 // [start,end) is the array of 8-bit counters created for the current DSO. 139 // Capture this array in order to read/modify the counters. 140 } 141 142PC-Table 143======== 144 145**Experimental, may change or disappear in future** 146 147**Note:** this instrumentation might be incompatible with dead code stripping 148(``-Wl,-gc-sections``) for linkers other than LLD, thus resulting in a 149significant binary size overhead. For more information, see 150`Bug 34636 <https://bugs.llvm.org/show_bug.cgi?id=34636>`_. 151 152With ``-fsanitize-coverage=pc-table`` the compiler will create a table of 153instrumented PCs. Requires either ``-fsanitize-coverage=inline-8bit-counters`` or 154``-fsanitize-coverage=trace-pc-guard``. 155 156Users need to implement a single function to capture the PC table at startup: 157 158.. code-block:: c++ 159 160 extern "C" 161 void __sanitizer_cov_pcs_init(const uintptr_t *pcs_beg, 162 const uintptr_t *pcs_end) { 163 // [pcs_beg,pcs_end) is the array of ptr-sized integers representing 164 // pairs [PC,PCFlags] for every instrumented block in the current DSO. 165 // Capture this array in order to read the PCs and their Flags. 166 // The number of PCs and PCFlags for a given DSO is the same as the number 167 // of 8-bit counters (-fsanitize-coverage=inline-8bit-counters) or 168 // trace_pc_guard callbacks (-fsanitize-coverage=trace-pc-guard) 169 // A PCFlags describes the basic block: 170 // * bit0: 1 if the block is the function entry block, 0 otherwise. 171 } 172 173 174Tracing PCs 175=========== 176 177With ``-fsanitize-coverage=trace-pc`` the compiler will insert 178``__sanitizer_cov_trace_pc()`` on every edge. 179With an additional ``...=trace-pc,indirect-calls`` flag 180``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call. 181These callbacks are not implemented in the Sanitizer run-time and should be defined 182by the user. 183This mechanism is used for fuzzing the Linux kernel 184(https://github.com/google/syzkaller). 185 186Instrumentation points 187====================== 188Sanitizer Coverage offers different levels of instrumentation. 189 190* ``edge`` (default): edges are instrumented (see below). 191* ``bb``: basic blocks are instrumented. 192* ``func``: only the entry block of every function will be instrumented. 193 194Use these flags together with ``trace-pc-guard`` or ``trace-pc``, 195like this: ``-fsanitize-coverage=func,trace-pc-guard``. 196 197When ``edge`` or ``bb`` is used, some of the edges/blocks may still be left 198uninstrumented (pruned) if such instrumentation is considered redundant. 199Use ``no-prune`` (e.g. ``-fsanitize-coverage=bb,no-prune,trace-pc-guard``) 200to disable pruning. This could be useful for better coverage visualization. 201 202 203Edge coverage 204------------- 205 206Consider this code: 207 208.. code-block:: c++ 209 210 void foo(int *a) { 211 if (a) 212 *a = 0; 213 } 214 215It contains 3 basic blocks, let's name them A, B, C: 216 217.. code-block:: none 218 219 A 220 |\ 221 | \ 222 | B 223 | / 224 |/ 225 C 226 227If blocks A, B, and C are all covered we know for certain that the edges A=>B 228and B=>C were executed, but we still don't know if the edge A=>C was executed. 229Such edges of control flow graph are called 230`critical <https://en.wikipedia.org/wiki/Control_flow_graph#Special_edges>`_. 231The edge-level coverage simply splits all critical edges by introducing new 232dummy blocks and then instruments those blocks: 233 234.. code-block:: none 235 236 A 237 |\ 238 | \ 239 D B 240 | / 241 |/ 242 C 243 244Tracing data flow 245================= 246 247Support for data-flow-guided fuzzing. 248With ``-fsanitize-coverage=trace-cmp`` the compiler will insert extra instrumentation 249around comparison instructions and switch statements. 250Similarly, with ``-fsanitize-coverage=trace-div`` the compiler will instrument 251integer division instructions (to capture the right argument of division) 252and with ``-fsanitize-coverage=trace-gep`` -- 253the `LLVM GEP instructions <https://llvm.org/docs/GetElementPtr.html>`_ 254(to capture array indices). 255 256Unless ``no-prune`` option is provided, some of the comparison instructions 257will not be instrumented. 258 259.. code-block:: c++ 260 261 // Called before a comparison instruction. 262 // Arg1 and Arg2 are arguments of the comparison. 263 void __sanitizer_cov_trace_cmp1(uint8_t Arg1, uint8_t Arg2); 264 void __sanitizer_cov_trace_cmp2(uint16_t Arg1, uint16_t Arg2); 265 void __sanitizer_cov_trace_cmp4(uint32_t Arg1, uint32_t Arg2); 266 void __sanitizer_cov_trace_cmp8(uint64_t Arg1, uint64_t Arg2); 267 268 // Called before a comparison instruction if exactly one of the arguments is constant. 269 // Arg1 and Arg2 are arguments of the comparison, Arg1 is a compile-time constant. 270 // These callbacks are emitted by -fsanitize-coverage=trace-cmp since 2017-08-11 271 void __sanitizer_cov_trace_const_cmp1(uint8_t Arg1, uint8_t Arg2); 272 void __sanitizer_cov_trace_const_cmp2(uint16_t Arg1, uint16_t Arg2); 273 void __sanitizer_cov_trace_const_cmp4(uint32_t Arg1, uint32_t Arg2); 274 void __sanitizer_cov_trace_const_cmp8(uint64_t Arg1, uint64_t Arg2); 275 276 // Called before a switch statement. 277 // Val is the switch operand. 278 // Cases[0] is the number of case constants. 279 // Cases[1] is the size of Val in bits. 280 // Cases[2:] are the case constants. 281 void __sanitizer_cov_trace_switch(uint64_t Val, uint64_t *Cases); 282 283 // Called before a division statement. 284 // Val is the second argument of division. 285 void __sanitizer_cov_trace_div4(uint32_t Val); 286 void __sanitizer_cov_trace_div8(uint64_t Val); 287 288 // Called before a GetElemementPtr (GEP) instruction 289 // for every non-constant array index. 290 void __sanitizer_cov_trace_gep(uintptr_t Idx); 291 292Default implementation 293====================== 294 295The sanitizer run-time (AddressSanitizer, MemorySanitizer, etc) provide a 296default implementations of some of the coverage callbacks. 297You may use this implementation to dump the coverage on disk at the process 298exit. 299 300Example: 301 302.. code-block:: console 303 304 % cat -n cov.cc 305 1 #include <stdio.h> 306 2 __attribute__((noinline)) 307 3 void foo() { printf("foo\n"); } 308 4 309 5 int main(int argc, char **argv) { 310 6 if (argc == 2) 311 7 foo(); 312 8 printf("main\n"); 313 9 } 314 % clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=trace-pc-guard 315 % ASAN_OPTIONS=coverage=1 ./a.out; wc -c *.sancov 316 main 317 SanitizerCoverage: ./a.out.7312.sancov 2 PCs written 318 24 a.out.7312.sancov 319 % ASAN_OPTIONS=coverage=1 ./a.out foo ; wc -c *.sancov 320 foo 321 main 322 SanitizerCoverage: ./a.out.7316.sancov 3 PCs written 323 24 a.out.7312.sancov 324 32 a.out.7316.sancov 325 326Every time you run an executable instrumented with SanitizerCoverage 327one ``*.sancov`` file is created during the process shutdown. 328If the executable is dynamically linked against instrumented DSOs, 329one ``*.sancov`` file will be also created for every DSO. 330 331Sancov data format 332------------------ 333 334The format of ``*.sancov`` files is very simple: the first 8 bytes is the magic, 335one of ``0xC0BFFFFFFFFFFF64`` and ``0xC0BFFFFFFFFFFF32``. The last byte of the 336magic defines the size of the following offsets. The rest of the data is the 337offsets in the corresponding binary/DSO that were executed during the run. 338 339Sancov Tool 340----------- 341 342An simple ``sancov`` tool is provided to process coverage files. 343The tool is part of LLVM project and is currently supported only on Linux. 344It can handle symbolization tasks autonomously without any extra support 345from the environment. You need to pass .sancov files (named 346``<module_name>.<pid>.sancov`` and paths to all corresponding binary elf files. 347Sancov matches these files using module names and binaries file names. 348 349.. code-block:: console 350 351 USAGE: sancov [options] <action> (<binary file>|<.sancov file>)... 352 353 Action (required) 354 -print - Print coverage addresses 355 -covered-functions - Print all covered functions. 356 -not-covered-functions - Print all not covered functions. 357 -symbolize - Symbolizes the report. 358 359 Options 360 -blacklist=<string> - Blacklist file (sanitizer blacklist format). 361 -demangle - Print demangled function name. 362 -strip_path_prefix=<string> - Strip this prefix from file paths in reports 363 364 365Coverage Reports 366---------------- 367 368**Experimental** 369 370``.sancov`` files do not contain enough information to generate a source-level 371coverage report. The missing information is contained 372in debug info of the binary. Thus the ``.sancov`` has to be symbolized 373to produce a ``.symcov`` file first: 374 375.. code-block:: console 376 377 sancov -symbolize my_program.123.sancov my_program > my_program.123.symcov 378 379The ``.symcov`` file can be browsed overlayed over the source code by 380running ``tools/sancov/coverage-report-server.py`` script that will start 381an HTTP server. 382 383Output directory 384---------------- 385 386By default, .sancov files are created in the current working directory. 387This can be changed with ``ASAN_OPTIONS=coverage_dir=/path``: 388 389.. code-block:: console 390 391 % ASAN_OPTIONS="coverage=1:coverage_dir=/tmp/cov" ./a.out foo 392 % ls -l /tmp/cov/*sancov 393 -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov 394 -rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov 395