1======================================================= 2libFuzzer – a library for coverage-guided fuzz testing. 3======================================================= 4.. contents:: 5 :local: 6 :depth: 1 7 8Introduction 9============ 10 11LibFuzzer is in-process, coverage-guided, evolutionary fuzzing engine. 12 13LibFuzzer is linked with the library under test, and feeds fuzzed inputs to the 14library via a specific fuzzing entrypoint (aka "target function"); the fuzzer 15then tracks which areas of the code are reached, and generates mutations on the 16corpus of input data in order to maximize the code coverage. 17The code coverage 18information for libFuzzer is provided by LLVM's SanitizerCoverage_ 19instrumentation. 20 21Contact: libfuzzer(#)googlegroups.com 22 23Versions 24======== 25 26LibFuzzer is under active development so you will need the current 27(or at least a very recent) version of the Clang compiler (see `building Clang from trunk`_) 28 29Refer to https://releases.llvm.org/5.0.0/docs/LibFuzzer.html for documentation on the older version. 30 31 32Getting Started 33=============== 34 35.. contents:: 36 :local: 37 :depth: 1 38 39Fuzz Target 40----------- 41 42The first step in using libFuzzer on a library is to implement a 43*fuzz target* -- a function that accepts an array of bytes and 44does something interesting with these bytes using the API under test. 45Like this: 46 47.. code-block:: c++ 48 49 // fuzz_target.cc 50 extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { 51 DoSomethingInterestingWithMyAPI(Data, Size); 52 return 0; // Non-zero return values are reserved for future use. 53 } 54 55Note that this fuzz target does not depend on libFuzzer in any way 56and so it is possible and even desirable to use it with other fuzzing engines 57e.g. AFL_ and/or Radamsa_. 58 59Some important things to remember about fuzz targets: 60 61* The fuzzing engine will execute the fuzz target many times with different inputs in the same process. 62* It must tolerate any kind of input (empty, huge, malformed, etc). 63* It must not `exit()` on any input. 64* It may use threads but ideally all threads should be joined at the end of the function. 65* It must be as deterministic as possible. Non-determinism (e.g. random decisions not based on the input bytes) will make fuzzing inefficient. 66* It must be fast. Try avoiding cubic or greater complexity, logging, or excessive memory consumption. 67* Ideally, it should not modify any global state (although that's not strict). 68* Usually, the narrower the target the better. E.g. if your target can parse several data formats, split it into several targets, one per format. 69 70 71Fuzzer Usage 72------------ 73 74Recent versions of Clang (starting from 6.0) include libFuzzer, and no extra installation is necessary. 75 76In order to build your fuzzer binary, use the `-fsanitize=fuzzer` flag during the 77compilation and linking. In most cases you may want to combine libFuzzer with 78AddressSanitizer_ (ASAN), UndefinedBehaviorSanitizer_ (UBSAN), or both. You can 79also build with MemorySanitizer_ (MSAN), but support is experimental:: 80 81 clang -g -O1 -fsanitize=fuzzer mytarget.c # Builds the fuzz target w/o sanitizers 82 clang -g -O1 -fsanitize=fuzzer,address mytarget.c # Builds the fuzz target with ASAN 83 clang -g -O1 -fsanitize=fuzzer,signed-integer-overflow mytarget.c # Builds the fuzz target with a part of UBSAN 84 clang -g -O1 -fsanitize=fuzzer,memory mytarget.c # Builds the fuzz target with MSAN 85 86This will perform the necessary instrumentation, as well as linking with the libFuzzer library. 87Note that ``-fsanitize=fuzzer`` links in the libFuzzer's ``main()`` symbol. 88 89If modifying ``CFLAGS`` of a large project, which also compiles executables 90requiring their own ``main`` symbol, it may be desirable to request just the 91instrumentation without linking:: 92 93 clang -fsanitize=fuzzer-no-link mytarget.c 94 95Then libFuzzer can be linked to the desired driver by passing in 96``-fsanitize=fuzzer`` during the linking stage. 97 98.. _libfuzzer-corpus: 99 100Corpus 101------ 102 103Coverage-guided fuzzers like libFuzzer rely on a corpus of sample inputs for the 104code under test. This corpus should ideally be seeded with a varied collection 105of valid and invalid inputs for the code under test; for example, for a graphics 106library the initial corpus might hold a variety of different small PNG/JPG/GIF 107files. The fuzzer generates random mutations based around the sample inputs in 108the current corpus. If a mutation triggers execution of a previously-uncovered 109path in the code under test, then that mutation is saved to the corpus for 110future variations. 111 112LibFuzzer will work without any initial seeds, but will be less 113efficient if the library under test accepts complex, 114structured inputs. 115 116The corpus can also act as a sanity/regression check, to confirm that the 117fuzzing entrypoint still works and that all of the sample inputs run through 118the code under test without problems. 119 120If you have a large corpus (either generated by fuzzing or acquired by other means) 121you may want to minimize it while still preserving the full coverage. One way to do that 122is to use the `-merge=1` flag: 123 124.. code-block:: console 125 126 mkdir NEW_CORPUS_DIR # Store minimized corpus here. 127 ./my_fuzzer -merge=1 NEW_CORPUS_DIR FULL_CORPUS_DIR 128 129You may use the same flag to add more interesting items to an existing corpus. 130Only the inputs that trigger new coverage will be added to the first corpus. 131 132.. code-block:: console 133 134 ./my_fuzzer -merge=1 CURRENT_CORPUS_DIR NEW_POTENTIALLY_INTERESTING_INPUTS_DIR 135 136Running 137------- 138 139To run the fuzzer, first create a Corpus_ directory that holds the 140initial "seed" sample inputs: 141 142.. code-block:: console 143 144 mkdir CORPUS_DIR 145 cp /some/input/samples/* CORPUS_DIR 146 147Then run the fuzzer on the corpus directory: 148 149.. code-block:: console 150 151 ./my_fuzzer CORPUS_DIR # -max_len=1000 -jobs=20 ... 152 153As the fuzzer discovers new interesting test cases (i.e. test cases that 154trigger coverage of new paths through the code under test), those test cases 155will be added to the corpus directory. 156 157By default, the fuzzing process will continue indefinitely – at least until 158a bug is found. Any crashes or sanitizer failures will be reported as usual, 159stopping the fuzzing process, and the particular input that triggered the bug 160will be written to disk (typically as ``crash-<sha1>``, ``leak-<sha1>``, 161or ``timeout-<sha1>``). 162 163 164Parallel Fuzzing 165---------------- 166 167Each libFuzzer process is single-threaded, unless the library under test starts 168its own threads. However, it is possible to run multiple libFuzzer processes in 169parallel with a shared corpus directory; this has the advantage that any new 170inputs found by one fuzzer process will be available to the other fuzzer 171processes (unless you disable this with the ``-reload=0`` option). 172 173This is primarily controlled by the ``-jobs=N`` option, which indicates that 174that `N` fuzzing jobs should be run to completion (i.e. until a bug is found or 175time/iteration limits are reached). These jobs will be run across a set of 176worker processes, by default using half of the available CPU cores; the count of 177worker processes can be overridden by the ``-workers=N`` option. For example, 178running with ``-jobs=30`` on a 12-core machine would run 6 workers by default, 179with each worker averaging 5 bugs by completion of the entire process. 180 181Fork mode 182--------- 183 184**Experimental** mode ``-fork=N`` (where ``N`` is the number of parallel jobs) 185enables oom-, timeout-, and crash-resistant 186fuzzing with separate processes (using ``fork-exec``, not just ``fork``). 187 188The top libFuzzer process will not do any fuzzing itself, but will 189spawn up to ``N`` concurrent child processes providing them 190small random subsets of the corpus. After a child exits, the top process 191merges the corpus generated by the child back to the main corpus. 192 193Related flags: 194 195``-ignore_ooms`` 196 True by default. If an OOM happens during fuzzing in one of the child processes, 197 the reproducer is saved on disk, and fuzzing continues. 198``-ignore_timeouts`` 199 True by default, same as ``-ignore_ooms``, but for timeouts. 200``-ignore_crashes`` 201 False by default, same as ``-ignore_ooms``, but for all other crashes. 202 203The plan is to eventually replace ``-jobs=N`` and ``-workers=N`` with ``-fork=N``. 204 205Resuming merge 206-------------- 207 208Merging large corpora may be time consuming, and it is often desirable to do it 209on preemptable VMs, where the process may be killed at any time. 210In order to seamlessly resume the merge, use the ``-merge_control_file`` flag 211and use ``killall -SIGUSR1 /path/to/fuzzer/binary`` to stop the merge gracefully. Example: 212 213.. code-block:: console 214 215 % rm -f SomeLocalPath 216 % ./my_fuzzer CORPUS1 CORPUS2 -merge=1 -merge_control_file=SomeLocalPath 217 ... 218 MERGE-INNER: using the control file 'SomeLocalPath' 219 ... 220 # While this is running, do `killall -SIGUSR1 my_fuzzer` in another console 221 ==9015== INFO: libFuzzer: exiting as requested 222 223 # This will leave the file SomeLocalPath with the partial state of the merge. 224 # Now, you can continue the merge by executing the same command. The merge 225 # will continue from where it has been interrupted. 226 % ./my_fuzzer CORPUS1 CORPUS2 -merge=1 -merge_control_file=SomeLocalPath 227 ... 228 MERGE-OUTER: non-empty control file provided: 'SomeLocalPath' 229 MERGE-OUTER: control file ok, 32 files total, first not processed file 20 230 ... 231 232Options 233======= 234 235To run the fuzzer, pass zero or more corpus directories as command line 236arguments. The fuzzer will read test inputs from each of these corpus 237directories, and any new test inputs that are generated will be written 238back to the first corpus directory: 239 240.. code-block:: console 241 242 ./fuzzer [-flag1=val1 [-flag2=val2 ...] ] [dir1 [dir2 ...] ] 243 244If a list of files (rather than directories) are passed to the fuzzer program, 245then it will re-run those files as test inputs but will not perform any fuzzing. 246In this mode the fuzzer binary can be used as a regression test (e.g. on a 247continuous integration system) to check the target function and saved inputs 248still work. 249 250The most important command line options are: 251 252``-help`` 253 Print help message (``-help=1``). 254``-seed`` 255 Random seed. If 0 (the default), the seed is generated. 256``-runs`` 257 Number of individual test runs, -1 (the default) to run indefinitely. 258``-max_len`` 259 Maximum length of a test input. If 0 (the default), libFuzzer tries to guess 260 a good value based on the corpus (and reports it). 261``-len_control`` 262 Try generating small inputs first, then try larger inputs over time. 263 Specifies the rate at which the length limit is increased (smaller == faster). 264 Default is 100. If 0, immediately try inputs with size up to max_len. 265``-timeout`` 266 Timeout in seconds, default 1200. If an input takes longer than this timeout, 267 the process is treated as a failure case. 268``-rss_limit_mb`` 269 Memory usage limit in Mb, default 2048. Use 0 to disable the limit. 270 If an input requires more than this amount of RSS memory to execute, 271 the process is treated as a failure case. 272 The limit is checked in a separate thread every second. 273 If running w/o ASAN/MSAN, you may use 'ulimit -v' instead. 274``-malloc_limit_mb`` 275 If non-zero, the fuzzer will exit if the target tries to allocate this 276 number of Mb with one malloc call. 277 If zero (default) same limit as rss_limit_mb is applied. 278``-timeout_exitcode`` 279 Exit code (default 77) used if libFuzzer reports a timeout. 280``-error_exitcode`` 281 Exit code (default 77) used if libFuzzer itself (not a sanitizer) reports a bug (leak, OOM, etc). 282``-max_total_time`` 283 If positive, indicates the maximum total time in seconds to run the fuzzer. 284 If 0 (the default), run indefinitely. 285``-merge`` 286 If set to 1, any corpus inputs from the 2nd, 3rd etc. corpus directories 287 that trigger new code coverage will be merged into the first corpus 288 directory. Defaults to 0. This flag can be used to minimize a corpus. 289``-merge_control_file`` 290 Specify a control file used for the merge process. 291 If a merge process gets killed it tries to leave this file in a state 292 suitable for resuming the merge. By default a temporary file will be used. 293``-minimize_crash`` 294 If 1, minimizes the provided crash input. 295 Use with -runs=N or -max_total_time=N to limit the number of attempts. 296``-reload`` 297 If set to 1 (the default), the corpus directory is re-read periodically to 298 check for new inputs; this allows detection of new inputs that were discovered 299 by other fuzzing processes. 300``-jobs`` 301 Number of fuzzing jobs to run to completion. Default value is 0, which runs a 302 single fuzzing process until completion. If the value is >= 1, then this 303 number of jobs performing fuzzing are run, in a collection of parallel 304 separate worker processes; each such worker process has its 305 ``stdout``/``stderr`` redirected to ``fuzz-<JOB>.log``. 306``-workers`` 307 Number of simultaneous worker processes to run the fuzzing jobs to completion 308 in. If 0 (the default), ``min(jobs, NumberOfCpuCores()/2)`` is used. 309``-dict`` 310 Provide a dictionary of input keywords; see Dictionaries_. 311``-use_counters`` 312 Use `coverage counters`_ to generate approximate counts of how often code 313 blocks are hit; defaults to 1. 314``-reduce_inputs`` 315 Try to reduce the size of inputs while preserving their full feature sets; 316 defaults to 1. 317``-use_value_profile`` 318 Use `value profile`_ to guide corpus expansion; defaults to 0. 319``-only_ascii`` 320 If 1, generate only ASCII (``isprint``+``isspace``) inputs. Defaults to 0. 321``-artifact_prefix`` 322 Provide a prefix to use when saving fuzzing artifacts (crash, timeout, or 323 slow inputs) as ``$(artifact_prefix)file``. Defaults to empty. 324``-exact_artifact_path`` 325 Ignored if empty (the default). If non-empty, write the single artifact on 326 failure (crash, timeout) as ``$(exact_artifact_path)``. This overrides 327 ``-artifact_prefix`` and will not use checksum in the file name. Do not use 328 the same path for several parallel processes. 329``-print_pcs`` 330 If 1, print out newly covered PCs. Defaults to 0. 331``-print_final_stats`` 332 If 1, print statistics at exit. Defaults to 0. 333``-detect_leaks`` 334 If 1 (default) and if LeakSanitizer is enabled 335 try to detect memory leaks during fuzzing (i.e. not only at shut down). 336``-close_fd_mask`` 337 Indicate output streams to close at startup. Be careful, this will 338 remove diagnostic output from target code (e.g. messages on assert failure). 339 340 - 0 (default): close neither ``stdout`` nor ``stderr`` 341 - 1 : close ``stdout`` 342 - 2 : close ``stderr`` 343 - 3 : close both ``stdout`` and ``stderr``. 344 345For the full list of flags run the fuzzer binary with ``-help=1``. 346 347Output 348====== 349 350During operation the fuzzer prints information to ``stderr``, for example:: 351 352 INFO: Seed: 1523017872 353 INFO: Loaded 1 modules (16 guards): [0x744e60, 0x744ea0), 354 INFO: -max_len is not provided, using 64 355 INFO: A corpus is not provided, starting from an empty corpus 356 #0 READ units: 1 357 #1 INITED cov: 3 ft: 2 corp: 1/1b exec/s: 0 rss: 24Mb 358 #3811 NEW cov: 4 ft: 3 corp: 2/2b exec/s: 0 rss: 25Mb L: 1 MS: 5 ChangeBit-ChangeByte-ChangeBit-ShuffleBytes-ChangeByte- 359 #3827 NEW cov: 5 ft: 4 corp: 3/4b exec/s: 0 rss: 25Mb L: 2 MS: 1 CopyPart- 360 #3963 NEW cov: 6 ft: 5 corp: 4/6b exec/s: 0 rss: 25Mb L: 2 MS: 2 ShuffleBytes-ChangeBit- 361 #4167 NEW cov: 7 ft: 6 corp: 5/9b exec/s: 0 rss: 25Mb L: 3 MS: 1 InsertByte- 362 ... 363 364The early parts of the output include information about the fuzzer options and 365configuration, including the current random seed (in the ``Seed:`` line; this 366can be overridden with the ``-seed=N`` flag). 367 368Further output lines have the form of an event code and statistics. The 369possible event codes are: 370 371``READ`` 372 The fuzzer has read in all of the provided input samples from the corpus 373 directories. 374``INITED`` 375 The fuzzer has completed initialization, which includes running each of 376 the initial input samples through the code under test. 377``NEW`` 378 The fuzzer has created a test input that covers new areas of the code 379 under test. This input will be saved to the primary corpus directory. 380``REDUCE`` 381 The fuzzer has found a better (smaller) input that triggers previously 382 discovered features (set ``-reduce_inputs=0`` to disable). 383``pulse`` 384 The fuzzer has generated 2\ :sup:`n` inputs (generated periodically to reassure 385 the user that the fuzzer is still working). 386``DONE`` 387 The fuzzer has completed operation because it has reached the specified 388 iteration limit (``-runs``) or time limit (``-max_total_time``). 389``RELOAD`` 390 The fuzzer is performing a periodic reload of inputs from the corpus 391 directory; this allows it to discover any inputs discovered by other 392 fuzzer processes (see `Parallel Fuzzing`_). 393 394Each output line also reports the following statistics (when non-zero): 395 396``cov:`` 397 Total number of code blocks or edges covered by executing the current corpus. 398``ft:`` 399 libFuzzer uses different signals to evaluate the code coverage: 400 edge coverage, edge counters, value profiles, indirect caller/callee pairs, etc. 401 These signals combined are called *features* (`ft:`). 402``corp:`` 403 Number of entries in the current in-memory test corpus and its size in bytes. 404``lim:`` 405 Current limit on the length of new entries in the corpus. Increases over time 406 until the max length (``-max_len``) is reached. 407``exec/s:`` 408 Number of fuzzer iterations per second. 409``rss:`` 410 Current memory consumption. 411 412For ``NEW`` and ``REDUCE`` events, the output line also includes information 413about the mutation operation that produced the new input: 414 415``L:`` 416 Size of the new input in bytes. 417``MS: <n> <operations>`` 418 Count and list of the mutation operations used to generate the input. 419 420 421Examples 422======== 423.. contents:: 424 :local: 425 :depth: 1 426 427Toy example 428----------- 429 430A simple function that does something interesting if it receives the input 431"HI!":: 432 433 cat << EOF > test_fuzzer.cc 434 #include <stdint.h> 435 #include <stddef.h> 436 extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { 437 if (size > 0 && data[0] == 'H') 438 if (size > 1 && data[1] == 'I') 439 if (size > 2 && data[2] == '!') 440 __builtin_trap(); 441 return 0; 442 } 443 EOF 444 # Build test_fuzzer.cc with asan and link against libFuzzer. 445 clang++ -fsanitize=address,fuzzer test_fuzzer.cc 446 # Run the fuzzer with no corpus. 447 ./a.out 448 449You should get an error pretty quickly:: 450 451 INFO: Seed: 1523017872 452 INFO: Loaded 1 modules (16 guards): [0x744e60, 0x744ea0), 453 INFO: -max_len is not provided, using 64 454 INFO: A corpus is not provided, starting from an empty corpus 455 #0 READ units: 1 456 #1 INITED cov: 3 ft: 2 corp: 1/1b exec/s: 0 rss: 24Mb 457 #3811 NEW cov: 4 ft: 3 corp: 2/2b exec/s: 0 rss: 25Mb L: 1 MS: 5 ChangeBit-ChangeByte-ChangeBit-ShuffleBytes-ChangeByte- 458 #3827 NEW cov: 5 ft: 4 corp: 3/4b exec/s: 0 rss: 25Mb L: 2 MS: 1 CopyPart- 459 #3963 NEW cov: 6 ft: 5 corp: 4/6b exec/s: 0 rss: 25Mb L: 2 MS: 2 ShuffleBytes-ChangeBit- 460 #4167 NEW cov: 7 ft: 6 corp: 5/9b exec/s: 0 rss: 25Mb L: 3 MS: 1 InsertByte- 461 ==31511== ERROR: libFuzzer: deadly signal 462 ... 463 artifact_prefix='./'; Test unit written to ./crash-b13e8756b13a00cf168300179061fb4b91fefbed 464 465 466More examples 467------------- 468 469Examples of real-life fuzz targets and the bugs they find can be found 470at http://tutorial.libfuzzer.info. Among other things you can learn how 471to detect Heartbleed_ in one second. 472 473 474Advanced features 475================= 476.. contents:: 477 :local: 478 :depth: 1 479 480Dictionaries 481------------ 482LibFuzzer supports user-supplied dictionaries with input language keywords 483or other interesting byte sequences (e.g. multi-byte magic values). 484Use ``-dict=DICTIONARY_FILE``. For some input languages using a dictionary 485may significantly improve the search speed. 486The dictionary syntax is similar to that used by AFL_ for its ``-x`` option:: 487 488 # Lines starting with '#' and empty lines are ignored. 489 490 # Adds "blah" (w/o quotes) to the dictionary. 491 kw1="blah" 492 # Use \\ for backslash and \" for quotes. 493 kw2="\"ac\\dc\"" 494 # Use \xAB for hex values 495 kw3="\xF7\xF8" 496 # the name of the keyword followed by '=' may be omitted: 497 "foo\x0Abar" 498 499 500 501Tracing CMP instructions 502------------------------ 503 504With an additional compiler flag ``-fsanitize-coverage=trace-cmp`` 505(on by default as part of ``-fsanitize=fuzzer``, see SanitizerCoverageTraceDataFlow_) 506libFuzzer will intercept CMP instructions and guide mutations based 507on the arguments of intercepted CMP instructions. This may slow down 508the fuzzing but is very likely to improve the results. 509 510Value Profile 511------------- 512 513With ``-fsanitize-coverage=trace-cmp`` (default with ``-fsanitize=fuzzer``) 514and extra run-time flag ``-use_value_profile=1`` the fuzzer will 515collect value profiles for the parameters of compare instructions 516and treat some new values as new coverage. 517 518The current implementation does roughly the following: 519 520* The compiler instruments all CMP instructions with a callback that receives both CMP arguments. 521* The callback computes `(caller_pc&4095) | (popcnt(Arg1 ^ Arg2) << 12)` and uses this value to set a bit in a bitset. 522* Every new observed bit in the bitset is treated as new coverage. 523 524 525This feature has a potential to discover many interesting inputs, 526but there are two downsides. 527First, the extra instrumentation may bring up to 2x additional slowdown. 528Second, the corpus may grow by several times. 529 530Fuzzer-friendly build mode 531--------------------------- 532Sometimes the code under test is not fuzzing-friendly. Examples: 533 534 - The target code uses a PRNG seeded e.g. by system time and 535 thus two consequent invocations may potentially execute different code paths 536 even if the end result will be the same. This will cause a fuzzer to treat 537 two similar inputs as significantly different and it will blow up the test corpus. 538 E.g. libxml uses ``rand()`` inside its hash table. 539 - The target code uses checksums to protect from invalid inputs. 540 E.g. png checks CRC for every chunk. 541 542In many cases it makes sense to build a special fuzzing-friendly build 543with certain fuzzing-unfriendly features disabled. We propose to use a common build macro 544for all such cases for consistency: ``FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION``. 545 546.. code-block:: c++ 547 548 void MyInitPRNG() { 549 #ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION 550 // In fuzzing mode the behavior of the code should be deterministic. 551 srand(0); 552 #else 553 srand(time(0)); 554 #endif 555 } 556 557 558 559AFL compatibility 560----------------- 561LibFuzzer can be used together with AFL_ on the same test corpus. 562Both fuzzers expect the test corpus to reside in a directory, one file per input. 563You can run both fuzzers on the same corpus, one after another: 564 565.. code-block:: console 566 567 ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@ 568 ./llvm-fuzz testcase_dir findings_dir # Will write new tests to testcase_dir 569 570Periodically restart both fuzzers so that they can use each other's findings. 571Currently, there is no simple way to run both fuzzing engines in parallel while sharing the same corpus dir. 572 573You may also use AFL on your target function ``LLVMFuzzerTestOneInput``: 574see an example `here <https://github.com/llvm/llvm-project/tree/main/compiler-rt/lib/fuzzer/afl>`__. 575 576How good is my fuzzer? 577---------------------- 578 579Once you implement your target function ``LLVMFuzzerTestOneInput`` and fuzz it to death, 580you will want to know whether the function or the corpus can be improved further. 581One easy to use metric is, of course, code coverage. 582 583We recommend to use 584`Clang Coverage <https://clang.llvm.org/docs/SourceBasedCodeCoverage.html>`_, 585to visualize and study your code coverage 586(`example <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md#visualizing-coverage>`_). 587 588 589User-supplied mutators 590---------------------- 591 592LibFuzzer allows to use custom (user-supplied) mutators, see 593`Structure-Aware Fuzzing <https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md>`_ 594for more details. 595 596Startup initialization 597---------------------- 598If the library being tested needs to be initialized, there are several options. 599 600The simplest way is to have a statically initialized global object inside 601`LLVMFuzzerTestOneInput` (or in global scope if that works for you): 602 603.. code-block:: c++ 604 605 extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { 606 static bool Initialized = DoInitialization(); 607 ... 608 609Alternatively, you may define an optional init function and it will receive 610the program arguments that you can read and modify. Do this **only** if you 611really need to access ``argv``/``argc``. 612 613.. code-block:: c++ 614 615 extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) { 616 ReadAndMaybeModify(argc, argv); 617 return 0; 618 } 619 620Using libFuzzer as a library 621---------------------------- 622If the code being fuzzed must provide its own `main`, it's possible to 623invoke libFuzzer as a library. Be sure to pass ``-fsanitize=fuzzer-no-link`` 624during compilation, and link your binary against the no-main version of 625libFuzzer. On Linux installations, this is typically located at: 626 627.. code-block:: bash 628 629 /usr/lib/<llvm-version>/lib/clang/<clang-version>/lib/linux/libclang_rt.fuzzer_no_main-<architecture>.a 630 631If building libFuzzer from source, this is located at the following path 632in the build output directory: 633 634.. code-block:: bash 635 636 lib/linux/libclang_rt.fuzzer_no_main-<architecture>.a 637 638From here, the code can do whatever setup it requires, and when it's ready 639to start fuzzing, it can call `LLVMFuzzerRunDriver`, passing in the program 640arguments and a callback. This callback is invoked just like 641`LLVMFuzzerTestOneInput`, and has the same signature. 642 643.. code-block:: c++ 644 645 extern "C" int LLVMFuzzerRunDriver(int *argc, char ***argv, 646 int (*UserCb)(const uint8_t *Data, size_t Size)); 647 648 649 650Leaks 651----- 652 653Binaries built with AddressSanitizer_ or LeakSanitizer_ will try to detect 654memory leaks at the process shutdown. 655For in-process fuzzing this is inconvenient 656since the fuzzer needs to report a leak with a reproducer as soon as the leaky 657mutation is found. However, running full leak detection after every mutation 658is expensive. 659 660By default (``-detect_leaks=1``) libFuzzer will count the number of 661``malloc`` and ``free`` calls when executing every mutation. 662If the numbers don't match (which by itself doesn't mean there is a leak) 663libFuzzer will invoke the more expensive LeakSanitizer_ 664pass and if the actual leak is found, it will be reported with the reproducer 665and the process will exit. 666 667If your target has massive leaks and the leak detection is disabled 668you will eventually run out of RAM (see the ``-rss_limit_mb`` flag). 669 670 671Developing libFuzzer 672==================== 673 674LibFuzzer is built as a part of LLVM project by default on macos and Linux. 675Users of other operating systems can explicitly request compilation using 676``-DCOMPILER_RT_BUILD_LIBFUZZER=ON`` flag. 677Tests are run using ``check-fuzzer`` target from the build directory 678which was configured with ``-DCOMPILER_RT_INCLUDE_TESTS=ON`` flag. 679 680.. code-block:: console 681 682 ninja check-fuzzer 683 684 685FAQ 686========================= 687 688Q. Why doesn't libFuzzer use any of the LLVM support? 689----------------------------------------------------- 690 691There are two reasons. 692 693First, we want this library to be used outside of the LLVM without users having to 694build the rest of LLVM. This may sound unconvincing for many LLVM folks, 695but in practice the need for building the whole LLVM frightens many potential 696users -- and we want more users to use this code. 697 698Second, there is a subtle technical reason not to rely on the rest of LLVM, or 699any other large body of code (maybe not even STL). When coverage instrumentation 700is enabled, it will also instrument the LLVM support code which will blow up the 701coverage set of the process (since the fuzzer is in-process). In other words, by 702using more external dependencies we will slow down the fuzzer while the main 703reason for it to exist is extreme speed. 704 705Q. Does libFuzzer Support Windows? 706------------------------------------------------------------------------------------ 707 708Yes, libFuzzer now supports Windows. Initial support was added in r341082. 709Any build of Clang 9 supports it. You can download a build of Clang for Windows 710that has libFuzzer from 711`LLVM Snapshot Builds <https://llvm.org/builds/>`_. 712 713Using libFuzzer on Windows without ASAN is unsupported. Building fuzzers with the 714``/MD`` (dynamic runtime library) compile option is unsupported. Support for these 715may be added in the future. Linking fuzzers with the ``/INCREMENTAL`` link option 716(or the ``/DEBUG`` option which implies it) is also unsupported. 717 718Send any questions or comments to the mailing list: libfuzzer(#)googlegroups.com 719 720Q. When libFuzzer is not a good solution for a problem? 721--------------------------------------------------------- 722 723* If the test inputs are validated by the target library and the validator 724 asserts/crashes on invalid inputs, in-process fuzzing is not applicable. 725* Bugs in the target library may accumulate without being detected. E.g. a memory 726 corruption that goes undetected at first and then leads to a crash while 727 testing another input. This is why it is highly recommended to run this 728 in-process fuzzer with all sanitizers to detect most bugs on the spot. 729* It is harder to protect the in-process fuzzer from excessive memory 730 consumption and infinite loops in the target library (still possible). 731* The target library should not have significant global state that is not 732 reset between the runs. 733* Many interesting target libraries are not designed in a way that supports 734 the in-process fuzzer interface (e.g. require a file path instead of a 735 byte array). 736* If a single test run takes a considerable fraction of a second (or 737 more) the speed benefit from the in-process fuzzer is negligible. 738* If the target library runs persistent threads (that outlive 739 execution of one test) the fuzzing results will be unreliable. 740 741Q. So, what exactly this Fuzzer is good for? 742-------------------------------------------- 743 744This Fuzzer might be a good choice for testing libraries that have relatively 745small inputs, each input takes < 10ms to run, and the library code is not expected 746to crash on invalid inputs. 747Examples: regular expression matchers, text or binary format parsers, compression, 748network, crypto. 749 750Q. LibFuzzer crashes on my complicated fuzz target (but works fine for me on smaller targets). 751---------------------------------------------------------------------------------------------- 752 753Check if your fuzz target uses ``dlclose``. 754Currently, libFuzzer doesn't support targets that call ``dlclose``, 755this may be fixed in future. 756 757 758Trophies 759======== 760* Thousands of bugs found on OSS-Fuzz: https://opensource.googleblog.com/2017/05/oss-fuzz-five-months-later-and.html 761 762* GLIBC: https://sourceware.org/glibc/wiki/FuzzingLibc 763 764* MUSL LIBC: `[1] <http://git.musl-libc.org/cgit/musl/commit/?id=39dfd58417ef642307d90306e1c7e50aaec5a35c>`__ `[2] <http://www.openwall.com/lists/oss-security/2015/03/30/3>`__ 765 766* `pugixml <https://github.com/zeux/pugixml/issues/39>`_ 767 768* PCRE: Search for "LLVM fuzzer" in http://vcs.pcre.org/pcre2/code/trunk/ChangeLog?view=markup; 769 also in `bugzilla <https://bugs.exim.org/buglist.cgi?bug_status=__all__&content=libfuzzer&no_redirect=1&order=Importance&product=PCRE&query_format=specific>`_ 770 771* `ICU <http://bugs.icu-project.org/trac/ticket/11838>`_ 772 773* `Freetype <https://savannah.nongnu.org/search/?words=LibFuzzer&type_of_search=bugs&Search=Search&exact=1#options>`_ 774 775* `Harfbuzz <https://github.com/behdad/harfbuzz/issues/139>`_ 776 777* `SQLite <http://www3.sqlite.org/cgi/src/info/088009efdd56160b>`_ 778 779* `Python <http://bugs.python.org/issue25388>`_ 780 781* OpenSSL/BoringSSL: `[1] <https://boringssl.googlesource.com/boringssl/+/cb852981cd61733a7a1ae4fd8755b7ff950e857d>`_ `[2] <https://openssl.org/news/secadv/20160301.txt>`_ `[3] <https://boringssl.googlesource.com/boringssl/+/2b07fa4b22198ac02e0cee8f37f3337c3dba91bc>`_ `[4] <https://boringssl.googlesource.com/boringssl/+/6b6e0b20893e2be0e68af605a60ffa2cbb0ffa64>`_ `[5] <https://github.com/openssl/openssl/pull/931/commits/dd5ac557f052cc2b7f718ac44a8cb7ac6f77dca8>`_ `[6] <https://github.com/openssl/openssl/pull/931/commits/19b5b9194071d1d84e38ac9a952e715afbc85a81>`_ 782 783* `Libxml2 784 <https://bugzilla.gnome.org/buglist.cgi?bug_status=__all__&content=libFuzzer&list_id=68957&order=Importance&product=libxml2&query_format=specific>`_ and `[HT206167] <https://support.apple.com/en-gb/HT206167>`_ (CVE-2015-5312, CVE-2015-7500, CVE-2015-7942) 785 786* `Linux Kernel's BPF verifier <https://github.com/iovisor/bpf-fuzzer>`_ 787 788* `Linux Kernel's Crypto code <https://www.spinics.net/lists/stable/msg199712.html>`_ 789 790* Capstone: `[1] <https://github.com/aquynh/capstone/issues/600>`__ `[2] <https://github.com/aquynh/capstone/commit/6b88d1d51eadf7175a8f8a11b690684443b11359>`__ 791 792* file:`[1] <http://bugs.gw.com/view.php?id=550>`__ `[2] <http://bugs.gw.com/view.php?id=551>`__ `[3] <http://bugs.gw.com/view.php?id=553>`__ `[4] <http://bugs.gw.com/view.php?id=554>`__ 793 794* Radare2: `[1] <https://github.com/revskills?tab=contributions&from=2016-04-09>`__ 795 796* gRPC: `[1] <https://github.com/grpc/grpc/pull/6071/commits/df04c1f7f6aec6e95722ec0b023a6b29b6ea871c>`__ `[2] <https://github.com/grpc/grpc/pull/6071/commits/22a3dfd95468daa0db7245a4e8e6679a52847579>`__ `[3] <https://github.com/grpc/grpc/pull/6071/commits/9cac2a12d9e181d130841092e9d40fa3309d7aa7>`__ `[4] <https://github.com/grpc/grpc/pull/6012/commits/82a91c91d01ce9b999c8821ed13515883468e203>`__ `[5] <https://github.com/grpc/grpc/pull/6202/commits/2e3e0039b30edaf89fb93bfb2c1d0909098519fa>`__ `[6] <https://github.com/grpc/grpc/pull/6106/files>`__ 797 798* WOFF2: `[1] <https://github.com/google/woff2/commit/a15a8ab>`__ 799 800* LLVM: `Clang <https://bugs.llvm.org/show_bug.cgi?id=23057>`_, `Clang-format <https://bugs.llvm.org/show_bug.cgi?id=23052>`_, `libc++ <https://bugs.llvm.org/show_bug.cgi?id=24411>`_, `llvm-as <https://bugs.llvm.org/show_bug.cgi?id=24639>`_, `Demangler <https://bugs.chromium.org/p/chromium/issues/detail?id=606626>`_, Disassembler: http://reviews.llvm.org/rL247405, http://reviews.llvm.org/rL247414, http://reviews.llvm.org/rL247416, http://reviews.llvm.org/rL247417, http://reviews.llvm.org/rL247420, http://reviews.llvm.org/rL247422. 801 802* Tensorflow: `[1] <https://da-data.blogspot.com/2017/01/finding-bugs-in-tensorflow-with.html>`__ 803 804* Ffmpeg: `[1] <https://github.com/FFmpeg/FFmpeg/commit/c92f55847a3d9cd12db60bfcd0831ff7f089c37c>`__ `[2] <https://github.com/FFmpeg/FFmpeg/commit/25ab1a65f3acb5ec67b53fb7a2463a7368f1ad16>`__ `[3] <https://github.com/FFmpeg/FFmpeg/commit/85d23e5cbc9ad6835eef870a5b4247de78febe56>`__ `[4] <https://github.com/FFmpeg/FFmpeg/commit/04bd1b38ee6b8df410d0ab8d4949546b6c4af26a>`__ 805 806* `Wireshark <https://bugs.wireshark.org/bugzilla/buglist.cgi?bug_status=UNCONFIRMED&bug_status=CONFIRMED&bug_status=IN_PROGRESS&bug_status=INCOMPLETE&bug_status=RESOLVED&bug_status=VERIFIED&f0=OP&f1=OP&f2=product&f3=component&f4=alias&f5=short_desc&f7=content&f8=CP&f9=CP&j1=OR&o2=substring&o3=substring&o4=substring&o5=substring&o6=substring&o7=matches&order=bug_id%20DESC&query_format=advanced&v2=libfuzzer&v3=libfuzzer&v4=libfuzzer&v5=libfuzzer&v6=libfuzzer&v7=%22libfuzzer%22>`_ 807 808* `QEMU <https://researchcenter.paloaltonetworks.com/2017/09/unit42-palo-alto-networks-discovers-new-qemu-vulnerability/>`_ 809 810.. _pcre2: http://www.pcre.org/ 811.. _AFL: http://lcamtuf.coredump.cx/afl/ 812.. _Radamsa: https://github.com/aoh/radamsa 813.. _SanitizerCoverage: https://clang.llvm.org/docs/SanitizerCoverage.html 814.. _SanitizerCoverageTraceDataFlow: https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow 815.. _AddressSanitizer: https://clang.llvm.org/docs/AddressSanitizer.html 816.. _LeakSanitizer: https://clang.llvm.org/docs/LeakSanitizer.html 817.. _Heartbleed: http://en.wikipedia.org/wiki/Heartbleed 818.. _FuzzerInterface.h: https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/fuzzer/FuzzerInterface.h 819.. _3.7.0: https://llvm.org/releases/3.7.0/docs/LibFuzzer.html 820.. _building Clang from trunk: https://clang.llvm.org/get_started.html 821.. _MemorySanitizer: https://clang.llvm.org/docs/MemorySanitizer.html 822.. _UndefinedBehaviorSanitizer: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html 823.. _`coverage counters`: https://clang.llvm.org/docs/SanitizerCoverage.html#coverage-counters 824.. _`value profile`: #value-profile 825.. _`caller-callee pairs`: https://clang.llvm.org/docs/SanitizerCoverage.html#caller-callee-coverage 826.. _BoringSSL: https://boringssl.googlesource.com/boringssl/ 827 828