1================================ 2Fuzzing LLVM libraries and tools 3================================ 4 5.. contents:: 6 :local: 7 :depth: 2 8 9Introduction 10============ 11 12The LLVM tree includes a number of fuzzers for various components. These are 13built on top of :doc:`LibFuzzer <LibFuzzer>`. 14 15 16Available Fuzzers 17================= 18 19clang-fuzzer 20------------ 21 22A |generic fuzzer| that tries to compile textual input as C++ code. Some of the 23bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's 24tracker`__. 25 26__ https://llvm.org/pr23057 27__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer 28 29clang-proto-fuzzer 30------------------ 31 32A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf 33class that describes a subset of the C++ language. 34 35This fuzzer accepts clang command line options after `ignore_remaining_args=1`. 36For example, the following command will fuzz clang with a higher optimization 37level: 38 39.. code-block:: shell 40 41 % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3 42 43clang-format-fuzzer 44------------------- 45 46A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the 47bugs this fuzzer has reported are `on bugzilla`__ 48and `on OSS Fuzz's tracker`__. 49 50.. _clang-format: https://clang.llvm.org/docs/ClangFormat.html 51__ https://llvm.org/pr23052 52__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer 53 54llvm-as-fuzzer 55-------------- 56 57A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`. 58Some of the bugs this fuzzer has reported are `on bugzilla`__. 59 60__ https://llvm.org/pr24639 61 62llvm-dwarfdump-fuzzer 63--------------------- 64 65A |generic fuzzer| that interprets inputs as object files and runs 66:doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs 67this fuzzer has reported are `on OSS Fuzz's tracker`__ 68 69__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer 70 71llvm-demangle-fuzzer 72--------------------- 73 74A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've 75fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same 76function! 77 78llvm-isel-fuzzer 79---------------- 80 81A |LLVM IR fuzzer| aimed at finding bugs in instruction selection. 82 83This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match 84those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example, 85the following command would fuzz AArch64 with :doc:`GlobalISel`: 86 87.. code-block:: shell 88 89 % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0 90 91Some flags can also be specified in the binary name itself in order to support 92OSS Fuzz, which has trouble with required arguments. To do this, you can copy 93or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options 94from the binary name using "--". The valid options are architecture names 95(``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific 96keywords, like ``gisel`` for enabling global instruction selection. In this 97mode, the same example could be run like so: 98 99.. code-block:: shell 100 101 % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir> 102 103llvm-opt-fuzzer 104--------------- 105 106A |LLVM IR fuzzer| aimed at finding bugs in optimization passes. 107 108It receives optimzation pipeline and runs it for each fuzzer input. 109 110Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both 111``mtriple`` and ``passes`` arguments are required. Passes are specified in a 112format suitable for the new pass manager. 113 114.. code-block:: shell 115 116 % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine 117 118Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations 119might be embedded directly into the binary file name: 120 121.. code-block:: shell 122 123 % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir> 124 125llvm-mc-assemble-fuzzer 126----------------------- 127 128A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as 129target specific assembly. 130 131Note that this fuzzer has an unusual command line interface which is not fully 132compatible with all of libFuzzer's features. Fuzzer arguments must be passed 133after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For 134example, to fuzz the AArch64 assembler you might use the following command: 135 136.. code-block:: console 137 138 llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4 139 140This scheme will likely change in the future. 141 142llvm-mc-disassemble-fuzzer 143-------------------------- 144 145A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs 146as assembled binary data. 147 148Note that this fuzzer has an unusual command line interface which is not fully 149compatible with all of libFuzzer's features. See the notes above about 150``llvm-mc-assemble-fuzzer`` for details. 151 152 153.. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>` 154.. |protobuf fuzzer| 155 replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>` 156.. |LLVM IR fuzzer| 157 replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>` 158 159 160Mutators and Input Generators 161============================= 162 163The inputs for a fuzz target are generated via random mutations of a 164:ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of 165mutations that a fuzzer in LLVM might want. 166 167.. _fuzzing-llvm-generic: 168 169Generic Random Fuzzing 170---------------------- 171 172The most basic form of input mutation is to use the built in mutators of 173LibFuzzer. These simply treat the input corpus as a bag of bits and make random 174mutations. This type of fuzzer is good for stressing the surface layers of a 175program, and is good at testing things like lexers, parsers, or binary 176protocols. 177 178Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_, 179`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_, 180`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_. 181 182.. _fuzzing-llvm-protobuf: 183 184Structured Fuzzing using ``libprotobuf-mutator`` 185------------------------------------------------ 186 187We can use libprotobuf-mutator_ in order to perform structured fuzzing and 188stress deeper layers of programs. This works by defining a protobuf class that 189translates arbitrary data into structurally interesting input. Specifically, we 190use this to work with a subset of the C++ language and perform mutations that 191produce valid C++ programs in order to exercise parts of clang that are more 192interesting than parser error handling. 193 194To build this kind of fuzzer you need `protobuf`_ and its dependencies 195installed, and you need to specify some extra flags when configuring the build 196with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by 197adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in 198:ref:`building-fuzzers`. 199 200The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is 201`clang-proto-fuzzer`_. 202 203.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator 204.. _protobuf: https://github.com/google/protobuf 205 206.. _fuzzing-llvm-ir: 207 208Structured Fuzzing of LLVM IR 209----------------------------- 210 211We also use a more direct form of structured fuzzing for fuzzers that take 212:doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate`` 213library, which was `discussed at EuroLLVM 2017`_. 214 215The ``FuzzMutate`` library is used to structurally fuzz backends in 216`llvm-isel-fuzzer`_. 217 218.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg 219 220 221Building and Running 222==================== 223 224.. _building-fuzzers: 225 226Configuring LLVM to Build Fuzzers 227--------------------------------- 228 229Fuzzers will be built and linked to libFuzzer by default as long as you build 230LLVM with sanitizer coverage enabled. You would typically also enable at least 231one sanitizer to find bugs faster. The most common way to build the fuzzers is 232by adding the following two flags to your CMake invocation: 233``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``. 234 235.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building 236 with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off`` 237 to avoid building the sanitizers themselves with sanitizers enabled. 238 239Continuously Running and Finding Bugs 240------------------------------------- 241 242There used to be a public buildbot running LLVM fuzzers continuously, and while 243this did find issues, it didn't have a very good way to report problems in an 244actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more 245instead. 246 247You can browse the `LLVM project issue list`_ for the bugs found by 248`LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing 249list`_. 250 251.. _OSS Fuzz: https://github.com/google/oss-fuzz 252.. _LLVM project issue list: 253 https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm 254.. _LLVM on OSS Fuzz: 255 https://github.com/google/oss-fuzz/blob/master/projects/llvm 256.. _llvm-bugs mailing list: 257 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs 258 259 260Utilities for Writing Fuzzers 261============================= 262 263There are some utilities available for writing fuzzers in LLVM. 264 265Some helpers for handling the command line interface are available in 266``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command 267line options in a consistent way and to implement standalone main functions so 268your fuzzer can be built and tested when not built against libFuzzer. 269 270There is also some handling of the CMake config for fuzzers, where you should 271use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works 272similarly to functions such as ``add_llvm_tool``, but they take care of linking 273to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to 274enable standalone testing. 275