1================================
2Fuzzing LLVM libraries and tools
3================================
4
5.. contents::
6   :local:
7   :depth: 2
8
9Introduction
10============
11
12The LLVM tree includes a number of fuzzers for various components. These are
13built on top of :doc:`LibFuzzer <LibFuzzer>`.
14
15
16Available Fuzzers
17=================
18
19clang-fuzzer
20------------
21
22A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
23bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's
24tracker`__.
25
26__ https://llvm.org/pr23057
27__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer
28
29clang-proto-fuzzer
30------------------
31
32A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
33class that describes a subset of the C++ language.
34
35This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
36For example, the following command will fuzz clang with a higher optimization
37level:
38
39.. code-block:: shell
40
41   % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3
42
43clang-format-fuzzer
44-------------------
45
46A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
47bugs this fuzzer has reported are `on bugzilla`__
48and `on OSS Fuzz's tracker`__.
49
50.. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
51__ https://llvm.org/pr23052
52__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer
53
54llvm-as-fuzzer
55--------------
56
57A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
58Some of the bugs this fuzzer has reported are `on bugzilla`__.
59
60__ https://llvm.org/pr24639
61
62llvm-dwarfdump-fuzzer
63---------------------
64
65A |generic fuzzer| that interprets inputs as object files and runs
66:doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
67this fuzzer has reported are `on OSS Fuzz's tracker`__
68
69__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer
70
71llvm-demangle-fuzzer
72---------------------
73
74A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've
75fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same
76function!
77
78llvm-isel-fuzzer
79----------------
80
81A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
82
83This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
84those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
85the following command would fuzz AArch64 with :doc:`GlobalISel`:
86
87.. code-block:: shell
88
89   % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
90
91Some flags can also be specified in the binary name itself in order to support
92OSS Fuzz, which has trouble with required arguments. To do this, you can copy
93or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options
94from the binary name using "--". The valid options are architecture names
95(``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific
96keywords, like ``gisel`` for enabling global instruction selection. In this
97mode, the same example could be run like so:
98
99.. code-block:: shell
100
101   % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir>
102
103llvm-opt-fuzzer
104---------------
105
106A |LLVM IR fuzzer| aimed at finding bugs in optimization passes.
107
108It receives optimzation pipeline and runs it for each fuzzer input.
109
110Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both
111``mtriple`` and ``passes`` arguments are required. Passes are specified in a
112format suitable for the new pass manager.
113
114.. code-block:: shell
115
116   % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine
117
118Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations
119might be embedded directly into the binary file name:
120
121.. code-block:: shell
122
123   % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir>
124
125llvm-mc-assemble-fuzzer
126-----------------------
127
128A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
129target specific assembly.
130
131Note that this fuzzer has an unusual command line interface which is not fully
132compatible with all of libFuzzer's features. Fuzzer arguments must be passed
133after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
134example, to fuzz the AArch64 assembler you might use the following command:
135
136.. code-block:: console
137
138  llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
139
140This scheme will likely change in the future.
141
142llvm-mc-disassemble-fuzzer
143--------------------------
144
145A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
146as assembled binary data.
147
148Note that this fuzzer has an unusual command line interface which is not fully
149compatible with all of libFuzzer's features. See the notes above about
150``llvm-mc-assemble-fuzzer`` for details.
151
152
153.. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
154.. |protobuf fuzzer|
155   replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
156.. |LLVM IR fuzzer|
157   replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`
158
159
160Mutators and Input Generators
161=============================
162
163The inputs for a fuzz target are generated via random mutations of a
164:ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
165mutations that a fuzzer in LLVM might want.
166
167.. _fuzzing-llvm-generic:
168
169Generic Random Fuzzing
170----------------------
171
172The most basic form of input mutation is to use the built in mutators of
173LibFuzzer. These simply treat the input corpus as a bag of bits and make random
174mutations. This type of fuzzer is good for stressing the surface layers of a
175program, and is good at testing things like lexers, parsers, or binary
176protocols.
177
178Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
179`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
180`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
181
182.. _fuzzing-llvm-protobuf:
183
184Structured Fuzzing using ``libprotobuf-mutator``
185------------------------------------------------
186
187We can use libprotobuf-mutator_ in order to perform structured fuzzing and
188stress deeper layers of programs. This works by defining a protobuf class that
189translates arbitrary data into structurally interesting input. Specifically, we
190use this to work with a subset of the C++ language and perform mutations that
191produce valid C++ programs in order to exercise parts of clang that are more
192interesting than parser error handling.
193
194To build this kind of fuzzer you need `protobuf`_ and its dependencies
195installed, and you need to specify some extra flags when configuring the build
196with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
197adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
198:ref:`building-fuzzers`.
199
200The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
201`clang-proto-fuzzer`_.
202
203.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
204.. _protobuf: https://github.com/google/protobuf
205
206.. _fuzzing-llvm-ir:
207
208Structured Fuzzing of LLVM IR
209-----------------------------
210
211We also use a more direct form of structured fuzzing for fuzzers that take
212:doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
213library, which was `discussed at EuroLLVM 2017`_.
214
215The ``FuzzMutate`` library is used to structurally fuzz backends in
216`llvm-isel-fuzzer`_.
217
218.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
219
220
221Building and Running
222====================
223
224.. _building-fuzzers:
225
226Configuring LLVM to Build Fuzzers
227---------------------------------
228
229Fuzzers will be built and linked to libFuzzer by default as long as you build
230LLVM with sanitizer coverage enabled. You would typically also enable at least
231one sanitizer to find bugs faster. The most common way to build the fuzzers is
232by adding the following two flags to your CMake invocation:
233``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
234
235.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
236          with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
237          to avoid building the sanitizers themselves with sanitizers enabled.
238
239Continuously Running and Finding Bugs
240-------------------------------------
241
242There used to be a public buildbot running LLVM fuzzers continuously, and while
243this did find issues, it didn't have a very good way to report problems in an
244actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
245instead.
246
247You can browse the `LLVM project issue list`_ for the bugs found by
248`LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing
249list`_.
250
251.. _OSS Fuzz: https://github.com/google/oss-fuzz
252.. _LLVM project issue list:
253   https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
254.. _LLVM on OSS Fuzz:
255   https://github.com/google/oss-fuzz/blob/master/projects/llvm
256.. _llvm-bugs mailing list:
257   http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
258
259
260Utilities for Writing Fuzzers
261=============================
262
263There are some utilities available for writing fuzzers in LLVM.
264
265Some helpers for handling the command line interface are available in
266``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
267line options in a consistent way and to implement standalone main functions so
268your fuzzer can be built and tested when not built against libFuzzer.
269
270There is also some handling of the CMake config for fuzzers, where you should
271use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
272similarly to functions such as ``add_llvm_tool``, but they take care of linking
273to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
274enable standalone testing.
275