xref: /openbsd/gnu/llvm/llvm/docs/FuzzingLLVM.rst (revision d415bd75)
1================================
2Fuzzing LLVM libraries and tools
3================================
4
5.. contents::
6   :local:
7   :depth: 2
8
9Introduction
10============
11
12The LLVM tree includes a number of fuzzers for various components. These are
13built on top of :doc:`LibFuzzer <LibFuzzer>`. In order to build and run these
14fuzzers, see :ref:`building-fuzzers`.
15
16
17Available Fuzzers
18=================
19
20clang-fuzzer
21------------
22
23A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
24bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's
25tracker`__.
26
27__ https://llvm.org/pr23057
28__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer
29
30clang-proto-fuzzer
31------------------
32
33A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
34class that describes a subset of the C++ language.
35
36This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
37For example, the following command will fuzz clang with a higher optimization
38level:
39
40.. code-block:: shell
41
42   % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3
43
44clang-format-fuzzer
45-------------------
46
47A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
48bugs this fuzzer has reported are `on bugzilla`__
49and `on OSS Fuzz's tracker`__.
50
51.. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
52__ https://llvm.org/pr23052
53__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer
54
55llvm-as-fuzzer
56--------------
57
58A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
59Some of the bugs this fuzzer has reported are `on bugzilla`__.
60
61__ https://llvm.org/pr24639
62
63llvm-dwarfdump-fuzzer
64---------------------
65
66A |generic fuzzer| that interprets inputs as object files and runs
67:doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
68this fuzzer has reported are `on OSS Fuzz's tracker`__
69
70__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer
71
72llvm-demangle-fuzzer
73---------------------
74
75A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've
76fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same
77function!
78
79llvm-isel-fuzzer
80----------------
81
82A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
83
84This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
85those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
86the following command would fuzz AArch64 with :doc:`GlobalISel/index`:
87
88.. code-block:: shell
89
90   % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
91
92Some flags can also be specified in the binary name itself in order to support
93OSS Fuzz, which has trouble with required arguments. To do this, you can copy
94or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options
95from the binary name using "--". The valid options are architecture names
96(``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific
97keywords, like ``gisel`` for enabling global instruction selection. In this
98mode, the same example could be run like so:
99
100.. code-block:: shell
101
102   % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir>
103
104llvm-opt-fuzzer
105---------------
106
107A |LLVM IR fuzzer| aimed at finding bugs in optimization passes.
108
109It receives optimization pipeline and runs it for each fuzzer input.
110
111Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both
112``mtriple`` and ``passes`` arguments are required. Passes are specified in a
113format suitable for the new pass manager. You can find some documentation about
114this format in the doxygen for ``PassBuilder::parsePassPipeline``.
115
116.. code-block:: shell
117
118   % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine
119
120Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations
121might be embedded directly into the binary file name:
122
123.. code-block:: shell
124
125   % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir>
126
127llvm-mc-assemble-fuzzer
128-----------------------
129
130A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
131target specific assembly.
132
133Note that this fuzzer has an unusual command line interface which is not fully
134compatible with all of libFuzzer's features. Fuzzer arguments must be passed
135after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
136example, to fuzz the AArch64 assembler you might use the following command:
137
138.. code-block:: console
139
140  llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
141
142This scheme will likely change in the future.
143
144llvm-mc-disassemble-fuzzer
145--------------------------
146
147A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
148as assembled binary data.
149
150Note that this fuzzer has an unusual command line interface which is not fully
151compatible with all of libFuzzer's features. See the notes above about
152``llvm-mc-assemble-fuzzer`` for details.
153
154
155.. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
156.. |protobuf fuzzer|
157   replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
158.. |LLVM IR fuzzer|
159   replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`
160
161lldb-target-fuzzer
162---------------------
163
164A |generic fuzzer| that interprets inputs as object files and uses them to
165create a target in lldb.
166
167Mutators and Input Generators
168=============================
169
170The inputs for a fuzz target are generated via random mutations of a
171:ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
172mutations that a fuzzer in LLVM might want.
173
174.. _fuzzing-llvm-generic:
175
176Generic Random Fuzzing
177----------------------
178
179The most basic form of input mutation is to use the built in mutators of
180LibFuzzer. These simply treat the input corpus as a bag of bits and make random
181mutations. This type of fuzzer is good for stressing the surface layers of a
182program, and is good at testing things like lexers, parsers, or binary
183protocols.
184
185Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
186`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
187`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
188
189.. _fuzzing-llvm-protobuf:
190
191Structured Fuzzing using ``libprotobuf-mutator``
192------------------------------------------------
193
194We can use libprotobuf-mutator_ in order to perform structured fuzzing and
195stress deeper layers of programs. This works by defining a protobuf class that
196translates arbitrary data into structurally interesting input. Specifically, we
197use this to work with a subset of the C++ language and perform mutations that
198produce valid C++ programs in order to exercise parts of clang that are more
199interesting than parser error handling.
200
201To build this kind of fuzzer you need `protobuf`_ and its dependencies
202installed, and you need to specify some extra flags when configuring the build
203with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
204adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
205:ref:`building-fuzzers`.
206
207The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
208`clang-proto-fuzzer`_.
209
210.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
211.. _protobuf: https://github.com/google/protobuf
212
213.. _fuzzing-llvm-ir:
214
215Structured Fuzzing of LLVM IR
216-----------------------------
217
218We also use a more direct form of structured fuzzing for fuzzers that take
219:doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
220library, which was `discussed at EuroLLVM 2017`_.
221
222The ``FuzzMutate`` library is used to structurally fuzz backends in
223`llvm-isel-fuzzer`_.
224
225.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
226
227
228Building and Running
229====================
230
231.. _building-fuzzers:
232
233Configuring LLVM to Build Fuzzers
234---------------------------------
235
236Fuzzers will be built and linked to libFuzzer by default as long as you build
237LLVM with sanitizer coverage enabled. You would typically also enable at least
238one sanitizer to find bugs faster. The most common way to build the fuzzers is
239by adding the following two flags to your CMake invocation:
240``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
241
242.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
243          with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
244          to avoid building the sanitizers themselves with sanitizers enabled.
245
246.. note:: You may run into issues if you build with BFD ld, which is the
247          default linker on many unix systems. These issues are being tracked
248          in https://llvm.org/PR34636.
249
250Continuously Running and Finding Bugs
251-------------------------------------
252
253There used to be a public buildbot running LLVM fuzzers continuously, and while
254this did find issues, it didn't have a very good way to report problems in an
255actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
256instead.
257
258You can browse the `LLVM project issue list`_ for the bugs found by
259`LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing
260list`_.
261
262.. _OSS Fuzz: https://github.com/google/oss-fuzz
263.. _LLVM project issue list:
264   https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
265.. _LLVM on OSS Fuzz:
266   https://github.com/google/oss-fuzz/blob/master/projects/llvm
267.. _llvm-bugs mailing list:
268   http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
269
270
271Utilities for Writing Fuzzers
272=============================
273
274There are some utilities available for writing fuzzers in LLVM.
275
276Some helpers for handling the command line interface are available in
277``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
278line options in a consistent way and to implement standalone main functions so
279your fuzzer can be built and tested when not built against libFuzzer.
280
281There is also some handling of the CMake config for fuzzers, where you should
282use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
283similarly to functions such as ``add_llvm_tool``, but they take care of linking
284to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
285enable standalone testing.
286