• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

patches/H05-Nov-2017-405201

README.qemuH A D05-Nov-20175.3 KiB12692

build_qemu_support.shH A D05-Nov-20174.6 KiB20199

README.qemu

1=========================================================
2High-performance binary-only instrumentation for afl-fuzz
3=========================================================
4
5  (See ../docs/README for the general instruction manual.)
6
71) Introduction
8---------------
9
10The code in this directory allows you to build a standalone feature that
11leverages the QEMU "user emulation" mode and allows callers to obtain
12instrumentation output for black-box, closed-source binaries. This mechanism
13can be then used by afl-fuzz to stress-test targets that couldn't be built
14with afl-gcc.
15
16The usual performance cost is 2-5x, which is considerably better than
17seen so far in experiments with tools such as DynamoRIO and PIN.
18
19The idea and much of the implementation comes from Andrew Griffiths.
20
212) How to use
22-------------
23
24The feature is implemented with a fairly simple patch to QEMU 2.10.0. The
25simplest way to build it is to run ./build_qemu_support.sh. The script will
26download, configure, and compile the QEMU binary for you.
27
28QEMU is a big project, so this will take a while, and you may have to
29resolve a couple of dependencies (most notably, you will definitely need
30libtool and glib2-devel).
31
32Once the binaries are compiled, you can leverage the QEMU tool by calling
33afl-fuzz and all the related utilities with -Q in the command line.
34
35Note that QEMU requires a generous memory limit to run; somewhere around
36200 MB is a good starting point, but considerably more may be needed for
37more complex programs. The default -m limit will be automatically bumped up
38to 200 MB when specifying -Q to afl-fuzz; be careful when overriding this.
39
40In principle, if you set CPU_TARGET before calling ./build_qemu_support.sh,
41you should get a build capable of running non-native binaries (say, you
42can try CPU_TARGET=arm). This is also necessary for running 32-bit binaries
43on a 64-bit system (CPU_TARGET=i386).
44
45Note: if you want the QEMU helper to be installed on your system for all
46users, you need to build it before issuing 'make install' in the parent
47directory.
48
493) Notes on linking
50-------------------
51
52The feature is supported only on Linux. Supporting BSD may amount to porting
53the changes made to linux-user/elfload.c and applying them to
54bsd-user/elfload.c, but I have not looked into this yet.
55
56The instrumentation follows only the .text section of the first ELF binary
57encountered in the linking process. It does not trace shared libraries. In
58practice, this means two things:
59
60  - Any libraries you want to analyze *must* be linked statically into the
61    executed ELF file (this will usually be the case for closed-source
62    apps).
63
64  - Standard C libraries and other stuff that is wasteful to instrument
65    should be linked dynamically - otherwise, AFL will have no way to avoid
66    peeking into them.
67
68Setting AFL_INST_LIBS=1 can be used to circumvent the .text detection logic
69and instrument every basic block encountered.
70
714) Benchmarking
72---------------
73
74If you want to compare the performance of the QEMU instrumentation with that of
75afl-gcc compiled code against the same target, you need to build the
76non-instrumented binary with the same optimization flags that are normally
77injected by afl-gcc, and make sure that the bits to be tested are statically
78linked into the binary. A common way to do this would be:
79
80$ CFLAGS="-O3 -funroll-loops" ./configure --disable-shared
81$ make clean all
82
83Comparative measurements of execution speed or instrumentation coverage will be
84fairly meaningless if the optimization levels or instrumentation scopes don't
85match.
86
875) Gotchas, feedback, bugs
88--------------------------
89
90If you need to fix up checksums or do other cleanup on mutated test cases, see
91experimental/post_library/ for a viable solution.
92
93Do not mix QEMU mode with ASAN, MSAN, or the likes; QEMU doesn't appreciate
94the "shadow VM" trick employed by the sanitizers and will probably just
95run out of memory.
96
97Compared to fully-fledged virtualization, the user emulation mode is *NOT* a
98security boundary. The binaries can freely interact with the host OS. If you
99somehow need to fuzz an untrusted binary, put everything in a sandbox first.
100
101QEMU does not necessarily support all CPU or hardware features that your
102target program may be utilizing. In particular, it does not appear to have
103full support for AVX2 / FMA3. Using binaries for older CPUs, or recompiling them
104with -march=core2, can help.
105
106Beyond that, this is an early-stage mechanism, so fields reports are welcome.
107You can send them to <afl-users@googlegroups.com>.
108
1096) Alternatives: static rewriting
110---------------------------------
111
112Statically rewriting binaries just once, instead of attempting to translate
113them at run time, can be a faster alternative. That said, static rewriting is
114fraught with peril, because it depends on being able to properly and fully model
115program control flow without actually executing each and every code path.
116
117If you want to experiment with this mode of operation, there is a module
118contributed by Aleksandar Nikolich:
119
120  https://github.com/vrtadmin/moflow/tree/master/afl-dyninst
121  https://groups.google.com/forum/#!topic/afl-users/HlSQdbOTlpg
122
123At this point, the author reports the possibility of hiccups with stripped
124binaries. That said, if we can get it to be comparably reliable to QEMU, we may
125decide to switch to this mode, but I had no time to play with it yet.
126