xref: /qemu/docs/image-fuzzer.txt (revision c5ba6219)
1d6dc10aaSMaria Kustova# Specification for the fuzz testing tool
2d6dc10aaSMaria Kustova#
3d6dc10aaSMaria Kustova# Copyright (C) 2014 Maria Kustova <maria.k@catit.be>
4d6dc10aaSMaria Kustova#
5d6dc10aaSMaria Kustova# This program is free software: you can redistribute it and/or modify
6d6dc10aaSMaria Kustova# it under the terms of the GNU General Public License as published by
7d6dc10aaSMaria Kustova# the Free Software Foundation, either version 2 of the License, or
8d6dc10aaSMaria Kustova# (at your option) any later version.
9d6dc10aaSMaria Kustova#
10d6dc10aaSMaria Kustova# This program is distributed in the hope that it will be useful,
11d6dc10aaSMaria Kustova# but WITHOUT ANY WARRANTY; without even the implied warranty of
12d6dc10aaSMaria Kustova# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
13d6dc10aaSMaria Kustova# GNU General Public License for more details.
14d6dc10aaSMaria Kustova#
15d6dc10aaSMaria Kustova# You should have received a copy of the GNU General Public License
16d6dc10aaSMaria Kustova# along with this program.  If not, see <http://www.gnu.org/licenses/>.
17d6dc10aaSMaria Kustova
18d6dc10aaSMaria Kustova
19d6dc10aaSMaria KustovaImage fuzzer
20d6dc10aaSMaria Kustova============
21d6dc10aaSMaria Kustova
22d6dc10aaSMaria KustovaDescription
23d6dc10aaSMaria Kustova-----------
24d6dc10aaSMaria Kustova
25d6dc10aaSMaria KustovaThe goal of the image fuzzer is to catch crashes of qemu-io/qemu-img
26d6dc10aaSMaria Kustovaby providing to them randomly corrupted images.
27d6dc10aaSMaria KustovaTest images are generated from scratch and have valid inner structure with some
28d6dc10aaSMaria Kustovaelements, e.g. L1/L2 tables, having random invalid values.
29d6dc10aaSMaria Kustova
30d6dc10aaSMaria Kustova
31d6dc10aaSMaria KustovaTest runner
32d6dc10aaSMaria Kustova-----------
33d6dc10aaSMaria Kustova
34d6dc10aaSMaria KustovaThe test runner generates test images, executes tests utilizing generated
35d6dc10aaSMaria Kustovaimages, indicates their results and collects all test related artifacts (logs,
36d6dc10aaSMaria Kustovacore dumps, test images, backing files).
37d6dc10aaSMaria KustovaThe test means execution of all available commands under test with the same
38d6dc10aaSMaria Kustovagenerated test image.
39d6dc10aaSMaria KustovaBy default, the test runner generates new tests and executes them until
40d6dc10aaSMaria Kustovakeyboard interruption. But if a test seed is specified via the '--seed' runner
41d6dc10aaSMaria Kustovaparameter, then only one test with this seed will be executed, after its finish
42d6dc10aaSMaria Kustovathe runner will exit.
43d6dc10aaSMaria Kustova
44d6dc10aaSMaria KustovaThe runner uses an external image fuzzer to generate test images. An image
45d6dc10aaSMaria Kustovagenerator should be specified as a mandatory parameter of the test runner.
46d6dc10aaSMaria KustovaDetails about interactions between the runner and fuzzers see "Module
47d6dc10aaSMaria Kustovainterfaces".
48d6dc10aaSMaria Kustova
49d6dc10aaSMaria KustovaThe runner activates generation of core dumps during test executions, but it
50d6dc10aaSMaria Kustovaassumes that core dumps will be generated in the current working directory.
51d6dc10aaSMaria KustovaFor comprehensive test results, please, set up your test environment
52d6dc10aaSMaria Kustovaproperly.
53d6dc10aaSMaria Kustova
54*c5ba6219SPhilippe Mathieu-DaudéPaths to binaries under test (SUTs) ``qemu-img`` and ``qemu-io`` are retrieved
55*c5ba6219SPhilippe Mathieu-Daudéfrom environment variables. If the environment check fails the runner will
56d6dc10aaSMaria Kustovause SUTs installed in system paths.
57*c5ba6219SPhilippe Mathieu-Daudé``qemu-img`` is required for creation of backing files, so it's mandatory to set
58d6dc10aaSMaria Kustovathe related environment variable if it's not installed in the system path.
59d6dc10aaSMaria KustovaFor details about environment variables see qemu-iotests/check.
60d6dc10aaSMaria Kustova
61d6dc10aaSMaria KustovaThe runner accepts a JSON array of fields expected to be fuzzed via the
62d6dc10aaSMaria Kustova'--config' argument, e.g.
63d6dc10aaSMaria Kustova
64d6dc10aaSMaria Kustova       '[["feature_name_table"], ["header", "l1_table_offset"]]'
65d6dc10aaSMaria Kustova
66d6dc10aaSMaria KustovaEach sublist can have one or two strings defining image structure elements.
67d6dc10aaSMaria KustovaIn the latter case a parent element should be placed on the first position,
68d6dc10aaSMaria Kustovaand a field name on the second one.
69d6dc10aaSMaria Kustova
70d6dc10aaSMaria KustovaThe runner accepts a list of commands under test as a JSON array via
71d6dc10aaSMaria Kustovathe '--command' argument. Each command is a list containing a SUT and all its
72d6dc10aaSMaria Kustovaarguments, e.g.
73d6dc10aaSMaria Kustova
74d6dc10aaSMaria Kustova       runner.py -c '[["qemu-io", "$test_img", "-c", "write $off $len"]]'
75d6dc10aaSMaria Kustova     /tmp/test ../qcow2
76d6dc10aaSMaria Kustova
77d6dc10aaSMaria KustovaFor variable arguments next aliases can be used:
78d6dc10aaSMaria Kustova    - $test_img for a fuzzed img
79d6dc10aaSMaria Kustova    - $off for an offset in the fuzzed image
80d6dc10aaSMaria Kustova    - $len for a data size
81d6dc10aaSMaria Kustova
82d6dc10aaSMaria KustovaValues for last two aliases will be generated based on a size of a virtual
83d6dc10aaSMaria Kustovadisk of the generated image.
84d6dc10aaSMaria KustovaIn case when no commands are specified the runner will execute commands from
85d6dc10aaSMaria Kustovathe default list:
86d6dc10aaSMaria Kustova    - qemu-img check
87d6dc10aaSMaria Kustova    - qemu-img info
88d6dc10aaSMaria Kustova    - qemu-img convert
89d6dc10aaSMaria Kustova    - qemu-io -c read
90d6dc10aaSMaria Kustova    - qemu-io -c write
91d6dc10aaSMaria Kustova    - qemu-io -c aio_read
92d6dc10aaSMaria Kustova    - qemu-io -c aio_write
93d6dc10aaSMaria Kustova    - qemu-io -c flush
94d6dc10aaSMaria Kustova    - qemu-io -c discard
95d6dc10aaSMaria Kustova    - qemu-io -c truncate
96d6dc10aaSMaria Kustova
97d6dc10aaSMaria Kustova
98d6dc10aaSMaria KustovaQcow2 image generator
99d6dc10aaSMaria Kustova---------------------
100d6dc10aaSMaria Kustova
101d6dc10aaSMaria KustovaThe 'qcow2' generator is a Python package providing 'create_image' method as
102d6dc10aaSMaria Kustovaa single public API. See details in 'Test runner/image fuzzer' chapter of
103d6dc10aaSMaria Kustova'Module interfaces'.
104d6dc10aaSMaria Kustova
105d6dc10aaSMaria KustovaQcow2 contains two submodules: fuzz.py and layout.py.
106d6dc10aaSMaria Kustova
107d6dc10aaSMaria Kustova'fuzz.py' contains all fuzzing functions, one per image field. It's assumed
108d6dc10aaSMaria Kustovathat after code analysis every field will have own constraints for its value.
109d6dc10aaSMaria KustovaFor now only universal potentially dangerous values are used, e.g. type limits
110d6dc10aaSMaria Kustovafor integers or unsafe symbols as '%s' for strings. For bitmasks random amount
111d6dc10aaSMaria Kustovaof bits are set to ones. All fuzzed values are checked on non-equality to the
112d6dc10aaSMaria Kustovacurrent valid value of the field. In case of equality the value will be
113d6dc10aaSMaria Kustovaregenerated.
114d6dc10aaSMaria Kustova
115d6dc10aaSMaria Kustova'layout.py' creates a random valid image, fuzzes a random subset of the image
116d6dc10aaSMaria Kustovafields by 'fuzz.py' module and writes a fuzzed image to the file specified.
117d6dc10aaSMaria KustovaIf a fuzzer configuration is specified, then it has the next interpretation:
118d6dc10aaSMaria Kustova
119d6dc10aaSMaria Kustova    1. If a list contains a parent image element only, then some random portion
120d6dc10aaSMaria Kustova    of fields of this element will be fuzzed every test.
121d6dc10aaSMaria Kustova    The same behavior is applied for the entire image if no configuration is
122d6dc10aaSMaria Kustova    used. This case is useful for the test specialization.
123d6dc10aaSMaria Kustova
124d6dc10aaSMaria Kustova    2. If a list contains a parent element and a field name, then a field
125d6dc10aaSMaria Kustova    will be always fuzzed for every test. This case is useful for regression
126d6dc10aaSMaria Kustova    testing.
127d6dc10aaSMaria Kustova
12856271efdSMaria KustovaThe generator can create header fields, header extensions, L1/L2 tables and
12956271efdSMaria Kustovarefcount table and blocks.
130d6dc10aaSMaria Kustova
131d6dc10aaSMaria KustovaModule interfaces
132d6dc10aaSMaria Kustova-----------------
133d6dc10aaSMaria Kustova
134d6dc10aaSMaria Kustova* Test runner/image fuzzer
135d6dc10aaSMaria Kustova
136d6dc10aaSMaria KustovaThe runner calls an image generator specifying the path to a test image file,
137d6dc10aaSMaria Kustovapath to a backing file and its format and a fuzzer configuration.
138d6dc10aaSMaria KustovaAn image generator is expected to provide a
139d6dc10aaSMaria Kustova
140d6dc10aaSMaria Kustova   'create_image(test_img_path, backing_file_path=None,
141d6dc10aaSMaria Kustova                 backing_file_format=None, fuzz_config=None)'
142d6dc10aaSMaria Kustova
143d6dc10aaSMaria Kustovamethod that creates a test image, writes it to the specified file and returns
144d6dc10aaSMaria Kustovathe size of the virtual disk.
145d6dc10aaSMaria KustovaThe file should be created if it doesn't exist or overwritten otherwise.
146d6dc10aaSMaria Kustovafuzz_config has a form of a list of lists. Every sublist can have one
147d6dc10aaSMaria Kustovaor two elements: first element is a name of a parent image element, second one
148d6dc10aaSMaria Kustovaif exists is a name of a field in this element.
149d6dc10aaSMaria KustovaExample,
150d6dc10aaSMaria Kustova        [['header', 'l1_table_offset'],
151d6dc10aaSMaria Kustova         ['header', 'nb_snapshots'],
152d6dc10aaSMaria Kustova         ['feature_name_table']]
153d6dc10aaSMaria Kustova
154d6dc10aaSMaria KustovaRandom seed is set by the runner at every test execution for the regression
155d6dc10aaSMaria Kustovapurpose, so an image generator is not recommended to modify it internally.
156d6dc10aaSMaria Kustova
157d6dc10aaSMaria Kustova
158d6dc10aaSMaria KustovaOverall fuzzer requirements
159d6dc10aaSMaria Kustova===========================
160d6dc10aaSMaria Kustova
161d6dc10aaSMaria KustovaInput data:
162d6dc10aaSMaria Kustova----------
163d6dc10aaSMaria Kustova
164d6dc10aaSMaria Kustova - image template (generator)
165d6dc10aaSMaria Kustova - work directory
166d6dc10aaSMaria Kustova - action vector (optional)
167d6dc10aaSMaria Kustova - seed (optional)
168d6dc10aaSMaria Kustova - SUT and its arguments (optional)
169d6dc10aaSMaria Kustova
170d6dc10aaSMaria Kustova
171d6dc10aaSMaria KustovaFuzzer requirements:
172d6dc10aaSMaria Kustova-------------------
173d6dc10aaSMaria Kustova
174d6dc10aaSMaria Kustova1.  Should be able to inject random data
175d6dc10aaSMaria Kustova2.  Should be able to select a random value from the manually pregenerated
176d6dc10aaSMaria Kustova    vector (boundary values, e.g. max/min cluster size)
177d6dc10aaSMaria Kustova3.  Image template should describe a general structure invariant for all
178d6dc10aaSMaria Kustova    test images (image format description)
179d6dc10aaSMaria Kustova4.  Image template should be autonomous and other fuzzer parts should not
180d6dc10aaSMaria Kustova    rely on it
181d6dc10aaSMaria Kustova5.  Image template should contain reference rules (not only block+size
182d6dc10aaSMaria Kustova    description)
183d6dc10aaSMaria Kustova6.  Should generate the test image with the correct structure based on an image
184d6dc10aaSMaria Kustova    template
185d6dc10aaSMaria Kustova7.  Should accept a seed as an argument (for regression purpose)
186d6dc10aaSMaria Kustova8.  Should generate a seed if it is not specified as an input parameter.
187d6dc10aaSMaria Kustova9.  The same seed should generate the same image for the same action vector,
188d6dc10aaSMaria Kustova    specified or generated.
189d6dc10aaSMaria Kustova10. Should accept a vector of actions as an argument (for test reproducing and
190d6dc10aaSMaria Kustova    for test case specification, e.g. group of tests for header structure,
191d6dc10aaSMaria Kustova    group of test for snapshots, etc)
192d6dc10aaSMaria Kustova11. Action vector should be randomly generated from the pool of available
193d6dc10aaSMaria Kustova    actions, if it is not specified as an input parameter
194d6dc10aaSMaria Kustova12. Pool of actions should be defined automatically based on an image template
195d6dc10aaSMaria Kustova13. Should accept a SUT and its call parameters as an argument or select them
196d6dc10aaSMaria Kustova    randomly otherwise. As far as it's expected to be rarely changed, the list
197d6dc10aaSMaria Kustova    of all possible test commands can be available in the test runner
198d6dc10aaSMaria Kustova    internally.
199d6dc10aaSMaria Kustova14. Should support an external cancellation of a test run
200d6dc10aaSMaria Kustova15. Seed should be logged (for regression purpose)
201d6dc10aaSMaria Kustova16. All files related to a test result should be collected: a test image,
202d6dc10aaSMaria Kustova    SUT logs, fuzzer logs and crash dumps
203d6dc10aaSMaria Kustova17. Should be compatible with python version 2.4-2.7
204d6dc10aaSMaria Kustova18. Usage of external libraries should be limited as much as possible.
205d6dc10aaSMaria Kustova
206d6dc10aaSMaria Kustova
207d6dc10aaSMaria KustovaImage formats:
208d6dc10aaSMaria Kustova-------------
209d6dc10aaSMaria Kustova
210d6dc10aaSMaria KustovaMain target image format is qcow2, but support of image templates should
211d6dc10aaSMaria Kustovaprovide an ability to add any other image format.
212d6dc10aaSMaria Kustova
213d6dc10aaSMaria Kustova
214d6dc10aaSMaria KustovaEffectiveness:
215d6dc10aaSMaria Kustova-------------
216d6dc10aaSMaria Kustova
217d6dc10aaSMaria KustovaThe fuzzer can be controlled via template, seed and action vector;
218d6dc10aaSMaria Kustovait makes the fuzzer itself invariant to an image format and test logic.
219d6dc10aaSMaria KustovaIt should be able to perform rather complex and precise tests, that can be
220d6dc10aaSMaria Kustovaspecified via an action vector. Otherwise, knowledge about an image structure
221d6dc10aaSMaria Kustovaallows the fuzzer to generate the pool of all available areas can be fuzzed
222d6dc10aaSMaria Kustovaand randomly select some of them and so compose its own action vector.
223d6dc10aaSMaria KustovaAlso complexity of a template defines complexity of the fuzzer, so its
224d6dc10aaSMaria Kustovafunctionality can be varied from simple model-independent fuzzing to smart
225d6dc10aaSMaria Kustovamodel-based one.
226d6dc10aaSMaria Kustova
227d6dc10aaSMaria Kustova
228d6dc10aaSMaria KustovaGlossary:
229d6dc10aaSMaria Kustova--------
230d6dc10aaSMaria Kustova
231d6dc10aaSMaria KustovaAction vector is a sequence of structure elements retrieved from an image
232d6dc10aaSMaria Kustovaformat, each of them will be fuzzed for the test image. It's a subset of
233d6dc10aaSMaria Kustovaelements of the action pool. Example: header, refcount table, etc.
234d6dc10aaSMaria KustovaAction pool is all available elements of an image structure that generated
235d6dc10aaSMaria Kustovaautomatically from an image template.
236d6dc10aaSMaria KustovaImage template is a formal description of an image structure and relations
237d6dc10aaSMaria Kustovabetween image blocks.
238d6dc10aaSMaria KustovaTest image is an output image of the fuzzer defined by the current seed and
239d6dc10aaSMaria Kustovaaction vector.
240