1
2.xz Test Files
3----------------
4
50. Introduction
6
7    This directory contains bunch of files to test handling of .xz files
8    in .xz decoder implementations. Many of the files have been created
9    by hand with a hex editor, thus there is no better "source code" than
10    the files themselves. All the test files (*.xz) and this README have
11    been put into the public domain.
12
13
141. File Types
15
16    Good files (good-*.xz) must decode successfully without requiring
17    a lot of CPU time or RAM.
18
19    Unsupported files (unsupported-*.xz) are good files, but headers
20    indicate features not supported by the current file format
21    specification.
22
23    Bad files (bad-*.xz) must cause the decoder to give an error. Like
24    with the good files, these files must not require a lot of CPU time
25    or RAM before they get detected to be broken.
26
27
282. Descriptions of Individual Files
29
302.1. Good Files
31
32    good-0-empty.xz has one Stream with no Blocks.
33
34    good-0pad-empty.xz has one Stream with no Blocks followed by
35    four-byte Stream Padding.
36
37    good-0cat-empty.xz has two zero-Block Streams concatenated without
38    Stream Padding.
39
40    good-0catpad-empty.xz has two zero-Block Streams concatenated with
41    four-byte Stream Padding between the Streams.
42
43    good-1-check-none.xz has one Stream with one Block with two
44    uncompressed LZMA2 chunks and no integrity check.
45
46    good-1-check-crc32.xz has one Stream with one Block with two
47    uncompressed LZMA2 chunks and CRC32 check.
48
49    good-1-check-crc64.xz is like good-1-check-crc32.xz but with CRC64.
50
51    good-1-check-sha256.xz is like good-1-check-crc32.xz but with
52    SHA256.
53
54    good-2-lzma2.xz has one Stream with two Blocks with one uncompressed
55    LZMA2 chunk in each Block.
56
57    good-1-block_header-1.xz has both Compressed Size and Uncompressed
58    Size in the Block Header. This has also four extra bytes of Header
59    Padding.
60
61    good-1-block_header-2.xz has known Compressed Size.
62
63    good-1-block_header-3.xz has known Uncompressed Size.
64
65    good-1-delta-lzma2.tiff.xz is an image file that compresses
66    better with Delta+LZMA2 than with plain LZMA2.
67
68    good-1-x86-lzma2.xz uses the x86 filter (BCJ) and LZMA2. The
69    uncompressed file is compress_prepared_bcj_x86 found from the tests
70    directory.
71
72    good-1-sparc-lzma2.xz uses the SPARC filter and LZMA. The
73    uncompressed file is compress_prepared_bcj_sparc found from the tests
74    directory.
75
76    good-1-lzma2-1.xz has two LZMA2 chunks, of which the second sets
77    new properties.
78
79    good-1-lzma2-2.xz has two LZMA2 chunks, of which the second resets
80    the state without specifying new properties.
81
82    good-1-lzma2-3.xz has two LZMA2 chunks, of which the first is
83    uncompressed and the second is LZMA. The first chunk resets dictionary
84    and the second sets new properties.
85
86    good-1-lzma2-4.xz has three LZMA2 chunks: First is LZMA, second is
87    uncompressed with dictionary reset, and third is LZMA with new
88    properties but without dictionary reset.
89
90    good-1-lzma2-5.xz has an empty LZMA2 stream with only the end of
91    payload marker. XZ Utils 5.0.1 and older incorrectly see this file
92    as corrupt.
93
94    good-1-3delta-lzma2.xz has three Delta filters and LZMA2.
95
96
972.2. Unsupported Files
98
99    unsupported-check.xz uses Check ID 0x02 which isn't supported by
100    the current version of the file format. It is implementation-defined
101    how this file handled (it may reject it, or decode it possibly with
102    a warning).
103
104    unsupported-block_header.xz has a non-null byte in Header Padding,
105    which may indicate presence of a new unsupported field.
106
107    unsupported-filter_flags-1.xz has unsupported Filter ID 0x7F.
108
109    unsupported-filter_flags-2.xz specifies only Delta filter in the
110    List of Filter Flags, but Delta isn't allowed as the last filter in
111    the chain. It could be a little more correct to detect this file as
112    corrupt instead of unsupported, but saying it is unsupported is
113    simpler in case of liblzma.
114
115    unsupported-filter_flags-3.xz specifies two LZMA2 filters in the
116    List of Filter Flags. LZMA2 is allowed only as the last filter in the
117    chain. It could be a little more correct to detect this file as
118    corrupt instead of unsupported, but saying it is unsupported is
119    simpler in case of liblzma.
120
121
1222.3. Bad Files
123
124    bad-0pad-empty.xz has one Stream with no Blocks followed by
125    five-byte Stream Padding. Stream Padding must be a multiple of four
126    bytes, thus this file is corrupt.
127
128    bad-0catpad-empty.xz has two zero-Block Streams concatenated with
129    five-byte Stream Padding between the Streams.
130
131    bad-0cat-alone.xz is good-0-empty.xz concatenated with an empty
132    LZMA_Alone file.
133
134    bad-0cat-header_magic.xz is good-0cat-empty.xz but with one byte
135    wrong in the Header Magic Bytes field of the second Stream. liblzma
136    gives LZMA_DATA_ERROR for this. (LZMA_FORMAT_ERROR is used only if
137    the first Stream of a file has invalid Header Magic Bytes.)
138
139    bad-0-header_magic.xz is good-0-empty.xz but with one byte wrong
140    in the Header Magic Bytes field. liblzma gives LZMA_FORMAT_ERROR for
141    this.
142
143    bad-0-footer_magic.xz is good-0-empty.xz but with one byte wrong
144    in the Footer Magic Bytes field. liblzma gives LZMA_DATA_ERROR for
145    this.
146
147    bad-0-empty-truncated.xz is good-0-empty.xz without the last byte
148    of the file.
149
150    bad-0-nonempty_index.xz has no Blocks but Index claims that there is
151    one Block.
152
153    bad-0-backward_size.xz has wrong Backward Size in Stream Footer.
154
155    bad-1-stream_flags-1.xz has different Stream Flags in Stream Header
156    and Stream Footer.
157
158    bad-1-stream_flags-2.xz has wrong CRC32 in Stream Header.
159
160    bad-1-stream_flags-3.xz has wrong CRC32 in Stream Footer.
161
162    bad-1-vli-1.xz has two-byte variable-length integer in the
163    Uncompressed Size field in Block Header while one-byte would be enough
164    for that value. It's important that the file gets rejected due to too
165    big integer encoding instead of due to Uncompressed Size not matching
166    the value stored in the Block Header. That is, the decoder must not
167    try to decode the Compressed Data field.
168
169    bad-1-vli-2.xz has ten-byte variable-length integer as Uncompressed
170    Size in Block Header. It's important that the file gets rejected due
171    to too big integer encoding instead of due to Uncompressed Size not
172    matching the value stored in the Block Header. That is, the decoder
173    must not try to decode the Compressed Data field.
174
175    bad-1-block_header-1.xz has Block Header that ends in the middle of
176    the Filter Flags field.
177
178    bad-1-block_header-2.xz has Block Header that has Compressed Size and
179    Uncompressed Size but no List of Filter Flags field.
180
181    bad-1-block_header-3.xz has wrong CRC32 in Block Header.
182
183    bad-1-block_header-4.xz has too big Compressed Size in Block Header
184    (2^63 - 1 bytes while maximum is a little less, because the whole
185    Block must stay smaller than 2^63). It's important that the file
186    gets rejected due to invalid Compressed Size value; the decoder
187    must not try decoding the Compressed Data field.
188
189    bad-1-block_header-5.xz has zero as Compressed Size in Block Header.
190
191    bad-1-block_header-6.xz has corrupt Block Header which may crash
192    xz -lvv in XZ Utils 5.0.3 and earlier. It was fixed in the commit
193    c0297445064951807803457dca1611b3c47e7f0f.
194
195    bad-2-index-1.xz has wrong Unpadded Sizes in Index.
196
197    bad-2-index-2.xz has wrong Uncompressed Sizes in Index.
198
199    bad-2-index-3.xz has non-null byte in Index Padding.
200
201    bad-2-index-4.xz wrong CRC32 in Index.
202
203    bad-2-index-5.xz has zero as Unpadded Size. It is important that the
204    file gets rejected specifically due to Unpadded Size having an invalid
205    value.
206
207    bad-2-compressed_data_padding.xz has non-null byte in the padding of
208    the Compressed Data field of the first Block.
209
210    bad-1-check-crc32.xz has wrong Check (CRC32).
211
212    bad-1-check-crc64.xz has wrong Check (CRC64).
213
214    bad-1-check-sha256.xz has wrong Check (SHA-256).
215
216    bad-1-lzma2-1.xz has LZMA2 stream whose first chunk (uncompressed)
217    doesn't reset the dictionary.
218
219    bad-1-lzma2-2.xz has two LZMA2 chunks, of which the second chunk
220    indicates dictionary reset, but the LZMA compressed data tries to
221    repeat data from the previous chunk.
222
223    bad-1-lzma2-3.xz sets new invalid properties (lc=8, lp=0, pb=0) in
224    the middle of Block.
225
226    bad-1-lzma2-4.xz has two LZMA2 chunks, of which the first is
227    uncompressed and the second is LZMA. The first chunk resets dictionary
228    as it should, but the second chunk tries to reset state without
229    specifying properties for LZMA.
230
231    bad-1-lzma2-5.xz is like bad-1-lzma2-4.xz but doesn't try to reset
232    anything in the header of the second chunk.
233
234    bad-1-lzma2-6.xz has reserved LZMA2 control byte value (0x03).
235
236    bad-1-lzma2-7.xz has EOPM at LZMA level.
237
238    bad-1-lzma2-8.xz is like good-1-lzma2-4.xz but doesn't set new
239    properties in the third LZMA2 chunk.
240
241