1 2.xz Test Files 3---------------- 4 50. Introduction 6 7 This directory contains bunch of files to test handling of .xz files 8 in .xz decoder implementations. Many of the files have been created 9 by hand with a hex editor, thus there is no better "source code" than 10 the files themselves. All the test files (*.xz) and this README have 11 been put into the public domain. 12 13 141. File Types 15 16 Good files (good-*.xz) must decode successfully without requiring 17 a lot of CPU time or RAM. 18 19 Unsupported files (unsupported-*.xz) are good files, but headers 20 indicate features not supported by the current file format 21 specification. 22 23 Bad files (bad-*.xz) must cause the decoder to give an error. Like 24 with the good files, these files must not require a lot of CPU time 25 or RAM before they get detected to be broken. 26 27 282. Descriptions of Individual Files 29 302.1. Good Files 31 32 good-0-empty.xz has one Stream with no Blocks. 33 34 good-0pad-empty.xz has one Stream with no Blocks followed by 35 four-byte Stream Padding. 36 37 good-0cat-empty.xz has two zero-Block Streams concatenated without 38 Stream Padding. 39 40 good-0catpad-empty.xz has two zero-Block Streams concatenated with 41 four-byte Stream Padding between the Streams. 42 43 good-1-check-none.xz has one Stream with one Block with two 44 uncompressed LZMA2 chunks and no integrity check. 45 46 good-1-check-crc32.xz has one Stream with one Block with two 47 uncompressed LZMA2 chunks and CRC32 check. 48 49 good-1-check-crc64.xz is like good-1-check-crc32.xz but with CRC64. 50 51 good-1-check-sha256.xz is like good-1-check-crc32.xz but with 52 SHA256. 53 54 good-2-lzma2.xz has one Stream with two Blocks with one uncompressed 55 LZMA2 chunk in each Block. 56 57 good-1-block_header-1.xz has both Compressed Size and Uncompressed 58 Size in the Block Header. This has also four extra bytes of Header 59 Padding. 60 61 good-1-block_header-2.xz has known Compressed Size. 62 63 good-1-block_header-3.xz has known Uncompressed Size. 64 65 good-1-delta-lzma2.tiff.xz is an image file that compresses 66 better with Delta+LZMA2 than with plain LZMA2. 67 68 good-1-x86-lzma2.xz uses the x86 filter (BCJ) and LZMA2. The 69 uncompressed file is compress_prepared_bcj_x86 found from the tests 70 directory. 71 72 good-1-sparc-lzma2.xz uses the SPARC filter and LZMA. The 73 uncompressed file is compress_prepared_bcj_sparc found from the tests 74 directory. 75 76 good-1-lzma2-1.xz has two LZMA2 chunks, of which the second sets 77 new properties. 78 79 good-1-lzma2-2.xz has two LZMA2 chunks, of which the second resets 80 the state without specifying new properties. 81 82 good-1-lzma2-3.xz has two LZMA2 chunks, of which the first is 83 uncompressed and the second is LZMA. The first chunk resets dictionary 84 and the second sets new properties. 85 86 good-1-lzma2-4.xz has three LZMA2 chunks: First is LZMA, second is 87 uncompressed with dictionary reset, and third is LZMA with new 88 properties but without dictionary reset. 89 90 good-1-lzma2-5.xz has an empty LZMA2 stream with only the end of 91 payload marker. XZ Utils 5.0.1 and older incorrectly see this file 92 as corrupt. 93 94 good-1-3delta-lzma2.xz has three Delta filters and LZMA2. 95 96 972.2. Unsupported Files 98 99 unsupported-check.xz uses Check ID 0x02 which isn't supported by 100 the current version of the file format. It is implementation-defined 101 how this file handled (it may reject it, or decode it possibly with 102 a warning). 103 104 unsupported-block_header.xz has a non-null byte in Header Padding, 105 which may indicate presence of a new unsupported field. 106 107 unsupported-filter_flags-1.xz has unsupported Filter ID 0x7F. 108 109 unsupported-filter_flags-2.xz specifies only Delta filter in the 110 List of Filter Flags, but Delta isn't allowed as the last filter in 111 the chain. It could be a little more correct to detect this file as 112 corrupt instead of unsupported, but saying it is unsupported is 113 simpler in case of liblzma. 114 115 unsupported-filter_flags-3.xz specifies two LZMA2 filters in the 116 List of Filter Flags. LZMA2 is allowed only as the last filter in the 117 chain. It could be a little more correct to detect this file as 118 corrupt instead of unsupported, but saying it is unsupported is 119 simpler in case of liblzma. 120 121 1222.3. Bad Files 123 124 bad-0pad-empty.xz has one Stream with no Blocks followed by 125 five-byte Stream Padding. Stream Padding must be a multiple of four 126 bytes, thus this file is corrupt. 127 128 bad-0catpad-empty.xz has two zero-Block Streams concatenated with 129 five-byte Stream Padding between the Streams. 130 131 bad-0cat-alone.xz is good-0-empty.xz concatenated with an empty 132 LZMA_Alone file. 133 134 bad-0cat-header_magic.xz is good-0cat-empty.xz but with one byte 135 wrong in the Header Magic Bytes field of the second Stream. liblzma 136 gives LZMA_DATA_ERROR for this. (LZMA_FORMAT_ERROR is used only if 137 the first Stream of a file has invalid Header Magic Bytes.) 138 139 bad-0-header_magic.xz is good-0-empty.xz but with one byte wrong 140 in the Header Magic Bytes field. liblzma gives LZMA_FORMAT_ERROR for 141 this. 142 143 bad-0-footer_magic.xz is good-0-empty.xz but with one byte wrong 144 in the Footer Magic Bytes field. liblzma gives LZMA_DATA_ERROR for 145 this. 146 147 bad-0-empty-truncated.xz is good-0-empty.xz without the last byte 148 of the file. 149 150 bad-0-nonempty_index.xz has no Blocks but Index claims that there is 151 one Block. 152 153 bad-0-backward_size.xz has wrong Backward Size in Stream Footer. 154 155 bad-1-stream_flags-1.xz has different Stream Flags in Stream Header 156 and Stream Footer. 157 158 bad-1-stream_flags-2.xz has wrong CRC32 in Stream Header. 159 160 bad-1-stream_flags-3.xz has wrong CRC32 in Stream Footer. 161 162 bad-1-vli-1.xz has two-byte variable-length integer in the 163 Uncompressed Size field in Block Header while one-byte would be enough 164 for that value. It's important that the file gets rejected due to too 165 big integer encoding instead of due to Uncompressed Size not matching 166 the value stored in the Block Header. That is, the decoder must not 167 try to decode the Compressed Data field. 168 169 bad-1-vli-2.xz has ten-byte variable-length integer as Uncompressed 170 Size in Block Header. It's important that the file gets rejected due 171 to too big integer encoding instead of due to Uncompressed Size not 172 matching the value stored in the Block Header. That is, the decoder 173 must not try to decode the Compressed Data field. 174 175 bad-1-block_header-1.xz has Block Header that ends in the middle of 176 the Filter Flags field. 177 178 bad-1-block_header-2.xz has Block Header that has Compressed Size and 179 Uncompressed Size but no List of Filter Flags field. 180 181 bad-1-block_header-3.xz has wrong CRC32 in Block Header. 182 183 bad-1-block_header-4.xz has too big Compressed Size in Block Header 184 (2^63 - 1 bytes while maximum is a little less, because the whole 185 Block must stay smaller than 2^63). It's important that the file 186 gets rejected due to invalid Compressed Size value; the decoder 187 must not try decoding the Compressed Data field. 188 189 bad-1-block_header-5.xz has zero as Compressed Size in Block Header. 190 191 bad-1-block_header-6.xz has corrupt Block Header which may crash 192 xz -lvv in XZ Utils 5.0.3 and earlier. It was fixed in the commit 193 c0297445064951807803457dca1611b3c47e7f0f. 194 195 bad-2-index-1.xz has wrong Unpadded Sizes in Index. 196 197 bad-2-index-2.xz has wrong Uncompressed Sizes in Index. 198 199 bad-2-index-3.xz has non-null byte in Index Padding. 200 201 bad-2-index-4.xz wrong CRC32 in Index. 202 203 bad-2-index-5.xz has zero as Unpadded Size. It is important that the 204 file gets rejected specifically due to Unpadded Size having an invalid 205 value. 206 207 bad-2-compressed_data_padding.xz has non-null byte in the padding of 208 the Compressed Data field of the first Block. 209 210 bad-1-check-crc32.xz has wrong Check (CRC32). 211 212 bad-1-check-crc64.xz has wrong Check (CRC64). 213 214 bad-1-check-sha256.xz has wrong Check (SHA-256). 215 216 bad-1-lzma2-1.xz has LZMA2 stream whose first chunk (uncompressed) 217 doesn't reset the dictionary. 218 219 bad-1-lzma2-2.xz has two LZMA2 chunks, of which the second chunk 220 indicates dictionary reset, but the LZMA compressed data tries to 221 repeat data from the previous chunk. 222 223 bad-1-lzma2-3.xz sets new invalid properties (lc=8, lp=0, pb=0) in 224 the middle of Block. 225 226 bad-1-lzma2-4.xz has two LZMA2 chunks, of which the first is 227 uncompressed and the second is LZMA. The first chunk resets dictionary 228 as it should, but the second chunk tries to reset state without 229 specifying properties for LZMA. 230 231 bad-1-lzma2-5.xz is like bad-1-lzma2-4.xz but doesn't try to reset 232 anything in the header of the second chunk. 233 234 bad-1-lzma2-6.xz has reserved LZMA2 control byte value (0x03). 235 236 bad-1-lzma2-7.xz has EOPM at LZMA level. 237 238 bad-1-lzma2-8.xz is like good-1-lzma2-4.xz but doesn't set new 239 properties in the third LZMA2 chunk. 240 241