README.md
1# zstd
2
3[Zstandard](https://facebook.github.io/zstd/) is a real-time compression algorithm, providing high compression ratios.
4It offers a very wide range of compression / speed trade-off, while being backed by a very fast decoder.
5A high performance compression algorithm is implemented. For now focused on speed.
6
7This package provides [compression](#Compressor) to and [decompression](#Decompressor) of Zstandard content.
8Note that custom dictionaries are not supported yet, so if your code relies on that,
9you cannot use the package as-is.
10
11This package is pure Go and without use of "unsafe".
12If a significant speedup can be achieved using "unsafe", it may be added as an option later.
13
14The `zstd` package is provided as open source software using a Go standard license.
15
16Currently the package is heavily optimized for 64 bit processors and will be significantly slower on 32 bit processors.
17
18## Installation
19
20Install using `go get -u github.com/klauspost/compress`. The package is located in `github.com/klauspost/compress/zstd`.
21
22Godoc Documentation: https://godoc.org/github.com/klauspost/compress/zstd
23
24
25## Compressor
26
27### Status:
28
29STABLE - there may always be subtle bugs, a wide variety of content has been tested and the library is actively
30used by several projects. This library is being continuously [fuzz-tested](https://github.com/klauspost/compress-fuzz),
31kindly supplied by [fuzzit.dev](https://fuzzit.dev/).
32
33There may still be specific combinations of data types/size/settings that could lead to edge cases,
34so as always, testing is recommended.
35
36For now, a high speed (fastest) and medium-fast (default) compressor has been implemented.
37
38The "Fastest" compression ratio is roughly equivalent to zstd level 1.
39The "Default" compression ratio is roughly equivalent to zstd level 3 (default).
40
41In terms of speed, it is typically 2x as fast as the stdlib deflate/gzip in its fastest mode.
42The compression ratio compared to stdlib is around level 3, but usually 3x as fast.
43
44Compared to cgo zstd, the speed is around level 3 (default), but compression slightly worse, between level 1&2.
45
46
47### Usage
48
49An Encoder can be used for either compressing a stream via the
50`io.WriteCloser` interface supported by the Encoder or as multiple independent
51tasks via the `EncodeAll` function.
52Smaller encodes are encouraged to use the EncodeAll function.
53Use `NewWriter` to create a new instance that can be used for both.
54
55To create a writer with default options, do like this:
56
57```Go
58// Compress input to output.
59func Compress(in io.Reader, out io.Writer) error {
60 w, err := NewWriter(output)
61 if err != nil {
62 return err
63 }
64 _, err := io.Copy(w, input)
65 if err != nil {
66 enc.Close()
67 return err
68 }
69 return enc.Close()
70}
71```
72
73Now you can encode by writing data to `enc`. The output will be finished writing when `Close()` is called.
74Even if your encode fails, you should still call `Close()` to release any resources that may be held up.
75
76The above is fine for big encodes. However, whenever possible try to *reuse* the writer.
77
78To reuse the encoder, you can use the `Reset(io.Writer)` function to change to another output.
79This will allow the encoder to reuse all resources and avoid wasteful allocations.
80
81Currently stream encoding has 'light' concurrency, meaning up to 2 goroutines can be working on part
82of a stream. This is independent of the `WithEncoderConcurrency(n)`, but that is likely to change
83in the future. So if you want to limit concurrency for future updates, specify the concurrency
84you would like.
85
86You can specify your desired compression level using `WithEncoderLevel()` option. Currently only pre-defined
87compression settings can be specified.
88
89#### Future Compatibility Guarantees
90
91This will be an evolving project. When using this package it is important to note that both the compression efficiency and speed may change.
92
93The goal will be to keep the default efficiency at the default zstd (level 3).
94However the encoding should never be assumed to remain the same,
95and you should not use hashes of compressed output for similarity checks.
96
97The Encoder can be assumed to produce the same output from the exact same code version.
98However, the may be modes in the future that break this,
99although they will not be enabled without an explicit option.
100
101This encoder is not designed to (and will probably never) output the exact same bitstream as the reference encoder.
102
103Also note, that the cgo decompressor currently does not [report all errors on invalid input](https://github.com/DataDog/zstd/issues/59),
104[omits error checks](https://github.com/DataDog/zstd/issues/61), [ignores checksums](https://github.com/DataDog/zstd/issues/43)
105and seems to ignore concatenated streams, even though [it is part of the spec](https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#frames).
106
107#### Blocks
108
109For compressing small blocks, the returned encoder has a function called `EncodeAll(src, dst []byte) []byte`.
110
111`EncodeAll` will encode all input in src and append it to dst.
112This function can be called concurrently, but each call will only run on a single goroutine.
113
114Encoded blocks can be concatenated and the result will be the combined input stream.
115Data compressed with EncodeAll can be decoded with the Decoder, using either a stream or `DecodeAll`.
116
117Especially when encoding blocks you should take special care to reuse the encoder.
118This will effectively make it run without allocations after a warmup period.
119To make it run completely without allocations, supply a destination buffer with space for all content.
120
121```Go
122import "github.com/klauspost/compress/zstd"
123
124// Create a writer that caches compressors.
125// For this operation type we supply a nil Reader.
126var encoder, _ = zstd.NewWriter(nil)
127
128// Compress a buffer.
129// If you have a destination buffer, the allocation in the call can also be eliminated.
130func Compress(src []byte) []byte {
131 return encoder.EncodeAll(src, make([]byte, 0, len(src)))
132}
133```
134
135You can control the maximum number of concurrent encodes using the `WithEncoderConcurrency(n)`
136option when creating the writer.
137
138Using the Encoder for both a stream and individual blocks concurrently is safe.
139
140### Performance
141
142I have collected some speed examples to compare speed and compression against other compressors.
143
144* `file` is the input file.
145* `out` is the compressor used. `zskp` is this package. `gzstd` is gzip standard library. `zstd` is the Datadog cgo library.
146* `level` is the compression level used. For `zskp` level 1 is "fastest", level 2 is "default".
147* `insize`/`outsize` is the input/output size.
148* `millis` is the number of milliseconds used for compression.
149* `mb/s` is megabytes (2^20 bytes) per second.
150
151```
152The test data for the Large Text Compression Benchmark is the first
15310^9 bytes of the English Wikipedia dump on Mar. 3, 2006.
154http://mattmahoney.net/dc/textdata.html
155
156file out level insize outsize millis mb/s
157enwik9 zskp 1 1000000000 343833033 5840 163.30
158enwik9 zskp 2 1000000000 317822183 8449 112.87
159enwik9 gzstd 1 1000000000 382578136 13627 69.98
160enwik9 gzstd 3 1000000000 349139651 22344 42.68
161enwik9 zstd 1 1000000000 357416379 4838 197.12
162enwik9 zstd 3 1000000000 313734522 7556 126.21
163
164GOB stream of binary data. Highly compressible.
165https://files.klauspost.com/compress/gob-stream.7z
166
167file out level insize outsize millis mb/s
168gob-stream zskp 1 1911399616 234981983 5100 357.42
169gob-stream zskp 2 1911399616 208674003 6698 272.15
170gob-stream gzstd 1 1911399616 357382641 14727 123.78
171gob-stream gzstd 3 1911399616 327835097 17005 107.19
172gob-stream zstd 1 1911399616 250787165 4075 447.22
173gob-stream zstd 3 1911399616 208191888 5511 330.77
174
175Highly compressible JSON file. Similar to logs in a lot of ways.
176https://files.klauspost.com/compress/adresser.001.gz
177
178file out level insize outsize millis mb/s
179adresser.001 zskp 1 1073741824 18510122 1477 692.83
180adresser.001 zskp 2 1073741824 19831697 1705 600.59
181adresser.001 gzstd 1 1073741824 47755503 3079 332.47
182adresser.001 gzstd 3 1073741824 40052381 3051 335.63
183adresser.001 zstd 1 1073741824 16135896 994 1030.18
184adresser.001 zstd 3 1073741824 17794465 905 1131.49
185
186VM Image, Linux mint with a few installed applications:
187https://files.klauspost.com/compress/rawstudio-mint14.7z
188
189file out level insize outsize millis mb/s
190rawstudio-mint14.tar zskp 1 8558382592 3648168838 33398 244.38
191rawstudio-mint14.tar zskp 2 8558382592 3376721436 50962 160.16
192rawstudio-mint14.tar gzstd 1 8558382592 3926257486 84712 96.35
193rawstudio-mint14.tar gzstd 3 8558382592 3740711978 176344 46.28
194rawstudio-mint14.tar zstd 1 8558382592 3607859742 27903 292.51
195rawstudio-mint14.tar zstd 3 8558382592 3341710879 46700 174.77
196
197
198The test data is designed to test archivers in realistic backup scenarios.
199http://mattmahoney.net/dc/10gb.html
200
201file out level insize outsize millis mb/s
20210gb.tar zskp 1 10065157632 4883149814 45715 209.97
20310gb.tar zskp 2 10065157632 4638110010 60970 157.44
20410gb.tar gzstd 1 10065157632 5198296126 97769 98.18
20510gb.tar gzstd 3 10065157632 4932665487 313427 30.63
20610gb.tar zstd 1 10065157632 4940796535 40391 237.65
20710gb.tar zstd 3 10065157632 4638618579 52911 181.42
208
209Silesia Corpus:
210http://sun.aei.polsl.pl/~sdeor/corpus/silesia.zip
211
212file out level insize outsize millis mb/s
213silesia.tar zskp 1 211947520 73025800 1108 182.26
214silesia.tar zskp 2 211947520 67674684 1599 126.41
215silesia.tar gzstd 1 211947520 80007735 2515 80.37
216silesia.tar gzstd 3 211947520 73133380 4259 47.45
217silesia.tar zstd 1 211947520 73513991 933 216.64
218silesia.tar zstd 3 211947520 66793301 1377 146.79
219```
220
221### Converters
222
223As part of the development process a *Snappy* -> *Zstandard* converter was also built.
224
225This can convert a *framed* [Snappy Stream](https://godoc.org/github.com/golang/snappy#Writer) to a zstd stream.
226Note that a single block is not framed.
227
228Conversion is done by converting the stream directly from Snappy without intermediate full decoding.
229Therefore the compression ratio is much less than what can be done by a full decompression
230and compression, and a faulty Snappy stream may lead to a faulty Zstandard stream without
231any errors being generated.
232No CRC value is being generated and not all CRC values of the Snappy stream are checked.
233However, it provides really fast re-compression of Snappy streams.
234
235
236```
237BenchmarkSnappy_ConvertSilesia-8 1 1156001600 ns/op 183.35 MB/s
238Snappy len 103008711 -> zstd len 82687318
239
240BenchmarkSnappy_Enwik9-8 1 6472998400 ns/op 154.49 MB/s
241Snappy len 508028601 -> zstd len 390921079
242```
243
244
245```Go
246 s := zstd.SnappyConverter{}
247 n, err = s.Convert(input, output)
248 if err != nil {
249 fmt.Println("Re-compressed stream to", n, "bytes")
250 }
251```
252
253The converter `s` can be reused to avoid allocations, even after errors.
254
255
256## Decompressor
257
258Staus: STABLE - there may still be subtle bugs, but a wide variety of content has been tested.
259
260This library is being continuously [fuzz-tested](https://github.com/klauspost/compress-fuzz),
261kindly supplied by [fuzzit.dev](https://fuzzit.dev/).
262The main purpose of the fuzz testing is to ensure that it is not possible to crash the decoder,
263or run it past its limits with ANY input provided.
264
265### Usage
266
267The package has been designed for two main usages, big streams of data and smaller in-memory buffers.
268There are two main usages of the package for these. Both of them are accessed by creating a `Decoder`.
269
270For streaming use a simple setup could look like this:
271
272```Go
273import "github.com/klauspost/compress/zstd"
274
275func Decompress(in io.Reader, out io.Writer) error {
276 d, err := zstd.NewReader(input)
277 if err != nil {
278 return err
279 }
280 defer d.Close()
281
282 // Copy content...
283 _, err := io.Copy(out, d)
284 return err
285}
286```
287
288It is important to use the "Close" function when you no longer need the Reader to stop running goroutines.
289See "Allocation-less operation" below.
290
291For decoding buffers, it could look something like this:
292
293```Go
294import "github.com/klauspost/compress/zstd"
295
296// Create a reader that caches decompressors.
297// For this operation type we supply a nil Reader.
298var decoder, _ = zstd.NewReader(nil)
299
300// Decompress a buffer. We don't supply a destination buffer,
301// so it will be allocated by the decoder.
302func Decompress(src []byte) ([]byte, error) {
303 return decoder.DecodeAll(src, nil)
304}
305```
306
307Both of these cases should provide the functionality needed.
308The decoder can be used for *concurrent* decompression of multiple buffers.
309It will only allow a certain number of concurrent operations to run.
310To tweak that yourself use the `WithDecoderConcurrency(n)` option when creating the decoder.
311
312### Allocation-less operation
313
314The decoder has been designed to operate without allocations after a warmup.
315
316This means that you should *store* the decoder for best performance.
317To re-use a stream decoder, use the `Reset(r io.Reader) error` to switch to another stream.
318A decoder can safely be re-used even if the previous stream failed.
319
320To release the resources, you must call the `Close()` function on a decoder.
321After this it can *no longer be reused*, but all running goroutines will be stopped.
322So you *must* use this if you will no longer need the Reader.
323
324For decompressing smaller buffers a single decoder can be used.
325When decoding buffers, you can supply a destination slice with length 0 and your expected capacity.
326In this case no unneeded allocations should be made.
327
328### Concurrency
329
330The buffer decoder does everything on the same goroutine and does nothing concurrently.
331It can however decode several buffers concurrently. Use `WithDecoderConcurrency(n)` to limit that.
332
333The stream decoder operates on
334
335* One goroutine reads input and splits the input to several block decoders.
336* A number of decoders will decode blocks.
337* A goroutine coordinates these blocks and sends history from one to the next.
338
339So effectively this also means the decoder will "read ahead" and prepare data to always be available for output.
340
341Since "blocks" are quite dependent on the output of the previous block stream decoding will only have limited concurrency.
342
343In practice this means that concurrency is often limited to utilizing about 2 cores effectively.
344
345
346### Benchmarks
347
348These are some examples of performance compared to [datadog cgo library](https://github.com/DataDog/zstd).
349
350The first two are streaming decodes and the last are smaller inputs.
351
352```
353BenchmarkDecoderSilesia-8 20 642550210 ns/op 329.85 MB/s 3101 B/op 8 allocs/op
354BenchmarkDecoderSilesiaCgo-8 100 384930000 ns/op 550.61 MB/s 451878 B/op 9713 allocs/op
355
356BenchmarkDecoderEnwik9-2 10 3146000080 ns/op 317.86 MB/s 2649 B/op 9 allocs/op
357BenchmarkDecoderEnwik9Cgo-2 20 1905900000 ns/op 524.69 MB/s 1125120 B/op 45785 allocs/op
358
359BenchmarkDecoder_DecodeAll/z000000.zst-8 200 7049994 ns/op 138.26 MB/s 40 B/op 2 allocs/op
360BenchmarkDecoder_DecodeAll/z000001.zst-8 100000 19560 ns/op 97.49 MB/s 40 B/op 2 allocs/op
361BenchmarkDecoder_DecodeAll/z000002.zst-8 5000 297599 ns/op 236.99 MB/s 40 B/op 2 allocs/op
362BenchmarkDecoder_DecodeAll/z000003.zst-8 2000 725502 ns/op 141.17 MB/s 40 B/op 2 allocs/op
363BenchmarkDecoder_DecodeAll/z000004.zst-8 200000 9314 ns/op 54.54 MB/s 40 B/op 2 allocs/op
364BenchmarkDecoder_DecodeAll/z000005.zst-8 10000 137500 ns/op 104.72 MB/s 40 B/op 2 allocs/op
365BenchmarkDecoder_DecodeAll/z000006.zst-8 500 2316009 ns/op 206.06 MB/s 40 B/op 2 allocs/op
366BenchmarkDecoder_DecodeAll/z000007.zst-8 20000 64499 ns/op 344.90 MB/s 40 B/op 2 allocs/op
367BenchmarkDecoder_DecodeAll/z000008.zst-8 50000 24900 ns/op 219.56 MB/s 40 B/op 2 allocs/op
368BenchmarkDecoder_DecodeAll/z000009.zst-8 1000 2348999 ns/op 154.01 MB/s 40 B/op 2 allocs/op
369
370BenchmarkDecoder_DecodeAllCgo/z000000.zst-8 500 4268005 ns/op 228.38 MB/s 1228849 B/op 3 allocs/op
371BenchmarkDecoder_DecodeAllCgo/z000001.zst-8 100000 15250 ns/op 125.05 MB/s 2096 B/op 3 allocs/op
372BenchmarkDecoder_DecodeAllCgo/z000002.zst-8 10000 147399 ns/op 478.49 MB/s 73776 B/op 3 allocs/op
373BenchmarkDecoder_DecodeAllCgo/z000003.zst-8 5000 320798 ns/op 319.27 MB/s 139312 B/op 3 allocs/op
374BenchmarkDecoder_DecodeAllCgo/z000004.zst-8 200000 10004 ns/op 50.77 MB/s 560 B/op 3 allocs/op
375BenchmarkDecoder_DecodeAllCgo/z000005.zst-8 20000 73599 ns/op 195.64 MB/s 19120 B/op 3 allocs/op
376BenchmarkDecoder_DecodeAllCgo/z000006.zst-8 1000 1119003 ns/op 426.48 MB/s 557104 B/op 3 allocs/op
377BenchmarkDecoder_DecodeAllCgo/z000007.zst-8 20000 103450 ns/op 215.04 MB/s 71296 B/op 9 allocs/op
378BenchmarkDecoder_DecodeAllCgo/z000008.zst-8 100000 20130 ns/op 271.58 MB/s 6192 B/op 3 allocs/op
379BenchmarkDecoder_DecodeAllCgo/z000009.zst-8 2000 1123500 ns/op 322.00 MB/s 368688 B/op 3 allocs/op
380```
381
382This reflects the performance around May 2019, but this may be out of date.
383
384# Contributions
385
386Contributions are always welcome.
387For new features/fixes, remember to add tests and for performance enhancements include benchmarks.
388
389For sending files for reproducing errors use a service like [goobox](https://goobox.io/#/upload) or similar to share your files.
390
391For general feedback and experience reports, feel free to open an issue or write me on [Twitter](https://twitter.com/sh0dan).
392
393This package includes the excellent [`github.com/cespare/xxhash`](https://github.com/cespare/xxhash) package Copyright (c) 2016 Caleb Spare.
394