1# zstd 2 3[Zstandard](https://facebook.github.io/zstd/) is a real-time compression algorithm, providing high compression ratios. 4It offers a very wide range of compression / speed trade-off, while being backed by a very fast decoder. 5A high performance compression algorithm is implemented. For now focused on speed. 6 7This package provides [compression](#Compressor) to and [decompression](#Decompressor) of Zstandard content. 8Note that custom dictionaries are only supported for decompression. 9 10This package is pure Go and without use of "unsafe". 11 12The `zstd` package is provided as open source software using a Go standard license. 13 14Currently the package is heavily optimized for 64 bit processors and will be significantly slower on 32 bit processors. 15 16## Installation 17 18Install using `go get -u github.com/klauspost/compress`. The package is located in `github.com/klauspost/compress/zstd`. 19 20Godoc Documentation: https://godoc.org/github.com/klauspost/compress/zstd 21 22 23## Compressor 24 25### Status: 26 27STABLE - there may always be subtle bugs, a wide variety of content has been tested and the library is actively 28used by several projects. This library is being continuously [fuzz-tested](https://github.com/klauspost/compress-fuzz), 29kindly supplied by [fuzzit.dev](https://fuzzit.dev/). 30 31There may still be specific combinations of data types/size/settings that could lead to edge cases, 32so as always, testing is recommended. 33 34For now, a high speed (fastest) and medium-fast (default) compressor has been implemented. 35 36The "Fastest" compression ratio is roughly equivalent to zstd level 1. 37The "Default" compression ratio is roughly equivalent to zstd level 3 (default). 38 39In terms of speed, it is typically 2x as fast as the stdlib deflate/gzip in its fastest mode. 40The compression ratio compared to stdlib is around level 3, but usually 3x as fast. 41 42Compared to cgo zstd, the speed is around level 3 (default), but compression slightly worse, between level 1&2. 43 44 45### Usage 46 47An Encoder can be used for either compressing a stream via the 48`io.WriteCloser` interface supported by the Encoder or as multiple independent 49tasks via the `EncodeAll` function. 50Smaller encodes are encouraged to use the EncodeAll function. 51Use `NewWriter` to create a new instance that can be used for both. 52 53To create a writer with default options, do like this: 54 55```Go 56// Compress input to output. 57func Compress(in io.Reader, out io.Writer) error { 58 w, err := NewWriter(output) 59 if err != nil { 60 return err 61 } 62 _, err := io.Copy(w, input) 63 if err != nil { 64 enc.Close() 65 return err 66 } 67 return enc.Close() 68} 69``` 70 71Now you can encode by writing data to `enc`. The output will be finished writing when `Close()` is called. 72Even if your encode fails, you should still call `Close()` to release any resources that may be held up. 73 74The above is fine for big encodes. However, whenever possible try to *reuse* the writer. 75 76To reuse the encoder, you can use the `Reset(io.Writer)` function to change to another output. 77This will allow the encoder to reuse all resources and avoid wasteful allocations. 78 79Currently stream encoding has 'light' concurrency, meaning up to 2 goroutines can be working on part 80of a stream. This is independent of the `WithEncoderConcurrency(n)`, but that is likely to change 81in the future. So if you want to limit concurrency for future updates, specify the concurrency 82you would like. 83 84You can specify your desired compression level using `WithEncoderLevel()` option. Currently only pre-defined 85compression settings can be specified. 86 87#### Future Compatibility Guarantees 88 89This will be an evolving project. When using this package it is important to note that both the compression efficiency and speed may change. 90 91The goal will be to keep the default efficiency at the default zstd (level 3). 92However the encoding should never be assumed to remain the same, 93and you should not use hashes of compressed output for similarity checks. 94 95The Encoder can be assumed to produce the same output from the exact same code version. 96However, the may be modes in the future that break this, 97although they will not be enabled without an explicit option. 98 99This encoder is not designed to (and will probably never) output the exact same bitstream as the reference encoder. 100 101Also note, that the cgo decompressor currently does not [report all errors on invalid input](https://github.com/DataDog/zstd/issues/59), 102[omits error checks](https://github.com/DataDog/zstd/issues/61), [ignores checksums](https://github.com/DataDog/zstd/issues/43) 103and seems to ignore concatenated streams, even though [it is part of the spec](https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#frames). 104 105#### Blocks 106 107For compressing small blocks, the returned encoder has a function called `EncodeAll(src, dst []byte) []byte`. 108 109`EncodeAll` will encode all input in src and append it to dst. 110This function can be called concurrently, but each call will only run on a single goroutine. 111 112Encoded blocks can be concatenated and the result will be the combined input stream. 113Data compressed with EncodeAll can be decoded with the Decoder, using either a stream or `DecodeAll`. 114 115Especially when encoding blocks you should take special care to reuse the encoder. 116This will effectively make it run without allocations after a warmup period. 117To make it run completely without allocations, supply a destination buffer with space for all content. 118 119```Go 120import "github.com/klauspost/compress/zstd" 121 122// Create a writer that caches compressors. 123// For this operation type we supply a nil Reader. 124var encoder, _ = zstd.NewWriter(nil) 125 126// Compress a buffer. 127// If you have a destination buffer, the allocation in the call can also be eliminated. 128func Compress(src []byte) []byte { 129 return encoder.EncodeAll(src, make([]byte, 0, len(src))) 130} 131``` 132 133You can control the maximum number of concurrent encodes using the `WithEncoderConcurrency(n)` 134option when creating the writer. 135 136Using the Encoder for both a stream and individual blocks concurrently is safe. 137 138### Performance 139 140I have collected some speed examples to compare speed and compression against other compressors. 141 142* `file` is the input file. 143* `out` is the compressor used. `zskp` is this package. `zstd` is the Datadog cgo library. `gzstd/gzkp` is gzip standard and this library. 144* `level` is the compression level used. For `zskp` level 1 is "fastest", level 2 is "default". 145* `insize`/`outsize` is the input/output size. 146* `millis` is the number of milliseconds used for compression. 147* `mb/s` is megabytes (2^20 bytes) per second. 148 149``` 150Silesia Corpus: 151http://sun.aei.polsl.pl/~sdeor/corpus/silesia.zip 152 153This package: 154file out level insize outsize millis mb/s 155silesia.tar zskp 1 211947520 73101992 643 313.87 156silesia.tar zskp 2 211947520 67504318 969 208.38 157silesia.tar zskp 3 211947520 65177448 1899 106.44 158 159cgo zstd: 160silesia.tar zstd 1 211947520 73605392 543 371.56 161silesia.tar zstd 3 211947520 66793289 864 233.68 162silesia.tar zstd 6 211947520 62916450 1913 105.66 163 164gzip, stdlib/this package: 165silesia.tar gzstd 1 211947520 80007735 1654 122.21 166silesia.tar gzkp 1 211947520 80369488 1168 173.06 167 168GOB stream of binary data. Highly compressible. 169https://files.klauspost.com/compress/gob-stream.7z 170 171file out level insize outsize millis mb/s 172gob-stream zskp 1 1911399616 235022249 3088 590.30 173gob-stream zskp 2 1911399616 205669791 3786 481.34 174gob-stream zskp 3 1911399616 185792019 9324 195.48 175gob-stream zstd 1 1911399616 249810424 2637 691.26 176gob-stream zstd 3 1911399616 208192146 3490 522.31 177gob-stream zstd 6 1911399616 193632038 6687 272.56 178gob-stream gzstd 1 1911399616 357382641 10251 177.82 179gob-stream gzkp 1 1911399616 362156523 5695 320.08 180 181The test data for the Large Text Compression Benchmark is the first 18210^9 bytes of the English Wikipedia dump on Mar. 3, 2006. 183http://mattmahoney.net/dc/textdata.html 184 185file out level insize outsize millis mb/s 186enwik9 zskp 1 1000000000 343848582 3609 264.18 187enwik9 zskp 2 1000000000 317276632 5746 165.97 188enwik9 zskp 3 1000000000 294540704 11725 81.34 189enwik9 zstd 1 1000000000 358072021 3110 306.65 190enwik9 zstd 3 1000000000 313734672 4784 199.35 191enwik9 zstd 6 1000000000 295138875 10290 92.68 192enwik9 gzstd 1 1000000000 382578136 9604 99.30 193enwik9 gzkp 1 1000000000 383825945 6544 145.73 194 195Highly compressible JSON file. 196https://files.klauspost.com/compress/github-june-2days-2019.json.zst 197 198file out level insize outsize millis mb/s 199github-june-2days-2019.json zskp 1 6273951764 699045015 10620 563.40 200github-june-2days-2019.json zskp 2 6273951764 617881763 11687 511.96 201github-june-2days-2019.json zskp 3 6273951764 537511906 29252 204.54 202github-june-2days-2019.json zstd 1 6273951764 766284037 8450 708.00 203github-june-2days-2019.json zstd 3 6273951764 661889476 10927 547.57 204github-june-2days-2019.json zstd 6 6273951764 642756859 22996 260.18 205github-june-2days-2019.json gzstd 1 6273951764 1164400847 29948 199.79 206github-june-2days-2019.json gzkp 1 6273951764 1128755542 19236 311.03 207 208VM Image, Linux mint with a few installed applications: 209https://files.klauspost.com/compress/rawstudio-mint14.7z 210 211file out level insize outsize millis mb/s 212rawstudio-mint14.tar zskp 1 8558382592 3667489370 20210 403.84 213rawstudio-mint14.tar zskp 2 8558382592 3364592300 31873 256.07 214rawstudio-mint14.tar zskp 3 8558382592 3224594213 71751 113.75 215rawstudio-mint14.tar zstd 1 8558382592 3609250104 17136 476.27 216rawstudio-mint14.tar zstd 3 8558382592 3341679997 29262 278.92 217rawstudio-mint14.tar zstd 6 8558382592 3235846406 77904 104.77 218rawstudio-mint14.tar gzstd 1 8558382592 3926257486 57722 141.40 219rawstudio-mint14.tar gzkp 1 8558382592 3970463184 41749 195.49 220 221CSV data: 222https://files.klauspost.com/compress/nyc-taxi-data-10M.csv.zst 223 224file out level insize outsize millis mb/s 225nyc-taxi-data-10M.csv zskp 1 3325605752 641339945 8925 355.35 226nyc-taxi-data-10M.csv zskp 2 3325605752 591748091 11268 281.44 227nyc-taxi-data-10M.csv zskp 3 3325605752 538490114 19880 159.53 228nyc-taxi-data-10M.csv zstd 1 3325605752 687399637 8233 385.18 229nyc-taxi-data-10M.csv zstd 3 3325605752 598514411 10065 315.07 230nyc-taxi-data-10M.csv zstd 6 3325605752 570522953 20038 158.27 231nyc-taxi-data-10M.csv gzstd 1 3325605752 928656485 23876 132.83 232nyc-taxi-data-10M.csv gzkp 1 3325605752 924718719 16388 193.53 233``` 234 235### Converters 236 237As part of the development process a *Snappy* -> *Zstandard* converter was also built. 238 239This can convert a *framed* [Snappy Stream](https://godoc.org/github.com/golang/snappy#Writer) to a zstd stream. 240Note that a single block is not framed. 241 242Conversion is done by converting the stream directly from Snappy without intermediate full decoding. 243Therefore the compression ratio is much less than what can be done by a full decompression 244and compression, and a faulty Snappy stream may lead to a faulty Zstandard stream without 245any errors being generated. 246No CRC value is being generated and not all CRC values of the Snappy stream are checked. 247However, it provides really fast re-compression of Snappy streams. 248 249 250``` 251BenchmarkSnappy_ConvertSilesia-8 1 1156001600 ns/op 183.35 MB/s 252Snappy len 103008711 -> zstd len 82687318 253 254BenchmarkSnappy_Enwik9-8 1 6472998400 ns/op 154.49 MB/s 255Snappy len 508028601 -> zstd len 390921079 256``` 257 258 259```Go 260 s := zstd.SnappyConverter{} 261 n, err = s.Convert(input, output) 262 if err != nil { 263 fmt.Println("Re-compressed stream to", n, "bytes") 264 } 265``` 266 267The converter `s` can be reused to avoid allocations, even after errors. 268 269 270## Decompressor 271 272Staus: STABLE - there may still be subtle bugs, but a wide variety of content has been tested. 273 274This library is being continuously [fuzz-tested](https://github.com/klauspost/compress-fuzz), 275kindly supplied by [fuzzit.dev](https://fuzzit.dev/). 276The main purpose of the fuzz testing is to ensure that it is not possible to crash the decoder, 277or run it past its limits with ANY input provided. 278 279### Usage 280 281The package has been designed for two main usages, big streams of data and smaller in-memory buffers. 282There are two main usages of the package for these. Both of them are accessed by creating a `Decoder`. 283 284For streaming use a simple setup could look like this: 285 286```Go 287import "github.com/klauspost/compress/zstd" 288 289func Decompress(in io.Reader, out io.Writer) error { 290 d, err := zstd.NewReader(input) 291 if err != nil { 292 return err 293 } 294 defer d.Close() 295 296 // Copy content... 297 _, err := io.Copy(out, d) 298 return err 299} 300``` 301 302It is important to use the "Close" function when you no longer need the Reader to stop running goroutines. 303See "Allocation-less operation" below. 304 305For decoding buffers, it could look something like this: 306 307```Go 308import "github.com/klauspost/compress/zstd" 309 310// Create a reader that caches decompressors. 311// For this operation type we supply a nil Reader. 312var decoder, _ = zstd.NewReader(nil) 313 314// Decompress a buffer. We don't supply a destination buffer, 315// so it will be allocated by the decoder. 316func Decompress(src []byte) ([]byte, error) { 317 return decoder.DecodeAll(src, nil) 318} 319``` 320 321Both of these cases should provide the functionality needed. 322The decoder can be used for *concurrent* decompression of multiple buffers. 323It will only allow a certain number of concurrent operations to run. 324To tweak that yourself use the `WithDecoderConcurrency(n)` option when creating the decoder. 325 326### Dictionaries 327 328Data compressed with [dictionaries](https://github.com/facebook/zstd#the-case-for-small-data-compression) can be decompressed. 329 330Dictionaries are added individually to Decoders. 331Dictionaries are generated by the `zstd --train` command and contains an initial state for the decoder. 332To add a dictionary use the `WithDecoderDicts(dicts ...[]byte)` option with the dictionary data. 333Several dictionaries can be added at once. 334 335The dictionary will be used automatically for the data that specifies them. 336A re-used Decoder will still contain the dictionaries registered. 337 338When registering multiple dictionaries with the same ID, the last one will be used. 339 340### Allocation-less operation 341 342The decoder has been designed to operate without allocations after a warmup. 343 344This means that you should *store* the decoder for best performance. 345To re-use a stream decoder, use the `Reset(r io.Reader) error` to switch to another stream. 346A decoder can safely be re-used even if the previous stream failed. 347 348To release the resources, you must call the `Close()` function on a decoder. 349After this it can *no longer be reused*, but all running goroutines will be stopped. 350So you *must* use this if you will no longer need the Reader. 351 352For decompressing smaller buffers a single decoder can be used. 353When decoding buffers, you can supply a destination slice with length 0 and your expected capacity. 354In this case no unneeded allocations should be made. 355 356### Concurrency 357 358The buffer decoder does everything on the same goroutine and does nothing concurrently. 359It can however decode several buffers concurrently. Use `WithDecoderConcurrency(n)` to limit that. 360 361The stream decoder operates on 362 363* One goroutine reads input and splits the input to several block decoders. 364* A number of decoders will decode blocks. 365* A goroutine coordinates these blocks and sends history from one to the next. 366 367So effectively this also means the decoder will "read ahead" and prepare data to always be available for output. 368 369Since "blocks" are quite dependent on the output of the previous block stream decoding will only have limited concurrency. 370 371In practice this means that concurrency is often limited to utilizing about 2 cores effectively. 372 373 374### Benchmarks 375 376These are some examples of performance compared to [datadog cgo library](https://github.com/DataDog/zstd). 377 378The first two are streaming decodes and the last are smaller inputs. 379 380``` 381BenchmarkDecoderSilesia-8 3 385000067 ns/op 550.51 MB/s 5498 B/op 8 allocs/op 382BenchmarkDecoderSilesiaCgo-8 6 197666567 ns/op 1072.25 MB/s 270672 B/op 8 allocs/op 383 384BenchmarkDecoderEnwik9-8 1 2027001600 ns/op 493.34 MB/s 10496 B/op 18 allocs/op 385BenchmarkDecoderEnwik9Cgo-8 2 979499200 ns/op 1020.93 MB/s 270672 B/op 8 allocs/op 386 387Concurrent performance: 388 389BenchmarkDecoder_DecodeAllParallel/kppkn.gtb.zst-16 28915 42469 ns/op 4340.07 MB/s 114 B/op 0 allocs/op 390BenchmarkDecoder_DecodeAllParallel/geo.protodata.zst-16 116505 9965 ns/op 11900.16 MB/s 16 B/op 0 allocs/op 391BenchmarkDecoder_DecodeAllParallel/plrabn12.txt.zst-16 8952 134272 ns/op 3588.70 MB/s 915 B/op 0 allocs/op 392BenchmarkDecoder_DecodeAllParallel/lcet10.txt.zst-16 11820 102538 ns/op 4161.90 MB/s 594 B/op 0 allocs/op 393BenchmarkDecoder_DecodeAllParallel/asyoulik.txt.zst-16 34782 34184 ns/op 3661.88 MB/s 60 B/op 0 allocs/op 394BenchmarkDecoder_DecodeAllParallel/alice29.txt.zst-16 27712 43447 ns/op 3500.58 MB/s 99 B/op 0 allocs/op 395BenchmarkDecoder_DecodeAllParallel/html_x_4.zst-16 62826 18750 ns/op 21845.10 MB/s 104 B/op 0 allocs/op 396BenchmarkDecoder_DecodeAllParallel/paper-100k.pdf.zst-16 631545 1794 ns/op 57078.74 MB/s 2 B/op 0 allocs/op 397BenchmarkDecoder_DecodeAllParallel/fireworks.jpeg.zst-16 1690140 712 ns/op 172938.13 MB/s 1 B/op 0 allocs/op 398BenchmarkDecoder_DecodeAllParallel/urls.10K.zst-16 10432 113593 ns/op 6180.73 MB/s 1143 B/op 0 allocs/op 399BenchmarkDecoder_DecodeAllParallel/html.zst-16 113206 10671 ns/op 9596.27 MB/s 15 B/op 0 allocs/op 400BenchmarkDecoder_DecodeAllParallel/comp-data.bin.zst-16 1530615 779 ns/op 5229.49 MB/s 0 B/op 0 allocs/op 401 402BenchmarkDecoder_DecodeAllParallelCgo/kppkn.gtb.zst-16 65217 16192 ns/op 11383.34 MB/s 46 B/op 0 allocs/op 403BenchmarkDecoder_DecodeAllParallelCgo/geo.protodata.zst-16 292671 4039 ns/op 29363.19 MB/s 6 B/op 0 allocs/op 404BenchmarkDecoder_DecodeAllParallelCgo/plrabn12.txt.zst-16 26314 46021 ns/op 10470.43 MB/s 293 B/op 0 allocs/op 405BenchmarkDecoder_DecodeAllParallelCgo/lcet10.txt.zst-16 33897 34900 ns/op 12227.96 MB/s 205 B/op 0 allocs/op 406BenchmarkDecoder_DecodeAllParallelCgo/asyoulik.txt.zst-16 104348 11433 ns/op 10949.01 MB/s 20 B/op 0 allocs/op 407BenchmarkDecoder_DecodeAllParallelCgo/alice29.txt.zst-16 75949 15510 ns/op 9805.60 MB/s 32 B/op 0 allocs/op 408BenchmarkDecoder_DecodeAllParallelCgo/html_x_4.zst-16 173910 6756 ns/op 60624.29 MB/s 37 B/op 0 allocs/op 409BenchmarkDecoder_DecodeAllParallelCgo/paper-100k.pdf.zst-16 923076 1339 ns/op 76474.87 MB/s 1 B/op 0 allocs/op 410BenchmarkDecoder_DecodeAllParallelCgo/fireworks.jpeg.zst-16 922920 1351 ns/op 91102.57 MB/s 2 B/op 0 allocs/op 411BenchmarkDecoder_DecodeAllParallelCgo/urls.10K.zst-16 27649 43618 ns/op 16096.19 MB/s 407 B/op 0 allocs/op 412BenchmarkDecoder_DecodeAllParallelCgo/html.zst-16 279073 4160 ns/op 24614.18 MB/s 6 B/op 0 allocs/op 413BenchmarkDecoder_DecodeAllParallelCgo/comp-data.bin.zst-16 749938 1579 ns/op 2581.71 MB/s 0 B/op 0 allocs/op 414``` 415 416This reflects the performance around May 2020, but this may be out of date. 417 418# Contributions 419 420Contributions are always welcome. 421For new features/fixes, remember to add tests and for performance enhancements include benchmarks. 422 423For sending files for reproducing errors use a service like [goobox](https://goobox.io/#/upload) or similar to share your files. 424 425For general feedback and experience reports, feel free to open an issue or write me on [Twitter](https://twitter.com/sh0dan). 426 427This package includes the excellent [`github.com/cespare/xxhash`](https://github.com/cespare/xxhash) package Copyright (c) 2016 Caleb Spare. 428