1Snappy, a fast compressor/decompressor. 2 3 4Introduction 5============ 6 7Snappy is a compression/decompression library. It does not aim for maximum 8compression, or compatibility with any other compression library; instead, 9it aims for very high speeds and reasonable compression. For instance, 10compared to the fastest mode of zlib, Snappy is an order of magnitude faster 11for most inputs, but the resulting compressed files are anywhere from 20% to 12100% bigger. (For more information, see "Performance", below.) 13 14Snappy has the following properties: 15 16 * Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code. 17 See "Performance" below. 18 * Stable: Over the last few years, Snappy has compressed and decompressed 19 petabytes of data in Google's production environment. The Snappy bitstream 20 format is stable and will not change between versions. 21 * Robust: The Snappy decompressor is designed not to crash in the face of 22 corrupted or malicious input. 23 * Free and open source software: Snappy is licensed under a BSD-type license. 24 For more information, see the included COPYING file. 25 26Snappy has previously been called "Zippy" in some Google presentations 27and the like. 28 29 30Performance 31=========== 32 33Snappy is intended to be fast. On a single core of a Core i7 processor 34in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at 35about 500 MB/sec or more. (These numbers are for the slowest inputs in our 36benchmark suite; others are much faster.) In our tests, Snappy usually 37is faster than algorithms in the same class (e.g. LZO, LZF, FastLZ, QuickLZ, 38etc.) while achieving comparable compression ratios. 39 40Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x 41for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and 42other already-compressed data. Similar numbers for zlib in its fastest mode 43are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are 44capable of achieving yet higher compression rates, although usually at the 45expense of speed. Of course, compression ratio will vary significantly with 46the input. 47 48Although Snappy should be fairly portable, it is primarily optimized 49for 64-bit x86-compatible processors, and may run slower in other environments. 50In particular: 51 52 - Snappy uses 64-bit operations in several places to process more data at 53 once than would otherwise be possible. 54 - Snappy assumes unaligned 32- and 64-bit loads and stores are cheap. 55 On some platforms, these must be emulated with single-byte loads 56 and stores, which is much slower. 57 - Snappy assumes little-endian throughout, and needs to byte-swap data in 58 several places if running on a big-endian platform. 59 60Experience has shown that even heavily tuned code can be improved. 61Performance optimizations, whether for 64-bit x86 or other platforms, 62are of course most welcome; see "Contact", below. 63 64 65Usage 66===== 67 68Note that Snappy, both the implementation and the main interface, 69is written in C++. However, several third-party bindings to other languages 70are available; see the Google Code page at http://code.google.com/p/snappy/ 71for more information. Also, if you want to use Snappy from C code, you can 72use the included C bindings in snappy-c.h. 73 74To use Snappy from your own C++ program, include the file "snappy.h" from 75your calling file, and link against the compiled library. 76 77There are many ways to call Snappy, but the simplest possible is 78 79 snappy::Compress(input.data(), input.size(), &output); 80 81and similarly 82 83 snappy::Uncompress(input.data(), input.size(), &output); 84 85where "input" and "output" are both instances of std::string. 86 87There are other interfaces that are more flexible in various ways, including 88support for custom (non-array) input sources. See the header file for more 89information. 90 91 92Tests and benchmarks 93==================== 94 95When you compile Snappy, snappy_unittest is compiled in addition to the 96library itself. You do not need it to use the compressor from your own library, 97but it contains several useful components for Snappy development. 98 99First of all, it contains unit tests, verifying correctness on your machine in 100various scenarios. If you want to change or optimize Snappy, please run the 101tests to verify you have not broken anything. Note that if you have the 102Google Test library installed, unit test behavior (especially failures) will be 103significantly more user-friendly. You can find Google Test at 104 105 http://code.google.com/p/googletest/ 106 107You probably also want the gflags library for handling of command-line flags; 108you can find it at 109 110 http://code.google.com/p/google-gflags/ 111 112In addition to the unit tests, snappy contains microbenchmarks used to 113tune compression and decompression performance. These are automatically run 114before the unit tests, but you can disable them using the flag 115--run_microbenchmarks=false if you have gflags installed (otherwise you will 116need to edit the source). 117 118Finally, snappy can benchmark Snappy against a few other compression libraries 119(zlib, LZO, LZF, FastLZ and QuickLZ), if they were detected at configure time. 120To benchmark using a given file, give the compression algorithm you want to test 121Snappy against (e.g. --zlib) and then a list of one or more file names on the 122command line. The testdata/ directory contains the files used by the 123microbenchmark, which should provide a reasonably balanced starting point for 124benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they 125are used to verify correctness in the presence of corrupted data in the unit 126test.) 127 128 129Contact 130======= 131 132Snappy is distributed through Google Code. For the latest version, a bug tracker, 133and other information, see 134 135 http://code.google.com/p/snappy/ 136