1Lossless Data Compression 2^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3 4Some lossless data compression algorithms are available in botan, currently all 5via third party libraries - these include zlib (including deflate and gzip 6formats), bzip2, and lzma. Support for these must be enabled at build time; 7you can check for them using the macros ``BOTAN_HAS_ZLIB``, ``BOTAN_HAS_BZIP2``, 8and ``BOTAN_HAS_LZMA``. 9 10.. note:: 11 You should always compress *before* you encrypt, because encryption seeks to 12 hide the redundancy that compression is supposed to try to find and remove. 13 14Compression is done through the ``Compression_Algorithm`` and 15``Decompression_Algorithm`` classes, both defined in `compression.h` 16 17Compression and decompression both work in three stages: starting a 18message (``start``), continuing to process it (``update``), and then 19finally completing processing the stream (``finish``). 20 21.. cpp:class:: Compression_Algorithm 22 23 .. cpp:function:: void start(size_t level) 24 25 Initialize the compression engine. This must be done before calling 26 ``update`` or ``finish``. The meaning of the `level` parameter varies by 27 the algorithm but generally takes a value between 1 and 9, with higher 28 values implying typically better compression from and more memory and/or 29 CPU time consumed by the compression process. The decompressor can always 30 handle input from any compressor. 31 32 .. cpp:function:: void update(secure_vector<uint8_t>& buf, \ 33 size_t offset = 0, bool flush = false) 34 35 Compress the material in the in/out parameter ``buf``. The leading 36 ``offset`` bytes of ``buf`` are ignored and remain untouched; this can be 37 useful for ignoring packet headers. If ``flush`` is true, the 38 compression state is flushed, allowing the decompressor to recover the 39 entire message up to this point without having the see the rest of the 40 compressed stream. 41 42 .. cpp::function:: void finish(secure_vector<uint8_t>& buf, size_t offset = 0) 43 44 Finish compressing a message. The ``buf`` and ``offset`` parameters are 45 treated as in ``update``. It is acceptable to call ``start`` followed by 46 ``finish`` with the entire message, without any intervening call to 47 ``update``. 48 49.. cpp:class:: Decompression_Algorithm 50 51 .. cpp:function:: void start() 52 53 Initialize the decompression engine. This must be done before calling 54 ``update`` or ``finish``. No level is provided here; the decompressor 55 can accept input generated by any compression parameters. 56 57 .. cpp:function:: void update(secure_vector<uint8_t>& buf, \ 58 size_t offset = 0) 59 60 Decompress the material in the in/out parameter ``buf``. The leading 61 ``offset`` bytes of ``buf`` are ignored and remain untouched; this can be 62 useful for ignoring packet headers. 63 64 This function may throw if the data seems to be invalid. 65 66 .. cpp::function:: void finish(secure_vector<uint8_t>& buf, size_t offset = 0) 67 68 Finish decompressing a message. The ``buf`` and ``offset`` parameters are 69 treated as in ``update``. It is acceptable to call ``start`` followed by 70 ``finish`` with the entire message, without any intervening call to 71 ``update``. 72 73 This function may throw if the data seems to be invalid. 74 75The easiest way to get a compressor is via the functions 76 77.. cpp:function:: Compression_Algorithm* make_compressor(std::string type) 78.. cpp:function:: Decompression_Algorithm* make_decompressor(std::string type) 79 80Supported values for `type` include `zlib` (raw zlib with no checksum), 81`deflate` (zlib's deflate format), `gzip`, `bz2`, and `lzma`. A null pointer 82will be returned if the algorithm is unavailable. 83 84To use a compression algorithm in a `Pipe` use the adapter types 85`Compression_Filter` and `Decompression_Filter` from `comp_filter.h`. The 86constructors of both filters take a `std::string` argument (passed to 87`make_compressor` or `make_decompressor`), the compression filter also takes a 88`level` parameter. Finally both constructors have a parameter `buf_sz` which 89specifies the size of the internal buffer that will be used - inputs will be 90broken into blocks of this size. The default is 4096. 91