1Lossless Data Compression
2^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3
4Some lossless data compression algorithms are available in botan, currently all
5via third party libraries - these include zlib (including deflate and gzip
6formats), bzip2, and lzma. Support for these must be enabled at build time;
7you can check for them using the macros ``BOTAN_HAS_ZLIB``, ``BOTAN_HAS_BZIP2``,
8and ``BOTAN_HAS_LZMA``.
9
10.. note::
11   You should always compress *before* you encrypt, because encryption seeks to
12   hide the redundancy that compression is supposed to try to find and remove.
13
14Compression is done through the ``Compression_Algorithm`` and
15``Decompression_Algorithm`` classes, both defined in `compression.h`
16
17Compression and decompression both work in three stages: starting a
18message (``start``), continuing to process it (``update``), and then
19finally completing processing the stream (``finish``).
20
21.. cpp:class:: Compression_Algorithm
22
23  .. cpp:function:: void start(size_t level)
24
25       Initialize the compression engine. This must be done before calling
26       ``update`` or ``finish``. The meaning of the `level` parameter varies by
27       the algorithm but generally takes a value between 1 and 9, with higher
28       values implying typically better compression from and more memory and/or
29       CPU time consumed by the compression process. The decompressor can always
30       handle input from any compressor.
31
32  .. cpp:function::  void update(secure_vector<uint8_t>& buf, \
33                                 size_t offset = 0, bool flush = false)
34
35       Compress the material in the in/out parameter ``buf``. The leading
36       ``offset`` bytes of ``buf`` are ignored and remain untouched; this can be
37       useful for ignoring packet headers.  If ``flush`` is true, the
38       compression state is flushed, allowing the decompressor to recover the
39       entire message up to this point without having the see the rest of the
40       compressed stream.
41
42   .. cpp::function:: void finish(secure_vector<uint8_t>& buf, size_t offset = 0)
43
44       Finish compressing a message. The ``buf`` and ``offset`` parameters are
45       treated as in ``update``. It is acceptable to call ``start`` followed by
46       ``finish`` with the entire message, without any intervening call to
47       ``update``.
48
49.. cpp:class:: Decompression_Algorithm
50
51  .. cpp:function:: void start()
52
53       Initialize the decompression engine. This must be done before calling
54       ``update`` or ``finish``. No level is provided here; the decompressor
55       can accept input generated by any compression parameters.
56
57  .. cpp:function::  void update(secure_vector<uint8_t>& buf, \
58                                 size_t offset = 0)
59
60       Decompress the material in the in/out parameter ``buf``. The leading
61       ``offset`` bytes of ``buf`` are ignored and remain untouched; this can be
62       useful for ignoring packet headers.
63
64       This function may throw if the data seems to be invalid.
65
66   .. cpp::function:: void finish(secure_vector<uint8_t>& buf, size_t offset = 0)
67
68       Finish decompressing a message. The ``buf`` and ``offset`` parameters are
69       treated as in ``update``. It is acceptable to call ``start`` followed by
70       ``finish`` with the entire message, without any intervening call to
71       ``update``.
72
73       This function may throw if the data seems to be invalid.
74
75The easiest way to get a compressor is via the functions
76
77.. cpp:function:: Compression_Algorithm* make_compressor(std::string type)
78.. cpp:function:: Decompression_Algorithm* make_decompressor(std::string type)
79
80Supported values for `type` include `zlib` (raw zlib with no checksum),
81`deflate` (zlib's deflate format), `gzip`, `bz2`, and `lzma`. A null pointer
82will be returned if the algorithm is unavailable.
83
84To use a compression algorithm in a `Pipe` use the adapter types
85`Compression_Filter` and `Decompression_Filter` from `comp_filter.h`. The
86constructors of both filters take a `std::string` argument (passed to
87`make_compressor` or `make_decompressor`), the compression filter also takes a
88`level` parameter. Finally both constructors have a parameter `buf_sz` which
89specifies the size of the internal buffer that will be used - inputs will be
90broken into blocks of this size. The default is 4096.
91