• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

cmake/H17-Sep-2021-113102

config/H17-Sep-2021-16,61312,358

data/121B2TestData/H03-May-2022-

doc/H03-May-2022-

include/H17-Sep-2021-24787

m4/H17-Sep-2021-9,9489,021

src/H03-May-2022-4,0122,956

tests/H03-May-2022-2,1651,853

AUTHORSH A D02-Feb-2021209 1512

CHANGELOG.mdH A D17-Sep-20212.7 KiB12584

INSTALL.mdH A D16-Jun-20212.1 KiB10370

Makefile.amH A D16-Jun-2021621 1714

Makefile.inH A D17-Sep-202126.5 KiB860753

README.SZIPH A D02-Feb-2021638 1612

README.mdH A D16-Jun-20217.5 KiB227167

THANKSH A D02-Feb-2021606 1711

aclocal.m4H A D17-Sep-202143.6 KiB1,1991,088

configureH A D17-Sep-2021440.9 KiB15,24211,894

configure.acH A D17-Sep-2021625 3526

README.SZIP

1**********************************************************************
2 SZIP compatibility
3**********************************************************************
4
5Libaec includes a free drop-in replacement for the SZIP
6library[1]. Just replace SZIP's shared library libsz.so* with
7libaec.so* and libsz.so* from libaec. For Windows the DLLs are called
8SZIP.DLL and AEC.DLL. Code which is dynamically linked with SZIP such
9as HDF5 should continue to work with libaec. No re-compilation
10required.
11
12HDF5 files which contain SZIP encoded data can be decoded by HDF5
13using libaec and vice versa.
14
15[1] http://www.hdfgroup.org/doc_resource/SZIP/
16

README.md

1# libaec - Adaptive Entropy Coding library
2
3Libaec provides fast lossless compression of 1 up to 32 bit wide
4signed or unsigned integers (samples). The library achieves best
5results for low entropy data as often encountered in space imaging
6instrument data or numerical model output from weather or climate
7simulations. While floating point representations are not directly
8supported, they can also be efficiently coded by grouping exponents
9and mantissa.
10
11## Scope
12
13Libaec implements extended
14[Golomb-Rice](http://en.wikipedia.org/wiki/Golomb_coding) coding as
15defined in the CCSDS recommended standard [121.0-B-3][1]. The library
16covers the adaptive entropy coder and the preprocessor discussed in
17sections 1 to 5.2.6 of the [standard][1].
18
19## Downloads
20
21Source code and binary installer can be [downloaded here](https://gitlab.dkrz.de/k202009/libaec/tags) [or here](https://github.com/MathisRosenhauer/libaec).
22
23## Patent considerations
24
25As stated in section A3 of the current [standard][1]
26
27> At time of publication, the specifications of this Recommended
28> Standard are not known to be the subject of patent rights.
29
30## Installation
31
32See [INSTALL.md](INSTALL.md) for details.
33
34## SZIP Compatibility
35
36[Libaec can replace SZIP](README.SZIP).
37
38## Encoding
39
40In this context efficiency refers to the size of the encoded
41data. Performance refers to the time it takes to encode data.
42
43Suppose you have an array of 32 bit signed integers you want to
44compress. The pointer pointing to the data shall be called `source`,
45output goes into `dest`.
46
47```c
48#include <libaec.h>
49
50...
51    struct aec_stream strm;
52    int32_t *source;
53    unsigned char *dest;
54
55    /* input data is 32 bits wide */
56    strm.bits_per_sample = 32;
57
58    /* define a block size of 16 */
59    strm.block_size = 16;
60
61    /* the reference sample interval is set to 128 blocks */
62    strm.rsi = 128;
63
64    /* input data is signed and needs to be preprocessed */
65    strm.flags = AEC_DATA_SIGNED | AEC_DATA_PREPROCESS;
66
67    /* pointer to input */
68    strm.next_in = (unsigned char *)source;
69
70    /* length of input in bytes */
71    strm.avail_in = source_length * sizeof(int32_t);
72
73    /* pointer to output buffer */
74    strm.next_out = dest;
75
76    /* length of output buffer in bytes */
77    strm.avail_out = dest_length;
78
79    /* initialize encoding */
80    if (aec_encode_init(&strm) != AEC_OK)
81        return 1;
82
83    /* Perform encoding in one call and flush output. */
84    /* In this example you must be sure that the output */
85    /* buffer is large enough for all compressed output */
86    if (aec_encode(&strm, AEC_FLUSH) != AEC_OK)
87        return 1;
88
89    /* free all resources used by encoder */
90    aec_encode_end(&strm);
91...
92```
93
94`block_size` can vary from 8 to 64 samples. Smaller blocks allow the
95compression to adapt more rapidly to changing source
96statistics. Larger blocks create less overhead but can be less
97efficient if source statistics change across the block.
98
99`rsi` sets the reference sample interval in blocks. A large RSI will
100improve performance and efficiency. It will also increase memory
101requirements since internal buffering is based on RSI size. A smaller
102RSI may be desirable in situations where errors could occur in the
103transmission of encoded data and the resulting propagation of errors
104in decoded data has to be minimized.
105
106### Flags:
107
108* `AEC_DATA_SIGNED`: input data are signed integers. Specifying this
109  correctly increases compression efficiency. Default is unsigned.
110
111* `AEC_DATA_PREPROCESS`: preprocessing input will improve compression
112  efficiency if data samples are correlated. It will only cost
113  performance for no gain in efficiency if the data is already
114  uncorrelated.
115
116* `AEC_DATA_MSB`: input data is stored most significant byte first
117  i.e. big endian. Default is little endian on all architectures.
118
119* `AEC_DATA_3BYTE`: the 17 to 24 bit input data is stored in three
120  bytes. This flag has no effect for other sample sizes.
121
122* `AEC_RESTRICTED`: use a restricted set of code options. This option is
123  only valid for `bits_per_sample` <= 4.
124
125### Data size:
126
127The following rules apply for deducing storage size from sample size
128(`bits_per_sample`):
129
130 **sample size**  | **storage size**
131--- | ---
132 1 -  8 bits  | 1 byte
133 9 - 16 bits  | 2 bytes
13417 - 24 bits  | 3 bytes (only if `AEC_DATA_3BYTE` is set)
13525 - 32 bits  | 4 bytes (if `AEC_DATA_3BYTE` is set)
13617 - 32 bits  | 4 bytes (if `AEC_DATA_3BYTE` is not set)
137
138If a sample requires less bits than the storage size provides, then
139you have to make sure that unused bits are not set. Libaec does not
140enforce this for performance reasons and will produce undefined output
141if unused bits are set. All input data must be a multiple of the
142storage size in bytes. Remaining bytes which do not form a complete
143sample will be ignored.
144
145Libaec accesses `next_in` and `next_out` buffers only bytewise. There
146are no alignment requirements for these buffers.
147
148### Flushing:
149
150`aec_encode` can be used in a streaming fashion by chunking input and
151output and specifying `AEC_NO_FLUSH`. The function will return if either
152the input runs empty or the output buffer is full. The calling
153function can check `avail_in` and `avail_out` to see what occurred. The
154last call to `aec_encode()` must set `AEC_FLUSH` to drain all
155output. [aec.c](src/aec.c) is an example of streaming usage of encoding and
156decoding.
157
158### Output:
159
160Encoded data will be written to the buffer submitted with
161`next_out`. The length of the compressed data is `total_out`.
162
163See libaec.h for a detailed description of all relevant structure
164members and constants.
165
166
167## Decoding
168
169Using decoding is very similar to encoding, only the meaning of input
170and output is reversed.
171
172```c
173#include <libaec.h>
174
175...
176    struct aec_stream strm;
177    /* this is now the compressed data */
178    unsigned char *source;
179    /* here goes the uncompressed result */
180    int32_t *dest;
181
182    strm.bits_per_sample = 32;
183    strm.block_size = 16;
184    strm.rsi = 128;
185    strm.flags = AEC_DATA_SIGNED | AEC_DATA_PREPROCESS;
186    strm.next_in = source;
187    strm.avail_in = source_length;
188    strm.next_out = (unsigned char *)dest;
189    strm.avail_out = dest_lenth * sizeof(int32_t);
190    if (aec_decode_init(&strm) != AEC_OK)
191        return 1;
192    if (aec_decode(&strm, AEC_FLUSH) != AEC_OK)
193        return 1;
194    aec_decode_end(&strm);
195...
196```
197
198It is strongly recommended that the size of the output buffer
199(`next_out`) is a multiple of the storage size in bytes. If the buffer
200is not a multiple of the storage size and the buffer gets filled to
201the last sample, the error code `AEC_MEM_ERROR` is returned.
202
203It is essential for decoding that parameters like `bits_per_sample`,
204`block_size`, `rsi`, and `flags` are exactly the same as they were for
205encoding. Libaec does not store these parameters in the coded stream
206so it is up to the calling program to keep the correct parameters
207between encoding and decoding.
208
209The actual values of coding parameters are in fact only relevant for
210efficiency and performance. Data integrity only depends on consistency
211of the parameters.
212
213The exact length of the original data is not preserved and must also be
214transmitted out of band. The decoder can produce additional output
215depending on whether the original data ended on a block boundary or on
216zero blocks. The output data must therefore be truncated to the
217correct length. This can also be achieved by providing an output
218buffer of just the correct length.
219
220## References
221
222[Lossless Data Compression. Recommendation for Space Data System
223Standards, CCSDS 121.0-B-3. Blue Book. Issue 3. Washington, D.C.:
224CCSDS, August 2020.][1]
225
226[1]: https://public.ccsds.org/Pubs/121x0b3.pdf
227