• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

tests/H09-Jun-2020-870711

.coveragercH A D09-Jun-2020208 1615

.gitignoreH A D09-Jun-202069 87

.travis.ymlH A D09-Jun-2020981 3731

LICENSEH A D09-Jun-2020139 22

MANIFEST.inH A D09-Jun-2020103 54

MakefileH A D09-Jun-20202.8 KiB10878

README.rstH A D09-Jun-202015 KiB326227

build.shH A D09-Jun-2020132 42

creedsolo.cH A D09-Jun-20201.7 MiB40,94826,888

creedsolo.pyxH A D09-Jun-202063.4 KiB908763

reedsolo.pyH A D09-Jun-202069.1 KiB975642

setup.pyH A D09-Jun-20203.3 KiB8660

tox.iniH A D09-Jun-2020786 3731

README.rst

1Reed Solomon
2============
3
4|PyPI-Status| |PyPI-Versions| |PyPI-Downloads|
5
6|Build-Status| |Coverage|
7
8A pure-python `universal errors-and-erasures Reed-Solomon Codec <http://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction>`_
9, based on the wonderful tutorial at `wikiversity <http://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders>`_,
10written by "Bobmath" and "LRQ3000".
11
12------------------------------------
13
14.. contents:: Table of contents
15   :backlinks: top
16   :local:
17
18
19Installation
20------------
21
22.. code:: sh
23
24    pip install --upgrade reedsolo
25
26.. note::
27
28    When installing from source using ``python setup.py install``, the setup.py will try to build the Cython optimized module ``creedsolo.pyx`` if Cython is installed. You can override this behavior by typing: ``python setup.py install --nocython``.
29
30    A pre-transpiled ``creedsolo.c`` is also available, and can be compiled without Cython by typing: ``python setup.py install --native-compile``.
31
32    The package on ``pip`` includes a pre-compiled ``creedsolo.pyd`` module for Windows 10 x64.
33
34Usage
35-----
36
37Basic usage with high-level RSCodec class
38~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39
40.. code:: python
41
42    # Initialization
43    >>> from reedsolo import RSCodec, ReedSolomonError
44    >>> rsc = RSCodec(10)  # 10 ecc symbols
45
46    # Encoding
47    # just a list of numbers/symbols:
48    >>> rsc.encode([1,2,3,4])
49    b'\x01\x02\x03\x04,\x9d\x1c+=\xf8h\xfa\x98M'
50    # bytearrays are accepted and the output will be matched:
51    >>> rsc.encode(bytearray([1,2,3,4]))
52    bytearray(b'\x01\x02\x03\x04,\x9d\x1c+=\xf8h\xfa\x98M')
53    # encoding a byte string is as easy:
54    >>> rsc.encode(b'hello world')
55    b'hello world\xed%T\xc4\xfd\xfd\x89\xf3\xa8\xaa'
56    # Note: strings of any length, even if longer than the Galois field, will be encoded as well using transparent chunking.
57
58    # Decoding (repairing)
59    >>> rsc.decode(b'hello world\xed%T\xc4\xfd\xfd\x89\xf3\xa8\xaa')[0]  # original
60    b'hello world'
61    >>> rsc.decode(b'heXlo worXd\xed%T\xc4\xfdX\x89\xf3\xa8\xaa')[0]     # 3 errors
62    b'hello world'
63    >>> rsc.decode(b'hXXXo worXd\xed%T\xc4\xfdX\x89\xf3\xa8\xaa')[0]     # 5 errors
64    b'hello world'
65    >>> rsc.decode(b'hXXXo worXd\xed%T\xc4\xfdXX\xf3\xa8\xaa')[0]        # 6 errors - fail
66    Traceback (most recent call last):
67      ...
68    reedsolo.ReedSolomonError: Too many (or few) errors found by Chien Search for the errata locator polynomial!
69
70**Important upgrade notice for pre-1.0 users:** Note that ``RSCodec.decode()`` returns 3 variables:
71
72    1. the decoded (corrected) message
73    2. the decoded message and error correction code (which is itself also corrected)
74    3. and the list of positions of the errata (errors and erasures)
75
76Here is how to use these outputs:
77
78.. code:: python
79
80    >>> tampered_msg = b'heXlo worXd\xed%T\xc4\xfdX\x89\xf3\xa8\xaa'
81    >>> decoded_msg, decoded_msgecc, errata_pos = rsc.decode(tampered_msg)
82    >>> print(decoded_msg)  # decoded/corrected message
83    bytearray(b'hello world')
84    >>> print(decoded_msgecc)  # decoded/corrected message and ecc symbols
85    bytearray(b'hello world\xed%T\xc4\xfd\xfd\x89\xf3\xa8\xaa')
86    >>> print(errata_pos)  # errata_pos is returned as a bytearray, hardly intelligible
87    bytearray(b'\x10\t\x02')
88    >>> print(list(errata_pos))  # convert to a list to get the errata positions as integer indices
89    [16, 9, 2]
90
91Since we failed to decode with 6 errors with a codec set with 10 error correction code (ecc) symbols, let's try to use a bigger codec, with 12 ecc symbols.
92
93.. code:: python
94
95    >>> rsc = RSCodec(12)  # using 2 more ecc symbols (to correct max 6 errors or 12 erasures)
96    >>> rsc.encode(b'hello world')
97    b'hello world?Ay\xb2\xbc\xdc\x01q\xb9\xe3\xe2='
98    >>> rsc.decode(b'hello worXXXXy\xb2XX\x01q\xb9\xe3\xe2=')[0]         # 6 errors - ok, but any more would fail
99    b'hello world'
100    >>> rsc.decode(b'helXXXXXXXXXXy\xb2XX\x01q\xb9\xe3\xe2=', erase_pos=[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 16])[0]  # 12 erasures - OK
101    b'hello world'
102
103This shows that we can decode twice as many erasures (where we provide the location of errors ourselves) than errors (with unknown locations). This is the cost of error correction compared to erasure correction.
104
105To get the maximum number of errors *or* erasures that can be independently corrected (ie, not simultaneously):
106
107.. code:: python
108
109    >>> maxerrors, maxerasures = rsc.maxerrata(verbose=True)
110    This codec can correct up to 6 errors and 12 erasures independently
111    >>> print(maxerrors, maxerasures)
112    6 12
113
114To get the maximum number of errors *and* erasures that can be simultaneously corrected, you need to specify the number of errors or erasures you expect:
115
116.. code:: python
117
118    >>> maxerrors, maxerasures = rsc.maxerrata(erasures=6, verbose=True)  # we know the number of erasures, will calculate how many errors we can afford
119    This codec can correct up to 3 errors and 6 erasures simultaneously
120    >>> print(maxerrors, maxerasures)
121    3 6
122    >>> maxerrors, maxerasures = rsc.maxerrata(errors=5, verbose=True)  # we know the number of errors, will calculate how many erasures we can afford
123    This codec can correct up to 5 errors and 2 erasures simultaneously
124    >>> print(maxerrors, maxerasures)
125    5 2
126
127Note that if a chunk has more errors and erasures than the Singleton Bound as calculated by the ``maxerrata()`` method, the codec will try to raise a ``ReedSolomonError`` exception,
128but may very well not detect any error either (this is a theoretical limitation of error correction codes). In other words, error correction codes are unreliable to detect if a chunk of a message
129is corrupted beyond the Singleton Bound. If you want more reliability in errata detection, use a checksum or hash such as SHA or MD5 on your message, these are much more reliable and have no bounds
130on the number of errata (the only potential issue is with collision but the probability is very very low).
131
132Note: to catch a ``ReedSolomonError`` exception, do not forget to import it first with: ``from reedsolo import ReedSolomonError``
133
134To check if a message is tampered given its error correction symbols, without decoding, use the ``check()`` method:
135
136.. code:: python
137
138    # Checking
139    >> rsc.check(b'hello worXXXXy\xb2XX\x01q\xb9\xe3\xe2=')  # Tampered message will return False
140    [False]
141    >> rmes, rmesecc, errata_pos = rsc.decode(b'hello worXXXXy\xb2XX\x01q\xb9\xe3\xe2=')
142    >> rsc.check(rmesecc)  # Corrected or untampered message will return True
143    [True]
144    >> print('Number of detected errors and erasures: %i, their positions: %s' % (len(errata_pos), list(errata_pos)))
145    Number of detected errors and erasures: 6, their positions: [16, 15, 12, 11, 10, 9]
146
147By default, most Reed-Solomon codecs are limited to characters that can be encoded in 256 bits and with a length of maximum 256 characters. But this codec is universal, you can reduce or increase the length and maximum character value by increasing the Galois Field:
148
149.. code:: python
150
151    # To use longer chunks or bigger values than 255 (may be very slow)
152    >> rsc = RSCodec(12, nsize=4095)  # always use a power of 2 minus 1
153    >> rsc = RSCodec(12, c_exp=12)  # alternative way to set nsize=4095
154    >> mes = 'a' * (4095-12)
155    >> mesecc = rsc.encode(mes)
156    >> mesecc[2] = 1
157    >> mesecc[-1] = 1
158    >> rmes, rmesecc, errata_pos = rsc.decode(mesecc)
159    >> rsc.check(mesecc)
160    [False]
161    >> rsc.check(rmesecc)
162    [True]
163
164Note that the ``RSCodec`` class supports transparent chunking, so you don't need to increase the Galois Field to support longer messages, but characters will still be limited to 256 bits (or
165whatever field you set with ``c_exp``).
166
167Low-level usage via direct access to math functions
168~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
169
170If you want full control, you can skip the API and directly use the library as-is. Here's how:
171
172First you need to init the precomputed tables:
173
174.. code:: python
175
176    >> import reedsolo as rs
177    >> rs.init_tables(0x11d)
178
179Pro tip: if you get the error: ValueError: byte must be in range(0, 256), please check that your prime polynomial is correct for your field.
180Pro tip2: by default, you can only encode messages of max length and max symbol value = 256. If you want to encode bigger messages,
181please use the following (where c_exp is the exponent of your Galois Field, eg, 12 = max length 2^12 = 4096):
182
183.. code:: python
184
185    >> prim = rs.find_prime_polys(c_exp=12, fast_primes=True, single=True)
186    >> rs.init_tables(c_exp=12, prim=prim)
187
188Let's define our RS message and ecc size:
189
190.. code:: python
191
192    >> n = 255  # length of total message+ecc
193    >> nsym = 12  # length of ecc
194    >> mes = "a" * (n-nsym)  # generate a sample message
195
196To optimize, you can precompute the generator polynomial:
197
198.. code:: python
199
200    >> gen = rs.rs_generator_poly_all(n)
201
202Then to encode:
203
204.. code:: python
205
206    >> mesecc = rs.rs_encode_msg(mes, nsym, gen=gen[nsym])
207
208Let's tamper our message:
209
210.. code:: python
211
212    >> mesecc[1] = 0
213
214To decode:
215
216.. code:: python
217
218    >> rmes, recc, errata_pos = rs.rs_correct_msg(mesecc, nsym, erase_pos=erase_pos)
219
220Note that both the message and the ecc are corrected (if possible of course).
221Pro tip: if you know a few erasures positions, you can specify them in a list ``erase_pos`` to double the repair power. But you can also just specify an empty list.
222
223You can check how many errors and/or erasures were corrected, which can be useful to design adaptive bitrate algorithms:
224
225.. code:: python
226
227    >> print('A total of %i errata were corrected over all chunks of this message.' % len(errata_pos))
228
229If the decoding fails, it will normally automatically check and raise a ReedSolomonError exception that you can handle.
230However if you want to manually check if the repaired message is correct, you can do so:
231
232.. code:: python
233
234    >> rs.rs_check(rmes + recc, nsym)
235
236Note: if you want to use multiple reedsolomon with different parameters, you need to backup the globals and restore them before calling reedsolo functions:
237
238.. code:: python
239
240    >> rs.init_tables()
241    >> global gf_log, gf_exp, field_charac
242    >> bak_gf_log, bak_gf_exp, bak_field_charac = gf_log, gf_exp, field_charac
243
244
245Then at anytime, you can do:
246
247.. code:: python
248
249    >> global gf_log, gf_exp, field_charac
250    >> gf_log, gf_exp, field_charac = bak_gf_log, bak_gf_exp, bak_field_charac
251    >> mesecc = rs.rs_encode_msg(mes, nsym)
252    >> rmes, recc, errata_pos = rs.rs_correct_msg(mesecc, nsym)
253
254The globals backup is not necessary if you use RSCodec, it will be automatically managed.
255
256Read the sourcecode's comments for more info about how it works, and for the various parameters you can setup if
257you need to interface with other RS codecs.
258
259Extended description
260--------------------
261The code of wikiversity is here consolidated into a nice API with exceptions handling.
262The algorithm can correct up to 2*e+v <= nsym, where e is the number of errors,
263v the number of erasures and nsym = n-k = the number of ECC (error correction code) symbols.
264This means that you can either correct exactly floor(nsym/2) errors, or nsym erasures
265(errors where you know the position), and a combination of both errors and erasures.
266This is called the Singleton Bound, and is the maximum/optimal theoretical number
267of erasures and errors any error correction algorithm can correct (although there
268are experimental approaches to go a bit further, named list decoding, not implemented
269here, but feel free to do pull request!).
270The code should work on pretty much any reasonable version of python (2.4-3.7),
271but I'm only testing on 2.7 and 3.7. Python 3.8 should work except for Cython which is
272currently incompatible with this version.
273
274The codec has quite reasonable performances if you either use PyPy on the pure-python
275implementation (reedsolo.py) or either if you compile the Cython extension creedsolo.pyx
276(which is about 2x faster than PyPy). You can expect encoding rates of several MB/s.
277
278This library is also thoroughly unit tested so that nearly any encoding/decoding case should be covered.
279
280The codec is universal, meaning that it can decode any message encoded by another RS encoder
281as long as you provide the correct parameters.
282Note however that if you use higher fields (ie, bigger c_exp), the algorithms will be slower, first because
283we cannot then use the optimized bytearray() structure but only array.array('i', ...), and also because
284Reed-Solomon's complexity is quadratic (both in encoding and decoding), so this means that the longer
285your messages, the longer it will take to encode/decode (quadratically!).
286
287The algorithm itself can handle messages of a length up to (2^c_exp)-1 symbols per message (or chunk), including the ECC symbols,
288and each symbol can have a value of up to (2^c_exp)-1 (indeed, both the message length and the maximum
289value for one character is constrained by the same mathematical reason). By default, we use the field GF(2^8),
290which means that you are limited to values between 0 and 255 (perfect to represent a single hexadecimal
291symbol on computers, so you can encode any binary stream) and limited to messages+ecc of maximum
292length 255. However, you can "chunk" longer messages to fit them into the message length limit.
293The ``RSCodec`` class will automatically apply chunking, by splitting longer messages into chunks and
294encode/decode them separately; it shouldn't make a difference from an API perspective (ie, from your POV).
295
296
297To use the Cython implementation, you need to ``pip install cython`` and a C++ compiler (Microsoft Visual C++ 14.0 for Windows and Python 3.7). Then you can simply cd to the root of the folder where creedsolo.pyx is, and type ``python setup.py build_ext --inplace``. Alternatively, you can generate just the C++ code by typing `cython -3 creedsolo.pyx`. When building a distributable egg or installing the module from source, the Cython module will be automatically transpiled and compiled if both Cython and a C compiler are installed. This behavior can be modified using the ``--nocython`` and ``--native-compile`` arguments for ``setup.py``.
298
299Authors
300-------
301
302This module was conceived and developed by Tomer Filiba.
303
304It was further extended and is currently maintained by Stephen Karl Larroque.
305
306For a list of all contributors, see `this page <https://github.com/tomerfiliba/reedsolomon/graphs/contributors>`_.
307
308License
309-------
310
311This software is released to the Public Domain.
312
313If the Public Domain is not adequate for your purpose, you can instead consider this module under the MIT License as you prefer.
314
315
316.. |PyPI-Status| image:: https://img.shields.io/pypi/v/reedsolo.svg
317   :target: https://pypi.org/project/reedsolo
318.. |PyPI-Versions| image:: https://img.shields.io/pypi/pyversions/reedsolo.svg?logo=python&logoColor=white
319   :target: https://pypi.org/project/reedsolo
320.. |PyPI-Downloads| image:: https://img.shields.io/pypi/dm/reedsolo.svg?label=pypi%20downloads&logo=python&logoColor=white
321   :target: https://pypi.org/project/reedsolo
322.. |Build-Status| image:: https://travis-ci.org/tomerfiliba/reedsolomon.svg?branch=master
323    :target: https://travis-ci.org/tomerfiliba/reedsolomon
324.. |Coverage| image:: https://coveralls.io/repos/tomerfiliba/reedsolomon/badge.svg?branch=master&service=github
325  :target: https://coveralls.io/github/tomerfiliba/reedsolomon?branch=master
326