1.. _sec-theory:
2
3Concepts
4========
5
6This section describes some of the theory used in arybo/petanque, the choices
7that have been made and the various limitations that exist.
8
9.. _sec-theory-anf:
10
11Representation of boolean expressions and bit-vectors
12-----------------------------------------------------
13
14Boolean expressions are represented in the petanque library using the Algebric
15Normal Form (ANF). The advantages are that it is complete (that is any boolean
16expression can be represented using only XOR and AND operations) and it has an
17algebraic structure (see :ref:`sec-theory-app`).
18
19The ANF form is a series of AND operations xored together, with a final
20potential binary XOR with 1. So, the expression tree has an addition as root,
21and its children are only AND operators, and optionally one 1 in the end. For
22instance, :math:`(x \land a) \oplus (x \land b) \oplus 1` is an ANF, whereas
23:math:`x \land (a \oplus b) \oplus 1` is not. On the last one, the AND
24operation must be expand to get the ANF version. That being said, the petanque
25library is able to consider any boolean expression with OR, AND, XOR and
26:ref:`ESF <sec-theory-esf>` operators (with any depth), and can compute the ANF
27form of these expressions.
28
29The petanque library considers expressions modulo 2, and is then using
30additions and multiplications to respectively represent binary XOR and AND.
31Here are some examples using the ``iarybo`` shell:
32
33.. code::
34
35    $ iarybo 1
36    In [1]: x^y
37    Out[1]:
38    Vec([
39    (x0 + y0)
40    ])
41
42    In [2]: x&y
43    Out[2]:
44    Vec([
45    (x0 * y0)
46    ])
47
48    In [3]: x|y
49    Out[3]:
50    Vec([
51    ((x0 * y0) + x0 + y0)
52    ])
53
54What is called in the various API "simplify" is the process of expanding any supported
55boolean expression into an ANF. For instance, one could create this expression:
56
57.. code:: python
58
59   from arybo.lib import MBA
60   mba = MBA(1)
61   x = mba.var('x')
62   y = mba.var('y')
63   z = mba.var('z')
64   # (x XOR y) AND z
65   e = (x[0]+y[0])*z[0]
66   >>> print(e)
67   ((x0 + y0) * a0)
68
69Using :meth:`arybo.lib.MBAVariable.simplify` will transform this expression
70into its ANF form:
71
72.. code:: python
73
74   from arybo.lib import simplify
75   e_s = simplify(e)
76   >>> print(e_s)
77   ((x0 * a0) + (y0 * a0))
78
79.. _sec-theory-esf:
80
81Elementary symmetric functions
82~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
83
84Elementary symmetric functions (ESF) are functions whose outputs do not depend
85on the order of their input boolean variables, which means their output values
86only depend on the Hamming weight of the input vector. An elementary symmetric
87function of degree :math:`d` with :math:`k` input variables is a boolean
88function defined as:
89
90.. math::
91
92	esf_d:~ \mathbb{F}_2^k & \longrightarrow\mathbb{F}_2^k\\
93	(x_1, \dots, x_k)      & \longmapsto \bigoplus_{1\le  j_1 < j_2 < \ldots < j_d \le k} x_{j_1} \dotsm x_{j_d}
94
95Some examples:
96
97.. math::
98
99	esf_1(a,b,c) &= a \oplus \oplus b \oplus c\\
100	esf_2(a,b) &= ab \\
101	esf_2(a,b,c) &= ab \oplus ac \oplus bc
102
103They are interesting because they occur "naturally" in many parts of arithmetic
104operators expressed at their boolean levels (see an example with the addition below).
105
106Moreover, we can prove that:
107
108.. math::
109
110	x_1 \lor \dotsb \lor x_n = \bigoplus\limits_{d=1}^{n} esf_d(x_1,\dotsc,x_n).
111
112This can be useful to identify OR operations within an ANF expression, as we
113can see in this example (OR identification is still experimental, and thus
114needs to be explicitly imported from pytanque):
115
116.. code::
117
118   $ iarybo 1
119   In [1]: e=a|b|c|d
120
121   In [2]: e
122   Out[2]:
123   Vec([
124   ((a0 * b0) + (a0 * c0) + (a0 * d0) + (b0 * c0) + (b0 * d0) + (c0 * d0) + (a0 * b0 * c0) + (a0 * b0 * d0) + (a0 * c0 * d0) + (b0 * c0 * d0) + (a0 * b0 * c0 * d0) + a0 + b0 + c0 + d0)
125   ])
126
127   In [3]: find_esfs(e[0])
128   Out[3]: [ESF(3, a0, b0, c0, d0), ESF(2, a0, b0, c0, d0)]
129
130   In [4]: e
131   Out[4]:
132   Vec([
133   (ESF(2, a0, b0, c0, d0) + ESF(3, a0, b0, c0, d0) + (a0 * b0 * c0 * d0) + a0 + b0 + c0 + d0)
134   ])
135
136   In [5]: import pytanque
137
138   In [6]: pytanque.identify_ors_inplace(e[0])
139   Out[6]: True
140
141   In [7]: e
142   Out[7]:
143   Vec([
144   (a0 | b0 | c0 | d0)
145   ])
146
147.. _sec-theory-solver:
148
149Boolean expression solver
150~~~~~~~~~~~~~~~~~~~~~~~~~
151
152A naive boolean expression solver has been implemented. It basically takes as
153input a boolean expression containing a given number of symbolic values, and
154produces (potentially symbolic) bit-vectors that make the boolean expression
155true or false (according to the user's demand).
156
157A usage example is described :ref:`here <sec-tutorial-dirac>`.
158
159Integer arithmetic operations
160-----------------------------
161
162Addition/substraction
163~~~~~~~~~~~~~~~~~~~~~
164
165Addition is computed symbolically using the algorithm behind a 1-bit logical
166adder. Basically, for an n-bit addition, :math:`n-1` carry bits are computed
167one after the other, according to the previous results. More formally, :math:`R
168= x+y` is computed like this (with :math:`R`, :math:`x` and :math:`y` n-bit
169variables):
170
171.. math::
172    :nowrap:
173
174    \begin{eqnarray}
175    &R_i = x_i \oplus y_i \oplus c_i\\
176    &\text{with }
177    \begin{cases}
178    c_0 = 0\\
179    c_{i+1} = x_i \cdot y_i \oplus c_i \cdot (x_i \oplus y_i)\\
180    \end{cases} \label{eq:carry}
181    \end{eqnarray}
182
183Using the ESF described above, :math:`c_i` can be rewritten as this:
184
185.. math::
186
187    c_{i+1} = esf_2(x_i, y_i, c_i)
188
189An optimization can be done if :math:`y` is a constant known at runtime. It uses the fact that:
190
191.. math::
192
193    x+y = (x \oplus y) + ((x \land y) \ll 1)
194
195
196By applying recursively this formula and because :math:`x+0 = x`, we can write
197the following recursive algorithm:
198
199.. code:: python
200
201    def add(x,y):
202      if (y == 0): return x
203      return add(x^y, (x&y)<<1)
204
205For instance, if
206
207.. math::
208
209    y = (0 \dots 0~1)^\intercal
210
211then the addition will be reduced to only one XOR in one loop iteration, while
212the original algorithm would have gone through the computation of every carry
213bit.
214
215Multiplication
216~~~~~~~~~~~~~~
217
218The multiplication is using the fact that:
219
220.. math::
221
222	x \times y &= x \times (\sum\limits_{i=0}^n 2^{i}y_{i}) \\
223        &= \sum\limits_{i=0}^n x\times 2^{i}y_{i} \\
224		&= \sum\limits_{i=0}^n (x \ll i) \times y_{i}
225
226An n-bit multiplication is thus performed using :math:`n` multiplication.
227
228Division by a known constant
229~~~~~~~~~~~~~~~~~~~~~~~~~~~~
230
231Only a division by a known constant at runtime is supported in Arybo for
232the moment. The main idea is to transform a division by a :math:`n`-bit constant into
233a multiplication by a :math:`2n`-bit constant and a right logical shift.
234
235The details of the complete algorithm are in the `Hacker's Delight
236<http://www.hackersdelight.org/>`_ book. It also can be found in some
237optimization libraries, for instance in `libdivide
238<http://www.libdivide.org>`_.
239
240.. _sec-theory-app:
241
242Application
243-----------
244
245Applications are functions that take a :math:`m`-bit vector as input and
246produce an `n`-bit vector. They are represented within petanque in two parts:
247
248 * a non-linear part called `NL`
249 * an affine part composed of a :math:`m*n` matrix `M` and a constant vector `V`
250
251This construction is possible because of the ANF form.
252
253In petanque, a process called "vectorial decomposition" allows the creation
254of such application from a bit-vector and list of symbolic inputs to consider.
255Here is an example that creates the application associated with the operation
256:math:`x+1`, for a 4-bit input :math:`x`:
257
258.. code:: python
259
260    from mba.lib import MBA
261    mba = MBA(4)
262    x = mba.var('x')
263    r = x+1
264    F = x.vectorial_decomp([x])
265    >>> print(F)
266    App NL = Vec([
267    0,
268    0,
269    (_0 * _1),
270    (_0 * _1 * _2)
271    ])
272    AffApp matrix = Mat([
273    [1, 0, 0, 0]
274    [1, 1, 0, 0]
275    [0, 0, 1, 0]
276    [0, 0, 0, 1]
277    ])
278    AffApp cst = Vec([
279    1,
280    0,
281    0,
282    0
283    ])
284
285Inverse of an application
286~~~~~~~~~~~~~~~~~~~~~~~~~
287
288Arybo is able to inverse a subset of the invertible applications, without
289computing the whole truth table and inverting it (which can be really memory
290and computation intensive for application over 32-bits input for instance).
291
292The two kinds of invertible application Arybo is able to invert are:
293
294 * affine/linear application with an invertible `M` matrix
295 * application with a non-linear part which is a `T function
296   <http://link.springer.com/chapter/10.1007%2F3-540-36400-5_34>`_. Every
297   arithmetic operation supported by Arybo falls into that category. The main
298   idea is to resolve the non-linear system using a classical substitution
299   technic.
300
301The example above is a good candidate:
302
303.. code:: python
304
305    from mba.lib import MBA, app_inverse, simplify
306    mba = MBA(4)
307    x = mba.var('x')
308    r = x+1
309    F = x.vectorial_decomp([x])
310    Finv = app_inverse(F)
311    >>> print(simplify(F(Finv(x.vec))))
312    Vec([
313    x0,
314    x1,
315    x2,
316    x3
317    ])
318
319A random permutation is a good example of an application Arybo can't invert
320(yet). Indeed, chances are very low to fall into one of the two categories
321mentioned above:
322
323.. code:: python
324
325    import random
326    from mba.lib import MBA, app_inverse, simplify
327    mba = MBA(4)
328    P = list(range(16))
329    random.shuffle(P)
330    E,X = mba.permut2expr(P)
331    F = E.vectorial_decomp([X])
332    >>> print(F)
333    App NL = Vec([
334    ((_0 * _1) + (_0 * _2) + (_1 * _2)),
335    ((_0 * _1) + (_0 * _3) + (_2 * _3) + (_0 * _1 * _2) + (_0 * _1 * _3) + (_0 * _2 * _3)),
336    ((_0 * _1) + (_0 * _2) + (_1 * _2) + (_1 * _3) + (_0 * _2 * _3)),
337    ((_1 * _3) + (_2 * _3) + (_1 * _2 * _3))
338    ])
339    AffApp matrix = Mat([
340    [0, 1, 1, 0]
341    [0, 1, 1, 1]
342    [1, 1, 0, 1]
343    [1, 1, 1, 0]
344    ])
345    AffApp cst = Vec([
346    0,
347    0,
348    1,
349    1
350    ])
351    >>> print(app_inverse(F))
352    None
353
354
355What could be improved
356----------------------
357
358* test different ways of storing boolean expressions within petanque. We are
359  currently using a "sorted vector" (that is a vector whose elements are always
360  sorted), which has the advantage of consuming less memory than a tree but has
361  a more important cost when inserting and removing elements (as we need to
362  move the other elements each time).
363
364* implement a C++ version of the Arybo library (while keeping the Python
365  version for testing/fallback purposes), so that it could be used natively in
366  other libraries, or in other languages through various bindings
367
368Interesting idea/algorithms to implement
369----------------------------------------
370
371* the algorithm described by Alex Biryukov, Christophe De Cannière, An Braeken
372  and Bart Preneel in `this paper
373  <http://www.iacr.org/cryptodb/archive/2003/EUROCRYPT/2059/2059.ps>`_ that
374  allows to find, for two arbitrary permutations :math:`S_1` and :math:`S_2`,
375  two invertible linear functions :math:`L_1` and :math:`L_2` such as
376  :math:`S_2 = L_1 \circ S_1 \circ S_2`.
377
378* find interesting equalities involving ESF that would make the
379  canonicalisation of some MBA much faster and less memory-consuming (as we
380  would simplify ESFs directly, without expanding them)
381