1.. _sec-theory: 2 3Concepts 4======== 5 6This section describes some of the theory used in arybo/petanque, the choices 7that have been made and the various limitations that exist. 8 9.. _sec-theory-anf: 10 11Representation of boolean expressions and bit-vectors 12----------------------------------------------------- 13 14Boolean expressions are represented in the petanque library using the Algebric 15Normal Form (ANF). The advantages are that it is complete (that is any boolean 16expression can be represented using only XOR and AND operations) and it has an 17algebraic structure (see :ref:`sec-theory-app`). 18 19The ANF form is a series of AND operations xored together, with a final 20potential binary XOR with 1. So, the expression tree has an addition as root, 21and its children are only AND operators, and optionally one 1 in the end. For 22instance, :math:`(x \land a) \oplus (x \land b) \oplus 1` is an ANF, whereas 23:math:`x \land (a \oplus b) \oplus 1` is not. On the last one, the AND 24operation must be expand to get the ANF version. That being said, the petanque 25library is able to consider any boolean expression with OR, AND, XOR and 26:ref:`ESF <sec-theory-esf>` operators (with any depth), and can compute the ANF 27form of these expressions. 28 29The petanque library considers expressions modulo 2, and is then using 30additions and multiplications to respectively represent binary XOR and AND. 31Here are some examples using the ``iarybo`` shell: 32 33.. code:: 34 35 $ iarybo 1 36 In [1]: x^y 37 Out[1]: 38 Vec([ 39 (x0 + y0) 40 ]) 41 42 In [2]: x&y 43 Out[2]: 44 Vec([ 45 (x0 * y0) 46 ]) 47 48 In [3]: x|y 49 Out[3]: 50 Vec([ 51 ((x0 * y0) + x0 + y0) 52 ]) 53 54What is called in the various API "simplify" is the process of expanding any supported 55boolean expression into an ANF. For instance, one could create this expression: 56 57.. code:: python 58 59 from arybo.lib import MBA 60 mba = MBA(1) 61 x = mba.var('x') 62 y = mba.var('y') 63 z = mba.var('z') 64 # (x XOR y) AND z 65 e = (x[0]+y[0])*z[0] 66 >>> print(e) 67 ((x0 + y0) * a0) 68 69Using :meth:`arybo.lib.MBAVariable.simplify` will transform this expression 70into its ANF form: 71 72.. code:: python 73 74 from arybo.lib import simplify 75 e_s = simplify(e) 76 >>> print(e_s) 77 ((x0 * a0) + (y0 * a0)) 78 79.. _sec-theory-esf: 80 81Elementary symmetric functions 82~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 83 84Elementary symmetric functions (ESF) are functions whose outputs do not depend 85on the order of their input boolean variables, which means their output values 86only depend on the Hamming weight of the input vector. An elementary symmetric 87function of degree :math:`d` with :math:`k` input variables is a boolean 88function defined as: 89 90.. math:: 91 92 esf_d:~ \mathbb{F}_2^k & \longrightarrow\mathbb{F}_2^k\\ 93 (x_1, \dots, x_k) & \longmapsto \bigoplus_{1\le j_1 < j_2 < \ldots < j_d \le k} x_{j_1} \dotsm x_{j_d} 94 95Some examples: 96 97.. math:: 98 99 esf_1(a,b,c) &= a \oplus \oplus b \oplus c\\ 100 esf_2(a,b) &= ab \\ 101 esf_2(a,b,c) &= ab \oplus ac \oplus bc 102 103They are interesting because they occur "naturally" in many parts of arithmetic 104operators expressed at their boolean levels (see an example with the addition below). 105 106Moreover, we can prove that: 107 108.. math:: 109 110 x_1 \lor \dotsb \lor x_n = \bigoplus\limits_{d=1}^{n} esf_d(x_1,\dotsc,x_n). 111 112This can be useful to identify OR operations within an ANF expression, as we 113can see in this example (OR identification is still experimental, and thus 114needs to be explicitly imported from pytanque): 115 116.. code:: 117 118 $ iarybo 1 119 In [1]: e=a|b|c|d 120 121 In [2]: e 122 Out[2]: 123 Vec([ 124 ((a0 * b0) + (a0 * c0) + (a0 * d0) + (b0 * c0) + (b0 * d0) + (c0 * d0) + (a0 * b0 * c0) + (a0 * b0 * d0) + (a0 * c0 * d0) + (b0 * c0 * d0) + (a0 * b0 * c0 * d0) + a0 + b0 + c0 + d0) 125 ]) 126 127 In [3]: find_esfs(e[0]) 128 Out[3]: [ESF(3, a0, b0, c0, d0), ESF(2, a0, b0, c0, d0)] 129 130 In [4]: e 131 Out[4]: 132 Vec([ 133 (ESF(2, a0, b0, c0, d0) + ESF(3, a0, b0, c0, d0) + (a0 * b0 * c0 * d0) + a0 + b0 + c0 + d0) 134 ]) 135 136 In [5]: import pytanque 137 138 In [6]: pytanque.identify_ors_inplace(e[0]) 139 Out[6]: True 140 141 In [7]: e 142 Out[7]: 143 Vec([ 144 (a0 | b0 | c0 | d0) 145 ]) 146 147.. _sec-theory-solver: 148 149Boolean expression solver 150~~~~~~~~~~~~~~~~~~~~~~~~~ 151 152A naive boolean expression solver has been implemented. It basically takes as 153input a boolean expression containing a given number of symbolic values, and 154produces (potentially symbolic) bit-vectors that make the boolean expression 155true or false (according to the user's demand). 156 157A usage example is described :ref:`here <sec-tutorial-dirac>`. 158 159Integer arithmetic operations 160----------------------------- 161 162Addition/substraction 163~~~~~~~~~~~~~~~~~~~~~ 164 165Addition is computed symbolically using the algorithm behind a 1-bit logical 166adder. Basically, for an n-bit addition, :math:`n-1` carry bits are computed 167one after the other, according to the previous results. More formally, :math:`R 168= x+y` is computed like this (with :math:`R`, :math:`x` and :math:`y` n-bit 169variables): 170 171.. math:: 172 :nowrap: 173 174 \begin{eqnarray} 175 &R_i = x_i \oplus y_i \oplus c_i\\ 176 &\text{with } 177 \begin{cases} 178 c_0 = 0\\ 179 c_{i+1} = x_i \cdot y_i \oplus c_i \cdot (x_i \oplus y_i)\\ 180 \end{cases} \label{eq:carry} 181 \end{eqnarray} 182 183Using the ESF described above, :math:`c_i` can be rewritten as this: 184 185.. math:: 186 187 c_{i+1} = esf_2(x_i, y_i, c_i) 188 189An optimization can be done if :math:`y` is a constant known at runtime. It uses the fact that: 190 191.. math:: 192 193 x+y = (x \oplus y) + ((x \land y) \ll 1) 194 195 196By applying recursively this formula and because :math:`x+0 = x`, we can write 197the following recursive algorithm: 198 199.. code:: python 200 201 def add(x,y): 202 if (y == 0): return x 203 return add(x^y, (x&y)<<1) 204 205For instance, if 206 207.. math:: 208 209 y = (0 \dots 0~1)^\intercal 210 211then the addition will be reduced to only one XOR in one loop iteration, while 212the original algorithm would have gone through the computation of every carry 213bit. 214 215Multiplication 216~~~~~~~~~~~~~~ 217 218The multiplication is using the fact that: 219 220.. math:: 221 222 x \times y &= x \times (\sum\limits_{i=0}^n 2^{i}y_{i}) \\ 223 &= \sum\limits_{i=0}^n x\times 2^{i}y_{i} \\ 224 &= \sum\limits_{i=0}^n (x \ll i) \times y_{i} 225 226An n-bit multiplication is thus performed using :math:`n` multiplication. 227 228Division by a known constant 229~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 230 231Only a division by a known constant at runtime is supported in Arybo for 232the moment. The main idea is to transform a division by a :math:`n`-bit constant into 233a multiplication by a :math:`2n`-bit constant and a right logical shift. 234 235The details of the complete algorithm are in the `Hacker's Delight 236<http://www.hackersdelight.org/>`_ book. It also can be found in some 237optimization libraries, for instance in `libdivide 238<http://www.libdivide.org>`_. 239 240.. _sec-theory-app: 241 242Application 243----------- 244 245Applications are functions that take a :math:`m`-bit vector as input and 246produce an `n`-bit vector. They are represented within petanque in two parts: 247 248 * a non-linear part called `NL` 249 * an affine part composed of a :math:`m*n` matrix `M` and a constant vector `V` 250 251This construction is possible because of the ANF form. 252 253In petanque, a process called "vectorial decomposition" allows the creation 254of such application from a bit-vector and list of symbolic inputs to consider. 255Here is an example that creates the application associated with the operation 256:math:`x+1`, for a 4-bit input :math:`x`: 257 258.. code:: python 259 260 from mba.lib import MBA 261 mba = MBA(4) 262 x = mba.var('x') 263 r = x+1 264 F = x.vectorial_decomp([x]) 265 >>> print(F) 266 App NL = Vec([ 267 0, 268 0, 269 (_0 * _1), 270 (_0 * _1 * _2) 271 ]) 272 AffApp matrix = Mat([ 273 [1, 0, 0, 0] 274 [1, 1, 0, 0] 275 [0, 0, 1, 0] 276 [0, 0, 0, 1] 277 ]) 278 AffApp cst = Vec([ 279 1, 280 0, 281 0, 282 0 283 ]) 284 285Inverse of an application 286~~~~~~~~~~~~~~~~~~~~~~~~~ 287 288Arybo is able to inverse a subset of the invertible applications, without 289computing the whole truth table and inverting it (which can be really memory 290and computation intensive for application over 32-bits input for instance). 291 292The two kinds of invertible application Arybo is able to invert are: 293 294 * affine/linear application with an invertible `M` matrix 295 * application with a non-linear part which is a `T function 296 <http://link.springer.com/chapter/10.1007%2F3-540-36400-5_34>`_. Every 297 arithmetic operation supported by Arybo falls into that category. The main 298 idea is to resolve the non-linear system using a classical substitution 299 technic. 300 301The example above is a good candidate: 302 303.. code:: python 304 305 from mba.lib import MBA, app_inverse, simplify 306 mba = MBA(4) 307 x = mba.var('x') 308 r = x+1 309 F = x.vectorial_decomp([x]) 310 Finv = app_inverse(F) 311 >>> print(simplify(F(Finv(x.vec)))) 312 Vec([ 313 x0, 314 x1, 315 x2, 316 x3 317 ]) 318 319A random permutation is a good example of an application Arybo can't invert 320(yet). Indeed, chances are very low to fall into one of the two categories 321mentioned above: 322 323.. code:: python 324 325 import random 326 from mba.lib import MBA, app_inverse, simplify 327 mba = MBA(4) 328 P = list(range(16)) 329 random.shuffle(P) 330 E,X = mba.permut2expr(P) 331 F = E.vectorial_decomp([X]) 332 >>> print(F) 333 App NL = Vec([ 334 ((_0 * _1) + (_0 * _2) + (_1 * _2)), 335 ((_0 * _1) + (_0 * _3) + (_2 * _3) + (_0 * _1 * _2) + (_0 * _1 * _3) + (_0 * _2 * _3)), 336 ((_0 * _1) + (_0 * _2) + (_1 * _2) + (_1 * _3) + (_0 * _2 * _3)), 337 ((_1 * _3) + (_2 * _3) + (_1 * _2 * _3)) 338 ]) 339 AffApp matrix = Mat([ 340 [0, 1, 1, 0] 341 [0, 1, 1, 1] 342 [1, 1, 0, 1] 343 [1, 1, 1, 0] 344 ]) 345 AffApp cst = Vec([ 346 0, 347 0, 348 1, 349 1 350 ]) 351 >>> print(app_inverse(F)) 352 None 353 354 355What could be improved 356---------------------- 357 358* test different ways of storing boolean expressions within petanque. We are 359 currently using a "sorted vector" (that is a vector whose elements are always 360 sorted), which has the advantage of consuming less memory than a tree but has 361 a more important cost when inserting and removing elements (as we need to 362 move the other elements each time). 363 364* implement a C++ version of the Arybo library (while keeping the Python 365 version for testing/fallback purposes), so that it could be used natively in 366 other libraries, or in other languages through various bindings 367 368Interesting idea/algorithms to implement 369---------------------------------------- 370 371* the algorithm described by Alex Biryukov, Christophe De Cannière, An Braeken 372 and Bart Preneel in `this paper 373 <http://www.iacr.org/cryptodb/archive/2003/EUROCRYPT/2059/2059.ps>`_ that 374 allows to find, for two arbitrary permutations :math:`S_1` and :math:`S_2`, 375 two invertible linear functions :math:`L_1` and :math:`L_2` such as 376 :math:`S_2 = L_1 \circ S_1 \circ S_2`. 377 378* find interesting equalities involving ESF that would make the 379 canonicalisation of some MBA much faster and less memory-consuming (as we 380 would simplify ESFs directly, without expanding them) 381