1Developer Tutorial 2################## 3 4This is a long tutorial to help new Pythran developer discover the Pythran 5architecture. This is *not* a developer documentation, but it aims at giving a 6good overview of Pythran capacity. 7 8It requires you are comfortable with Python, and eventually with C++11. It also 9assumes you have some compilation background, i.e. you know what an AST is and 10you don't try to escape when hearing the words alias analysis, memory effect 11computations and such. 12 13Parsing Python Code 14------------------- 15 16Python ships a standard module, ``ast`` to turn Python code into an AST. For instance:: 17 18 >>> import gast as ast 19 >>> from __future__ import print_function 20 >>> code = "a=1" 21 >>> tree = ast.parse(code) # turn the code into an AST 22 >>> print(ast.dump(tree)) # view it as a string 23 Module(body=[Assign(targets=[Name(id='a', ctx=Store(), annotation=None, type_comment=None)], value=Constant(value=1, kind=None), type_comment=None)], type_ignores=[]) 24 25Deciphering the above line, one learns that the single assignment is parsed as 26a module containing a single statement, which is an assignment to a single 27target, a ``ast.Name`` with the identifier ``a``, of the literal value ``1``. 28 29Eventually, one needs to parse more complex codes, and things get a bit more cryptic, but you get the idea:: 30 31 >>> fib_src = """ 32 ... def fib(n): 33 ... return n if n< 2 else fib(n-1) + fib(n-2)""" 34 >>> tree = ast.parse(fib_src) 35 >>> print(ast.dump(tree)) 36 Module(body=[FunctionDef(name='fib', args=arguments(args=[Name(id='n', ctx=Param(), annotation=None, type_comment=None)], posonlyargs=[], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[Return(value=IfExp(test=Compare(left=Name(id='n', ctx=Load(), annotation=None, type_comment=None), ops=[Lt()], comparators=[Constant(value=2, kind=None)]), body=Name(id='n', ctx=Load(), annotation=None, type_comment=None), orelse=BinOp(left=Call(func=Name(id='fib', ctx=Load(), annotation=None, type_comment=None), args=[BinOp(left=Name(id='n', ctx=Load(), annotation=None, type_comment=None), op=Sub(), right=Constant(value=1, kind=None))], keywords=[]), op=Add(), right=Call(func=Name(id='fib', ctx=Load(), annotation=None, type_comment=None), args=[BinOp(left=Name(id='n', ctx=Load(), annotation=None, type_comment=None), op=Sub(), right=Constant(value=2, kind=None))], keywords=[]))))], decorator_list=[], returns=None, type_comment=None)], type_ignores=[]) 37 38The idea remains the same. The whole Python syntax is described in 39http://docs.python.org/2/library/ast.html and is worth a glance, otherwise 40you'll be in serious trouble understanding the following. 41 42Pythran Pass Manager 43-------------------- 44 45A pass is a code transformation, i.e. a function that turns an AST node into a 46new AST node with refined behavior. As a compiler infrastructure, Pythran 47proposes a pass manager that (guess what?) manages pass scheduling, that is 48the order in which pass is applied to achieve the ultimate goal, world 49domination. Oooops, efficient C++11 code generation. 50 51One first need to instantiate a pass manager with a module name:: 52 53 >>> from pythran import passmanager 54 >>> pm = passmanager.PassManager("tutorial_module") 55 56The pass manager has 3 methods and two attributes:: 57 58 >>> [x for x in dir(pm) if not x.startswith('_')] 59 ['apply', 'dump', 'gather', 'module_dir', 'module_name'] 60 61``apply`` 62 applies a code transformation 63 64``dump`` 65 dumps a node using a dedicated backend 66 67``gather`` 68 gathers information about the node 69 70Pythran Backends 71---------------- 72 73Pythran currently has two backends. The main one is used to dump Pythran AST (a 74subset of Python AST) into a C++ AST:: 75 76 >>> from pythran import backend 77 >>> cxx = pm.dump(backend.Cxx, tree) 78 >>> str(cxx) 79 '#include <pythonic/include/operator_/add.hpp>\n#include <pythonic/include/operator_/lt.hpp>\n#include <pythonic/include/operator_/sub.hpp>\n#include <pythonic/operator_/add.hpp>\n#include <pythonic/operator_/lt.hpp>\n#include <pythonic/operator_/sub.hpp>\nnamespace __pythran_tutorial_module\n{\n struct fib\n {\n typedef void callable;\n typedef void pure;\n template <typename argument_type0 >\n struct type\n {\n typedef typename std::remove_cv<typename std::remove_reference<argument_type0>::type>::type __type0;\n typedef __type0 __type1;\n typedef typename pythonic::returnable<__type1>::type __type2;\n typedef __type2 result_type;\n } \n ;\n template <typename argument_type0 >\n inline\n typename type<argument_type0>::result_type operator()(argument_type0&& n) const\n ;\n } ;\n template <typename argument_type0 >\n inline\n typename fib::type<argument_type0>::result_type fib::operator()(argument_type0&& n) const\n {\n return (((bool)pythonic::operator_::lt(n, 2L)) ? typename __combined<decltype(n), decltype(pythonic::operator_::add(fib()(pythonic::operator_::sub(n, 1L)), fib()(pythonic::operator_::sub(n, 2L))))>::type(n) : typename __combined<decltype(n), decltype(pythonic::operator_::add(fib()(pythonic::operator_::sub(n, 1L)), fib()(pythonic::operator_::sub(n, 2L))))>::type(pythonic::operator_::add(fib()(pythonic::operator_::sub(n, 1L)), fib()(pythonic::operator_::sub(n, 2L)))));\n }\n}' 80 81The above string is understandable by a C++11 compiler, but it quickly reaches the limit of our developer brain, so most of the time, we are more comfortable with the Python backend:: 82 83 >>> py = pm.dump(backend.Python, tree) 84 >>> print(py) 85 def fib(n): 86 return (n if (n < 2) else (fib((n - 1)) + fib((n - 2)))) 87 88Passes 89------ 90 91There are many code transformations in Pythran. Some of them are used to lower 92the representation from Python AST to the simpler Pythran AST. For instance 93there is no tuple unpacking in Pythran, so Pythran provides an adequate 94transformation:: 95 96 >>> from pythran import transformations 97 >>> tree = ast.parse("def foo(): a,b = 1,3.5") 98 >>> _ = pm.apply(transformations.NormalizeTuples, tree) # in-place 99 >>> print(pm.dump(backend.Python, tree)) 100 def foo(): 101 __tuple0 = (1, 3.5) 102 a = __tuple0[0] 103 b = __tuple0[1] 104 105Pythran transforms the tuple unpacking into an intermediate tuple assignment. 106Note that if the unpacking statement is marked as critical using an OpenMP 107statement, then a temporary variable is used to hold the left hand side 108computation, if any:: 109 110 >>> from pythran import transformations 111 >>> tree = ast.parse(""" 112 ... def foo(x): 113 ... #omp critical 114 ... a,b = 1, x + 1 115 ... return a + b""") 116 >>> _ = pm.apply(transformations.NormalizeTuples, tree) # in-place 117 >>> print(pm.dump(backend.Python, tree)) 118 def foo(x): 119 __tuple0 = (1, (x + 1)) 120 a = __tuple0[0] 121 b = __tuple0[1] 122 return (a + b) 123 124There are many small passes used iteratively to produce the Pythran AST. For instance the implicit return at the end of every function is made explicit:: 125 126 >>> tree = ast.parse('def foo():pass') 127 >>> _ = pm.apply(transformations.NormalizeReturn, tree) 128 >>> print(pm.dump(backend.Python, tree)) 129 def foo(): 130 pass 131 return builtins.None 132 133More complex ones rely on introspection to implement constant folding:: 134 135 >>> from __future__ import print_function 136 >>> code = [fib_src, 'def foo(): return builtins.map(fib, [1,2,3])'] 137 >>> fib_call = '\n'.join(code) 138 >>> tree = ast.parse(fib_call) 139 >>> from pythran import optimizations as optim 140 >>> _ = pm.apply(optim.ConstantFolding, tree) 141 >>> print(pm.dump(backend.Python, tree)) 142 def fib(n): 143 return (n if (n < 2) else (fib((n - 1)) + fib((n - 2)))) 144 def foo(): 145 return [1, 1, 2] 146 147One can also detect some common generator expression patterns to call the itertool module:: 148 149 >>> norm = 'def norm(l): return builtins.sum(n*n for n in l)' 150 >>> tree = ast.parse(norm) 151 >>> _ = pm.apply(optim.ComprehensionPatterns, tree) 152 >>> 'map' in pm.dump(backend.Python, tree) 153 True 154 155 156Analysis 157-------- 158 159All Pythran passes are backed up by analysis. Pythran provides three levels of analysis:: 160 161 >>> passmanager.FunctionAnalysis 162 <class 'pythran.passmanager.FunctionAnalysis'> 163 >>> passmanager.ModuleAnalysis 164 <class 'pythran.passmanager.ModuleAnalysis'> 165 >>> passmanager.NodeAnalysis 166 <class 'pythran.passmanager.NodeAnalysis'> 167 168Lets examine the information Pythran can extract from a Pythran-compatible 169Python code. 170 171A simple analyse gathers informations concerning used identifiers across the 172module. It can be used, for instance, to generate new unique identifiers:: 173 174 >>> from pythran import analyses 175 >>> code = 'a = b = 1' 176 >>> tree = ast.parse(code) 177 >>> sorted(pm.gather(analyses.Identifiers, tree)) 178 ['a', 'b'] 179 180One can also computes the state of ``globals()``:: 181 182 >>> code = 'import math\n' 183 >>> code += 'def foo(a): b = math.cos(a) ; return [b] * 3' 184 >>> tree = ast.parse(code) 185 >>> sorted(list(pm.gather(analyses.Globals, tree))) 186 ['__dispatch__', 'builtins', 'foo', 'math'] 187 188One can also compute the state of ``locals()`` at any point of the program:: 189 190 >>> l = pm.gather(analyses.Locals, tree) 191 >>> fdef = tree.body[-1] 192 >>> freturn = fdef.body[-1] 193 >>> sorted(l[freturn]) 194 ['a', 'b', 'math'] 195 196The ``ConstantFolding`` pass relies on the eponymous analyse that flags all 197constant expressions. In the previous code, there is only two constant 198*expressions* but only one can be evaluate:: 199 200 >>> ce = pm.gather(analyses.ConstantExpressions, tree) 201 >>> sorted(map(ast.dump, ce)) 202 ["Attribute(value=Name(id='math', ctx=Load(), annotation=None, type_comment=None), attr='cos', ctx=Load())", 'Constant(value=3, kind=None)'] 203 204One of the most critical analyse of Pythran is the points-to analysis. There 205are two flavors of this analyse, one that computes an over-set of the aliased 206variable, and one that computes an under set. ``Aliases`` computes an over-set:: 207 208 >>> code = 'def foo(c, d): b= c or d ; return b' 209 >>> tree = ast.parse(code) 210 >>> al = pm.gather(analyses.Aliases, tree) 211 >>> returned = tree.body[-1].body[-1].value 212 >>> print(ast.dump(returned)) 213 Name(id='b', ctx=Load(), annotation=None, type_comment=None) 214 >>> sorted(a.id for a in al[returned]) 215 ['c', 'd'] 216 217Pythran also implements an inter-procedural analyse to compute which arguments 218are updated, for instance using an augmented assign, or the ``append`` method:: 219 220 >>> code = 'def foo(l,a): l+=[a]\ndef bar(g): foo(g, 1)' 221 >>> tree = ast.parse(code) 222 >>> ae = pm.gather(analyses.ArgumentEffects, tree) 223 >>> foo, bar = tree.body[0], tree.body[1] 224 >>> ae[foo] 225 [True, False] 226 >>> ae[bar] 227 [True] 228 229From this analyse and the ``GlobalEffects`` analyse, one can compute the set of 230pure functions, i.e. functions that have no side effects:: 231 232 >>> code = 'import random\ndef f():pass\ndef b(l): random.seed(0)' 233 >>> tree = ast.parse(code) 234 >>> pf = pm.gather(analyses.PureExpressions, tree) 235 >>> f = tree.body[1] 236 >>> b = tree.body[2] 237 >>> f in pf 238 True 239 >>> b in pf 240 False 241 242Pure functions are also interesting in the context of ``map``, as the 243application of a pure functions using a map results in a parallel ``map``:: 244 245 >>> code = 'def foo(x): return x*x\n' 246 >>> code += 'builtins.map(foo, builtins.range(100))' 247 >>> tree = ast.parse(code) 248 >>> pmaps = pm.gather(analyses.ParallelMaps, tree) 249 >>> len(pmaps) 250 1 251