1Developer Tutorial
2##################
3
4This is a long tutorial to help new Pythran developer discover the Pythran
5architecture. This is *not* a developer documentation, but it aims at giving a
6good overview of Pythran capacity.
7
8It requires you are comfortable with Python, and eventually with C++11. It also
9assumes you have some compilation background, i.e. you know what an AST is and
10you don't try to escape when hearing the words alias analysis, memory effect
11computations and such.
12
13Parsing Python Code
14-------------------
15
16Python ships a standard module, ``ast`` to turn Python code into an AST. For instance::
17
18  >>> import gast as ast
19  >>> from __future__ import print_function
20  >>> code = "a=1"
21  >>> tree = ast.parse(code)  # turn the code into an AST
22  >>> print(ast.dump(tree))  # view it as a string
23  Module(body=[Assign(targets=[Name(id='a', ctx=Store(), annotation=None, type_comment=None)], value=Constant(value=1, kind=None), type_comment=None)], type_ignores=[])
24
25Deciphering the above line, one learns that the single assignment is parsed as
26a module containing a single statement, which is an assignment to a single
27target, a ``ast.Name`` with the identifier ``a``, of the literal value ``1``.
28
29Eventually, one needs to parse more complex codes, and things get a bit more cryptic, but you get the idea::
30
31  >>> fib_src = """
32  ... def fib(n):
33  ...     return n if n< 2 else fib(n-1) + fib(n-2)"""
34  >>> tree = ast.parse(fib_src)
35  >>> print(ast.dump(tree))
36  Module(body=[FunctionDef(name='fib', args=arguments(args=[Name(id='n', ctx=Param(), annotation=None, type_comment=None)], posonlyargs=[], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[Return(value=IfExp(test=Compare(left=Name(id='n', ctx=Load(), annotation=None, type_comment=None), ops=[Lt()], comparators=[Constant(value=2, kind=None)]), body=Name(id='n', ctx=Load(), annotation=None, type_comment=None), orelse=BinOp(left=Call(func=Name(id='fib', ctx=Load(), annotation=None, type_comment=None), args=[BinOp(left=Name(id='n', ctx=Load(), annotation=None, type_comment=None), op=Sub(), right=Constant(value=1, kind=None))], keywords=[]), op=Add(), right=Call(func=Name(id='fib', ctx=Load(), annotation=None, type_comment=None), args=[BinOp(left=Name(id='n', ctx=Load(), annotation=None, type_comment=None), op=Sub(), right=Constant(value=2, kind=None))], keywords=[]))))], decorator_list=[], returns=None, type_comment=None)], type_ignores=[])
37
38The idea remains the same. The whole Python syntax is described in
39http://docs.python.org/2/library/ast.html and is worth a glance, otherwise
40you'll be in serious trouble understanding the following.
41
42Pythran Pass Manager
43--------------------
44
45A pass is a code transformation, i.e. a function that turns an AST node into a
46new AST node with refined behavior. As a compiler infrastructure, Pythran
47proposes a pass manager that (guess what?) manages pass scheduling, that is
48the order in which pass is applied to achieve the ultimate goal, world
49domination. Oooops, efficient C++11 code generation.
50
51One first need to instantiate a pass manager with a module name::
52
53  >>> from pythran import passmanager
54  >>> pm = passmanager.PassManager("tutorial_module")
55
56The pass manager has 3 methods and two attributes::
57
58  >>> [x for x in dir(pm) if not x.startswith('_')]
59  ['apply', 'dump', 'gather', 'module_dir', 'module_name']
60
61``apply``
62    applies a code transformation
63
64``dump``
65    dumps a node using a dedicated backend
66
67``gather``
68    gathers information about the node
69
70Pythran Backends
71----------------
72
73Pythran currently has two backends. The main one is used to dump Pythran AST (a
74subset of Python AST) into a C++ AST::
75
76  >>> from pythran import backend
77  >>> cxx = pm.dump(backend.Cxx, tree)
78  >>> str(cxx)
79  '#include <pythonic/include/operator_/add.hpp>\n#include <pythonic/include/operator_/lt.hpp>\n#include <pythonic/include/operator_/sub.hpp>\n#include <pythonic/operator_/add.hpp>\n#include <pythonic/operator_/lt.hpp>\n#include <pythonic/operator_/sub.hpp>\nnamespace __pythran_tutorial_module\n{\n  struct fib\n  {\n    typedef void callable;\n    typedef void pure;\n    template <typename argument_type0 >\n    struct type\n    {\n      typedef typename std::remove_cv<typename std::remove_reference<argument_type0>::type>::type __type0;\n      typedef __type0 __type1;\n      typedef typename pythonic::returnable<__type1>::type __type2;\n      typedef __type2 result_type;\n    }  \n    ;\n    template <typename argument_type0 >\n    inline\n    typename type<argument_type0>::result_type operator()(argument_type0&& n) const\n    ;\n  }  ;\n  template <typename argument_type0 >\n  inline\n  typename fib::type<argument_type0>::result_type fib::operator()(argument_type0&& n) const\n  {\n    return (((bool)pythonic::operator_::lt(n, 2L)) ? typename __combined<decltype(n), decltype(pythonic::operator_::add(fib()(pythonic::operator_::sub(n, 1L)), fib()(pythonic::operator_::sub(n, 2L))))>::type(n) : typename __combined<decltype(n), decltype(pythonic::operator_::add(fib()(pythonic::operator_::sub(n, 1L)), fib()(pythonic::operator_::sub(n, 2L))))>::type(pythonic::operator_::add(fib()(pythonic::operator_::sub(n, 1L)), fib()(pythonic::operator_::sub(n, 2L)))));\n  }\n}'
80
81The above string is understandable by a C++11 compiler, but it quickly reaches the limit of our developer brain, so most of the time, we are more comfortable with the Python backend::
82
83  >>> py = pm.dump(backend.Python, tree)
84  >>> print(py)
85  def fib(n):
86      return (n if (n < 2) else (fib((n - 1)) + fib((n - 2))))
87
88Passes
89------
90
91There are many code transformations in Pythran. Some of them are used to lower
92the representation from Python AST to the simpler Pythran AST. For instance
93there is no tuple unpacking in Pythran, so Pythran provides an adequate
94transformation::
95
96  >>> from pythran import transformations
97  >>> tree = ast.parse("def foo(): a,b = 1,3.5")
98  >>> _ = pm.apply(transformations.NormalizeTuples, tree)  # in-place
99  >>> print(pm.dump(backend.Python, tree))
100  def foo():
101      __tuple0 = (1, 3.5)
102      a = __tuple0[0]
103      b = __tuple0[1]
104
105Pythran transforms the tuple unpacking into an intermediate tuple assignment.
106Note that if the unpacking statement is marked as critical using an OpenMP
107statement, then a temporary variable is used to hold the left hand side
108computation, if any::
109
110  >>> from pythran import transformations
111  >>> tree = ast.parse("""
112  ... def foo(x):
113  ...     #omp critical
114  ...     a,b = 1, x + 1
115  ...     return a + b""")
116  >>> _ = pm.apply(transformations.NormalizeTuples, tree)  # in-place
117  >>> print(pm.dump(backend.Python, tree))
118  def foo(x):
119      __tuple0 = (1, (x + 1))
120      a = __tuple0[0]
121      b = __tuple0[1]
122      return (a + b)
123
124There are many small passes used iteratively to produce the Pythran AST. For instance the implicit return at the end of every function is made explicit::
125
126  >>> tree = ast.parse('def foo():pass')
127  >>> _ = pm.apply(transformations.NormalizeReturn, tree)
128  >>> print(pm.dump(backend.Python, tree))
129  def foo():
130      pass
131      return builtins.None
132
133More complex ones rely on introspection to implement constant folding::
134
135  >>> from __future__ import print_function
136  >>> code = [fib_src, 'def foo(): return builtins.map(fib, [1,2,3])']
137  >>> fib_call = '\n'.join(code)
138  >>> tree = ast.parse(fib_call)
139  >>> from pythran import optimizations as optim
140  >>> _ = pm.apply(optim.ConstantFolding, tree)
141  >>> print(pm.dump(backend.Python, tree))
142  def fib(n):
143      return (n if (n < 2) else (fib((n - 1)) + fib((n - 2))))
144  def foo():
145      return [1, 1, 2]
146
147One can also detect some common generator expression patterns to call the itertool module::
148
149  >>> norm = 'def norm(l): return builtins.sum(n*n for n in l)'
150  >>> tree = ast.parse(norm)
151  >>> _ = pm.apply(optim.ComprehensionPatterns, tree)
152  >>> 'map' in pm.dump(backend.Python, tree)
153  True
154
155
156Analysis
157--------
158
159All Pythran passes are backed up by analysis. Pythran provides three levels of analysis::
160
161  >>> passmanager.FunctionAnalysis
162  <class 'pythran.passmanager.FunctionAnalysis'>
163  >>> passmanager.ModuleAnalysis
164  <class 'pythran.passmanager.ModuleAnalysis'>
165  >>> passmanager.NodeAnalysis
166  <class 'pythran.passmanager.NodeAnalysis'>
167
168Lets examine the information Pythran can extract from a Pythran-compatible
169Python code.
170
171A simple analyse gathers informations concerning used identifiers across the
172module. It can be used, for instance, to generate new unique identifiers::
173
174  >>> from pythran import analyses
175  >>> code = 'a = b = 1'
176  >>> tree = ast.parse(code)
177  >>> sorted(pm.gather(analyses.Identifiers, tree))
178  ['a', 'b']
179
180One can also computes the state of ``globals()``::
181
182  >>> code = 'import math\n'
183  >>> code += 'def foo(a): b = math.cos(a) ; return [b] * 3'
184  >>> tree = ast.parse(code)
185  >>> sorted(list(pm.gather(analyses.Globals, tree)))
186  ['__dispatch__', 'builtins', 'foo', 'math']
187
188One can also compute the state of ``locals()`` at any point of the program::
189
190  >>> l = pm.gather(analyses.Locals, tree)
191  >>> fdef = tree.body[-1]
192  >>> freturn = fdef.body[-1]
193  >>> sorted(l[freturn])
194  ['a', 'b', 'math']
195
196The ``ConstantFolding`` pass relies on the eponymous analyse that flags all
197constant expressions. In the previous code, there is only two constant
198*expressions* but only one can be evaluate::
199
200  >>> ce = pm.gather(analyses.ConstantExpressions, tree)
201  >>> sorted(map(ast.dump, ce))
202  ["Attribute(value=Name(id='math', ctx=Load(), annotation=None, type_comment=None), attr='cos', ctx=Load())", 'Constant(value=3, kind=None)']
203
204One of the most critical analyse of Pythran is the points-to analysis. There
205are two flavors of this analyse, one that computes an over-set of the aliased
206variable, and one that computes an under set. ``Aliases`` computes an over-set::
207
208  >>> code = 'def foo(c, d): b= c or d ; return b'
209  >>> tree = ast.parse(code)
210  >>> al = pm.gather(analyses.Aliases, tree)
211  >>> returned = tree.body[-1].body[-1].value
212  >>> print(ast.dump(returned))
213  Name(id='b', ctx=Load(), annotation=None, type_comment=None)
214  >>> sorted(a.id for a in al[returned])
215  ['c', 'd']
216
217Pythran also implements an inter-procedural analyse to compute which arguments
218are updated, for instance using an augmented assign, or the ``append`` method::
219
220  >>> code = 'def foo(l,a): l+=[a]\ndef bar(g): foo(g, 1)'
221  >>> tree = ast.parse(code)
222  >>> ae = pm.gather(analyses.ArgumentEffects, tree)
223  >>> foo, bar = tree.body[0], tree.body[1]
224  >>> ae[foo]
225  [True, False]
226  >>> ae[bar]
227  [True]
228
229From this analyse and the ``GlobalEffects`` analyse, one can compute the set of
230pure functions, i.e. functions that have no side effects::
231
232  >>> code = 'import random\ndef f():pass\ndef b(l): random.seed(0)'
233  >>> tree = ast.parse(code)
234  >>> pf = pm.gather(analyses.PureExpressions, tree)
235  >>> f = tree.body[1]
236  >>> b = tree.body[2]
237  >>> f in pf
238  True
239  >>> b in pf
240  False
241
242Pure functions are also interesting in the context of ``map``, as the
243application of a pure functions using a map results in a parallel ``map``::
244
245  >>> code = 'def foo(x): return x*x\n'
246  >>> code += 'builtins.map(foo, builtins.range(100))'
247  >>> tree = ast.parse(code)
248  >>> pmaps = pm.gather(analyses.ParallelMaps, tree)
249  >>> len(pmaps)
250  1
251