README.rst
1===============
2pycparser v2.21
3===============
4
5
6.. image:: https://github.com/eliben/pycparser/workflows/pycparser-tests/badge.svg
7 :align: center
8 :target: https://github.com/eliben/pycparser/actions
9
10----
11
12.. contents::
13 :backlinks: none
14
15.. sectnum::
16
17
18Introduction
19============
20
21What is pycparser?
22------------------
23
24**pycparser** is a parser for the C language, written in pure Python. It is a
25module designed to be easily integrated into applications that need to parse
26C source code.
27
28What is it good for?
29--------------------
30
31Anything that needs C code to be parsed. The following are some uses for
32**pycparser**, taken from real user reports:
33
34* C code obfuscator
35* Front-end for various specialized C compilers
36* Static code checker
37* Automatic unit-test discovery
38* Adding specialized extensions to the C language
39
40One of the most popular uses of **pycparser** is in the `cffi
41<https://cffi.readthedocs.io/en/latest/>`_ library, which uses it to parse the
42declarations of C functions and types in order to auto-generate FFIs.
43
44**pycparser** is unique in the sense that it's written in pure Python - a very
45high level language that's easy to experiment with and tweak. To people familiar
46with Lex and Yacc, **pycparser**'s code will be simple to understand. It also
47has no external dependencies (except for a Python interpreter), making it very
48simple to install and deploy.
49
50Which version of C does pycparser support?
51------------------------------------------
52
53**pycparser** aims to support the full C99 language (according to the standard
54ISO/IEC 9899). Some features from C11 are also supported, and patches to support
55more are welcome.
56
57**pycparser** supports very few GCC extensions, but it's fairly easy to set
58things up so that it parses code with a lot of GCC-isms successfully. See the
59`FAQ <https://github.com/eliben/pycparser/wiki/FAQ>`_ for more details.
60
61What grammar does pycparser follow?
62-----------------------------------
63
64**pycparser** very closely follows the C grammar provided in Annex A of the C99
65standard (ISO/IEC 9899).
66
67How is pycparser licensed?
68--------------------------
69
70`BSD license <https://github.com/eliben/pycparser/blob/master/LICENSE>`_.
71
72Contact details
73---------------
74
75For reporting problems with **pycparser** or submitting feature requests, please
76open an `issue <https://github.com/eliben/pycparser/issues>`_, or submit a
77pull request.
78
79
80Installing
81==========
82
83Prerequisites
84-------------
85
86* **pycparser** was tested on Python 2.7, 3.4-3.6, on both Linux and
87 Windows. It should work on any later version (in both the 2.x and 3.x lines)
88 as well.
89
90* **pycparser** has no external dependencies. The only non-stdlib library it
91 uses is PLY, which is bundled in ``pycparser/ply``. The current PLY version is
92 3.10, retrieved from `<http://www.dabeaz.com/ply/>`_
93
94Note that **pycparser** (and PLY) uses docstrings for grammar specifications.
95Python installations that strip docstrings (such as when using the Python
96``-OO`` option) will fail to instantiate and use **pycparser**. You can try to
97work around this problem by making sure the PLY parsing tables are pre-generated
98in normal mode; this isn't an officially supported/tested mode of operation,
99though.
100
101Installation process
102--------------------
103
104Installing **pycparser** is very simple. Once you download and unzip the
105package, you just have to execute the standard ``python setup.py install``. The
106setup script will then place the ``pycparser`` module into ``site-packages`` in
107your Python's installation library.
108
109Alternatively, since **pycparser** is listed in the `Python Package Index
110<https://pypi.org/project/pycparser/>`_ (PyPI), you can install it using your
111favorite Python packaging/distribution tool, for example with::
112
113 > pip install pycparser
114
115Known problems
116--------------
117
118* Some users who've installed a new version of **pycparser** over an existing
119 version ran into a problem using the newly installed library. This has to do
120 with parse tables staying around as ``.pyc`` files from the older version. If
121 you see unexplained errors from **pycparser** after an upgrade, remove it (by
122 deleting the ``pycparser`` directory in your Python's ``site-packages``, or
123 wherever you installed it) and install again.
124
125
126Using
127=====
128
129Interaction with the C preprocessor
130-----------------------------------
131
132In order to be compilable, C code must be preprocessed by the C preprocessor -
133``cpp``. ``cpp`` handles preprocessing directives like ``#include`` and
134``#define``, removes comments, and performs other minor tasks that prepare the C
135code for compilation.
136
137For all but the most trivial snippets of C code **pycparser**, like a C
138compiler, must receive preprocessed C code in order to function correctly. If
139you import the top-level ``parse_file`` function from the **pycparser** package,
140it will interact with ``cpp`` for you, as long as it's in your PATH, or you
141provide a path to it.
142
143Note also that you can use ``gcc -E`` or ``clang -E`` instead of ``cpp``. See
144the ``using_gcc_E_libc.py`` example for more details. Windows users can download
145and install a binary build of Clang for Windows `from this website
146<http://llvm.org/releases/download.html>`_.
147
148What about the standard C library headers?
149------------------------------------------
150
151C code almost always ``#include``\s various header files from the standard C
152library, like ``stdio.h``. While (with some effort) **pycparser** can be made to
153parse the standard headers from any C compiler, it's much simpler to use the
154provided "fake" standard includes for C11 in ``utils/fake_libc_include``. These
155are standard C header files that contain only the bare necessities to allow
156valid parsing of the files that use them. As a bonus, since they're minimal, it
157can significantly improve the performance of parsing large C files.
158
159The key point to understand here is that **pycparser** doesn't really care about
160the semantics of types. It only needs to know whether some token encountered in
161the source is a previously defined type. This is essential in order to be able
162to parse C correctly.
163
164See `this blog post
165<https://eli.thegreenplace.net/2015/on-parsing-c-type-declarations-and-fake-headers>`_
166for more details.
167
168Note that the fake headers are not included in the ``pip`` package nor installed
169via ``setup.py`` (`#224 <https://github.com/eliben/pycparser/issues/224>`_).
170
171Basic usage
172-----------
173
174Take a look at the |examples|_ directory of the distribution for a few examples
175of using **pycparser**. These should be enough to get you started. Please note
176that most realistic C code samples would require running the C preprocessor
177before passing the code to **pycparser**; see the previous sections for more
178details.
179
180.. |examples| replace:: ``examples``
181.. _examples: examples
182
183
184Advanced usage
185--------------
186
187The public interface of **pycparser** is well documented with comments in
188``pycparser/c_parser.py``. For a detailed overview of the various AST nodes
189created by the parser, see ``pycparser/_c_ast.cfg``.
190
191There's also a `FAQ available here <https://github.com/eliben/pycparser/wiki/FAQ>`_.
192In any case, you can always drop me an `email <eliben@gmail.com>`_ for help.
193
194
195Modifying
196=========
197
198There are a few points to keep in mind when modifying **pycparser**:
199
200* The code for **pycparser**'s AST nodes is automatically generated from a
201 configuration file - ``_c_ast.cfg``, by ``_ast_gen.py``. If you modify the AST
202 configuration, make sure to re-generate the code. This can be done by running
203 the ``_build_tables.py`` script from the ``pycparser`` directory.
204* Make sure you understand the optimized mode of **pycparser** - for that you
205 must read the docstring in the constructor of the ``CParser`` class. For
206 development you should create the parser without optimizations, so that it
207 will regenerate the Yacc and Lex tables when you change the grammar.
208
209
210Package contents
211================
212
213Once you unzip the ``pycparser`` package, you'll see the following files and
214directories:
215
216README.rst:
217 This README file.
218
219LICENSE:
220 The pycparser license
221
222setup.py:
223 Installation script
224
225examples/:
226 A directory with some examples of using **pycparser**
227
228pycparser/:
229 The **pycparser** module source code.
230
231tests/:
232 Unit tests.
233
234utils/fake_libc_include:
235 Minimal standard C library include files that should allow to parse any C code.
236 Note that these headers now include C11 code, so they may not work when the
237 preprocessor is configured to an earlier C standard (like ``-std=c99``).
238
239utils/internal/:
240 Internal utilities for my own use. You probably don't need them.
241
242
243Contributors
244============
245
246Some people have contributed to **pycparser** by opening issues on bugs they've
247found and/or submitting patches. The list of contributors is in the CONTRIBUTORS
248file in the source distribution. After **pycparser** moved to Github I stopped
249updating this list because Github does a much better job at tracking
250contributions.
251
252
253