1===============
2Design overview
3===============
4
5pygccxml has 4 packages:
6
7* :mod:`declarations <pygccxml.declarations>`
8
9  This package defines classes that describe C++ declarations and types.
10
11* :mod:`parser <pygccxml.parser>`
12
13  This package defines classes that parse `GCC-XML`_
14  or `CastXML`_ generated files. It also defines a few classes that will help
15  you unnecessary parsing of C++ source files.
16
17* :mod:`utils <pygccxml.utils>`
18
19  This package defines a few functions useful for the whole project,
20  but which are mainly used internally by pygccxml.
21
22------------------------
23``declarations`` package
24------------------------
25
26Please take a look on the `UML diagram`_. This `UML diagram`_ describes almost all
27classes defined in the package and their relationship. ``declarations`` package
28defines two hierarchies of class:
29
301. types hierarchy - used to represent a C++ type
31
322. declarations hierarchy - used to represent a C++ declaration
33
34
35Types hierarchy
36---------------
37
38Types hierarchy is used to represent an arbitrary type in C++. class ``type_t``
39is the base class.
40
41``type_traits``
42~~~~~~~~~~~~~~~
43
44Are you aware of `boost::type_traits`_ library? The `boost::type_traits`_
45library contains a set of very specific traits classes, each of which
46encapsulate a single trait from the C++ type system; for example, is a type
47a pointer or a reference? Or does a type have a trivial constructor, or a
48const-qualifier?
49
50pygccxml implements a lot of functionality from the library:
51
52* a lot of algorithms were implemented
53
54  + ``is_same``
55
56  + ``is_enum``
57
58  + ``is_void``
59
60  + ``is_const``
61
62  + ``is_array``
63
64  + ``is_pointer``
65
66  + ``is_volatile``
67
68  + ``is_integral``
69
70  + ``is_reference``
71
72  + ``is_arithmetic``
73
74  + ``is_convertible``
75
76  + ``is_fundamental``
77
78  + ``is_floating_point``
79
80  + ``is_base_and_derived``
81
82  + ``is_unary_operator``
83
84  + ``is_binary_operator``
85
86  + ``remove_cv``
87
88  + ``remove_const``
89
90  + ``remove_alias``
91
92  + ``remove_pointer``
93
94  + ``remove_volatile``
95
96  + ``remove_reference``
97
98  + ``has_trivial_copy``
99
100  + ``has_trivial_constructor``
101
102  + ``has_any_non_copyconstructor``
103
104  For a full list of implemented algorithms, please consult API documentation.
105
106* a lot of unit tests has been written base on unit tests from the
107  `boost::type_traits`_ library.
108
109
110If you are going to build code generator, you will find ``type_traits`` very handy.
111
112Declarations hierarchy
113----------------------
114
115A declaration hierarchy is used to represent an arbitrary C++ declaration.
116Basically, most of the classes defined in this package are just "set of properties".
117
118``declaration_t`` is the base class of the declaration hierarchy. Every declaration
119has ``parent`` property. This property keeps a reference to the scope declaration
120instance, in which this declaration is defined.
121
122The ``scopedef_t`` class derives from ``declaration_t``. This class is used to
123say - "I may have other declarations inside". The "composite" design pattern is
124used here. ``class_t`` and ``namespace_t`` declaration classes derive from the
125``scopedef_t`` class.
126
127------------------
128``parser`` package
129------------------
130
131Please take a look on `parser package UML diagram`_ . Classes defined in this
132package, implement parsing and linking functionality. There are few kind of
133classes defined by the package:
134
135* classes, that implements parsing algorithms of `GCC-XML`_ generated XML file
136
137* parser configuration classes
138
139* cache - classes, those one will help you to eliminate unnecessary parsing
140
141* patchers - classes, which fix `GCC-XML`_ generated declarations. ( Yes, sometimes
142  GCC-XML generates wrong description of C++ declaration. )
143
144Parser classes
145--------------
146
147``source_reader_t`` - the only class that have a detailed knowledge about `GCC-XML`_.
148It has only one responsibility: it calls `GCC-XML`_ with a source file specified
149by user and creates declarations tree. The implementation of this class is split
150to 2 classes:
151
1521. ``scanner_t`` - this class scans the "XML" file, generated by `GCC-XML`_ and
153   creates pygccxml declarations and types classes. After the xml file has
154   been processed declarations and type class instances keeps references to
155   each other using `GCC-XML`_ generated ids.
156
1572. ``linker_t`` - this class contains logic for replacing `GCC-XML`_ generated
158   ids with references to declarations or type class instances.
159
160Both those classes are implementation details and should not be used by user.
161Performance note: ``scanner_t`` class uses Python ``xml.sax`` package in order
162to parse XML. As a result, ``scanner_t`` class is able to parse even big XML files
163pretty quick.
164
165``project_reader_t`` - think about this class as a linker. In most cases you work
166with few source files. GCC-XML does not supports this mode of work. So, pygccxml
167implements all functionality needed to parse few source files at once.
168``project_reader_t`` implements 2 different algorithms, that solves the problem:
169
1701. ``project_reader_t`` creates temporal source file, which includes all the source
171   files.
172
1732. ``project_reader_t`` parse separately every source file, using ``source_reader_t``
174   class and then joins the resulting declarations tree into single declarations
175   tree.
176
177Both approaches have different trades-off. The first approach does not allow you
178to reuse information from already parsed source files. While the second one
179allows you to setup cache.
180
181Parser configuration classes
182----------------------------
183
184``gccxml_configuration_t`` - a class, that accumulates all the settings needed to invoke `GCC-XML`_:
185
186
187``file_configuration_t`` - a class, that contains some data and description how
188to treat the data. ``file_configuration_t`` can contain reference to the the following types
189of data:
190
191(1) path to C++ source file
192
193(2) path to `GCC-XML`_ generated XML file
194
195(3) path to C++ source file and path to `GCC-XML`_ generated XML file
196
197    In this case, if XML file does not exists, it will be created. Next time
198    you will ask to parse the source file, the XML file will be used instead.
199
200    Small tip: you can setup your makefile to delete XML files every time,
201    the relevant source file has changed.
202
203(4) Python string, that contains valid C++ code
204
205There are few functions that will help you to construct ``file_configuration_t``
206object:
207
208* ``def create_source_fc( header )``
209
210  ``header`` contains path to C++ source file
211
212* ``def create_gccxml_fc( xml_file )``
213
214  ``xml_file`` contains path to `GCC-XML`_ generated XML file
215
216* ``def create_cached_source_fc( header, cached_source_file )``
217
218  - ``header`` contains path to C++ source file
219  - ``xml_file`` contains path to `GCC-XML`_ generated XML file
220
221* ``def create_text_fc( text )``
222
223  ``text`` - Python string, that contains valid C++ code
224
225
226Cache classes
227-------------
228
229There are few cache classes, which implements different cache strategies.
230
2311. ``file_configuration_t`` class, that keeps path to C++ source file and path to
232   `GCC-XML`_ generated XML file.
233
2342. ``file_cache_t`` class, will save all declarations from all files within single
235   binary file.
236
2373. ``directory_cache_t`` class will store one index file called "index.dat" which
238   is always read by the cache when the cache object is created. Each header file
239   will have its corresponding \*.cache file that stores the declarations found
240   in the header file. The index file is used to determine whether a \*.cache file
241   is still valid or not (by checking if one of the dependent files
242   (i.e. the header file itself and all included files) have been modified since
243   the last run).
244
245In some cases, ``directory_cache_t`` class gives much better performance, than
246``file_cache_t``. Many thanks to Matthias Baas for its implementation.
247
248**Warning**: when pygccxml writes information to files, using cache classes,
249it does not write any version information. It means, that when you upgrade
250pygccxml you have to delete all your cache files. Otherwise you will get very
251strange errors. For example: missing attribute.
252
253
254Patchers
255--------
256
257Well, `GCC-XML`_ has few bugs, which could not be fixed from it. For example
258
259.. code-block:: c++
260
261  namespace ns1{ namespace ns2{
262      enum fruit{ apple, orange };
263  } }
264
265.. code-block:: c++
266
267  void fix_enum( ns1::ns2::fruit arg=ns1::ns2::apple );
268
269`GCC-XML`_ will report the default value of ``arg`` as ``apple``. Obviously
270this in an error. pygccxml knows how to fix this bug.
271
272This is not the only bug, which could be fixed, there are few of them. pygccxml
273introduces few classes, which knows how to deal with specific bug. More over, those
274bugs are fixed, only if I am 101% sure, that this is the right thing to do.
275
276-----------------
277``utils`` package
278-----------------
279
280 Use internally by pygccxml.
281 Some methods/classes may be still usefull: loggers, find_xml_generator
282
283-------
284Summary
285-------
286
287That's all. I hope I was clear, at least I tried. Any way, pygccxml is an open
288source project. You always can take a look on the source code. If you need more
289information please read API documentation.
290
291
292.. _`SourceForge`: http://sourceforge.net/index.php
293.. _`Python`: http://www.python.org
294.. _`GCC-XML`: http://www.gccxml.org
295.. _`CastXML`: https://github.com/CastXML/CastXML
296.. _`UML diagram` : declarations_uml.png
297.. _`parser package UML diagram` : parser_uml.png
298.. _`ReleaseForge` : http://releaseforge.sourceforge.net
299.. _`boost::type_traits` : http://www.boost.org/libs/type_traits/index.html
300