xref: /illumos-gate/usr/src/tools/smatch/src/README (revision c85f09cc)
11f5207b7SJohn LevonFor parsing implicit dependencies, see smatch_scripts/implicit_dependencies.
2*c85f09ccSJohn Levon=======
3*c85f09ccSJohn Levon  sparse (spärs), adj,., spars-er, spars-est.
4*c85f09ccSJohn Levon	1. thinly scattered or distributed; "a sparse population"
5*c85f09ccSJohn Levon	2. thin; not thick or dense: "sparse hair"
6*c85f09ccSJohn Levon	3. scanty; meager.
7*c85f09ccSJohn Levon	4. semantic parse
8*c85f09ccSJohn Levon  	[ from Latin: spars(us) scattered, past participle of
9*c85f09ccSJohn Levon	  spargere 'to sparge' ]
10*c85f09ccSJohn Levon
11*c85f09ccSJohn Levon	Antonym: abundant
12*c85f09ccSJohn Levon
13*c85f09ccSJohn LevonSparse is a semantic parser of source files: it's neither a compiler
14*c85f09ccSJohn Levon(although it could be used as a front-end for one) nor is it a
15*c85f09ccSJohn Levonpreprocessor (although it contains as a part of it a preprocessing
16*c85f09ccSJohn Levonphase).
17*c85f09ccSJohn Levon
18*c85f09ccSJohn LevonIt is meant to be a small - and simple - library.  Scanty and meager,
19*c85f09ccSJohn Levonand partly because of that easy to use.  It has one mission in life:
20*c85f09ccSJohn Levoncreate a semantic parse tree for some arbitrary user for further
21*c85f09ccSJohn Levonanalysis.  It's not a tokenizer, nor is it some generic context-free
22*c85f09ccSJohn Levonparser.  In fact, context (semantics) is what it's all about - figuring
23*c85f09ccSJohn Levonout not just what the grouping of tokens are, but what the _types_ are
24*c85f09ccSJohn Levonthat the grouping implies.
25*c85f09ccSJohn Levon
26*c85f09ccSJohn LevonAnd no, it doesn't use lex and yacc (or flex and bison).  In my personal
27*c85f09ccSJohn Levonopinion, the result of using lex/yacc tends to end up just having to
28*c85f09ccSJohn Levonfight the assumptions the tools make.
29*c85f09ccSJohn Levon
30*c85f09ccSJohn LevonThe parsing is done in five phases:
31*c85f09ccSJohn Levon
32*c85f09ccSJohn Levon - full-file tokenization
33*c85f09ccSJohn Levon - pre-processing (which can cause another tokenization phase of another
34*c85f09ccSJohn Levon   file)
35*c85f09ccSJohn Levon - semantic parsing.
36*c85f09ccSJohn Levon - lazy type evaluation
37*c85f09ccSJohn Levon - inline function expansion and tree simplification
38*c85f09ccSJohn Levon
39*c85f09ccSJohn LevonNote the "full file" part. Partly for efficiency, but mostly for ease of
40*c85f09ccSJohn Levonuse, there are no "partial results". The library completely parses one
41*c85f09ccSJohn Levonwhole source file, and builds up the _complete_ parse tree in memory.
42*c85f09ccSJohn Levon
43*c85f09ccSJohn LevonAlso note the "lazy" in the type evaluation.  The semantic parsing
44*c85f09ccSJohn Levonitself will know which symbols are typedefines (required for parsing C
45*c85f09ccSJohn Levoncorrectly), but it will not have calculated what the details of the
46*c85f09ccSJohn Levondifferent types are.  That will be done only on demand, as the back-end
47*c85f09ccSJohn Levonrequires the information.
48*c85f09ccSJohn Levon
49*c85f09ccSJohn LevonThis means that a user of the library will literally just need to do
50*c85f09ccSJohn Levon
51*c85f09ccSJohn Levon  struct string_list *filelist = NULL;
52*c85f09ccSJohn Levon  char *file;
53*c85f09ccSJohn Levon
54*c85f09ccSJohn Levon  action(sparse_initialize(argc, argv, filelist));
55*c85f09ccSJohn Levon
56*c85f09ccSJohn Levon  FOR_EACH_PTR(filelist, file) {
57*c85f09ccSJohn Levon    action(sparse(file));
58*c85f09ccSJohn Levon  } END_FOR_EACH_PTR(file);
59*c85f09ccSJohn Levon
60*c85f09ccSJohn Levonand he is now done - having a full C parse of the file he opened.  The
61*c85f09ccSJohn Levonlibrary doesn't need any more setup, and once done does not impose any
62*c85f09ccSJohn Levonmore requirements.  The user is free to do whatever he wants with the
63*c85f09ccSJohn Levonparse tree that got built up, and needs not worry about the library ever
64*c85f09ccSJohn Levonagain.  There is no extra state, there are no parser callbacks, there is
65*c85f09ccSJohn Levononly the parse tree that is described by the header files. The action
66*c85f09ccSJohn Levonfuntion takes a pointer to a symbol_list and does whatever it likes with it.
67*c85f09ccSJohn Levon
68*c85f09ccSJohn LevonThe library also contains (as an example user) a few clients that do the
69*c85f09ccSJohn Levonpreprocessing, parsing and type evaluation and just print out the
70*c85f09ccSJohn Levonresults.  These clients were done to verify and debug the library, and
71*c85f09ccSJohn Levonalso as trivial examples of what you can do with the parse tree once it
72*c85f09ccSJohn Levonis formed, so that users can see how the tree is organized.
73