1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2          "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
5  <title>Checker Developer Manual</title>
6  <link type="text/css" rel="stylesheet" href="menu.css">
7  <link type="text/css" rel="stylesheet" href="content.css">
8  <script type="text/javascript" src="scripts/menu.js"></script>
9</head>
10<body>
11
12<div id="page">
13<!--#include virtual="menu.html.incl"-->
14
15<div id="content">
16
17<h3 style="color:red">This Page Is Under Construction</h3>
18
19<h1>Checker Developer Manual</h1>
20
21<p>The static analyzer engine performs path-sensitive exploration of the program and
22relies on a set of checkers to implement the logic for detecting and
23constructing specific bug reports. Anyone who is interested in implementing their own
24checker, should check out the Building a Checker in 24 Hours talk
25(<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a>
26 <a href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>)
27and refer to this page for additional information on writing a checker. The static analyzer is a
28part of the Clang project, so consult <a href="http://clang.llvm.org/hacking.html">Hacking on Clang</a>
29and <a href="http://llvm.org/docs/ProgrammersManual.html">LLVM Programmer's Manual</a>
30for developer guidelines and send your questions and proposals to
31<a href=http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>cfe-dev mailing list</a>.
32</p>
33
34    <ul>
35      <li><a href="#start">Getting Started</a></li>
36      <li><a href="#analyzer">Static Analyzer Overview</a>
37      <ul>
38        <li><a href="#interaction">Interaction with Checkers</a></li>
39        <li><a href="#values">Representing Values</a></li>
40      </ul></li>
41      <li><a href="#idea">Idea for a Checker</a></li>
42      <li><a href="#registration">Checker Registration</a></li>
43      <li><a href="#events_callbacks">Events, Callbacks, and Checker Class Structure</a></li>
44      <li><a href="#extendingstates">Custom Program States</a></li>
45      <li><a href="#bugs">Bug Reports</a></li>
46      <li><a href="#ast">AST Visitors</a></li>
47      <li><a href="#testing">Testing</a></li>
48      <li><a href="#commands">Useful Commands/Debugging Hints</a></li>
49      <li><a href="#additioninformation">Additional Sources of Information</a></li>
50    </ul>
51
52<h2 id=start>Getting Started</h2>
53  <ul>
54    <li>To check out the source code and build the project, follow steps 1-4 of
55    the <a href="http://clang.llvm.org/get_started.html">Clang Getting Started</a>
56  page.</li>
57
58    <li>The analyzer source code is located under the Clang source tree:
59    <br><tt>
60    $ <b>cd llvm/tools/clang</b>
61    </tt>
62    <br>See: <tt>include/clang/StaticAnalyzer</tt>, <tt>lib/StaticAnalyzer</tt>,
63     <tt>test/Analysis</tt>.</li>
64
65    <li>The analyzer regression tests can be executed from the Clang's build
66    directory:
67    <br><tt>
68    $ <b>cd ../../../; cd build/tools/clang; TESTDIRS=Analysis make test</b>
69    </tt></li>
70
71    <li>Analyze a file with the specified checker:
72    <br><tt>
73    $ <b>clang -cc1 -analyze -analyzer-checker=core.DivideZero test.c</b>
74    </tt></li>
75
76    <li>List the available checkers:
77    <br><tt>
78    $ <b>clang -cc1 -analyzer-checker-help</b>
79    </tt></li>
80
81    <li>See the analyzer help for different output formats, fine tuning, and
82    debug options:
83    <br><tt>
84    $ <b>clang -cc1 -help | grep "analyzer"</b>
85    </tt></li>
86
87  </ul>
88
89<h2 id=analyzer>Static Analyzer Overview</h2>
90  The analyzer core performs symbolic execution of the given program. All the
91  input values are represented with symbolic values; further, the engine deduces
92  the values of all the expressions in the program based on the input symbols
93  and the path. The execution is path sensitive and every possible path through
94  the program is explored. The explored execution traces are represented with
95  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedGraph.html">ExplodedGraph</a> object.
96  Each node of the graph is
97  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ExplodedNode.html">ExplodedNode</a>,
98  which consists of a <tt>ProgramPoint</tt> and a <tt>ProgramState</tt>.
99  <p>
100  <a href="http://clang.llvm.org/doxygen/classclang_1_1ProgramPoint.html">ProgramPoint</a>
101  represents the corresponding location in the program (or the CFG graph).
102  <tt>ProgramPoint</tt> is also used to record additional information on
103  when/how the state was added. For example, <tt>PostPurgeDeadSymbolsKind</tt>
104  kind means that the state is the result of purging dead symbols - the
105  analyzer's equivalent of garbage collection.
106  <p>
107  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1ProgramState.html">ProgramState</a>
108  represents abstract state of the program. It consists of:
109  <ul>
110    <li><tt>Environment</tt> - a mapping from source code expressions to symbolic
111    values
112    <li><tt>Store</tt> - a mapping from memory locations to symbolic values
113    <li><tt>GenericDataMap</tt> - constraints on symbolic values
114  </ul>
115
116  <h3 id=interaction>Interaction with Checkers</h3>
117  Checkers are not merely passive receivers of the analyzer core changes - they
118  actively participate in the <tt>ProgramState</tt> construction through the
119  <tt>GenericDataMap</tt> which can be used to store the checker-defined part
120  of the state. Each time the analyzer engine explores a new statement, it
121  notifies each checker registered to listen for that statement, giving it an
122  opportunity to either report a bug or modify the state. (As a rule of thumb,
123  the checker itself should be stateless.) The checkers are called one after another
124  in the predefined order; thus, calling all the checkers adds a chain to the
125  <tt>ExplodedGraph</tt>.
126
127  <h3 id=values>Representing Values</h3>
128  During symbolic execution, <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SVal.html">SVal</a>
129  objects are used to represent the semantic evaluation of expressions.
130  They can represent things like concrete
131  integers, symbolic values, or memory locations (which are memory regions).
132  They are a discriminated union of "values", symbolic and otherwise.
133  If a value isn't symbolic, usually that means there is no symbolic
134  information to track. For example, if the value was an integer, such as
135  <tt>42</tt>, it would be a <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1nonloc_1_1ConcreteInt.html">ConcreteInt</a>,
136  and the checker doesn't usually need to track any state with the concrete
137  number. In some cases, <tt>SVal</tt> is not a symbol, but it really should be
138  a symbolic value. This happens when the analyzer cannot reason about something
139  (yet). An example is floating point numbers. In such cases, the
140  <tt>SVal</tt> will evaluate to <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1UnknownVal.html">UnknownVal</a>.
141  This represents a case that is outside the realm of the analyzer's reasoning
142  capabilities. <tt>SVals</tt> are value objects and their values can be viewed
143  using the <tt>.dump()</tt> method. Often they wrap persistent objects such as
144  symbols or regions.
145  <p>
146  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymExpr.html">SymExpr</a> (symbol)
147  is meant to represent abstract, but named, symbolic value. Symbols represent
148  an actual (immutable) value. We might not know what its specific value is, but
149  we can associate constraints with that value as we analyze a path. For
150  example, we might record that the value of a symbol is greater than
151  <tt>0</tt>, etc.
152  <p>
153
154  <p>
155  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1MemRegion.html">MemRegion</a> is similar to a symbol.
156  It is used to provide a lexicon of how to describe abstract memory. Regions can
157  layer on top of other regions, providing a layered approach to representing memory.
158  For example, a struct object on the stack might be represented by a <tt>VarRegion</tt>,
159  but a <tt>FieldRegion</tt> which is a subregion of the <tt>VarRegion</tt> could
160  be used to represent the memory associated with a specific field of that object.
161  So how do we represent symbolic memory regions? That's what
162  <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymbolicRegion.html">SymbolicRegion</a>
163  is for. It is a <tt>MemRegion</tt> that has an associated symbol. Since the
164  symbol is unique and has a unique name; that symbol names the region.
165
166  <P>
167  Let's see how the analyzer processes the expressions in the following example:
168  <p>
169  <pre class="code_example">
170  int foo(int x) {
171     int y = x * 2;
172     int z = x;
173     ...
174  }
175  </pre>
176  <p>
177Let's look at how <tt>x*2</tt> gets evaluated. When <tt>x</tt> is evaluated,
178we first construct an <tt>SVal</tt> that represents the lvalue of <tt>x</tt>, in
179this case it is an <tt>SVal</tt> that references the <tt>MemRegion</tt> for <tt>x</tt>.
180Afterwards, when we do the lvalue-to-rvalue conversion, we get a new <tt>SVal</tt>,
181which references the value <b>currently bound</b> to <tt>x</tt>. That value is
182symbolic; it's whatever <tt>x</tt> was bound to at the start of the function.
183Let's call that symbol <tt>$0</tt>. Similarly, we evaluate the expression for <tt>2</tt>,
184and get an <tt>SVal</tt> that references the concrete number <tt>2</tt>. When
185we evaluate <tt>x*2</tt>, we take the two <tt>SVals</tt> of the subexpressions,
186and create a new <tt>SVal</tt> that represents their multiplication (which in
187this case is a new symbolic expression, which we might call <tt>$1</tt>). When we
188evaluate the assignment to <tt>y</tt>, we again compute its lvalue (a <tt>MemRegion</tt>),
189and then bind the <tt>SVal</tt> for the RHS (which references the symbolic value <tt>$1</tt>)
190to the <tt>MemRegion</tt> in the symbolic store.
191<br>
192The second line is similar. When we evaluate <tt>x</tt> again, we do the same
193dance, and create an <tt>SVal</tt> that references the symbol <tt>$0</tt>. Note, two <tt>SVals</tt>
194might reference the same underlying values.
195
196<p>
197To summarize, MemRegions are unique names for blocks of memory. Symbols are
198unique names for abstract symbolic values. Some MemRegions represents abstract
199symbolic chunks of memory, and thus are also based on symbols. SVals are just
200references to values, and can reference either MemRegions, Symbols, or concrete
201values (e.g., the number 1).
202
203  <!--
204  TODO: Add a picture.
205  <br>
206  Symbols<br>
207  FunctionalObjects are used throughout.
208  -->
209
210<h2 id=idea>Idea for a Checker</h2>
211  Here are several questions which you should consider when evaluating your
212  checker idea:
213  <ul>
214    <li>Can the check be effectively implemented without path-sensitive
215    analysis? See <a href="#ast">AST Visitors</a>.</li>
216
217    <li>How high the false positive rate is going to be? Looking at the occurrences
218    of the issue you want to write a checker for in the existing code bases might
219    give you some ideas. </li>
220
221    <li>How the current limitations of the analysis will effect the false alarm
222    rate? Currently, the analyzer only reasons about one procedure at a time (no
223    inter-procedural analysis). Also, it uses a simple range tracking based
224    solver to model symbolic execution.</li>
225
226    <li>Consult the <a
227    href="http://llvm.org/bugs/buglist.cgi?query_format=advanced&amp;bug_status=NEW&amp;bug_status=REOPENED&amp;version=trunk&amp;component=Static%20Analyzer&amp;product=clang">Bugzilla database</a>
228    to get some ideas for new checkers and consider starting with improving/fixing
229    bugs in the existing checkers.</li>
230  </ul>
231
232<p>Once an idea for a checker has been chosen, there are two key decisions that
233need to be made:
234  <ul>
235    <li> Which events the checker should be tracking. This is discussed in more
236    detail in the section <a href="#events_callbacks">Events, Callbacks, and
237    Checker Class Structure</a>.
238    <li> What checker-specific data needs to be stored as part of the program
239    state (if any). This should be minimized as much as possible. More detail about
240    implementing custom program state is given in section <a
241    href="#extendingstates">Custom Program States</a>.
242  </ul>
243
244
245<h2 id=registration>Checker Registration</h2>
246  All checker implementation files are located in
247  <tt>clang/lib/StaticAnalyzer/Checkers</tt> folder. The steps below describe
248  how the checker <tt>SimpleStreamChecker</tt>, which checks for misuses of
249  stream APIs, was registered with the analyzer.
250  Similar steps should be followed for a new checker.
251<ol>
252  <li>A new checker implementation file, <tt>SimpleStreamChecker.cpp</tt>, was
253  created in the directory <tt>lib/StaticAnalyzer/Checkers</tt>.
254  <li>The following registration code was added to the implementation file:
255<pre class="code_example">
256void ento::registerSimpleStreamChecker(CheckerManager &amp;mgr) {
257  mgr.registerChecker&lt;SimpleStreamChecker&gt();
258}
259</pre>
260<li>A package was selected for the checker and the checker was defined in the
261table of checkers at <tt>lib/StaticAnalyzer/Checkers/Checkers.td</tt>. Since all
262checkers should first be developed as "alpha", and the SimpleStreamChecker
263performs UNIX API checks, the correct package is "alpha.unix", and the following
264was added to the corresponding <tt>UnixAlpha</tt> section of <tt>Checkers.td</tt>:
265<pre class="code_example">
266let ParentPackage = UnixAlpha in {
267...
268def SimpleStreamChecker : Checker<"SimpleStream">,
269  HelpText<"Check for misuses of stream APIs">,
270  DescFile<"SimpleStreamChecker.cpp">;
271...
272} // end "alpha.unix"
273</pre>
274
275<li>The source code file was made visible to CMake by adding it to
276<tt>lib/StaticAnalyzer/Checkers/CMakeLists.txt</tt>.
277
278</ol>
279
280After adding a new checker to the analyzer, one can verify that the new checker
281was successfully added by seeing if it appears in the list of available checkers:
282<br> <tt><b>$clang -cc1 -analyzer-checker-help</b></tt>
283
284<h2 id=events_callbacks>Events, Callbacks, and Checker Class Structure</h2>
285
286<p> All checkers inherit from the <tt><a
287href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1Checker.html">
288Checker</a></tt> template class; the template parameter(s) describe the type of
289events that the checker is interested in processing. The various types of events
290that are available are described in the file <a
291href="http://clang.llvm.org/doxygen/CheckerDocumentation_8cpp_source.html">
292CheckerDocumentation.cpp</a>
293
294<p> For each event type requested, a corresponding callback function must be
295defined in the checker class (<a
296href="http://clang.llvm.org/doxygen/CheckerDocumentation_8cpp_source.html">
297CheckerDocumentation.cpp</a> shows the
298correct function name and signature for each event type).
299
300<p> As an example, consider <tt>SimpleStreamChecker</tt>. This checker needs to
301take action at the following times:
302
303<ul>
304<li>Before making a call to a function, check if the function is <tt>fclose</tt>.
305If so, check the parameter being passed.
306<li>After making a function call, check if the function is <tt>fopen</tt>. If
307so, process the return value.
308<li>When values go out of scope, check whether they are still-open file
309descriptors, and report a bug if so. In addition, remove any information about
310them from the program state in order to keep the state as small as possible.
311<li>When file pointers "escape" (are used in a way that the analyzer can no longer
312track them), mark them as such. This prevents false positives in the cases where
313the analyzer cannot be sure whether the file was closed or not.
314</ul>
315
316<p>These events that will be used for each of these actions are, respectively, <a
317href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PreCall.html">PreCall</a>,
318<a
319href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PostCall.html">PostCall</a>,
320<a
321href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1DeadSymbols.html">DeadSymbols</a>,
322and <a
323href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PointerEscape.html">PointerEscape</a>.
324The high-level structure of the checker's class is thus:
325
326<pre class="code_example">
327class SimpleStreamChecker : public Checker&lt;check::PreCall,
328                                           check::PostCall,
329                                           check::DeadSymbols,
330                                           check::PointerEscape&gt; {
331public:
332
333  void checkPreCall(const CallEvent &amp;Call, CheckerContext &amp;C) const;
334
335  void checkPostCall(const CallEvent &amp;Call, CheckerContext &amp;C) const;
336
337  void checkDeadSymbols(SymbolReaper &amp;SR, CheckerContext &amp;C) const;
338
339  ProgramStateRef checkPointerEscape(ProgramStateRef State,
340                                     const InvalidatedSymbols &amp;Escaped,
341                                     const CallEvent *Call,
342                                     PointerEscapeKind Kind) const;
343};
344</pre>
345
346<h2 id=extendingstates>Custom Program States</h2>
347
348<p> Checkers often need to keep track of information specific to the checks they
349perform. However, since checkers have no guarantee about the order in which the
350program will be explored, or even that all possible paths will be explored, this
351state information cannot be kept within individual checkers. Therefore, if
352checkers need to store custom information, they need to add new categories of
353data to the <tt>ProgramState</tt>. The preferred way to do so is to use one of
354several macros designed for this purpose. They are:
355
356<ul>
357<li><a
358href="http://clang.llvm.org/doxygen/ProgramStateTrait_8h.html#ae4cddb54383cd702a045d7c61b009147">REGISTER_TRAIT_WITH_PROGRAMSTATE</a>:
359Used when the state information is a single value. The methods available for
360state types declared with this macro are <tt>get</tt>, <tt>set</tt>, and
361<tt>remove</tt>.
362<li><a
363href="http://clang.llvm.org/doxygen/CheckerContext_8h.html#aa27656fa0ce65b0d9ba12eb3c02e8be9">REGISTER_LIST_WITH_PROGRAMSTATE</a>:
364Used when the state information is a list of values. The methods available for
365state types declared with this macro are <tt>add</tt>, <tt>get</tt>,
366<tt>remove</tt>, and <tt>contains</tt>.
367<li><a
368href="http://clang.llvm.org/doxygen/CheckerContext_8h.html#ad90f9387b94b344eaaf499afec05f4d1">REGISTER_SET_WITH_PROGRAMSTATE</a>:
369Used when the state information is a set of values. The methods available for
370state types declared with this macro are <tt>add</tt>, <tt>get</tt>,
371<tt>remove</tt>, and <tt>contains</tt>.
372<li><a
373href="http://clang.llvm.org/doxygen/CheckerContext_8h.html#a6d1893bb8c18543337b6c363c1319fcf">REGISTER_MAP_WITH_PROGRAMSTATE</a>:
374Used when the state information is a map from a key to a value. The methods
375available for state types declared with this macro are <tt>add</tt>,
376<tt>set</tt>, <tt>get</tt>, <tt>remove</tt>, and <tt>contains</tt>.
377</ul>
378
379<p>All of these macros take as parameters the name to be used for the custom
380category of state information and the data type(s) to be used for storage. The
381data type(s) specified will become the parameter type and/or return type of the
382methods that manipulate the new category of state information. Each of these
383methods are templated with the name of the custom data type.
384
385<p>For example, a common case is the need to track data associated with a
386symbolic expression; a map type is the most logical way to implement this. The
387key for this map will be a pointer to a symbolic expression
388(<tt>SymbolRef</tt>). If the data type to be associated with the symbolic
389expression is an integer, then the custom category of state information would be
390declared as
391
392<pre class="code_example">
393REGISTER_MAP_WITH_PROGRAMSTATE(ExampleDataType, SymbolRef, int)
394</pre>
395
396The data would be accessed with the function
397
398<pre class="code_example">
399ProgramStateRef state;
400SymbolRef Sym;
401...
402int currentlValue = state-&gt;get&lt;ExampleDataType&gt;(Sym);
403</pre>
404
405and set with the function
406
407<pre class="code_example">
408ProgramStateRef state;
409SymbolRef Sym;
410int newValue;
411...
412ProgramStateRef newState = state-&gt;set&lt;ExampleDataType&gt;(Sym, newValue);
413</pre>
414
415<p>In addition, the macros define a data type used for storing the data of the
416new data category; the name of this type is the name of the data category with
417"Ty" appended. For <tt>REGISTER_TRAIT_WITH_PROGRAMSTATE</tt>, this will simply
418be passed data type; for the other three macros, this will be a specialized
419version of the <a
420href="http://llvm.org/doxygen/classllvm_1_1ImmutableList.html">llvm::ImmutableList</a>,
421<a
422href="http://llvm.org/doxygen/classllvm_1_1ImmutableSet.html">llvm::ImmutableSet</a>,
423or <a
424href="http://llvm.org/doxygen/classllvm_1_1ImmutableMap.html">llvm::ImmutableMap</a>
425templated class. For the <tt>ExampleDataType</tt> example above, the type
426created would be equivalent to writing the declaration:
427
428<pre class="code_example">
429typedef llvm::ImmutableMap&lt;SymbolRef, int&gt; ExampleDataTypeTy;
430</pre>
431
432<p>These macros will cover a majority of use cases; however, they still have a
433few limitations. They cannot be used inside namespaces (since they expand to
434contain top-level namespace references), and the data types that they define
435cannot be referenced from more than one file.
436
437<p>Note that <tt>ProgramStates</tt> are immutable; instead of modifying an existing
438one, functions that modify the state will return a copy of the previous state
439with the change applied. This updated state must be then provided to the
440analyzer core by calling the <tt>CheckerContext::addTransition</tt> function.
441<h2 id=bugs>Bug Reports</h2>
442
443
444<p> When a checker detects a mistake in the analyzed code, it needs a way to
445report it to the analyzer core so that it can be displayed. The two classes used
446to construct this report are <tt><a
447href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1BugType.html">BugType</a></tt>
448and <tt><a
449href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1BugReport.html">
450BugReport</a></tt>.
451
452<p>
453<tt>BugType</tt>, as the name would suggest, represents a type of bug. The
454constructor for <tt>BugType</tt> takes two parameters: The name of the bug
455type, and the name of the category of the bug. These are used (e.g.) in the
456summary page generated by the scan-build tool.
457
458<P>
459  The <tt>BugReport</tt> class represents a specific occurrence of a bug. In
460  the most common case, three parameters are used to form a <tt>BugReport</tt>:
461<ol>
462<li>The type of bug, specified as an instance of the <tt>BugType</tt> class.
463<li>A short descriptive string. This is placed at the location of the bug in
464the detailed line-by-line output generated by scan-build.
465<li>The context in which the bug occurred. This includes both the location of
466the bug in the program and the program's state when the location is reached. These are
467both encapsulated in an <tt>ExplodedNode</tt>.
468</ol>
469
470<p>In order to obtain the correct <tt>ExplodedNode</tt>, a decision must be made
471as to whether or not analysis can continue along the current path. This decision
472is based on whether the detected bug is one that would prevent the program under
473analysis from continuing. For example, leaking of a resource should not stop
474analysis, as the program can continue to run after the leak. Dereferencing a
475null pointer, on the other hand, should stop analysis, as there is no way for
476the program to meaningfully continue after such an error.
477
478<p>If analysis can continue, then the most recent <tt>ExplodedNode</tt>
479generated by the checker can be passed to the <tt>BugReport</tt> constructor
480without additional modification. This <tt>ExplodedNode</tt> will be the one
481returned by the most recent call to <a
482href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#a264f48d97809707049689c37aa35af78">CheckerContext::addTransition</a>.
483If no transition has been performed during the current callback, the checker should call <a
484href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#a264f48d97809707049689c37aa35af78">CheckerContext::addTransition()</a>
485and use the returned node for bug reporting.
486
487<p>If analysis can not continue, then the current state should be transitioned
488into a so-called <i>sink node</i>, a node from which no further analysis will be
489performed. This is done by calling the <a
490href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#adeea33a5a2bed190210c4a2bb807a6f0">
491CheckerContext::generateSink</a> function; this function is the same as the
492<tt>addTransition</tt> function, but marks the state as a sink node. Like
493<tt>addTransition</tt>, this returns an <tt>ExplodedNode</tt> with the updated
494state, which can then be passed to the <tt>BugReport</tt> constructor.
495
496<p>
497After a <tt>BugReport</tt> is created, it should be passed to the analyzer core
498by calling <a href = "http://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#ae7738af2cbfd1d713edec33d3203dff5">CheckerContext::emitReport</a>.
499
500<h2 id=ast>AST Visitors</h2>
501  Some checks might not require path-sensitivity to be effective. Simple AST walk
502  might be sufficient. If that is the case, consider implementing a Clang
503  compiler warning. On the other hand, a check might not be acceptable as a compiler
504  warning; for example, because of a relatively high false positive rate. In this
505  situation, AST callbacks <tt><b>checkASTDecl</b></tt> and
506  <tt><b>checkASTCodeBody</b></tt> are your best friends.
507
508<h2 id=testing>Testing</h2>
509  Every patch should be well tested with Clang regression tests. The checker tests
510  live in <tt>clang/test/Analysis</tt> folder. To run all of the analyzer tests,
511  execute the following from the <tt>clang</tt> build directory:
512    <pre class="code">
513    $ <b>TESTDIRS=Analysis make test</b>
514    </pre>
515
516<h2 id=commands>Useful Commands/Debugging Hints</h2>
517<ul>
518<li>
519While investigating a checker-related issue, instruct the analyzer to only
520execute a single checker:
521<br><tt>
522$ <b>clang -cc1 -analyze -analyzer-checker=osx.KeychainAPI test.c</b>
523</tt>
524</li>
525<li>
526To dump AST:
527<br><tt>
528$ <b>clang -cc1 -ast-dump test.c</b>
529</tt>
530</li>
531<li>
532To view/dump CFG use <tt>debug.ViewCFG</tt> or <tt>debug.DumpCFG</tt> checkers:
533<br><tt>
534$ <b>clang -cc1 -analyze -analyzer-checker=debug.ViewCFG test.c</b>
535</tt>
536</li>
537<li>
538To see all available debug checkers:
539<br><tt>
540$ <b>clang -cc1 -analyzer-checker-help | grep "debug"</b>
541</tt>
542</li>
543<li>
544To see which function is failing while processing a large file use
545<tt>-analyzer-display-progress</tt> option.
546</li>
547<li>
548While debugging execute <tt>clang -cc1 -analyze -analyzer-checker=core</tt>
549instead of <tt>clang --analyze</tt>, as the later would call the compiler
550in a separate process.
551</li>
552<li>
553To view <tt>ExplodedGraph</tt> (the state graph explored by the analyzer) while
554debugging, goto a frame that has <tt>clang::ento::ExprEngine</tt> object and
555execute:
556<br><tt>
557(gdb) <b>p ViewGraph(0)</b>
558</tt>
559</li>
560<li>
561To see the <tt>ProgramState</tt> while debugging use the following command.
562<br><tt>
563(gdb) <b>p State->dump()</b>
564</tt>
565</li>
566<li>
567To see <tt>clang::Expr</tt> while debugging use the following command. If you
568pass in a SourceManager object, it will also dump the corresponding line in the
569source code.
570<br><tt>
571(gdb) <b>p E->dump()</b>
572</tt>
573</li>
574<li>
575To dump AST of a method that the current <tt>ExplodedNode</tt> belongs to:
576<br><tt>
577(gdb) <b>p C.getPredecessor()->getCodeDecl().getBody()->dump()</b>
578(gdb) <b>p C.getPredecessor()->getCodeDecl().getBody()->dump(getContext().getSourceManager())</b>
579</tt>
580</li>
581</ul>
582
583<h2 id=additioninformation>Additional Sources of Information</h2>
584
585Here are some additional resources that are useful when working on the Clang
586Static Analyzer:
587
588<ul>
589<li> <a href="http://clang.llvm.org/doxygen">Clang doxygen</a>. Contains
590up-to-date documentation about the APIs available in Clang. Relevant entries
591have been linked throughout this page. Also of use is the
592<a href="http://llvm.org/doxygen">LLVM doxygen</a>, when dealing with classes
593from LLVM.
594<li> The <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev">
595cfe-dev mailing list</a>. This is the primary mailing list used for
596discussion of Clang development (including static code analysis). The
597<a href="http://lists.cs.uiuc.edu/pipermail/cfe-dev">archive</a> also contains
598a lot of information.
599<li> The "Building a Checker in 24 hours" presentation given at the <a
600href="http://llvm.org/devmtg/2012-11">November 2012 LLVM Developer's
601meeting</a>. Describes the construction of SimpleStreamChecker. <a
602href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">Slides</a>
603and <a
604href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>
605are available.
606</ul>
607
608</div>
609</div>
610</body>
611</html>
612