1\input texinfo
2@setfilename ../../info/semantic.info
3@set TITLE  Semantic Manual
4@set AUTHOR Eric M. Ludlam, David Ponce, and Richard Y. Kim
5@settitle @value{TITLE}
6@include docstyle.texi
7
8@c *************************************************************************
9@c @ Header
10@c *************************************************************************
11
12@c Merge all indexes into a single index for now.
13@c We can always separate them later into two or more as needed.
14@syncodeindex vr cp
15@syncodeindex fn cp
16@syncodeindex ky cp
17@syncodeindex pg cp
18@syncodeindex tp cp
19
20@c @footnotestyle separate
21@c @paragraphindent 2
22@c @@smallbook
23@c %**end of header
24
25@copying
26This manual documents the Semantic library and utilities.
27
28Copyright @copyright{} 1999--2005, 2007, 2009--2021 Free Software
29Foundation, Inc.
30
31@quotation
32Permission is granted to copy, distribute and/or modify this document
33under the terms of the GNU Free Documentation License, Version 1.3 or
34any later version published by the Free Software Foundation; with no
35Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,''
36and with the Back-Cover Texts as in (a) below.  A copy of the license
37is included in the section entitled ``GNU Free Documentation License.''
38
39(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
40modify this GNU manual.''
41@end quotation
42@end copying
43
44@dircategory Emacs misc features
45@direntry
46* Semantic: (semantic).         Source code parser library and utilities.
47@end direntry
48
49@titlepage
50@center @titlefont{Semantic}
51@sp 4
52@center by @value{AUTHOR}
53@page
54@vskip 0pt plus 1filll
55@insertcopying
56@end titlepage
57@page
58
59@macro semantic{}
60@i{Semantic}
61@end macro
62
63@macro keyword{kw}
64@anchor{\kw\}
65@b{\kw\}
66@end macro
67
68@macro obsolete{old,new}
69@sp 1
70@strong{Compatibility}:
71@code{\new\} introduced in @semantic{} version 2.0 supersedes
72@code{\old\} which is now obsolete.
73@end macro
74
75@c *************************************************************************
76@c @ Document
77@c *************************************************************************
78@contents
79
80@node top
81@top @value{TITLE}
82
83@semantic{} is a suite of Emacs libraries and utilities for parsing
84source code.  At its core is a lexical analyzer and two parser
85generators (@code{bovinator} and @code{wisent}) written in Emacs Lisp.
86@semantic{} provides a variety of tools for making use of the parser
87output, including user commands for code navigation and completion, as
88well as enhancements for imenu, speedbar, whichfunc, eldoc,
89hippie-expand, and several other parts of Emacs.
90
91To send bug reports, or participate in discussions about semantic,
92use the mailing list cedet-semantic@@sourceforge.net via the URL:
93@url{http://lists.sourceforge.net/lists/listinfo/cedet-semantic}
94
95@ifnottex
96@insertcopying
97@end ifnottex
98
99@menu
100* Introduction::
101* Using Semantic::
102* Semantic Internals::
103* Glossary::
104* GNU Free Documentation License::
105* Index::
106@end menu
107
108@node Introduction
109@chapter Introduction
110
111This chapter gives an overview of @semantic{} and its goals.
112
113Ordinarily, Emacs uses regular expressions (and syntax tables) to
114analyze source code for purposes such as syntax highlighting.  This
115approach, though simple and efficient, has its limitations: roughly
116speaking, it only ``guesses'' the meaning of each piece of source code
117in the context of the programming language, instead of rigorously
118``understanding'' it.
119
120@semantic{} provides a new infrastructure to analyze source code using
121@dfn{parsers} instead of regular expressions.  It contains two
122built-in parser generators (an @acronym{LL} generator named
123@code{Bovine} and an @acronym{LALR} generator named @code{Wisent},
124both written in Emacs Lisp), and parsers for several common
125programming languages.  It can also make use of @dfn{external
126parsers}---programs such as GNU Global and GNU IDUtils.
127
128@semantic{} provides a uniform, language-independent @acronym{API} for
129accessing the parser output.  This output can be used by other Emacs
130Lisp programs to implement ``syntax-aware'' behavior.  @semantic{}
131itself includes several such utilities, including user-level Emacs
132commands for navigating, searching, and completing source code.
133
134The following diagram illustrates the structure of the @semantic{}
135package:
136
137@table @strong
138@item Please Note:
139The words in all-capital are those that @semantic{} itself provides.
140Others are current or future languages or applications that are not
141distributed along with @semantic{}.
142@end table
143
144@example
145                                                             Applications
146                                                                 and
147                                                              Utilities
148                                                                -------
149                                                               /       \
150               +---------------+    +--------+    +--------+
151         C --->| C      PARSER |--->|        |    |        |
152               +---------------+    |        |    |        |
153               +---------------+    | COMMON |    | COMMON |<--- SPEEDBAR
154      Java --->| JAVA   PARSER |--->| PARSE  |    |        |
155               +---------------+    | TREE   |    | PARSE  |<--- SEMANTICDB
156               +---------------+    | FORMAT |    | API    |
157    Scheme --->| SCHEME PARSER |--->|        |    |        |<--- ecb
158               +---------------+    |        |    |        |
159               +---------------+    |        |    |        |
160   Texinfo --->| TEXI.  PARSER |--->|        |    |        |
161               +---------------+    |        |    |        |
162
163                    ...                ...           ...         ...
164
165               +---------------+    |        |    |        |
166   Lang. Y --->| Y      Parser |--->|        |    |        |<--- app. ?
167               +---------------+    |        |    |        |
168               +---------------+    |        |    |        |<--- app. ?
169   Lang. Z --->| Z      Parser |--->|        |    |        |
170               +---------------+    +--------+    +--------+
171@end example
172
173@menu
174* Semantic Components::
175@end menu
176
177@node Semantic Components
178@section Semantic Components
179
180In this section, we provide a more detailed description of the major
181components of @semantic{}, and how they interact with one another.
182
183The first step in parsing a source code file is to break it up into
184its fundamental components.  This step is called lexical analysis:
185
186@example
187        syntax table, keywords list, and options
188                         |
189                         |
190                         v
191    input file  ---->  Lexer   ----> token stream
192@end example
193
194@noindent
195The output of the lexical analyzer is a list of tokens that make up
196the file.  The next step is the actual parsing, shown below:
197
198@example
199                    parser tables
200                         |
201                         v
202    token stream --->  Parser  ----> parse tree
203@end example
204
205@noindent
206The end result, the parse tree, is @semantic{}'s internal
207representation of the language grammar.  @semantic{} provides an
208@acronym{API} for Emacs Lisp programs to access the parse tree.
209
210Parsing large files can take several seconds or more.  By default,
211@semantic{} automatically caches parse trees by saving them in your
212@file{.emacs.d} directory.  When you revisit a previously-parsed file,
213the parse tree is automatically reloaded from this cache, to save
214time.  @xref{SemanticDB}.
215
216@node Using Semantic
217@chapter Using Semantic
218
219@include sem-user.texi
220
221@node Semantic Internals
222@chapter Semantic Internals
223
224This chapter provides an overview of the internals of @semantic{}.
225This information is usually not needed by application developers or
226grammar developers; it is useful mostly for the hackers who would like
227to learn more about how @semantic{} works.
228
229@menu
230* Parser code::          Code used for the parsers
231* Tag handling::         Code used for manipulating tags
232* Semanticdb Internals:: Code used in the semantic database
233* Analyzer Internals::   Code used in the code analyzer
234* Tools::                Code used in user tools
235@ignore
236* Tests::                Code used for testing
237@end ignore
238@end menu
239
240@node Parser code
241@section Parser code
242
243@semantic{} parsing code is spread across a range of files.
244
245@table @file
246@item semantic.el
247The core infrastructure sets up buffers for parsing, and has all the
248core parsing routines.  Most parsing routines are overloadable, so the
249actual implementation may be somewhere else.
250
251@item semantic/edit.el
252Incremental reparse based on user edits.
253
254@item semantic/grammar.el
255@itemx semantic-grammar.wy
256Parser for the different grammar languages, and a major mode for
257editing grammars in Emacs.
258
259@item semantic/lex.el
260Infrastructure for implementing lexical analyzers.  Provides macros
261for creating individual analyzers for specific features, and a way to
262combine them together.
263
264@item semantic/lex-spp.el
265Infrastructure for a lexical symbolic preprocessor.  This was written
266to implement the C preprocessor, but could be used for other lexical
267preprocessors.
268
269@item semantic/grammar.el
270@itemx semantic/bovine/grammar.el
271The ``bovine'' grammar.  This is the first grammar mode written for
272@semantic{} and is useful for creating simple parsers.
273
274@item semantic/wisent.el
275@itemx semantic/wisent/wisent.el
276@itemx semantic/wisent/grammar.el
277A port of bison to Emacs.  This infrastructure lets you create LALR
278based parsers for @semantic{}.
279
280@item semantic/debug.el
281Infrastructure for debugging grammars.
282
283@item semantic/util.el
284Various utilities for manipulating tags, such as describing the tag
285under point, adding labels, and the all important
286@code{semantic-something-to-tag-table}.
287
288@end table
289
290@node Tag handling
291@section Tag handling
292
293A tag represents an individual item found in a buffer, such as a
294function or variable.  Tag handling is handled in several source
295files.
296
297@table @file
298@item semantic/tag.el
299Basic tag creation, queries, cloning, binding, and unbinding.
300
301@item semantic/tag-write.el
302Write a tag or tag list to a stream.  These routines are used by
303@file{semanticdb-file.el} when saving a list of tags.
304
305@item semantic/tag-file.el
306Files associated with tags.  Goto-tag, file for include, and file for
307a prototype.
308
309@item semantic/tag-ls.el
310Language dependent features of a tag, such as parent calculation, slot
311protection, and other states like abstract, virtual, static, and leaf.
312
313@item semantic/dep.el
314Include file handling.  Contains the include path concepts, and
315routines for looking up file names in the include path.
316
317@item semantic/format.el
318Convert a tag into a nicely formatted and colored string.  Use
319@code{semantic-test-all-format-tag-functions} to test different output
320options.
321
322@item semantic/find.el
323Find tags matching different conditions in a tag table.
324These routines are used by @file{semanticdb-find.el} once the database
325has been converted into a simpler tag table.
326
327@item semantic/sort.el
328Sorting lists of tags in different ways.  Includes sorting a plain
329list of tags forward or backward.  Includes binning tags based on
330attributes (bucketize), and tag adoption for multiple references to
331the same thing.
332
333@item semantic/doc.el
334Capture documentation comments from near a tag.
335
336@end table
337
338@node Semanticdb Internals
339@section Semanticdb Internals
340
341@acronym{Semanticdb} complexity is certainly an issue.  It is a rather
342hairy problem to try and solve.
343
344@table @file
345@item semantic/db.el
346Defines a @dfn{database} and a @dfn{table} base class.  You can
347instantiate these classes, and use them, but they are not persistent.
348
349This file also provides support for @code{semanticdb-minor-mode},
350which automatically associates files with tables in databases so that
351tags are @emph{saved} while a buffer is not in memory.
352
353The database and tables both also provide applicable cache information,
354and cache flushing system.  The semanticdb search routines use caches
355to save data structures that are complex to calculate.
356
357Lastly, it provides the concept of @dfn{project root}.  It is a system
358by which a file can be associated with the root of a project, so if
359you have a tree of directories and source files, it can find the root,
360and allow a tag-search to span all available databases in that
361directory hierarchy.
362
363@item semantic/db-file.el
364Provides a subclass of the basic table so that it can be saved to
365disk.  Implements all the code needed to unbind/rebind tags to a
366buffer and writing them to a file.
367
368@item semantic/db-el.el
369Implements a special kind of @dfn{system} database that uses Emacs
370internals to perform queries.
371
372@item semantic/db-ebrowse.el
373Implements a system database that uses Ebrowse to parse files into a
374table that can be queried for tag names.  Successful tag hits during a
375find causes @semantic{} to pick up and parse the reference files to
376get the full details.
377
378@item semantic/db-find.el
379Infrastructure for searching groups @semantic{} databases, and dealing
380with the search results format.
381
382@item semantic/db-ref.el
383Tracks crossreferences.   Cross references are needed when buffer is
384reparsed, and must alert other tables that any dependent caches may
385need to be flushed.  References are in the form of include files.
386
387@end table
388
389@node Analyzer Internals
390@section Analyzer Internals
391
392The @semantic{} analyzer is a complex engine which has been broken
393down across several modules.  When the @semantic{} analyzer fails,
394start with @code{semantic-analyze-debug-assist}, then dive into some
395of these files.
396
397@table @file
398@item semantic/analyze.el
399The core analyzer for defining the @dfn{current context}.  The
400current context is an object that contains references to aspects of
401the local context including the current prefix, and a tag list
402defining what the prefix means.
403
404@item semantic/analyze/complete.el
405Provides @code{semantic-analyze-possible-completions}.
406
407@item semantic/analyze/debug.el
408The analyzer debugger.  Useful when attempting to get everything
409configured.
410
411@item semantic/analyze/fcn.el
412Various support functions needed by the analyzer.
413
414@item semantic/ctxt.el
415Local context parser.  Contains overloadable functions used to move
416around through different scopes, get local variables, and collect the
417current prefix used when doing completion.
418
419@item semantic/scope.el
420Calculate @dfn{scope} for a location in a buffer.  The scope includes
421local variables, and tag lists in scope for various reasons, such as
422C++ using statements.
423
424@item semantic/db-typecache.el
425The typecache is part of @code{semanticdb}, but is used primarily by
426the analyzer to look up datatypes and complex names.  The typecache is
427bound across source files and builds a master lookup table for data
428type names.
429
430@item semantic/ia.el
431Interactive Analyzer functions.  Simple routines that do completion or
432lookups based on the results from the Analyzer.  These routines are
433meant as examples for application writers, but are quite useful as
434they are.
435
436@item semantic/ia-sb.el
437Speedbar support for the analyzer, displaying context info, and
438completion lists.
439
440@end table
441
442@node Tools
443@section Tools
444
445These files contain various tools for users.
446
447@table @file
448@item semantic/idle.el
449Idle scheduler for @semantic{}.  Manages reparsing buffers after
450edits, and large work tasks in idle time.  Includes modes for showing
451summary help and pop-up completion.
452
453@item semantic/senator.el
454The @semantic{} navigator.  Provides many ways to move through a
455buffer based on the active tag table.
456
457@item semantic/decorate.el
458A minor mode for decorating tags based on details from the parser.
459Includes overlines for functions, or coloring class fields based on
460protection.
461
462@item semantic/decorate/include.el
463A decoration mode for include files, which assists users in setting up
464parsing for their includes.
465
466@item semantic/complete.el
467Advanced completion prompts for reading tag names in the minibuffer, or
468inline in a buffer.
469
470@item semantic/imenu.el
471Imenu support for using @semantic{} tags in imenu.
472
473@item semantic/mru-bookmark.el
474Automatic bookmarking based on tags.  Jump to locations you've been
475before based on tag name.
476
477@item semantic/sb.el
478Support for @semantic{} tag usage in Speedbar.
479
480@item semantic/util-modes.el
481A bunch of small minor-modes that exposes aspects of the semantic
482parser state.  Includes @code{semantic-stickyfunc-mode}.
483
484@item semantic/chart.el
485Draw some charts from stats generated from parsing.
486
487@end table
488
489@c These files seem to not have been imported from CEDET.
490@ignore
491@node Tests
492@section Tests
493
494@table @file
495
496@item semantic-utest.el
497Basic testing of parsing and incremental parsing for most supported
498languages.
499
500@item semantic-ia-utest.el
501Test the semantic analyzer's ability to provide smart completions.
502
503@item semantic-utest-c.el
504Tests for the C parser's lexical pre-processor.
505
506@item semantic-regtest.el
507Regression tests from the older Semantic 1.x API.
508
509@end table
510@end ignore
511
512@node Glossary
513@appendix Glossary
514
515@table @asis
516@item BNF
517In semantic 1.4, a BNF file represented ``Bovine Normal Form'', the
518grammar file used for the 1.4 parser generator.  This was a play on
519Backus-Naur Form which proved too confusing.
520
521@item bovinate
522A verb representing what happens when a bovine parser parses a file.
523
524@item bovine lambda
525In a bovine, or LL parser, the bovine lambda is a function to execute
526when a specific set of match rules has succeeded in matching text from
527the buffer.
528
529@item bovine parser
530A parser using the bovine parser generator.  It is an LL parser
531suitable for small simple languages.
532
533@item context
534
535@item LALR
536
537@item lexer
538A program which converts text into a stream of tokens by analyzing
539them lexically.  Lexers will commonly create strings, symbols,
540keywords and punctuation, and strip whitespaces and comments.
541
542@item LL
543
544@item nonterminal
545A nonterminal symbol or simply a nonterminal stands for a class of
546syntactically equivalent groupings.  A nonterminal symbol name is used
547in writing grammar rules.
548
549@item overloadable
550Some functions are defined via @code{define-overload}.
551These can be overloaded via ....
552
553@item parser
554A program that converts @b{tokens} to @b{tags}.
555
556@item tag
557A tag is a representation of some entity in a language file, such as a
558function, variable, or include statement.  In semantic, the word tag is
559used the same way it is used for the etags or ctags tools.
560
561A tag is usually bound to a buffer region via overlay, or it just
562specifies character locations in a file.
563
564@item token
565A single atomic item returned from a lexer.  It represents some set
566of characters found in a buffer.
567
568@item token stream
569The output of the lexer as well as the input to the parser.
570
571@item wisent parser
572A parser using the wisent parser generator.  It is a port of bison to
573Emacs Lisp.  It is an LALR parser suitable for complex languages.
574@end table
575
576
577@node GNU Free Documentation License
578@appendix GNU Free Documentation License
579@include doclicense.texi
580
581@node Index
582@unnumbered Index
583@printindex cp
584
585@iftex
586@contents
587@summarycontents
588@end iftex
589
590@bye
591
592@c Following comments are for the benefit of ispell.
593
594@c LocalWords: alist API APIs arg argc args argv asis assoc autoload Wisent
595@c LocalWords: bnf bovinate bovinates LALR
596@c LocalWords: bovinating bovination bovinator bucketize
597@c LocalWords: cb cdr charquote checkcache cindex CLOS
598@c LocalWords: concat concocting const ctxt Decl defcustom
599@c LocalWords: deffn deffnx defun defvar destructor's dfn diff dir
600@c LocalWords: doc docstring EDE EIEIO elisp emacsman emph enum
601@c LocalWords: eq Exp EXPANDFULL expression fn foo func funcall
602@c LocalWords: ia ids ifinfo imenu imenus init int isearch itemx java kbd
603@c LocalWords: keymap keywordtable lang languagemode lexer lexing Ludlam
604@c LocalWords: menubar metaparent metaparents min minibuffer Misc mode's
605@c LocalWords: multitable NAvigaTOR noindent nomedian nonterm noselect
606@c LocalWords: nosnarf obarray OLE OO outputfile paren parsetable POINT's
607@c LocalWords: popup positionalonly positiononly positionormarker pre
608@c LocalWords: printf printindex Programmatically pt quotemode
609@c LocalWords: ref regex regexp Regexps reparse resetfile samp sb
610@c LocalWords: scopestart SEmantic semanticdb setfilename setq
611@c LocalWords: settitle setupfunction sexp sp SPC speedbar speedbar's
612@c LocalWords: streamorbuffer struct subalist submenu submenus
613@c LocalWords: subsubsection sw sym texi texinfo titlefont titlepage
614@c LocalWords: tok TOKEN's toplevel typemodifiers uml unset untar
615@c LocalWords: uref usedb var vskip xref yak
616