1\input texinfo 2@setfilename ../../info/semantic.info 3@set TITLE Semantic Manual 4@set AUTHOR Eric M. Ludlam, David Ponce, and Richard Y. Kim 5@settitle @value{TITLE} 6@include docstyle.texi 7 8@c ************************************************************************* 9@c @ Header 10@c ************************************************************************* 11 12@c Merge all indexes into a single index for now. 13@c We can always separate them later into two or more as needed. 14@syncodeindex vr cp 15@syncodeindex fn cp 16@syncodeindex ky cp 17@syncodeindex pg cp 18@syncodeindex tp cp 19 20@c @footnotestyle separate 21@c @paragraphindent 2 22@c @@smallbook 23@c %**end of header 24 25@copying 26This manual documents the Semantic library and utilities. 27 28Copyright @copyright{} 1999--2005, 2007, 2009--2021 Free Software 29Foundation, Inc. 30 31@quotation 32Permission is granted to copy, distribute and/or modify this document 33under the terms of the GNU Free Documentation License, Version 1.3 or 34any later version published by the Free Software Foundation; with no 35Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,'' 36and with the Back-Cover Texts as in (a) below. A copy of the license 37is included in the section entitled ``GNU Free Documentation License.'' 38 39(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and 40modify this GNU manual.'' 41@end quotation 42@end copying 43 44@dircategory Emacs misc features 45@direntry 46* Semantic: (semantic). Source code parser library and utilities. 47@end direntry 48 49@titlepage 50@center @titlefont{Semantic} 51@sp 4 52@center by @value{AUTHOR} 53@page 54@vskip 0pt plus 1filll 55@insertcopying 56@end titlepage 57@page 58 59@macro semantic{} 60@i{Semantic} 61@end macro 62 63@macro keyword{kw} 64@anchor{\kw\} 65@b{\kw\} 66@end macro 67 68@macro obsolete{old,new} 69@sp 1 70@strong{Compatibility}: 71@code{\new\} introduced in @semantic{} version 2.0 supersedes 72@code{\old\} which is now obsolete. 73@end macro 74 75@c ************************************************************************* 76@c @ Document 77@c ************************************************************************* 78@contents 79 80@node top 81@top @value{TITLE} 82 83@semantic{} is a suite of Emacs libraries and utilities for parsing 84source code. At its core is a lexical analyzer and two parser 85generators (@code{bovinator} and @code{wisent}) written in Emacs Lisp. 86@semantic{} provides a variety of tools for making use of the parser 87output, including user commands for code navigation and completion, as 88well as enhancements for imenu, speedbar, whichfunc, eldoc, 89hippie-expand, and several other parts of Emacs. 90 91To send bug reports, or participate in discussions about semantic, 92use the mailing list cedet-semantic@@sourceforge.net via the URL: 93@url{http://lists.sourceforge.net/lists/listinfo/cedet-semantic} 94 95@ifnottex 96@insertcopying 97@end ifnottex 98 99@menu 100* Introduction:: 101* Using Semantic:: 102* Semantic Internals:: 103* Glossary:: 104* GNU Free Documentation License:: 105* Index:: 106@end menu 107 108@node Introduction 109@chapter Introduction 110 111This chapter gives an overview of @semantic{} and its goals. 112 113Ordinarily, Emacs uses regular expressions (and syntax tables) to 114analyze source code for purposes such as syntax highlighting. This 115approach, though simple and efficient, has its limitations: roughly 116speaking, it only ``guesses'' the meaning of each piece of source code 117in the context of the programming language, instead of rigorously 118``understanding'' it. 119 120@semantic{} provides a new infrastructure to analyze source code using 121@dfn{parsers} instead of regular expressions. It contains two 122built-in parser generators (an @acronym{LL} generator named 123@code{Bovine} and an @acronym{LALR} generator named @code{Wisent}, 124both written in Emacs Lisp), and parsers for several common 125programming languages. It can also make use of @dfn{external 126parsers}---programs such as GNU Global and GNU IDUtils. 127 128@semantic{} provides a uniform, language-independent @acronym{API} for 129accessing the parser output. This output can be used by other Emacs 130Lisp programs to implement ``syntax-aware'' behavior. @semantic{} 131itself includes several such utilities, including user-level Emacs 132commands for navigating, searching, and completing source code. 133 134The following diagram illustrates the structure of the @semantic{} 135package: 136 137@table @strong 138@item Please Note: 139The words in all-capital are those that @semantic{} itself provides. 140Others are current or future languages or applications that are not 141distributed along with @semantic{}. 142@end table 143 144@example 145 Applications 146 and 147 Utilities 148 ------- 149 / \ 150 +---------------+ +--------+ +--------+ 151 C --->| C PARSER |--->| | | | 152 +---------------+ | | | | 153 +---------------+ | COMMON | | COMMON |<--- SPEEDBAR 154 Java --->| JAVA PARSER |--->| PARSE | | | 155 +---------------+ | TREE | | PARSE |<--- SEMANTICDB 156 +---------------+ | FORMAT | | API | 157 Scheme --->| SCHEME PARSER |--->| | | |<--- ecb 158 +---------------+ | | | | 159 +---------------+ | | | | 160 Texinfo --->| TEXI. PARSER |--->| | | | 161 +---------------+ | | | | 162 163 ... ... ... ... 164 165 +---------------+ | | | | 166 Lang. Y --->| Y Parser |--->| | | |<--- app. ? 167 +---------------+ | | | | 168 +---------------+ | | | |<--- app. ? 169 Lang. Z --->| Z Parser |--->| | | | 170 +---------------+ +--------+ +--------+ 171@end example 172 173@menu 174* Semantic Components:: 175@end menu 176 177@node Semantic Components 178@section Semantic Components 179 180In this section, we provide a more detailed description of the major 181components of @semantic{}, and how they interact with one another. 182 183The first step in parsing a source code file is to break it up into 184its fundamental components. This step is called lexical analysis: 185 186@example 187 syntax table, keywords list, and options 188 | 189 | 190 v 191 input file ----> Lexer ----> token stream 192@end example 193 194@noindent 195The output of the lexical analyzer is a list of tokens that make up 196the file. The next step is the actual parsing, shown below: 197 198@example 199 parser tables 200 | 201 v 202 token stream ---> Parser ----> parse tree 203@end example 204 205@noindent 206The end result, the parse tree, is @semantic{}'s internal 207representation of the language grammar. @semantic{} provides an 208@acronym{API} for Emacs Lisp programs to access the parse tree. 209 210Parsing large files can take several seconds or more. By default, 211@semantic{} automatically caches parse trees by saving them in your 212@file{.emacs.d} directory. When you revisit a previously-parsed file, 213the parse tree is automatically reloaded from this cache, to save 214time. @xref{SemanticDB}. 215 216@node Using Semantic 217@chapter Using Semantic 218 219@include sem-user.texi 220 221@node Semantic Internals 222@chapter Semantic Internals 223 224This chapter provides an overview of the internals of @semantic{}. 225This information is usually not needed by application developers or 226grammar developers; it is useful mostly for the hackers who would like 227to learn more about how @semantic{} works. 228 229@menu 230* Parser code:: Code used for the parsers 231* Tag handling:: Code used for manipulating tags 232* Semanticdb Internals:: Code used in the semantic database 233* Analyzer Internals:: Code used in the code analyzer 234* Tools:: Code used in user tools 235@ignore 236* Tests:: Code used for testing 237@end ignore 238@end menu 239 240@node Parser code 241@section Parser code 242 243@semantic{} parsing code is spread across a range of files. 244 245@table @file 246@item semantic.el 247The core infrastructure sets up buffers for parsing, and has all the 248core parsing routines. Most parsing routines are overloadable, so the 249actual implementation may be somewhere else. 250 251@item semantic/edit.el 252Incremental reparse based on user edits. 253 254@item semantic/grammar.el 255@itemx semantic-grammar.wy 256Parser for the different grammar languages, and a major mode for 257editing grammars in Emacs. 258 259@item semantic/lex.el 260Infrastructure for implementing lexical analyzers. Provides macros 261for creating individual analyzers for specific features, and a way to 262combine them together. 263 264@item semantic/lex-spp.el 265Infrastructure for a lexical symbolic preprocessor. This was written 266to implement the C preprocessor, but could be used for other lexical 267preprocessors. 268 269@item semantic/grammar.el 270@itemx semantic/bovine/grammar.el 271The ``bovine'' grammar. This is the first grammar mode written for 272@semantic{} and is useful for creating simple parsers. 273 274@item semantic/wisent.el 275@itemx semantic/wisent/wisent.el 276@itemx semantic/wisent/grammar.el 277A port of bison to Emacs. This infrastructure lets you create LALR 278based parsers for @semantic{}. 279 280@item semantic/debug.el 281Infrastructure for debugging grammars. 282 283@item semantic/util.el 284Various utilities for manipulating tags, such as describing the tag 285under point, adding labels, and the all important 286@code{semantic-something-to-tag-table}. 287 288@end table 289 290@node Tag handling 291@section Tag handling 292 293A tag represents an individual item found in a buffer, such as a 294function or variable. Tag handling is handled in several source 295files. 296 297@table @file 298@item semantic/tag.el 299Basic tag creation, queries, cloning, binding, and unbinding. 300 301@item semantic/tag-write.el 302Write a tag or tag list to a stream. These routines are used by 303@file{semanticdb-file.el} when saving a list of tags. 304 305@item semantic/tag-file.el 306Files associated with tags. Goto-tag, file for include, and file for 307a prototype. 308 309@item semantic/tag-ls.el 310Language dependent features of a tag, such as parent calculation, slot 311protection, and other states like abstract, virtual, static, and leaf. 312 313@item semantic/dep.el 314Include file handling. Contains the include path concepts, and 315routines for looking up file names in the include path. 316 317@item semantic/format.el 318Convert a tag into a nicely formatted and colored string. Use 319@code{semantic-test-all-format-tag-functions} to test different output 320options. 321 322@item semantic/find.el 323Find tags matching different conditions in a tag table. 324These routines are used by @file{semanticdb-find.el} once the database 325has been converted into a simpler tag table. 326 327@item semantic/sort.el 328Sorting lists of tags in different ways. Includes sorting a plain 329list of tags forward or backward. Includes binning tags based on 330attributes (bucketize), and tag adoption for multiple references to 331the same thing. 332 333@item semantic/doc.el 334Capture documentation comments from near a tag. 335 336@end table 337 338@node Semanticdb Internals 339@section Semanticdb Internals 340 341@acronym{Semanticdb} complexity is certainly an issue. It is a rather 342hairy problem to try and solve. 343 344@table @file 345@item semantic/db.el 346Defines a @dfn{database} and a @dfn{table} base class. You can 347instantiate these classes, and use them, but they are not persistent. 348 349This file also provides support for @code{semanticdb-minor-mode}, 350which automatically associates files with tables in databases so that 351tags are @emph{saved} while a buffer is not in memory. 352 353The database and tables both also provide applicable cache information, 354and cache flushing system. The semanticdb search routines use caches 355to save data structures that are complex to calculate. 356 357Lastly, it provides the concept of @dfn{project root}. It is a system 358by which a file can be associated with the root of a project, so if 359you have a tree of directories and source files, it can find the root, 360and allow a tag-search to span all available databases in that 361directory hierarchy. 362 363@item semantic/db-file.el 364Provides a subclass of the basic table so that it can be saved to 365disk. Implements all the code needed to unbind/rebind tags to a 366buffer and writing them to a file. 367 368@item semantic/db-el.el 369Implements a special kind of @dfn{system} database that uses Emacs 370internals to perform queries. 371 372@item semantic/db-ebrowse.el 373Implements a system database that uses Ebrowse to parse files into a 374table that can be queried for tag names. Successful tag hits during a 375find causes @semantic{} to pick up and parse the reference files to 376get the full details. 377 378@item semantic/db-find.el 379Infrastructure for searching groups @semantic{} databases, and dealing 380with the search results format. 381 382@item semantic/db-ref.el 383Tracks crossreferences. Cross references are needed when buffer is 384reparsed, and must alert other tables that any dependent caches may 385need to be flushed. References are in the form of include files. 386 387@end table 388 389@node Analyzer Internals 390@section Analyzer Internals 391 392The @semantic{} analyzer is a complex engine which has been broken 393down across several modules. When the @semantic{} analyzer fails, 394start with @code{semantic-analyze-debug-assist}, then dive into some 395of these files. 396 397@table @file 398@item semantic/analyze.el 399The core analyzer for defining the @dfn{current context}. The 400current context is an object that contains references to aspects of 401the local context including the current prefix, and a tag list 402defining what the prefix means. 403 404@item semantic/analyze/complete.el 405Provides @code{semantic-analyze-possible-completions}. 406 407@item semantic/analyze/debug.el 408The analyzer debugger. Useful when attempting to get everything 409configured. 410 411@item semantic/analyze/fcn.el 412Various support functions needed by the analyzer. 413 414@item semantic/ctxt.el 415Local context parser. Contains overloadable functions used to move 416around through different scopes, get local variables, and collect the 417current prefix used when doing completion. 418 419@item semantic/scope.el 420Calculate @dfn{scope} for a location in a buffer. The scope includes 421local variables, and tag lists in scope for various reasons, such as 422C++ using statements. 423 424@item semantic/db-typecache.el 425The typecache is part of @code{semanticdb}, but is used primarily by 426the analyzer to look up datatypes and complex names. The typecache is 427bound across source files and builds a master lookup table for data 428type names. 429 430@item semantic/ia.el 431Interactive Analyzer functions. Simple routines that do completion or 432lookups based on the results from the Analyzer. These routines are 433meant as examples for application writers, but are quite useful as 434they are. 435 436@item semantic/ia-sb.el 437Speedbar support for the analyzer, displaying context info, and 438completion lists. 439 440@end table 441 442@node Tools 443@section Tools 444 445These files contain various tools for users. 446 447@table @file 448@item semantic/idle.el 449Idle scheduler for @semantic{}. Manages reparsing buffers after 450edits, and large work tasks in idle time. Includes modes for showing 451summary help and pop-up completion. 452 453@item semantic/senator.el 454The @semantic{} navigator. Provides many ways to move through a 455buffer based on the active tag table. 456 457@item semantic/decorate.el 458A minor mode for decorating tags based on details from the parser. 459Includes overlines for functions, or coloring class fields based on 460protection. 461 462@item semantic/decorate/include.el 463A decoration mode for include files, which assists users in setting up 464parsing for their includes. 465 466@item semantic/complete.el 467Advanced completion prompts for reading tag names in the minibuffer, or 468inline in a buffer. 469 470@item semantic/imenu.el 471Imenu support for using @semantic{} tags in imenu. 472 473@item semantic/mru-bookmark.el 474Automatic bookmarking based on tags. Jump to locations you've been 475before based on tag name. 476 477@item semantic/sb.el 478Support for @semantic{} tag usage in Speedbar. 479 480@item semantic/util-modes.el 481A bunch of small minor-modes that exposes aspects of the semantic 482parser state. Includes @code{semantic-stickyfunc-mode}. 483 484@item semantic/chart.el 485Draw some charts from stats generated from parsing. 486 487@end table 488 489@c These files seem to not have been imported from CEDET. 490@ignore 491@node Tests 492@section Tests 493 494@table @file 495 496@item semantic-utest.el 497Basic testing of parsing and incremental parsing for most supported 498languages. 499 500@item semantic-ia-utest.el 501Test the semantic analyzer's ability to provide smart completions. 502 503@item semantic-utest-c.el 504Tests for the C parser's lexical pre-processor. 505 506@item semantic-regtest.el 507Regression tests from the older Semantic 1.x API. 508 509@end table 510@end ignore 511 512@node Glossary 513@appendix Glossary 514 515@table @asis 516@item BNF 517In semantic 1.4, a BNF file represented ``Bovine Normal Form'', the 518grammar file used for the 1.4 parser generator. This was a play on 519Backus-Naur Form which proved too confusing. 520 521@item bovinate 522A verb representing what happens when a bovine parser parses a file. 523 524@item bovine lambda 525In a bovine, or LL parser, the bovine lambda is a function to execute 526when a specific set of match rules has succeeded in matching text from 527the buffer. 528 529@item bovine parser 530A parser using the bovine parser generator. It is an LL parser 531suitable for small simple languages. 532 533@item context 534 535@item LALR 536 537@item lexer 538A program which converts text into a stream of tokens by analyzing 539them lexically. Lexers will commonly create strings, symbols, 540keywords and punctuation, and strip whitespaces and comments. 541 542@item LL 543 544@item nonterminal 545A nonterminal symbol or simply a nonterminal stands for a class of 546syntactically equivalent groupings. A nonterminal symbol name is used 547in writing grammar rules. 548 549@item overloadable 550Some functions are defined via @code{define-overload}. 551These can be overloaded via .... 552 553@item parser 554A program that converts @b{tokens} to @b{tags}. 555 556@item tag 557A tag is a representation of some entity in a language file, such as a 558function, variable, or include statement. In semantic, the word tag is 559used the same way it is used for the etags or ctags tools. 560 561A tag is usually bound to a buffer region via overlay, or it just 562specifies character locations in a file. 563 564@item token 565A single atomic item returned from a lexer. It represents some set 566of characters found in a buffer. 567 568@item token stream 569The output of the lexer as well as the input to the parser. 570 571@item wisent parser 572A parser using the wisent parser generator. It is a port of bison to 573Emacs Lisp. It is an LALR parser suitable for complex languages. 574@end table 575 576 577@node GNU Free Documentation License 578@appendix GNU Free Documentation License 579@include doclicense.texi 580 581@node Index 582@unnumbered Index 583@printindex cp 584 585@iftex 586@contents 587@summarycontents 588@end iftex 589 590@bye 591 592@c Following comments are for the benefit of ispell. 593 594@c LocalWords: alist API APIs arg argc args argv asis assoc autoload Wisent 595@c LocalWords: bnf bovinate bovinates LALR 596@c LocalWords: bovinating bovination bovinator bucketize 597@c LocalWords: cb cdr charquote checkcache cindex CLOS 598@c LocalWords: concat concocting const ctxt Decl defcustom 599@c LocalWords: deffn deffnx defun defvar destructor's dfn diff dir 600@c LocalWords: doc docstring EDE EIEIO elisp emacsman emph enum 601@c LocalWords: eq Exp EXPANDFULL expression fn foo func funcall 602@c LocalWords: ia ids ifinfo imenu imenus init int isearch itemx java kbd 603@c LocalWords: keymap keywordtable lang languagemode lexer lexing Ludlam 604@c LocalWords: menubar metaparent metaparents min minibuffer Misc mode's 605@c LocalWords: multitable NAvigaTOR noindent nomedian nonterm noselect 606@c LocalWords: nosnarf obarray OLE OO outputfile paren parsetable POINT's 607@c LocalWords: popup positionalonly positiononly positionormarker pre 608@c LocalWords: printf printindex Programmatically pt quotemode 609@c LocalWords: ref regex regexp Regexps reparse resetfile samp sb 610@c LocalWords: scopestart SEmantic semanticdb setfilename setq 611@c LocalWords: settitle setupfunction sexp sp SPC speedbar speedbar's 612@c LocalWords: streamorbuffer struct subalist submenu submenus 613@c LocalWords: subsubsection sw sym texi texinfo titlefont titlepage 614@c LocalWords: tok TOKEN's toplevel typemodifiers uml unset untar 615@c LocalWords: uref usedb var vskip xref yak 616