• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

config/H31-Dec-2011-4714

doc/H03-May-2022-563518

src/H31-Dec-2011-16,42311,776

COPYINGH A D31-Dec-201134.3 KiB675553

ChangelogH A D31-Dec-20118.2 KiB207179

Makefile.inH A D31-Dec-20112.8 KiB8745

Makefile.win32H A D31-Dec-20112.2 KiB6227

READMEH A D31-Dec-20117.3 KiB165138

TODOH A D31-Dec-2011152 64

configureH A D31-Dec-20114.4 KiB172121

README

1Introduction
2============
3
4LLnextgen is a (partial) reimplementation of the LLgen Extended-LL(1) parser
5generator [http://www.cs.vu.nl/~ceriel/LLgen.html] created by D. Grune and
6C.J.H. Jacobs which is part of the Amsterdam Compiler Kit (ACK). LLnextgen is
7Licensed under the GNU General Public License version 3. See the file COPYING
8for details. Alternatively, see <http://www.gnu.org/licenses/>.
9
10Note: To add to the confusion, there exists or existed another program called
11LLgen, which is an LL(1) parser generator. It was created by Fischer and
12LeBlanc.
13
14Motivation
15==========
16
17I like the ideas embodied in the LLgen program and I find the way to specify
18grammars easy and intuitive. However, it turns out LLgen contains a number of
19more and less serious bugs that make it annoying to work with.
20
21One option of course was to fix the LLgen program, but it turned out that it
22was not written with maintainability in mind. Furthermore, it was written in
23a time when memory was expensive and therefore limited. This results in a
24number of hacks that complicate maintenance even further. Thus, I decided to
25do a rewrite. The rewrite also allowed several features to be implemented
26which LLgen was missing (in my opinion anyway).
27
28Compatibility (issues)
29======================
30
31At this time the basic LLgen functionality is implemented. This includes
32everything apart from the extended user error-handling with the %onerror
33directive and the non-correcting error-recovery.
34
35Although I've tried to copy the behaviour of LLgen accurately, I have
36implemented some aspects slightly differently. The following is a list of the
37differences in behaviour between LLgen and LLnextgen:
38- LLgen generated both K&R style C code and ANSI C code. LLnextgen only
39  supports generation of ANSI C code.
40- There is a minor difference in the determination of the default choices.
41  LLnextgen simply chooses the first production with the shortest possible
42  terminal production, while LLgen also takes the complexity in terms of
43  non-terminals and terms into account. There is also a minor difference when
44  there is more than one shortest alternative and some of them are marked with
45  %avoid. Both differences are not very important as the user can specify
46  which alternative should be the default, thereby circumventing the
47  differences in the algorithms.
48- The default behaviour of generating one output C file per input and Lpars.c
49  and Lpars.h has been changed in favour of generating one .c file and one .h
50  file. The rationale given for creating multiple output files in the first
51  place was that it would reduce the compilation time for the generated
52  parser. As computation power has become much more abundant this feature is
53  no longer necessary, and the difficult interaction with the make program
54  makes it undesirable. The LLgen behaviour is still supported through a
55  command-line switch.
56- in LLgen one could have a parser and a %first macro with the same name.
57  LLnextgen forbids this, as it leads to name collisions in the new file
58  naming scheme. For the old LLgen file naming scheme it could also easily
59  lead to name collisions, although they could be circumvented by not mentioning
60  the parser in any of the C code in the .g files.
61- LLgen names the labels it generates L_X, where X is a number. LLnextgen names
62  these LL_X.
63- LLgen parsers are always reentrant. As this feature is hardly ever used,
64  LLnextgen parsers are non-reentrant unless the option --reentrant is used.
65
66Extra features
67==============
68
69LLnextgen incorporates a number of features that where not available in the
70LLgen program:
71
72- Tracing of conflicts. LLgen can only indicate where a conflict is detected,
73  but not where it is caused. As the cause may be in a seemingly unrelated
74  rule, conflicts can be very hard to find. LLnextgen can trace the cause of
75  conflicts, making it much easier to resolve them.
76- Automatic token buffering. LLgen and LLnextgen require that the token last
77  retrieved from the lexical analyser is returned again after a parse error
78  was detected. Most lexical analysers do not provide this feature, and LLgen
79  users are required to do this themselves. As this almost always leads to the
80  same code, LLnextgen can provide this code itself, or can be asked to print
81  the default code to standard output as a basis for modifications.
82- A symbol table can be auto-generated if the needed information is supplied.
83- A default LLmessage routine can be generated (if the auto-generated symbol
84  table is used), or alternatively sent to the standard output.
85- The limitation of the maximum file-name length in LLgen has been removed.
86- A command-line switch is provided that makes LLnextgen as compatible with
87  LLgen as possible, as well as a number of switches that can turn on separate
88  compatibility aspects.
89- Separating parameters in non-terminal headers can now be done with comma's.
90  LLnextgen will issue a warning about using a semi-colon to separate
91  parameters. This warning can be suppressed with a command-line switch.
92- File inclusion is possible through the %include directive. Dependency
93  information can be generated for use in Makefile's.
94- Command line options can be set in the grammar itself through %options.
95- The parser can be stopped through the LLabort() call, if it has been enabled.
96- Thread-safe parsers.
97- Return values for non-terminals.
98- An extra repetition operator for specify that the last element in a
99  repeating term is optional for the last repetition of that term.
100
101Several other features are planned. See the file TODO for details.
102
103Prerequisites and installation
104==============================
105
106LLnextgen is written in pure ANSI C, so most C compilers should have little
107trouble compiling it. From version 0.3.0, LLnextgen has optional support for
108regular expression matching. As this is not part of the ANSI C specification,
109a mechanism has been introduced to allow automatic testing for POSIX regular
110expression availablity. Therefore, there are three different ways to compile
111LLnextgen:
112
113Using the configure script:
114---
115
116$ ./configure
117or
118$ ./configure --prefix=/usr
119(see ./configure --help for more tuning options)
120$ make all
121$ make install
122(assumes working install program)
123
124Manually editing the Makefile to suit your installation:
125---
126
127$ cp Makefile.in Makefile
128Edit the values for REGEX, REGEXLIBS and prefix
129$ make all
130$ make install
131(assumes working install program)
132
133Manually compiling LLnextgen:
134---
135
136$ cd src
137$ cp lexer.c.dist lexer.c
138$ cp grammar.c.dist grammar.c
139$ cp grammar.h.dist grammar.h
140$ cc -o LLnextgen *.c
141or to compile with regular expression support, add
142-DREGEX={POSIX,OLDPOSIX,PCRE} and if required -l{regex,pcreposix} to your
143compiler command line (see Makefile.in for details on the values). After this
144your LLnextgen executable is done, and all that is left to do is to install
145it, and the documentation, into the target directories.
146
147Remarks:
148---
149
150LLnextgen is known to compile and work on several flavours of Un*x, on both
15132 and 64 bit platforms and on Windows. For compilation on Windows with MS
152Visual C++, the Makefile.win32 file is provided for use with nmake.exe.
153
154Reporting bugs
155==============
156
157If you think you have found a bug, please check that you are using the latest
158version of LLnextgen [http://os.ghalkes.nl/LLnextgen]. When reporting bugs,
159please include a minimal grammar that demonstrates the problem.
160
161Author
162======
163
164Gertjan Halkes <llnextgen@ghalkes.nl>
165