1Version 3.10
2---------------------
301/31/17: beazley
4          Changed grammar signature computation to not involve hashing
5          functions. Parts are just combined into a big string.
6
710/07/16: beazley
8          Fixed Issue #101: Incorrect shift-reduce conflict resolution with
9          precedence specifier.
10
11          PLY was incorrectly resolving shift-reduce conflicts in certain
12          cases.  For example, in the example/calc/calc.py example, you
13          could trigger it doing this:
14
15          calc > -3 - 4
16          1                         (correct answer should be -7)
17          calc >
18
19          Issue and suggested patch contributed by https://github.com/RomaVis
20
21Version 3.9
22---------------------
2308/30/16: beazley
24          Exposed the parser state number as the parser.state attribute
25          in productions and error functions. For example:
26
27          def p_somerule(p):
28              '''
29              rule : A B C
30              '''
31              print('State:', p.parser.state)
32
33          May address issue #65 (publish current state in error callback).
34
3508/30/16: beazley
36          Fixed Issue #88. Python3 compatibility with ply/cpp.
37
3808/30/16: beazley
39          Fixed Issue #93. Ply can crash if SyntaxError is raised inside
40          a production.   Not actually sure if the original implementation
41          worked as documented at all.  Yacc has been modified to follow
42          the spec as outlined in the CHANGES noted for 11/27/07 below.
43
4408/30/16: beazley
45          Fixed Issue #97. Failure with code validation when the original
46          source files aren't present.   Validation step now ignores
47          the missing file.
48
4908/30/16: beazley
50          Minor fixes to version numbers.
51
52Version 3.8
53---------------------
5410/02/15: beazley
55          Fixed issues related to Python 3.5. Patch contributed by Barry Warsaw.
56
57Version 3.7
58---------------------
5908/25/15: beazley
60          Fixed problems when reading table files from pickled data.
61
6205/07/15: beazley
63          Fixed regression in handling of table modules if specified as module
64          objects.   See https://github.com/dabeaz/ply/issues/63
65
66Version 3.6
67---------------------
6804/25/15: beazley
69          If PLY is unable to create the 'parser.out' or 'parsetab.py' files due
70          to permission issues, it now just issues a warning message and
71          continues to operate. This could happen if a module using PLY
72	  is installed in a funny way where tables have to be regenerated, but
73          for whatever reason, the user doesn't have write permission on
74          the directory where PLY wants to put them.
75
7604/24/15: beazley
77          Fixed some issues related to use of packages and table file
78          modules.  Just to emphasize, PLY now generates its special
79          files such as 'parsetab.py' and 'lextab.py' in the *SAME*
80          directory as the source file that uses lex() and yacc().
81
82	  If for some reason, you want to change the name of the table
83          module, use the tabmodule and lextab options:
84
85             lexer = lex.lex(lextab='spamlextab')
86             parser = yacc.yacc(tabmodule='spamparsetab')
87
88          If you specify a simple name as shown, the module will still be
89          created in the same directory as the file invoking lex() or yacc().
90          If you want the table files to be placed into a different package,
91          then give a fully qualified package name.  For example:
92
93             lexer = lex.lex(lextab='pkgname.files.lextab')
94             parser = yacc.yacc(tabmodule='pkgname.files.parsetab')
95
96          For this to work, 'pkgname.files' must already exist as a valid
97          Python package (i.e., the directories must already exist and be
98          set up with the proper __init__.py files, etc.).
99
100Version 3.5
101---------------------
10204/21/15: beazley
103          Added support for defaulted_states in the parser.  A
104          defaulted_state is a state where the only legal action is a
105          reduction of a single grammar rule across all valid input
106          tokens.  For such states, the rule is reduced and the
107          reading of the next lookahead token is delayed until it is
108          actually needed at a later point in time.
109
110	  This delay in consuming the next lookahead token is a
111	  potentially important feature in advanced parsing
112	  applications that require tight interaction between the
113	  lexer and the parser.  For example, a grammar rule change
114	  modify the lexer state upon reduction and have such changes
115	  take effect before the next input token is read.
116
117	  *** POTENTIAL INCOMPATIBILITY ***
118	  One potential danger of defaulted_states is that syntax
119	  errors might be deferred to a a later point of processing
120	  than where they were detected in past versions of PLY.
121	  Thus, it's possible that your error handling could change
122	  slightly on the same inputs.  defaulted_states do not change
123	  the overall parsing of the input (i.e., the same grammar is
124	  accepted).
125
126	  If for some reason, you need to disable defaulted states,
127	  you can do this:
128
129              parser = yacc.yacc()
130              parser.defaulted_states = {}
131
13204/21/15: beazley
133          Fixed debug logging in the parser.  It wasn't properly reporting goto states
134          on grammar rule reductions.
135
13604/20/15: beazley
137          Added actions to be defined to character literals (Issue #32).  For example:
138
139              literals = [ '{', '}' ]
140
141              def t_lbrace(t):
142                  r'\{'
143                  # Some action
144                  t.type = '{'
145                  return t
146
147              def t_rbrace(t):
148                  r'\}'
149                  # Some action
150                  t.type = '}'
151                  return t
152
15304/19/15: beazley
154          Import of the 'parsetab.py' file is now constrained to only consider the
155          directory specified by the outputdir argument to yacc().  If not supplied,
156          the import will only consider the directory in which the grammar is defined.
157          This should greatly reduce problems with the wrong parsetab.py file being
158          imported by mistake. For example, if it's found somewhere else on the path
159          by accident.
160
161	  *** POTENTIAL INCOMPATIBILITY ***  It's possible that this might break some
162          packaging/deployment setup if PLY was instructed to place its parsetab.py
163          in a different location.  You'll have to specify a proper outputdir= argument
164          to yacc() to fix this if needed.
165
16604/19/15: beazley
167          Changed default output directory to be the same as that in which the
168          yacc grammar is defined.  If your grammar is in a file 'calc.py',
169          then the parsetab.py and parser.out files should be generated in the
170          same directory as that file.  The destination directory can be changed
171          using the outputdir= argument to yacc().
172
17304/19/15: beazley
174          Changed the parsetab.py file signature slightly so that the parsetab won't
175          regenerate if created on a different major version of Python (ie., a
176          parsetab created on Python 2 will work with Python 3).
177
17804/16/15: beazley
179          Fixed Issue #44 call_errorfunc() should return the result of errorfunc()
180
18104/16/15: beazley
182          Support for versions of Python <2.7 is officially dropped.  PLY may work, but
183          the unit tests requires Python 2.7 or newer.
184
18504/16/15: beazley
186          Fixed bug related to calling yacc(start=...).   PLY wasn't regenerating the
187          table file correctly for this case.
188
18904/16/15: beazley
190          Added skipped tests for PyPy and Java.  Related to use of Python's -O option.
191
19205/29/13: beazley
193          Added filter to make unit tests pass under 'python -3'.
194          Reported by Neil Muller.
195
19605/29/13: beazley
197          Fixed CPP_INTEGER regex in ply/cpp.py (Issue 21).
198	  Reported by @vbraun.
199
20005/29/13: beazley
201          Fixed yacc validation bugs when from __future__ import unicode_literals
202          is being used.  Reported by Kenn Knowles.
203
20405/29/13: beazley
205          Added support for Travis-CI.  Contributed by Kenn Knowles.
206
20705/29/13: beazley
208          Added a .gitignore file.  Suggested by Kenn Knowles.
209
21005/29/13: beazley
211	  Fixed validation problems for source files that include a
212          different source code encoding specifier.  Fix relies on
213          the inspect module.  Should work on Python 2.6 and newer.
214          Not sure about older versions of Python.
215          Contributed by Michael Droettboom
216
21705/21/13: beazley
218          Fixed unit tests for yacc to eliminate random failures due to dict hash value
219	  randomization in Python 3.3
220	  Reported by Arfrever
221
22210/15/12: beazley
223          Fixed comment whitespace processing bugs in ply/cpp.py.
224          Reported by Alexei Pososin.
225
22610/15/12: beazley
227          Fixed token names in ply/ctokens.py to match rule names.
228          Reported by Alexei Pososin.
229
23004/26/12: beazley
231          Changes to functions available in panic mode error recover.  In previous versions
232          of PLY, the following global functions were available for use in the p_error() rule:
233
234                 yacc.errok()       # Reset error state
235                 yacc.token()       # Get the next token
236                 yacc.restart()     # Reset the parsing stack
237
238          The use of global variables was problematic for code involving multiple parsers
239          and frankly was a poor design overall.   These functions have been moved to methods
240          of the parser instance created by the yacc() function.   You should write code like
241          this:
242
243                def p_error(p):
244                    ...
245                    parser.errok()
246
247                parser = yacc.yacc()
248
249          *** POTENTIAL INCOMPATIBILITY ***  The original global functions now issue a
250          DeprecationWarning.
251
25204/19/12: beazley
253          Fixed some problems with line and position tracking and the use of error
254          symbols.   If you have a grammar rule involving an error rule like this:
255
256               def p_assignment_bad(p):
257                   '''assignment : location EQUALS error SEMI'''
258                   ...
259
260          You can now do line and position tracking on the error token.  For example:
261
262               def p_assignment_bad(p):
263                   '''assignment : location EQUALS error SEMI'''
264                   start_line = p.lineno(3)
265                   start_pos  = p.lexpos(3)
266
267          If the trackng=True option is supplied to parse(), you can additionally get
268          spans:
269
270               def p_assignment_bad(p):
271                   '''assignment : location EQUALS error SEMI'''
272                   start_line, end_line = p.linespan(3)
273                   start_pos, end_pos = p.lexspan(3)
274
275          Note that error handling is still a hairy thing in PLY. This won't work
276          unless your lexer is providing accurate information.   Please report bugs.
277          Suggested by a bug reported by Davis Herring.
278
27904/18/12: beazley
280          Change to doc string handling in lex module.  Regex patterns are now first
281          pulled from a function's .regex attribute.  If that doesn't exist, then
282	  .doc is checked as a fallback.   The @TOKEN decorator now sets the .regex
283	  attribute of a function instead of its doc string.
284	  Changed suggested by Kristoffer Ellersgaard Koch.
285
28604/18/12: beazley
287          Fixed issue #1: Fixed _tabversion. It should use __tabversion__ instead of __version__
288          Reported by Daniele Tricoli
289
29004/18/12: beazley
291          Fixed issue #8: Literals empty list causes IndexError
292          Reported by Walter Nissen.
293
29404/18/12: beazley
295          Fixed issue #12: Typo in code snippet in documentation
296          Reported by florianschanda.
297
29804/18/12: beazley
299          Fixed issue #10: Correctly escape t_XOREQUAL pattern.
300          Reported by Andy Kittner.
301
302Version 3.4
303---------------------
30402/17/11: beazley
305          Minor patch to make cpp.py compatible with Python 3.  Note: This
306          is an experimental file not currently used by the rest of PLY.
307
30802/17/11: beazley
309          Fixed setup.py trove classifiers to properly list PLY as
310          Python 3 compatible.
311
31201/02/11: beazley
313          Migration of repository to github.
314
315Version 3.3
316-----------------------------
31708/25/09: beazley
318          Fixed issue 15 related to the set_lineno() method in yacc.  Reported by
319	  mdsherry.
320
32108/25/09: beazley
322          Fixed a bug related to regular expression compilation flags not being
323          properly stored in lextab.py files created by the lexer when running
324          in optimize mode.  Reported by Bruce Frederiksen.
325
326
327Version 3.2
328-----------------------------
32903/24/09: beazley
330          Added an extra check to not print duplicated warning messages
331          about reduce/reduce conflicts.
332
33303/24/09: beazley
334          Switched PLY over to a BSD-license.
335
33603/23/09: beazley
337          Performance optimization.  Discovered a few places to make
338          speedups in LR table generation.
339
34003/23/09: beazley
341          New warning message.  PLY now warns about rules never
342          reduced due to reduce/reduce conflicts.  Suggested by
343          Bruce Frederiksen.
344
34503/23/09: beazley
346          Some clean-up of warning messages related to reduce/reduce errors.
347
34803/23/09: beazley
349          Added a new picklefile option to yacc() to write the parsing
350          tables to a filename using the pickle module.   Here is how
351          it works:
352
353              yacc(picklefile="parsetab.p")
354
355          This option can be used if the normal parsetab.py file is
356          extremely large.  For example, on jython, it is impossible
357          to read parsing tables if the parsetab.py exceeds a certain
358          threshold.
359
360          The filename supplied to the picklefile option is opened
361          relative to the current working directory of the Python
362          interpreter.  If you need to refer to the file elsewhere,
363          you will need to supply an absolute or relative path.
364
365          For maximum portability, the pickle file is written
366          using protocol 0.
367
36803/13/09: beazley
369          Fixed a bug in parser.out generation where the rule numbers
370          where off by one.
371
37203/13/09: beazley
373          Fixed a string formatting bug with one of the error messages.
374          Reported by Richard Reitmeyer
375
376Version 3.1
377-----------------------------
37802/28/09: beazley
379          Fixed broken start argument to yacc().  PLY-3.0 broke this
380          feature by accident.
381
38202/28/09: beazley
383          Fixed debugging output. yacc() no longer reports shift/reduce
384          or reduce/reduce conflicts if debugging is turned off.  This
385          restores similar behavior in PLY-2.5.   Reported by Andrew Waters.
386
387Version 3.0
388-----------------------------
38902/03/09: beazley
390          Fixed missing lexer attribute on certain tokens when
391          invoking the parser p_error() function.  Reported by
392          Bart Whiteley.
393
39402/02/09: beazley
395          The lex() command now does all error-reporting and diagonistics
396          using the logging module interface.   Pass in a Logger object
397          using the errorlog parameter to specify a different logger.
398
39902/02/09: beazley
400          Refactored ply.lex to use a more object-oriented and organized
401          approach to collecting lexer information.
402
40302/01/09: beazley
404          Removed the nowarn option from lex().  All output is controlled
405          by passing in a logger object.   Just pass in a logger with a high
406          level setting to suppress output.   This argument was never
407          documented to begin with so hopefully no one was relying upon it.
408
40902/01/09: beazley
410          Discovered and removed a dead if-statement in the lexer.  This
411          resulted in a 6-7% speedup in lexing when I tested it.
412
41301/13/09: beazley
414          Minor change to the procedure for signalling a syntax error in a
415          production rule.  A normal SyntaxError exception should be raised
416          instead of yacc.SyntaxError.
417
41801/13/09: beazley
419          Added a new method p.set_lineno(n,lineno) that can be used to set the
420          line number of symbol n in grammar rules.   This simplifies manual
421          tracking of line numbers.
422
42301/11/09: beazley
424          Vastly improved debugging support for yacc.parse().   Instead of passing
425          debug as an integer, you can supply a Logging object (see the logging
426          module). Messages will be generated at the ERROR, INFO, and DEBUG
427	  logging levels, each level providing progressively more information.
428          The debugging trace also shows states, grammar rule, values passed
429          into grammar rules, and the result of each reduction.
430
43101/09/09: beazley
432          The yacc() command now does all error-reporting and diagnostics using
433          the interface of the logging module.  Use the errorlog parameter to
434          specify a logging object for error messages.  Use the debuglog parameter
435          to specify a logging object for the 'parser.out' output.
436
43701/09/09: beazley
438          *HUGE* refactoring of the the ply.yacc() implementation.   The high-level
439	  user interface is backwards compatible, but the internals are completely
440          reorganized into classes.  No more global variables.    The internals
441          are also more extensible.  For example, you can use the classes to
442          construct a LALR(1) parser in an entirely different manner than
443          what is currently the case.  Documentation is forthcoming.
444
44501/07/09: beazley
446          Various cleanup and refactoring of yacc internals.
447
44801/06/09: beazley
449          Fixed a bug with precedence assignment.  yacc was assigning the precedence
450          each rule based on the left-most token, when in fact, it should have been
451          using the right-most token.  Reported by Bruce Frederiksen.
452
45311/27/08: beazley
454          Numerous changes to support Python 3.0 including removal of deprecated
455          statements (e.g., has_key) and the additional of compatibility code
456          to emulate features from Python 2 that have been removed, but which
457          are needed.   Fixed the unit testing suite to work with Python 3.0.
458          The code should be backwards compatible with Python 2.
459
46011/26/08: beazley
461          Loosened the rules on what kind of objects can be passed in as the
462          "module" parameter to lex() and yacc().  Previously, you could only use
463          a module or an instance.  Now, PLY just uses dir() to get a list of
464          symbols on whatever the object is without regard for its type.
465
46611/26/08: beazley
467          Changed all except: statements to be compatible with Python2.x/3.x syntax.
468
46911/26/08: beazley
470          Changed all raise Exception, value statements to raise Exception(value) for
471          forward compatibility.
472
47311/26/08: beazley
474          Removed all print statements from lex and yacc, using sys.stdout and sys.stderr
475          directly.  Preparation for Python 3.0 support.
476
47711/04/08: beazley
478          Fixed a bug with referring to symbols on the the parsing stack using negative
479          indices.
480
48105/29/08: beazley
482          Completely revamped the testing system to use the unittest module for everything.
483          Added additional tests to cover new errors/warnings.
484
485Version 2.5
486-----------------------------
48705/28/08: beazley
488          Fixed a bug with writing lex-tables in optimized mode and start states.
489          Reported by Kevin Henry.
490
491Version 2.4
492-----------------------------
49305/04/08: beazley
494          A version number is now embedded in the table file signature so that
495          yacc can more gracefully accomodate changes to the output format
496          in the future.
497
49805/04/08: beazley
499          Removed undocumented .pushback() method on grammar productions.  I'm
500          not sure this ever worked and can't recall ever using it.  Might have
501          been an abandoned idea that never really got fleshed out.  This
502          feature was never described or tested so removing it is hopefully
503          harmless.
504
50505/04/08: beazley
506          Added extra error checking to yacc() to detect precedence rules defined
507          for undefined terminal symbols.   This allows yacc() to detect a potential
508          problem that can be really tricky to debug if no warning message or error
509          message is generated about it.
510
51105/04/08: beazley
512          lex() now has an outputdir that can specify the output directory for
513          tables when running in optimize mode.  For example:
514
515             lexer = lex.lex(optimize=True, lextab="ltab", outputdir="foo/bar")
516
517          The behavior of specifying a table module and output directory are
518          more aligned with the behavior of yacc().
519
52005/04/08: beazley
521          [Issue 9]
522          Fixed filename bug in when specifying the modulename in lex() and yacc().
523          If you specified options such as the following:
524
525             parser = yacc.yacc(tabmodule="foo.bar.parsetab",outputdir="foo/bar")
526
527          yacc would create a file "foo.bar.parsetab.py" in the given directory.
528          Now, it simply generates a file "parsetab.py" in that directory.
529          Bug reported by cptbinho.
530
53105/04/08: beazley
532          Slight modification to lex() and yacc() to allow their table files
533	  to be loaded from a previously loaded module.   This might make
534	  it easier to load the parsing tables from a complicated package
535          structure.  For example:
536
537	       import foo.bar.spam.parsetab as parsetab
538               parser = yacc.yacc(tabmodule=parsetab)
539
540          Note:  lex and yacc will never regenerate the table file if used
541          in the form---you will get a warning message instead.
542          This idea suggested by Brian Clapper.
543
544
54504/28/08: beazley
546          Fixed a big with p_error() functions being picked up correctly
547          when running in yacc(optimize=1) mode.  Patch contributed by
548          Bart Whiteley.
549
55002/28/08: beazley
551          Fixed a bug with 'nonassoc' precedence rules.   Basically the
552          non-precedence was being ignored and not producing the correct
553          run-time behavior in the parser.
554
55502/16/08: beazley
556          Slight relaxation of what the input() method to a lexer will
557          accept as a string.   Instead of testing the input to see
558          if the input is a string or unicode string, it checks to see
559          if the input object looks like it contains string data.
560          This change makes it possible to pass string-like objects
561          in as input.  For example, the object returned by mmap.
562
563              import mmap, os
564              data = mmap.mmap(os.open(filename,os.O_RDONLY),
565                               os.path.getsize(filename),
566                               access=mmap.ACCESS_READ)
567              lexer.input(data)
568
569
57011/29/07: beazley
571          Modification of ply.lex to allow token functions to aliased.
572          This is subtle, but it makes it easier to create libraries and
573          to reuse token specifications.  For example, suppose you defined
574          a function like this:
575
576               def number(t):
577                    r'\d+'
578                    t.value = int(t.value)
579                    return t
580
581          This change would allow you to define a token rule as follows:
582
583              t_NUMBER = number
584
585          In this case, the token type will be set to 'NUMBER' and use
586          the associated number() function to process tokens.
587
58811/28/07: beazley
589          Slight modification to lex and yacc to grab symbols from both
590          the local and global dictionaries of the caller.   This
591          modification allows lexers and parsers to be defined using
592          inner functions and closures.
593
59411/28/07: beazley
595          Performance optimization:  The lexer.lexmatch and t.lexer
596          attributes are no longer set for lexer tokens that are not
597          defined by functions.   The only normal use of these attributes
598          would be in lexer rules that need to perform some kind of
599          special processing.  Thus, it doesn't make any sense to set
600          them on every token.
601
602          *** POTENTIAL INCOMPATIBILITY ***  This might break code
603          that is mucking around with internal lexer state in some
604          sort of magical way.
605
60611/27/07: beazley
607          Added the ability to put the parser into error-handling mode
608          from within a normal production.   To do this, simply raise
609          a yacc.SyntaxError exception like this:
610
611          def p_some_production(p):
612              'some_production : prod1 prod2'
613              ...
614              raise yacc.SyntaxError      # Signal an error
615
616          A number of things happen after this occurs:
617
618          - The last symbol shifted onto the symbol stack is discarded
619            and parser state backed up to what it was before the
620            the rule reduction.
621
622          - The current lookahead symbol is saved and replaced by
623            the 'error' symbol.
624
625          - The parser enters error recovery mode where it tries
626            to either reduce the 'error' rule or it starts
627            discarding items off of the stack until the parser
628            resets.
629
630          When an error is manually set, the parser does *not* call
631          the p_error() function (if any is defined).
632          *** NEW FEATURE *** Suggested on the mailing list
633
63411/27/07: beazley
635          Fixed structure bug in examples/ansic.  Reported by Dion Blazakis.
636
63711/27/07: beazley
638          Fixed a bug in the lexer related to start conditions and ignored
639          token rules.  If a rule was defined that changed state, but
640          returned no token, the lexer could be left in an inconsistent
641          state.  Reported by
642
64311/27/07: beazley
644          Modified setup.py to support Python Eggs.   Patch contributed by
645          Simon Cross.
646
64711/09/07: beazely
648          Fixed a bug in error handling in yacc.  If a syntax error occurred and the
649          parser rolled the entire parse stack back, the parser would be left in in
650          inconsistent state that would cause it to trigger incorrect actions on
651          subsequent input.  Reported by Ton Biegstraaten, Justin King, and others.
652
65311/09/07: beazley
654          Fixed a bug when passing empty input strings to yacc.parse().   This
655          would result in an error message about "No input given".  Reported
656          by Andrew Dalke.
657
658Version 2.3
659-----------------------------
66002/20/07: beazley
661          Fixed a bug with character literals if the literal '.' appeared as the
662          last symbol of a grammar rule.  Reported by Ales Smrcka.
663
66402/19/07: beazley
665          Warning messages are now redirected to stderr instead of being printed
666          to standard output.
667
66802/19/07: beazley
669          Added a warning message to lex.py if it detects a literal backslash
670          character inside the t_ignore declaration.  This is to help
671          problems that might occur if someone accidentally defines t_ignore
672          as a Python raw string.  For example:
673
674              t_ignore = r' \t'
675
676          The idea for this is from an email I received from David Cimimi who
677          reported bizarre behavior in lexing as a result of defining t_ignore
678          as a raw string by accident.
679
68002/18/07: beazley
681          Performance improvements.  Made some changes to the internal
682          table organization and LR parser to improve parsing performance.
683
68402/18/07: beazley
685          Automatic tracking of line number and position information must now be
686          enabled by a special flag to parse().  For example:
687
688              yacc.parse(data,tracking=True)
689
690          In many applications, it's just not that important to have the
691          parser automatically track all line numbers.  By making this an
692          optional feature, it allows the parser to run significantly faster
693          (more than a 20% speed increase in many cases).    Note: positional
694          information is always available for raw tokens---this change only
695          applies to positional information associated with nonterminal
696          grammar symbols.
697          *** POTENTIAL INCOMPATIBILITY ***
698
69902/18/07: beazley
700          Yacc no longer supports extended slices of grammar productions.
701          However, it does support regular slices.  For example:
702
703          def p_foo(p):
704              '''foo: a b c d e'''
705              p[0] = p[1:3]
706
707          This change is a performance improvement to the parser--it streamlines
708          normal access to the grammar values since slices are now handled in
709          a __getslice__() method as opposed to __getitem__().
710
71102/12/07: beazley
712          Fixed a bug in the handling of token names when combined with
713          start conditions.   Bug reported by Todd O'Bryan.
714
715Version 2.2
716------------------------------
71711/01/06: beazley
718          Added lexpos() and lexspan() methods to grammar symbols.  These
719          mirror the same functionality of lineno() and linespan().  For
720          example:
721
722          def p_expr(p):
723              'expr : expr PLUS expr'
724               p.lexpos(1)     # Lexing position of left-hand-expression
725               p.lexpos(1)     # Lexing position of PLUS
726               start,end = p.lexspan(3)  # Lexing range of right hand expression
727
72811/01/06: beazley
729          Minor change to error handling.  The recommended way to skip characters
730          in the input is to use t.lexer.skip() as shown here:
731
732             def t_error(t):
733                 print "Illegal character '%s'" % t.value[0]
734                 t.lexer.skip(1)
735
736          The old approach of just using t.skip(1) will still work, but won't
737          be documented.
738
73910/31/06: beazley
740          Discarded tokens can now be specified as simple strings instead of
741          functions.  To do this, simply include the text "ignore_" in the
742          token declaration.  For example:
743
744              t_ignore_cppcomment = r'//.*'
745
746          Previously, this had to be done with a function.  For example:
747
748              def t_ignore_cppcomment(t):
749                  r'//.*'
750                  pass
751
752          If start conditions/states are being used, state names should appear
753          before the "ignore_" text.
754
75510/19/06: beazley
756          The Lex module now provides support for flex-style start conditions
757          as described at http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html.
758          Please refer to this document to understand this change note.  Refer to
759          the PLY documentation for PLY-specific explanation of how this works.
760
761          To use start conditions, you first need to declare a set of states in
762          your lexer file:
763
764          states = (
765                    ('foo','exclusive'),
766                    ('bar','inclusive')
767          )
768
769          This serves the same role as the %s and %x specifiers in flex.
770
771          One a state has been declared, tokens for that state can be
772          declared by defining rules of the form t_state_TOK.  For example:
773
774            t_PLUS = '\+'          # Rule defined in INITIAL state
775            t_foo_NUM = '\d+'      # Rule defined in foo state
776            t_bar_NUM = '\d+'      # Rule defined in bar state
777
778            t_foo_bar_NUM = '\d+'  # Rule defined in both foo and bar
779            t_ANY_NUM = '\d+'      # Rule defined in all states
780
781          In addition to defining tokens for each state, the t_ignore and t_error
782          specifications can be customized for specific states.  For example:
783
784            t_foo_ignore = " "     # Ignored characters for foo state
785            def t_bar_error(t):
786                # Handle errors in bar state
787
788          With token rules, the following methods can be used to change states
789
790            def t_TOKNAME(t):
791                t.lexer.begin('foo')        # Begin state 'foo'
792                t.lexer.push_state('foo')   # Begin state 'foo', push old state
793                                            # onto a stack
794                t.lexer.pop_state()         # Restore previous state
795                t.lexer.current_state()     # Returns name of current state
796
797          These methods mirror the BEGIN(), yy_push_state(), yy_pop_state(), and
798          yy_top_state() functions in flex.
799
800          The use of start states can be used as one way to write sub-lexers.
801          For example, the lexer or parser might instruct the lexer to start
802          generating a different set of tokens depending on the context.
803
804          example/yply/ylex.py shows the use of start states to grab C/C++
805          code fragments out of traditional yacc specification files.
806
807          *** NEW FEATURE *** Suggested by Daniel Larraz with whom I also
808          discussed various aspects of the design.
809
81010/19/06: beazley
811          Minor change to the way in which yacc.py was reporting shift/reduce
812          conflicts.  Although the underlying LALR(1) algorithm was correct,
813          PLY was under-reporting the number of conflicts compared to yacc/bison
814          when precedence rules were in effect.  This change should make PLY
815          report the same number of conflicts as yacc.
816
81710/19/06: beazley
818          Modified yacc so that grammar rules could also include the '-'
819          character.  For example:
820
821            def p_expr_list(p):
822                'expression-list : expression-list expression'
823
824          Suggested by Oldrich Jedlicka.
825
82610/18/06: beazley
827          Attribute lexer.lexmatch added so that token rules can access the re
828          match object that was generated.  For example:
829
830          def t_FOO(t):
831              r'some regex'
832              m = t.lexer.lexmatch
833              # Do something with m
834
835
836          This may be useful if you want to access named groups specified within
837          the regex for a specific token. Suggested by Oldrich Jedlicka.
838
83910/16/06: beazley
840          Changed the error message that results if an illegal character
841          is encountered and no default error function is defined in lex.
842          The exception is now more informative about the actual cause of
843          the error.
844
845Version 2.1
846------------------------------
84710/02/06: beazley
848          The last Lexer object built by lex() can be found in lex.lexer.
849          The last Parser object built  by yacc() can be found in yacc.parser.
850
85110/02/06: beazley
852          New example added:  examples/yply
853
854          This example uses PLY to convert Unix-yacc specification files to
855          PLY programs with the same grammar.   This may be useful if you
856          want to convert a grammar from bison/yacc to use with PLY.
857
85810/02/06: beazley
859          Added support for a start symbol to be specified in the yacc
860          input file itself.  Just do this:
861
862               start = 'name'
863
864          where 'name' matches some grammar rule.  For example:
865
866               def p_name(p):
867                   'name : A B C'
868                   ...
869
870          This mirrors the functionality of the yacc %start specifier.
871
87209/30/06: beazley
873          Some new examples added.:
874
875          examples/GardenSnake : A simple indentation based language similar
876                                 to Python.  Shows how you might handle
877                                 whitespace.  Contributed by Andrew Dalke.
878
879          examples/BASIC       : An implementation of 1964 Dartmouth BASIC.
880                                 Contributed by Dave against his better
881                                 judgement.
882
88309/28/06: beazley
884          Minor patch to allow named groups to be used in lex regular
885          expression rules.  For example:
886
887              t_QSTRING = r'''(?P<quote>['"]).*?(?P=quote)'''
888
889          Patch submitted by Adam Ring.
890
89109/28/06: beazley
892          LALR(1) is now the default parsing method.   To use SLR, use
893          yacc.yacc(method="SLR").  Note: there is no performance impact
894          on parsing when using LALR(1) instead of SLR. However, constructing
895          the parsing tables will take a little longer.
896
89709/26/06: beazley
898          Change to line number tracking.  To modify line numbers, modify
899          the line number of the lexer itself.  For example:
900
901          def t_NEWLINE(t):
902              r'\n'
903              t.lexer.lineno += 1
904
905          This modification is both cleanup and a performance optimization.
906          In past versions, lex was monitoring every token for changes in
907          the line number.  This extra processing is unnecessary for a vast
908          majority of tokens. Thus, this new approach cleans it up a bit.
909
910          *** POTENTIAL INCOMPATIBILITY ***
911          You will need to change code in your lexer that updates the line
912          number. For example, "t.lineno += 1" becomes "t.lexer.lineno += 1"
913
91409/26/06: beazley
915          Added the lexing position to tokens as an attribute lexpos. This
916          is the raw index into the input text at which a token appears.
917          This information can be used to compute column numbers and other
918          details (e.g., scan backwards from lexpos to the first newline
919          to get a column position).
920
92109/25/06: beazley
922          Changed the name of the __copy__() method on the Lexer class
923          to clone().  This is used to clone a Lexer object (e.g., if
924          you're running different lexers at the same time).
925
92609/21/06: beazley
927          Limitations related to the use of the re module have been eliminated.
928          Several users reported problems with regular expressions exceeding
929          more than 100 named groups. To solve this, lex.py is now capable
930          of automatically splitting its master regular regular expression into
931          smaller expressions as needed.   This should, in theory, make it
932          possible to specify an arbitrarily large number of tokens.
933
93409/21/06: beazley
935          Improved error checking in lex.py.  Rules that match the empty string
936          are now rejected (otherwise they cause the lexer to enter an infinite
937          loop).  An extra check for rules containing '#' has also been added.
938          Since lex compiles regular expressions in verbose mode, '#' is interpreted
939          as a regex comment, it is critical to use '\#' instead.
940
94109/18/06: beazley
942          Added a @TOKEN decorator function to lex.py that can be used to
943          define token rules where the documentation string might be computed
944          in some way.
945
946          digit            = r'([0-9])'
947          nondigit         = r'([_A-Za-z])'
948          identifier       = r'(' + nondigit + r'(' + digit + r'|' + nondigit + r')*)'
949
950          from ply.lex import TOKEN
951
952          @TOKEN(identifier)
953          def t_ID(t):
954               # Do whatever
955
956          The @TOKEN decorator merely sets the documentation string of the
957          associated token function as needed for lex to work.
958
959          Note: An alternative solution is the following:
960
961          def t_ID(t):
962              # Do whatever
963
964          t_ID.__doc__ = identifier
965
966          Note: Decorators require the use of Python 2.4 or later.  If compatibility
967          with old versions is needed, use the latter solution.
968
969          The need for this feature was suggested by Cem Karan.
970
97109/14/06: beazley
972          Support for single-character literal tokens has been added to yacc.
973          These literals must be enclosed in quotes.  For example:
974
975          def p_expr(p):
976               "expr : expr '+' expr"
977               ...
978
979          def p_expr(p):
980               'expr : expr "-" expr'
981               ...
982
983          In addition to this, it is necessary to tell the lexer module about
984          literal characters.   This is done by defining the variable 'literals'
985          as a list of characters.  This should  be defined in the module that
986          invokes the lex.lex() function.  For example:
987
988             literals = ['+','-','*','/','(',')','=']
989
990          or simply
991
992             literals = '+=*/()='
993
994          It is important to note that literals can only be a single character.
995          When the lexer fails to match a token using its normal regular expression
996          rules, it will check the current character against the literal list.
997          If found, it will be returned with a token type set to match the literal
998          character.  Otherwise, an illegal character will be signalled.
999
1000
100109/14/06: beazley
1002          Modified PLY to install itself as a proper Python package called 'ply'.
1003          This will make it a little more friendly to other modules.  This
1004          changes the usage of PLY only slightly.  Just do this to import the
1005          modules
1006
1007                import ply.lex as lex
1008                import ply.yacc as yacc
1009
1010          Alternatively, you can do this:
1011
1012                from ply import *
1013
1014          Which imports both the lex and yacc modules.
1015          Change suggested by Lee June.
1016
101709/13/06: beazley
1018          Changed the handling of negative indices when used in production rules.
1019          A negative production index now accesses already parsed symbols on the
1020          parsing stack.  For example,
1021
1022              def p_foo(p):
1023                   "foo: A B C D"
1024                   print p[1]       # Value of 'A' symbol
1025                   print p[2]       # Value of 'B' symbol
1026                   print p[-1]      # Value of whatever symbol appears before A
1027                                    # on the parsing stack.
1028
1029                   p[0] = some_val  # Sets the value of the 'foo' grammer symbol
1030
1031          This behavior makes it easier to work with embedded actions within the
1032          parsing rules. For example, in C-yacc, it is possible to write code like
1033          this:
1034
1035               bar:   A { printf("seen an A = %d\n", $1); } B { do_stuff; }
1036
1037          In this example, the printf() code executes immediately after A has been
1038          parsed.  Within the embedded action code, $1 refers to the A symbol on
1039          the stack.
1040
1041          To perform this equivalent action in PLY, you need to write a pair
1042          of rules like this:
1043
1044               def p_bar(p):
1045                     "bar : A seen_A B"
1046                     do_stuff
1047
1048               def p_seen_A(p):
1049                     "seen_A :"
1050                     print "seen an A =", p[-1]
1051
1052          The second rule "seen_A" is merely a empty production which should be
1053          reduced as soon as A is parsed in the "bar" rule above.  The use
1054          of the negative index p[-1] is used to access whatever symbol appeared
1055          before the seen_A symbol.
1056
1057          This feature also makes it possible to support inherited attributes.
1058          For example:
1059
1060               def p_decl(p):
1061                     "decl : scope name"
1062
1063               def p_scope(p):
1064                     """scope : GLOBAL
1065                              | LOCAL"""
1066                   p[0] = p[1]
1067
1068               def p_name(p):
1069                     "name : ID"
1070                     if p[-1] == "GLOBAL":
1071                          # ...
1072                     else if p[-1] == "LOCAL":
1073                          #...
1074
1075          In this case, the name rule is inheriting an attribute from the
1076          scope declaration that precedes it.
1077
1078          *** POTENTIAL INCOMPATIBILITY ***
1079          If you are currently using negative indices within existing grammar rules,
1080          your code will break.  This should be extremely rare if non-existent in
1081          most cases.  The argument to various grammar rules is not usually not
1082          processed in the same way as a list of items.
1083
1084Version 2.0
1085------------------------------
108609/07/06: beazley
1087          Major cleanup and refactoring of the LR table generation code.  Both SLR
1088          and LALR(1) table generation is now performed by the same code base with
1089          only minor extensions for extra LALR(1) processing.
1090
109109/07/06: beazley
1092          Completely reimplemented the entire LALR(1) parsing engine to use the
1093          DeRemer and Pennello algorithm for calculating lookahead sets.  This
1094          significantly improves the performance of generating LALR(1) tables
1095          and has the added feature of actually working correctly!  If you
1096          experienced weird behavior with LALR(1) in prior releases, this should
1097          hopefully resolve all of those problems.  Many thanks to
1098          Andrew Waters and Markus Schoepflin for submitting bug reports
1099          and helping me test out the revised LALR(1) support.
1100
1101Version 1.8
1102------------------------------
110308/02/06: beazley
1104          Fixed a problem related to the handling of default actions in LALR(1)
1105          parsing.  If you experienced subtle and/or bizarre behavior when trying
1106          to use the LALR(1) engine, this may correct those problems.  Patch
1107          contributed by Russ Cox.  Note: This patch has been superceded by
1108          revisions for LALR(1) parsing in Ply-2.0.
1109
111008/02/06: beazley
1111          Added support for slicing of productions in yacc.
1112          Patch contributed by Patrick Mezard.
1113
1114Version 1.7
1115------------------------------
111603/02/06: beazley
1117          Fixed infinite recursion problem ReduceToTerminals() function that
1118          would sometimes come up in LALR(1) table generation.  Reported by
1119          Markus Schoepflin.
1120
112103/01/06: beazley
1122          Added "reflags" argument to lex().  For example:
1123
1124               lex.lex(reflags=re.UNICODE)
1125
1126          This can be used to specify optional flags to the re.compile() function
1127          used inside the lexer.   This may be necessary for special situations such
1128          as processing Unicode (e.g., if you want escapes like \w and \b to consult
1129          the Unicode character property database).   The need for this suggested by
1130          Andreas Jung.
1131
113203/01/06: beazley
1133          Fixed a bug with an uninitialized variable on repeated instantiations of parser
1134          objects when the write_tables=0 argument was used.   Reported by Michael Brown.
1135
113603/01/06: beazley
1137          Modified lex.py to accept Unicode strings both as the regular expressions for
1138          tokens and as input. Hopefully this is the only change needed for Unicode support.
1139          Patch contributed by Johan Dahl.
1140
114103/01/06: beazley
1142          Modified the class-based interface to work with new-style or old-style classes.
1143          Patch contributed by Michael Brown (although I tweaked it slightly so it would work
1144          with older versions of Python).
1145
1146Version 1.6
1147------------------------------
114805/27/05: beazley
1149          Incorporated patch contributed by Christopher Stawarz to fix an extremely
1150          devious bug in LALR(1) parser generation.   This patch should fix problems
1151          numerous people reported with LALR parsing.
1152
115305/27/05: beazley
1154          Fixed problem with lex.py copy constructor.  Reported by Dave Aitel, Aaron Lav,
1155          and Thad Austin.
1156
115705/27/05: beazley
1158          Added outputdir option to yacc()  to control output directory. Contributed
1159          by Christopher Stawarz.
1160
116105/27/05: beazley
1162          Added rununit.py test script to run tests using the Python unittest module.
1163          Contributed by Miki Tebeka.
1164
1165Version 1.5
1166------------------------------
116705/26/04: beazley
1168          Major enhancement. LALR(1) parsing support is now working.
1169          This feature was implemented by Elias Ioup (ezioup@alumni.uchicago.edu)
1170          and optimized by David Beazley. To use LALR(1) parsing do
1171          the following:
1172
1173               yacc.yacc(method="LALR")
1174
1175          Computing LALR(1) parsing tables takes about twice as long as
1176          the default SLR method.  However, LALR(1) allows you to handle
1177          more complex grammars.  For example, the ANSI C grammar
1178          (in example/ansic) has 13 shift-reduce conflicts with SLR, but
1179          only has 1 shift-reduce conflict with LALR(1).
1180
118105/20/04: beazley
1182          Added a __len__ method to parser production lists.  Can
1183          be used in parser rules like this:
1184
1185             def p_somerule(p):
1186                 """a : B C D
1187                      | E F"
1188                 if (len(p) == 3):
1189                     # Must have been first rule
1190                 elif (len(p) == 2):
1191                     # Must be second rule
1192
1193          Suggested by Joshua Gerth and others.
1194
1195Version 1.4
1196------------------------------
119704/23/04: beazley
1198          Incorporated a variety of patches contributed by Eric Raymond.
1199          These include:
1200
1201           0. Cleans up some comments so they don't wrap on an 80-column display.
1202           1. Directs compiler errors to stderr where they belong.
1203           2. Implements and documents automatic line counting when \n is ignored.
1204           3. Changes the way progress messages are dumped when debugging is on.
1205              The new format is both less verbose and conveys more information than
1206              the old, including shift and reduce actions.
1207
120804/23/04: beazley
1209          Added a Python setup.py file to simply installation.  Contributed
1210          by Adam Kerrison.
1211
121204/23/04: beazley
1213          Added patches contributed by Adam Kerrison.
1214
1215          -   Some output is now only shown when debugging is enabled.  This
1216              means that PLY will be completely silent when not in debugging mode.
1217
1218          -   An optional parameter "write_tables" can be passed to yacc() to
1219              control whether or not parsing tables are written.   By default,
1220              it is true, but it can be turned off if you don't want the yacc
1221              table file. Note: disabling this will cause yacc() to regenerate
1222              the parsing table each time.
1223
122404/23/04: beazley
1225          Added patches contributed by David McNab.  This patch addes two
1226          features:
1227
1228          -   The parser can be supplied as a class instead of a module.
1229              For an example of this, see the example/classcalc directory.
1230
1231          -   Debugging output can be directed to a filename of the user's
1232              choice.  Use
1233
1234                 yacc(debugfile="somefile.out")
1235
1236
1237Version 1.3
1238------------------------------
123912/10/02: jmdyck
1240          Various minor adjustments to the code that Dave checked in today.
1241          Updated test/yacc_{inf,unused}.exp to reflect today's changes.
1242
124312/10/02: beazley
1244          Incorporated a variety of minor bug fixes to empty production
1245          handling and infinite recursion checking.  Contributed by
1246          Michael Dyck.
1247
124812/10/02: beazley
1249          Removed bogus recover() method call in yacc.restart()
1250
1251Version 1.2
1252------------------------------
125311/27/02: beazley
1254          Lexer and parser objects are now available as an attribute
1255          of tokens and slices respectively. For example:
1256
1257             def t_NUMBER(t):
1258                 r'\d+'
1259                 print t.lexer
1260
1261             def p_expr_plus(t):
1262                 'expr: expr PLUS expr'
1263                 print t.lexer
1264                 print t.parser
1265
1266          This can be used for state management (if needed).
1267
126810/31/02: beazley
1269          Modified yacc.py to work with Python optimize mode.  To make
1270          this work, you need to use
1271
1272              yacc.yacc(optimize=1)
1273
1274          Furthermore, you need to first run Python in normal mode
1275          to generate the necessary parsetab.py files.  After that,
1276          you can use python -O or python -OO.
1277
1278          Note: optimized mode turns off a lot of error checking.
1279          Only use when you are sure that your grammar is working.
1280          Make sure parsetab.py is up to date!
1281
128210/30/02: beazley
1283          Added cloning of Lexer objects.   For example:
1284
1285              import copy
1286              l = lex.lex()
1287              lc = copy.copy(l)
1288
1289              l.input("Some text")
1290              lc.input("Some other text")
1291              ...
1292
1293          This might be useful if the same "lexer" is meant to
1294          be used in different contexts---or if multiple lexers
1295          are running concurrently.
1296
129710/30/02: beazley
1298          Fixed subtle bug with first set computation and empty productions.
1299          Patch submitted by Michael Dyck.
1300
130110/30/02: beazley
1302          Fixed error messages to use "filename:line: message" instead
1303          of "filename:line. message".  This makes error reporting more
1304          friendly to emacs. Patch submitted by Fran�ois Pinard.
1305
130610/30/02: beazley
1307          Improvements to parser.out file.  Terminals and nonterminals
1308          are sorted instead of being printed in random order.
1309          Patch submitted by Fran�ois Pinard.
1310
131110/30/02: beazley
1312          Improvements to parser.out file output.  Rules are now printed
1313          in a way that's easier to understand.  Contributed by Russ Cox.
1314
131510/30/02: beazley
1316          Added 'nonassoc' associativity support.    This can be used
1317          to disable the chaining of operators like a < b < c.
1318          To use, simply specify 'nonassoc' in the precedence table
1319
1320          precedence = (
1321            ('nonassoc', 'LESSTHAN', 'GREATERTHAN'),  # Nonassociative operators
1322            ('left', 'PLUS', 'MINUS'),
1323            ('left', 'TIMES', 'DIVIDE'),
1324            ('right', 'UMINUS'),            # Unary minus operator
1325          )
1326
1327          Patch contributed by Russ Cox.
1328
132910/30/02: beazley
1330          Modified the lexer to provide optional support for Python -O and -OO
1331          modes.  To make this work, Python *first* needs to be run in
1332          unoptimized mode.  This reads the lexing information and creates a
1333          file "lextab.py".  Then, run lex like this:
1334
1335                   # module foo.py
1336                   ...
1337                   ...
1338                   lex.lex(optimize=1)
1339
1340          Once the lextab file has been created, subsequent calls to
1341          lex.lex() will read data from the lextab file instead of using
1342          introspection.   In optimized mode (-O, -OO) everything should
1343          work normally despite the loss of doc strings.
1344
1345          To change the name of the file 'lextab.py' use the following:
1346
1347                  lex.lex(lextab="footab")
1348
1349          (this creates a file footab.py)
1350
1351
1352Version 1.1   October 25, 2001
1353------------------------------
1354
135510/25/01: beazley
1356          Modified the table generator to produce much more compact data.
1357          This should greatly reduce the size of the parsetab.py[c] file.
1358          Caveat: the tables still need to be constructed so a little more
1359          work is done in parsetab on import.
1360
136110/25/01: beazley
1362          There may be a possible bug in the cycle detector that reports errors
1363          about infinite recursion.   I'm having a little trouble tracking it
1364          down, but if you get this problem, you can disable the cycle
1365          detector as follows:
1366
1367                 yacc.yacc(check_recursion = 0)
1368
136910/25/01: beazley
1370          Fixed a bug in lex.py that sometimes caused illegal characters to be
1371          reported incorrectly.  Reported by Sverre J�rgensen.
1372
13737/8/01  : beazley
1374          Added a reference to the underlying lexer object when tokens are handled by
1375          functions.   The lexer is available as the 'lexer' attribute.   This
1376          was added to provide better lexing support for languages such as Fortran
1377          where certain types of tokens can't be conveniently expressed as regular
1378          expressions (and where the tokenizing function may want to perform a
1379          little backtracking).  Suggested by Pearu Peterson.
1380
13816/20/01 : beazley
1382          Modified yacc() function so that an optional starting symbol can be specified.
1383          For example:
1384
1385                 yacc.yacc(start="statement")
1386
1387          Normally yacc always treats the first production rule as the starting symbol.
1388          However, if you are debugging your grammar it may be useful to specify
1389          an alternative starting symbol.  Idea suggested by Rich Salz.
1390
1391Version 1.0  June 18, 2001
1392--------------------------
1393Initial public offering
1394
1395