1[vset VERSION 1.1]
2[comment {-*- text -*- doctools manpage}]
3[manpage_begin pt::rde n [vset VERSION]]
4[include include/module.inc]
5[titledesc {Parsing Runtime Support, PARAM based}]
6[require pt::rde [opt [vset VERSION]]]
7[require snit]
8[require struct::stack 1.5]
9[require pt::ast 1.1]
10[description]
11[include include/ref_intro.inc]
12
13This package provides a class whose instances provide the runtime
14support for recursive descent parsers with backtracking, as is needed
15for the execution of, for example, parsing expression grammars. It
16implements the [manpage {PackRat Machine Specification}], as such that
17document is [emph required] reading to understand both this manpage,
18and the package itself. The description below does make numerous
19shorthand references to the PARAM's instructions and the various parts
20of its architectural state.
21
22[para]
23
24The package resides in the Execution section of the Core Layer of
25Parser Tools.
26[para][image arch_core_transform][para]
27
28[para]
29
30Note: This package not only has the standard Tcl implementation, but
31also an accelerator, i.e. a C implementation, based on Critcl.
32
33[para]
34
35[subsection {Class API}]
36
37The package exports the API described here.
38
39[list_begin definitions]
40
41[call [cmd ::pt::rde] [arg objectName]]
42
43The command creates a new runtime object for a recursive descent
44parser with backtracking and returns the fully qualified name of the
45object command as its result. The API of this object command is
46described in the section [sectref {Object API}]. It may be used to
47invoke various operations on the object.
48
49[list_end]
50
51[subsection {Object API}]
52
53All objects created by this package provide the following 63 methods
54for the manipulation and querying of their state, which is, in essence
55the architectural state of a PARAM.
56
57[para]
58
59First some general methods and the state accessors.
60
61[list_begin definitions]
62
63[call [arg objectName] [method destroy]]
64
65This method destroys the object, releasing all claimed memory, and
66deleting the associated object command.
67
68[call [arg objectName] [method reset] [arg chan]]
69
70This method resets the state of the runtme to its defaults, preparing
71it for the parsing of the character in the channel [arg chan], which
72becomes IN.
73
74[include include/channel_notes.inc]
75
76[call [arg objectName] [method complete]]
77
78This method completes parsing, either returning the AST made from the
79elements of ARS, or throwing an error containing the current ER.
80
81[call [arg objectName] [method chan]]
82
83This method returns the handle of the channel which is IN.
84
85[call [arg objectName] [method line]]
86
87This method returns the line number for the position IN is currently
88at. Note that this may not match with the line number for CL, due to
89backtracking.
90
91[call [arg objectName] [method column]]
92
93This method returns the column for the position IN is currently
94at. Note that this may not match with the column for CL, due to
95backtracking.
96
97[call [arg objectName] [method current]]
98
99This method returns CC.
100
101[call [arg objectName] [method location]]
102
103This method returns CL.
104
105[call [arg objectName] [method locations]]
106
107This method returns the LS. The topmost entry of the stack will be the
108first element of the returned list.
109
110[call [arg objectName] [method ok]]
111
112This method returns ST.
113
114[call [arg objectName] [method value]]
115
116This method returns SV.
117
118[call [arg objectName] [method error]]
119
120This method returns ER. This is either the empty string for an empty
121ER, or a list of 2 elements, the location the error is for, and a set
122of messages which specify which symbols were expected at the
123location. The messages are encoded as one of the possible atomic
124parsing expressions (special operators, terminal, range, and
125nonterminal operator).
126
127[call [arg objectName] [method errors]]
128
129This method returns ES. The topmost entry of the stack will be the
130first element of the returned list. Each entry is encoded as described
131for [method error].
132
133[call [arg objectName] [method tokens] [opt "[arg from] [opt [arg to]]"]]
134
135This method returns the part of TC for the range of locations of IN
136starting at [arg from] and ending at [arg to]. If [arg to] is not
137specified it is taken as identical to [arg from]. If neither argument
138is specified the whole of TC is returned.
139
140[para]
141
142Each token in the returned list is a list of three elements itself,
143containing the character at the location, and the associated line and
144column numbers, in this order.
145
146[call [arg objectName] [method symbols]]
147
148This method returns a dictionary containing NC. Keys are two-element
149lists containing nonterminal symbol and location, in this order. The
150values are 4-tuples containing CL, ST, ER, and SV, in this order. ER
151is encoded as specified for the method [method error].
152
153[call [arg objectName] [method known]]
154
155This method returns a list containing the keys of SC. They are
156encoded in the same manner as is done by method [method symbols].
157
158[call [arg objectName] [method reducible]]
159
160This method returns ARS. The topmost entry of the stack will be the
161first element of the returned list
162
163[call [arg objectName] [method asts]]
164
165This method returns AS. The topmost entry of the stack will be the
166first element of the returned list
167
168[call [arg objectName] [method ast]]
169
170This is a convenience method returning the topmost element of ARS.
171
172[call [arg objectName] [method position] [arg loc]]
173
174This method returns the line and column numbers for the specified
175location of IN, assuming that this location has already been reached
176during the parsing process.
177
178[list_end]
179
180The following methods implement all PARAM instructions. They all have
181the prefix "i_".
182
183[para]
184
185The control flow is mainly provided by Tcl's builtin commands, like
186[cmd if], [cmd while], etc., plus a few guarded variants of PARAM
187instructions and Tcl commands.. That means that these instruction
188variants will do nothing if their guard condition is not
189fulfilled. They can be recognized by the prefix "i:ok_" and "i:fail_",
190which denote the value ST has to have for the instruction to execute.
191
192[para]
193
194The instructions are listed in the same order they occur in the
195[manpage {PackRat Machine Specification}], with the guard variants
196listed after their regular implementation, if any, or in their place.
197
198[list_begin definitions]
199
200[vset INS input_next][vset IA0 msg][include include/rde_1ins.inc]
201
202[vset INS test_alnum][include include/rde_0ins.inc]
203[vset INS test_alpha][include include/rde_0ins.inc]
204[vset INS test_ascii][include include/rde_0ins.inc]
205[vset INS test_char][vset IA0 char][include include/rde_1ins.inc]
206[vset INS test_ddigit][include include/rde_0ins.inc]
207[vset INS test_digit][include include/rde_0ins.inc]
208[vset INS test_graph][include include/rde_0ins.inc]
209[vset INS test_lower][include include/rde_0ins.inc]
210[vset INS test_print][include include/rde_0ins.inc]
211[vset INS test_punct][include include/rde_0ins.inc]
212[vset INS test_range][vset IA0 chars][vset IA1 chare][include include/rde_2ins.inc]
213[vset INS test_space][include include/rde_0ins.inc]
214[vset INS test_upper][include include/rde_0ins.inc]
215[vset INS test_wordchar][include include/rde_0ins.inc]
216[vset INS test_xdigit][include include/rde_0ins.inc]
217
218[vset INS error_clear][include include/rde_0ins.inc]
219[vset INS error_push][include include/rde_0ins.inc]
220[vset INS error_pop_merge][include include/rde_0ins.inc]
221[vset INS error_nonterminal][vset IA0 symbol][include include/rde_1ins.inc]
222
223[vset INS status_ok][include include/rde_0ins.inc]
224[vset INS status_fail][include include/rde_0ins.inc]
225[vset INS status_negate][include include/rde_0ins.inc]
226
227[vset INS loc_push][include include/rde_0ins.inc]
228[vset INS loc_pop_discard][include include/rde_0ins.inc]
229[vset INS loc_pop_rewind][include include/rde_0ins.inc]
230[vset G ok][vset INS loc_pop_rewind][include include/rde_0gins.inc]
231[vset IFAIL loc_pop_rewind]
232[vset IOK   loc_pop_discard]
233[vset IOKX  discard][include include/rde_0cins.inc]
234
235[vset INS symbol_restore][vset IA0 symbol][include include/rde_1ins.inc]
236
237[para] The boolean result of the check is returned as the result of
238the method and can be used with standard Tcl control flow commands.
239
240[vset INS symbol_save][vset IA0 symbol][include include/rde_1ins.inc]
241
242[vset INS value_clear][include include/rde_0ins.inc]
243[vset IFAIL value_clear]
244[vset IOK   value_leaf]
245[vset IOKX  leaf][include include/rde_0cins.inc]
246[vset IFAIL value_clear]
247[vset IOK   value_reduce]
248[vset IOKX  reduce][include include/rde_0cins.inc]
249
250[vset G ok][vset INS ast_value_push][include include/rde_0ginsb.inc]
251[vset INS ast_push][include include/rde_0ins.inc]
252[vset INS ast_pop_rewind][include include/rde_0ins.inc]
253[vset G fail][vset INS ast_pop_rewind][include include/rde_0gins.inc]
254[vset IFAIL ast_pop_rewind]
255[vset IOK   ast_pop_discard]
256[vset IOKX  discard][include include/rde_0cins.inc]
257[vset INS ast_pop_discard][include include/rde_0ins.inc]
258[vset IFAIL ast_pop_discard]
259[vset IOK   ast_pop_rewind]
260[vset IOKX  rewind][include include/rde_0cins.inc]
261
262[call [arg objectName] [method i:ok_continue]]
263
264This guarded method executes only for "ST == ok". Then it aborts the
265current iteration of the innermost loop in the calling Tcl procedure.
266
267[call [arg objectName] [method i:fail_continue]]
268
269This guarded method executes only for "ST == fail". Then it aborts the
270current iteration of the innermost loop in the calling Tcl procedure.
271
272[call [arg objectName] [method i:fail_return]]
273
274This guarded method executes only for "ST == fail". Then it aborts the
275calling Tcl procedure.
276
277[call [arg objectName] [method i:ok_return]]
278
279This guarded method executes only for "ST == ok". Then it aborts the
280calling Tcl procedure.
281
282[list_end]
283[para]
284
285The next set of methods are [term {super instructions}], meaning that
286each implements a longer sequence of instructions commonly used in
287parsers. The combinated instructions of the previous set, i.e. those
288with names matching the pattern "i_*/*", are actually super
289instructions as well, albeit with limited scope, handling 2
290instructions with their control flow. The upcoming set is much broader
291in scope, folding as much as six or more PARAM instructions into a
292single method call.
293
294[para]
295
296In this we can see the reasoning behind their use well:
297
298[list_begin enumerated]
299[enum]
300By using less instructions the generated parsers become smaller, as
301the common parts are now truly part of the common runtime, and not
302explicitly written in the parser's code over and over again.
303
304[enum]
305Using less instructions additionally reduces the overhead associated
306with calls into the runtime, i.e. the cost of method dispatch and of
307setting up the variable context.
308
309[enum]
310Another effect of the super instructions is that their internals can
311be optimized as well, especially regarding control flow, and stack
312use, as the runtime internals are accessible to all instructions
313folded into the sequence.
314[list_end]
315
316[para]
317
318[list_begin definitions]
319[call [arg objectName] [method si:void_state_push]]
320
321This method combines [example {
322i_loc_push
323i_error_clear
324i_error_push
325}]
326
327Parsers use it at the beginning of [term void] sequences and choices
328with a [term void] initial branch.
329
330[call [arg objectName] [method si:void2_state_push]]
331
332This method combines [example {
333i_loc_push
334i_error_clear
335i_error_push
336}]
337
338Parsers use it at the beginning of optional and repeated expressions.
339
340[call [arg objectName] [method si:value_state_push]]
341
342This method combines [example {
343i_ast_push
344i_loc_push
345i_error_clear
346i_error_push
347}]
348
349Parsers use it at the beginning of sequences generating an AST and
350choices with an initial branch generating an AST.
351
352[call [arg objectName] [method si:void_state_merge]]
353
354This method combines [example {
355i_error_pop_merge
356i_loc_pop_rewind/discard
357}]
358
359Parsers use it at the end of void sequences and choices whose last
360branch is void.
361
362[call [arg objectName] [method si:void_state_merge_ok]]
363
364This method combines [example {
365i_error_pop_merge
366i_loc_pop_rewind/discard
367i_status_ok
368}]
369
370Parsers use it at the end of optional expressions
371
372[call [arg objectName] [method si:value_state_merge]]
373
374This method combines [example {
375i_error_pop_merge
376i_ast_pop_rewind/discard
377i_loc_pop_rewind/discard
378}]
379
380Parsers use it at the end of sequences generating ASTs and choices
381whose last branch generates an AST
382
383[call [arg objectName] [method si:value_notahead_start]]
384
385This method combines [example {
386i_loc_push
387i_ast_push
388}]
389
390Parsers use it at the beginning of negative lookahead predicates which
391generate ASTs.
392
393[call [arg objectName] [method si:void_notahead_exit]]
394
395This method combines [example {
396i_loc_pop_rewind
397i_status_negate
398}]
399
400Parsers use it at the end of void negative lookahead predicates.
401
402[call [arg objectName] [method si:value_notahead_exit]]
403
404This method combines [example {
405i_ast_pop_discard/rewind
406i_loc_pop_rewind
407i_status_negate
408}]
409
410Parsers use it at the end of negative lookahead predicates which
411generate ASTs.
412
413[call [arg objectName] [method si:kleene_abort]]
414
415This method combines [example {
416i_loc_pop_rewind/discard
417i:fail_return
418}]
419
420Parsers use it to stop a positive repetition when its first, required, expression fails.
421
422[call [arg objectName] [method si:kleene_close]]
423
424This method combines [example {
425i_error_pop_merge
426i_loc_pop_rewind/discard
427i:fail_status_ok
428i:fail_return
429}]
430
431Parsers use it at the end of repetitions.
432
433[call [arg objectName] [method si:voidvoid_branch]]
434
435This method combines [example {
436i_error_pop_merge
437i:ok_loc_pop_discard
438i:ok_return
439i_loc_rewind
440i_error_push
441}]
442
443Parsers use it when transiting between branches of a choice when both are void.
444
445[call [arg objectName] [method si:voidvalue_branch]]
446
447This method combines [example {
448i_error_pop_merge
449i:ok_loc_pop_discard
450i:ok_return
451i_ast_push
452i_loc_rewind
453i_error_push
454}]
455
456Parsers use it when transiting between branches of a choice when the
457failing branch is void, and the next to test generates an AST.
458
459[call [arg objectName] [method si:valuevoid_branch]]
460
461This method combines [example {
462i_error_pop_merge
463i_ast_pop_rewind/discard
464i:ok_loc_pop_discard
465i:ok_return
466i_loc_rewind
467i_error_push
468}]
469
470Parsers use it when transiting between branches of a choice when the
471failing branch generates an AST, and the next to test is void.
472
473[call [arg objectName] [method si:valuevalue_branch]]
474
475This method combines [example {
476i_error_pop_merge
477i_ast_pop_discard
478i:ok_loc_pop_discard
479i:ok_return
480i_ast_rewind
481i_loc_rewind
482i_error_push
483}]
484
485Parsers use it when transiting between branches of a choice when both
486generate ASTs.
487
488[call [arg objectName] [method si:voidvoid_part]]
489
490This method combines [example {
491i_error_pop_merge
492i:fail_loc_pop_rewind
493i:fail_return
494i_error_push
495}]
496
497Parsers use it when transiting between parts of a sequence and both
498are void.
499
500[call [arg objectName] [method si:voidvalue_part]]
501
502This method combines [example {
503i_error_pop_merge
504i:fail_loc_pop_rewind
505i:fail_return
506i_ast_push
507i_error_push
508}]
509
510Parsers use it when transiting between parts of a sequence and the
511sucessfully matched part is void, and after it an AST is generated.
512
513[call [arg objectName] [method si:valuevalue_part]]
514
515This method combines [example {
516i_error_pop_merge
517i:fail_ast_pop_rewind
518i:fail_loc_pop_rewind
519i:fail_return
520i_error_push
521}]
522
523Parsers use it when transiting between parts of a sequence and both
524parts generate ASTs.
525
526[call [arg objectName] [method si:value_symbol_start] [arg symbol]]
527
528This method combines [example {
529if/found? i_symbol_restore $symbol
530i:found:ok_ast_value_push
531i:found_return
532i_loc_push
533i_ast_push
534}]
535
536Parsers use it at the beginning of a nonterminal symbol generating an
537AST, whose right-hand side may have generated an AST as well.
538
539[call [arg objectName] [method si:value_void_symbol_start] [arg symbol]]
540
541This method combines [example {
542if/found? i_symbol_restore $symbol
543i:found:ok_ast_value_push
544i:found_return
545i_loc_push
546i_ast_push
547}]
548
549Parsers use it at the beginning of a void nonterminal symbol whose
550right-hand side may generate an AST.
551
552[call [arg objectName] [method si:void_symbol_start] [arg symbol]]
553
554This method combines [example {
555if/found? i_symbol_restore $symbol
556i:found_return
557i_loc_push
558i_ast_push
559}]
560
561Parsers use it at the beginning of a nonterminal symbol generating an
562AST whose right-hand side is void.
563
564[call [arg objectName] [method si:void_void_symbol_start] [arg symbol]]
565
566This method combines [example {
567if/found? i_symbol_restore $symbol
568i:found_return
569i_loc_push
570}]
571
572Parsers use it at  the beginning of a void nonterminal symbol whose
573right-hand side is void as well.
574
575[call [arg objectName] [method si:reduce_symbol_end] [arg symbol]]
576
577This method combines [example {
578i_value_clear/reduce $symbol
579i_symbol_save        $symbol
580i_error_nonterminal  $symbol
581i_ast_pop_rewind
582i_loc_pop_discard
583i:ok_ast_value_push
584}]
585
586Parsers use it at the end of a non-terminal symbol generating an AST
587using the AST generated by the right-hand side as child.
588
589[call [arg objectName] [method si:void_leaf_symbol_end] [arg symbol]]
590
591This method combines [example {
592i_value_clear/leaf  $symbol
593i_symbol_save       $symbol
594i_error_nonterminal $symbol
595i_loc_pop_discard
596i:ok_ast_value_push
597}]
598
599Parsers use it at the end of a non-terminal symbol generating an AST
600whose right-hand side is void.
601
602[call [arg objectName] [method si:value_leaf_symbol_end] [arg symbol]]
603
604This method combines [example {
605i_value_clear/leaf  $symbol
606i_symbol_save       $symbol
607i_error_nonterminal $symbol
608i_loc_pop_discard
609i_ast_pop_rewind
610i:ok_ast_value_push
611}]
612
613Parsers use it at the end of a non-terminal symbol generating an AST
614discarding the AST generated by the right-hand side.
615
616[call [arg objectName] [method si:value_clear_symbol_end] [arg symbol]]
617
618This method combines [example {
619i_value_clear
620i_symbol_save       $symbol
621i_error_nonterminal $symbol
622i_loc_pop_discard
623i_ast_pop_rewind
624}]
625
626Parsers use it at the end of a void non-terminal symbol, discarding
627the AST generated by the right-hand side.
628
629[call [arg objectName] [method si:void_clear_symbol_end] [arg symbol]]
630
631This method combines [example {
632i_value_clear
633i_symbol_save       $symbol
634i_error_nonterminal $symbol
635i_loc_pop_discard
636}]
637
638Parsers use it at the end of a void non-terminal symbol with a void
639right-hand side.
640
641[call [arg objectName] [method si:next_char] [arg tok]]
642[call [arg objectName] [method si:next_range] [arg toks] [arg toke]]
643[call [arg objectName] [method si:next_alnum]]
644[call [arg objectName] [method si:next_alpha]]
645[call [arg objectName] [method si:next_ascii]]
646[call [arg objectName] [method si:next_ddigit]]
647[call [arg objectName] [method si:next_digit]]
648[call [arg objectName] [method si:next_graph]]
649[call [arg objectName] [method si:next_lower]]
650[call [arg objectName] [method si:next_print]]
651[call [arg objectName] [method si:next_punct]]
652[call [arg objectName] [method si:next_space]]
653[call [arg objectName] [method si:next_upper]]
654[call [arg objectName] [method si:next_wordchar]]
655[call [arg objectName] [method si:next_xdigit]]
656
657These methods all combine [example {
658i_input_next $msg
659i:fail_return
660}]
661
662with the appropriate [cmd i_test_xxx] instruction. Parsers use them for
663handling atomic expressions.
664
665[list_end]
666[para]
667
668[include include/feedback.inc]
669[manpage_end]
670