1% This program by D. E. Knuth is not copyrighted and can be used freely.
2% Version 0 was released in December, 1981.
3% Version 1 was released in September, 1982, with version 0 of TeX.
4% Slight changes were made in October, 1982, for version 0.6 of TeX.
5% Version 1.1 changed "_" to "\_" if not within an identifier (November, 1982).
6% Version 1.2 added @@= and @@\ and marked changed modules (December, 1982).
7% Version 1.3 marked and indexed changed modules better (January, 1983).
8% Version 1.4 added "history" (February, 1983).
9% Version 1.5 conformed to TeX version 0.96 (March, 1983).
10% Version 1.6 conformed to TeX version 0.98 (May, 1983).
11% Version 1.7 introduced the new change file format (June, 1983).
12% Version 2 was released in July, 1983, with version 0.999 of TeX.
13% Version 2.1 corrected a bug in changed_module reckoning (August, 1983).
14% Version 2.2 corrected it better (August, 1983).
15% Version 2.3 starts the output with \input webmac (August, 1983).
16% Version 2.4 fixed a bug in compress(#) (September, 1983).
17% Version 2.5 cleared xrefswitch after module names (November, 1983).
18% Version 2.6 fixed a bug in declaration of trans array (January, 1984).
19% Version 2.7 fixed a bug in real constants (August, 1984).
20% Version 2.8 fixed a bug in change_buffer movement (August, 1985).
21% Version 2.9 increased max_refs and max_toks to 30000 each (January, 1987).
22% Version 3, for Sewell's book, fixed long-line bug in input_ln (March, 1989).
23% Version 3.1 fixed a bug for programs with only one module (April, 1989).
24% Version 4 was major change to allow 8-bit input (September, 1989).
25% Version 4.1, for Breitenlohner, avoids English-only output (March, 1990).
26% Version 4.2 conforms to ANSI standard for-loop rules (September, 1990).
27% Version 4.3 catches extra } in input (Breitenlohner, September, 1991).
28% Version 4.4 corrects changed_module logic, %-overflow (January, 1992).
29
30% Here is TeX material that gets inserted after \input webmac
31\def\hang{\hangindent 3em\indent\ignorespaces}
32\font\ninerm=cmr9
33\let\mc=\ninerm % medium caps for names like SAIL
34\def\PASCAL{Pascal}
35\def\pb{$\.|\ldots\.|$} % Pascal brackets (|...|)
36\def\v{\.{\char'174}} % vertical (|) in typewriter font
37\def\dleft{[\![} \def\dright{]\!]} % double brackets
38\mathchardef\RA="3221 % right arrow
39\mathchardef\BA="3224 % double arrow
40\def\({} % kludge for alphabetizing certain module names
41
42\def\title{WEAVE}
43\def\contentspagenumber{15} % should be odd
44\def\topofcontents{\null\vfill
45  \titlefalse % include headline on the contents page
46  \def\rheader{\mainfont Appendix D\hfil \contentspagenumber}
47  \centerline{\titlefont The {\ttitlefont WEAVE} processor}
48  \vskip 15pt
49  \centerline{(Version 4.4)}
50  \vfill}
51\pageno=\contentspagenumber \advance\pageno by 1
52
53@* Introduction.
54This program converts a \.{WEB} file to a \TeX\ file. It was written
55by D. E. Knuth in October, 1981; a somewhat similar {\mc SAIL} program had
56been developed in March, 1979, although the earlier program used a top-down
57parsing method that is quite different from the present scheme.
58
59The code uses a few features of the local \PASCAL\ compiler that may need
60to be changed in other installations:
61
62\yskip\item{1)} Case statements have a default.
63\item{2)} Input-output routines may need to be adapted for use with a particular
64character set and/or for printing messages on the user's terminal.
65
66\yskip\noindent
67These features are also present in the \PASCAL\ version of \TeX, where they
68are used in a similar (but more complex) way. System-dependent portions
69of \.{WEAVE} can be identified by looking at the entries for `system
70dependencies' in the index below.
71@!@^system dependencies@>
72
73The ``banner line'' defined here should be changed whenever \.{WEAVE}
74is modified.
75
76@d banner=='This is WEAVE, Version 4.4'
77
78@ The program begins with a fairly normal header, made up of pieces that
79@^system dependencies@>
80will mostly be filled in later. The \.{WEB} input comes from files |web_file|
81and |change_file|, and the \TeX\ output goes to file |tex_file|.
82
83If it is necessary to abort the job because of a fatal error, the program
84calls the `|jump_out|' procedure, which goes to the label |end_of_WEAVE|.
85
86@d end_of_WEAVE = 9999 {go here to wrap it up}
87
88@p @t\4@>@<Compiler directives@>@/
89program WEAVE(@!web_file,@!change_file,@!tex_file);
90label end_of_WEAVE; {go here to finish}
91const @<Constants in the outer block@>@/
92type @<Types in the outer block@>@/
93var @<Globals in the outer block@>@/
94@<Error handling procedures@>@/
95procedure initialize;
96  var @<Local variables for initialization@>@/
97  begin @<Set initial values@>@/
98  end;
99
100@ Some of this code is optional for use when debugging only;
101such material is enclosed between the delimiters |debug| and $|gubed|$.
102Other parts, delimited by |stat| and $|tats|$, are optionally included
103if statistics about \.{WEAVE}'s memory usage are desired.
104
105@d debug==@{ {change this to `$\\{debug}\equiv\null$' when debugging}
106@d gubed==@t@>@} {change this to `$\\{gubed}\equiv\null$' when debugging}
107@f debug==begin
108@f gubed==end
109@#
110@d stat==@{ {change this to `$\\{stat}\equiv\null$'
111  when gathering usage statistics}
112@d tats==@t@>@} {change this to `$\\{tats}\equiv\null$'
113  when gathering usage statistics}
114@f stat==begin
115@f tats==end
116
117@ The \PASCAL\ compiler used to develop this system has ``compiler
118directives'' that can appear in comments whose first character is a dollar sign.
119In production versions of \.{WEAVE} these directives tell the compiler that
120@^system dependencies@>
121it is safe to avoid range checks and to leave out the extra code it inserts
122for the \PASCAL\ debugger's benefit, although interrupts will occur if
123there is arithmetic overflow.
124
125@<Compiler directives@>=
126@{@&$C-,A+,D-@} {no range check, catch arithmetic overflow, no debug overhead}
127@!debug @{@&$C+,D+@}@+ gubed {but turn everything on when debugging}
128
129@ Labels are given symbolic names by the following definitions. We insert
130the label `|exit|:' just before the `\ignorespaces|end|\unskip' of a
131procedure in which we have used the `|return|' statement defined below;
132the label `|restart|' is occasionally used at the very beginning of a
133procedure; and the label `|reswitch|' is occasionally used just prior to
134a \&{case} statement in which some cases change the conditions and we wish to
135branch to the newly applicable case.
136Loops that are set up with the \&{loop} construction defined below are
137commonly exited by going to `|done|' or to `|found|' or to `|not_found|',
138and they are sometimes repeated by going to `|continue|'.
139
140@d exit=10 {go here to leave a procedure}
141@d restart=20 {go here to start a procedure again}
142@d reswitch=21 {go here to start a case statement again}
143@d continue=22 {go here to resume a loop}
144@d done=30 {go here to exit a loop}
145@d found=31 {go here when you've found it}
146@d not_found=32 {go here when you've found something else}
147
148@ Here are some macros for common programming idioms.
149
150@d incr(#) == #:=#+1 {increase a variable by unity}
151@d decr(#) == #:=#-1 {decrease a variable by unity}
152@d loop == @+ while true do@+ {repeat over and over until a |goto| happens}
153@d do_nothing == {empty statement}
154@d return == goto exit {terminate a procedure call}
155@f return == nil
156@f loop == xclause
157
158@ We assume that |case| statements may include a default case that applies
159if no matching label is found. Thus, we shall use constructions like
160@^system dependencies@>
161$$\vbox{\halign{#\hfil\cr
162|case x of|\cr
1631: $\langle\,$code for $x=1\,\rangle$;\cr
1643: $\langle\,$code for $x=3\,\rangle$;\cr
165|othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr
166|endcases|\cr}}$$
167since most \PASCAL\ compilers have plugged this hole in the language by
168incorporating some sort of default mechanism. For example, the compiler
169used to develop \.{WEB} and \TeX\ allows `|others|:' as a default label,
170and other \PASCAL s allow syntaxes like `\ignorespaces|else|\unskip' or
171`\&{otherwise}' or `\\{otherwise}:', etc. The definitions of |othercases|
172and |endcases| should be changed to agree with local conventions.
173(Of course, if no default mechanism is available, the |case| statements of
174this program must be extended by listing all remaining cases.)
175
176@d othercases == others: {default for cases not listed explicitly}
177@d endcases == @+end {follows the default case in an extended |case| statement}
178@f othercases == else
179@f endcases == end
180
181@ The following parameters are set big enough to handle \TeX, so they
182should be sufficient for most applications of \.{WEAVE}.
183
184@<Constants...@>=
185@!max_bytes=45000; {|1/ww| times the number of bytes in identifiers,
186  index entries, and module names; must be less than 65536}
187@!max_names=5000; {number of identifiers, index entries, and module names;
188  must be less than 10240}
189@!max_modules=2000;{greater than the total number of modules}
190@!hash_size=353; {should be prime}
191@!buf_size=100; {maximum length of input line}
192@!longest_name=400; {module names shouldn't be longer than this}
193@!long_buf_size=500; {|buf_size+longest_name|}
194@!line_length=80; {lines of \TeX\ output have at most this many characters,
195  should be less than 256}
196@!max_refs=30000; {number of cross references; must be less than 65536}
197@!max_toks=30000; {number of symbols in \PASCAL\ texts being parsed;
198  must be less than 65536}
199@!max_texts=2000; {number of phrases in \PASCAL\ texts being parsed;
200  must be less than 10240}
201@!max_scraps=1000; {number of tokens in \PASCAL\ texts being parsed}
202@!stack_size=200; {number of simultaneous output levels}
203
204@ A global variable called |history| will contain one of four values
205at the end of every run: |spotless| means that no unusual messages were
206printed; |harmless_message| means that a message of possible interest
207was printed but no serious errors were detected; |error_message| means that
208at least one error was found; |fatal_message| means that the program
209terminated abnormally. The value of |history| does not influence the
210behavior of the program; it is simply computed for the convenience
211of systems that might want to use such information.
212
213@d spotless=0 {|history| value for normal jobs}
214@d harmless_message=1 {|history| value when non-serious info was printed}
215@d error_message=2 {|history| value when an error was noted}
216@d fatal_message=3 {|history| value when we had to stop prematurely}
217@#
218@d mark_harmless==@t@>@+if history=spotless then history:=harmless_message
219@d mark_error==history:=error_message
220@d mark_fatal==history:=fatal_message
221
222@<Glob...@>=@!history:spotless..fatal_message; {how bad was this run?}
223
224@ @<Set init...@>=history:=spotless;
225
226@* The character set.
227One of the main goals in the design of \.{WEB} has been to make it readily
228portable between a wide variety of computers. Yet \.{WEB} by its very
229nature must use a greater variety of characters than most computer
230programs deal with, and character encoding is one of the areas in which
231existing machines differ most widely from each other.
232
233To resolve this problem, all input to \.{WEAVE} and \.{TANGLE} is
234converted to an internal eight-bit code that is essentially standard
235ASCII, the ``American Standard Code for Information Interchange.''
236The conversion is done immediately when each character is read in.
237Conversely, characters are converted from ASCII to the user's external
238representation just before they are output. (The original ASCII code
239was seven bits only; \.{WEB} now allows eight bits in an attempt to
240keep up with modern times.)
241
242Such an internal code is relevant to users of \.{WEB} only because it is
243the code used for preprocessed constants like \.{"A"}. If you are writing
244a program in \.{WEB} that makes use of such one-character constants, you
245should convert your input to ASCII form, like \.{WEAVE} and \.{TANGLE} do.
246Otherwise \.{WEB}'s internal coding scheme does not affect you.
247@^ASCII code@>
248
249Here is a table of the standard visible ASCII codes:
250$$\def\:{\char\count255\global\advance\count255 by 1}
251\count255='40
252\vbox{
253\hbox{\hbox to 40pt{\it\hfill0\/\hfill}%
254\hbox to 40pt{\it\hfill1\/\hfill}%
255\hbox to 40pt{\it\hfill2\/\hfill}%
256\hbox to 40pt{\it\hfill3\/\hfill}%
257\hbox to 40pt{\it\hfill4\/\hfill}%
258\hbox to 40pt{\it\hfill5\/\hfill}%
259\hbox to 40pt{\it\hfill6\/\hfill}%
260\hbox to 40pt{\it\hfill7\/\hfill}}
261\vskip 4pt
262\hrule
263\def\^{\vrule height 10.5pt depth 4.5pt}
264\halign{\hbox to 0pt{\hskip -24pt\O{#0}\hfill}&\^
265\hbox to 40pt{\tt\hfill#\hfill\^}&
266&\hbox to 40pt{\tt\hfill#\hfill\^}\cr
26704&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
26805&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
26906&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27007&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27110&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27211&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27312&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27413&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27514&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27615&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27716&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule}
27817&\:&\:&\:&\:&\:&\:&\:\cr}
279\hrule width 280pt}$$
280(Actually, of course, code @'040 is an invisible blank space.)  Code @'136
281was once an upward arrow (\.{\char'13}), and code @'137 was
282once a left arrow (\.^^X), in olden times when the first draft
283of ASCII code was prepared; but \.{WEB} works with today's standard
284ASCII in which those codes represent circumflex and underline as shown.
285
286@<Types...@>=
287@!ASCII_code=0..255; {eight-bit numbers, a subrange of the integers}
288
289@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
290character sets were common, so it did not make provision for lowercase
291letters. Nowadays, of course, we need to deal with both capital and small
292letters in a convenient way, so \.{WEB} assumes that it is being used
293with a \PASCAL\ whose character set contains at least the characters of
294standard ASCII as listed above. Some \PASCAL\ compilers use the original
295name |char| for the data type associated with the characters in text files,
296while other \PASCAL s consider |char| to be a 64-element subrange of a larger
297data type that has some other name.
298
299In order to accommodate this difference, we shall use the name |text_char|
300to stand for the data type of the characters in the input and output
301files.  We shall also assume that |text_char| consists of the elements
302|chr(first_text_char)| through |chr(last_text_char)|, inclusive. The
303following definitions should be adjusted if necessary.
304@^system dependencies@>
305
306@d text_char == char {the data type of characters in text files}
307@d first_text_char=0 {ordinal number of the smallest element of |text_char|}
308@d last_text_char=255 {ordinal number of the largest element of |text_char|}
309
310@<Types...@>=
311@!text_file=packed file of text_char;
312
313@ The \.{WEAVE} and \.{TANGLE} processors convert between ASCII code and
314the user's external character set by means of arrays |xord| and |xchr|
315that are analogous to \PASCAL's |ord| and |chr| functions.
316
317@<Globals...@>=
318@!xord: array [text_char] of ASCII_code;
319  {specifies conversion of input characters}
320@!xchr: array [ASCII_code] of text_char;
321  {specifies conversion of output characters}
322
323@ If we assume that every system using \.{WEB} is able to read and write the
324visible characters of standard ASCII (although not necessarily using the
325ASCII codes to represent them), the following assignment statements initialize
326most of the |xchr| array properly, without needing any system-dependent
327changes. For example, the statement \.{xchr[@@\'101]:=\'A\'} that appears
328in the present \.{WEB} file might be encoded in, say, {\mc EBCDIC} code
329on the external medium on which it resides, but \.{TANGLE} will convert from
330this external code to ASCII and back again. Therefore the assignment
331statement \.{XCHR[65]:=\'A\'} will appear in the corresponding \PASCAL\ file,
332and \PASCAL\ will compile this statement so that |xchr[65]| receives the
333character \.A in the external (|char|) code. Note that it would be quite
334incorrect to say \.{xchr[@@\'101]:="A"}, because |"A"| is a constant of
335type |integer|, not |char|, and because we have $|"A"|=65$ regardless of
336the external character set.
337
338@<Set init...@>=
339xchr[@'40]:=' ';
340xchr[@'41]:='!';
341xchr[@'42]:='"';
342xchr[@'43]:='#';
343xchr[@'44]:='$';
344xchr[@'45]:='%';
345xchr[@'46]:='&';
346xchr[@'47]:='''';@/
347xchr[@'50]:='(';
348xchr[@'51]:=')';
349xchr[@'52]:='*';
350xchr[@'53]:='+';
351xchr[@'54]:=',';
352xchr[@'55]:='-';
353xchr[@'56]:='.';
354xchr[@'57]:='/';@/
355xchr[@'60]:='0';
356xchr[@'61]:='1';
357xchr[@'62]:='2';
358xchr[@'63]:='3';
359xchr[@'64]:='4';
360xchr[@'65]:='5';
361xchr[@'66]:='6';
362xchr[@'67]:='7';@/
363xchr[@'70]:='8';
364xchr[@'71]:='9';
365xchr[@'72]:=':';
366xchr[@'73]:=';';
367xchr[@'74]:='<';
368xchr[@'75]:='=';
369xchr[@'76]:='>';
370xchr[@'77]:='?';@/
371xchr[@'100]:='@@';
372xchr[@'101]:='A';
373xchr[@'102]:='B';
374xchr[@'103]:='C';
375xchr[@'104]:='D';
376xchr[@'105]:='E';
377xchr[@'106]:='F';
378xchr[@'107]:='G';@/
379xchr[@'110]:='H';
380xchr[@'111]:='I';
381xchr[@'112]:='J';
382xchr[@'113]:='K';
383xchr[@'114]:='L';
384xchr[@'115]:='M';
385xchr[@'116]:='N';
386xchr[@'117]:='O';@/
387xchr[@'120]:='P';
388xchr[@'121]:='Q';
389xchr[@'122]:='R';
390xchr[@'123]:='S';
391xchr[@'124]:='T';
392xchr[@'125]:='U';
393xchr[@'126]:='V';
394xchr[@'127]:='W';@/
395xchr[@'130]:='X';
396xchr[@'131]:='Y';
397xchr[@'132]:='Z';
398xchr[@'133]:='[';
399xchr[@'134]:='\';
400xchr[@'135]:=']';
401xchr[@'136]:='^';
402xchr[@'137]:='_';@/
403xchr[@'140]:='`';
404xchr[@'141]:='a';
405xchr[@'142]:='b';
406xchr[@'143]:='c';
407xchr[@'144]:='d';
408xchr[@'145]:='e';
409xchr[@'146]:='f';
410xchr[@'147]:='g';@/
411xchr[@'150]:='h';
412xchr[@'151]:='i';
413xchr[@'152]:='j';
414xchr[@'153]:='k';
415xchr[@'154]:='l';
416xchr[@'155]:='m';
417xchr[@'156]:='n';
418xchr[@'157]:='o';@/
419xchr[@'160]:='p';
420xchr[@'161]:='q';
421xchr[@'162]:='r';
422xchr[@'163]:='s';
423xchr[@'164]:='t';
424xchr[@'165]:='u';
425xchr[@'166]:='v';
426xchr[@'167]:='w';@/
427xchr[@'170]:='x';
428xchr[@'171]:='y';
429xchr[@'172]:='z';
430xchr[@'173]:='{';
431xchr[@'174]:='|';
432xchr[@'175]:='}';
433xchr[@'176]:='~';@/
434xchr[0]:=' '; xchr[@'177]:=' '; {these ASCII codes are not used}
435
436@ Some of the ASCII codes below @'40 have been given symbolic names in
437\.{WEAVE} and \.{TANGLE} because they are used with a special meaning.
438
439@d and_sign=@'4 {equivalent to `\.{and}'}
440@d not_sign=@'5 {equivalent to `\.{not}'}
441@d set_element_sign=@'6 {equivalent to `\.{in}'}
442@d tab_mark=@'11 {ASCII code used as tab-skip}
443@d line_feed=@'12 {ASCII code thrown away at end of line}
444@d form_feed=@'14 {ASCII code used at end of page}
445@d carriage_return=@'15 {ASCII code used at end of line}
446@d left_arrow=@'30 {equivalent to `\.{:=}'}
447@d not_equal=@'32 {equivalent to `\.{<>}'}
448@d less_or_equal=@'34 {equivalent to `\.{<=}'}
449@d greater_or_equal=@'35 {equivalent to `\.{>=}'}
450@d equivalence_sign=@'36 {equivalent to `\.{==}'}
451@d or_sign=@'37 {equivalent to `\.{or}'}
452
453@ When we initialize the |xord| array and the remaining parts of |xchr|,
454it will be convenient to make use of an index variable, |i|.
455
456@<Local variables for init...@>=
457@!i:0..255;
458
459@ Here now is the system-dependent part of the character set.
460If \.{WEB} is being implemented on a garden-variety \PASCAL\ for which
461only standard ASCII codes will appear in the input and output files, you
462don't need to make any changes here. But if you have, for example, an extended
463character set like the one in Appendix~C of {\sl The \TeX book}, the first
464line of code in this module should be changed to
465$$\hbox{|for i:=1 to @'37 do xchr[i]:=chr(i);|}$$
466\.{WEB}'s character set is essentially identical to \TeX's, even with respect to
467characters less than @'40.
468@^system dependencies@>
469
470Changes to the present module will make \.{WEB} more friendly on computers
471that have an extended character set, so that one can type things like
472\.^^Z\ instead of \.{<>}. If you have an extended set of characters that
473are easily incorporated into text files, you can assign codes arbitrarily
474here, giving an |xchr| equivalent to whatever characters the users of
475\.{WEB} are allowed to have in their input files, provided that unsuitable
476characters do not correspond to special codes like |carriage_return|
477that are listed above.
478
479(The present file \.{WEAVE.WEB} does not contain any of the non-ASCII
480characters, because it is intended to be used with all implementations of
481\.{WEB}.  It was originally created on a Stanford system that has a
482convenient extended character set, then ``sanitized'' by applying another
483program that transliterated all of the non-standard characters into
484standard equivalents.)
485
486@<Set init...@>=
487for i:=1 to @'37 do xchr[i]:=' ';
488for i:=@'200 to @'377 do xchr[i]:=' ';
489
490@ The following system-independent code makes the |xord| array contain a
491suitable inverse to the information in |xchr|.
492
493@<Set init...@>=
494for i:=first_text_char to last_text_char do xord[chr(i)]:=" ";
495for i:=1 to @'377 do xord[xchr[i]]:=i;
496xord[' ']:=" ";
497
498@* Input and output.
499The input conventions of this program are intended to be very much like those
500of \TeX\ (except, of course, that they are much simpler, because much less
501needs to be done). Furthermore they are identical to those of \.{TANGLE}.
502Therefore people who need to make modifications to all three systems
503should be able to do so without too many headaches.
504
505We use the standard \PASCAL\ input/output procedures in several places that
506\TeX\ cannot, since \.{WEAVE} does not have to deal with files that are named
507dynamically by the user, and since there is no input from the terminal.
508
509@ Terminal output is done by writing on file |term_out|, which is assumed to
510consist of characters of type |text_char|:
511@^system dependencies@>
512
513@d print(#)==write(term_out,#) {`|print|' means write on the terminal}
514@d print_ln(#)==write_ln(term_out,#) {`|print|' and then start new line}
515@d new_line==write_ln(term_out) {start new line}
516@d print_nl(#)==  {print information starting on a new line}
517  begin new_line; print(#);
518  end
519
520@<Globals...@>=
521@!term_out:text_file; {the terminal as an output file}
522
523@ Different systems have different ways of specifying that the output on a
524certain file will appear on the user's terminal. Here is one way to do this
525on the \PASCAL\ system that was used in \.{TANGLE}'s initial development:
526@^system dependencies@>
527
528@<Set init...@>=
529rewrite(term_out,'TTY:'); {send |term_out| output to the terminal}
530
531@ The |update_terminal| procedure is called when we want
532to make sure that everything we have output to the terminal so far has
533actually left the computer's internal buffers and been sent.
534@^system dependencies@>
535
536@d update_terminal == break(term_out) {empty the terminal output buffer}
537
538@ The main input comes from |web_file|; this input may be overridden
539by changes in |change_file|. (If |change_file| is empty, there are no changes.)
540
541@<Globals...@>=
542@!web_file:text_file; {primary input}
543@!change_file:text_file; {updates}
544
545@ The following code opens the input files.  Since these files were listed
546in the program header, we assume that the \PASCAL\ runtime system has
547already checked that suitable file names have been given; therefore no
548additional error checking needs to be done. We will see below that
549\.{WEAVE} reads through the entire input twice.
550@^system dependencies@>
551
552@p procedure open_input; {prepare to read |web_file| and |change_file|}
553begin reset(web_file); reset(change_file);
554end;
555
556@ The main output goes to |tex_file|.
557
558@<Globals...@>=
559@!tex_file: text_file;
560
561@ The following code opens |tex_file|.
562Since this file was listed in the program header, we assume that the
563\PASCAL\ runtime system has checked that a suitable external file name has
564been given.
565@^system dependencies@>
566
567@<Set init...@>=
568rewrite(tex_file);
569
570@ Input goes into an array called |buffer|.
571
572@<Globals...@>=@!buffer: array[0..long_buf_size] of ASCII_code;
573
574@ The |input_ln| procedure brings the next line of input from the specified
575file into the |buffer| array and returns the value |true|, unless the file has
576already been entirely read, in which case it returns |false|. The conventions
577of \TeX\ are followed; i.e., |ASCII_code| numbers representing the next line
578of the file are input into |buffer[0]|, |buffer[1]|, \dots,
579|buffer[limit-1]|; trailing blanks are ignored;
580and the global variable |limit| is set to the length of the
581@^system dependencies@>
582line. The value of |limit| must be strictly less than |buf_size|.
583
584We assume that none of the |ASCII_code| values
585of |buffer[j]| for |0<=j<limit| is equal to 0, @'177, |line_feed|, |form_feed|,
586or |carriage_return|. Since |buf_size| is strictly less than |long_buf_size|,
587some of \.{WEAVE}'s routines use the fact that it is safe to refer to
588|buffer[limit+2]| without overstepping the bounds of the array.
589
590@p function input_ln(var f:text_file):boolean;
591  {inputs a line or returns |false|}
592var final_limit:0..buf_size; {|limit| without trailing blanks}
593begin limit:=0; final_limit:=0;
594if eof(f) then input_ln:=false
595else  begin while not eoln(f) do
596    begin buffer[limit]:=xord[f^]; get(f);
597    incr(limit);
598    if buffer[limit-1]<>" " then final_limit:=limit;
599    if limit=buf_size then
600      begin while not eoln(f) do get(f);
601      decr(limit); {keep |buffer[buf_size]| empty}
602      if final_limit>limit then final_limit:=limit;
603      print_nl('! Input line too long'); loc:=0; error;
604@.Input line too long@>
605      end;
606    end;
607  read_ln(f); limit:=final_limit; input_ln:=true;
608  end;
609end;
610
611@* Reporting errors to the user.
612The \.{WEAVE} processor operates in three phases: first it inputs the source
613file and stores cross-reference data, then it inputs the source once again and
614produces the \TeX\ output file, and finally it sorts and outputs the index.
615
616The global variables |phase_one| and |phase_three| tell which Phase we are in.
617
618@<Globals...@>=
619@!phase_one: boolean; {|true| in Phase I, |false| in Phases II and III}
620@!phase_three: boolean; {|true| in Phase III, |false| in Phases I and II}
621
622@ If an error is detected while we are debugging,
623we usually want to look at the contents of memory.
624A special procedure will be declared later for this purpose.
625
626@<Error handling...@>=
627@!debug@+ procedure debug_help; forward;@+gubed
628
629@ The command `|err_print('! Error message')|' will report a syntax error to
630the user, by printing the error message at the beginning of a new line and
631then giving an indication of where the error was spotted in the source file.
632Note that no period follows the error message, since the error routine
633will automatically supply a period.
634
635The actual error indications are provided by a procedure called |error|.
636However, error messages are not actually reported during phase one,
637since errors detected on the first pass will be detected again
638during the second.
639
640@d err_print(#)==
641  begin if not phase_one then
642    begin new_line; print(#); error;
643    end;
644  end
645
646@<Error handling...@>=
647procedure error; {prints `\..' and location of error message}
648var@!k,@!l: 0..long_buf_size; {indices into |buffer|}
649begin @<Print error location based on input buffer@>;
650update_terminal; mark_error;
651@!debug debug_skipped:=debug_cycle;debug_help;@+gubed
652end;
653
654@ The error locations can be indicated by using the global variables
655|loc|, |line|, and |changing|, which tell respectively the first
656unlooked-at position in |buffer|, the current line number, and whether or not
657the current line is from |change_file| or |web_file|.
658This routine should be modified on systems whose standard text editor
659has special line-numbering conventions.
660@^system dependencies@>
661
662@<Print error location based on input buffer@>=
663begin if changing then print('. (change file ')@+else print('. (');
664print_ln('l.', line:1, ')');
665if loc>=limit then l:=limit else l:=loc;
666for k:=1 to l do
667  if buffer[k-1]=tab_mark then print(' ')
668  else print(xchr[buffer[k-1]]); {print the characters already read}
669new_line;
670for k:=1 to l do print(' '); {space out the next line}
671for k:=l+1 to limit do print(xchr[buffer[k-1]]); {print the part not yet read}
672if buffer[limit]="|" then print(xchr["|"]);
673  {end of \PASCAL\ text in module names}
674print(' '); {this space separates the message from future asterisks}
675end
676
677@ The |jump_out| procedure just cuts across all active procedure levels
678and jumps out of the program. This is the only non-local \&{goto} statement
679in \.{WEAVE}. It is used when no recovery from a particular error has
680been provided.
681
682Some \PASCAL\ compilers do not implement non-local |goto| statements.
683@^system dependencies@>
684In such cases the code that appears at label |end_of_WEAVE| should be
685copied into the |jump_out| procedure, followed by a call to a system procedure
686that terminates the program.
687
688@d fatal_error(#)==begin new_line; print(#); error; mark_fatal; jump_out;
689  end
690
691@<Error handling...@>=
692procedure jump_out;
693begin goto end_of_WEAVE;
694end;
695
696@ Sometimes the program's behavior is far different from what it should be,
697and \.{WEAVE} prints an error message that is really for the \.{WEAVE}
698maintenance person, not the user. In such cases the program says
699|confusion('indication of where we are')|.
700
701@d confusion(#)==fatal_error('! This can''t happen (',#,')')
702@.This can't happen@>
703
704@ An overflow stop occurs if \.{WEAVE}'s tables aren't large enough.
705
706@d overflow(#)==fatal_error('! Sorry, ',#,' capacity exceeded')
707@.Sorry, x capacity exceeded@>
708
709@* Data structures.
710During the first phase of its processing, \.{WEAVE} puts identifier names,
711index entries, and module names into the large |byte_mem| array, which is
712packed with eight-bit integers. Allocation is sequential, since names are
713never deleted.
714
715An auxiliary array |byte_start| is used as a directory for |byte_mem|,
716and the |link|, |ilk|, and |xref| arrays give further information about names.
717These auxiliary arrays consist of sixteen-bit items.
718
719@<Types...@>=
720@!eight_bits=0..255; {unsigned one-byte quantity}
721@!sixteen_bits=0..65535; {unsigned two-byte quantity}
722
723@ \.{WEAVE} has been designed to avoid the need for indices that are more
724than sixteen bits wide, so that it can be used on most computers. But
725there are programs that need more than 65536 bytes; \TeX\ is one of these.
726To get around this problem, a slight complication has been added to the
727data structures:  |byte_mem| is a two-dimensional array, whose first index
728is either 0 or 1. (For generality, the first index is actually allowed to
729run between 0 and |ww-1|, where |ww| is defined to be 2; the program will
730work for any positive value of |ww|, and it can be simplified in obvious
731ways if |ww=1|.)
732
733@d ww=2 {we multiply the byte capacity by approximately this amount}
734
735@<Globals...@>=
736@!byte_mem: packed array [0..ww-1,0..max_bytes] of ASCII_code;
737  {characters of names}
738@!byte_start: array [0..max_names] of sixteen_bits; {directory into |byte_mem|}
739@!link: array [0..max_names] of sixteen_bits; {hash table or tree links}
740@!ilk: array [0..max_names] of sixteen_bits; {type codes or tree links}
741@!xref: array [0..max_names] of sixteen_bits; {heads of cross-reference lists}
742
743@ The names of identifiers are found by computing a hash address |h| and
744then looking at strings of bytes signified by |hash[h]|, |link[hash[h]]|,
745|link[link[hash[h]]]|, \dots, until either finding the desired name
746or encountering a zero.
747
748A `|name_pointer|' variable, which signifies a name, is an index into
749|byte_start|. The actual sequence of characters in the name pointed to by
750|p| appears in positions |byte_start[p]| to |byte_start[p+ww]-1|, inclusive,
751in the segment of |byte_mem| whose first index is |p mod ww|. Thus, when
752|ww=2| the even-numbered name bytes appear in |byte_mem[0,@t$*$@>]|
753and the odd-numbered ones appear in |byte_mem[1,@t$*$@>]|.
754The pointer 0 is used for undefined module names; we don't
755want to use it for the names of identifiers, since 0 stands for a null
756pointer in a linked list.
757
758We usually have |byte_start[name_ptr+w]=byte_ptr[(name_ptr+w) mod ww]|
759for |0<=w<ww|, since these are the starting positions for the next |ww|
760names to be stored in |byte_mem|.
761
762@d length(#)==byte_start[#+ww]-byte_start[#] {the length of a name}
763
764@<Types...@>=
765@!name_pointer=0..max_names; {identifies a name}
766
767@ @<Global...@>=
768@!name_ptr:name_pointer; {first unused position in |byte_start|}
769@!byte_ptr:array [0..ww-1] of 0..max_bytes;
770  {first unused position in |byte_mem|}
771
772@ @<Local variables for init...@>=
773@!wi: 0..ww-1; {to initialize the |byte_mem| indices}
774
775@ @<Set init...@>=
776for wi:=0 to ww-1 do
777  begin byte_start[wi]:=0; byte_ptr[wi]:=0;
778  end;
779byte_start[ww]:=0; {this makes name 0 of length zero}
780name_ptr:=1;
781
782@ Several types of identifiers are distinguished by their |ilk|:
783
784\yskip\hang |normal| identifiers are part of the \PASCAL\ program and
785will appear in italic type.
786
787\yskip\hang |roman| identifiers are index entries that appear after
788\.{@@\^} in the \.{WEB} file.
789
790\yskip\hang |wildcard| identifiers are index entries that appear after
791\.{@@:} in the \.{WEB} file.
792
793\yskip\hang |typewriter| identifiers are index entries that appear after
794\.{@@.} in the \.{WEB} file.
795
796\yskip\hang |array_like|, |begin_like|, \dots, |var_like|
797identifiers are \PASCAL\ reserved words whose |ilk| explains how they are
798to be treated when \PASCAL\ code is being formatted.
799
800\yskip\hang Finally, if |c| is an ASCII code, an |ilk| equal to
801|char_like+c| denotes a reserved word that will be converted to character
802|c|.
803
804@d normal=0 {ordinary identifiers have |normal| ilk}
805@d roman=1 {normal index entries have |roman| ilk}
806@d wildcard=2 {user-formatted index entries have |wildcard| ilk}
807@d typewriter=3 {`typewriter type' entries have |typewriter| ilk}
808@d reserved(#)==(ilk[#]>typewriter) {tells if a name is a reserved word}
809@d array_like=4 {\&{array}, \&{file}, \&{set}}
810@d begin_like=5 {\&{begin}}
811@d case_like=6 {\&{case}}
812@d const_like=7 {\&{const}, \&{label}, \&{type}}
813@d div_like=8 {\&{div}, \&{mod}}
814@d do_like=9 {\&{do}, \&{of}, \&{then}}
815@d else_like=10 {\&{else}}
816@d end_like=11 {\&{end}}
817@d for_like=12 {\&{for}, \&{while}, \&{with}}
818@d goto_like=13 {\&{goto}, \&{packed}}
819@d if_like=14 {\&{if}}
820@d in_like=15 {\&{in}}
821@d nil_like=16 {\&{nil}}
822@d proc_like=17 {\&{function}, \&{procedure}, \&{program}}
823@d record_like=18 {\&{record}}
824@d repeat_like=19 {\&{repeat}}
825@d to_like=20 {\&{downto}, \&{to}}
826@d until_like=21 {\&{until}}
827@d var_like=22 {\&{var}}
828@d loop_like=23 {\&{loop}, \&{xclause}}
829@d char_like=24 {\&{and}, \&{or}, \&{not}, \&{in}}
830
831@ The names of modules are stored in |byte_mem| together
832with the identifier names, but a hash table is not used for them because
833\.{WEAVE} needs to be able to recognize a module name when given a prefix of
834that name. A conventional binary seach tree is used to retrieve module names,
835with fields called |llink| and |rlink| in place of |link| and |ilk|. The
836root of this tree is |rlink[0]|.
837
838@d llink==link {left link in binary search tree for module names}
839@d rlink==ilk {right link in binary search tree for module names}
840@d root==rlink[0] {the root of the binary search tree for module names}
841
842@<Set init...@>=
843root:=0; {the binary search tree starts out with nothing in it}
844
845@ Here is a little procedure that prints the text of a given name on the
846user's terminal.
847
848@p procedure print_id(@!p:name_pointer); {print identifier or module name}
849var k:0..max_bytes; {index into |byte_mem|}
850@!w:0..ww-1; {row of |byte_mem|}
851begin if p>=name_ptr then print('IMPOSSIBLE')
852else  begin w:=p mod ww;
853  for k:=byte_start[p] to byte_start[p+ww]-1 do
854    print(xchr[byte_mem[w,k]]);
855  end;
856end;
857
858@ We keep track of the current module number in
859|module_count|, which is the total number of modules that have started.
860Modules which have been altered by a change file entry
861have their |changed_module| flag turned on during the first phase.
862
863@<Globals...@>=
864@!module_count:0..max_modules; {the current module number}
865@!changed_module: packed array [0..max_modules] of boolean; {is it changed?}
866@!change_exists: boolean; {has any module changed?}
867
868@ The other large memory area in \.{WEAVE} keeps the cross-reference data.
869All uses of the name |p| are recorded in a linked list beginning at
870|xref[p]|, which points into the |xmem| array. Entries in |xmem| consist
871of two sixteen-bit items per word, called the |num| and |xlink| fields.
872If |x| is an index into |xmem|, reached from name |p|, the value of |num(x)|
873is either a module number where |p| is used, or it is |def_flag| plus a
874module number where |p| is defined; and |xlink(x)| points to the next such
875cross reference for |p|, if any. This list of cross references is in
876decreasing order by module number. The current number of cross references
877is |xref_ptr|.
878
879The global variable |xref_switch| is set either to |def_flag| or to zero,
880depending on whether the next cross reference to an identifier is to be
881underlined or not in the index. This switch is set to |def_flag| when
882\.{@@!} or \.{@@d} or \.{@@f} is scanned, and it is cleared to zero when
883the next identifier or index entry cross reference has been made. Similarly,
884the global variable |mod_xref_switch| is either |def_flag| or zero, depending
885on whether a module name is being defined or used.
886
887@d num(#)==xmem[#].num_field
888@d xlink(#)==xmem[#].xlink_field
889@d def_flag=10240 {must be strictly larger than |max_modules|}
890
891@ @<Types...@>=
892@!xref_number=0..max_refs;
893
894@ @<Globals...@>=
895@!xmem:array[xref_number] of packed record@t@>@/
896  @!num_field: sixteen_bits; {module number plus zero or |def_flag|}
897  @!xlink_field: sixteen_bits; {pointer to the previous cross reference}
898  end;
899@!xref_ptr:xref_number; {the largest occupied position in |xmem|}
900@!xref_switch,@!mod_xref_switch:0..def_flag; {either zero or |def_flag|}
901
902@ @<Set init...@>=xref_ptr:=0; xref_switch:=0; mod_xref_switch:=0; num(0):=0;
903xref[0]:=0; {cross references to undefined modules}
904
905@ A new cross reference for an identifier is formed by calling |new_xref|,
906which discards duplicate entries and ignores non-underlined references
907to one-letter identifiers or \PASCAL's reserved words.
908
909@d append_xref(#)==if xref_ptr=max_refs then overflow('cross reference')
910  else  begin incr(xref_ptr); num(xref_ptr):=#;
911    end
912
913@p procedure new_xref(@!p:name_pointer);
914label exit;
915var q:xref_number; {pointer to previous cross reference}
916@!m,@!n: sixteen_bits; {new and previous cross-reference value}
917begin if (reserved(p)or(byte_start[p]+1=byte_start[p+ww]))and
918  (xref_switch=0) then return;
919m:=module_count+xref_switch; xref_switch:=0; q:=xref[p];
920if q>0 then
921  begin n:=num(q);
922  if (n=m)or(n=m+def_flag) then return
923  else if m=n+def_flag then
924    begin num(q):=m; return;
925    end;
926  end;
927append_xref(m); xlink(xref_ptr):=q; xref[p]:=xref_ptr;
928exit: end;
929
930@ The cross reference lists for module names are slightly different. Suppose
931that a module name is defined in modules $m_1$, \dots, $m_k$ and used in
932modules $n_1$, \dots, $n_l$. Then its list will contain $m_1+|def_flag|$,
933$m_k+|def_flag|$, \dots, $m_2+|def_flag|$, $n_l$, \dots, $n_1$, in
934this order.  After Phase II, however, the order will be
935$m_1+|def_flag|$, \dots, $m_k+|def_flag|$, $n_1$, \dots, $n_l$.
936
937@p procedure new_mod_xref(@!p:name_pointer);
938var q,@!r:xref_number; {pointers to previous cross references}
939begin q:=xref[p]; r:=0;
940if q>0 then
941  begin if mod_xref_switch=0 then while num(q)>=def_flag do
942    begin r:=q; q:=xlink(q);
943    end
944  else if num(q)>=def_flag then
945    begin r:=q; q:=xlink(q);
946    end;
947  end;
948append_xref(module_count+mod_xref_switch); xlink(xref_ptr):=q;
949mod_xref_switch:=0;
950if r=0 then xref[p]:=xref_ptr
951else xlink(r):=xref_ptr;
952end;
953
954@ A third large area of memory is used for sixteen-bit `tokens', which appear
955in short lists similar to the strings of characters in |byte_mem|. Token lists
956are used to contain the result of \PASCAL\ code translated into \TeX\ form;
957further details about them will be explained later. A |text_pointer| variable
958is an index into |tok_start|.
959
960@<Types...@>=
961@!text_pointer=0..max_texts; {identifies a token list}
962
963@ The first position of |tok_mem|
964that is unoccupied by replacement text is called |tok_ptr|, and the first
965unused location of |tok_start| is called |text_ptr|.
966Thus, we usually have |tok_start[text_ptr]=tok_ptr|.
967
968@<Glob...@>=
969@t\hskip1em@>@!tok_mem: packed array [0..max_toks] of sixteen_bits; {tokens}
970@t\hskip1em@>@!tok_start: array [text_pointer] of sixteen_bits;
971  {directory into |tok_mem|}
972@t\hskip1em@>@!text_ptr:text_pointer; {first unused position in |tok_start|}
973@t\hskip1em@>@!tok_ptr:0..max_toks; {first unused position in |tok_mem|}
974stat@!max_tok_ptr,@!max_txt_ptr:0..max_toks; {largest values occurring}
975tats
976
977@ @<Set init...@>=
978tok_ptr:=1; text_ptr:=1; tok_start[0]:=1; tok_start[1]:=1;
979stat max_tok_ptr:=1; max_txt_ptr:=1;@+tats
980
981@* Searching for identifiers.
982The hash table described above is updated by the |id_lookup| procedure,
983which finds a given identifier and returns a pointer to its index in
984|byte_start|. The identifier is supposed to match character by character
985and it is also supposed to have a given |ilk| code; the same name may be
986present more than once if it is supposed to appear in the index with
987different typesetting conventions.
988If the identifier was not already present, it is inserted into the table.
989
990Because of the way \.{WEAVE}'s scanning mechanism works, it is most convenient
991to let |id_lookup| search for an identifier that is present in the |buffer|
992array. Two other global variables specify its position in the buffer: the
993first character is |buffer[id_first]|, and the last is |buffer[id_loc-1]|.
994
995@<Glob...@>=
996@!id_first:0..long_buf_size; {where the current identifier begins in the buffer}
997@!id_loc:0..long_buf_size; {just after the current identifier in the buffer}
998@#
999@!hash:array [0..hash_size] of sixteen_bits; {heads of hash lists}
1000
1001@ Initially all the hash lists are empty.
1002
1003@<Local variables for init...@>=
1004@!h:0..hash_size; {index into hash-head array}
1005
1006@ @<Set init...@>=
1007for h:=0 to hash_size-1 do hash[h]:=0;
1008
1009@ Here now is the main procedure for finding identifiers (and index
1010entries).  The parameter |t| is set to the desired |ilk| code. The
1011identifier must either have |ilk=t|, or we must have
1012|t=normal| and the identifier must be a reserved word.
1013
1014@p function id_lookup(@!t:eight_bits):name_pointer; {finds current identifier}
1015label found;
1016var i:0..long_buf_size; {index into |buffer|}
1017@!h:0..hash_size; {hash code}
1018@!k:0..max_bytes; {index into |byte_mem|}
1019@!w:0..ww-1; {row of |byte_mem|}
1020@!l:0..long_buf_size; {length of the given identifier}
1021@!p:name_pointer; {where the identifier is being sought}
1022begin l:=id_loc-id_first; {compute the length}
1023@<Compute the hash code |h|@>;
1024@<Compute the name location |p|@>;
1025if p=name_ptr then @<Enter a new name into the table at position |p|@>;
1026id_lookup:=p;
1027end;
1028
1029@ A simple hash code is used: If the sequence of
1030ASCII codes is $c_1c_2\ldots c_m$, its hash value will be
1031$$(2^{n-1}c_1+2^{n-2}c_2+\cdots+c_n)\,\bmod\,|hash_size|.$$
1032
1033@<Compute the hash...@>=
1034h:=buffer[id_first]; i:=id_first+1;
1035while i<id_loc do
1036  begin h:=(h+h+buffer[i]) mod hash_size; incr(i);
1037  end
1038
1039@ If the identifier is new, it will be placed in position |p=name_ptr|,
1040otherwise |p| will point to its existing location.
1041
1042@<Compute the name location...@>=
1043p:=hash[h];
1044while p<>0 do
1045  begin if (length(p)=l)and((ilk[p]=t)or((t=normal)and reserved(p))) then
1046    @<Compare name |p| with current identifier,
1047      |goto found| if equal@>;
1048  p:=link[p];
1049  end;
1050p:=name_ptr; {the current identifier is new}
1051link[p]:=hash[h]; hash[h]:=p; {insert |p| at beginning of hash list}
1052found:
1053
1054@ @<Compare name |p|...@>=
1055begin i:=id_first; k:=byte_start[p]; w:=p mod ww;
1056while (i<id_loc)and(buffer[i]=byte_mem[w,k]) do
1057  begin incr(i); incr(k);
1058  end;
1059if i=id_loc then goto found; {all characters agree}
1060end
1061
1062@ When we begin the following segment of the program, |p=name_ptr|.
1063
1064@<Enter a new name...@>=
1065begin w:=name_ptr mod ww;
1066if byte_ptr[w]+l>max_bytes then overflow('byte memory');
1067if name_ptr+ww>max_names then overflow('name');
1068i:=id_first; k:=byte_ptr[w]; {get ready to move the identifier into |byte_mem|}
1069while i<id_loc do
1070  begin byte_mem[w,k]:=buffer[i]; incr(k); incr(i);
1071  end;
1072byte_ptr[w]:=k; byte_start[name_ptr+ww]:=k; incr(name_ptr);
1073ilk[p]:=t; xref[p]:=0;
1074end
1075
1076@* Initializing the table of reserved words.
1077We have to get \PASCAL's reserved words into the hash table, and the
1078simplest way to do this is to insert them every time \.{WEAVE} is run.
1079A few macros permit us to do the initialization with a compact program.
1080
1081@d sid9(#)==buffer[9]:=#;cur_name:=id_lookup
1082@d sid8(#)==buffer[8]:=#;sid9
1083@d sid7(#)==buffer[7]:=#;sid8
1084@d sid6(#)==buffer[6]:=#;sid7
1085@d sid5(#)==buffer[5]:=#;sid6
1086@d sid4(#)==buffer[4]:=#;sid5
1087@d sid3(#)==buffer[3]:=#;sid4
1088@d sid2(#)==buffer[2]:=#;sid3
1089@d sid1(#)==buffer[1]:=#;sid2
1090@d id2==id_first:=8; sid8
1091@d id3==id_first:=7; sid7
1092@d id4==id_first:=6; sid6
1093@d id5==id_first:=5; sid5
1094@d id6==id_first:=4; sid4
1095@d id7==id_first:=3; sid3
1096@d id8==id_first:=2; sid2
1097@d id9==id_first:=1; sid1
1098
1099@<Globals...@>=
1100@!cur_name:name_pointer; {points to the identifier just inserted}
1101
1102@ The intended use of the macros above might not be immediately obvious,
1103but the riddle is answered by the following:
1104
1105@<Store all the reserved words@>=
1106id_loc:=10;@/
1107id3("a")("n")("d")(char_like+and_sign);@/
1108id5("a")("r")("r")("a")("y")(array_like);@/
1109id5("b")("e")("g")("i")("n")(begin_like);@/
1110id4("c")("a")("s")("e")(case_like);@/
1111id5("c")("o")("n")("s")("t")(const_like);@/
1112id3("d")("i")("v")(div_like);@/
1113id2("d")("o")(do_like);@/
1114id6("d")("o")("w")("n")("t")("o")(to_like);@/
1115id4("e")("l")("s")("e")(else_like);@/
1116id3("e")("n")("d")(end_like);@/
1117id4("f")("i")("l")("e")(array_like);@/
1118id3("f")("o")("r")(for_like);@/
1119id8("f")("u")("n")("c")("t")("i")("o")("n")(proc_like);@/
1120id4("g")("o")("t")("o")(goto_like);@/
1121id2("i")("f")(if_like);@/
1122id2("i")("n")(char_like+set_element_sign);@/
1123id5("l")("a")("b")("e")("l")(const_like);@/
1124id3("m")("o")("d")(div_like);@/
1125id3("n")("i")("l")(nil_like);@/
1126id3("n")("o")("t")(char_like+not_sign);@/
1127id2("o")("f")(do_like);@/
1128id2("o")("r")(char_like+or_sign);@/
1129id6("p")("a")("c")("k")("e")("d")(goto_like);@/
1130id9("p")("r")("o")("c")("e")("d")("u")("r")("e")(proc_like);@/
1131id7("p")("r")("o")("g")("r")("a")("m")(proc_like);@/
1132id6("r")("e")("c")("o")("r")("d")(record_like);@/
1133id6("r")("e")("p")("e")("a")("t")(repeat_like);@/
1134id3("s")("e")("t")(array_like);@/
1135id4("t")("h")("e")("n")(do_like);@/
1136id2("t")("o")(to_like);@/
1137id4("t")("y")("p")("e")(const_like);@/
1138id5("u")("n")("t")("i")("l")(until_like);@/
1139id3("v")("a")("r")(var_like);@/
1140id5("w")("h")("i")("l")("e")(for_like);@/
1141id4("w")("i")("t")("h")(for_like);@/
1142id7("x")("c")("l")("a")("u")("s")("e")(loop_like);@/
1143
1144@* Searching for module names.
1145The |mod_lookup| procedure finds the module name |mod_text[1..l]| in the
1146search tree, after inserting it if necessary, and returns a pointer to
1147where it was found.
1148
1149@<Glob...@>=
1150@!mod_text:array [0..longest_name] of ASCII_code; {name being sought for}
1151
1152@ According to the rules of \.{WEB}, no module name
1153should be a proper prefix of another, so a ``clean'' comparison should
1154occur between any two names. The result of |mod_lookup| is 0 if this
1155prefix condition is violated. An error message is printed when such violations
1156are detected during phase two of \.{WEAVE}.
1157
1158@d less=0 {the first name is lexicographically less than the second}
1159@d equal=1 {the first name is equal to the second}
1160@d greater=2 {the first name is lexicographically greater than the second}
1161@d prefix=3 {the first name is a proper prefix of the second}
1162@d extension=4 {the first name is a proper extension of the second}
1163
1164@p function mod_lookup(@!l:sixteen_bits):name_pointer; {finds module name}
1165label found;
1166var c:less..extension; {comparison between two names}
1167@!j:0..longest_name; {index into |mod_text|}
1168@!k:0..max_bytes; {index into |byte_mem|}
1169@!w:0..ww-1; {row of |byte_mem|}
1170@!p:name_pointer; {current node of the search tree}
1171@!q:name_pointer; {father of node |p|}
1172begin c:=greater; q:=0; p:=root;
1173while p<>0 do
1174  begin @<Set variable |c| to the result of comparing the given name
1175    to name |p|@>;
1176  q:=p;
1177  if c=less then p:=llink[q]
1178  else if c=greater then p:=rlink[q]
1179  else goto found;
1180  end;
1181@<Enter a new module name into the tree@>;
1182found: if c<>equal then
1183  begin err_print('! Incompatible section names'); p:=0;
1184@.Incompatible section names@>
1185  end;
1186mod_lookup:=p;
1187end;
1188
1189@ @<Enter a new module name...@>=
1190w:=name_ptr mod ww; k:=byte_ptr[w];
1191if k+l>max_bytes then overflow('byte memory');
1192if name_ptr>max_names-ww then overflow('name');
1193p:=name_ptr;
1194if c=less then llink[q]:=p else rlink[q]:=p;
1195llink[p]:=0; rlink[p]:=0; xref[p]:=0; c:=equal;
1196for j:=1 to l do byte_mem[w,k+j-1]:=mod_text[j];
1197byte_ptr[w]:=k+l; byte_start[name_ptr+ww]:=k+l; incr(name_ptr);
1198
1199@ @<Set variable |c|...@>=
1200begin k:=byte_start[p]; w:=p mod ww; c:=equal; j:=1;
1201while (k<byte_start[p+ww]) and (j<=l) and (mod_text[j]=byte_mem[w,k]) do
1202  begin incr(k); incr(j);
1203  end;
1204if k=byte_start[p+ww] then
1205  if j>l then c:=equal
1206  else c:=extension
1207else if j>l then c:=prefix
1208else if mod_text[j]<byte_mem[w,k] then c:=less
1209else c:=greater;
1210end
1211
1212@ The |prefix_lookup| procedure is supposed to find exactly one module
1213name that has |mod_text[1..l]| as a prefix. Actually the algorithm
1214silently accepts also the situation that some module name is a prefix of
1215|mod_text[1..l]|, because the user who painstakingly typed in more than
1216necessary probably doesn't want to be told about the wasted effort.
1217
1218Recall that error messages are not printed during phase one. It is
1219possible that the |prefix_lookup| procedure will fail on the first pass,
1220because there is no match, yet the second pass might detect no error if a
1221matching module name has occurred after the offending prefix. In such a
1222case the cross-reference information will be incorrect and \.{WEAVE} will
1223report no error. However, such a mistake will be detected by the
1224\.{TANGLE} processor.
1225
1226@p function prefix_lookup(@!l:sixteen_bits):name_pointer; {finds name extension}
1227var c:less..extension; {comparison between two names}
1228@!count:0..max_names; {the number of hits}
1229@!j:0..longest_name; {index into |mod_text|}
1230@!k:0..max_bytes; {index into |byte_mem|}
1231@!w:0..ww-1; {row of |byte_mem|}
1232@!p:name_pointer; {current node of the search tree}
1233@!q:name_pointer; {another place to resume the search after one branch is done}
1234@!r:name_pointer; {extension found}
1235begin q:=0; p:=root; count:=0; r:=0; {begin search at root of tree}
1236while p<>0 do
1237  begin @<Set variable |c| to the result of comparing...@>;
1238  if c=less then p:=llink[p]
1239  else if c=greater then p:=rlink[p]
1240  else  begin r:=p; incr(count); q:=rlink[p]; p:=llink[p];
1241    end;
1242  if p=0 then
1243    begin p:=q; q:=0;
1244    end;
1245  end;
1246if count<>1 then
1247  if count=0 then err_print('! Name does not match')
1248@.Name does not match@>
1249  else err_print('! Ambiguous prefix');
1250@.Ambiguous prefix@>
1251prefix_lookup:=r; {the result will be 0 if there was no match}
1252end;
1253
1254@* Lexical scanning.
1255Let us now consider the subroutines that read the \.{WEB} source file
1256and break it into meaningful units. There are four such procedures:
1257One simply skips to the next `\.{@@\ }' or `\.{@@*}' that begins a
1258module; another passes over the \TeX\ text at the beginning of a
1259module; the third passes over the \TeX\ text in a \PASCAL\ comment;
1260and the last, which is the most interesting, gets the next token of
1261a \PASCAL\ text.
1262
1263@ But first we need to consider the low-level routine |get_line|
1264that takes care of merging |change_file| into |web_file|. The |get_line|
1265procedure also updates the line numbers for error messages.
1266
1267@<Globals...@>=
1268@!ii:integer; {general purpose |for| loop variable in the outer block}
1269@!line:integer; {the number of the current line in the current file}
1270@!other_line:integer; {the number of the current line in the input file that
1271  is not currently being read}
1272@!temp_line:integer; {used when interchanging |line| with |other_line|}
1273@!limit:0..long_buf_size; {the last character position occupied in the buffer}
1274@!loc:0..long_buf_size; {the next character position to be read from the buffer}
1275@!input_has_ended: boolean; {if |true|, there is no more input}
1276@!changing: boolean; {if |true|, the current line is from |change_file|}
1277@!change_pending: boolean; {if |true|, the current change is not yet
1278  recorded in |changed_module[module_count]|}
1279
1280@ As we change |changing| from |true| to |false| and back again, we must
1281remember to swap the values of |line| and |other_line| so that the |err_print|
1282routine will be sure to report the correct line number.
1283
1284@d change_changing==
1285  changing := not changing;
1286  temp_line:=other_line; other_line:=line; line:=temp_line
1287    {|line @t$\null\BA\null$@> other_line|}
1288
1289@ When |changing| is |false|, the next line of |change_file| is kept in
1290|change_buffer[0..change_limit]|, for purposes of comparison with the next
1291line of |web_file|. After the change file has been completely input, we
1292set |change_limit:=0|, so that no further matches will be made.
1293
1294@<Globals...@>=
1295@!change_buffer:array[0..buf_size] of ASCII_code;
1296@!change_limit:0..buf_size; {the last position occupied in |change_buffer|}
1297
1298@ Here's a simple function that checks if the two buffers are different.
1299
1300@p function lines_dont_match:boolean;
1301label exit;
1302var k:0..buf_size; {index into the buffers}
1303begin lines_dont_match:=true;
1304if change_limit<>limit then return;
1305if limit>0 then
1306  for k:=0 to limit-1 do if change_buffer[k]<>buffer[k] then return;
1307lines_dont_match:=false;
1308exit: end;
1309
1310@ Procedure |prime_the_change_buffer| sets |change_buffer| in preparation
1311for the next matching operation. Since blank lines in the change file are
1312not used for matching, we have |(change_limit=0)and not changing| if and
1313only if the change file is exhausted. This procedure is called only
1314when |changing| is true; hence error messages will be reported correctly.
1315
1316@p procedure prime_the_change_buffer;
1317label continue, done, exit;
1318var k:0..buf_size; {index into the buffers}
1319begin change_limit:=0; {this value will be used if the change file ends}
1320@<Skip over comment lines in the change file; |return| if end of file@>;
1321@<Skip to the next nonblank line; |return| if end of file@>;
1322@<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>;
1323exit: end;
1324
1325@ While looking for a line that begins with \.{@@x} in the change file,
1326we allow lines that begin with \.{@@}, as long as they don't begin with
1327\.{@@y} or \.{@@z} (which would probably indicate that the change file is
1328fouled up).
1329
1330@<Skip over comment lines in the change file...@>=
1331loop@+  begin incr(line);
1332  if not input_ln(change_file) then return;
1333  if limit<2 then goto continue;
1334  if buffer[0]<>"@@" then goto continue;
1335  if (buffer[1]>="X")and(buffer[1]<="Z") then
1336    buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
1337  if buffer[1]="x" then goto done;
1338  if (buffer[1]="y")or(buffer[1]="z") then
1339    begin loc:=2; err_print('! Where is the matching @@x?');
1340@.Where is the match...@>
1341    end;
1342continue: end;
1343done:
1344
1345@ Here we are looking at lines following the \.{@@x}.
1346
1347@<Skip to the next nonblank line...@>=
1348repeat incr(line);
1349  if not input_ln(change_file) then
1350    begin err_print('! Change file ended after @@x');
1351@.Change file ended...@>
1352    return;
1353    end;
1354until limit>0;
1355
1356@ @<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>=
1357begin change_limit:=limit;
1358if limit>0 then for k:=0 to limit-1 do change_buffer[k]:=buffer[k];
1359end
1360
1361@ The following procedure is used to see if the next change entry should
1362go into effect; it is called only when |changing| is false.
1363The idea is to test whether or not the current
1364contents of |buffer| matches the current contents of |change_buffer|.
1365If not, there's nothing more to do; but if so, a change is called for:
1366All of the text down to the \.{@@y} is supposed to match. An error
1367message is issued if any discrepancy is found. Then the procedure
1368prepares to read the next line from |change_file|.
1369
1370When a match is found, the current module is marked as changed unless
1371the first line after the \.{@@x} and after the \.{@@y} both start with
1372either |'@@*'| or |'@@ '| (possibly preceded by whitespace).
1373
1374@d if_module_start_then_make_change_pending(#)==
1375  loc:=0; buffer[limit]:="!";
1376  while (buffer[loc]=" ")or(buffer[loc]=tab_mark) do incr(loc);
1377  buffer[limit]:=" ";
1378  if buffer[loc]="@@" then
1379    if (buffer[loc+1]="*") or
1380       (buffer[loc+1]=" ") or (buffer[loc+1]=tab_mark) then
1381      change_pending:=#
1382
1383@p procedure check_change; {switches to |change_file| if the buffers match}
1384label exit;
1385var n:integer; {the number of discrepancies found}
1386@!k:0..buf_size; {index into the buffers}
1387begin if lines_dont_match then return;
1388change_pending:=false;
1389if not changed_module[module_count] then
1390  begin if_module_start_then_make_change_pending(true);
1391  if not change_pending then changed_module[module_count]:=true;
1392  end;
1393n:=0;
1394loop@+  begin change_changing; {now it's |true|}
1395  incr(line);
1396  if not input_ln(change_file) then
1397    begin err_print('! Change file ended before @@y');
1398@.Change file ended...@>
1399    change_limit:=0;  change_changing; {|false| again}
1400    return;
1401    end;
1402  @<If the current line starts with \.{@@y},
1403    report any discrepancies and |return|@>;
1404  @<Move |buffer| and |limit|...@>;
1405  change_changing; {now it's |false|}
1406  incr(line);
1407  if not input_ln(web_file) then
1408    begin err_print('! WEB file ended during a change');
1409@.WEB file ended...@>
1410    input_has_ended:=true; return;
1411    end;
1412  if lines_dont_match then incr(n);
1413  end;
1414exit: end;
1415
1416@ @<If the current line starts with \.{@@y}...@>=
1417if limit>1 then if buffer[0]="@@" then
1418  begin if (buffer[1]>="X")and(buffer[1]<="Z") then
1419    buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
1420  if (buffer[1]="x")or(buffer[1]="z") then
1421    begin loc:=2; err_print('! Where is the matching @@y?');
1422@.Where is the match...@>
1423    end
1424  else if buffer[1]="y" then
1425    begin if n>0 then
1426      begin loc:=2; err_print('! Hmm... ',n:1,
1427        ' of the preceding lines failed to match');
1428@.Hmm... n of the preceding...@>
1429      end;
1430    return;
1431    end;
1432  end
1433
1434@ The |reset_input| procedure, which gets \.{WEAVE} ready to read the
1435user's \.{WEB} input, is used at the beginning of phases one and two.
1436
1437@p procedure reset_input;
1438begin open_input; line:=0; other_line:=0;@/
1439changing:=true; prime_the_change_buffer; change_changing;@/
1440limit:=0; loc:=1; buffer[0]:=" "; input_has_ended:=false;
1441end;
1442
1443@ The |get_line| procedure is called when |loc>limit|; it puts the next
1444line of merged input into the buffer and updates the other variables
1445appropriately. A space is placed at the right end of the line.
1446
1447@p procedure get_line; {inputs the next line}
1448label restart;
1449begin restart:if changing then
1450  @<Read from |change_file| and maybe turn off |changing|@>;
1451if not changing then
1452  begin @<Read from |web_file| and maybe turn on |changing|@>;
1453  if changing then goto restart;
1454  end;
1455loc:=0; buffer[limit]:=" ";
1456end;
1457
1458@ @<Read from |web_file|...@>=
1459begin incr(line);
1460if not input_ln(web_file) then input_has_ended:=true
1461else if limit=change_limit then
1462  if buffer[0]=change_buffer[0] then
1463    if change_limit>0 then check_change;
1464end
1465
1466@ @<Read from |change_file|...@>=
1467begin incr(line);
1468if not input_ln(change_file) then
1469  begin err_print('! Change file ended without @@z');
1470@.Change file ended...@>
1471  buffer[0]:="@@"; buffer[1]:="z"; limit:=2;
1472  end;
1473if limit>0 then {check if the change has ended}
1474  begin if change_pending then
1475    begin if_module_start_then_make_change_pending(false);
1476    if change_pending then
1477      begin changed_module[module_count]:=true; change_pending:=false;
1478      end;
1479    end;
1480  buffer[limit]:=" ";
1481  if buffer[0]="@@" then
1482    begin if (buffer[1]>="X")and(buffer[1]<="Z") then
1483      buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
1484    if (buffer[1]="x")or(buffer[1]="y") then
1485      begin loc:=2; err_print('! Where is the matching @@z?');
1486@.Where is the match...@>
1487      end
1488    else if buffer[1]="z" then
1489      begin prime_the_change_buffer; change_changing;
1490      end;
1491    end;
1492  end;
1493end
1494
1495@ At the end of the program, we will tell the user if the change file
1496had a line that didn't match any relevant line in |web_file|.
1497
1498@<Check that all changes have been read@>=
1499if change_limit<>0 then {|changing| is false}
1500  begin for ii:=0 to change_limit do buffer[ii]:=change_buffer[ii];
1501  limit:=change_limit; changing:=true; line:=other_line; loc:=change_limit;
1502  err_print('! Change file entry did not match');
1503@.Change file entry did not match@>
1504  end
1505
1506@ Control codes in \.{WEB}, which begin with `\.{@@}', are converted
1507into a numeric code designed to simplify \.{WEAVE}'s logic; for example,
1508larger numbers are given to the control codes that denote more significant
1509milestones, and the code of |new_module| should be the largest of
1510all. Some of these numeric control codes take the place of ASCII
1511control codes that will not otherwise appear in the output of the
1512scanning routines.
1513@^ASCII code@>
1514
1515@d ignore=0 {control code of no interest to \.{WEAVE}}
1516@d verbatim=@'2 {extended ASCII alpha will not appear}
1517@d force_line=@'3 {extended ASCII beta will not appear}
1518@d begin_comment=@'11 {ASCII tab mark will not appear}
1519@d end_comment=@'12 {ASCII line feed will not appear}
1520@d octal=@'14 {ASCII form feed will not appear}
1521@d hex=@'15 {ASCII carriage return will not appear}
1522@d double_dot=@'40 {ASCII space will not appear except in strings}
1523@d no_underline=@'175 {this code will be intercepted without confusion}
1524@d underline=@'176 {this code will be intercepted without confusion}
1525@d param=@'177 {ASCII delete will not appear}
1526@d xref_roman=@'203 {control code for `\.{@@\^}'}
1527@d xref_wildcard=@'204 {control code for `\.{@@:}'}
1528@d xref_typewriter=@'205 {control code for `\.{@@.}'}
1529@d TeX_string=@'206 {control code for `\.{@@t}'}
1530@d check_sum=@'207 {control code for `\.{@@\$}'}
1531@d join=@'210 {control code for `\.{@@\&}'}
1532@d thin_space=@'211 {control code for `\.{@@,}'}
1533@d math_break=@'212 {control code for `\.{@@\char'174}'}
1534@d line_break=@'213 {control code for `\.{@@/}'}
1535@d big_line_break=@'214 {control code for `\.{@@\#}'}
1536@d no_line_break=@'215 {control code for `\.{@@+}'}
1537@d pseudo_semi=@'216 {control code for `\.{@@;}'}
1538@d format=@'217 {control code for `\.{@@f}'}
1539@d definition=@'220 {control code for `\.{@@d}'}
1540@d begin_Pascal=@'221 {control code for `\.{@@p}'}
1541@d module_name=@'222 {control code for `\.{@@<}'}
1542@d new_module=@'223 {control code for `\.{@@\ }' and `\.{@@*}'}
1543
1544@ Control codes are converted from ASCII to \.{WEAVE}'s internal
1545representation by the |control_code| routine.
1546
1547@p function control_code(@!c:ASCII_code):eight_bits; {convert |c|
1548  after \.{@@}}
1549begin case c of
1550"@@": control_code:="@@"; {`quoted' at sign}
1551"'": control_code:=octal; {precedes octal constant}
1552"""": control_code:=hex; {precedes hexadecimal constant}
1553"$": control_code:=check_sum; {precedes check sum constant}
1554" ",tab_mark,"*": control_code:=new_module; {beginning of a new module}
1555"=": control_code:=verbatim;
1556"\": control_code:=force_line;
1557"D","d": control_code:=definition; {macro definition}
1558"F","f": control_code:=format; {format definition}
1559"{": control_code:=begin_comment; {begin-comment delimiter}
1560"}": control_code:=end_comment; {end-comment delimiter}
1561"P","p": control_code:=begin_Pascal; {\PASCAL\ text in unnamed module}
1562"&": control_code:=join; {concatenate two tokens}
1563"<": control_code:=module_name; {beginning of a module name}
1564">": begin err_print('! Extra @@>'); control_code:=ignore;
1565@.Extra \AT!>@>
1566  end; {end of module name should not be discovered in this way}
1567"T","t": control_code:=TeX_string; {\TeX\ box within \PASCAL}
1568"!": control_code:=underline; {set definition flag}
1569"?": control_code:=no_underline; {reset definition flag}
1570"^": control_code:=xref_roman; {index entry to be typeset normally}
1571":": control_code:=xref_wildcard; {index entry to be in user format}
1572".": control_code:=xref_typewriter; {index entry to be in typewriter type}
1573",": control_code:=thin_space; {puts extra space in \PASCAL\ format}
1574"|": control_code:=math_break; {allows a break in a formula}
1575"/": control_code:=line_break; {forces end-of-line in \PASCAL\ format}
1576"#": control_code:=big_line_break; {forces end-of-line and some space besides}
1577"+": control_code:=no_line_break; {cancels end-of-line down to single space}
1578";": control_code:=pseudo_semi; {acts like a semicolon, but is invisible}
1579@t\4@>@<Special control codes allowed only when debugging@>@;
1580othercases begin err_print('! Unknown control code'); control_code:=ignore;
1581@.Unknown control code@>
1582  end
1583endcases;
1584end;
1585
1586@ If \.{WEAVE} is compiled with debugging commands, one can write
1587\.{@@2}, \.{@@1}, and \.{@@0} to turn tracing fully on, partly on,
1588and off, respectively.
1589@.\AT!2@>
1590@.\AT!1@>
1591
1592@<Special control codes...@>=
1593@!debug@t@>@/
1594"0","1","2": begin tracing:=c-"0"; control_code:=ignore;
1595  end;
1596gubed
1597
1598@ The |skip_limbo| routine is used on the first pass to skip through
1599portions of the input that are not in any modules, i.e., that precede
1600the first module. After this procedure has been called, the value of
1601|input_has_ended| will tell whether or not a new module has
1602actually been found.
1603
1604@p procedure skip_limbo; {skip to next module}
1605label exit;
1606var c:ASCII_code; {character following \.{@@}}
1607begin loop if loc>limit then
1608    begin get_line;
1609    if input_has_ended then return;
1610    end
1611  else  begin buffer[limit+1]:="@@";
1612    while buffer[loc]<>"@@" do incr(loc);
1613    if loc<=limit then
1614      begin loc:=loc+2; c:=buffer[loc-1];
1615      if (c=" ")or(c=tab_mark)or(c="*") then return;
1616      end;
1617    end;
1618exit: end;
1619
1620@ The |skip_TeX| routine is used on the first pass to skip through
1621the \TeX\ code at the beginning of a module. It returns the next
1622control code or `\v' found in the input. A |new_module| is
1623assumed to exist at the very end of the file.
1624
1625@p function skip_TeX: eight_bits; {skip past pure \TeX\ code}
1626label done;
1627var c:eight_bits; {control code found}
1628begin loop begin if loc>limit then
1629    begin get_line;
1630    if input_has_ended then
1631      begin c:=new_module; goto done;
1632      end;
1633    end;
1634  buffer[limit+1]:="@@";
1635  repeat c:=buffer[loc]; incr(loc);
1636  if c="|" then goto done;
1637  until c="@@";
1638  if loc<=limit then
1639    begin c:=control_code(buffer[loc]); incr(loc); goto done;
1640    end;
1641  end;
1642done:skip_TeX:=c;
1643end;
1644
1645@ The |skip_comment| routine is used on the first pass to skip
1646through \TeX\ code in \PASCAL\ comments. The |bal| parameter
1647tells how many left braces are assumed to have been scanned when
1648this routine is called, and the procedure returns a corresponding
1649value of |bal| at the point that scanning has stopped. Scanning
1650stops either at a `\v' that introduces \PASCAL\ text,
1651in which case the returned value is positive, or it stops at the
1652end of the comment, in which case the returned value is zero.
1653The scanning also stops in anomalous situations when the comment
1654doesn't end or when it contains an illegal use of \.{@@}.
1655One should call |skip_comment(1)| when beginning to scan a comment.
1656
1657@p function skip_comment(@!bal:eight_bits):eight_bits; {skips \TeX\
1658  code in comments}
1659label done;
1660var c:ASCII_code; {the current character}
1661begin loop begin if loc>limit then
1662    begin get_line;
1663    if input_has_ended then
1664      begin bal:=0; goto done;
1665      end; {an error message will occur in phase two}
1666    end;
1667  c:=buffer[loc]; incr(loc);
1668  if c="|" then goto done;
1669  @<Do special things when |c="@@", "\", "{", "}"|; |goto done| at end@>;
1670  end;
1671done: skip_comment:=bal;
1672end;
1673
1674@ @<Do special things when |c="@@"...@>=
1675if c="@@" then
1676  begin c:=buffer[loc];
1677  if (c<>" ")and(c<>tab_mark)and(c<>"*") then incr(loc)
1678  else  begin decr(loc); bal:=0; goto done;
1679    end {an error message will occur in phase two}
1680  end
1681else if (c="\")and(buffer[loc]<>"@@") then incr(loc)
1682else if c="{" then incr(bal)
1683else if c="}" then
1684  begin decr(bal);
1685  if bal=0 then goto done;
1686  end
1687
1688@* Inputting the next token.
1689As stated above, \.{WEAVE}'s most interesting lexical scanning routine is the
1690|get_next| function that inputs the next token of \PASCAL\ input. However,
1691|get_next| is not especially complicated.
1692
1693The result of |get_next| is either an ASCII code for some special character,
1694or it is a special code representing a pair of characters (e.g., `\.{:=}'
1695or `\.{..}'), or it is the numeric value computed by the |control_code|
1696procedure, or it is one of the following special codes:
1697
1698\yskip\hang |exponent|: The `\.E' in a real constant.
1699
1700\yskip\hang |identifier|: In this case the global variables |id_first|
1701and |id_loc| will have been set to the appropriate values needed by the
1702|id_lookup| routine.
1703
1704\yskip\hang |string|: In this case the global variables |id_first| and
1705|id_loc| will have been set to the beginning and ending-plus-one locations
1706in the buffer.  The string ends with the first reappearance of its initial
1707delimiter; thus, for example, $$\.{\'This isn\'\'t a single string\'}$$
1708will be treated as two consecutive strings, the first being \.{\'This
1709isn\'}.
1710
1711\yskip\noindent Furthermore, some of the control codes cause
1712|get_next| to take additional actions:
1713
1714\yskip\hang |xref_roman|, |xref_wildcard|,
1715|xref_typewriter|, |TeX_string|: The values of
1716|id_first| and |id_loc| will be set so that the string in question appears
1717in |buffer[id_first..(id_loc-1)]|.
1718
1719\yskip\hang |module_name|: In this case the global variable |cur_module| will
1720point to the |byte_start| entry for the module name that has just been scanned.
1721
1722\yskip\noindent If |get_next| sees `\.{@@!}' or `\.{@@?}',
1723it sets |xref_switch| to |def_flag| or zero and goes on to the next token.
1724
1725A global variable called |scanning_hex| is set |true| during the time that
1726the letters \.A through \.F should be treated as if they were digits.
1727
1728@d exponent=@'200 {\.E or \.e following a digit}
1729@d string=@'201 {\PASCAL\ string or \.{WEB} precomputed string}
1730@d identifier=@'202 {\PASCAL\ identifier or reserved word}
1731
1732@<Globals...@>=
1733@!cur_module: name_pointer; {name of module just scanned}
1734@!scanning_hex: boolean; {are we scanning a hexadecimal constant?}
1735
1736@ @<Set init...@>=
1737scanning_hex:=false;
1738
1739@ As one might expect, |get_next| consists mostly of a big switch
1740that branches to the various special cases that can arise.
1741
1742@d up_to(#)==#-24,#-23,#-22,#-21,#-20,#-19,#-18,#-17,#-16,#-15,#-14,
1743  #-13,#-12,#-11,#-10,#-9,#-8,#-7,#-6,#-5,#-4,#-3,#-2,#-1,#
1744
1745@p function get_next:eight_bits; {produces the next input token}
1746label restart,done,found;
1747var c:eight_bits; {the current character}
1748@!d:eight_bits; {the next character}
1749@!j,@!k:0..longest_name; {indices into |mod_text|}
1750begin restart: if loc>limit then
1751  begin get_line;
1752  if input_has_ended then
1753    begin c:=new_module; goto found;
1754    end;
1755  end;
1756c:=buffer[loc]; incr(loc);
1757if scanning_hex then @<Go to |found| if |c| is a hexadecimal digit,
1758  otherwise set |scanning_hex:=false|@>;
1759case c of
1760"A",up_to("Z"),"a",up_to("z"): @<Get an identifier@>;
1761"'","""": @<Get a string@>;
1762"@@": @<Get control code and possible module name@>;
1763@t\4@>@<Compress two-symbol combinations like `\.{:=}'@>@;
1764" ",tab_mark: goto restart; {ignore spaces and tabs}
1765"}": begin err_print('! Extra }'); goto restart;
1766@.Extra \}@>
1767  end;
1768othercases if c>=128 then goto restart {ignore nonstandard characters}
1769  else do_nothing
1770endcases;
1771found:@!debug if trouble_shooting then debug_help;@;@+gubed@/
1772get_next:=c;
1773end;
1774
1775@ @<Go to |found| if |c| is a hexadecimal digit...@>=
1776if ((c>="0")and(c<="9"))or((c>="A")and(c<="F")) then goto found
1777else scanning_hex:=false
1778
1779@ Note that the following code substitutes \.{@@\{} and \.{@@\}} for the
1780respective combinations `\.{(*}' and `\.{*)}'. Explicit braces should be used
1781for \TeX\ comments in \PASCAL\ text.
1782
1783@d compress(#)==begin if loc<=limit then begin c:=#; incr(loc); end; end
1784
1785@<Compress two-symbol...@>=
1786".": if buffer[loc]="." then compress(double_dot)
1787  else if buffer[loc]=")" then compress("]");
1788":": if buffer[loc]="=" then compress(left_arrow);
1789"=": if buffer[loc]="=" then compress(equivalence_sign);
1790">": if buffer[loc]="=" then compress(greater_or_equal);
1791"<": if buffer[loc]="=" then compress(less_or_equal)
1792  else if buffer[loc]=">" then compress(not_equal);
1793"(": if buffer[loc]="*" then compress(begin_comment)
1794  else if buffer[loc]="." then compress("[");
1795"*": if buffer[loc]=")" then compress(end_comment);
1796
1797@ @<Get an identifier@>=
1798begin if ((c="E")or(c="e"))and(loc>1) then
1799  if (buffer[loc-2]<="9")and(buffer[loc-2]>="0") then c:=exponent;
1800if c<>exponent then
1801  begin decr(loc); id_first:=loc;
1802  repeat incr(loc); d:=buffer[loc];
1803  until ((d<"0")or((d>"9")and(d<"A"))or((d>"Z")and(d<"a"))or(d>"z"))and(d<>"_");
1804  c:=identifier; id_loc:=loc;
1805  end;
1806end
1807
1808@ A string that starts and ends with single or double quote marks is
1809scanned by the following piece of the program.
1810
1811@<Get a string@>=
1812begin id_first:=loc-1;
1813repeat d:=buffer[loc]; incr(loc);
1814if loc>limit then
1815  begin err_print('! String constant didn''t end');
1816@.String constant didn't end@>
1817  loc:=limit; d:=c;
1818  end;
1819until d=c;
1820id_loc:=loc; c:=string;
1821end
1822
1823@ After an \.{@@} sign has been scanned, the next character tells us
1824whether there is more work to do.
1825
1826@<Get control code and possible module name@>=
1827begin c:=control_code(buffer[loc]); incr(loc);
1828if c=underline then
1829  begin xref_switch:=def_flag; goto restart;
1830  end
1831else if c=no_underline then
1832  begin xref_switch:=0; goto restart;
1833  end
1834else if (c<=TeX_string)and(c>=xref_roman) then
1835  @<Scan to the next \.{@@>}@>
1836else if c=hex then scanning_hex:=true
1837else if c=module_name then
1838  @<Scan the module name and make |cur_module| point to it@>
1839else if c=verbatim then @<Scan a verbatim string@>;
1840end
1841
1842@ The occurrence of a module name sets |xref_switch| to zero,
1843because the module name might (for example) follow \&{var}.
1844
1845@<Scan the module name...@>=
1846begin @<Put module name into |mod_text[1..k]|@>;
1847if k>3 then
1848  begin if (mod_text[k]=".")and(mod_text[k-1]=".")and(mod_text[k-2]=".") then
1849    cur_module:=prefix_lookup(k-3)
1850  else cur_module:=mod_lookup(k);
1851  end
1852else cur_module:=mod_lookup(k);
1853xref_switch:=0;
1854end
1855
1856@ Module names are placed into the |mod_text| array with consecutive spaces,
1857tabs, and carriage-returns replaced by single spaces. There will be no
1858spaces at the beginning or the end. (We set |mod_text[0]:=" "| to facilitate
1859this, since the |mod_lookup| routine uses |mod_text[1]| as the first
1860character of the name.)
1861
1862@<Set init...@>=mod_text[0]:=" ";
1863
1864@ @<Put module name...@>=
1865k:=0;
1866loop@+  begin if loc>limit then
1867    begin get_line;
1868    if input_has_ended then
1869      begin err_print('! Input ended in section name');
1870@.Input ended in section name@>
1871      loc:=1; goto done;
1872      end;
1873    end;
1874  d:=buffer[loc];
1875  @<If end of name, |goto done|@>;
1876  incr(loc); if k<longest_name-1 then incr(k);
1877  if (d=" ")or(d=tab_mark) then
1878    begin d:=" "; if mod_text[k-1]=" " then decr(k);
1879    end;
1880  mod_text[k]:=d;
1881  end;
1882done: @<Check for overlong name@>;
1883if (mod_text[k]=" ")and(k>0) then decr(k)
1884
1885@ @<If end of name,...@>=
1886if d="@@" then
1887  begin d:=buffer[loc+1];
1888  if d=">" then
1889    begin loc:=loc+2; goto done;
1890    end;
1891  if (d=" ")or(d=tab_mark)or(d="*") then
1892    begin err_print('! Section name didn''t end'); goto done;
1893@.Section name didn't end@>
1894    end;
1895  incr(k); mod_text[k]:="@@"; incr(loc); {now |d=buffer[loc]| again}
1896  end
1897
1898@ @<Check for overlong name@>=
1899if k>=longest_name-2 then
1900  begin print_nl('! Section name too long: ');
1901@.Section name too long@>
1902  for j:=1 to 25 do print(xchr[mod_text[j]]);
1903  print('...'); mark_harmless;
1904  end
1905
1906@ @<Scan to the next...@>=
1907begin id_first:=loc; buffer[limit+1]:="@@";
1908while buffer[loc]<>"@@" do incr(loc);
1909id_loc:=loc;
1910if loc>limit then
1911  begin err_print('! Control text didn''t end'); loc:=limit;
1912@.Control text didn't end@>
1913  end
1914else  begin loc:=loc+2;
1915  if buffer[loc-1]<>">" then
1916    err_print('! Control codes are forbidden in control text');
1917@.Control codes are forbidden...@>
1918  end;
1919end
1920
1921@ A verbatim \PASCAL\ string will be treated like ordinary strings, but
1922with no surrounding delimiters.  At the present point in the program we
1923have |buffer[loc-1]=verbatim|; we must set |id_first| to the beginning
1924of the string itself, and |id_loc| to its ending-plus-one location in the
1925buffer.  We also set |loc| to the position just after the ending delimiter.
1926
1927@<Scan a verbatim string@>=
1928begin id_first:=loc; incr(loc);
1929buffer[limit+1]:="@@"; buffer[limit+2]:=">";
1930while (buffer[loc]<>"@@")or(buffer[loc+1]<>">") do incr(loc);
1931if loc>=limit then err_print('! Verbatim string didn''t end');
1932@.Verbatim string didn't end@>
1933id_loc:=loc; loc:=loc+2;
1934end
1935
1936@* Phase one processing.
1937We now have accumulated enough subroutines to make it possible to carry out
1938\.{WEAVE}'s first pass over the source file. If everything works right,
1939both phase one and phase two of \.{WEAVE} will assign the same numbers to
1940modules, and these numbers will agree with what \.{TANGLE} does.
1941
1942The global variable |next_control| often contains the most recent output of
1943|get_next|; in interesting cases, this will be the control code that
1944ended a module or part of a module.
1945
1946@<Glob...@>=@!next_control:eight_bits; {control code waiting to be acting upon}
1947
1948@ The overall processing strategy in phase one has the following
1949straightforward outline.
1950
1951@<Phase I: Read all the user's text and store the cross references@>=
1952phase_one:=true; phase_three:=false;
1953reset_input;
1954module_count:=0; skip_limbo; change_exists:=false;
1955while not input_has_ended do
1956  @<Store cross reference data for the current module@>;
1957changed_module[module_count]:=change_exists;
1958  {the index changes if anything does}
1959phase_one:=false; {prepare for second phase}
1960@<Print error messages about unused or undefined module names@>;
1961
1962@ @<Store cross reference data...@>=
1963begin incr(module_count);
1964if module_count=max_modules then overflow('section number');
1965changed_module[module_count]:=changing;
1966 {it will become |true| if any line changes}
1967if buffer[loc-1]="*" then
1968  begin print('*',module_count:1);
1969  update_terminal; {print a progress report}
1970  end;
1971@<Store cross references in the \TeX\ part of a module@>;
1972@<Store cross references in the \(definition part of a module@>;
1973@<Store cross references in the \PASCAL\ part of a module@>;
1974if changed_module[module_count] then change_exists:=true;
1975end
1976
1977@ The |Pascal_xref| subroutine stores references to identifiers in
1978\PASCAL\ text material beginning with the current value of |next_control|
1979and continuing until |next_control| is `\.\{' or `\v', or until the next
1980``milestone'' is passed (i.e., |next_control>=format|). If
1981|next_control>=format| when |Pascal_xref| is called, nothing will happen;
1982but if |next_control="|"| upon entry, the procedure assumes that this is
1983the `\v' preceding \PASCAL\ text that is to be processed.
1984
1985The program uses the fact that our internal code numbers satisfy
1986the relations |xref_roman=identifier+roman| and |xref_wildcard=identifier
1987+wildcard| and |xref_typewriter=identifier+
1988typewriter| and |normal=0|. An implied `\.{@@!}' is inserted after
1989\&{function}, \&{procedure}, \&{program}, and \&{var}.
1990
1991@p procedure Pascal_xref; {makes cross references for \PASCAL\ identifiers}
1992label exit;
1993var p:name_pointer; {a referenced name}
1994begin while next_control<format do
1995  begin if (next_control>=identifier)and
1996      (next_control<=xref_typewriter) then
1997    begin p:=id_lookup(next_control-identifier); new_xref(p);
1998    if (ilk[p]=proc_like)or(ilk[p]=var_like) then
1999      xref_switch:=def_flag; {implied `\.{@@!}'}
2000    end;
2001  next_control:=get_next;
2002  if (next_control="|")or(next_control="{") then return;
2003  end;
2004exit:end;
2005
2006@ The |outer_xref| subroutine is like |Pascal_xref| but it begins
2007with |next_control<>"|"| and ends with |next_control>=format|. Thus, it
2008handles \PASCAL\ text with embedded comments.
2009
2010@p procedure outer_xref; {extension of |Pascal_xref|}
2011var bal:eight_bits; {brace level in comment}
2012begin while next_control<format do
2013  if next_control<>"{" then Pascal_xref
2014  else  begin bal:=skip_comment(1); next_control:="|";
2015    while bal>0 do
2016      begin Pascal_xref;
2017      if next_control="|" then bal:=skip_comment(bal)
2018      else bal:=0; {an error will be reported in phase two}
2019      end;
2020    end;
2021end;
2022
2023@ In the \TeX\ part of a module, cross reference entries are made only for
2024the identifiers in \PASCAL\ texts enclosed in \pb, or for control texts
2025enclosed in \.{@@\^}$\,\ldots\,$\.{@@>} or \.{@@.}$\,\ldots\,$\.{@@>}
2026or \.{@@:}$\,\ldots\,$\.{@@>}.
2027
2028@<Store cross references in the \T...@>=
2029repeat next_control:=skip_TeX;
2030case next_control of
2031underline: xref_switch:=def_flag;
2032no_underline: xref_switch:=0;
2033"|": Pascal_xref;
2034xref_roman, xref_wildcard, xref_typewriter, module_name:
2035  begin loc:=loc-2; next_control:=get_next; {scan to \.{@@>}}
2036  if next_control<>module_name then
2037    new_xref(id_lookup(next_control-identifier));
2038  end;
2039othercases do_nothing
2040endcases;
2041until next_control>=format
2042
2043@ During the definition and \PASCAL\ parts of a module, cross references
2044are made for all identifiers except reserved words; however, the
2045identifiers in a format definition are referenced even if they are
2046reserved. The \TeX\ code in comments is, of course, ignored, except for
2047\PASCAL\ portions enclosed in \pb; the text of a module name is skipped
2048entirely, even if it contains \pb\ constructions.
2049
2050The variables |lhs| and |rhs| point to the respective identifiers involved
2051in a format definition.
2052
2053@<Global...@>=
2054@!lhs,@!rhs:name_pointer; {indices into |byte_start| for format identifiers}
2055
2056@ When we get to the following code we have |next_control>=format|.
2057
2058@<Store cross references in the \(d...@>=
2059while next_control<=definition do {|format| or |definition|}
2060  begin xref_switch:=def_flag; {implied \.{@@!}}
2061  if next_control=definition then next_control:=get_next
2062  else @<Process a format definition@>;
2063  outer_xref;
2064  end
2065
2066@ Error messages for improper format definitions will be issued in phase
2067two. Our job in phase one is to define the |ilk| of a properly formatted
2068identifier, and to fool the |new_xref| routine into thinking that the
2069identifier on the right-hand side of the format definition is not a
2070reserved word.
2071
2072@<Process a form...@>=
2073begin next_control:=get_next;
2074if next_control=identifier then
2075  begin lhs:=id_lookup(normal); ilk[lhs]:=normal; new_xref(lhs);
2076  next_control:=get_next;
2077  if next_control=equivalence_sign then
2078    begin next_control:=get_next;
2079    if next_control=identifier then
2080      begin rhs:=id_lookup(normal);
2081      ilk[lhs]:=ilk[rhs]; ilk[rhs]:=normal; new_xref(rhs);
2082      ilk[rhs]:=ilk[lhs]; next_control:=get_next;
2083      end;
2084    end;
2085  end;
2086end
2087
2088@ Finally, when the \TeX\ and definition parts have been treated, we have
2089|next_control>=begin_Pascal|.
2090
2091@<Store cross references in the \P...@>=
2092if next_control<=module_name then {|begin_Pascal| or |module_name|}
2093  begin if next_control=begin_Pascal then mod_xref_switch:=0
2094  else mod_xref_switch:=def_flag;
2095  repeat if next_control=module_name then new_mod_xref(cur_module);
2096    next_control:=get_next; outer_xref;
2097  until next_control>module_name;
2098  end
2099
2100@ After phase one has looked at everything, we want to check that each
2101module name was both defined and used.
2102The variable |cur_xref| will point to cross references for the
2103current module name of interest.
2104
2105@<Glob...@>=@!cur_xref:xref_number; {temporary cross reference pointer}
2106
2107@ The following recursive procedure
2108walks through the tree of module names and prints out anomalies.
2109@^recursion@>
2110
2111@p procedure mod_check(@!p:name_pointer); {print anomalies in subtree |p|}
2112begin if p>0 then
2113  begin mod_check(llink[p]);@/
2114  cur_xref:=xref[p];
2115  if num(cur_xref)<def_flag then
2116    begin print_nl('! Never defined: <'); print_id(p);
2117@.Never defined: <section name>@>
2118    print('>'); mark_harmless;
2119    end;
2120  while num(cur_xref)>=def_flag do cur_xref:=xlink(cur_xref);
2121  if cur_xref=0 then
2122    begin print_nl('! Never used: <'); print_id(p); print('>');
2123@.Never used: <section name>@>
2124    mark_harmless;
2125    end;
2126  mod_check(rlink[p]);
2127  end;
2128end;
2129
2130@ @<Print error messages about un...@>=@+mod_check(root)
2131
2132@* Low-level output routines.
2133The \TeX\ output is supposed to appear in lines at most |line_length|
2134characters long, so we place it into an output buffer. During the output
2135process, |out_line| will hold the current line number of the line about to
2136be output.
2137
2138@<Glo...@>=
2139@!out_buf:array[0..line_length] of ASCII_code; {assembled characters}
2140@!out_ptr:0..line_length; {number of characters in |out_buf|}
2141@!out_line: integer; {coordinates of next line to be output}
2142
2143@ The |flush_buffer| routine empties the buffer up to a given breakpoint,
2144and moves any remaining characters to the beginning of the next line.
2145If the |per_cent| parameter is |true|, a |"%"| is appended to the line
2146that is being output; in this case the breakpoint |b| should be strictly
2147less than |line_length|. If the |per_cent| parameter is |false|,
2148trailing blanks are suppressed.
2149The characters emptied from the buffer form a new line of output;
2150if the |carryover| parameter is true, a |"%"| in that line will be
2151carried over to the next line (so that \TeX\ will ignore the completion
2152of commented-out text).
2153
2154@p procedure flush_buffer(@!b:eight_bits;@!per_cent,@!carryover:boolean);
2155  {outputs |out_buf[1..b]|, where |b<=out_ptr|}
2156label done,found;
2157var j,@!k:0..line_length;
2158begin j:=b;
2159if not per_cent then {remove trailing blanks}
2160  loop@+  begin if j=0 then goto done;
2161    if out_buf[j]<>" " then goto done;
2162    decr(j);
2163    end;
2164done: for k:=1 to j do write(tex_file,xchr[out_buf[k]]);
2165if per_cent then write(tex_file,xchr["%"]);
2166write_ln(tex_file); incr(out_line);
2167if carryover then
2168  for k:=1 to j do
2169    if out_buf[k]="%" then
2170      if (k=1)or(out_buf[k-1]<>"\") then {comment mode should be preserved}
2171        begin out_buf[b]:="%"; decr(b); goto found;
2172        end;
2173found: if (b<out_ptr) then
2174  for k:=b+1 to out_ptr do out_buf[k-b]:=out_buf[k];
2175out_ptr:=out_ptr-b;
2176end;
2177
2178@ When we are copying \TeX\ source material, we retain line breaks
2179that occur in the input, except that an empty line is not
2180output when the \TeX\ source line was nonempty. For example, a line
2181of the \TeX\ file that contains only an index cross-reference entry
2182will not be copied. The |finish_line| routine is called just before
2183|get_line| inputs a new line, and just after a line break token has
2184been emitted during the output of translated \PASCAL\ text.
2185
2186@p procedure finish_line; {do this at the end of a line}
2187label exit;
2188var k:0..buf_size; {index into |buffer|}
2189begin if out_ptr>0 then flush_buffer(out_ptr,false,false)
2190else  begin for k:=0 to limit do
2191    if (buffer[k]<>" ")and(buffer[k]<>tab_mark) then return;
2192  flush_buffer(0,false,false);
2193  end;
2194exit:end;
2195
2196@ In particular, the |finish_line| procedure is called near the very
2197beginning of phase two. We initialize the output variables in a slightly
2198tricky way so that the first line of the output file will be
2199`\.{\\input webmac}'.
2200@.\\input webmac@>
2201@.webmac@>
2202
2203@<Set init...@>=
2204out_ptr:=1; out_line:=1; out_buf[1]:="c"; write(tex_file,'\input webma');
2205
2206@ When we wish to append the character |c| to the output buffer, we write
2207`$|out|(c)$'; this will cause the buffer to be emptied if it was already
2208full. Similarly, `$|out2|(c_1)(c_2)$' appends a pair of characters.
2209A line break will occur at a space or after a single-nonletter
2210\TeX\ control sequence.
2211
2212@d oot(#)==@;@/
2213  if out_ptr=line_length then break_out;
2214  incr(out_ptr); out_buf[out_ptr]:=#;
2215@d oot1(#)==oot(#)@+end
2216@d oot2(#)==oot(#)@,oot1
2217@d oot3(#)==oot(#)@,oot2
2218@d oot4(#)==oot(#)@,oot3
2219@d oot5(#)==oot(#)@,oot4
2220@d out==@+begin oot1
2221@d out2==@+begin oot2
2222@d out3==@+begin oot3
2223@d out4==@+begin oot4
2224@d out5==@+begin oot5
2225
2226@ The |break_out| routine is called just before the output buffer is about
2227to overflow. To make this routine a little faster, we initialize position
22280 of the output buffer to `\.\\'; this character isn't really output.
2229
2230@<Set init...@>=
2231out_buf[0]:="\";
2232
2233@ A long line is broken at a blank space or just before a backslash that isn't
2234preceded by another backslash. In the latter case, a |"%"| is output at
2235the break.
2236
2237@p procedure break_out; {finds a way to break the output line}
2238label exit;
2239var k:0..line_length; {index into |out_buf|}
2240@!d:ASCII_code; {character from the buffer}
2241begin k:=out_ptr;
2242loop@+  begin if k=0 then
2243    @<Print warning message, break the line, |return|@>;
2244  d:=out_buf[k];
2245  if d=" " then
2246    begin flush_buffer(k,false,true); return;
2247    end;
2248  if (d="\")and(out_buf[k-1]<>"\") then {in this case |k>1|}
2249    begin flush_buffer(k-1,true,true); return;
2250    end;
2251  decr(k);
2252  end;
2253exit:end;
2254
2255@ We get to this module only in unusual cases that the entire output line
2256consists of a string of backslashes followed by a string of nonblank
2257non-backslashes. In such cases it is almost always safe to break the
2258line by putting a |"%"| just before the last character.
2259
2260@<Print warning message...@>=
2261begin print_nl('! Line had to be broken (output l.',out_line:1);
2262@.Line had to be broken@>
2263print_ln('):');
2264for k:=1 to out_ptr-1 do print(xchr[out_buf[k]]);
2265new_line; mark_harmless;
2266flush_buffer(out_ptr-1,true,true); return;
2267end
2268
2269@ Here is a procedure that outputs a module number in decimal notation.
2270
2271@<Glob...@>=@!dig:array[0..4] of 0..9; {digits to output}
2272
2273@ The number to be converted by |out_mod| is known to be less than
2274|def_flag|, so it cannot have more than five decimal digits.  If
2275the module is changed, we output `\.{\\*}' just after the number.
2276
2277@p procedure out_mod(@!m:integer); {output a module number}
2278var k:0..5; {index into |dig|}
2279@!a:integer; {accumulator}
2280begin k:=0; a:=m;
2281repeat dig[k]:=a mod 10; a:=a div 10; incr(k);
2282until a=0;
2283repeat decr(k); out(dig[k]+"0");
2284until k=0;
2285if changed_module[m] then out2("\")("*");
2286@.\\*@>
2287end;
2288
2289@ The |out_name| subroutine is used to output an identifier or index
2290entry, enclosing it in braces.
2291
2292@p procedure out_name(@!p:name_pointer); {outputs a name}
2293var k:0..max_bytes; {index into |byte_mem|}
2294@!w:0..ww-1; {row of |byte_mem|}
2295begin out("{"); w:=p mod ww;
2296for k:=byte_start[p] to byte_start[p+ww]-1 do
2297  begin if byte_mem[w,k]="_" then out("\");
2298@.\\_@>
2299  out(byte_mem[w,k]);
2300  end;
2301out("}");
2302end;
2303
2304@* Routines that copy \TeX\ material.
2305During phase two, we use the subroutines |copy_limbo|, |copy_TeX|, and
2306|copy_comment| in place of the analogous |skip_limbo|, |skip_TeX|, and
2307|skip_comment| that were used in phase one.
2308
2309The |copy_limbo| routine, for example, takes \TeX\ material that is not
2310part of any module and transcribes it almost verbatim to the output file.
2311No `\.{@@}' signs should occur in such material except in `\.{@@@@}'
2312pairs; such pairs are replaced by singletons.
2313
2314@p procedure copy_limbo; {copy \TeX\ code until the next module begins}
2315label exit;
2316var c:ASCII_code; {character following \.{@@} sign}
2317begin loop if loc>limit then
2318    begin finish_line; get_line;
2319    if input_has_ended then return;
2320    end
2321  else  begin buffer[limit+1]:="@@";
2322    @<Copy up to control code, |return| if finished@>;
2323    end;
2324exit:end;
2325
2326@ @<Copy up to control...@>=
2327while buffer[loc]<>"@@" do
2328  begin out(buffer[loc]); incr(loc);
2329  end;
2330if loc<=limit then
2331  begin loc:=loc+2; c:=buffer[loc-1];
2332  if (c=" ")or(c=tab_mark)or(c="*") then return;
2333  if (c<>"z")and(c<>"Z") then
2334    begin out("@@");
2335    if c<>"@@" then err_print('! Double @@ required outside of sections');
2336@.Double \AT! required...@>
2337    end;
2338  end
2339
2340@ The |copy_TeX| routine processes the \TeX\ code at the beginning of a
2341module; for example, the words you are now reading were copied in this
2342way. It returns the next control code or `\v' found in the input.
2343
2344@p function copy_TeX:eight_bits; {copy pure \TeX\ material}
2345label done;
2346var c:eight_bits; {control code found}
2347begin loop begin if loc>limit then
2348    begin finish_line; get_line;
2349    if input_has_ended then
2350      begin c:=new_module; goto done;
2351      end;
2352    end;
2353  buffer[limit+1]:="@@";
2354  @<Copy up to `\v' or control code, |goto done| if finished@>;
2355  end;
2356done:copy_TeX:=c;
2357end;
2358
2359@ We don't copy spaces or tab marks into the beginning of a line. This
2360makes the test for empty lines in |finish_line| work.
2361
2362@<Copy up to `\v'...@>=
2363repeat c:=buffer[loc]; incr(loc);
2364if c="|" then goto done;
2365if c<>"@@" then
2366  begin out(c);
2367  if (out_ptr=1)and((c=" ")or(c=tab_mark)) then decr(out_ptr);
2368  end;
2369until c="@@";
2370if loc<=limit then
2371  begin c:=control_code(buffer[loc]); incr(loc);
2372  goto done;
2373  end
2374
2375@ The |copy_comment| uses and returns a brace-balance value, following the
2376conventions of |skip_comment| above. Instead of copying the \TeX\ material
2377into the output buffer, this procedure copies it into the token memory.
2378The abbreviation |app_tok(t)| is used to append token |t| to the current
2379token list, and it also makes sure that it is possible to append at least
2380one further token without overflow.
2381
2382@d app_tok(#)==begin if tok_ptr+2>max_toks then overflow('token');
2383  tok_mem[tok_ptr]:=#; incr(tok_ptr);
2384  end
2385
2386@p function copy_comment(@!bal:eight_bits):eight_bits; {copies \TeX\ code in
2387  comments}
2388label done;
2389var c:ASCII_code; {current character being copied}
2390begin loop begin if loc>limit then
2391    begin get_line;
2392    if input_has_ended then
2393      begin err_print('! Input ended in mid-comment');
2394@.Input ended in mid-comment@>
2395      loc:=1; @<Clear |bal| and |goto done|@>;
2396      end;
2397    end;
2398  c:=buffer[loc]; incr(loc);
2399  if c="|" then goto done;
2400  app_tok(c);
2401  @<Copy special things when |c="@@", "\", "{", "}"|;
2402    |goto done| at end@>;
2403  end;
2404done: copy_comment:=bal;
2405end;
2406
2407@ @<Copy special things when |c="@@"...@>=
2408if c="@@" then
2409  begin incr(loc);
2410  if buffer[loc-1]<>"@@" then
2411    begin err_print('! Illegal use of @@ in comment');
2412@.Illegal use of \AT!...@>
2413    loc:=loc-2; decr(tok_ptr); @<Clear |bal|...@>;
2414    end;
2415  end
2416else if (c="\")and(buffer[loc]<>"@@") then
2417  begin app_tok(buffer[loc]); incr(loc);
2418  end
2419else if c="{" then incr(bal)
2420else if c="}" then
2421  begin decr(bal);
2422  if bal=0 then goto done;
2423  end
2424
2425@ When the comment has terminated abruptly due to an error, we output
2426enough right braces to keep \TeX\ happy.
2427
2428@<Clear |bal|...@>=
2429app_tok(" "); {this is done in case the previous character was `\.\\'}
2430repeat app_tok("}"); decr(bal);
2431until bal=0;
2432goto done;
2433
2434@* Parsing.
2435The most intricate part of \.{WEAVE} is its mechanism for converting
2436\PASCAL-like code into \TeX\ code, and we might as well plunge into this
2437aspect of the program now. A ``bottom up'' approach is used to parse the
2438\PASCAL-like material, since \.{WEAVE} must deal with fragmentary
2439constructions whose overall ``part of speech'' is not known.
2440
2441At the lowest level, the input is represented as a sequence of entities
2442that we shall call {\it scraps}, where each scrap of information consists
2443of two parts, its {\it category} and its {\it translation}. The category
2444is essentially a syntactic class, and the translation is a token list that
2445represents \TeX\ code. Rules of syntax and semantics tell us how to
2446combine adjacent scraps into larger ones, and if we are lucky an entire
2447\PASCAL\ text that starts out as hundreds of small scraps will join
2448together into one gigantic scrap whose translation is the desired \TeX\
2449code. If we are unlucky, we will be left with several scraps that don't
2450combine; their translations will simply be output, one by one.
2451
2452The combination rules are given as context-sensitive productions that are
2453applied from left to right. Suppose that we are currently working on the
2454sequence of scraps $s_1\,s_2\ldots s_n$. We try first to find the longest
2455production that applies to an initial substring $s_1\,s_2\ldots\,$; but if
2456no such productions exist, we find to find the longest production
2457applicable to the next substring $s_2\,s_3\ldots\,$; and if that fails, we
2458try to match $s_3\,s_4\ldots\,$, etc.
2459
2460A production applies if the category codes have a given pattern. For
2461example, one of the productions is
2462$$|open|\;|math|\;|semi|\;\RA\;|open|\;|math|$$
2463and it means that three consecutive scraps whose respective categories are
2464|open|, |math|, and |semi| are con\-verted to two scraps whose categories
2465are |open| and |math|. This production also has an associated rule that
2466tells how to combine the translation parts:
2467$$\eqalign{O_2&=O_1\cr
2468M_2&=M_1\,S\,\.{\\,}\,\hbox{|opt|\thinspace\tt5}\cr}$$
2469This means that the |open| scrap has not changed, while the new |math| scrap
2470has a translation $M_2$ composed of the translation $M_1$ of the original
2471|math| scrap followed by the translation |S| of the |semi| scrap followed
2472by `\.{\\,}' followed by `|opt|' followed by `\.5'. (In the \TeX\ file,
2473this will specify an additional thin space after the semicolon, followed
2474by an optional line break with penalty 50.) Translation rules use subscripts
2475to distinguish between translations of scraps whose categories have the
2476same initial letter; these subscripts are assigned from left to right.
2477
2478$\.{WEAVE}$ also has the production rule
2479$$|semi|\;\RA\;|terminator|$$
2480(meaning that a semicolon can terminate a \PASCAL\ statement). Since
2481productions are applied from left to right, this rule will be activated
2482only if the |semi| is not preceded by scraps that match other productions;
2483in particular, a |semi| that is preceded by `|open| |math|' will have
2484disappeared because of the production above, and such semicolons do not
2485act as statement terminators.  This incidentally is how \.{WEAVE} is able
2486to treat semicolons in two distinctly different ways, the first of which
2487is intended for semicolons in the parameter list of a procedure
2488declaration.
2489
2490The translation rule corresponding to $|semi|\;\RA\;|terminator|$ is
2491$$T=S$$
2492but we shall not mention translation rules in the common case that the
2493translation of the new scrap on the right-hand side is simply the
2494concatenation of the disappearing scraps on the left-hand side.
2495
2496@ Here is a list of the category codes that scraps can have.
2497
2498@d simp=1 {the translation can be used both in horizontal mode
2499  and in math mode of \TeX}
2500@d math=2 {the translation should be used only in \TeX\ math mode}
2501@d intro=3 {a statement is expected to follow this, after a space and
2502  an optional break}
2503@d open=4 {denotes an incomplete parenthesized quantity to be used in
2504  math mode}
2505@d beginning=5 {denotes an incomplete compound statement to be used in
2506  horizontal mode}
2507@d close=6 {ends a parenthesis or compound statement}
2508@d alpha=7 {denotes the beginning of a clause}
2509@d omega=8 {denotes the ending of a clause and possible comment following}
2510@d semi=9 {denotes a semicolon and possible comment following it}
2511@d terminator=10 {something that ends a statement or declaration}
2512@d stmt=11 {denotes a statement or declaration including its terminator}
2513@d cond=12 {precedes an \&{if} clause that might have a matching \&{else}}
2514@d clause=13 {precedes a statement after which indentation ends}
2515@d colon=14 {denotes a colon}
2516@d exp=15 {stands for the E in a floating point constant}
2517@d proc=16 {denotes a procedure or program or function heading}
2518@d case_head=17 {denotes a case statement or record heading}
2519@d record_head=18 {denotes a record heading without indentation}
2520@d var_head=19 {denotes a variable declaration heading}
2521@d elsie=20 {\&{else}}
2522@d casey=21 {\&{case}}
2523@d mod_scrap=22 {denotes a module name}
2524
2525@p @!debug procedure print_cat(@!c:eight_bits);
2526  {symbolic printout of a category}
2527begin case c of
2528simp: print('simp');
2529math: print('math');
2530intro: print('intro');
2531open: print('open');
2532beginning: print('beginning');
2533close: print('close');
2534alpha: print('alpha');
2535omega: print('omega');
2536semi: print('semi');
2537terminator: print('terminator');
2538stmt: print('stmt');
2539cond: print('cond');
2540clause: print('clause');
2541colon: print('colon');
2542exp: print('exp');
2543proc: print('proc');
2544case_head: print('casehead');
2545record_head: print('recordhead');
2546var_head: print('varhead');
2547elsie: print('elsie');
2548casey: print('casey');
2549mod_scrap: print('module');
2550othercases print('UNKNOWN')
2551endcases;
2552end;
2553gubed
2554
2555@ The token lists for translated \TeX\ output contain some special control
2556symbols as well as ordinary characters. These control symbols are
2557interpreted by \.{WEAVE} before they are written to the output file.
2558
2559\yskip\hang |break_space| denotes an optional line break or an en space;
2560
2561\yskip\hang |force| denotes a line break;
2562
2563\yskip\hang |big_force| denotes a line break with additional vertical space;
2564
2565\yskip\hang |opt| denotes an optional line break (with the continuation
2566line indented two ems with respect to the normal starting position)---this
2567code is followed by an integer |n|, and the break will occur with penalty
2568$10n$;
2569
2570\yskip\hang |backup| denotes a backspace of one em;
2571
2572\yskip\hang |cancel| obliterates any |break_space| or |force| or |big_force|
2573tokens that immediately precede or follow it and also cancels any
2574|backup| tokens that follow it;
2575
2576\yskip\hang |indent| causes future lines to be indented one more em;
2577
2578\yskip\hang |outdent| causes future lines to be indented one less em.
2579
2580\yskip\noindent All of these tokens are removed from the \TeX\ output that
2581comes from \PASCAL\ text between \pb\ signs; |break_space| and |force| and
2582|big_force| become single spaces in this mode. The translation of other
2583\PASCAL\ texts results in \TeX\ control sequences \.{\\1}, \.{\\2},
2584\.{\\3}, \.{\\4}, \.{\\5}, \.{\\6}, \.{\\7} corresponding respectively to
2585|indent|, |outdent|, |opt|, |backup|, |break_space|, |force|, and
2586|big_force|. However, a sequence of consecutive `\.\ ', |break_space|,
2587|force|, and/or |big_force| tokens is first replaced by a single token
2588(the maximum of the given ones).
2589
2590The tokens |math_rel|, |math_bin|, |math_op| will be translated into
2591\.{\\mathrel\{}, \.{\\mathbin\{}, and \.{\\mathop\{}, respectively.
2592Other control sequences in the \TeX\ output will be `\.{\\\\\{}$\,\ldots\,$\.\}'
2593surrounding identifiers, `\.{\\\&\{}$\,\ldots\,$\.\}' surrounding
2594reserved words, `\.{\\.\{}$\,\ldots\,$\.\}' surrounding strings,
2595`\.{\\C\{}$\,\ldots\,$\.\}$\,$|force|' surrounding comments, and
2596`\.{\\X$n$:}$\,\ldots\,$\.{\\X}' surrounding module names, where
2597|n| is the module number.
2598
2599@d math_bin=@'203
2600@d math_rel=@'204
2601@d math_op=@'205
2602@d big_cancel=@'206 {like |cancel|, also overrides spaces}
2603@d cancel=@'207 {overrides |backup|, |break_space|, |force|, |big_force|}
2604@d indent=cancel+1 {one more tab (\.{\\1})}
2605@d outdent=cancel+2 {one less tab (\.{\\2})}
2606@d opt=cancel+3 {optional break in mid-statement (\.{\\3})}
2607@d backup=cancel+4 {stick out one unit to the left (\.{\\4})}
2608@d break_space=cancel+5 {optional break between statements (\.{\\5})}
2609@d force=cancel+6 {forced break between statements (\.{\\6})}
2610@d big_force=cancel+7 {forced break with additional space (\.{\\7})}
2611@d end_translation=big_force+1 {special sentinel token at end of list}
2612
2613@ The raw input is converted into scraps according to the following table,
2614which gives category codes followed by the translations. Sometimes a single
2615item of input produces more than one scrap.
2616\def\stars {\.{**}}%
2617(The symbol `\stars' stands for `\.{\\\&\{{\rm identifier}\}}',
2618i.e., the identifier itself treated as a reserved word. In a few cases the
2619category is given as `|@!comment|'; this is not an actual category code, it
2620means that the translation will be treated as a comment, as explained
2621below.)
2622
2623\yskip\halign{\quad#\hfil&\quad#\hfil\cr
2624\.{<>}&|math:|\.{\\I}\cr
2625\.{<=}&|math:|\.{\\L}\cr
2626\.{>=}&|math:|\.{\\G}\cr
2627\.{:=}&|math:|\.{\\K}\cr
2628\.{==}&|math:|\.{\\S}\cr
2629\.{(*}&|math:|\.{\\B}\cr
2630\.{*)}&|math:|\.{\\T}\cr
2631\.{(.}&|open:|\.[\cr
2632\.{.)}&|close:|\.]\cr
2633\."$\,$string$\,$\."&|simp:|\.{\\.\{"{\rm$\,$modified string$\,$}"\}}\cr
2634\.\'$\,$string$\,$\.\'&|simp:|\.{\\.\{\\\'{\rm$\,$modified
2635  string$\,$}\\\'\}}\cr
2636\.{@@=}$\,$string$\,$\.{@@>}&|simp:|\.{\\=\{{\rm$\,$modified string$\,$}\}}\cr
2637\#&|math:|\.{\\\#}\cr
2638\.\$&|math:|\.{\\\$}\cr
2639\.\_&|math:|\.{\\\_}\cr
2640\.\%&|math:|\.{\\\%}\cr
2641\.\^&|math:|\.{\\\^}\cr
2642\.(&|open:|\.(\cr
2643\.)&|close:|\.)\cr
2644\.[&|open:|\.[\cr
2645\.]&|close:|\.]\cr
2646\.*&|math:|\.{\\ast}\cr
2647\.,&|math:|\.,|@,opt@,|\.9\cr
2648\.{..}&|math:|\.{\\to}\cr
2649\..&|simp:|\..\cr
2650\.:&|colon:|\.:\cr
2651\.;&|semi:|\.;\cr
2652identifier&|simp:|\.{\\\\\{{\rm$\,$identifier$\,$}\}}\cr
2653\.E in constant&|exp:|\.{\\E\{}\cr
2654digit $d$&|simp:|$d$\cr
2655other character $c$&|math:|$c$\cr
2656\.{and}&|math:|\.{\\W}\cr
2657\.{array}&|alpha:|\stars\cr
2658\.{begin}&|beginning:|$|force|\,\stars\,|cancel|$\qquad|intro:|\cr
2659\.{case}&|casey:|\qquad|alpha:|$|force|\,\stars$\cr
2660\.{const}&|intro:|$|force|\,|backup|\,\stars$\cr
2661\.{div}&|math:|$|math_bin|\,\stars\,\.\}$\cr
2662\.{do}&|omega:|\stars\cr
2663\.{downto}&|math:|$|math_rel|\,\stars\,\.\}$\cr
2664\.{else}&|terminator:|\qquad|elsie:|$|force|\,|backup|\,\stars$\cr
2665\.{end}&|terminator:|\qquad|close:|$|force|\,\stars$\cr
2666\.{file}&|alpha:|\stars\cr
2667\.{for}&|alpha:|$|force|\,\stars$\cr
2668\.{function}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad
2669  |intro:|$|indent|\,\.{\\\ }$\cr
2670\.{goto}&|intro:|\stars\cr
2671\.{if}&|cond:|\qquad|alpha:|$|force|\,\stars$\cr
2672\.{in}&|math:|\.{\\in}\cr
2673\.{label}&|intro:|$|force|\,|backup|\,\stars$\cr
2674\.{mod}&|math:|$|math_bin|\,\stars\,\.\}$\cr
2675\.{nil}&|simp:|\stars\cr
2676\.{not}&|math:|\.{\\R}\cr
2677\.{of}&|omega:|\stars\cr
2678\.{or}&|math:|\.{\\V}\cr
2679\.{packed}&|intro:|\stars\cr
2680\.{procedure}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad
2681  |intro:|$|indent|\,\.{\\\ }$\cr
2682\.{program}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad
2683  |intro:|$|indent|\,\.{\\\ }$\cr
2684\.{record}&|record_head:|\stars\qquad|intro:|\cr
2685\.{repeat}&|beginning:|$|force|\,|indent|\,\stars\,|cancel|$\qquad|intro:|\cr
2686\.{set}&|alpha:|\stars\cr
2687\.{then}&|omega:|\stars\cr
2688\.{to}&|math:|$|math_rel|\,\stars\,\.\}$\cr
2689\.{type}&|intro:|$|force|\,|backup|\,\stars$\cr
2690\.{until}&|terminator:|\qquad|close:|$|force|\,|backup|\,\stars$\qquad
2691  |clause:|\cr
2692\.{var}&|var_head:|$|force|\,|backup|\,\stars\,|cancel|$\qquad|intro:|\cr
2693\.{while}&|alpha:|$|force|\,\stars$\cr
2694\.{with}&|alpha:|$|force|\,\stars$\cr
2695\.{xclause}&|alpha:|$|force|\,\.{\\\~}$\qquad|omega:|\stars\cr
2696\.{@@\'}$\,$const&|simp:|\.{\\O\{}\hbox{const}\.\}\cr
2697\.{@@"}$\,$const&|simp:|\.{\\H\{}\hbox{const}\.\}\cr
2698\.{@@\$}&|simp:|\.{\\)}\cr
2699\.{@@\\}&|simp:|\.{\\]}\cr
2700\.{@@,}&|math:|\.{\\,}\cr
2701\.{@@t}$\,$stuff$\,$\.{@@>}&|simp:|\.{\\hbox\{{\rm$\,$stuff$\,$}\}}\cr
2702\.{@@<}$\,$module$\,$\.{@@>}&|mod_scrap:|\.{\\X$n$:{\rm$\,$module$\,$}\\X}\cr
2703\.{@@\#}&|comment:||big_force|\cr
2704\.{@@/}&|comment:||force|\cr
2705\.{@@\char'174}&|simp:|$|opt|\,\.0$\cr
2706\.{@@+}&|comment:|$|big_cancel|\,\.{\\\ }\,|big_cancel|$\cr
2707\.{@@;}&|semi:|\cr
2708\.{@@\&}&|math:|\.{\\J}\cr
2709\.{@@\{}&|math:|\.{\\B}\cr
2710\.{@@\}}&|math:|\.{\\T}\cr}
2711\yskip\noindent When a string is output, certain characters are preceded by
2712`\.\\' signs so that they will print properly.
2713
2714A comment in the input will be combined with the preceding
2715|omega| or |semi| scrap, or with the following |terminator| scrap, if
2716possible; otherwise it will be inserted as a separate |terminator| scrap.
2717An additional ``comment'' is effectively appended at the end of the
2718\PASCAL\ text, just before translation begins; this consists of a |cancel|
2719token in the case of \PASCAL\ text in \pb, otherwise it consists of a
2720|force| token.
2721
2722From this table it is evident that \.{WEAVE} will parse a lot of non-\PASCAL\
2723programs. For example, the reserved words `\.{for}' and `\.{array}' are
2724treated in an identical way by \.{WEAVE} from a syntactic standpoint,
2725and semantically they are equivalent except that a forced line break occurs
2726just before `\&{for}'; \PASCAL\ programmers may well be surprised at this
2727similarity. The idea is to keep \.{WEAVE}'s rules as simple as possible,
2728consistent with doing a reasonable job on syntactically correct \PASCAL\
2729programs. The production rules below have been formulated in the same
2730spirit of ``almost anything goes.''
2731
2732@ Here is a table of all the productions. The reader can best get a feel for
2733@^productions, table of@>
2734how they work by trying them out by hand on small examples; no amount of
2735explanation will be as effective as watching the rules in action.
2736Parsing can also be watched by debugging with `\.{@@2}'.
2737
2738\def\[#1]{\quad$\dleft#1\dright$}
2739\def\sp{\.{\ }}
2740\yskip
2741\halign to\the\hsize{\hfil\it# &
2742  #\hfil\hskip-200pt\tabskip 0pt plus 100pt&
2743  #\hfil\tabskip0pt\cr
2744&Production categories\[\hbox{translations}]&Remarks\cr
2745\noalign{\yskip}
27461&|alpha@,math@,colon| $\RA$ |alpha@,math|&e.g., |case v:boolean of|\cr
27472&|alpha@,math@,omega| $\RA$ |clause|\[C=A\,\sp\,\.\$\,M\,\.\$\,\sp\,|indent|\,
2748O]&e.g., |while x>0 do|\cr
27493&|alpha@,omega| $\RA$ |clause|\[C=A\,\sp\,|indent|\,O]&e.g., |file of|\cr
27504&|alpha@,simp| $\RA$ |alpha@,math|&convert to math mode\cr
27515&|beginning@,close@,(terminator@t or @>stmt)| $\RA$ |stmt|&compound statement
2752ends\cr
27536&|beginning@,stmt| $\RA$ |beginning|\[B_2=B_1\,|break_space|\,S]&compound
2754statement grows\cr
27557&|case_head@,casey@,clause| $\RA$ |case_head|\[C_4=C_1\,|outdent|\,C_2\,C_3]&
2756variant records\cr
27578&|case_head@,close@,terminator| $\RA$ |stmt|\[S=C_1\,|cancel|\,|outdent|\,
2758C_2\,T]&end of case statement\cr
27599&|case_head@,stmt| $\RA$ |case_head|\[C_2=C_1\,|force|\,S]&case statement
2760grows\cr
276110&|casey@,clause| $\RA$ |case_head|&beginning of case statement\cr
276211&|clause@,stmt| $\RA$ |stmt|\[S_2=C\,|break_space|\,S_1\,|cancel|\,|outdent|\,
2763|force|]&end of controlled statement\cr
276412&|cond@,clause@,stmt@,elsie| $\RA$ |clause|\[C_3=C_1\,C_2\,|break_space|\,S\,
2765E\,\sp\,|cancel|]&complete conditional\cr
276613&|cond@,clause@,stmt| $\RA$ |stmt|\cr
2767&\qquad\[S_2=C_1\,C_2\,|break_space|\,S_1\,
2768|cancel|\,|outdent|\,|force|]&incomplete conditional\cr
276914&|elsie| $\RA$ |intro|&unmatched else\cr
277015&|exp@,math@,simp|* $\RA$ |math|\[M_2=E\,M_1\,S\,\.\}]&signed exponent\cr
277116&|exp@,simp|* $\RA$ |math|\[M=E\,S\,\.\}]&unsigned exponent\cr
277217&|intro@,stmt| $\RA$ |stmt|\[S_2=I\,\sp\,|opt|\,\.7\,|cancel|\,S_1]&labeled
2773statement, etc.\cr
277418&|math@,close| $\RA$ |stmt@,close|\[S=\.\$\,M\,\.\$]&end of field list\cr
277519&|math@,colon| $\RA$ |intro|\[I=|force|\,|backup|\,\.\$\,M\,\.\$\,C]&compound
2776label\cr
277720&|math@,math| $\RA$ |math|&simple concatenation\cr
277821&|math@,simp| $\RA$ |math|&simple concatenation\cr
277922&|math@,stmt| $\RA$ |stmt|\cr
2780&\qquad\[S_2=\.\$\,M\,\.\$\,|indent|\,|break_space|\,
2781S_1\,|cancel|\,|outdent|\,|force|]&macro or type definition\cr
278223&|math@,terminator| $\RA$ |stmt|\[S=\.\$\,M\,\.\$\,T]&statement involving
2783math\cr
278424&|mod_scrap@,(terminator@t or @>semi)| $\RA$ |stmt|\[S=M\,T\,|force|]&module
2785like a statement\cr
278625&|mod_scrap| $\RA$ |simp|&module unlike a statement\cr
278726&|open@,case_head@,close| $\RA$ |math|\[M=O\,\.\$\,|cancel|\,C_1\,
2788|cancel|\,|outdent|\,\.\$\,C_2]&case in field list\cr
278927&|open@,close| $\RA$ |math|\[M=O\,\.\\\,\.,\,C]&empty set |[]|\cr
279028&|open@,math@,case_head@,close| $\RA$ |math|\cr
2791&\qquad\[M_2=O\,M_1\,\.\$\,|cancel|\,
2792C_1\,|cancel|\,|outdent|\,\.\$\,C_2]&case in field list\cr
279329&|open@,math@,close| $\RA$ |math|&parenthesized group\cr
279430&|open@,math@,colon| $\RA$ |open@,math|&colon in parentheses\cr
279531&|open@,math@,proc@,intro| $\RA$ |open@,math|\[M_2=M_1\,|math_op|\,|cancel|\,
2796P\,\.\}]&|procedure| in parentheses\cr
279732&|open@,math@,semi| $\RA$ |open@,math|\[M_2=M_1\,S\,\.\\\,\.,\,|opt|\,
2798\.5]&semicolon in parentheses\cr
279933&|open@,math@,var_head@,intro| $\RA$ |open@,math|\[M_2=M_1\,|math_op|\,
2800|cancel|\,V\,\.\}]&|var| in parentheses\cr
280134&|open@,proc@,intro| $\RA$ |open@,math|\[M=|math_op|\,|cancel|\,
2802P\,\.\}]&|procedure| in parentheses\cr
280335&|open@,simp| $\RA$ |open@,math|&convert to math mode\cr
280436&|open@,stmt@,close| $\RA$ |math|\[M=O\,\.\$\,|cancel|\,S\,|cancel|\,
2805\.\$\,C]&field list\cr
280637&|open@,var_head@,intro| $\RA$ |open@,math|\[M=|math_op|\,|cancel|\,V\,
2807\.\}]&|var| in parentheses\cr
280838&|proc@,beginning@,close@,terminator| $\RA$ |stmt|\[S=P\,|cancel|\,
2809|outdent|\,B\,C\,T]&end of procedure declaration\cr
281039&|proc@,stmt| $\RA$ |proc|\[P_2=P_1\,|break_space|\,S]&procedure declaration
2811grows\cr
281240&|record_head@,intro@,casey| $\RA$ |casey|\[C_2=R\,I\,\sp\,|cancel|\,C_1]&
2813\&{record case} $\ldots$\cr
281441&|record_head| $\RA$ |case_head|\[C=|indent|\,R\,|cancel|]&other \&{record}
2815structures\cr
281642&|semi| $\RA$ |terminator|&semicolon after statement\cr
281743&|simp@,close| $\RA$ |stmt@,close|&end of field list\cr
281844&|simp@,colon| $\RA$ |intro|\[I=|force|\,|backup|\,S\,C]&simple label\cr
281945&|simp@,math| $\RA$ |math|&simple concatenation\cr
282046&|simp@,mod_scrap| $\RA$ |mod_scrap|&in emergencies\cr
282147&|simp@,simp| $\RA$ |simp|&simple concatenation\cr
282248&|simp@,terminator| $\RA$ |stmt|&simple statement\cr
282349&|stmt@,stmt| $\RA$ |stmt|\[S_3=S_1\,|break_space|\,S_2]&adjacent
2824statements\cr
282550&|terminator| $\RA$ |stmt|&empty statement\cr
282651&|var_head@,beginning| $\RA$ |stmt@,beginning|&end of variable
2827declarations\cr
282852&|var_head@,math@,colon| $\RA$ |var_head@,intro|\[I=\.\$\,M\,\.\$\,C]&
2829variable declaration\cr
283053&|var_head@,simp@,colon| $\RA$ |var_head@,intro|&variable declaration\cr
283154&|var_head@,stmt| $\RA$ |var_head|\[V_2=V_1\,|break_space|\,S]&variable
2832declarations grow\cr}
2833\yskip\noindent
2834Translations are not specified here when they are simple concatenations
2835of the scraps that change. For example, the full translation of
2836`|open@,math@,colon| $\RA$ |open@,math|' is $O_2=O_1$, $M_2=M_1C$.
2837
2838The notation `|simp|*', in the |exp|-related productions above,
2839stands for a |simp| scrap that isn't followed by another |simp|.
2840
2841@* Implementing the productions.
2842When \PASCAL\ text is to be processed with the grammar above, we put its
2843initial scraps $s_1\ldots s_n$ into two arrays |cat[1..n]| and |trans[1..n]|.
2844The value of |cat[k]| is simply a category code from the list above; the
2845value of |trans[k]| is a text pointer, i.e., an index into |tok_start|.
2846Our production rules have the nice property that the right-hand side is never
2847longer than the left-hand side. Therefore it is convenient to use sequential
2848allocation for the current sequence of scraps. Five pointers are used to
2849manage the parsing:
2850
2851\yskip\hang |pp| (the parsing pointer) is such that we are trying to match
2852the category codes |cat[pp]@,cat[pp+1]|$\,\ldots\,$ to the left-hand sides
2853of productions.
2854
2855\yskip\hang |scrap_base|, |lo_ptr|, |hi_ptr|, and |scrap_ptr| are such that
2856the current sequence of scraps appears in positions |scrap_base| through
2857|lo_ptr| and |hi_ptr| through |scrap_ptr|, inclusive, in the |cat| and
2858|trans| arrays. Scraps located between |scrap_base| and |lo_ptr| have
2859been examined, while those in positions |>=hi_ptr| have not yet been
2860looked at by the parsing process.
2861
2862\yskip\noindent Initially |scrap_ptr| is set to the position of the final
2863scrap to be parsed, and it doesn't change its value. The parsing process
2864makes sure that |lo_ptr>=pp+3|, since productions have as many as four terms,
2865by moving scraps from |hi_ptr| to |lo_ptr|. If there are
2866fewer than |pp+3| scraps left, the positions up to |pp+3| are filled with
2867blanks that will not match in any productions. Parsing stops when
2868|pp=lo_ptr+1| and |hi_ptr=scrap_ptr+1|.
2869
2870The |trans| array elements are declared to be of type |0..10239| instead
2871of type |text_pointer|, because the final sorting phase of \.{WEAVE}
2872uses this array to contain elements of type |name_pointer|. Both
2873of these types are subranges of |0..10239|.
2874
2875@<Glo...@>=
2876@!cat:array[0..max_scraps] of eight_bits; {category codes of scraps}
2877@!trans:array[0..max_scraps] of 0..10239; {translation texts of scraps}
2878@!pp:0..max_scraps; {current position for reducing productions}
2879@!scrap_base:0..max_scraps; {beginning of the current scrap sequence}
2880@!scrap_ptr:0..max_scraps; {ending of the current scrap sequence}
2881@!lo_ptr:0..max_scraps; {last scrap that has been examined}
2882@!hi_ptr:0..max_scraps; {first scrap that has not been examined}
2883stat@!max_scr_ptr:0..max_scraps; {largest value assumed by |scrap_ptr|}
2884tats
2885
2886@ @<Set init...@>=
2887scrap_base:=1; scrap_ptr:=0;
2888stat max_scr_ptr:=0; @+tats
2889
2890@ Token lists in |@!tok_mem| are composed of the following kinds of
2891items for \TeX\ output.
2892
2893\yskip\item{$\bullet$}ASCII codes and special codes like |force| and
2894|math_rel| represent themselves;
2895
2896\item{$\bullet$}|id_flag+p| represents \.{\\\\\{{\rm identifier $p$}\}};
2897
2898\item{$\bullet$}|res_flag+p| represents \.{\\\&\{{\rm identifier $p$}\}};
2899
2900\item{$\bullet$}|mod_flag+p| represents module name |p|;
2901
2902\item{$\bullet$}|tok_flag+p| represents token list number |p|;
2903
2904\item{$\bullet$}|inner_tok_flag+p| represents token list number |p|, to be
2905translated without line-break controls.
2906
2907@d id_flag=10240 {signifies an identifier}
2908@d res_flag=id_flag+id_flag {signifies a reserved word}
2909@d mod_flag=res_flag+id_flag {signifies a module name}
2910@d tok_flag==mod_flag+id_flag {signifies a token list}
2911@d inner_tok_flag==tok_flag+id_flag {signifies a token list in `\pb'}
2912@#
2913@d lbrace==xchr["{"] {this avoids possible \PASCAL\ compiler confusion}
2914@d rbrace==xchr["}"] {because these braces might occur within comments}
2915
2916@p @!debug procedure print_text(@!p:text_pointer); {prints a token list}
2917var j:0..max_toks; {index into |tok_mem|}
2918@!r:0..id_flag-1; {remainder of token after the flag has been stripped off}
2919begin if p>=text_ptr then print('BAD')
2920else for j:=tok_start[p] to tok_start[p+1]-1 do
2921  begin r:=tok_mem[j] mod id_flag;
2922  case tok_mem[j] div id_flag of
2923  1: begin print('\\',lbrace); print_id(r); print(rbrace);
2924    end; {|id_flag|}
2925  2: begin print('\&',lbrace); print_id(r); print(rbrace);
2926    end; {|res_flag|}
2927  3: begin print('<'); print_id(r); print('>');
2928    end; {|mod_flag|}
2929  4: print('[[',r:1,']]'); {|tok_flag|}
2930  5: print('|[[',r:1,']]|'); {|inner_tok_flag|}
2931  othercases @<Print token |r| in symbolic form@>
2932  endcases;
2933  end;
2934end;
2935gubed
2936
2937@ @<Print token |r|...@>=
2938case r of
2939math_bin: print('\mathbin',lbrace);
2940math_rel: print('\mathrel',lbrace);
2941math_op: print('\mathop',lbrace);
2942big_cancel: print('[ccancel]');
2943cancel: print('[cancel]');
2944indent: print('[indent]');
2945outdent: print('[outdent]');
2946backup: print('[backup]');
2947opt: print('[opt]');
2948break_space: print('[break]');
2949force: print('[force]');
2950big_force: print('[fforce]');
2951end_translation: print('[quit]');
2952othercases print(xchr[r])
2953endcases
2954
2955@ The production rules listed above are embedded directly into the \.{WEAVE}
2956program, since it is easier to do this than to write an interpretive system
2957that would handle production systems in general. Several macros are defined
2958here so that the program for each production is fairly short.
2959
2960All of our productions conform to the general notion that some |k|
2961consecutive scraps starting at some position |j| are to be replaced by a
2962single scrap of some category |c| whose translation is composed from the
2963translations of the disappearing scraps. After this production has been
2964applied, the production pointer |pp| should change by an amount |d|. Such
2965a production can be represented by the quadruple $(j,k,c,d)$. For example,
2966the production `|simp@,math| $\RA$ |math|' would be represented by
2967`$(|pp|,2,|math|,-1)$'; in this case the pointer $pp$ should decrease by 1
2968after the production has been applied, because some productions with
2969|math| in their second positions might now match, but no productions have
2970|math| in the third or fourth position of their left-hand sides. Note that
2971the value of |d| is determined by the whole collection of productions, not
2972by an individual one. Consider the further example
2973`|var_head@,math@,colon| $\RA$ |var_head@,intro|', which is represented by
2974`$(|pp|+1,2,|intro|,+1)$'; the $+1$ here is deduced by looking at the
2975grammar and seeing that no matches could possibly occur at positions |<=pp|
2976after this production has been applied. The determination of |d| has been
2977done by hand in each case, based on the full set of productions but not on
2978the grammar of \PASCAL\ or on the rules for constructing the initial
2979scraps.
2980
2981We also attach a serial number to each production, so that additional
2982information is available when debugging. For example, the program below
2983contains the statement `|reduce(pp+1,2,intro,+1)(52)|' when it implements
2984the production just mentioned.
2985
2986Before calling |reduce|, the program should have appended the tokens of
2987the new translation to the |tok_mem| array. We commonly want to append
2988copies of several existing translations, and macros are defined to
2989simplify these common cases. For example, |app2(pp)| will append the
2990translations of two consecutive scraps, |trans[pp]| and |trans[pp+1]|, to
2991the current token list. If the entire new translation is formed in this
2992way, we write `$|squash|(j,k,c,d)$' instead of `$|reduce|(j,k,c,d)$'. For
2993example, `|squash(pp,2,math,-1)|' is an abbreviation for `|app2(pp);
2994reduce(pp,2,math,-1)|'.
2995
2996The code below is an exact translation of the production rules into
2997\PASCAL, using such macros, and the reader should have no difficulty
2998understanding the format by comparing the code with the symbolic
2999productions as they were listed earlier.
3000
3001{\sl Caution:\/} The macros |app|, |app1|, |app2|, and |app3| are
3002sequences of statements that are not enclosed with |begin| and $|end|$,
3003because such delimiters would make the \PASCAL\ program much longer. This
3004means that it is necessary to write |begin| and |end| explicitly when such
3005a macro is used as a single statement. Several mysterious bugs in the
3006original programming of \.{WEAVE} were caused by a failure to remember
3007this fact.  Next time the author will know better.
3008
3009@d production(#)==@!debug prod(#) gubed; goto found
3010@d reduce(#)==red(#); production
3011@d production_end(#)==@!debug prod(#) gubed; goto found;
3012  end
3013@d squash(#)==begin sq(#); production_end
3014@d app(#)==tok_mem[tok_ptr]:=#; incr(tok_ptr) {this is like |app_tok|,
3015  but it doesn't test for overflow}
3016@d app1(#)==tok_mem[tok_ptr]:=tok_flag+trans[#]; incr(tok_ptr)
3017@d app2(#)==app1(#);app1(#+1)
3018@d app3(#)==app2(#);app1(#+2)
3019
3020@ Let us consider the big case statement for productions now, before looking
3021at its context. We want to design the program so that this case statement
3022works, so we might as well not keep ourselves in suspense about exactly what
3023code needs to be provided with a proper environment.
3024
3025The code here is more complicated than it need be, since some popular
3026\PASCAL\ compilers are unable to deal with procedures that contain a lot
3027of program text. The |translate| procedure, which incorporates the |case|
3028statement here, would become too long for those compilers if we did
3029not do something to split the cases into parts. Therefore
3030a separate procedure called |five_cases| has been introduced.
3031@^split procedures@>
3032This auxiliary procedure contains approximately half of the program text
3033that |translate| would otherwise have had. There's also a procedure
3034called |alpha_cases|, which turned out to be necessary because the best
3035two-way split wasn't good enough. The procedure could be split further
3036in an analogous manner, but the present scheme works on all compilers
3037known to the author.
3038
3039@<Match a production at |pp|, or increase |pp| if there is no match@>=
3040if cat[pp]<=alpha then
3041  if cat[pp]<alpha then five_cases@+else alpha_cases
3042else  begin case cat[pp] of
3043  case_head: @<Cases for |case_head|@>;
3044  casey: @<Cases for |casey|@>;
3045  clause: @<Cases for |clause|@>;
3046  cond: @<Cases for |cond|@>;
3047  elsie: @<Cases for |elsie|@>;
3048  exp: @<Cases for |exp|@>;
3049  mod_scrap: @<Cases for |mod_scrap|@>;
3050  proc: @<Cases for |proc|@>;
3051  record_head: @<Cases for |record_head|@>;
3052  semi: @<Cases for |semi|@>;
3053  stmt: @<Cases for |stmt|@>;
3054  terminator: @<Cases for |terminator|@>;
3055  var_head: @<Cases for |var_head|@>;
3056  othercases do_nothing
3057  endcases;@/
3058  incr(pp); {if no match was found, we move to the right}
3059  found: end
3060
3061@ Here are the procedures that need to be present for the reason just
3062explained.
3063
3064@<Declaration of subprocedures for |translate|@>=
3065procedure five_cases; {handles almost half of the syntax}
3066label found;
3067begin case cat[pp] of
3068beginning: @<Cases for |beginning|@>;
3069intro: @<Cases for |intro|@>;
3070math: @<Cases for |math|@>;
3071open: @<Cases for |open|@>;
3072simp: @<Cases for |simp|@>;
3073othercases do_nothing
3074endcases;@/
3075incr(pp); {if no match was found, we move to the right}
3076found: end;
3077@#
3078procedure alpha_cases;
3079label found;
3080begin @<Cases for |alpha|@>;
3081incr(pp); {if no match was found, we move to the right}
3082found: end;
3083
3084@ Now comes the code that tries to match each production starting
3085with a particular type of scrap. Whenever a match is discovered,
3086the |squash| or |reduce| macro will cause the appropriate action
3087to be performed, followed by |goto found|.
3088
3089@<Cases for |alpha|@>=
3090if cat[pp+1]=math then
3091  begin if cat[pp+2]=colon then squash(pp+1,2,math,0)(1)
3092  else if cat[pp+2]=omega then
3093    begin app1(pp); app(" "); app("$"); app1(pp+1);
3094    app("$"); app(" "); app(indent); app1(pp+2);
3095    reduce(pp,3,clause,-2)(2);
3096    end;
3097  end
3098else if cat[pp+1]=omega then
3099  begin app1(pp); app(" "); app(indent); app1(pp+1);
3100  reduce(pp,2,clause,-2)(3);
3101  end
3102else if cat[pp+1]=simp then squash(pp+1,1,math,0)(4)
3103
3104@ @<Cases for |beginning|@>=
3105if cat[pp+1]=close then
3106  begin if (cat[pp+2]=terminator)or(cat[pp+2]=stmt) then
3107    squash(pp,3,stmt,-2)(5);
3108  end
3109else if cat[pp+1]=stmt then
3110  begin app1(pp); app(break_space); app1(pp+1);
3111  reduce(pp,2,beginning,-1)(6);
3112  end
3113
3114@ @<Cases for |case_head|@>=
3115if cat[pp+1]=casey then
3116  begin if cat[pp+2]=clause then
3117    begin app1(pp); app(outdent); app2(pp+1);
3118    reduce(pp,3,case_head,0)(7);
3119    end;
3120  end
3121else if cat[pp+1]=close then
3122  begin if cat[pp+2]=terminator then
3123    begin app1(pp); app(cancel); app(outdent); app2(pp+1);
3124    reduce(pp,3,stmt,-2)(8);
3125    end;
3126  end
3127else if cat[pp+1]=stmt then
3128  begin app1(pp); app(force); app1(pp+1);
3129  reduce(pp,2,case_head,0)(9);
3130  end
3131
3132@ @<Cases for |casey|@>=
3133if cat[pp+1]=clause then squash(pp,2,case_head,0)(10)
3134
3135@ @<Cases for |clause|@>=
3136if cat[pp+1]=stmt then
3137  begin app1(pp); app(break_space); app1(pp+1);
3138  app(cancel); app(outdent);
3139  app(force); reduce(pp,2,stmt,-2)(11);
3140  end
3141
3142@ @<Cases for |cond|@>=
3143if (cat[pp+1]=clause)and(cat[pp+2]=stmt) then
3144  if cat[pp+3]=elsie then
3145    begin app2(pp); app(break_space); app2(pp+2); app(" ");
3146    app(cancel); reduce(pp,4,clause,-2)(12);
3147    end
3148  else  begin app2(pp); app(break_space); app1(pp+2); app(cancel);
3149    app(outdent); app(force); reduce(pp,3,stmt,-2)(13);
3150    end
3151
3152@ @<Cases for |elsie|@>=
3153squash(pp,1,intro,-3)(14)
3154
3155@ @<Cases for |exp|@>=
3156if cat[pp+1]=math then
3157  begin if cat[pp+2]=simp then if cat[pp+3]<>simp then
3158    begin app3(pp); app("}"); reduce(pp,3,math,-1)(15);
3159    end;
3160  end
3161else if cat[pp+1]=simp then if cat[pp+2]<>simp then
3162  begin app2(pp); app("}"); reduce(pp,2,math,-1)(16);
3163  end
3164
3165@ @<Cases for |intro|@>=
3166if cat[pp+1]=stmt then
3167  begin app1(pp); app(" "); app(opt); app("7");
3168  app(cancel); app1(pp+1); reduce(pp,2,stmt,-2)(17);
3169  end
3170
3171@ @<Cases for |math|@>=
3172if cat[pp+1]=close then
3173  begin app("$"); app1(pp); app("$"); reduce(pp,1,stmt,-2)(18);
3174  end
3175else if cat[pp+1]=colon then
3176  begin app(force); app(backup); app("$"); app1(pp);
3177  app("$"); app1(pp+1); reduce(pp,2,intro,-3)(19);
3178  end
3179else if cat[pp+1]=math then squash(pp,2,math,-1)(20)
3180else if cat[pp+1]=simp then squash(pp,2,math,-1)(21)
3181else if cat[pp+1]=stmt then
3182  begin app("$"); app1(pp); app("$"); app(indent);
3183  app(break_space); app1(pp+1); app(cancel); app(outdent);
3184  app(force); reduce(pp,2,stmt,-2)(22);
3185  end
3186else if cat[pp+1]=terminator then
3187  begin app("$"); app1(pp); app("$"); app1(pp+1);
3188  reduce(pp,2,stmt,-2)(23);
3189  end
3190
3191@ @<Cases for |mod_scrap|@>=
3192if (cat[pp+1]=terminator)or(cat[pp+1]=semi) then
3193  begin app2(pp); app(force); reduce(pp,2,stmt,-2)(24);
3194  end
3195else squash(pp,1,simp,-2)(25)
3196
3197@ @<Cases for |open|@>=
3198if (cat[pp+1]=case_head)and(cat[pp+2]=close) then
3199  begin app1(pp); app("$"); app(cancel); app1(pp+1); app(cancel);
3200  app(outdent); app("$"); app1(pp+2); reduce(pp,3,math,-1)(26);
3201  end
3202else if cat[pp+1]=close then
3203  begin app1(pp); app("\"); app(","); app1(pp+1);
3204@.\\,@>
3205  reduce(pp,2,math,-1)(27);
3206  end
3207else if cat[pp+1]=math then @<Cases for |open@,math|@>
3208else if cat[pp+1]=proc then
3209  begin if cat[pp+2]=intro then
3210    begin app(math_op); app(cancel); app1(pp+1); app("}");
3211    reduce(pp+1,2,math,0)(34);
3212    end;
3213  end
3214else if cat[pp+1]=simp then squash(pp+1,1,math,0)(35)
3215else if (cat[pp+1]=stmt)and(cat[pp+2]=close) then
3216  begin app1(pp); app("$"); app(cancel); app1(pp+1); app(cancel);
3217  app("$"); app1(pp+2); reduce(pp,3,math,-1)(36);
3218  end
3219else if cat[pp+1]=var_head then
3220  begin if cat[pp+2]=intro then
3221    begin app(math_op); app(cancel); app1(pp+1); app("}");
3222    reduce(pp+1,2,math,0)(37);
3223    end;
3224  end
3225
3226@ @<Cases for |open@,math|@>=
3227begin if (cat[pp+2]=case_head)and(cat[pp+3]=close) then
3228  begin app2(pp); app("$"); app(cancel); app1(pp+2); app(cancel);
3229  app(outdent); app("$"); app1(pp+3); reduce(pp,4,math,-1)(28);
3230  end
3231else if cat[pp+2]=close then squash(pp,3,math,-1)(29)
3232else if cat[pp+2]=colon then squash(pp+1,2,math,0)(30)
3233else if cat[pp+2]=proc then
3234  begin if cat[pp+3]=intro then
3235    begin app1(pp+1); app(math_op); app(cancel);
3236    app1(pp+2); app("}"); reduce(pp+1,3,math,0)(31);
3237    end;
3238  end
3239else if cat[pp+2]=semi then
3240  begin app2(pp+1); app("\"); app(","); app(opt); app("5");
3241@.\\,@>
3242  reduce(pp+1,2,math,0)(32);
3243  end
3244else if cat[pp+2]=var_head then
3245  begin if cat[pp+3]=intro then
3246    begin app1(pp+1); app(math_op); app(cancel);
3247    app1(pp+2); app("}"); reduce(pp+1,3,math,0)(31);
3248    end;
3249  end;
3250end
3251
3252@ @<Cases for |proc|@>=
3253if cat[pp+1]=beginning then
3254  begin if (cat[pp+2]=close)and(cat[pp+3]=terminator) then
3255    begin app1(pp); app(cancel); app(outdent); app3(pp+1);
3256    reduce(pp,4,stmt,-2)(38);
3257    end;
3258  end
3259else if cat[pp+1]=stmt then
3260  begin app1(pp); app(break_space); app1(pp+1);
3261  reduce(pp,2,proc,-2)(39);
3262  end
3263
3264@ @<Cases for |record_head|@>=
3265if (cat[pp+1]=intro)and(cat[pp+2]=casey) then
3266  begin app2(pp); app(" "); app(cancel); app1(pp+2);
3267  reduce(pp,3,casey,-2)(40);
3268  end
3269else  begin app(indent); app1(pp); app(cancel);
3270  reduce(pp,1,case_head,0)(41);
3271  end
3272
3273@ @<Cases for |semi|@>=
3274squash(pp,1,terminator,-3)(42)
3275
3276@ @<Cases for |simp|@>=
3277if cat[pp+1]=close then squash(pp,1,stmt,-2)(43)
3278else if cat[pp+1]=colon then
3279  begin app(force); app(backup); app2(pp); reduce(pp,2,intro,-3)(44);
3280  end
3281else if cat[pp+1]=math then squash(pp,2,math,-1)(45)
3282else if cat[pp+1]=mod_scrap then squash(pp,2,mod_scrap,0)(46)
3283else if cat[pp+1]=simp then squash(pp,2,simp,-2)(47)
3284else if cat[pp+1]=terminator then squash(pp,2,stmt,-2)(48)
3285
3286@ @<Cases for |stmt|@>=
3287if cat[pp+1]=stmt then
3288  begin app1(pp); app(break_space); app1(pp+1);
3289  reduce(pp,2,stmt,-2)(49);
3290  end
3291
3292@ @<Cases for |terminator|@>=
3293squash(pp,1,stmt,-2)(50)
3294
3295@ @<Cases for |var_head|@>=
3296if cat[pp+1]=beginning then squash(pp,1,stmt,-2)(51)
3297else if cat[pp+1]=math then
3298  begin if cat[pp+2]=colon then
3299    begin app("$"); app1(pp+1); app("$"); app1(pp+2);
3300    reduce(pp+1,2,intro,+1)(52);
3301    end;
3302  end
3303else if cat[pp+1]=simp then
3304  begin if cat[pp+2]=colon then squash(pp+1,2,intro,+1)(53);
3305  end
3306else if cat[pp+1]=stmt then
3307  begin app1(pp); app(break_space); app1(pp+1);
3308  reduce(pp,2,var_head,-2)(54);
3309  end
3310
3311@ The `|freeze_text|' macro is used to give official status to a token list.
3312Before saying |freeze_text|, items are appended to the current token list,
3313and we know that the eventual number of this token list will be the current
3314value of |text_ptr|. But no list of that number really exists as yet,
3315because no ending point for the current list has been
3316stored in the |tok_start| array. After saying |freeze_text|, the
3317old current token list becomes legitimate, and its number is the current
3318value of |text_ptr-1| since |text_ptr| has been increased. The new
3319current token list is empty and ready to be appended to.
3320Note that |freeze_text| does not check to see that |text_ptr| hasn't gotten
3321too large, since it is assumed that this test was done beforehand.
3322
3323@d freeze_text==incr(text_ptr); tok_start[text_ptr]:=tok_ptr
3324
3325@ The `|reduce|' macro used in our code for productions actually calls on
3326a procedure named `|red|', which makes the appropriate changes to the
3327scrap list.
3328
3329@p procedure red(@!j:sixteen_bits; @!k:eight_bits; @!c:eight_bits;
3330  @!d:integer);
3331var i:0..max_scraps; {index into scrap memory}
3332begin cat[j]:=c; trans[j]:=text_ptr; freeze_text;
3333if k>1 then
3334  begin for i:=j+k to lo_ptr do
3335    begin cat[i-k+1]:=cat[i]; trans[i-k+1]:=trans[i];
3336    end;
3337  lo_ptr:=lo_ptr-k+1;
3338  end;
3339@<Change |pp| to $\max(|scrap_base|,|pp+d|)$@>;
3340end;
3341
3342@ @<Change |pp| to $\max(|scrap_base|,|pp+d|)$@>=
3343if pp+d>=scrap_base then pp:=pp+d
3344else pp:=scrap_base
3345
3346@ Similarly, the `|squash|' macro invokes a procedure called `|sq|'. This
3347procedure takes advantage of the simplification that occurs when |k=1|.
3348
3349@p procedure sq(@!j:sixteen_bits; @!k:eight_bits; @!c:eight_bits;
3350  @!d:integer);
3351var i:0..max_scraps; {index into scrap memory}
3352begin if k=1 then
3353  begin cat[j]:=c; @<Change |pp|...@>;
3354  end
3355else  begin for i:=j to j+k-1 do
3356    begin app1(i);
3357    end;
3358  red(j,k,c,d);
3359  end;
3360end;
3361
3362@ Here now is the code that applies productions as long as possible. It
3363requires two local labels (|found| and |done|), as well as a local
3364variable (|i|).
3365
3366@<Reduce the scraps using the productions until no more rules apply@>=
3367loop@+begin @<Make sure the entries |cat[pp..(pp+3)]| are defined@>;
3368  if (tok_ptr+8>max_toks)or(text_ptr+4>max_texts) then
3369    begin stat if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;
3370    if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
3371    tats@;@/
3372    overflow('token/text');
3373    end;
3374  if pp>lo_ptr then goto done;
3375  @<Match a production...@>;
3376  end;
3377done:
3378
3379@ If we get to the end of the scrap list, category codes equal to zero are
3380stored, since zero does not match anything in a production.
3381
3382@<Make sure the entries...@>=
3383if lo_ptr<pp+3 then
3384  begin repeat if hi_ptr<=scrap_ptr then
3385    begin incr(lo_ptr);@/
3386    cat[lo_ptr]:=cat[hi_ptr]; trans[lo_ptr]:=trans[hi_ptr];@/
3387    incr(hi_ptr);
3388    end;
3389  until (hi_ptr>scrap_ptr)or(lo_ptr=pp+3);
3390  for i:=lo_ptr+1 to pp+3 do cat[i]:=0;
3391  end
3392
3393@ If \.{WEAVE} is being run in debugging mode, the production numbers and
3394current stack categories will be printed out when |tracing| is set to 2;
3395a sequence of two or more irreducible scraps will be printed out when
3396|tracing| is set to 1.
3397@.\AT!2@>
3398@.\AT!1@>
3399
3400@<Glo...@>=
3401@!debug@!tracing:0..2; {can be used to show parsing details}
3402gubed
3403
3404@ The |prod| procedure is called in debugging mode just after |reduce| or
3405|squash|; its parameter is the number of the production that has just
3406been applied.
3407
3408@p @!debug procedure prod(@!n:eight_bits); {shows current categories}
3409var k:1..max_scraps; {index into |cat|}
3410begin if tracing=2 then
3411  begin print_nl(n:1,':');
3412  for k:=scrap_base to lo_ptr do
3413    begin if k=pp then print('*') @+ else print(' ');
3414    print_cat(cat[k]);
3415    end;
3416  if hi_ptr<=scrap_ptr then print('...'); {indicate that more is coming}
3417  end;
3418end;
3419gubed
3420
3421@ The |translate| function assumes that scraps have been stored in
3422positions |scrap_base| through |scrap_ptr| of |cat| and |trans|. It
3423appends a |terminator| scrap and begins to apply productions as much as
3424possible. The result is a token list containing the translation of
3425the given sequence of scraps.
3426
3427After calling |translate|, we will have |text_ptr+3<=max_texts| and
3428|tok_ptr+6<=max_toks|, so it will be possible to create up to three token
3429lists with up to six tokens without checking for overflow. Before calling
3430|translate|, we should have |text_ptr<max_texts| and |scrap_ptr<max_scraps|,
3431since |translate| might add a new text and a new scrap before it checks
3432for overflow.
3433
3434@p @<Declaration of subprocedures for |translate|@>@;
3435function translate:text_pointer; {converts a sequence of scraps}
3436label done,found;
3437var i: 1..max_scraps; {index into |cat|}
3438@!j:0..max_scraps; {runs through final scraps}
3439@!k:0..long_buf_size; {index into |buffer|}
3440begin pp:=scrap_base; lo_ptr:=pp-1; hi_ptr:=pp;
3441@<If tracing, print an indication of where we are@>;
3442@<Reduce the scraps...@>;
3443if (lo_ptr=scrap_base)and(cat[lo_ptr]<>math) then translate:=trans[lo_ptr]
3444else @<Combine the irreducible scraps that remain@>;
3445end;
3446
3447@ If the initial sequence of scraps does not reduce to a single scrap,
3448we concatenate the translations of all remaining scraps, separated by
3449blank spaces, with dollar signs surrounding the translations of |math|
3450scraps.
3451
3452@<Combine the irreducible...@>=
3453begin @<If semi-tracing, show the irreducible scraps@>;
3454for j:=scrap_base to lo_ptr do
3455  begin if j<>scrap_base then
3456    begin app(" ");
3457    end;
3458  if cat[j]=math then
3459    begin app("$");
3460    end;
3461  app1(j);
3462  if cat[j]=math then
3463    begin app("$");
3464    end;
3465  if tok_ptr+6>max_toks then overflow('token');
3466  end;
3467freeze_text; translate:=text_ptr-1;
3468end
3469
3470@ @<If semi-tracing, show the irreducible scraps@>=
3471@!debug if (lo_ptr>scrap_base)and(tracing=1) then
3472  begin print_nl('Irreducible scrap sequence in section ',module_count:1);
3473  print_ln(':'); mark_harmless;
3474  for j:=scrap_base to lo_ptr do
3475    begin print(' '); print_cat(cat[j]);
3476    end;
3477  end;
3478gubed
3479
3480@ @<If tracing,...@>=
3481@!debug if tracing=2 then
3482  begin print_nl('Tracing after l.',line:1,':'); mark_harmless;
3483  if loc>50 then
3484    begin print('...');
3485    for k:=loc-50 to loc do print(xchr[buffer[k-1]]);
3486    end
3487  else for k:=1 to loc do print(xchr[buffer[k-1]]);
3488  end
3489gubed
3490
3491@* Initializing the scraps.
3492If we are going to use the powerful production mechanism just developed, we
3493must get the scraps set up in the first place, given a \PASCAL\ text. A table
3494of the initial scraps corresponding to \PASCAL\ tokens appeared above in the
3495section on parsing; our goal now is to implement that table. We shall do this
3496by implementing a subroutine called |Pascal_parse| that is analogous to the
3497|Pascal_xref| routine used during phase one.
3498
3499Like |Pascal_xref|, the |Pascal_parse| procedure starts with the current
3500value of |next_control| and it uses the operation |next_control:=get_next|
3501repeatedly to read \PASCAL\ text until encountering the next `\v' or
3502`\.\{', or until |next_control>=format|. The scraps corresponding to what
3503it reads are appended into the |cat| and |trans| arrays, and |scrap_ptr|
3504is advanced.
3505
3506Like |prod|, this procedure has to split into pieces so that each
3507part is short enough to be handled by \PASCAL\ compilers that discriminate
3508against long subroutines. This time there are two split-off routines,
3509called |easy_cases| and |sub_cases|.
3510@^split procedures@>
3511
3512After studying |Pascal_parse|, we will look at the sub-procedures
3513|app_comment|, |app_octal|, and |app_hex| that are used in some of its
3514branches.
3515
3516@p @<Declaration of the |app_comment| procedure@>@;
3517@<Declaration of the |app_octal| and |app_hex| procedures@>@;
3518@<Declaration of the |easy_cases| procedure@>@;
3519@<Declaration of the |sub_cases| procedure@>@;
3520procedure Pascal_parse; {creates scraps from \PASCAL\ tokens}
3521label reswitch, exit;
3522var j:0..long_buf_size; {index into |buffer|}
3523@!p:name_pointer; {identifier designator}
3524begin while next_control<format do
3525  begin @<Append the scrap appropriate to |next_control|@>;
3526  next_control:=get_next;
3527  if (next_control="|")or(next_control="{") then return;
3528  end;
3529exit:end;
3530
3531@ The macros defined here are helpful abbreviations for the operations
3532needed when generating the scraps. A scrap of category |c| whose
3533translation has three tokens $t_1$, $t_2$, $t_3$ is generated by
3534|sc3|$(t_1)(t_2)(t_3)(c)$, etc.
3535
3536@d s0(#)==incr(scrap_ptr); cat[scrap_ptr]:=#; trans[scrap_ptr]:=text_ptr;
3537  freeze_text;
3538  end
3539@d s1(#)==app(#);s0
3540@d s2(#)==app(#);s1
3541@d s3(#)==app(#);s2
3542@d s4(#)==app(#);s3
3543@d sc4==@+begin s4
3544@d sc3==@+begin s3
3545@d sc2==@+begin s2
3546@d sc1==@+begin s1
3547@d sc0(#)==begin incr(scrap_ptr); cat[scrap_ptr]:=#; trans[scrap_ptr]:=0;
3548  end
3549@d comment_scrap(#)==begin app(#); app_comment;
3550  end
3551
3552@ @<Append the scr...@>=
3553@<Make sure that there is room for at least four more scraps, six more
3554tokens, and four more texts@>;
3555reswitch: case next_control of
3556string,verbatim: @<Append a \(string scrap@>;
3557identifier: @<Append an identifier scrap@>;
3558TeX_string: @<Append a \TeX\ string scrap@>;
3559othercases easy_cases
3560endcases
3561
3562@ The |easy_cases| each result in straightforward scraps.
3563
3564@<Declaration of the |easy_cases| procedure@>=
3565procedure easy_cases; {a subprocedure of |Pascal_parse|}
3566begin case next_control of
3567set_element_sign: sc3("\")("i")("n")(math);
3568@.\\in@>
3569double_dot: sc3("\")("t")("o")(math);
3570@.\\to@>
3571"#","$","%","^","_": sc2("\")(next_control)(math);
3572@.\\\#@>
3573@.\\\$@>
3574@.\\\%@>
3575@.\\\^@>
3576ignore,"|",xref_roman,xref_wildcard,xref_typewriter: do_nothing;
3577"(","[": sc1(next_control)(open);
3578")","]": sc1(next_control)(close);
3579"*": sc4("\")("a")("s")("t")(math);
3580@.\\ast@>
3581",": sc3(",")(opt)("9")(math);
3582".","0","1","2","3","4","5","6","7","8","9": sc1(next_control)(simp);
3583";": sc1(";")(semi);
3584":": sc1(":")(colon);
3585@t\4@>  @<Cases involving nonstandard ASCII characters@>@;
3586exponent: sc3("\")("E")("{")(exp);
3587@.\\E@>
3588begin_comment: sc2("\")("B")(math);
3589@.\\B@>
3590end_comment: sc2("\")("T")(math);
3591@.\\T@>
3592octal: app_octal;
3593hex: app_hex;
3594check_sum: sc2("\")(")")(simp);
3595@.\\)@>
3596force_line: sc2("\")("]")(simp);
3597@.\\]@>
3598thin_space: sc2("\")(",")(math);
3599@.\\,@>
3600math_break: sc2(opt)("0")(simp);
3601line_break: comment_scrap(force);
3602big_line_break: comment_scrap(big_force);
3603no_line_break: begin app(big_cancel); app("\"); app(" ");
3604@.\\\ @>
3605  comment_scrap(big_cancel);
3606  end;
3607pseudo_semi: sc0(semi);
3608join: sc2("\")("J")(math);
3609@.\\J@>
3610othercases sc1(next_control)(math)
3611endcases;
3612end;
3613
3614@ @<Make sure that there is room for at least four...@>=
3615if (scrap_ptr+4>max_scraps)or(tok_ptr+6>max_toks)or(text_ptr+4>max_texts) then
3616  begin stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;
3617  if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;
3618  if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
3619  tats@;@/
3620  overflow('scrap/token/text');
3621  end
3622
3623@ Some nonstandard ASCII characters may have entered \.{WEAVE} by means of
3624standard ones. They are converted to \TeX\ control sequences so that it is
3625possible to keep \.{WEAVE} from stepping beyond standard ASCII.
3626
3627@<Cases involving nonstandard...@>=
3628not_equal: sc2("\")("I")(math);
3629@.\\I@>
3630less_or_equal: sc2("\")("L")(math);
3631@.\\L@>
3632greater_or_equal: sc2("\")("G")(math);
3633@.\\G@>
3634equivalence_sign: sc2("\")("S")(math);
3635@.\\S@>
3636and_sign: sc2("\")("W")(math);
3637@.\\W@>
3638or_sign: sc2("\")("V")(math);
3639@.\\V@>
3640not_sign: sc2("\")("R")(math);
3641@.\\R@>
3642left_arrow: sc2("\")("K")(math);
3643@.\\K@>
3644
3645@ The following code must use |app_tok| instead of |app| in order to
3646protect against overflow. Note that |tok_ptr+1<=max_toks| after |app_tok|
3647has been used, so another |app| is legitimate before testing again.
3648
3649Many of the special characters in a string must be prefixed by `\.\\' so that
3650\TeX\ will print them properly.
3651@^special string characters@>
3652
3653@<Append a \(string scrap@>=
3654begin app("\");
3655if next_control=verbatim then
3656  begin app("=");
3657@.\\=@>
3658  end
3659else  begin app(".");
3660@.\\.@>
3661  end;
3662app("{"); j:=id_first;
3663while j<id_loc do
3664  begin case buffer[j] of
3665  " ","\","#","%","$","^","'","`","{","}","~","&","_":
3666      begin app("\");
3667      end;
3668@.\\\ @>
3669@.\\\\@>
3670@.\\\#@>
3671@.\\\%@>
3672@.\\\$@>
3673@.\\\^@>
3674@.\\\'@>
3675@.\\\`@>
3676@.\\\{@>
3677@.\\\}@>
3678@.\\\~@>
3679@.\\\&@>
3680@.\\_@>
3681  "@@": if buffer[j+1]="@@" then incr(j)
3682    else err_print('! Double @@ should be used in strings');
3683@.Double \AT! should be used...@>
3684  othercases do_nothing
3685  endcases;@/
3686  app_tok(buffer[j]); incr(j);
3687  end;
3688sc1("}")(simp);
3689end
3690
3691@ @<Append a \TeX\ string scrap@>=
3692begin app("\"); app("h"); app("b"); app("o"); app("x");
3693app("{");
3694for j:=id_first to id_loc-1 do app_tok(buffer[j]);
3695sc1("}")(simp);
3696end
3697
3698@ @<Append an identifier scrap@>=
3699begin p:=id_lookup(normal);
3700case ilk[p] of
3701normal,array_like,const_like,div_like,
3702  do_like,for_like,goto_like,nil_like,to_like: sub_cases(p);
3703@t\4@>@<Cases that generate more than one scrap@>@;
3704othercases begin next_control:=ilk[p]-char_like; goto reswitch;
3705  end {\&{and}, \&{in}, \&{not}, \&{or}}
3706endcases;
3707end
3708
3709@ The |sub_cases| also result in straightforward scraps.
3710
3711@<Declaration of the |sub_cases| procedure@>=
3712procedure sub_cases(@!p:name_pointer); {a subprocedure of |Pascal_parse|}
3713begin case ilk[p] of
3714normal: sc1(id_flag+p)(simp); {not a reserved word}
3715array_like: sc1(res_flag+p)(alpha); {\&{array}, \&{file}, \&{set}}
3716const_like: sc3(force)(backup)(res_flag+p)(intro);
3717  {\&{const}, \&{label}, \&{type}}
3718div_like: sc3(math_bin)(res_flag+p)("}")(math); {\&{div}, \&{mod}}
3719do_like: sc1(res_flag+p)(omega); {\&{do}, \&{of}, \&{then}}
3720for_like: sc2(force)(res_flag+p)(alpha); {\&{for}, \&{while}, \&{with}}
3721goto_like: sc1(res_flag+p)(intro); {\&{goto}, \&{packed}}
3722nil_like: sc1(res_flag+p)(simp); {\&{nil}}
3723to_like: sc3(math_rel)(res_flag+p)("}")(math); {\&{downto}, \&{to}}
3724end;
3725end;
3726
3727@ @<Cases that generate more than one scrap@>=
3728begin_like: begin sc3(force)(res_flag+p)(cancel)(beginning); sc0(intro);
3729  end; {\&{begin}}
3730case_like: begin sc0(casey); sc2(force)(res_flag+p)(alpha);
3731  end; {\&{case}}
3732else_like: begin @<Append |terminator| if not already present@>;
3733  sc3(force)(backup)(res_flag+p)(elsie);
3734  end; {\&{else}}
3735end_like: begin @<Append |term...@>;
3736  sc2(force)(res_flag+p)(close);
3737  end; {\&{end}}
3738if_like: begin sc0(cond); sc2(force)(res_flag+p)(alpha);
3739  end; {\&{if}}
3740loop_like: begin sc3(force)("\")("~")(alpha);
3741@.\\\~@>
3742  sc1(res_flag+p)(omega);
3743  end; {\&{xclause}}
3744proc_like: begin sc4(force)(backup)(res_flag+p)(cancel)(proc);
3745  sc3(indent)("\")(" ")(intro);
3746@.\\\ @>
3747  end; {\&{function}, \&{procedure}, \&{program}}
3748record_like: begin sc1(res_flag+p)(record_head); sc0(intro);
3749  end; {\&{record}}
3750repeat_like: begin sc4(force)(indent)(res_flag+p)(cancel)(beginning);
3751  sc0(intro);
3752  end; {\&{repeat}}
3753until_like: begin @<Append |term...@>;
3754  sc3(force)(backup)(res_flag+p)(close); sc0(clause);
3755  end; {\&{until}}
3756var_like: begin sc4(force)(backup)(res_flag+p)(cancel)(var_head); sc0(intro);
3757  end; {\&{var}}
3758
3759@ If a comment or semicolon appears before the reserved words \&{end},
3760\&{else}, or \&{until}, the |semi| or |terminator| scrap that is already
3761present overrides the |terminator| scrap belonging to this reserved word.
3762
3763@<Append |termin...@>=
3764if (scrap_ptr<scrap_base)or((cat[scrap_ptr]<>terminator)and
3765    (cat[scrap_ptr]<>semi)) then sc0(terminator)
3766
3767@ A comment is incorporated into the previous scrap if that scrap is of type
3768|omega| or |semi| or |terminator|. (These three categories have consecutive
3769category codes.) Otherwise the comment is entered as a separate scrap
3770of type |terminator|, and it will combine with a |terminator| scrap that
3771immediately follows~it.
3772
3773The |app_comment| procedure takes care of placing a comment at the end of the
3774current scrap list. When |app_comment| is called, we assume that the current
3775token list is the translation of the comment involved.
3776
3777@<Declaration of the |app_comment|...@>=
3778procedure app_comment; {append a comment to the scrap list}
3779begin freeze_text;
3780if (scrap_ptr<scrap_base)or(cat[scrap_ptr]<omega)or
3781    (cat[scrap_ptr]>terminator) then sc0(terminator)
3782else  begin app1(scrap_ptr); {|cat[scrap_ptr]| is
3783    |omega| or |semi| or |terminator|}
3784  end;
3785app(text_ptr-1+tok_flag); trans[scrap_ptr]:=text_ptr; freeze_text;
3786end;
3787
3788@ We are now finished with |Pascal_parse|, except for two relatively
3789trivial subprocedures that convert constants into tokens.
3790
3791@<Declaration of the |app_octal| and...@>=
3792procedure app_octal;
3793begin app("\"); app("O"); app("{");
3794@.\\O@>
3795while (buffer[loc]>="0")and(buffer[loc]<="7") do
3796  begin app_tok(buffer[loc]); incr(loc);
3797  end;
3798sc1("}")(simp);
3799end;
3800@#
3801procedure app_hex;
3802begin app("\"); app("H"); app("{");
3803@.\\H@>
3804while ((buffer[loc]>="0")and(buffer[loc]<="9"))or@|
3805    ((buffer[loc]>="A")and(buffer[loc]<="F")) do
3806  begin app_tok(buffer[loc]); incr(loc);
3807  end;
3808sc1("}")(simp);
3809end;
3810
3811
3812@ When the `\v' that introduces \PASCAL\ text is sensed, a call on
3813|Pascal_translate| will return a pointer to the \TeX\ translation of
3814that text. If scraps exist in the |cat| and |trans| arrays, they are
3815unaffected by this translation process.
3816
3817@p function Pascal_translate: text_pointer;
3818var p:text_pointer; {points to the translation}
3819@!save_base:0..max_scraps; {holds original value of |scrap_base|}
3820begin save_base:=scrap_base; scrap_base:=scrap_ptr+1;
3821Pascal_parse; {get the scraps together}
3822if next_control<>"|" then err_print('! Missing "|" after Pascal text');
3823@.Missing "|"...@>
3824app_tok(cancel); app_comment; {place a |cancel| token as a final ``comment''}
3825p:=translate; {make the translation}
3826stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;@;@+tats@;@/
3827scrap_ptr:=scrap_base-1; scrap_base:=save_base; {scrap the scraps}
3828Pascal_translate:=p;
3829end;
3830
3831@ The |outer_parse| routine is to |Pascal_parse| as |outer_xref|
3832is to |Pascal_xref|: It constructs a sequence of scraps for \PASCAL\ text
3833until |next_control>=format|. Thus, it takes care of embedded comments.
3834
3835@p procedure outer_parse; {makes scraps from \PASCAL\ tokens and comments}
3836var bal:eight_bits; {brace level in comment}
3837@!p,@!q:text_pointer; {partial comments}
3838begin while next_control<format do
3839  if next_control<>"{" then Pascal_parse
3840  else  begin @<Make sure that there is room for at least seven more
3841      tokens, three more texts, and one more scrap@>;
3842    app("\"); app("C"); app("{");
3843@.\\C@>
3844    bal:=copy_comment(1); next_control:="|";
3845    while bal>0 do
3846      begin p:=text_ptr; freeze_text; q:=Pascal_translate;
3847      {at this point we have |tok_ptr+6<=max_toks|}
3848      app(tok_flag+p); app(inner_tok_flag+q);
3849      if next_control="|" then bal:=copy_comment(bal)
3850      else bal:=0; {an error has been reported}
3851      end;
3852    app(force); app_comment; {the full comment becomes a scrap}
3853    end;
3854end;
3855
3856@ @<Make sure that there is room for at least seven more...@>=
3857if (tok_ptr+7>max_toks)or(text_ptr+3>max_texts)or(scrap_ptr>=max_scraps) then
3858  begin stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;
3859  if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;
3860  if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
3861  tats@;@/
3862  overflow('token/text/scrap');
3863  end
3864
3865@* Output of tokens.
3866So far our programs have only built up multi-layered token lists in
3867\.{WEAVE}'s internal memory; we have to figure out how to get them into
3868the desired final form. The job of converting token lists to characters in
3869the \TeX\ output file is not difficult, although it is an implicitly
3870recursive process. Four main considerations had to be kept in mind when
3871this part of \.{WEAVE} was designed.  (a) There are two modes of output:
3872|outer| mode, which translates tokens like |force| into line-breaking
3873control sequences, and |inner| mode, which ignores them except that blank
3874spaces take the place of line breaks. (b) The |cancel| instruction applies
3875to adjacent token or tokens that are output, and this cuts across levels
3876of recursion since `|cancel|' occurs at the beginning or end of a token
3877list on one level. (c) The \TeX\ output file will be semi-readable if line
3878breaks are inserted after the result of tokens like |break_space| and
3879|force|.  (d) The final line break should be suppressed, and there should
3880be no |force| token output immediately after `\.{\\Y\\P}'.
3881
3882@ The output process uses a stack to keep track of what is going on at
3883different ``levels'' as the token lists are being written out. Entries on
3884this stack have three parts:
3885
3886\yskip\hang |end_field| is the |tok_mem| location where the token list of a
3887particular level will end;
3888
3889\yskip\hang |tok_field| is the |tok_mem| location from which the next token
3890on a particular level will be read;
3891
3892\yskip\hang |mode_field| is the current mode, either |inner| or |outer|.
3893
3894\yskip\noindent The current values of these quantities are referred to
3895quite frequently, so they are stored in a separate place instead of in the
3896|stack| array. We call the current values |cur_end|, |cur_tok|, and
3897|cur_mode|.
3898
3899The global variable |stack_ptr| tells how many levels of output are
3900currently in progress. The end of output occurs when an |end_translation|
3901token is found, so the stack is never empty except when we first begin the
3902output process.
3903
3904@d inner=0 {value of |mode| for \PASCAL\ texts within \TeX\ texts}
3905@d outer=1 {value of |mode| for \PASCAL\ texts in modules}
3906
3907@<Types...@>=
3908@!mode=inner..outer;@/
3909@!output_state=record@!end_field:sixteen_bits; {ending location of token list}
3910  @!tok_field:sixteen_bits; {present location within token list}
3911  @!mode_field:mode; {interpretation of control tokens}
3912  end;
3913
3914@ @d cur_end==cur_state.end_field {current ending location in |tok_mem|}
3915@d cur_tok==cur_state.tok_field {location of next output token in |tok_mem|}
3916@d cur_mode==cur_state.mode_field {current mode of interpretation}
3917@d init_stack==stack_ptr:=0;cur_mode:=outer {do this to initialize the stack}
3918
3919@<Glob...@>=
3920@!cur_state:output_state; {|cur_end|, |cur_tok|, |cur_mode|}
3921@!stack:array[1..stack_size] of output_state; {info for non-current levels}
3922@!stack_ptr:0..stack_size; {first unused location in the output state stack}
3923stat@!max_stack_ptr:0..stack_size; {largest value assumed by |stack_ptr|}
3924tats
3925
3926@ @<Set init...@>=stat max_stack_ptr:=0;@+tats
3927
3928@ To insert token-list |p| into the output, the |push_level| subroutine
3929is called; it saves the old level of output and gets a new one going.
3930The value of |cur_mode| is not changed.
3931
3932@p procedure push_level(@!p:text_pointer); {suspends the current level}
3933begin if stack_ptr=stack_size then overflow('stack')
3934else  begin if stack_ptr>0 then
3935    stack[stack_ptr]:=cur_state; {save |cur_end|$\,\ldots\,$|cur_mode|}
3936  incr(stack_ptr);
3937  stat if stack_ptr>max_stack_ptr then
3938    max_stack_ptr:=stack_ptr;@;@+tats@;@/
3939  cur_tok:=tok_start[p]; cur_end:=tok_start[p+1];
3940  end;
3941end;
3942
3943@ Conversely, the |pop_level| routine restores the conditions that were in
3944force when the current level was begun. This subroutine will never be
3945called when |stack_ptr=1|. It is so simple, we declare it as a macro:
3946
3947@d pop_level==begin decr(stack_ptr); cur_state:=stack[stack_ptr];
3948  end {do this when |cur_tok| reaches |cur_end|}
3949
3950@ The |get_output| function returns the next byte of output that is not a
3951reference to a token list. It returns the values |identifier| or |res_word|
3952or |mod_name| if the next token is to be an identifier (typeset in
3953italics), a reserved word (typeset in boldface) or a module name (typeset
3954by a complex routine that might generate additional levels of output).
3955In these cases |cur_name| points to the identifier or module name in
3956question.
3957
3958@d res_word=@'201 {returned by |get_output| for reserved words}
3959@d mod_name=@'200 {returned by |get_output| for module names}
3960
3961@p function get_output:eight_bits; {returns the next token of output}
3962label restart;
3963var a:sixteen_bits; {current item read from |tok_mem|}
3964begin restart: while cur_tok=cur_end do pop_level;
3965a:=tok_mem[cur_tok]; incr(cur_tok);
3966if a>=@'400 then
3967  begin cur_name:=a mod id_flag;
3968  case a div id_flag of
3969  2: a:=res_word; {|a=res_flag+cur_name|}
3970  3: a:=mod_name; {|a=mod_flag+cur_name|}
3971  4: begin push_level(cur_name); goto restart;
3972    end; {|a=tok_flag+cur_name|}
3973  5: begin push_level(cur_name); cur_mode:=inner; goto restart;
3974    end; {|a=inner_tok_flag+cur_name|}
3975  othercases a:=identifier {|a=id_flag+cur_name|}
3976  endcases;
3977  end;
3978@!debug if trouble_shooting then debug_help; @+ gubed@/
3979get_output:=a;
3980end;
3981
3982@ The real work associated with token output is done by |make_output|.
3983This procedure appends an |end_translation| token to the current token list,
3984and then it repeatedly calls |get_output| and feeds characters to the output
3985buffer until reaching the |end_translation| sentinel. It is possible for
3986|make_output| to
3987be called recursively, since a module name may include embedded \PASCAL\
3988text; however, the depth of recursion never exceeds one level, since
3989module names cannot be inside of module names.
3990
3991A procedure called |output_Pascal| does the scanning, translation, and
3992output of \PASCAL\ text within `\pb' brackets, and this procedure uses
3993|make_output| to output the current token list. Thus, the recursive call
3994of |make_output| actually occurs when |make_output| calls |output_Pascal|
3995while outputting the name of a module.
3996@^recursion@>
3997
3998@p procedure make_output; forward; @t\2@>@#
3999procedure output_Pascal; {outputs the current token list}
4000var save_tok_ptr,@!save_text_ptr,@!save_next_control:sixteen_bits;
4001  {values to be restored}
4002p:text_pointer; {translation of the \PASCAL\ text}
4003begin save_tok_ptr:=tok_ptr; save_text_ptr:=text_ptr;
4004save_next_control:=next_control; next_control:="|"; p:=Pascal_translate;
4005app(p+inner_tok_flag);
4006make_output; {output the list}
4007stat if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
4008if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;@;@+tats@;@/
4009text_ptr:=save_text_ptr; tok_ptr:=save_tok_ptr; {forget the tokens}
4010next_control:=save_next_control; {restore |next_control| to original state}
4011end;
4012
4013@ Here is \.{WEAVE}'s major output handler.
4014
4015@p procedure make_output; {outputs the equivalents of tokens}
4016label reswitch,exit,found;
4017var a:eight_bits; {current output byte}
4018@!b:eight_bits; {next output byte}
4019@!k,@!k_limit:0..max_bytes; {indices into |byte_mem|}
4020@!w:0..ww-1; {row of |byte_mem|}
4021@!j:0..long_buf_size; {index into |buffer|}
4022@!string_delimiter:ASCII_code; {first and last character of
4023  string being copied}
4024@!save_loc,@!save_limit:0..long_buf_size; {|loc| and |limit| to be restored}
4025@!cur_mod_name:name_pointer; {name of module being output}
4026@!save_mode:mode; {value of |cur_mode| before a sequence of breaks}
4027begin app(end_translation); {append a sentinel}
4028freeze_text; push_level(text_ptr-1);
4029loop@+  begin a:=get_output;
4030reswitch: case a of
4031  end_translation: return;
4032  identifier,res_word:@<Output an identifier@>;
4033  mod_name:@<Output a module name@>;
4034  math_bin,math_op,math_rel:@<Output a \.{\\math} operator@>;
4035  cancel: begin repeat a:=get_output;
4036    until (a<backup)or(a>big_force);
4037    goto reswitch;
4038    end;
4039  big_cancel: begin repeat a:=get_output;
4040    until ((a<backup)and(a<>" "))or(a>big_force);
4041    goto reswitch;
4042    end;
4043  indent,outdent,opt,backup,break_space,force,big_force:@<Output a
4044    \(control, look ahead in case of line breaks,
4045    possibly |goto reswitch|@>;
4046  othercases out(a) {otherwise |a| is an ASCII character}
4047  endcases;
4048  end;
4049exit:end;
4050
4051@ An identifier of length one does not have to be enclosed in braces, and it
4052looks slightly better if set in a math-italic font instead of a (slightly
4053narrower) text-italic font. Thus we output `\.{\\\char'174a}' but
4054`\.{\\\\\{aa\}}'.
4055
4056@<Output an identifier@>=
4057begin out("\");
4058if a=identifier then
4059  if length(cur_name)=1 then out("|")
4060@.\\|@>
4061  else out("\")
4062@.\\\\@>
4063else out("&"); {|a=res_word|}
4064@.\\\&@>
4065if length(cur_name)=1 then out(byte_mem[cur_name mod ww,byte_start[cur_name]])
4066else out_name(cur_name);
4067end
4068
4069@ @<Output a \....@>=
4070begin out5("\")("m")("a")("t")("h");
4071if a=math_bin then out3("b")("i")("n")
4072else if a=math_rel then out3("r")("e")("l")
4073else out2("o")("p");
4074out("{");
4075end
4076
4077@ The current mode does not affect the behavior of \.{WEAVE}'s output routine
4078except when we are outputting control tokens.
4079
4080@<Output a \(control...@>=
4081if a<break_space then
4082  begin if cur_mode=outer then
4083    begin out2("\")(a-cancel+"0");
4084@.\\1@>
4085@.\\2@>
4086@.\\3@>
4087@.\\4@>
4088@.\\5@>
4089@.\\6@>
4090@.\\7@>
4091    if a=opt then out(get_output) {|opt| is followed by a digit}
4092    end
4093  else if a=opt then b:=get_output {ignore digit following |opt|}
4094  end
4095else @<Look ahead for strongest line break, |goto reswitch|@>
4096
4097@ If several of the tokens |break_space|, |force|, |big_force| occur in a
4098row, possibly mixed with blank spaces (which are ignored),
4099the largest one is used. A line break also occurs in the output file,
4100except at the very end of the translation. The very first line break
4101is suppressed (i.e., a line break that follows `\.{\\Y\\P}').
4102
4103@<Look ahead for st...@>=
4104begin b:=a; save_mode:=cur_mode;
4105loop@+  begin a:=get_output;
4106  if (a=cancel)or(a=big_cancel) then goto reswitch;
4107    {|cancel| overrides everything}
4108  if ((a<>" ")and(a<break_space))or(a>big_force) then
4109    begin if save_mode=outer then
4110      begin if out_ptr>3 then
4111        if (out_buf[out_ptr]="P")and
4112          (out_buf[out_ptr-1]="\")and
4113@.\\P@>
4114@.\\Y@>
4115          (out_buf[out_ptr-2]="Y")and
4116          (out_buf[out_ptr-3]="\") then
4117          goto reswitch;
4118@.\\1@>
4119@.\\2@>
4120@.\\3@>
4121@.\\4@>
4122@.\\5@>
4123@.\\6@>
4124@.\\7@>
4125      out2("\")(b-cancel+"0");
4126      if a<>end_translation then finish_line;
4127      end
4128    else if (a<>end_translation)and(cur_mode=inner) then out(" ");
4129    goto reswitch;
4130    end;
4131  if a>b then b:=a; {if |a=" "| we have |a<b|}
4132  end;
4133end
4134
4135@ The remaining part of |make_output| is somewhat more complicated. When we
4136output a module name, we may need to enter the parsing and translation
4137routines, since the name may contain \PASCAL\ code embedded in
4138\pb\ constructions. This \PASCAL\ code is placed at the end of the active
4139input buffer and the translation process uses the end of the active
4140|tok_mem| area.
4141
4142@<Output a module name@>=
4143begin out2("\")("X");
4144@.\\X@>
4145cur_xref:=xref[cur_name];
4146if num(cur_xref)>=def_flag then
4147  begin out_mod(num(cur_xref)-def_flag);
4148  if phase_three then
4149    begin cur_xref:=xlink(cur_xref);
4150    while num(cur_xref)>=def_flag do
4151      begin out2(",")(" ");
4152      out_mod(num(cur_xref)-def_flag);
4153      cur_xref:=xlink(cur_xref);
4154      end;
4155    end;
4156  end
4157else out("0"); {output the module number, or zero if it was undefined}
4158out(":"); @<Output the text of the module name@>;
4159out2("\")("X");
4160end
4161
4162@ @<Output the text...@>=
4163k:=byte_start[cur_name]; w:=cur_name mod ww; k_limit:=byte_start[cur_name+ww];
4164cur_mod_name:=cur_name;
4165while k<k_limit do
4166  begin b:=byte_mem[w,k]; incr(k);
4167  if b="@@" then @<Skip next character, give error if not `\.{@@}'@>;
4168  if b<>"|" then out(b)
4169  else  begin @<Copy the \PASCAL\ text into |buffer[(limit+1)..j]|@>;
4170    save_loc:=loc; save_limit:=limit; loc:=limit+2; limit:=j+1;
4171    buffer[limit]:="|"; output_Pascal;
4172    loc:=save_loc; limit:=save_limit;
4173    end;
4174  end
4175
4176@ @<Skip next char...@>=
4177begin if byte_mem[w,k]<>"@@" then
4178  begin print_nl('! Illegal control code in section name:');
4179@.Illegal control code...@>
4180  print_nl('<'); print_id(cur_mod_name); print('> '); mark_error;
4181  end;
4182incr(k);
4183end
4184
4185@ The \PASCAL\ text enclosed in \pb\ should not contain `\v' characters,
4186except within strings. We put a `\v' at the front of the buffer, so that an
4187error message that displays the whole buffer will look a little bit sensible.
4188The variable |string_delimiter| is zero outside of strings, otherwise it
4189equals the delimiter that began the string being copied.
4190
4191@<Copy the \PASCAL\ text into...@>=
4192j:=limit+1; buffer[j]:="|"; string_delimiter:=0;
4193loop@+  begin if k>=k_limit then
4194    begin print_nl('! Pascal text in section name didn''t end:');
4195@.Pascal text...didn't end@>
4196    print_nl('<'); print_id(cur_mod_name); print('> '); mark_error;
4197    goto found;
4198    end;
4199  b:=byte_mem[w,k]; incr(k);
4200  if b="@@" then @<Copy a control code into the buffer@>
4201  else  begin if (b="""")or(b="'") then
4202      if string_delimiter=0 then string_delimiter:=b
4203      else if string_delimiter=b then string_delimiter:=0;
4204    if (b<>"|")or(string_delimiter<>0) then
4205      begin if j>long_buf_size-3 then overflow('buffer');
4206      incr(j); buffer[j]:=b;
4207      end
4208    else goto found;
4209    end;
4210  end;
4211found:
4212
4213@ @<Copy a control code into the buffer@>=
4214begin if j>long_buf_size-4 then overflow('buffer');
4215buffer[j+1]:="@@"; buffer[j+2]:=byte_mem[w,k]; j:=j+2; incr(k);
4216end
4217
4218@* Phase two processing.
4219We have assembled enough pieces of the puzzle in order to be ready to specify
4220the processing in \.{WEAVE}'s main pass over the source file. Phase two
4221is analogous to phase one, except that more work is involved because we must
4222actually output the \TeX\ material instead of merely looking at the
4223\.{WEB} specifications.
4224
4225@<Phase II: Read all the text again and translate it to \TeX\ form@>=
4226reset_input; print_nl('Writing the output file...');
4227module_count:=0;
4228copy_limbo;
4229finish_line; flush_buffer(0,false,false); {insert a blank line, it looks nice}
4230while not input_has_ended do @<Translate the \(current module@>
4231
4232@ The output file will contain the control sequence \.{\\Y} between non-null
4233sections of a module, e.g., between the \TeX\ and definition parts if both
4234are nonempty. This puts a little white space between the parts when they are
4235printed. However, we don't want \.{\\Y} to occur between two definitions
4236within a single module. The variables |out_line| or |out_ptr| will
4237change if a section is non-null, so the following macros `|save_position|'
4238and `|emit_space_if_needed|' are able to handle the situation:
4239
4240@d save_position==save_line:=out_line; save_place:=out_ptr
4241@d emit_space_if_needed==if (save_line<>out_line)or(save_place<>out_ptr) then
4242  out2("\")("Y")
4243@.\\Y@>
4244
4245@<Glo...@>=
4246@!save_line:integer; {former value of |out_line|}
4247@!save_place:sixteen_bits; {former value of |out_ptr|}
4248
4249@ @<Translate the \(current module@>=
4250begin incr(module_count);@/
4251@<Output the code for the beginning of a new module@>;
4252save_position;@/
4253@<Translate the \TeX\ part of the current module@>;
4254@<Translate the \(definition part of the current module@>;
4255@<Translate the \PASCAL\ part of the current module@>;
4256@<Show cross references to this module@>;
4257@<Output the code for the end of a module@>;
4258end
4259
4260@ Modules beginning with the \.{WEB} control sequence `\.{@@\ }' start in the
4261output with the \TeX\ control sequence `\.{\\M}', followed by the module
4262number. Similarly, `\.{@@*}' modules lead to the control sequence `\.{\\N}'.
4263If this is a changed module, we put \.{*} just before the module number.
4264
4265@<Output the code for the beginning...@>=
4266out("\");
4267if buffer[loc-1]<>"*" then out("M")
4268@.\\M@>
4269else  begin out("N"); print('*',module_count:1);
4270@.\\N@>
4271  update_terminal; {print a progress report}
4272  end;
4273out_mod(module_count); out2(".")(" ")
4274
4275@ In the \TeX\ part of a module, we simply copy the source text, except that
4276index entries are not copied and \PASCAL\ text within \pb\ is translated.
4277
4278@<Translate the \T...@>=
4279repeat next_control:=copy_TeX;
4280case next_control of
4281"|": begin init_stack; output_Pascal;
4282  end;
4283"@@": out("@@");
4284octal: @<Translate an octal constant appearing in \TeX\ text@>;
4285hex: @<Translate a hexadecimal constant appearing in \TeX\ text@>;
4286TeX_string,xref_roman,xref_wildcard,xref_typewriter,module_name:
4287  begin loc:=loc-2; next_control:=get_next; {skip to \.{@@>}}
4288  if next_control=TeX_string then
4289    err_print('! TeX string should be in Pascal text only');
4290@.TeX string should be...@>
4291  end;
4292begin_comment,end_comment,check_sum,thin_space,math_break,line_break,
4293  big_line_break,no_line_break,join,pseudo_semi:
4294    err_print('! You can''t do that in TeX text');
4295@.You can't do that...@>
4296othercases do_nothing
4297endcases;
4298until next_control>=format
4299
4300@ @<Translate an octal constant appearing in \TeX\ text@>=
4301begin out3("\")("O")("{");
4302@.\\O@>
4303while (buffer[loc]>="0")and(buffer[loc]<="7") do
4304  begin out(buffer[loc]); incr(loc);
4305  end; {since |buffer[limit]=" "|, this loop will end}
4306out("}");
4307end
4308
4309@ @<Translate a hexadecimal constant appearing in \TeX\ text@>=
4310begin out3("\")("H")("{");
4311@.\\H@>
4312while ((buffer[loc]>="0")and(buffer[loc]<="9"))or@|
4313    ((buffer[loc]>="A")and(buffer[loc]<="F")) do
4314  begin out(buffer[loc]); incr(loc);
4315  end;
4316out("}");
4317end
4318@ When we get to the following code we have |next_control>=format|, and
4319the token memory is in its initial empty state.
4320
4321@<Translate the \(d...@>=
4322if next_control<=definition then {definition part non-empty}
4323  begin emit_space_if_needed; save_position;
4324  end;
4325while next_control<=definition do {|format| or |definition|}
4326  begin init_stack;
4327  if next_control=definition then @<Start a macro definition@>
4328  else @<Start a format definition@>;
4329  outer_parse; finish_Pascal;
4330  end
4331
4332@ The |finish_Pascal| procedure outputs the translation of the current
4333scraps, preceded by the control sequence `\.{\\P}' and followed by the
4334control sequence `\.{\\par}'. It also restores the token and scrap
4335memories to their initial empty state.
4336
4337A |force| token is appended to the current scraps before translation
4338takes place, so that the translation will normally end with \.{\\6} or
4339\.{\\7} (the \TeX\ macros for |force| and |big_force|). This \.{\\6} or
4340\.{\\7} is replaced by the concluding \.{\\par} or by \.{\\Y\\par}.
4341
4342@p procedure finish_Pascal; {finishes a definition or a \PASCAL\ part}
4343var p:text_pointer; {translation of the scraps}
4344begin out2("\")("P"); app_tok(force); app_comment; p:=translate;
4345@.\\P@>
4346app(p+tok_flag); make_output; {output the list}
4347if out_ptr>1 then
4348  if out_buf[out_ptr-1]="\" then
4349@.\\6@>
4350@.\\7@>
4351@.\\Y@>
4352    if out_buf[out_ptr]="6" then out_ptr:=out_ptr-2
4353    else if out_buf[out_ptr]="7" then out_buf[out_ptr]:="Y";
4354out4("\")("p")("a")("r"); finish_line;
4355stat if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr;
4356if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;
4357if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;
4358tats@;@/
4359tok_ptr:=1; text_ptr:=1; scrap_ptr:=0; {forget the tokens and the scraps}
4360end;
4361
4362@ @<Start a macro...@>=
4363begin sc2("\")("D")(intro); {this will produce `\&{define }'}
4364@.\\D@>
4365next_control:=get_next;
4366if next_control<>identifier then err_print('! Improper macro definition')
4367@.Improper macro definition@>
4368else sc1(id_flag+id_lookup(normal))(math);
4369next_control:=get_next;
4370end
4371
4372@ @<Start a format...@>=
4373begin sc2("\")("F")(intro); {this will produce `\&{format }'}
4374@.\\F@>
4375next_control:=get_next;
4376if next_control=identifier then
4377  begin sc1(id_flag+id_lookup(normal))(math);
4378  next_control:=get_next;
4379  if next_control=equivalence_sign then
4380    begin sc2("\")("S")(math); {output an equivalence sign}
4381@.\\S@>
4382    next_control:=get_next;
4383    if next_control=identifier then
4384      begin sc1(id_flag+id_lookup(normal))(math);
4385      sc0(semi); {insert an invisible semicolon}
4386      next_control:=get_next;
4387      end;
4388    end;
4389  end;
4390if scrap_ptr<>5 then err_print('! Improper format definition');
4391@.Improper format definition@>
4392end
4393
4394@ Finally, when the \TeX\ and definition parts have been treated, we have
4395|next_control>=begin_Pascal|. We will make the global variable |this_module|
4396point to the current module name, if it has a name.
4397
4398@<Glob...@>=@!this_module:name_pointer; {the current module name, or zero}
4399
4400@ @<Translate the \P...@>=
4401this_module:=0;
4402if next_control<=module_name then
4403  begin emit_space_if_needed; init_stack;
4404  if next_control=begin_Pascal then next_control:=get_next
4405  else  begin this_module:=cur_module;
4406    @<Check that |=| or |==| follows this module name, and
4407      emit the scraps to start the module definition@>;
4408    end;
4409  while next_control<=module_name do
4410    begin outer_parse;
4411    @<Emit the scrap for a module name if present@>;
4412    end;
4413  finish_Pascal;
4414  end
4415
4416@ @<Check that |=|...@>=
4417repeat next_control:=get_next;
4418until next_control<>"+"; {allow optional `\.{+=}'}
4419if (next_control<>"=")and(next_control<>equivalence_sign) then
4420  err_print('! You need an = sign after the section name')
4421@.You need an = sign...@>
4422else next_control:=get_next;
4423if out_ptr>1 then
4424  if (out_buf[out_ptr]="Y")and(out_buf[out_ptr-1]="\") then
4425@.\\Y@>
4426    begin app(backup); {the module name will be flush left}
4427    end;
4428sc1(mod_flag+this_module)(mod_scrap);
4429cur_xref:=xref[this_module];
4430if num(cur_xref)<>module_count+def_flag then
4431  begin sc3(math_rel)("+")("}")(math);
4432    {module name is multiply defined}
4433  this_module:=0; {so we won't give cross-reference info here}
4434  end;
4435sc2("\")("S")(math); {output an equivalence sign}
4436@.\\S@>
4437sc1(force)(semi); {this forces a line break unless `\.{@@+}' follows}
4438
4439@ @<Emit the scrap...@>=
4440if next_control<module_name then
4441  begin err_print('! You can''t do that in Pascal text');
4442@.You can't do that...@>
4443  next_control:=get_next;
4444  end
4445else if next_control=module_name then
4446  begin sc1(mod_flag+cur_module)(mod_scrap); next_control:=get_next;
4447  end
4448
4449@ Cross references relating to a named module are given after the module ends.
4450
4451@<Show cross...@>=
4452if this_module>0 then
4453  begin @<Rearrange the list pointed to by |cur_xref|@>;
4454  footnote(def_flag); footnote(0);
4455  end
4456
4457@ To rearrange the order of the linked list of cross references, we need
4458four more variables that point to cross reference entries.  We'll end up
4459with a list pointed to by |cur_xref|.
4460
4461@<Glob...@>=
4462@!next_xref,@!this_xref,@!first_xref,@!mid_xref:xref_number;
4463  {pointer variables for rearranging a list}
4464
4465@ We want to rearrange the cross reference list so that all the entries with
4466|def_flag| come first, in ascending order; then come all the other
4467entries, in ascending order.  There may be no entries in either one or both
4468of these categories.
4469
4470@<Rearrange the list...@>=
4471first_xref:=xref[this_module];
4472this_xref:=xlink(first_xref); {bypass current module number}
4473if num(this_xref)>def_flag then
4474  begin mid_xref:=this_xref; cur_xref:=0; {this value doesn't matter}
4475  repeat  next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref;
4476    cur_xref:=this_xref; this_xref:=next_xref;
4477  until num(this_xref)<=def_flag;
4478  xlink(first_xref):=cur_xref;
4479  end
4480else mid_xref:=0; {first list null}
4481cur_xref:=0;
4482while this_xref<>0 do
4483  begin next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref;
4484  cur_xref:=this_xref; this_xref:=next_xref;
4485  end;
4486if mid_xref>0 then xlink(mid_xref):=cur_xref
4487else xlink(first_xref):=cur_xref;
4488cur_xref:=xlink(first_xref)
4489
4490@ The |footnote| procedure gives cross reference information about
4491multiply defined module names (if the |flag| parameter is |def_flag|), or about
4492the uses of a module name (if the |flag| parameter is zero). It assumes that
4493|cur_xref| points to the first cross-reference entry of interest, and it
4494leaves |cur_xref| pointing to the first element not printed.  Typical outputs:
4495`\.{\\A101.}'; `\.{\\Us370\\ET1009.}'; `\.{\\As8, 27\\*, 51\\ETs64.}'.
4496
4497@p procedure footnote(@!flag:sixteen_bits); {outputs module cross-references}
4498label done,exit;
4499var q:xref_number; {cross-reference pointer variable}
4500begin if num(cur_xref)<=flag then return;
4501finish_line; out("\");
4502@.\\A@>
4503@.\\U@>
4504if flag=0 then out("U")@+else out("A");
4505@<Output all the module numbers on the reference list |cur_xref|@>;
4506out(".");
4507exit:end;
4508
4509@ The following code distinguishes three cases, according as the number
4510of cross references is one, two, or more than two. Variable |q| points
4511to the first cross reference, and the last link is a zero.
4512
4513@<Output all the module numbers...@>=
4514q:=cur_xref; if num(xlink(q))>flag then out("s"); {plural}
4515@.\\As@>
4516@.\\Us@>
4517loop@+  begin out_mod(num(cur_xref)-flag);
4518  cur_xref:=xlink(cur_xref); {point to the next cross reference to output}
4519  if num(cur_xref)<=flag then goto done;
4520  if num(xlink(cur_xref))>flag then out2(",")(" ") {not the last}
4521  else begin out3("\")("E")("T"); {the last}
4522@.\\ET@>
4523    if cur_xref<>xlink(q) then out("s"); {the last of more than two}
4524@.\\ETs@>
4525    end;
4526  end;
4527done:
4528
4529@ @<Output the code for the end of a module@>=
4530out3("\")("f")("i"); finish_line;
4531flush_buffer(0,false,false); {insert a blank line, it looks nice}
4532@.\\fi@>
4533
4534@* Phase three processing.
4535We are nearly finished! \.{WEAVE}'s only remaining task is to write out the
4536index, after sorting the identifiers and index entries.
4537
4538@<Phase III: Output the cross-reference index@>=
4539phase_three:=true; print_nl('Writing the index...');
4540if change_exists then
4541  begin finish_line; @<Tell about changed modules@>;
4542  end;
4543finish_line; out4("\")("i")("n")("x"); finish_line;
4544@.\\inx@>
4545@<Do the first pass of sorting@>;
4546@<Sort and output the index@>;
4547out4("\")("f")("i")("n"); finish_line;
4548@.\\fin@>
4549@<Output all the module names@>;
4550out4("\")("c")("o")("n"); finish_line;
4551@.\\con@>
4552print('Done.');
4553
4554@ Just before the index comes a list of all the changed modules, including
4555the index module itself.
4556
4557@<Glob...@>=
4558@!k_module:0..max_modules; {runs through the modules}
4559
4560@ @<Tell about changed modules@>=
4561begin {remember that the index is already marked as changed}
4562k_module:=1;
4563out4("\")("c")("h")(" ");
4564while k_module<module_count do
4565  begin if changed_module[k_module] then
4566    begin out_mod(k_module); out2(",")(" ");
4567    end;
4568  incr(k_module);
4569  end;
4570out_mod(k_module);
4571out(".");
4572end
4573
4574@ A left-to-right radix sorting method is used, since this makes it easy to
4575adjust the collating sequence and since the running time will be at worst
4576proportional to the total length of all entries in the index. We put the
4577identifiers into 230 different lists based on their first characters.
4578(Uppercase letters are put into the same list as the corresponding lowercase
4579letters, since we want to have `$t<\\{TeX}<\&{to}$'.) The
4580list for character |c| begins at location |bucket[c]| and continues through
4581the |blink| array.
4582
4583@<Glob...@>=
4584@!bucket:array[ASCII_code] of name_pointer;
4585@!next_name: name_pointer; {successor of |cur_name| when sorting}
4586@!c:ASCII_code; {index into |bucket|}
4587@!h:0..hash_size; {index into |hash|}
4588@!blink:array[0..max_names] of sixteen_bits; {links in the buckets}
4589
4590@ To begin the sorting, we go through all the hash lists and put each entry
4591having a nonempty cross-reference list into the proper bucket.
4592
4593@<Do the first pass...@>=
4594for c:=0 to 255 do bucket[c]:=0;
4595for h:=0 to hash_size-1 do
4596  begin next_name:=hash[h];
4597  while next_name<>0 do
4598    begin cur_name:=next_name; next_name:=link[cur_name];
4599    if xref[cur_name]<>0 then
4600      begin c:=byte_mem[cur_name mod ww,byte_start[cur_name]];
4601      if (c<="Z")and(c>="A") then c:=c+@'40;
4602      blink[cur_name]:=bucket[c]; bucket[c]:=cur_name;
4603      end;
4604    end;
4605  end
4606
4607@ During the sorting phase we shall use the |cat| and |trans| arrays from
4608\.{WEAVE}'s parsing algorithm and rename them |depth| and |head|. They now
4609represent a stack of identifier lists for all the index entries that have
4610not yet been output. The variable |sort_ptr| tells how many such lists are
4611present; the lists are output in reverse order (first |sort_ptr|, then
4612|sort_ptr-1|, etc.). The |j|th list starts at |head[j]|, and if the first
4613|k| characters of all entries on this list are known to be equal we have
4614|depth[j]=k|.
4615
4616@d depth==cat {reclaims memory that is no longer needed for parsing}
4617@d head==trans {ditto}
4618@d sort_ptr==scrap_ptr {ditto}
4619@d max_sorts==max_scraps {ditto}
4620
4621@<Globals...@>=
4622@!cur_depth:eight_bits; {depth of current buckets}
4623@!cur_byte:0..max_bytes; {index into |byte_mem|}
4624@!cur_bank:0..ww-1; {row of |byte_mem|}
4625@!cur_val:sixteen_bits; {current cross reference number}
4626stat@!max_sort_ptr:0..max_sorts;@+tats {largest value of |sort_ptr|}
4627
4628@ @<Set init...@>=stat max_sort_ptr:=0;@+tats
4629
4630@ The desired alphabetic order is specified by the |collate| array; namely,
4631|collate[0]<collate[1]<@t$\cdots$@><collate[229]|.
4632
4633@<Glob...@>=@!collate:array[0..229] of ASCII_code; {collation order}
4634
4635@ @<Local variables for init...@>=
4636@!c:ASCII_code; {used to initialize |collate|}
4637
4638@ We use the order $\hbox{null}<\.\ <\hbox{other characters}<\.\_<
4639\.A=\.a<\cdots<\.Z=\.z<\.0<\cdots<\.9.$
4640
4641@<Set init...@>=
4642collate[0]:=0; collate[1]:=" ";
4643for c:=1 to " "-1 do collate[c+1]:=c;
4644for c:=" "+1 to "0"-1 do collate[c]:=c;
4645for c:="9"+1 to "A"-1 do collate[c-10]:=c;
4646for c:="Z"+1 to "_"-1 do collate[c-36]:=c;
4647collate["_"-36]:="_"+1;
4648for c:="z"+1 to 255 do collate[c-63]:=c;
4649collate[193]:="_";
4650for c:="a" to "z" do collate[c-"a"+194]:=c;
4651for c:="0" to "9" do collate[c-"0"+220]:=c;
4652
4653@ Procedure |unbucket| goes through the buckets and adds nonempty lists
4654to the stack, using the collating sequence specified in the |collate| array.
4655The parameter to |unbucket| tells the current depth in the buckets.
4656Any two sequences that agree in their first 255 character positions are
4657regarded as identical.
4658
4659@d infinity=255 {$\infty$ (approximately)}
4660
4661@p procedure unbucket(@!d:eight_bits); {empties buckets having depth |d|}
4662var c:ASCII_code; {index into |bucket|}
4663begin for c:=229 downto 0 do if bucket[collate[c]]>0 then
4664  begin if sort_ptr>max_sorts then overflow('sorting');
4665  incr(sort_ptr);
4666  stat if sort_ptr>max_sort_ptr then max_sort_ptr:=sort_ptr;@;@+tats@;@/
4667  if c=0 then depth[sort_ptr]:=infinity else depth[sort_ptr]:=d;
4668  head[sort_ptr]:=bucket[collate[c]]; bucket[collate[c]]:=0;
4669  end;
4670end;
4671
4672@ @<Sort and output...@>=
4673sort_ptr:=0; unbucket(1);
4674while sort_ptr>0 do
4675  begin cur_depth:=cat[sort_ptr];
4676  if (blink[head[sort_ptr]]=0)or(cur_depth=infinity) then
4677    @<Output index entries for the list at |sort_ptr|@>
4678  else @<Split the list at |sort_ptr| into further lists@>;
4679  end
4680
4681@ @<Split the list...@>=
4682begin next_name:=head[sort_ptr];
4683repeat cur_name:=next_name; next_name:=blink[cur_name];
4684  cur_byte:=byte_start[cur_name]+cur_depth; cur_bank:=cur_name mod ww;
4685  if cur_byte=byte_start[cur_name+ww] then c:=0 {we hit the end of the name}
4686  else  begin c:=byte_mem[cur_bank,cur_byte];
4687    if (c<="Z")and(c>="A") then c:=c+@'40;
4688    end;
4689  blink[cur_name]:=bucket[c]; bucket[c]:=cur_name;
4690until next_name=0;
4691decr(sort_ptr); unbucket(cur_depth+1);
4692end
4693
4694@ @<Output index...@>=
4695begin cur_name:=head[sort_ptr];
4696@!debug if trouble_shooting then debug_help;@;@+gubed@/
4697repeat out2("\")(":");
4698@.\\:@>
4699  @<Output the name at |cur_name|@>;
4700  @<Output the cross-references at |cur_name|@>;
4701  cur_name:=blink[cur_name];
4702until cur_name=0;
4703decr(sort_ptr);
4704end
4705
4706@ @<Output the name...@>=
4707case ilk[cur_name] of
4708normal: if length(cur_name)=1 then out2("\")("|")@+else out2("\")("\");
4709@.\\|@>
4710@.\\\\@>
4711roman: do_nothing;
4712wildcard: out2("\")("9");
4713@.\\9@>
4714typewriter: out2("\")(".");
4715@.\\.@>
4716othercases out2("\")("&")
4717@.\\\&@>
4718endcases;@/
4719out_name(cur_name)
4720
4721@ Section numbers that are to be underlined are enclosed in
4722`\.{\\[}$\,\ldots\,$\.]'.
4723
4724@<Output the cross-references...@>=
4725@<Invert the cross-reference list at |cur_name|, making |cur_xref| the head@>;
4726repeat out2(",")(" "); cur_val:=num(cur_xref);
4727if cur_val<def_flag then out_mod(cur_val)
4728else  begin out2("\")("["); out_mod(cur_val-def_flag); out("]");
4729@.\\[@>
4730  end;
4731cur_xref:=xlink(cur_xref);
4732until cur_xref=0;
4733out("."); finish_line
4734
4735@ List inversion is best thought of as popping elements off one stack and
4736pushing them onto another. In this case |cur_xref| will be the head of
4737the stack that we push things onto.
4738
4739@<Invert the cross-reference list at |cur_name|, making |cur_xref| the head@>=
4740this_xref:=xref[cur_name]; cur_xref:=0;
4741repeat next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref;
4742cur_xref:=this_xref; this_xref:=next_xref;
4743until this_xref=0
4744
4745@ The following recursive procedure walks through the tree of module names and
4746prints them.
4747@^recursion@>
4748
4749@p procedure mod_print(p:name_pointer); {print all module names in subtree |p|}
4750begin if p>0 then
4751  begin mod_print(llink[p]);@/
4752  out2("\")(":");@/
4753@.\\:@>
4754  tok_ptr:=1; text_ptr:=1; scrap_ptr:=0; init_stack;
4755  app(p+mod_flag); make_output;
4756  footnote(0); {|cur_xref| was set by |make_output|}
4757  finish_line;@/
4758  mod_print(rlink[p]);
4759  end;
4760end;
4761
4762@ @<Output all the module names@>=@+mod_print(root)
4763
4764@* Debugging.
4765The \PASCAL\ debugger with which \.{WEAVE} was developed allows breakpoints
4766to be set, and variables can be read and changed, but procedures cannot be
4767executed. Therefore a `|debug_help|' procedure has been inserted in the main
4768loops of each phase of the program; when |ddt| and |dd| are set to appropriate
4769values, symbolic printouts of various tables will appear.
4770
4771The idea is to set a breakpoint inside the |debug_help| routine, at the
4772place of `\ignorespaces|breakpoint:|\unskip' below.  Then when
4773|debug_help| is to be activated, set |trouble_shooting| equal to |true|.
4774The |debug_help| routine will prompt you for values of |ddt| and |dd|,
4775discontinuing this when |ddt<=0|; thus you type $2n+1$ integers, ending
4776with zero or a negative number. Then control either passes to the
4777breakpoint, allowing you to look at and/or change variables (if you typed
4778zero), or to exit the routine (if you typed a negative value).
4779
4780Another global variable, |debug_cycle|, can be used to skip silently
4781past calls on |debug_help|. If you set |debug_cycle>1|, the program stops
4782only every |debug_cycle| times |debug_help| is called; however,
4783any error stop will set |debug_cycle| to zero.
4784
4785@<Globals...@>=
4786@!debug@!trouble_shooting:boolean; {is |debug_help| wanted?}
4787@!ddt:integer; {operation code for the |debug_help| routine}
4788@!dd:integer; {operand in procedures performed by |debug_help|}
4789@!debug_cycle:integer; {threshold for |debug_help| stopping}
4790@!debug_skipped:integer; {we have skipped this many |debug_help| calls}
4791@!term_in:text_file; {the user's terminal as an input file}
4792gubed
4793
4794@ The debugging routine needs to read from the user's terminal.
4795@^system dependencies@>
4796@<Set init...@>=
4797@!debug trouble_shooting:=true; debug_cycle:=1; debug_skipped:=0; tracing:=0;@/
4798trouble_shooting:=false; debug_cycle:=99999; {use these when it almost works}
4799reset(term_in,'TTY:','/I'); {open |term_in| as the terminal, don't do a |get|}
4800gubed
4801
4802@ @d breakpoint=888 {place where a breakpoint is desirable}
4803@^system dependencies@>
4804
4805@p @!debug procedure debug_help; {routine to display various things}
4806label breakpoint,exit;
4807var k:integer; {index into various arrays}
4808begin incr(debug_skipped);
4809if debug_skipped<debug_cycle then return;
4810debug_skipped:=0;
4811loop@+  begin print_nl('#'); update_terminal; {prompt}
4812  read(term_in,ddt); {read a debug-command code}
4813  if ddt<0 then return
4814  else if ddt=0 then
4815    begin goto breakpoint;@\ {go to every label at least once}
4816    breakpoint: ddt:=0;@\
4817    end
4818  else  begin read(term_in,dd);
4819    case ddt of
4820    1: print_id(dd);
4821    2: print_text(dd);
4822    3: for k:=1 to dd do print(xchr[buffer[k]]);
4823    4: for k:=1 to dd do print(xchr[mod_text[k]]);
4824    5: for k:=1 to out_ptr do print(xchr[out_buf[k]]);
4825    6: for k:=1 to dd do
4826      begin print_cat(cat[k]); print(' ');
4827      end;
4828    othercases print('?')
4829    endcases;
4830    end;
4831  end;
4832exit:end;
4833gubed
4834
4835@* The main program.
4836Let's put it all together now: \.{WEAVE} starts and ends here.
4837@^system dependencies@>
4838
4839The main procedure has been split into three sub-procedures in order to
4840keep certain \PASCAL\ compilers from overflowing their capacity.
4841@^split procedures@>
4842
4843@p procedure Phase_I;
4844begin @<Phase I:...@>;
4845end;
4846@#
4847procedure Phase_II;
4848begin @<Phase II:...@>;
4849end;
4850@#
4851begin initialize; {beginning of the main program}
4852print_ln(banner); {print a ``banner line''}
4853@<Store all the reserved words@>;
4854Phase_I; Phase_II;@/
4855@<Phase III:...@>;
4856@<Check that all changes have been read@>;
4857end_of_WEAVE:
4858stat @<Print statistics about memory usage@>;@+tats@;@/
4859@t\4\4@>{here files should be closed if the operating system requires it}
4860@<Print the job |history|@>;
4861end.
4862
4863@ @<Print statistics about memory usage@>=
4864print_nl('Memory usage statistics: ',
4865  name_ptr:1,' names, ', xref_ptr:1,' cross references, ',
4866  byte_ptr[0]:1);
4867for cur_bank:=1 to ww-1 do print('+',byte_ptr[cur_bank]:1);
4868  print(' bytes;');
4869print_nl('parsing required ',max_scr_ptr:1,' scraps, ',max_txt_ptr:1,
4870  ' texts, ',max_tok_ptr:1,' tokens, ', max_stack_ptr:1,' levels;');
4871print_nl('sorting required ',max_sort_ptr:1, ' levels.')
4872
4873@ Some implementations may wish to pass the |history| value to the
4874operating system so that it can be used to govern whether or not other
4875programs are started. Here we simply report the history to the user.
4876@^system dependencies@>
4877
4878@<Print the job |history|@>=
4879case history of
4880spotless: print_nl('(No errors were found.)');
4881harmless_message: print_nl('(Did you see the warning message above?)');
4882error_message: print_nl('(Pardon me, but I think I spotted something wrong.)');
4883fatal_message: print_nl('(That was a fatal error, my friend.)');
4884end {there are no other cases}
4885
4886@* System-dependent changes.
4887This module should be replaced, if necessary, by changes to the program
4888that are necessary to make \.{WEAVE} work at a particular installation.
4889It is usually best to design your change file so that all changes to
4890previous modules preserve the module numbering; then everybody's version
4891will be consistent with the printed program. More extensive changes,
4892which introduce new modules, can be inserted here; then only the index
4893itself will get a new module number.
4894@^system dependencies@>
4895
4896@* Index.
4897If you have read and understood the code for Phase III above, you know what
4898is in this index and how it got here. All modules in which an identifier is
4899used are listed with that identifier, except that reserved words are
4900indexed only when they appear in format definitions, and the appearances
4901of identifiers in module names are not indexed. Underlined entries
4902correspond to where the identifier was declared. Error messages, control
4903sequences put into the output, and a few
4904other things like ``recursion'' are indexed here too.
4905