1% This program by D. E. Knuth is not copyrighted and can be used freely. 2% Version 0 was released in December, 1981. 3% Version 1 was released in September, 1982, with version 0 of TeX. 4% Slight changes were made in October, 1982, for version 0.6 of TeX. 5% Version 1.1 changed "_" to "\_" if not within an identifier (November, 1982). 6% Version 1.2 added @@= and @@\ and marked changed modules (December, 1982). 7% Version 1.3 marked and indexed changed modules better (January, 1983). 8% Version 1.4 added "history" (February, 1983). 9% Version 1.5 conformed to TeX version 0.96 (March, 1983). 10% Version 1.6 conformed to TeX version 0.98 (May, 1983). 11% Version 1.7 introduced the new change file format (June, 1983). 12% Version 2 was released in July, 1983, with version 0.999 of TeX. 13% Version 2.1 corrected a bug in changed_module reckoning (August, 1983). 14% Version 2.2 corrected it better (August, 1983). 15% Version 2.3 starts the output with \input webmac (August, 1983). 16% Version 2.4 fixed a bug in compress(#) (September, 1983). 17% Version 2.5 cleared xrefswitch after module names (November, 1983). 18% Version 2.6 fixed a bug in declaration of trans array (January, 1984). 19% Version 2.7 fixed a bug in real constants (August, 1984). 20% Version 2.8 fixed a bug in change_buffer movement (August, 1985). 21% Version 2.9 increased max_refs and max_toks to 30000 each (January, 1987). 22% Version 3, for Sewell's book, fixed long-line bug in input_ln (March, 1989). 23% Version 3.1 fixed a bug for programs with only one module (April, 1989). 24% Version 4 was major change to allow 8-bit input (September, 1989). 25% Version 4.1, for Breitenlohner, avoids English-only output (March, 1990). 26% Version 4.2 conforms to ANSI standard for-loop rules (September, 1990). 27% Version 4.3 catches extra } in input (Breitenlohner, September, 1991). 28% Version 4.4 corrects changed_module logic, %-overflow (January, 1992). 29 30% Here is TeX material that gets inserted after \input webmac 31\def\hang{\hangindent 3em\indent\ignorespaces} 32\font\ninerm=cmr9 33\let\mc=\ninerm % medium caps for names like SAIL 34\def\PASCAL{Pascal} 35\def\pb{$\.|\ldots\.|$} % Pascal brackets (|...|) 36\def\v{\.{\char'174}} % vertical (|) in typewriter font 37\def\dleft{[\![} \def\dright{]\!]} % double brackets 38\mathchardef\RA="3221 % right arrow 39\mathchardef\BA="3224 % double arrow 40\def\({} % kludge for alphabetizing certain module names 41 42\def\title{WEAVE} 43\def\contentspagenumber{15} % should be odd 44\def\topofcontents{\null\vfill 45 \titlefalse % include headline on the contents page 46 \def\rheader{\mainfont Appendix D\hfil \contentspagenumber} 47 \centerline{\titlefont The {\ttitlefont WEAVE} processor} 48 \vskip 15pt 49 \centerline{(Version 4.4)} 50 \vfill} 51\pageno=\contentspagenumber \advance\pageno by 1 52 53@* Introduction. 54This program converts a \.{WEB} file to a \TeX\ file. It was written 55by D. E. Knuth in October, 1981; a somewhat similar {\mc SAIL} program had 56been developed in March, 1979, although the earlier program used a top-down 57parsing method that is quite different from the present scheme. 58 59The code uses a few features of the local \PASCAL\ compiler that may need 60to be changed in other installations: 61 62\yskip\item{1)} Case statements have a default. 63\item{2)} Input-output routines may need to be adapted for use with a particular 64character set and/or for printing messages on the user's terminal. 65 66\yskip\noindent 67These features are also present in the \PASCAL\ version of \TeX, where they 68are used in a similar (but more complex) way. System-dependent portions 69of \.{WEAVE} can be identified by looking at the entries for `system 70dependencies' in the index below. 71@!@^system dependencies@> 72 73The ``banner line'' defined here should be changed whenever \.{WEAVE} 74is modified. 75 76@d banner=='This is WEAVE, Version 4.4' 77 78@ The program begins with a fairly normal header, made up of pieces that 79@^system dependencies@> 80will mostly be filled in later. The \.{WEB} input comes from files |web_file| 81and |change_file|, and the \TeX\ output goes to file |tex_file|. 82 83If it is necessary to abort the job because of a fatal error, the program 84calls the `|jump_out|' procedure, which goes to the label |end_of_WEAVE|. 85 86@d end_of_WEAVE = 9999 {go here to wrap it up} 87 88@p @t\4@>@<Compiler directives@>@/ 89program WEAVE(@!web_file,@!change_file,@!tex_file); 90label end_of_WEAVE; {go here to finish} 91const @<Constants in the outer block@>@/ 92type @<Types in the outer block@>@/ 93var @<Globals in the outer block@>@/ 94@<Error handling procedures@>@/ 95procedure initialize; 96 var @<Local variables for initialization@>@/ 97 begin @<Set initial values@>@/ 98 end; 99 100@ Some of this code is optional for use when debugging only; 101such material is enclosed between the delimiters |debug| and $|gubed|$. 102Other parts, delimited by |stat| and $|tats|$, are optionally included 103if statistics about \.{WEAVE}'s memory usage are desired. 104 105@d debug==@{ {change this to `$\\{debug}\equiv\null$' when debugging} 106@d gubed==@t@>@} {change this to `$\\{gubed}\equiv\null$' when debugging} 107@f debug==begin 108@f gubed==end 109@# 110@d stat==@{ {change this to `$\\{stat}\equiv\null$' 111 when gathering usage statistics} 112@d tats==@t@>@} {change this to `$\\{tats}\equiv\null$' 113 when gathering usage statistics} 114@f stat==begin 115@f tats==end 116 117@ The \PASCAL\ compiler used to develop this system has ``compiler 118directives'' that can appear in comments whose first character is a dollar sign. 119In production versions of \.{WEAVE} these directives tell the compiler that 120@^system dependencies@> 121it is safe to avoid range checks and to leave out the extra code it inserts 122for the \PASCAL\ debugger's benefit, although interrupts will occur if 123there is arithmetic overflow. 124 125@<Compiler directives@>= 126@{@&$C-,A+,D-@} {no range check, catch arithmetic overflow, no debug overhead} 127@!debug @{@&$C+,D+@}@+ gubed {but turn everything on when debugging} 128 129@ Labels are given symbolic names by the following definitions. We insert 130the label `|exit|:' just before the `\ignorespaces|end|\unskip' of a 131procedure in which we have used the `|return|' statement defined below; 132the label `|restart|' is occasionally used at the very beginning of a 133procedure; and the label `|reswitch|' is occasionally used just prior to 134a \&{case} statement in which some cases change the conditions and we wish to 135branch to the newly applicable case. 136Loops that are set up with the \&{loop} construction defined below are 137commonly exited by going to `|done|' or to `|found|' or to `|not_found|', 138and they are sometimes repeated by going to `|continue|'. 139 140@d exit=10 {go here to leave a procedure} 141@d restart=20 {go here to start a procedure again} 142@d reswitch=21 {go here to start a case statement again} 143@d continue=22 {go here to resume a loop} 144@d done=30 {go here to exit a loop} 145@d found=31 {go here when you've found it} 146@d not_found=32 {go here when you've found something else} 147 148@ Here are some macros for common programming idioms. 149 150@d incr(#) == #:=#+1 {increase a variable by unity} 151@d decr(#) == #:=#-1 {decrease a variable by unity} 152@d loop == @+ while true do@+ {repeat over and over until a |goto| happens} 153@d do_nothing == {empty statement} 154@d return == goto exit {terminate a procedure call} 155@f return == nil 156@f loop == xclause 157 158@ We assume that |case| statements may include a default case that applies 159if no matching label is found. Thus, we shall use constructions like 160@^system dependencies@> 161$$\vbox{\halign{#\hfil\cr 162|case x of|\cr 1631: $\langle\,$code for $x=1\,\rangle$;\cr 1643: $\langle\,$code for $x=3\,\rangle$;\cr 165|othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr 166|endcases|\cr}}$$ 167since most \PASCAL\ compilers have plugged this hole in the language by 168incorporating some sort of default mechanism. For example, the compiler 169used to develop \.{WEB} and \TeX\ allows `|others|:' as a default label, 170and other \PASCAL s allow syntaxes like `\ignorespaces|else|\unskip' or 171`\&{otherwise}' or `\\{otherwise}:', etc. The definitions of |othercases| 172and |endcases| should be changed to agree with local conventions. 173(Of course, if no default mechanism is available, the |case| statements of 174this program must be extended by listing all remaining cases.) 175 176@d othercases == others: {default for cases not listed explicitly} 177@d endcases == @+end {follows the default case in an extended |case| statement} 178@f othercases == else 179@f endcases == end 180 181@ The following parameters are set big enough to handle \TeX, so they 182should be sufficient for most applications of \.{WEAVE}. 183 184@<Constants...@>= 185@!max_bytes=45000; {|1/ww| times the number of bytes in identifiers, 186 index entries, and module names; must be less than 65536} 187@!max_names=5000; {number of identifiers, index entries, and module names; 188 must be less than 10240} 189@!max_modules=2000;{greater than the total number of modules} 190@!hash_size=353; {should be prime} 191@!buf_size=100; {maximum length of input line} 192@!longest_name=400; {module names shouldn't be longer than this} 193@!long_buf_size=500; {|buf_size+longest_name|} 194@!line_length=80; {lines of \TeX\ output have at most this many characters, 195 should be less than 256} 196@!max_refs=30000; {number of cross references; must be less than 65536} 197@!max_toks=30000; {number of symbols in \PASCAL\ texts being parsed; 198 must be less than 65536} 199@!max_texts=2000; {number of phrases in \PASCAL\ texts being parsed; 200 must be less than 10240} 201@!max_scraps=1000; {number of tokens in \PASCAL\ texts being parsed} 202@!stack_size=200; {number of simultaneous output levels} 203 204@ A global variable called |history| will contain one of four values 205at the end of every run: |spotless| means that no unusual messages were 206printed; |harmless_message| means that a message of possible interest 207was printed but no serious errors were detected; |error_message| means that 208at least one error was found; |fatal_message| means that the program 209terminated abnormally. The value of |history| does not influence the 210behavior of the program; it is simply computed for the convenience 211of systems that might want to use such information. 212 213@d spotless=0 {|history| value for normal jobs} 214@d harmless_message=1 {|history| value when non-serious info was printed} 215@d error_message=2 {|history| value when an error was noted} 216@d fatal_message=3 {|history| value when we had to stop prematurely} 217@# 218@d mark_harmless==@t@>@+if history=spotless then history:=harmless_message 219@d mark_error==history:=error_message 220@d mark_fatal==history:=fatal_message 221 222@<Glob...@>=@!history:spotless..fatal_message; {how bad was this run?} 223 224@ @<Set init...@>=history:=spotless; 225 226@* The character set. 227One of the main goals in the design of \.{WEB} has been to make it readily 228portable between a wide variety of computers. Yet \.{WEB} by its very 229nature must use a greater variety of characters than most computer 230programs deal with, and character encoding is one of the areas in which 231existing machines differ most widely from each other. 232 233To resolve this problem, all input to \.{WEAVE} and \.{TANGLE} is 234converted to an internal eight-bit code that is essentially standard 235ASCII, the ``American Standard Code for Information Interchange.'' 236The conversion is done immediately when each character is read in. 237Conversely, characters are converted from ASCII to the user's external 238representation just before they are output. (The original ASCII code 239was seven bits only; \.{WEB} now allows eight bits in an attempt to 240keep up with modern times.) 241 242Such an internal code is relevant to users of \.{WEB} only because it is 243the code used for preprocessed constants like \.{"A"}. If you are writing 244a program in \.{WEB} that makes use of such one-character constants, you 245should convert your input to ASCII form, like \.{WEAVE} and \.{TANGLE} do. 246Otherwise \.{WEB}'s internal coding scheme does not affect you. 247@^ASCII code@> 248 249Here is a table of the standard visible ASCII codes: 250$$\def\:{\char\count255\global\advance\count255 by 1} 251\count255='40 252\vbox{ 253\hbox{\hbox to 40pt{\it\hfill0\/\hfill}% 254\hbox to 40pt{\it\hfill1\/\hfill}% 255\hbox to 40pt{\it\hfill2\/\hfill}% 256\hbox to 40pt{\it\hfill3\/\hfill}% 257\hbox to 40pt{\it\hfill4\/\hfill}% 258\hbox to 40pt{\it\hfill5\/\hfill}% 259\hbox to 40pt{\it\hfill6\/\hfill}% 260\hbox to 40pt{\it\hfill7\/\hfill}} 261\vskip 4pt 262\hrule 263\def\^{\vrule height 10.5pt depth 4.5pt} 264\halign{\hbox to 0pt{\hskip -24pt\O{#0}\hfill}&\^ 265\hbox to 40pt{\tt\hfill#\hfill\^}& 266&\hbox to 40pt{\tt\hfill#\hfill\^}\cr 26704&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 26805&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 26906&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27007&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27110&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27211&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27312&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27413&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27514&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27615&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27716&\:&\:&\:&\:&\:&\:&\:&\:\cr\noalign{\hrule} 27817&\:&\:&\:&\:&\:&\:&\:\cr} 279\hrule width 280pt}$$ 280(Actually, of course, code @'040 is an invisible blank space.) Code @'136 281was once an upward arrow (\.{\char'13}), and code @'137 was 282once a left arrow (\.^^X), in olden times when the first draft 283of ASCII code was prepared; but \.{WEB} works with today's standard 284ASCII in which those codes represent circumflex and underline as shown. 285 286@<Types...@>= 287@!ASCII_code=0..255; {eight-bit numbers, a subrange of the integers} 288 289@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit 290character sets were common, so it did not make provision for lowercase 291letters. Nowadays, of course, we need to deal with both capital and small 292letters in a convenient way, so \.{WEB} assumes that it is being used 293with a \PASCAL\ whose character set contains at least the characters of 294standard ASCII as listed above. Some \PASCAL\ compilers use the original 295name |char| for the data type associated with the characters in text files, 296while other \PASCAL s consider |char| to be a 64-element subrange of a larger 297data type that has some other name. 298 299In order to accommodate this difference, we shall use the name |text_char| 300to stand for the data type of the characters in the input and output 301files. We shall also assume that |text_char| consists of the elements 302|chr(first_text_char)| through |chr(last_text_char)|, inclusive. The 303following definitions should be adjusted if necessary. 304@^system dependencies@> 305 306@d text_char == char {the data type of characters in text files} 307@d first_text_char=0 {ordinal number of the smallest element of |text_char|} 308@d last_text_char=255 {ordinal number of the largest element of |text_char|} 309 310@<Types...@>= 311@!text_file=packed file of text_char; 312 313@ The \.{WEAVE} and \.{TANGLE} processors convert between ASCII code and 314the user's external character set by means of arrays |xord| and |xchr| 315that are analogous to \PASCAL's |ord| and |chr| functions. 316 317@<Globals...@>= 318@!xord: array [text_char] of ASCII_code; 319 {specifies conversion of input characters} 320@!xchr: array [ASCII_code] of text_char; 321 {specifies conversion of output characters} 322 323@ If we assume that every system using \.{WEB} is able to read and write the 324visible characters of standard ASCII (although not necessarily using the 325ASCII codes to represent them), the following assignment statements initialize 326most of the |xchr| array properly, without needing any system-dependent 327changes. For example, the statement \.{xchr[@@\'101]:=\'A\'} that appears 328in the present \.{WEB} file might be encoded in, say, {\mc EBCDIC} code 329on the external medium on which it resides, but \.{TANGLE} will convert from 330this external code to ASCII and back again. Therefore the assignment 331statement \.{XCHR[65]:=\'A\'} will appear in the corresponding \PASCAL\ file, 332and \PASCAL\ will compile this statement so that |xchr[65]| receives the 333character \.A in the external (|char|) code. Note that it would be quite 334incorrect to say \.{xchr[@@\'101]:="A"}, because |"A"| is a constant of 335type |integer|, not |char|, and because we have $|"A"|=65$ regardless of 336the external character set. 337 338@<Set init...@>= 339xchr[@'40]:=' '; 340xchr[@'41]:='!'; 341xchr[@'42]:='"'; 342xchr[@'43]:='#'; 343xchr[@'44]:='$'; 344xchr[@'45]:='%'; 345xchr[@'46]:='&'; 346xchr[@'47]:='''';@/ 347xchr[@'50]:='('; 348xchr[@'51]:=')'; 349xchr[@'52]:='*'; 350xchr[@'53]:='+'; 351xchr[@'54]:=','; 352xchr[@'55]:='-'; 353xchr[@'56]:='.'; 354xchr[@'57]:='/';@/ 355xchr[@'60]:='0'; 356xchr[@'61]:='1'; 357xchr[@'62]:='2'; 358xchr[@'63]:='3'; 359xchr[@'64]:='4'; 360xchr[@'65]:='5'; 361xchr[@'66]:='6'; 362xchr[@'67]:='7';@/ 363xchr[@'70]:='8'; 364xchr[@'71]:='9'; 365xchr[@'72]:=':'; 366xchr[@'73]:=';'; 367xchr[@'74]:='<'; 368xchr[@'75]:='='; 369xchr[@'76]:='>'; 370xchr[@'77]:='?';@/ 371xchr[@'100]:='@@'; 372xchr[@'101]:='A'; 373xchr[@'102]:='B'; 374xchr[@'103]:='C'; 375xchr[@'104]:='D'; 376xchr[@'105]:='E'; 377xchr[@'106]:='F'; 378xchr[@'107]:='G';@/ 379xchr[@'110]:='H'; 380xchr[@'111]:='I'; 381xchr[@'112]:='J'; 382xchr[@'113]:='K'; 383xchr[@'114]:='L'; 384xchr[@'115]:='M'; 385xchr[@'116]:='N'; 386xchr[@'117]:='O';@/ 387xchr[@'120]:='P'; 388xchr[@'121]:='Q'; 389xchr[@'122]:='R'; 390xchr[@'123]:='S'; 391xchr[@'124]:='T'; 392xchr[@'125]:='U'; 393xchr[@'126]:='V'; 394xchr[@'127]:='W';@/ 395xchr[@'130]:='X'; 396xchr[@'131]:='Y'; 397xchr[@'132]:='Z'; 398xchr[@'133]:='['; 399xchr[@'134]:='\'; 400xchr[@'135]:=']'; 401xchr[@'136]:='^'; 402xchr[@'137]:='_';@/ 403xchr[@'140]:='`'; 404xchr[@'141]:='a'; 405xchr[@'142]:='b'; 406xchr[@'143]:='c'; 407xchr[@'144]:='d'; 408xchr[@'145]:='e'; 409xchr[@'146]:='f'; 410xchr[@'147]:='g';@/ 411xchr[@'150]:='h'; 412xchr[@'151]:='i'; 413xchr[@'152]:='j'; 414xchr[@'153]:='k'; 415xchr[@'154]:='l'; 416xchr[@'155]:='m'; 417xchr[@'156]:='n'; 418xchr[@'157]:='o';@/ 419xchr[@'160]:='p'; 420xchr[@'161]:='q'; 421xchr[@'162]:='r'; 422xchr[@'163]:='s'; 423xchr[@'164]:='t'; 424xchr[@'165]:='u'; 425xchr[@'166]:='v'; 426xchr[@'167]:='w';@/ 427xchr[@'170]:='x'; 428xchr[@'171]:='y'; 429xchr[@'172]:='z'; 430xchr[@'173]:='{'; 431xchr[@'174]:='|'; 432xchr[@'175]:='}'; 433xchr[@'176]:='~';@/ 434xchr[0]:=' '; xchr[@'177]:=' '; {these ASCII codes are not used} 435 436@ Some of the ASCII codes below @'40 have been given symbolic names in 437\.{WEAVE} and \.{TANGLE} because they are used with a special meaning. 438 439@d and_sign=@'4 {equivalent to `\.{and}'} 440@d not_sign=@'5 {equivalent to `\.{not}'} 441@d set_element_sign=@'6 {equivalent to `\.{in}'} 442@d tab_mark=@'11 {ASCII code used as tab-skip} 443@d line_feed=@'12 {ASCII code thrown away at end of line} 444@d form_feed=@'14 {ASCII code used at end of page} 445@d carriage_return=@'15 {ASCII code used at end of line} 446@d left_arrow=@'30 {equivalent to `\.{:=}'} 447@d not_equal=@'32 {equivalent to `\.{<>}'} 448@d less_or_equal=@'34 {equivalent to `\.{<=}'} 449@d greater_or_equal=@'35 {equivalent to `\.{>=}'} 450@d equivalence_sign=@'36 {equivalent to `\.{==}'} 451@d or_sign=@'37 {equivalent to `\.{or}'} 452 453@ When we initialize the |xord| array and the remaining parts of |xchr|, 454it will be convenient to make use of an index variable, |i|. 455 456@<Local variables for init...@>= 457@!i:0..255; 458 459@ Here now is the system-dependent part of the character set. 460If \.{WEB} is being implemented on a garden-variety \PASCAL\ for which 461only standard ASCII codes will appear in the input and output files, you 462don't need to make any changes here. But if you have, for example, an extended 463character set like the one in Appendix~C of {\sl The \TeX book}, the first 464line of code in this module should be changed to 465$$\hbox{|for i:=1 to @'37 do xchr[i]:=chr(i);|}$$ 466\.{WEB}'s character set is essentially identical to \TeX's, even with respect to 467characters less than @'40. 468@^system dependencies@> 469 470Changes to the present module will make \.{WEB} more friendly on computers 471that have an extended character set, so that one can type things like 472\.^^Z\ instead of \.{<>}. If you have an extended set of characters that 473are easily incorporated into text files, you can assign codes arbitrarily 474here, giving an |xchr| equivalent to whatever characters the users of 475\.{WEB} are allowed to have in their input files, provided that unsuitable 476characters do not correspond to special codes like |carriage_return| 477that are listed above. 478 479(The present file \.{WEAVE.WEB} does not contain any of the non-ASCII 480characters, because it is intended to be used with all implementations of 481\.{WEB}. It was originally created on a Stanford system that has a 482convenient extended character set, then ``sanitized'' by applying another 483program that transliterated all of the non-standard characters into 484standard equivalents.) 485 486@<Set init...@>= 487for i:=1 to @'37 do xchr[i]:=' '; 488for i:=@'200 to @'377 do xchr[i]:=' '; 489 490@ The following system-independent code makes the |xord| array contain a 491suitable inverse to the information in |xchr|. 492 493@<Set init...@>= 494for i:=first_text_char to last_text_char do xord[chr(i)]:=" "; 495for i:=1 to @'377 do xord[xchr[i]]:=i; 496xord[' ']:=" "; 497 498@* Input and output. 499The input conventions of this program are intended to be very much like those 500of \TeX\ (except, of course, that they are much simpler, because much less 501needs to be done). Furthermore they are identical to those of \.{TANGLE}. 502Therefore people who need to make modifications to all three systems 503should be able to do so without too many headaches. 504 505We use the standard \PASCAL\ input/output procedures in several places that 506\TeX\ cannot, since \.{WEAVE} does not have to deal with files that are named 507dynamically by the user, and since there is no input from the terminal. 508 509@ Terminal output is done by writing on file |term_out|, which is assumed to 510consist of characters of type |text_char|: 511@^system dependencies@> 512 513@d print(#)==write(term_out,#) {`|print|' means write on the terminal} 514@d print_ln(#)==write_ln(term_out,#) {`|print|' and then start new line} 515@d new_line==write_ln(term_out) {start new line} 516@d print_nl(#)== {print information starting on a new line} 517 begin new_line; print(#); 518 end 519 520@<Globals...@>= 521@!term_out:text_file; {the terminal as an output file} 522 523@ Different systems have different ways of specifying that the output on a 524certain file will appear on the user's terminal. Here is one way to do this 525on the \PASCAL\ system that was used in \.{TANGLE}'s initial development: 526@^system dependencies@> 527 528@<Set init...@>= 529rewrite(term_out,'TTY:'); {send |term_out| output to the terminal} 530 531@ The |update_terminal| procedure is called when we want 532to make sure that everything we have output to the terminal so far has 533actually left the computer's internal buffers and been sent. 534@^system dependencies@> 535 536@d update_terminal == break(term_out) {empty the terminal output buffer} 537 538@ The main input comes from |web_file|; this input may be overridden 539by changes in |change_file|. (If |change_file| is empty, there are no changes.) 540 541@<Globals...@>= 542@!web_file:text_file; {primary input} 543@!change_file:text_file; {updates} 544 545@ The following code opens the input files. Since these files were listed 546in the program header, we assume that the \PASCAL\ runtime system has 547already checked that suitable file names have been given; therefore no 548additional error checking needs to be done. We will see below that 549\.{WEAVE} reads through the entire input twice. 550@^system dependencies@> 551 552@p procedure open_input; {prepare to read |web_file| and |change_file|} 553begin reset(web_file); reset(change_file); 554end; 555 556@ The main output goes to |tex_file|. 557 558@<Globals...@>= 559@!tex_file: text_file; 560 561@ The following code opens |tex_file|. 562Since this file was listed in the program header, we assume that the 563\PASCAL\ runtime system has checked that a suitable external file name has 564been given. 565@^system dependencies@> 566 567@<Set init...@>= 568rewrite(tex_file); 569 570@ Input goes into an array called |buffer|. 571 572@<Globals...@>=@!buffer: array[0..long_buf_size] of ASCII_code; 573 574@ The |input_ln| procedure brings the next line of input from the specified 575file into the |buffer| array and returns the value |true|, unless the file has 576already been entirely read, in which case it returns |false|. The conventions 577of \TeX\ are followed; i.e., |ASCII_code| numbers representing the next line 578of the file are input into |buffer[0]|, |buffer[1]|, \dots, 579|buffer[limit-1]|; trailing blanks are ignored; 580and the global variable |limit| is set to the length of the 581@^system dependencies@> 582line. The value of |limit| must be strictly less than |buf_size|. 583 584We assume that none of the |ASCII_code| values 585of |buffer[j]| for |0<=j<limit| is equal to 0, @'177, |line_feed|, |form_feed|, 586or |carriage_return|. Since |buf_size| is strictly less than |long_buf_size|, 587some of \.{WEAVE}'s routines use the fact that it is safe to refer to 588|buffer[limit+2]| without overstepping the bounds of the array. 589 590@p function input_ln(var f:text_file):boolean; 591 {inputs a line or returns |false|} 592var final_limit:0..buf_size; {|limit| without trailing blanks} 593begin limit:=0; final_limit:=0; 594if eof(f) then input_ln:=false 595else begin while not eoln(f) do 596 begin buffer[limit]:=xord[f^]; get(f); 597 incr(limit); 598 if buffer[limit-1]<>" " then final_limit:=limit; 599 if limit=buf_size then 600 begin while not eoln(f) do get(f); 601 decr(limit); {keep |buffer[buf_size]| empty} 602 if final_limit>limit then final_limit:=limit; 603 print_nl('! Input line too long'); loc:=0; error; 604@.Input line too long@> 605 end; 606 end; 607 read_ln(f); limit:=final_limit; input_ln:=true; 608 end; 609end; 610 611@* Reporting errors to the user. 612The \.{WEAVE} processor operates in three phases: first it inputs the source 613file and stores cross-reference data, then it inputs the source once again and 614produces the \TeX\ output file, and finally it sorts and outputs the index. 615 616The global variables |phase_one| and |phase_three| tell which Phase we are in. 617 618@<Globals...@>= 619@!phase_one: boolean; {|true| in Phase I, |false| in Phases II and III} 620@!phase_three: boolean; {|true| in Phase III, |false| in Phases I and II} 621 622@ If an error is detected while we are debugging, 623we usually want to look at the contents of memory. 624A special procedure will be declared later for this purpose. 625 626@<Error handling...@>= 627@!debug@+ procedure debug_help; forward;@+gubed 628 629@ The command `|err_print('! Error message')|' will report a syntax error to 630the user, by printing the error message at the beginning of a new line and 631then giving an indication of where the error was spotted in the source file. 632Note that no period follows the error message, since the error routine 633will automatically supply a period. 634 635The actual error indications are provided by a procedure called |error|. 636However, error messages are not actually reported during phase one, 637since errors detected on the first pass will be detected again 638during the second. 639 640@d err_print(#)== 641 begin if not phase_one then 642 begin new_line; print(#); error; 643 end; 644 end 645 646@<Error handling...@>= 647procedure error; {prints `\..' and location of error message} 648var@!k,@!l: 0..long_buf_size; {indices into |buffer|} 649begin @<Print error location based on input buffer@>; 650update_terminal; mark_error; 651@!debug debug_skipped:=debug_cycle;debug_help;@+gubed 652end; 653 654@ The error locations can be indicated by using the global variables 655|loc|, |line|, and |changing|, which tell respectively the first 656unlooked-at position in |buffer|, the current line number, and whether or not 657the current line is from |change_file| or |web_file|. 658This routine should be modified on systems whose standard text editor 659has special line-numbering conventions. 660@^system dependencies@> 661 662@<Print error location based on input buffer@>= 663begin if changing then print('. (change file ')@+else print('. ('); 664print_ln('l.', line:1, ')'); 665if loc>=limit then l:=limit else l:=loc; 666for k:=1 to l do 667 if buffer[k-1]=tab_mark then print(' ') 668 else print(xchr[buffer[k-1]]); {print the characters already read} 669new_line; 670for k:=1 to l do print(' '); {space out the next line} 671for k:=l+1 to limit do print(xchr[buffer[k-1]]); {print the part not yet read} 672if buffer[limit]="|" then print(xchr["|"]); 673 {end of \PASCAL\ text in module names} 674print(' '); {this space separates the message from future asterisks} 675end 676 677@ The |jump_out| procedure just cuts across all active procedure levels 678and jumps out of the program. This is the only non-local \&{goto} statement 679in \.{WEAVE}. It is used when no recovery from a particular error has 680been provided. 681 682Some \PASCAL\ compilers do not implement non-local |goto| statements. 683@^system dependencies@> 684In such cases the code that appears at label |end_of_WEAVE| should be 685copied into the |jump_out| procedure, followed by a call to a system procedure 686that terminates the program. 687 688@d fatal_error(#)==begin new_line; print(#); error; mark_fatal; jump_out; 689 end 690 691@<Error handling...@>= 692procedure jump_out; 693begin goto end_of_WEAVE; 694end; 695 696@ Sometimes the program's behavior is far different from what it should be, 697and \.{WEAVE} prints an error message that is really for the \.{WEAVE} 698maintenance person, not the user. In such cases the program says 699|confusion('indication of where we are')|. 700 701@d confusion(#)==fatal_error('! This can''t happen (',#,')') 702@.This can't happen@> 703 704@ An overflow stop occurs if \.{WEAVE}'s tables aren't large enough. 705 706@d overflow(#)==fatal_error('! Sorry, ',#,' capacity exceeded') 707@.Sorry, x capacity exceeded@> 708 709@* Data structures. 710During the first phase of its processing, \.{WEAVE} puts identifier names, 711index entries, and module names into the large |byte_mem| array, which is 712packed with eight-bit integers. Allocation is sequential, since names are 713never deleted. 714 715An auxiliary array |byte_start| is used as a directory for |byte_mem|, 716and the |link|, |ilk|, and |xref| arrays give further information about names. 717These auxiliary arrays consist of sixteen-bit items. 718 719@<Types...@>= 720@!eight_bits=0..255; {unsigned one-byte quantity} 721@!sixteen_bits=0..65535; {unsigned two-byte quantity} 722 723@ \.{WEAVE} has been designed to avoid the need for indices that are more 724than sixteen bits wide, so that it can be used on most computers. But 725there are programs that need more than 65536 bytes; \TeX\ is one of these. 726To get around this problem, a slight complication has been added to the 727data structures: |byte_mem| is a two-dimensional array, whose first index 728is either 0 or 1. (For generality, the first index is actually allowed to 729run between 0 and |ww-1|, where |ww| is defined to be 2; the program will 730work for any positive value of |ww|, and it can be simplified in obvious 731ways if |ww=1|.) 732 733@d ww=2 {we multiply the byte capacity by approximately this amount} 734 735@<Globals...@>= 736@!byte_mem: packed array [0..ww-1,0..max_bytes] of ASCII_code; 737 {characters of names} 738@!byte_start: array [0..max_names] of sixteen_bits; {directory into |byte_mem|} 739@!link: array [0..max_names] of sixteen_bits; {hash table or tree links} 740@!ilk: array [0..max_names] of sixteen_bits; {type codes or tree links} 741@!xref: array [0..max_names] of sixteen_bits; {heads of cross-reference lists} 742 743@ The names of identifiers are found by computing a hash address |h| and 744then looking at strings of bytes signified by |hash[h]|, |link[hash[h]]|, 745|link[link[hash[h]]]|, \dots, until either finding the desired name 746or encountering a zero. 747 748A `|name_pointer|' variable, which signifies a name, is an index into 749|byte_start|. The actual sequence of characters in the name pointed to by 750|p| appears in positions |byte_start[p]| to |byte_start[p+ww]-1|, inclusive, 751in the segment of |byte_mem| whose first index is |p mod ww|. Thus, when 752|ww=2| the even-numbered name bytes appear in |byte_mem[0,@t$*$@>]| 753and the odd-numbered ones appear in |byte_mem[1,@t$*$@>]|. 754The pointer 0 is used for undefined module names; we don't 755want to use it for the names of identifiers, since 0 stands for a null 756pointer in a linked list. 757 758We usually have |byte_start[name_ptr+w]=byte_ptr[(name_ptr+w) mod ww]| 759for |0<=w<ww|, since these are the starting positions for the next |ww| 760names to be stored in |byte_mem|. 761 762@d length(#)==byte_start[#+ww]-byte_start[#] {the length of a name} 763 764@<Types...@>= 765@!name_pointer=0..max_names; {identifies a name} 766 767@ @<Global...@>= 768@!name_ptr:name_pointer; {first unused position in |byte_start|} 769@!byte_ptr:array [0..ww-1] of 0..max_bytes; 770 {first unused position in |byte_mem|} 771 772@ @<Local variables for init...@>= 773@!wi: 0..ww-1; {to initialize the |byte_mem| indices} 774 775@ @<Set init...@>= 776for wi:=0 to ww-1 do 777 begin byte_start[wi]:=0; byte_ptr[wi]:=0; 778 end; 779byte_start[ww]:=0; {this makes name 0 of length zero} 780name_ptr:=1; 781 782@ Several types of identifiers are distinguished by their |ilk|: 783 784\yskip\hang |normal| identifiers are part of the \PASCAL\ program and 785will appear in italic type. 786 787\yskip\hang |roman| identifiers are index entries that appear after 788\.{@@\^} in the \.{WEB} file. 789 790\yskip\hang |wildcard| identifiers are index entries that appear after 791\.{@@:} in the \.{WEB} file. 792 793\yskip\hang |typewriter| identifiers are index entries that appear after 794\.{@@.} in the \.{WEB} file. 795 796\yskip\hang |array_like|, |begin_like|, \dots, |var_like| 797identifiers are \PASCAL\ reserved words whose |ilk| explains how they are 798to be treated when \PASCAL\ code is being formatted. 799 800\yskip\hang Finally, if |c| is an ASCII code, an |ilk| equal to 801|char_like+c| denotes a reserved word that will be converted to character 802|c|. 803 804@d normal=0 {ordinary identifiers have |normal| ilk} 805@d roman=1 {normal index entries have |roman| ilk} 806@d wildcard=2 {user-formatted index entries have |wildcard| ilk} 807@d typewriter=3 {`typewriter type' entries have |typewriter| ilk} 808@d reserved(#)==(ilk[#]>typewriter) {tells if a name is a reserved word} 809@d array_like=4 {\&{array}, \&{file}, \&{set}} 810@d begin_like=5 {\&{begin}} 811@d case_like=6 {\&{case}} 812@d const_like=7 {\&{const}, \&{label}, \&{type}} 813@d div_like=8 {\&{div}, \&{mod}} 814@d do_like=9 {\&{do}, \&{of}, \&{then}} 815@d else_like=10 {\&{else}} 816@d end_like=11 {\&{end}} 817@d for_like=12 {\&{for}, \&{while}, \&{with}} 818@d goto_like=13 {\&{goto}, \&{packed}} 819@d if_like=14 {\&{if}} 820@d in_like=15 {\&{in}} 821@d nil_like=16 {\&{nil}} 822@d proc_like=17 {\&{function}, \&{procedure}, \&{program}} 823@d record_like=18 {\&{record}} 824@d repeat_like=19 {\&{repeat}} 825@d to_like=20 {\&{downto}, \&{to}} 826@d until_like=21 {\&{until}} 827@d var_like=22 {\&{var}} 828@d loop_like=23 {\&{loop}, \&{xclause}} 829@d char_like=24 {\&{and}, \&{or}, \&{not}, \&{in}} 830 831@ The names of modules are stored in |byte_mem| together 832with the identifier names, but a hash table is not used for them because 833\.{WEAVE} needs to be able to recognize a module name when given a prefix of 834that name. A conventional binary seach tree is used to retrieve module names, 835with fields called |llink| and |rlink| in place of |link| and |ilk|. The 836root of this tree is |rlink[0]|. 837 838@d llink==link {left link in binary search tree for module names} 839@d rlink==ilk {right link in binary search tree for module names} 840@d root==rlink[0] {the root of the binary search tree for module names} 841 842@<Set init...@>= 843root:=0; {the binary search tree starts out with nothing in it} 844 845@ Here is a little procedure that prints the text of a given name on the 846user's terminal. 847 848@p procedure print_id(@!p:name_pointer); {print identifier or module name} 849var k:0..max_bytes; {index into |byte_mem|} 850@!w:0..ww-1; {row of |byte_mem|} 851begin if p>=name_ptr then print('IMPOSSIBLE') 852else begin w:=p mod ww; 853 for k:=byte_start[p] to byte_start[p+ww]-1 do 854 print(xchr[byte_mem[w,k]]); 855 end; 856end; 857 858@ We keep track of the current module number in 859|module_count|, which is the total number of modules that have started. 860Modules which have been altered by a change file entry 861have their |changed_module| flag turned on during the first phase. 862 863@<Globals...@>= 864@!module_count:0..max_modules; {the current module number} 865@!changed_module: packed array [0..max_modules] of boolean; {is it changed?} 866@!change_exists: boolean; {has any module changed?} 867 868@ The other large memory area in \.{WEAVE} keeps the cross-reference data. 869All uses of the name |p| are recorded in a linked list beginning at 870|xref[p]|, which points into the |xmem| array. Entries in |xmem| consist 871of two sixteen-bit items per word, called the |num| and |xlink| fields. 872If |x| is an index into |xmem|, reached from name |p|, the value of |num(x)| 873is either a module number where |p| is used, or it is |def_flag| plus a 874module number where |p| is defined; and |xlink(x)| points to the next such 875cross reference for |p|, if any. This list of cross references is in 876decreasing order by module number. The current number of cross references 877is |xref_ptr|. 878 879The global variable |xref_switch| is set either to |def_flag| or to zero, 880depending on whether the next cross reference to an identifier is to be 881underlined or not in the index. This switch is set to |def_flag| when 882\.{@@!} or \.{@@d} or \.{@@f} is scanned, and it is cleared to zero when 883the next identifier or index entry cross reference has been made. Similarly, 884the global variable |mod_xref_switch| is either |def_flag| or zero, depending 885on whether a module name is being defined or used. 886 887@d num(#)==xmem[#].num_field 888@d xlink(#)==xmem[#].xlink_field 889@d def_flag=10240 {must be strictly larger than |max_modules|} 890 891@ @<Types...@>= 892@!xref_number=0..max_refs; 893 894@ @<Globals...@>= 895@!xmem:array[xref_number] of packed record@t@>@/ 896 @!num_field: sixteen_bits; {module number plus zero or |def_flag|} 897 @!xlink_field: sixteen_bits; {pointer to the previous cross reference} 898 end; 899@!xref_ptr:xref_number; {the largest occupied position in |xmem|} 900@!xref_switch,@!mod_xref_switch:0..def_flag; {either zero or |def_flag|} 901 902@ @<Set init...@>=xref_ptr:=0; xref_switch:=0; mod_xref_switch:=0; num(0):=0; 903xref[0]:=0; {cross references to undefined modules} 904 905@ A new cross reference for an identifier is formed by calling |new_xref|, 906which discards duplicate entries and ignores non-underlined references 907to one-letter identifiers or \PASCAL's reserved words. 908 909@d append_xref(#)==if xref_ptr=max_refs then overflow('cross reference') 910 else begin incr(xref_ptr); num(xref_ptr):=#; 911 end 912 913@p procedure new_xref(@!p:name_pointer); 914label exit; 915var q:xref_number; {pointer to previous cross reference} 916@!m,@!n: sixteen_bits; {new and previous cross-reference value} 917begin if (reserved(p)or(byte_start[p]+1=byte_start[p+ww]))and 918 (xref_switch=0) then return; 919m:=module_count+xref_switch; xref_switch:=0; q:=xref[p]; 920if q>0 then 921 begin n:=num(q); 922 if (n=m)or(n=m+def_flag) then return 923 else if m=n+def_flag then 924 begin num(q):=m; return; 925 end; 926 end; 927append_xref(m); xlink(xref_ptr):=q; xref[p]:=xref_ptr; 928exit: end; 929 930@ The cross reference lists for module names are slightly different. Suppose 931that a module name is defined in modules $m_1$, \dots, $m_k$ and used in 932modules $n_1$, \dots, $n_l$. Then its list will contain $m_1+|def_flag|$, 933$m_k+|def_flag|$, \dots, $m_2+|def_flag|$, $n_l$, \dots, $n_1$, in 934this order. After Phase II, however, the order will be 935$m_1+|def_flag|$, \dots, $m_k+|def_flag|$, $n_1$, \dots, $n_l$. 936 937@p procedure new_mod_xref(@!p:name_pointer); 938var q,@!r:xref_number; {pointers to previous cross references} 939begin q:=xref[p]; r:=0; 940if q>0 then 941 begin if mod_xref_switch=0 then while num(q)>=def_flag do 942 begin r:=q; q:=xlink(q); 943 end 944 else if num(q)>=def_flag then 945 begin r:=q; q:=xlink(q); 946 end; 947 end; 948append_xref(module_count+mod_xref_switch); xlink(xref_ptr):=q; 949mod_xref_switch:=0; 950if r=0 then xref[p]:=xref_ptr 951else xlink(r):=xref_ptr; 952end; 953 954@ A third large area of memory is used for sixteen-bit `tokens', which appear 955in short lists similar to the strings of characters in |byte_mem|. Token lists 956are used to contain the result of \PASCAL\ code translated into \TeX\ form; 957further details about them will be explained later. A |text_pointer| variable 958is an index into |tok_start|. 959 960@<Types...@>= 961@!text_pointer=0..max_texts; {identifies a token list} 962 963@ The first position of |tok_mem| 964that is unoccupied by replacement text is called |tok_ptr|, and the first 965unused location of |tok_start| is called |text_ptr|. 966Thus, we usually have |tok_start[text_ptr]=tok_ptr|. 967 968@<Glob...@>= 969@t\hskip1em@>@!tok_mem: packed array [0..max_toks] of sixteen_bits; {tokens} 970@t\hskip1em@>@!tok_start: array [text_pointer] of sixteen_bits; 971 {directory into |tok_mem|} 972@t\hskip1em@>@!text_ptr:text_pointer; {first unused position in |tok_start|} 973@t\hskip1em@>@!tok_ptr:0..max_toks; {first unused position in |tok_mem|} 974stat@!max_tok_ptr,@!max_txt_ptr:0..max_toks; {largest values occurring} 975tats 976 977@ @<Set init...@>= 978tok_ptr:=1; text_ptr:=1; tok_start[0]:=1; tok_start[1]:=1; 979stat max_tok_ptr:=1; max_txt_ptr:=1;@+tats 980 981@* Searching for identifiers. 982The hash table described above is updated by the |id_lookup| procedure, 983which finds a given identifier and returns a pointer to its index in 984|byte_start|. The identifier is supposed to match character by character 985and it is also supposed to have a given |ilk| code; the same name may be 986present more than once if it is supposed to appear in the index with 987different typesetting conventions. 988If the identifier was not already present, it is inserted into the table. 989 990Because of the way \.{WEAVE}'s scanning mechanism works, it is most convenient 991to let |id_lookup| search for an identifier that is present in the |buffer| 992array. Two other global variables specify its position in the buffer: the 993first character is |buffer[id_first]|, and the last is |buffer[id_loc-1]|. 994 995@<Glob...@>= 996@!id_first:0..long_buf_size; {where the current identifier begins in the buffer} 997@!id_loc:0..long_buf_size; {just after the current identifier in the buffer} 998@# 999@!hash:array [0..hash_size] of sixteen_bits; {heads of hash lists} 1000 1001@ Initially all the hash lists are empty. 1002 1003@<Local variables for init...@>= 1004@!h:0..hash_size; {index into hash-head array} 1005 1006@ @<Set init...@>= 1007for h:=0 to hash_size-1 do hash[h]:=0; 1008 1009@ Here now is the main procedure for finding identifiers (and index 1010entries). The parameter |t| is set to the desired |ilk| code. The 1011identifier must either have |ilk=t|, or we must have 1012|t=normal| and the identifier must be a reserved word. 1013 1014@p function id_lookup(@!t:eight_bits):name_pointer; {finds current identifier} 1015label found; 1016var i:0..long_buf_size; {index into |buffer|} 1017@!h:0..hash_size; {hash code} 1018@!k:0..max_bytes; {index into |byte_mem|} 1019@!w:0..ww-1; {row of |byte_mem|} 1020@!l:0..long_buf_size; {length of the given identifier} 1021@!p:name_pointer; {where the identifier is being sought} 1022begin l:=id_loc-id_first; {compute the length} 1023@<Compute the hash code |h|@>; 1024@<Compute the name location |p|@>; 1025if p=name_ptr then @<Enter a new name into the table at position |p|@>; 1026id_lookup:=p; 1027end; 1028 1029@ A simple hash code is used: If the sequence of 1030ASCII codes is $c_1c_2\ldots c_m$, its hash value will be 1031$$(2^{n-1}c_1+2^{n-2}c_2+\cdots+c_n)\,\bmod\,|hash_size|.$$ 1032 1033@<Compute the hash...@>= 1034h:=buffer[id_first]; i:=id_first+1; 1035while i<id_loc do 1036 begin h:=(h+h+buffer[i]) mod hash_size; incr(i); 1037 end 1038 1039@ If the identifier is new, it will be placed in position |p=name_ptr|, 1040otherwise |p| will point to its existing location. 1041 1042@<Compute the name location...@>= 1043p:=hash[h]; 1044while p<>0 do 1045 begin if (length(p)=l)and((ilk[p]=t)or((t=normal)and reserved(p))) then 1046 @<Compare name |p| with current identifier, 1047 |goto found| if equal@>; 1048 p:=link[p]; 1049 end; 1050p:=name_ptr; {the current identifier is new} 1051link[p]:=hash[h]; hash[h]:=p; {insert |p| at beginning of hash list} 1052found: 1053 1054@ @<Compare name |p|...@>= 1055begin i:=id_first; k:=byte_start[p]; w:=p mod ww; 1056while (i<id_loc)and(buffer[i]=byte_mem[w,k]) do 1057 begin incr(i); incr(k); 1058 end; 1059if i=id_loc then goto found; {all characters agree} 1060end 1061 1062@ When we begin the following segment of the program, |p=name_ptr|. 1063 1064@<Enter a new name...@>= 1065begin w:=name_ptr mod ww; 1066if byte_ptr[w]+l>max_bytes then overflow('byte memory'); 1067if name_ptr+ww>max_names then overflow('name'); 1068i:=id_first; k:=byte_ptr[w]; {get ready to move the identifier into |byte_mem|} 1069while i<id_loc do 1070 begin byte_mem[w,k]:=buffer[i]; incr(k); incr(i); 1071 end; 1072byte_ptr[w]:=k; byte_start[name_ptr+ww]:=k; incr(name_ptr); 1073ilk[p]:=t; xref[p]:=0; 1074end 1075 1076@* Initializing the table of reserved words. 1077We have to get \PASCAL's reserved words into the hash table, and the 1078simplest way to do this is to insert them every time \.{WEAVE} is run. 1079A few macros permit us to do the initialization with a compact program. 1080 1081@d sid9(#)==buffer[9]:=#;cur_name:=id_lookup 1082@d sid8(#)==buffer[8]:=#;sid9 1083@d sid7(#)==buffer[7]:=#;sid8 1084@d sid6(#)==buffer[6]:=#;sid7 1085@d sid5(#)==buffer[5]:=#;sid6 1086@d sid4(#)==buffer[4]:=#;sid5 1087@d sid3(#)==buffer[3]:=#;sid4 1088@d sid2(#)==buffer[2]:=#;sid3 1089@d sid1(#)==buffer[1]:=#;sid2 1090@d id2==id_first:=8; sid8 1091@d id3==id_first:=7; sid7 1092@d id4==id_first:=6; sid6 1093@d id5==id_first:=5; sid5 1094@d id6==id_first:=4; sid4 1095@d id7==id_first:=3; sid3 1096@d id8==id_first:=2; sid2 1097@d id9==id_first:=1; sid1 1098 1099@<Globals...@>= 1100@!cur_name:name_pointer; {points to the identifier just inserted} 1101 1102@ The intended use of the macros above might not be immediately obvious, 1103but the riddle is answered by the following: 1104 1105@<Store all the reserved words@>= 1106id_loc:=10;@/ 1107id3("a")("n")("d")(char_like+and_sign);@/ 1108id5("a")("r")("r")("a")("y")(array_like);@/ 1109id5("b")("e")("g")("i")("n")(begin_like);@/ 1110id4("c")("a")("s")("e")(case_like);@/ 1111id5("c")("o")("n")("s")("t")(const_like);@/ 1112id3("d")("i")("v")(div_like);@/ 1113id2("d")("o")(do_like);@/ 1114id6("d")("o")("w")("n")("t")("o")(to_like);@/ 1115id4("e")("l")("s")("e")(else_like);@/ 1116id3("e")("n")("d")(end_like);@/ 1117id4("f")("i")("l")("e")(array_like);@/ 1118id3("f")("o")("r")(for_like);@/ 1119id8("f")("u")("n")("c")("t")("i")("o")("n")(proc_like);@/ 1120id4("g")("o")("t")("o")(goto_like);@/ 1121id2("i")("f")(if_like);@/ 1122id2("i")("n")(char_like+set_element_sign);@/ 1123id5("l")("a")("b")("e")("l")(const_like);@/ 1124id3("m")("o")("d")(div_like);@/ 1125id3("n")("i")("l")(nil_like);@/ 1126id3("n")("o")("t")(char_like+not_sign);@/ 1127id2("o")("f")(do_like);@/ 1128id2("o")("r")(char_like+or_sign);@/ 1129id6("p")("a")("c")("k")("e")("d")(goto_like);@/ 1130id9("p")("r")("o")("c")("e")("d")("u")("r")("e")(proc_like);@/ 1131id7("p")("r")("o")("g")("r")("a")("m")(proc_like);@/ 1132id6("r")("e")("c")("o")("r")("d")(record_like);@/ 1133id6("r")("e")("p")("e")("a")("t")(repeat_like);@/ 1134id3("s")("e")("t")(array_like);@/ 1135id4("t")("h")("e")("n")(do_like);@/ 1136id2("t")("o")(to_like);@/ 1137id4("t")("y")("p")("e")(const_like);@/ 1138id5("u")("n")("t")("i")("l")(until_like);@/ 1139id3("v")("a")("r")(var_like);@/ 1140id5("w")("h")("i")("l")("e")(for_like);@/ 1141id4("w")("i")("t")("h")(for_like);@/ 1142id7("x")("c")("l")("a")("u")("s")("e")(loop_like);@/ 1143 1144@* Searching for module names. 1145The |mod_lookup| procedure finds the module name |mod_text[1..l]| in the 1146search tree, after inserting it if necessary, and returns a pointer to 1147where it was found. 1148 1149@<Glob...@>= 1150@!mod_text:array [0..longest_name] of ASCII_code; {name being sought for} 1151 1152@ According to the rules of \.{WEB}, no module name 1153should be a proper prefix of another, so a ``clean'' comparison should 1154occur between any two names. The result of |mod_lookup| is 0 if this 1155prefix condition is violated. An error message is printed when such violations 1156are detected during phase two of \.{WEAVE}. 1157 1158@d less=0 {the first name is lexicographically less than the second} 1159@d equal=1 {the first name is equal to the second} 1160@d greater=2 {the first name is lexicographically greater than the second} 1161@d prefix=3 {the first name is a proper prefix of the second} 1162@d extension=4 {the first name is a proper extension of the second} 1163 1164@p function mod_lookup(@!l:sixteen_bits):name_pointer; {finds module name} 1165label found; 1166var c:less..extension; {comparison between two names} 1167@!j:0..longest_name; {index into |mod_text|} 1168@!k:0..max_bytes; {index into |byte_mem|} 1169@!w:0..ww-1; {row of |byte_mem|} 1170@!p:name_pointer; {current node of the search tree} 1171@!q:name_pointer; {father of node |p|} 1172begin c:=greater; q:=0; p:=root; 1173while p<>0 do 1174 begin @<Set variable |c| to the result of comparing the given name 1175 to name |p|@>; 1176 q:=p; 1177 if c=less then p:=llink[q] 1178 else if c=greater then p:=rlink[q] 1179 else goto found; 1180 end; 1181@<Enter a new module name into the tree@>; 1182found: if c<>equal then 1183 begin err_print('! Incompatible section names'); p:=0; 1184@.Incompatible section names@> 1185 end; 1186mod_lookup:=p; 1187end; 1188 1189@ @<Enter a new module name...@>= 1190w:=name_ptr mod ww; k:=byte_ptr[w]; 1191if k+l>max_bytes then overflow('byte memory'); 1192if name_ptr>max_names-ww then overflow('name'); 1193p:=name_ptr; 1194if c=less then llink[q]:=p else rlink[q]:=p; 1195llink[p]:=0; rlink[p]:=0; xref[p]:=0; c:=equal; 1196for j:=1 to l do byte_mem[w,k+j-1]:=mod_text[j]; 1197byte_ptr[w]:=k+l; byte_start[name_ptr+ww]:=k+l; incr(name_ptr); 1198 1199@ @<Set variable |c|...@>= 1200begin k:=byte_start[p]; w:=p mod ww; c:=equal; j:=1; 1201while (k<byte_start[p+ww]) and (j<=l) and (mod_text[j]=byte_mem[w,k]) do 1202 begin incr(k); incr(j); 1203 end; 1204if k=byte_start[p+ww] then 1205 if j>l then c:=equal 1206 else c:=extension 1207else if j>l then c:=prefix 1208else if mod_text[j]<byte_mem[w,k] then c:=less 1209else c:=greater; 1210end 1211 1212@ The |prefix_lookup| procedure is supposed to find exactly one module 1213name that has |mod_text[1..l]| as a prefix. Actually the algorithm 1214silently accepts also the situation that some module name is a prefix of 1215|mod_text[1..l]|, because the user who painstakingly typed in more than 1216necessary probably doesn't want to be told about the wasted effort. 1217 1218Recall that error messages are not printed during phase one. It is 1219possible that the |prefix_lookup| procedure will fail on the first pass, 1220because there is no match, yet the second pass might detect no error if a 1221matching module name has occurred after the offending prefix. In such a 1222case the cross-reference information will be incorrect and \.{WEAVE} will 1223report no error. However, such a mistake will be detected by the 1224\.{TANGLE} processor. 1225 1226@p function prefix_lookup(@!l:sixteen_bits):name_pointer; {finds name extension} 1227var c:less..extension; {comparison between two names} 1228@!count:0..max_names; {the number of hits} 1229@!j:0..longest_name; {index into |mod_text|} 1230@!k:0..max_bytes; {index into |byte_mem|} 1231@!w:0..ww-1; {row of |byte_mem|} 1232@!p:name_pointer; {current node of the search tree} 1233@!q:name_pointer; {another place to resume the search after one branch is done} 1234@!r:name_pointer; {extension found} 1235begin q:=0; p:=root; count:=0; r:=0; {begin search at root of tree} 1236while p<>0 do 1237 begin @<Set variable |c| to the result of comparing...@>; 1238 if c=less then p:=llink[p] 1239 else if c=greater then p:=rlink[p] 1240 else begin r:=p; incr(count); q:=rlink[p]; p:=llink[p]; 1241 end; 1242 if p=0 then 1243 begin p:=q; q:=0; 1244 end; 1245 end; 1246if count<>1 then 1247 if count=0 then err_print('! Name does not match') 1248@.Name does not match@> 1249 else err_print('! Ambiguous prefix'); 1250@.Ambiguous prefix@> 1251prefix_lookup:=r; {the result will be 0 if there was no match} 1252end; 1253 1254@* Lexical scanning. 1255Let us now consider the subroutines that read the \.{WEB} source file 1256and break it into meaningful units. There are four such procedures: 1257One simply skips to the next `\.{@@\ }' or `\.{@@*}' that begins a 1258module; another passes over the \TeX\ text at the beginning of a 1259module; the third passes over the \TeX\ text in a \PASCAL\ comment; 1260and the last, which is the most interesting, gets the next token of 1261a \PASCAL\ text. 1262 1263@ But first we need to consider the low-level routine |get_line| 1264that takes care of merging |change_file| into |web_file|. The |get_line| 1265procedure also updates the line numbers for error messages. 1266 1267@<Globals...@>= 1268@!ii:integer; {general purpose |for| loop variable in the outer block} 1269@!line:integer; {the number of the current line in the current file} 1270@!other_line:integer; {the number of the current line in the input file that 1271 is not currently being read} 1272@!temp_line:integer; {used when interchanging |line| with |other_line|} 1273@!limit:0..long_buf_size; {the last character position occupied in the buffer} 1274@!loc:0..long_buf_size; {the next character position to be read from the buffer} 1275@!input_has_ended: boolean; {if |true|, there is no more input} 1276@!changing: boolean; {if |true|, the current line is from |change_file|} 1277@!change_pending: boolean; {if |true|, the current change is not yet 1278 recorded in |changed_module[module_count]|} 1279 1280@ As we change |changing| from |true| to |false| and back again, we must 1281remember to swap the values of |line| and |other_line| so that the |err_print| 1282routine will be sure to report the correct line number. 1283 1284@d change_changing== 1285 changing := not changing; 1286 temp_line:=other_line; other_line:=line; line:=temp_line 1287 {|line @t$\null\BA\null$@> other_line|} 1288 1289@ When |changing| is |false|, the next line of |change_file| is kept in 1290|change_buffer[0..change_limit]|, for purposes of comparison with the next 1291line of |web_file|. After the change file has been completely input, we 1292set |change_limit:=0|, so that no further matches will be made. 1293 1294@<Globals...@>= 1295@!change_buffer:array[0..buf_size] of ASCII_code; 1296@!change_limit:0..buf_size; {the last position occupied in |change_buffer|} 1297 1298@ Here's a simple function that checks if the two buffers are different. 1299 1300@p function lines_dont_match:boolean; 1301label exit; 1302var k:0..buf_size; {index into the buffers} 1303begin lines_dont_match:=true; 1304if change_limit<>limit then return; 1305if limit>0 then 1306 for k:=0 to limit-1 do if change_buffer[k]<>buffer[k] then return; 1307lines_dont_match:=false; 1308exit: end; 1309 1310@ Procedure |prime_the_change_buffer| sets |change_buffer| in preparation 1311for the next matching operation. Since blank lines in the change file are 1312not used for matching, we have |(change_limit=0)and not changing| if and 1313only if the change file is exhausted. This procedure is called only 1314when |changing| is true; hence error messages will be reported correctly. 1315 1316@p procedure prime_the_change_buffer; 1317label continue, done, exit; 1318var k:0..buf_size; {index into the buffers} 1319begin change_limit:=0; {this value will be used if the change file ends} 1320@<Skip over comment lines in the change file; |return| if end of file@>; 1321@<Skip to the next nonblank line; |return| if end of file@>; 1322@<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>; 1323exit: end; 1324 1325@ While looking for a line that begins with \.{@@x} in the change file, 1326we allow lines that begin with \.{@@}, as long as they don't begin with 1327\.{@@y} or \.{@@z} (which would probably indicate that the change file is 1328fouled up). 1329 1330@<Skip over comment lines in the change file...@>= 1331loop@+ begin incr(line); 1332 if not input_ln(change_file) then return; 1333 if limit<2 then goto continue; 1334 if buffer[0]<>"@@" then goto continue; 1335 if (buffer[1]>="X")and(buffer[1]<="Z") then 1336 buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify} 1337 if buffer[1]="x" then goto done; 1338 if (buffer[1]="y")or(buffer[1]="z") then 1339 begin loc:=2; err_print('! Where is the matching @@x?'); 1340@.Where is the match...@> 1341 end; 1342continue: end; 1343done: 1344 1345@ Here we are looking at lines following the \.{@@x}. 1346 1347@<Skip to the next nonblank line...@>= 1348repeat incr(line); 1349 if not input_ln(change_file) then 1350 begin err_print('! Change file ended after @@x'); 1351@.Change file ended...@> 1352 return; 1353 end; 1354until limit>0; 1355 1356@ @<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>= 1357begin change_limit:=limit; 1358if limit>0 then for k:=0 to limit-1 do change_buffer[k]:=buffer[k]; 1359end 1360 1361@ The following procedure is used to see if the next change entry should 1362go into effect; it is called only when |changing| is false. 1363The idea is to test whether or not the current 1364contents of |buffer| matches the current contents of |change_buffer|. 1365If not, there's nothing more to do; but if so, a change is called for: 1366All of the text down to the \.{@@y} is supposed to match. An error 1367message is issued if any discrepancy is found. Then the procedure 1368prepares to read the next line from |change_file|. 1369 1370When a match is found, the current module is marked as changed unless 1371the first line after the \.{@@x} and after the \.{@@y} both start with 1372either |'@@*'| or |'@@ '| (possibly preceded by whitespace). 1373 1374@d if_module_start_then_make_change_pending(#)== 1375 loc:=0; buffer[limit]:="!"; 1376 while (buffer[loc]=" ")or(buffer[loc]=tab_mark) do incr(loc); 1377 buffer[limit]:=" "; 1378 if buffer[loc]="@@" then 1379 if (buffer[loc+1]="*") or 1380 (buffer[loc+1]=" ") or (buffer[loc+1]=tab_mark) then 1381 change_pending:=# 1382 1383@p procedure check_change; {switches to |change_file| if the buffers match} 1384label exit; 1385var n:integer; {the number of discrepancies found} 1386@!k:0..buf_size; {index into the buffers} 1387begin if lines_dont_match then return; 1388change_pending:=false; 1389if not changed_module[module_count] then 1390 begin if_module_start_then_make_change_pending(true); 1391 if not change_pending then changed_module[module_count]:=true; 1392 end; 1393n:=0; 1394loop@+ begin change_changing; {now it's |true|} 1395 incr(line); 1396 if not input_ln(change_file) then 1397 begin err_print('! Change file ended before @@y'); 1398@.Change file ended...@> 1399 change_limit:=0; change_changing; {|false| again} 1400 return; 1401 end; 1402 @<If the current line starts with \.{@@y}, 1403 report any discrepancies and |return|@>; 1404 @<Move |buffer| and |limit|...@>; 1405 change_changing; {now it's |false|} 1406 incr(line); 1407 if not input_ln(web_file) then 1408 begin err_print('! WEB file ended during a change'); 1409@.WEB file ended...@> 1410 input_has_ended:=true; return; 1411 end; 1412 if lines_dont_match then incr(n); 1413 end; 1414exit: end; 1415 1416@ @<If the current line starts with \.{@@y}...@>= 1417if limit>1 then if buffer[0]="@@" then 1418 begin if (buffer[1]>="X")and(buffer[1]<="Z") then 1419 buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify} 1420 if (buffer[1]="x")or(buffer[1]="z") then 1421 begin loc:=2; err_print('! Where is the matching @@y?'); 1422@.Where is the match...@> 1423 end 1424 else if buffer[1]="y" then 1425 begin if n>0 then 1426 begin loc:=2; err_print('! Hmm... ',n:1, 1427 ' of the preceding lines failed to match'); 1428@.Hmm... n of the preceding...@> 1429 end; 1430 return; 1431 end; 1432 end 1433 1434@ The |reset_input| procedure, which gets \.{WEAVE} ready to read the 1435user's \.{WEB} input, is used at the beginning of phases one and two. 1436 1437@p procedure reset_input; 1438begin open_input; line:=0; other_line:=0;@/ 1439changing:=true; prime_the_change_buffer; change_changing;@/ 1440limit:=0; loc:=1; buffer[0]:=" "; input_has_ended:=false; 1441end; 1442 1443@ The |get_line| procedure is called when |loc>limit|; it puts the next 1444line of merged input into the buffer and updates the other variables 1445appropriately. A space is placed at the right end of the line. 1446 1447@p procedure get_line; {inputs the next line} 1448label restart; 1449begin restart:if changing then 1450 @<Read from |change_file| and maybe turn off |changing|@>; 1451if not changing then 1452 begin @<Read from |web_file| and maybe turn on |changing|@>; 1453 if changing then goto restart; 1454 end; 1455loc:=0; buffer[limit]:=" "; 1456end; 1457 1458@ @<Read from |web_file|...@>= 1459begin incr(line); 1460if not input_ln(web_file) then input_has_ended:=true 1461else if limit=change_limit then 1462 if buffer[0]=change_buffer[0] then 1463 if change_limit>0 then check_change; 1464end 1465 1466@ @<Read from |change_file|...@>= 1467begin incr(line); 1468if not input_ln(change_file) then 1469 begin err_print('! Change file ended without @@z'); 1470@.Change file ended...@> 1471 buffer[0]:="@@"; buffer[1]:="z"; limit:=2; 1472 end; 1473if limit>0 then {check if the change has ended} 1474 begin if change_pending then 1475 begin if_module_start_then_make_change_pending(false); 1476 if change_pending then 1477 begin changed_module[module_count]:=true; change_pending:=false; 1478 end; 1479 end; 1480 buffer[limit]:=" "; 1481 if buffer[0]="@@" then 1482 begin if (buffer[1]>="X")and(buffer[1]<="Z") then 1483 buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify} 1484 if (buffer[1]="x")or(buffer[1]="y") then 1485 begin loc:=2; err_print('! Where is the matching @@z?'); 1486@.Where is the match...@> 1487 end 1488 else if buffer[1]="z" then 1489 begin prime_the_change_buffer; change_changing; 1490 end; 1491 end; 1492 end; 1493end 1494 1495@ At the end of the program, we will tell the user if the change file 1496had a line that didn't match any relevant line in |web_file|. 1497 1498@<Check that all changes have been read@>= 1499if change_limit<>0 then {|changing| is false} 1500 begin for ii:=0 to change_limit do buffer[ii]:=change_buffer[ii]; 1501 limit:=change_limit; changing:=true; line:=other_line; loc:=change_limit; 1502 err_print('! Change file entry did not match'); 1503@.Change file entry did not match@> 1504 end 1505 1506@ Control codes in \.{WEB}, which begin with `\.{@@}', are converted 1507into a numeric code designed to simplify \.{WEAVE}'s logic; for example, 1508larger numbers are given to the control codes that denote more significant 1509milestones, and the code of |new_module| should be the largest of 1510all. Some of these numeric control codes take the place of ASCII 1511control codes that will not otherwise appear in the output of the 1512scanning routines. 1513@^ASCII code@> 1514 1515@d ignore=0 {control code of no interest to \.{WEAVE}} 1516@d verbatim=@'2 {extended ASCII alpha will not appear} 1517@d force_line=@'3 {extended ASCII beta will not appear} 1518@d begin_comment=@'11 {ASCII tab mark will not appear} 1519@d end_comment=@'12 {ASCII line feed will not appear} 1520@d octal=@'14 {ASCII form feed will not appear} 1521@d hex=@'15 {ASCII carriage return will not appear} 1522@d double_dot=@'40 {ASCII space will not appear except in strings} 1523@d no_underline=@'175 {this code will be intercepted without confusion} 1524@d underline=@'176 {this code will be intercepted without confusion} 1525@d param=@'177 {ASCII delete will not appear} 1526@d xref_roman=@'203 {control code for `\.{@@\^}'} 1527@d xref_wildcard=@'204 {control code for `\.{@@:}'} 1528@d xref_typewriter=@'205 {control code for `\.{@@.}'} 1529@d TeX_string=@'206 {control code for `\.{@@t}'} 1530@d check_sum=@'207 {control code for `\.{@@\$}'} 1531@d join=@'210 {control code for `\.{@@\&}'} 1532@d thin_space=@'211 {control code for `\.{@@,}'} 1533@d math_break=@'212 {control code for `\.{@@\char'174}'} 1534@d line_break=@'213 {control code for `\.{@@/}'} 1535@d big_line_break=@'214 {control code for `\.{@@\#}'} 1536@d no_line_break=@'215 {control code for `\.{@@+}'} 1537@d pseudo_semi=@'216 {control code for `\.{@@;}'} 1538@d format=@'217 {control code for `\.{@@f}'} 1539@d definition=@'220 {control code for `\.{@@d}'} 1540@d begin_Pascal=@'221 {control code for `\.{@@p}'} 1541@d module_name=@'222 {control code for `\.{@@<}'} 1542@d new_module=@'223 {control code for `\.{@@\ }' and `\.{@@*}'} 1543 1544@ Control codes are converted from ASCII to \.{WEAVE}'s internal 1545representation by the |control_code| routine. 1546 1547@p function control_code(@!c:ASCII_code):eight_bits; {convert |c| 1548 after \.{@@}} 1549begin case c of 1550"@@": control_code:="@@"; {`quoted' at sign} 1551"'": control_code:=octal; {precedes octal constant} 1552"""": control_code:=hex; {precedes hexadecimal constant} 1553"$": control_code:=check_sum; {precedes check sum constant} 1554" ",tab_mark,"*": control_code:=new_module; {beginning of a new module} 1555"=": control_code:=verbatim; 1556"\": control_code:=force_line; 1557"D","d": control_code:=definition; {macro definition} 1558"F","f": control_code:=format; {format definition} 1559"{": control_code:=begin_comment; {begin-comment delimiter} 1560"}": control_code:=end_comment; {end-comment delimiter} 1561"P","p": control_code:=begin_Pascal; {\PASCAL\ text in unnamed module} 1562"&": control_code:=join; {concatenate two tokens} 1563"<": control_code:=module_name; {beginning of a module name} 1564">": begin err_print('! Extra @@>'); control_code:=ignore; 1565@.Extra \AT!>@> 1566 end; {end of module name should not be discovered in this way} 1567"T","t": control_code:=TeX_string; {\TeX\ box within \PASCAL} 1568"!": control_code:=underline; {set definition flag} 1569"?": control_code:=no_underline; {reset definition flag} 1570"^": control_code:=xref_roman; {index entry to be typeset normally} 1571":": control_code:=xref_wildcard; {index entry to be in user format} 1572".": control_code:=xref_typewriter; {index entry to be in typewriter type} 1573",": control_code:=thin_space; {puts extra space in \PASCAL\ format} 1574"|": control_code:=math_break; {allows a break in a formula} 1575"/": control_code:=line_break; {forces end-of-line in \PASCAL\ format} 1576"#": control_code:=big_line_break; {forces end-of-line and some space besides} 1577"+": control_code:=no_line_break; {cancels end-of-line down to single space} 1578";": control_code:=pseudo_semi; {acts like a semicolon, but is invisible} 1579@t\4@>@<Special control codes allowed only when debugging@>@; 1580othercases begin err_print('! Unknown control code'); control_code:=ignore; 1581@.Unknown control code@> 1582 end 1583endcases; 1584end; 1585 1586@ If \.{WEAVE} is compiled with debugging commands, one can write 1587\.{@@2}, \.{@@1}, and \.{@@0} to turn tracing fully on, partly on, 1588and off, respectively. 1589@.\AT!2@> 1590@.\AT!1@> 1591 1592@<Special control codes...@>= 1593@!debug@t@>@/ 1594"0","1","2": begin tracing:=c-"0"; control_code:=ignore; 1595 end; 1596gubed 1597 1598@ The |skip_limbo| routine is used on the first pass to skip through 1599portions of the input that are not in any modules, i.e., that precede 1600the first module. After this procedure has been called, the value of 1601|input_has_ended| will tell whether or not a new module has 1602actually been found. 1603 1604@p procedure skip_limbo; {skip to next module} 1605label exit; 1606var c:ASCII_code; {character following \.{@@}} 1607begin loop if loc>limit then 1608 begin get_line; 1609 if input_has_ended then return; 1610 end 1611 else begin buffer[limit+1]:="@@"; 1612 while buffer[loc]<>"@@" do incr(loc); 1613 if loc<=limit then 1614 begin loc:=loc+2; c:=buffer[loc-1]; 1615 if (c=" ")or(c=tab_mark)or(c="*") then return; 1616 end; 1617 end; 1618exit: end; 1619 1620@ The |skip_TeX| routine is used on the first pass to skip through 1621the \TeX\ code at the beginning of a module. It returns the next 1622control code or `\v' found in the input. A |new_module| is 1623assumed to exist at the very end of the file. 1624 1625@p function skip_TeX: eight_bits; {skip past pure \TeX\ code} 1626label done; 1627var c:eight_bits; {control code found} 1628begin loop begin if loc>limit then 1629 begin get_line; 1630 if input_has_ended then 1631 begin c:=new_module; goto done; 1632 end; 1633 end; 1634 buffer[limit+1]:="@@"; 1635 repeat c:=buffer[loc]; incr(loc); 1636 if c="|" then goto done; 1637 until c="@@"; 1638 if loc<=limit then 1639 begin c:=control_code(buffer[loc]); incr(loc); goto done; 1640 end; 1641 end; 1642done:skip_TeX:=c; 1643end; 1644 1645@ The |skip_comment| routine is used on the first pass to skip 1646through \TeX\ code in \PASCAL\ comments. The |bal| parameter 1647tells how many left braces are assumed to have been scanned when 1648this routine is called, and the procedure returns a corresponding 1649value of |bal| at the point that scanning has stopped. Scanning 1650stops either at a `\v' that introduces \PASCAL\ text, 1651in which case the returned value is positive, or it stops at the 1652end of the comment, in which case the returned value is zero. 1653The scanning also stops in anomalous situations when the comment 1654doesn't end or when it contains an illegal use of \.{@@}. 1655One should call |skip_comment(1)| when beginning to scan a comment. 1656 1657@p function skip_comment(@!bal:eight_bits):eight_bits; {skips \TeX\ 1658 code in comments} 1659label done; 1660var c:ASCII_code; {the current character} 1661begin loop begin if loc>limit then 1662 begin get_line; 1663 if input_has_ended then 1664 begin bal:=0; goto done; 1665 end; {an error message will occur in phase two} 1666 end; 1667 c:=buffer[loc]; incr(loc); 1668 if c="|" then goto done; 1669 @<Do special things when |c="@@", "\", "{", "}"|; |goto done| at end@>; 1670 end; 1671done: skip_comment:=bal; 1672end; 1673 1674@ @<Do special things when |c="@@"...@>= 1675if c="@@" then 1676 begin c:=buffer[loc]; 1677 if (c<>" ")and(c<>tab_mark)and(c<>"*") then incr(loc) 1678 else begin decr(loc); bal:=0; goto done; 1679 end {an error message will occur in phase two} 1680 end 1681else if (c="\")and(buffer[loc]<>"@@") then incr(loc) 1682else if c="{" then incr(bal) 1683else if c="}" then 1684 begin decr(bal); 1685 if bal=0 then goto done; 1686 end 1687 1688@* Inputting the next token. 1689As stated above, \.{WEAVE}'s most interesting lexical scanning routine is the 1690|get_next| function that inputs the next token of \PASCAL\ input. However, 1691|get_next| is not especially complicated. 1692 1693The result of |get_next| is either an ASCII code for some special character, 1694or it is a special code representing a pair of characters (e.g., `\.{:=}' 1695or `\.{..}'), or it is the numeric value computed by the |control_code| 1696procedure, or it is one of the following special codes: 1697 1698\yskip\hang |exponent|: The `\.E' in a real constant. 1699 1700\yskip\hang |identifier|: In this case the global variables |id_first| 1701and |id_loc| will have been set to the appropriate values needed by the 1702|id_lookup| routine. 1703 1704\yskip\hang |string|: In this case the global variables |id_first| and 1705|id_loc| will have been set to the beginning and ending-plus-one locations 1706in the buffer. The string ends with the first reappearance of its initial 1707delimiter; thus, for example, $$\.{\'This isn\'\'t a single string\'}$$ 1708will be treated as two consecutive strings, the first being \.{\'This 1709isn\'}. 1710 1711\yskip\noindent Furthermore, some of the control codes cause 1712|get_next| to take additional actions: 1713 1714\yskip\hang |xref_roman|, |xref_wildcard|, 1715|xref_typewriter|, |TeX_string|: The values of 1716|id_first| and |id_loc| will be set so that the string in question appears 1717in |buffer[id_first..(id_loc-1)]|. 1718 1719\yskip\hang |module_name|: In this case the global variable |cur_module| will 1720point to the |byte_start| entry for the module name that has just been scanned. 1721 1722\yskip\noindent If |get_next| sees `\.{@@!}' or `\.{@@?}', 1723it sets |xref_switch| to |def_flag| or zero and goes on to the next token. 1724 1725A global variable called |scanning_hex| is set |true| during the time that 1726the letters \.A through \.F should be treated as if they were digits. 1727 1728@d exponent=@'200 {\.E or \.e following a digit} 1729@d string=@'201 {\PASCAL\ string or \.{WEB} precomputed string} 1730@d identifier=@'202 {\PASCAL\ identifier or reserved word} 1731 1732@<Globals...@>= 1733@!cur_module: name_pointer; {name of module just scanned} 1734@!scanning_hex: boolean; {are we scanning a hexadecimal constant?} 1735 1736@ @<Set init...@>= 1737scanning_hex:=false; 1738 1739@ As one might expect, |get_next| consists mostly of a big switch 1740that branches to the various special cases that can arise. 1741 1742@d up_to(#)==#-24,#-23,#-22,#-21,#-20,#-19,#-18,#-17,#-16,#-15,#-14, 1743 #-13,#-12,#-11,#-10,#-9,#-8,#-7,#-6,#-5,#-4,#-3,#-2,#-1,# 1744 1745@p function get_next:eight_bits; {produces the next input token} 1746label restart,done,found; 1747var c:eight_bits; {the current character} 1748@!d:eight_bits; {the next character} 1749@!j,@!k:0..longest_name; {indices into |mod_text|} 1750begin restart: if loc>limit then 1751 begin get_line; 1752 if input_has_ended then 1753 begin c:=new_module; goto found; 1754 end; 1755 end; 1756c:=buffer[loc]; incr(loc); 1757if scanning_hex then @<Go to |found| if |c| is a hexadecimal digit, 1758 otherwise set |scanning_hex:=false|@>; 1759case c of 1760"A",up_to("Z"),"a",up_to("z"): @<Get an identifier@>; 1761"'","""": @<Get a string@>; 1762"@@": @<Get control code and possible module name@>; 1763@t\4@>@<Compress two-symbol combinations like `\.{:=}'@>@; 1764" ",tab_mark: goto restart; {ignore spaces and tabs} 1765"}": begin err_print('! Extra }'); goto restart; 1766@.Extra \}@> 1767 end; 1768othercases if c>=128 then goto restart {ignore nonstandard characters} 1769 else do_nothing 1770endcases; 1771found:@!debug if trouble_shooting then debug_help;@;@+gubed@/ 1772get_next:=c; 1773end; 1774 1775@ @<Go to |found| if |c| is a hexadecimal digit...@>= 1776if ((c>="0")and(c<="9"))or((c>="A")and(c<="F")) then goto found 1777else scanning_hex:=false 1778 1779@ Note that the following code substitutes \.{@@\{} and \.{@@\}} for the 1780respective combinations `\.{(*}' and `\.{*)}'. Explicit braces should be used 1781for \TeX\ comments in \PASCAL\ text. 1782 1783@d compress(#)==begin if loc<=limit then begin c:=#; incr(loc); end; end 1784 1785@<Compress two-symbol...@>= 1786".": if buffer[loc]="." then compress(double_dot) 1787 else if buffer[loc]=")" then compress("]"); 1788":": if buffer[loc]="=" then compress(left_arrow); 1789"=": if buffer[loc]="=" then compress(equivalence_sign); 1790">": if buffer[loc]="=" then compress(greater_or_equal); 1791"<": if buffer[loc]="=" then compress(less_or_equal) 1792 else if buffer[loc]=">" then compress(not_equal); 1793"(": if buffer[loc]="*" then compress(begin_comment) 1794 else if buffer[loc]="." then compress("["); 1795"*": if buffer[loc]=")" then compress(end_comment); 1796 1797@ @<Get an identifier@>= 1798begin if ((c="E")or(c="e"))and(loc>1) then 1799 if (buffer[loc-2]<="9")and(buffer[loc-2]>="0") then c:=exponent; 1800if c<>exponent then 1801 begin decr(loc); id_first:=loc; 1802 repeat incr(loc); d:=buffer[loc]; 1803 until ((d<"0")or((d>"9")and(d<"A"))or((d>"Z")and(d<"a"))or(d>"z"))and(d<>"_"); 1804 c:=identifier; id_loc:=loc; 1805 end; 1806end 1807 1808@ A string that starts and ends with single or double quote marks is 1809scanned by the following piece of the program. 1810 1811@<Get a string@>= 1812begin id_first:=loc-1; 1813repeat d:=buffer[loc]; incr(loc); 1814if loc>limit then 1815 begin err_print('! String constant didn''t end'); 1816@.String constant didn't end@> 1817 loc:=limit; d:=c; 1818 end; 1819until d=c; 1820id_loc:=loc; c:=string; 1821end 1822 1823@ After an \.{@@} sign has been scanned, the next character tells us 1824whether there is more work to do. 1825 1826@<Get control code and possible module name@>= 1827begin c:=control_code(buffer[loc]); incr(loc); 1828if c=underline then 1829 begin xref_switch:=def_flag; goto restart; 1830 end 1831else if c=no_underline then 1832 begin xref_switch:=0; goto restart; 1833 end 1834else if (c<=TeX_string)and(c>=xref_roman) then 1835 @<Scan to the next \.{@@>}@> 1836else if c=hex then scanning_hex:=true 1837else if c=module_name then 1838 @<Scan the module name and make |cur_module| point to it@> 1839else if c=verbatim then @<Scan a verbatim string@>; 1840end 1841 1842@ The occurrence of a module name sets |xref_switch| to zero, 1843because the module name might (for example) follow \&{var}. 1844 1845@<Scan the module name...@>= 1846begin @<Put module name into |mod_text[1..k]|@>; 1847if k>3 then 1848 begin if (mod_text[k]=".")and(mod_text[k-1]=".")and(mod_text[k-2]=".") then 1849 cur_module:=prefix_lookup(k-3) 1850 else cur_module:=mod_lookup(k); 1851 end 1852else cur_module:=mod_lookup(k); 1853xref_switch:=0; 1854end 1855 1856@ Module names are placed into the |mod_text| array with consecutive spaces, 1857tabs, and carriage-returns replaced by single spaces. There will be no 1858spaces at the beginning or the end. (We set |mod_text[0]:=" "| to facilitate 1859this, since the |mod_lookup| routine uses |mod_text[1]| as the first 1860character of the name.) 1861 1862@<Set init...@>=mod_text[0]:=" "; 1863 1864@ @<Put module name...@>= 1865k:=0; 1866loop@+ begin if loc>limit then 1867 begin get_line; 1868 if input_has_ended then 1869 begin err_print('! Input ended in section name'); 1870@.Input ended in section name@> 1871 loc:=1; goto done; 1872 end; 1873 end; 1874 d:=buffer[loc]; 1875 @<If end of name, |goto done|@>; 1876 incr(loc); if k<longest_name-1 then incr(k); 1877 if (d=" ")or(d=tab_mark) then 1878 begin d:=" "; if mod_text[k-1]=" " then decr(k); 1879 end; 1880 mod_text[k]:=d; 1881 end; 1882done: @<Check for overlong name@>; 1883if (mod_text[k]=" ")and(k>0) then decr(k) 1884 1885@ @<If end of name,...@>= 1886if d="@@" then 1887 begin d:=buffer[loc+1]; 1888 if d=">" then 1889 begin loc:=loc+2; goto done; 1890 end; 1891 if (d=" ")or(d=tab_mark)or(d="*") then 1892 begin err_print('! Section name didn''t end'); goto done; 1893@.Section name didn't end@> 1894 end; 1895 incr(k); mod_text[k]:="@@"; incr(loc); {now |d=buffer[loc]| again} 1896 end 1897 1898@ @<Check for overlong name@>= 1899if k>=longest_name-2 then 1900 begin print_nl('! Section name too long: '); 1901@.Section name too long@> 1902 for j:=1 to 25 do print(xchr[mod_text[j]]); 1903 print('...'); mark_harmless; 1904 end 1905 1906@ @<Scan to the next...@>= 1907begin id_first:=loc; buffer[limit+1]:="@@"; 1908while buffer[loc]<>"@@" do incr(loc); 1909id_loc:=loc; 1910if loc>limit then 1911 begin err_print('! Control text didn''t end'); loc:=limit; 1912@.Control text didn't end@> 1913 end 1914else begin loc:=loc+2; 1915 if buffer[loc-1]<>">" then 1916 err_print('! Control codes are forbidden in control text'); 1917@.Control codes are forbidden...@> 1918 end; 1919end 1920 1921@ A verbatim \PASCAL\ string will be treated like ordinary strings, but 1922with no surrounding delimiters. At the present point in the program we 1923have |buffer[loc-1]=verbatim|; we must set |id_first| to the beginning 1924of the string itself, and |id_loc| to its ending-plus-one location in the 1925buffer. We also set |loc| to the position just after the ending delimiter. 1926 1927@<Scan a verbatim string@>= 1928begin id_first:=loc; incr(loc); 1929buffer[limit+1]:="@@"; buffer[limit+2]:=">"; 1930while (buffer[loc]<>"@@")or(buffer[loc+1]<>">") do incr(loc); 1931if loc>=limit then err_print('! Verbatim string didn''t end'); 1932@.Verbatim string didn't end@> 1933id_loc:=loc; loc:=loc+2; 1934end 1935 1936@* Phase one processing. 1937We now have accumulated enough subroutines to make it possible to carry out 1938\.{WEAVE}'s first pass over the source file. If everything works right, 1939both phase one and phase two of \.{WEAVE} will assign the same numbers to 1940modules, and these numbers will agree with what \.{TANGLE} does. 1941 1942The global variable |next_control| often contains the most recent output of 1943|get_next|; in interesting cases, this will be the control code that 1944ended a module or part of a module. 1945 1946@<Glob...@>=@!next_control:eight_bits; {control code waiting to be acting upon} 1947 1948@ The overall processing strategy in phase one has the following 1949straightforward outline. 1950 1951@<Phase I: Read all the user's text and store the cross references@>= 1952phase_one:=true; phase_three:=false; 1953reset_input; 1954module_count:=0; skip_limbo; change_exists:=false; 1955while not input_has_ended do 1956 @<Store cross reference data for the current module@>; 1957changed_module[module_count]:=change_exists; 1958 {the index changes if anything does} 1959phase_one:=false; {prepare for second phase} 1960@<Print error messages about unused or undefined module names@>; 1961 1962@ @<Store cross reference data...@>= 1963begin incr(module_count); 1964if module_count=max_modules then overflow('section number'); 1965changed_module[module_count]:=changing; 1966 {it will become |true| if any line changes} 1967if buffer[loc-1]="*" then 1968 begin print('*',module_count:1); 1969 update_terminal; {print a progress report} 1970 end; 1971@<Store cross references in the \TeX\ part of a module@>; 1972@<Store cross references in the \(definition part of a module@>; 1973@<Store cross references in the \PASCAL\ part of a module@>; 1974if changed_module[module_count] then change_exists:=true; 1975end 1976 1977@ The |Pascal_xref| subroutine stores references to identifiers in 1978\PASCAL\ text material beginning with the current value of |next_control| 1979and continuing until |next_control| is `\.\{' or `\v', or until the next 1980``milestone'' is passed (i.e., |next_control>=format|). If 1981|next_control>=format| when |Pascal_xref| is called, nothing will happen; 1982but if |next_control="|"| upon entry, the procedure assumes that this is 1983the `\v' preceding \PASCAL\ text that is to be processed. 1984 1985The program uses the fact that our internal code numbers satisfy 1986the relations |xref_roman=identifier+roman| and |xref_wildcard=identifier 1987+wildcard| and |xref_typewriter=identifier+ 1988typewriter| and |normal=0|. An implied `\.{@@!}' is inserted after 1989\&{function}, \&{procedure}, \&{program}, and \&{var}. 1990 1991@p procedure Pascal_xref; {makes cross references for \PASCAL\ identifiers} 1992label exit; 1993var p:name_pointer; {a referenced name} 1994begin while next_control<format do 1995 begin if (next_control>=identifier)and 1996 (next_control<=xref_typewriter) then 1997 begin p:=id_lookup(next_control-identifier); new_xref(p); 1998 if (ilk[p]=proc_like)or(ilk[p]=var_like) then 1999 xref_switch:=def_flag; {implied `\.{@@!}'} 2000 end; 2001 next_control:=get_next; 2002 if (next_control="|")or(next_control="{") then return; 2003 end; 2004exit:end; 2005 2006@ The |outer_xref| subroutine is like |Pascal_xref| but it begins 2007with |next_control<>"|"| and ends with |next_control>=format|. Thus, it 2008handles \PASCAL\ text with embedded comments. 2009 2010@p procedure outer_xref; {extension of |Pascal_xref|} 2011var bal:eight_bits; {brace level in comment} 2012begin while next_control<format do 2013 if next_control<>"{" then Pascal_xref 2014 else begin bal:=skip_comment(1); next_control:="|"; 2015 while bal>0 do 2016 begin Pascal_xref; 2017 if next_control="|" then bal:=skip_comment(bal) 2018 else bal:=0; {an error will be reported in phase two} 2019 end; 2020 end; 2021end; 2022 2023@ In the \TeX\ part of a module, cross reference entries are made only for 2024the identifiers in \PASCAL\ texts enclosed in \pb, or for control texts 2025enclosed in \.{@@\^}$\,\ldots\,$\.{@@>} or \.{@@.}$\,\ldots\,$\.{@@>} 2026or \.{@@:}$\,\ldots\,$\.{@@>}. 2027 2028@<Store cross references in the \T...@>= 2029repeat next_control:=skip_TeX; 2030case next_control of 2031underline: xref_switch:=def_flag; 2032no_underline: xref_switch:=0; 2033"|": Pascal_xref; 2034xref_roman, xref_wildcard, xref_typewriter, module_name: 2035 begin loc:=loc-2; next_control:=get_next; {scan to \.{@@>}} 2036 if next_control<>module_name then 2037 new_xref(id_lookup(next_control-identifier)); 2038 end; 2039othercases do_nothing 2040endcases; 2041until next_control>=format 2042 2043@ During the definition and \PASCAL\ parts of a module, cross references 2044are made for all identifiers except reserved words; however, the 2045identifiers in a format definition are referenced even if they are 2046reserved. The \TeX\ code in comments is, of course, ignored, except for 2047\PASCAL\ portions enclosed in \pb; the text of a module name is skipped 2048entirely, even if it contains \pb\ constructions. 2049 2050The variables |lhs| and |rhs| point to the respective identifiers involved 2051in a format definition. 2052 2053@<Global...@>= 2054@!lhs,@!rhs:name_pointer; {indices into |byte_start| for format identifiers} 2055 2056@ When we get to the following code we have |next_control>=format|. 2057 2058@<Store cross references in the \(d...@>= 2059while next_control<=definition do {|format| or |definition|} 2060 begin xref_switch:=def_flag; {implied \.{@@!}} 2061 if next_control=definition then next_control:=get_next 2062 else @<Process a format definition@>; 2063 outer_xref; 2064 end 2065 2066@ Error messages for improper format definitions will be issued in phase 2067two. Our job in phase one is to define the |ilk| of a properly formatted 2068identifier, and to fool the |new_xref| routine into thinking that the 2069identifier on the right-hand side of the format definition is not a 2070reserved word. 2071 2072@<Process a form...@>= 2073begin next_control:=get_next; 2074if next_control=identifier then 2075 begin lhs:=id_lookup(normal); ilk[lhs]:=normal; new_xref(lhs); 2076 next_control:=get_next; 2077 if next_control=equivalence_sign then 2078 begin next_control:=get_next; 2079 if next_control=identifier then 2080 begin rhs:=id_lookup(normal); 2081 ilk[lhs]:=ilk[rhs]; ilk[rhs]:=normal; new_xref(rhs); 2082 ilk[rhs]:=ilk[lhs]; next_control:=get_next; 2083 end; 2084 end; 2085 end; 2086end 2087 2088@ Finally, when the \TeX\ and definition parts have been treated, we have 2089|next_control>=begin_Pascal|. 2090 2091@<Store cross references in the \P...@>= 2092if next_control<=module_name then {|begin_Pascal| or |module_name|} 2093 begin if next_control=begin_Pascal then mod_xref_switch:=0 2094 else mod_xref_switch:=def_flag; 2095 repeat if next_control=module_name then new_mod_xref(cur_module); 2096 next_control:=get_next; outer_xref; 2097 until next_control>module_name; 2098 end 2099 2100@ After phase one has looked at everything, we want to check that each 2101module name was both defined and used. 2102The variable |cur_xref| will point to cross references for the 2103current module name of interest. 2104 2105@<Glob...@>=@!cur_xref:xref_number; {temporary cross reference pointer} 2106 2107@ The following recursive procedure 2108walks through the tree of module names and prints out anomalies. 2109@^recursion@> 2110 2111@p procedure mod_check(@!p:name_pointer); {print anomalies in subtree |p|} 2112begin if p>0 then 2113 begin mod_check(llink[p]);@/ 2114 cur_xref:=xref[p]; 2115 if num(cur_xref)<def_flag then 2116 begin print_nl('! Never defined: <'); print_id(p); 2117@.Never defined: <section name>@> 2118 print('>'); mark_harmless; 2119 end; 2120 while num(cur_xref)>=def_flag do cur_xref:=xlink(cur_xref); 2121 if cur_xref=0 then 2122 begin print_nl('! Never used: <'); print_id(p); print('>'); 2123@.Never used: <section name>@> 2124 mark_harmless; 2125 end; 2126 mod_check(rlink[p]); 2127 end; 2128end; 2129 2130@ @<Print error messages about un...@>=@+mod_check(root) 2131 2132@* Low-level output routines. 2133The \TeX\ output is supposed to appear in lines at most |line_length| 2134characters long, so we place it into an output buffer. During the output 2135process, |out_line| will hold the current line number of the line about to 2136be output. 2137 2138@<Glo...@>= 2139@!out_buf:array[0..line_length] of ASCII_code; {assembled characters} 2140@!out_ptr:0..line_length; {number of characters in |out_buf|} 2141@!out_line: integer; {coordinates of next line to be output} 2142 2143@ The |flush_buffer| routine empties the buffer up to a given breakpoint, 2144and moves any remaining characters to the beginning of the next line. 2145If the |per_cent| parameter is |true|, a |"%"| is appended to the line 2146that is being output; in this case the breakpoint |b| should be strictly 2147less than |line_length|. If the |per_cent| parameter is |false|, 2148trailing blanks are suppressed. 2149The characters emptied from the buffer form a new line of output; 2150if the |carryover| parameter is true, a |"%"| in that line will be 2151carried over to the next line (so that \TeX\ will ignore the completion 2152of commented-out text). 2153 2154@p procedure flush_buffer(@!b:eight_bits;@!per_cent,@!carryover:boolean); 2155 {outputs |out_buf[1..b]|, where |b<=out_ptr|} 2156label done,found; 2157var j,@!k:0..line_length; 2158begin j:=b; 2159if not per_cent then {remove trailing blanks} 2160 loop@+ begin if j=0 then goto done; 2161 if out_buf[j]<>" " then goto done; 2162 decr(j); 2163 end; 2164done: for k:=1 to j do write(tex_file,xchr[out_buf[k]]); 2165if per_cent then write(tex_file,xchr["%"]); 2166write_ln(tex_file); incr(out_line); 2167if carryover then 2168 for k:=1 to j do 2169 if out_buf[k]="%" then 2170 if (k=1)or(out_buf[k-1]<>"\") then {comment mode should be preserved} 2171 begin out_buf[b]:="%"; decr(b); goto found; 2172 end; 2173found: if (b<out_ptr) then 2174 for k:=b+1 to out_ptr do out_buf[k-b]:=out_buf[k]; 2175out_ptr:=out_ptr-b; 2176end; 2177 2178@ When we are copying \TeX\ source material, we retain line breaks 2179that occur in the input, except that an empty line is not 2180output when the \TeX\ source line was nonempty. For example, a line 2181of the \TeX\ file that contains only an index cross-reference entry 2182will not be copied. The |finish_line| routine is called just before 2183|get_line| inputs a new line, and just after a line break token has 2184been emitted during the output of translated \PASCAL\ text. 2185 2186@p procedure finish_line; {do this at the end of a line} 2187label exit; 2188var k:0..buf_size; {index into |buffer|} 2189begin if out_ptr>0 then flush_buffer(out_ptr,false,false) 2190else begin for k:=0 to limit do 2191 if (buffer[k]<>" ")and(buffer[k]<>tab_mark) then return; 2192 flush_buffer(0,false,false); 2193 end; 2194exit:end; 2195 2196@ In particular, the |finish_line| procedure is called near the very 2197beginning of phase two. We initialize the output variables in a slightly 2198tricky way so that the first line of the output file will be 2199`\.{\\input webmac}'. 2200@.\\input webmac@> 2201@.webmac@> 2202 2203@<Set init...@>= 2204out_ptr:=1; out_line:=1; out_buf[1]:="c"; write(tex_file,'\input webma'); 2205 2206@ When we wish to append the character |c| to the output buffer, we write 2207`$|out|(c)$'; this will cause the buffer to be emptied if it was already 2208full. Similarly, `$|out2|(c_1)(c_2)$' appends a pair of characters. 2209A line break will occur at a space or after a single-nonletter 2210\TeX\ control sequence. 2211 2212@d oot(#)==@;@/ 2213 if out_ptr=line_length then break_out; 2214 incr(out_ptr); out_buf[out_ptr]:=#; 2215@d oot1(#)==oot(#)@+end 2216@d oot2(#)==oot(#)@,oot1 2217@d oot3(#)==oot(#)@,oot2 2218@d oot4(#)==oot(#)@,oot3 2219@d oot5(#)==oot(#)@,oot4 2220@d out==@+begin oot1 2221@d out2==@+begin oot2 2222@d out3==@+begin oot3 2223@d out4==@+begin oot4 2224@d out5==@+begin oot5 2225 2226@ The |break_out| routine is called just before the output buffer is about 2227to overflow. To make this routine a little faster, we initialize position 22280 of the output buffer to `\.\\'; this character isn't really output. 2229 2230@<Set init...@>= 2231out_buf[0]:="\"; 2232 2233@ A long line is broken at a blank space or just before a backslash that isn't 2234preceded by another backslash. In the latter case, a |"%"| is output at 2235the break. 2236 2237@p procedure break_out; {finds a way to break the output line} 2238label exit; 2239var k:0..line_length; {index into |out_buf|} 2240@!d:ASCII_code; {character from the buffer} 2241begin k:=out_ptr; 2242loop@+ begin if k=0 then 2243 @<Print warning message, break the line, |return|@>; 2244 d:=out_buf[k]; 2245 if d=" " then 2246 begin flush_buffer(k,false,true); return; 2247 end; 2248 if (d="\")and(out_buf[k-1]<>"\") then {in this case |k>1|} 2249 begin flush_buffer(k-1,true,true); return; 2250 end; 2251 decr(k); 2252 end; 2253exit:end; 2254 2255@ We get to this module only in unusual cases that the entire output line 2256consists of a string of backslashes followed by a string of nonblank 2257non-backslashes. In such cases it is almost always safe to break the 2258line by putting a |"%"| just before the last character. 2259 2260@<Print warning message...@>= 2261begin print_nl('! Line had to be broken (output l.',out_line:1); 2262@.Line had to be broken@> 2263print_ln('):'); 2264for k:=1 to out_ptr-1 do print(xchr[out_buf[k]]); 2265new_line; mark_harmless; 2266flush_buffer(out_ptr-1,true,true); return; 2267end 2268 2269@ Here is a procedure that outputs a module number in decimal notation. 2270 2271@<Glob...@>=@!dig:array[0..4] of 0..9; {digits to output} 2272 2273@ The number to be converted by |out_mod| is known to be less than 2274|def_flag|, so it cannot have more than five decimal digits. If 2275the module is changed, we output `\.{\\*}' just after the number. 2276 2277@p procedure out_mod(@!m:integer); {output a module number} 2278var k:0..5; {index into |dig|} 2279@!a:integer; {accumulator} 2280begin k:=0; a:=m; 2281repeat dig[k]:=a mod 10; a:=a div 10; incr(k); 2282until a=0; 2283repeat decr(k); out(dig[k]+"0"); 2284until k=0; 2285if changed_module[m] then out2("\")("*"); 2286@.\\*@> 2287end; 2288 2289@ The |out_name| subroutine is used to output an identifier or index 2290entry, enclosing it in braces. 2291 2292@p procedure out_name(@!p:name_pointer); {outputs a name} 2293var k:0..max_bytes; {index into |byte_mem|} 2294@!w:0..ww-1; {row of |byte_mem|} 2295begin out("{"); w:=p mod ww; 2296for k:=byte_start[p] to byte_start[p+ww]-1 do 2297 begin if byte_mem[w,k]="_" then out("\"); 2298@.\\_@> 2299 out(byte_mem[w,k]); 2300 end; 2301out("}"); 2302end; 2303 2304@* Routines that copy \TeX\ material. 2305During phase two, we use the subroutines |copy_limbo|, |copy_TeX|, and 2306|copy_comment| in place of the analogous |skip_limbo|, |skip_TeX|, and 2307|skip_comment| that were used in phase one. 2308 2309The |copy_limbo| routine, for example, takes \TeX\ material that is not 2310part of any module and transcribes it almost verbatim to the output file. 2311No `\.{@@}' signs should occur in such material except in `\.{@@@@}' 2312pairs; such pairs are replaced by singletons. 2313 2314@p procedure copy_limbo; {copy \TeX\ code until the next module begins} 2315label exit; 2316var c:ASCII_code; {character following \.{@@} sign} 2317begin loop if loc>limit then 2318 begin finish_line; get_line; 2319 if input_has_ended then return; 2320 end 2321 else begin buffer[limit+1]:="@@"; 2322 @<Copy up to control code, |return| if finished@>; 2323 end; 2324exit:end; 2325 2326@ @<Copy up to control...@>= 2327while buffer[loc]<>"@@" do 2328 begin out(buffer[loc]); incr(loc); 2329 end; 2330if loc<=limit then 2331 begin loc:=loc+2; c:=buffer[loc-1]; 2332 if (c=" ")or(c=tab_mark)or(c="*") then return; 2333 if (c<>"z")and(c<>"Z") then 2334 begin out("@@"); 2335 if c<>"@@" then err_print('! Double @@ required outside of sections'); 2336@.Double \AT! required...@> 2337 end; 2338 end 2339 2340@ The |copy_TeX| routine processes the \TeX\ code at the beginning of a 2341module; for example, the words you are now reading were copied in this 2342way. It returns the next control code or `\v' found in the input. 2343 2344@p function copy_TeX:eight_bits; {copy pure \TeX\ material} 2345label done; 2346var c:eight_bits; {control code found} 2347begin loop begin if loc>limit then 2348 begin finish_line; get_line; 2349 if input_has_ended then 2350 begin c:=new_module; goto done; 2351 end; 2352 end; 2353 buffer[limit+1]:="@@"; 2354 @<Copy up to `\v' or control code, |goto done| if finished@>; 2355 end; 2356done:copy_TeX:=c; 2357end; 2358 2359@ We don't copy spaces or tab marks into the beginning of a line. This 2360makes the test for empty lines in |finish_line| work. 2361 2362@<Copy up to `\v'...@>= 2363repeat c:=buffer[loc]; incr(loc); 2364if c="|" then goto done; 2365if c<>"@@" then 2366 begin out(c); 2367 if (out_ptr=1)and((c=" ")or(c=tab_mark)) then decr(out_ptr); 2368 end; 2369until c="@@"; 2370if loc<=limit then 2371 begin c:=control_code(buffer[loc]); incr(loc); 2372 goto done; 2373 end 2374 2375@ The |copy_comment| uses and returns a brace-balance value, following the 2376conventions of |skip_comment| above. Instead of copying the \TeX\ material 2377into the output buffer, this procedure copies it into the token memory. 2378The abbreviation |app_tok(t)| is used to append token |t| to the current 2379token list, and it also makes sure that it is possible to append at least 2380one further token without overflow. 2381 2382@d app_tok(#)==begin if tok_ptr+2>max_toks then overflow('token'); 2383 tok_mem[tok_ptr]:=#; incr(tok_ptr); 2384 end 2385 2386@p function copy_comment(@!bal:eight_bits):eight_bits; {copies \TeX\ code in 2387 comments} 2388label done; 2389var c:ASCII_code; {current character being copied} 2390begin loop begin if loc>limit then 2391 begin get_line; 2392 if input_has_ended then 2393 begin err_print('! Input ended in mid-comment'); 2394@.Input ended in mid-comment@> 2395 loc:=1; @<Clear |bal| and |goto done|@>; 2396 end; 2397 end; 2398 c:=buffer[loc]; incr(loc); 2399 if c="|" then goto done; 2400 app_tok(c); 2401 @<Copy special things when |c="@@", "\", "{", "}"|; 2402 |goto done| at end@>; 2403 end; 2404done: copy_comment:=bal; 2405end; 2406 2407@ @<Copy special things when |c="@@"...@>= 2408if c="@@" then 2409 begin incr(loc); 2410 if buffer[loc-1]<>"@@" then 2411 begin err_print('! Illegal use of @@ in comment'); 2412@.Illegal use of \AT!...@> 2413 loc:=loc-2; decr(tok_ptr); @<Clear |bal|...@>; 2414 end; 2415 end 2416else if (c="\")and(buffer[loc]<>"@@") then 2417 begin app_tok(buffer[loc]); incr(loc); 2418 end 2419else if c="{" then incr(bal) 2420else if c="}" then 2421 begin decr(bal); 2422 if bal=0 then goto done; 2423 end 2424 2425@ When the comment has terminated abruptly due to an error, we output 2426enough right braces to keep \TeX\ happy. 2427 2428@<Clear |bal|...@>= 2429app_tok(" "); {this is done in case the previous character was `\.\\'} 2430repeat app_tok("}"); decr(bal); 2431until bal=0; 2432goto done; 2433 2434@* Parsing. 2435The most intricate part of \.{WEAVE} is its mechanism for converting 2436\PASCAL-like code into \TeX\ code, and we might as well plunge into this 2437aspect of the program now. A ``bottom up'' approach is used to parse the 2438\PASCAL-like material, since \.{WEAVE} must deal with fragmentary 2439constructions whose overall ``part of speech'' is not known. 2440 2441At the lowest level, the input is represented as a sequence of entities 2442that we shall call {\it scraps}, where each scrap of information consists 2443of two parts, its {\it category} and its {\it translation}. The category 2444is essentially a syntactic class, and the translation is a token list that 2445represents \TeX\ code. Rules of syntax and semantics tell us how to 2446combine adjacent scraps into larger ones, and if we are lucky an entire 2447\PASCAL\ text that starts out as hundreds of small scraps will join 2448together into one gigantic scrap whose translation is the desired \TeX\ 2449code. If we are unlucky, we will be left with several scraps that don't 2450combine; their translations will simply be output, one by one. 2451 2452The combination rules are given as context-sensitive productions that are 2453applied from left to right. Suppose that we are currently working on the 2454sequence of scraps $s_1\,s_2\ldots s_n$. We try first to find the longest 2455production that applies to an initial substring $s_1\,s_2\ldots\,$; but if 2456no such productions exist, we find to find the longest production 2457applicable to the next substring $s_2\,s_3\ldots\,$; and if that fails, we 2458try to match $s_3\,s_4\ldots\,$, etc. 2459 2460A production applies if the category codes have a given pattern. For 2461example, one of the productions is 2462$$|open|\;|math|\;|semi|\;\RA\;|open|\;|math|$$ 2463and it means that three consecutive scraps whose respective categories are 2464|open|, |math|, and |semi| are con\-verted to two scraps whose categories 2465are |open| and |math|. This production also has an associated rule that 2466tells how to combine the translation parts: 2467$$\eqalign{O_2&=O_1\cr 2468M_2&=M_1\,S\,\.{\\,}\,\hbox{|opt|\thinspace\tt5}\cr}$$ 2469This means that the |open| scrap has not changed, while the new |math| scrap 2470has a translation $M_2$ composed of the translation $M_1$ of the original 2471|math| scrap followed by the translation |S| of the |semi| scrap followed 2472by `\.{\\,}' followed by `|opt|' followed by `\.5'. (In the \TeX\ file, 2473this will specify an additional thin space after the semicolon, followed 2474by an optional line break with penalty 50.) Translation rules use subscripts 2475to distinguish between translations of scraps whose categories have the 2476same initial letter; these subscripts are assigned from left to right. 2477 2478$\.{WEAVE}$ also has the production rule 2479$$|semi|\;\RA\;|terminator|$$ 2480(meaning that a semicolon can terminate a \PASCAL\ statement). Since 2481productions are applied from left to right, this rule will be activated 2482only if the |semi| is not preceded by scraps that match other productions; 2483in particular, a |semi| that is preceded by `|open| |math|' will have 2484disappeared because of the production above, and such semicolons do not 2485act as statement terminators. This incidentally is how \.{WEAVE} is able 2486to treat semicolons in two distinctly different ways, the first of which 2487is intended for semicolons in the parameter list of a procedure 2488declaration. 2489 2490The translation rule corresponding to $|semi|\;\RA\;|terminator|$ is 2491$$T=S$$ 2492but we shall not mention translation rules in the common case that the 2493translation of the new scrap on the right-hand side is simply the 2494concatenation of the disappearing scraps on the left-hand side. 2495 2496@ Here is a list of the category codes that scraps can have. 2497 2498@d simp=1 {the translation can be used both in horizontal mode 2499 and in math mode of \TeX} 2500@d math=2 {the translation should be used only in \TeX\ math mode} 2501@d intro=3 {a statement is expected to follow this, after a space and 2502 an optional break} 2503@d open=4 {denotes an incomplete parenthesized quantity to be used in 2504 math mode} 2505@d beginning=5 {denotes an incomplete compound statement to be used in 2506 horizontal mode} 2507@d close=6 {ends a parenthesis or compound statement} 2508@d alpha=7 {denotes the beginning of a clause} 2509@d omega=8 {denotes the ending of a clause and possible comment following} 2510@d semi=9 {denotes a semicolon and possible comment following it} 2511@d terminator=10 {something that ends a statement or declaration} 2512@d stmt=11 {denotes a statement or declaration including its terminator} 2513@d cond=12 {precedes an \&{if} clause that might have a matching \&{else}} 2514@d clause=13 {precedes a statement after which indentation ends} 2515@d colon=14 {denotes a colon} 2516@d exp=15 {stands for the E in a floating point constant} 2517@d proc=16 {denotes a procedure or program or function heading} 2518@d case_head=17 {denotes a case statement or record heading} 2519@d record_head=18 {denotes a record heading without indentation} 2520@d var_head=19 {denotes a variable declaration heading} 2521@d elsie=20 {\&{else}} 2522@d casey=21 {\&{case}} 2523@d mod_scrap=22 {denotes a module name} 2524 2525@p @!debug procedure print_cat(@!c:eight_bits); 2526 {symbolic printout of a category} 2527begin case c of 2528simp: print('simp'); 2529math: print('math'); 2530intro: print('intro'); 2531open: print('open'); 2532beginning: print('beginning'); 2533close: print('close'); 2534alpha: print('alpha'); 2535omega: print('omega'); 2536semi: print('semi'); 2537terminator: print('terminator'); 2538stmt: print('stmt'); 2539cond: print('cond'); 2540clause: print('clause'); 2541colon: print('colon'); 2542exp: print('exp'); 2543proc: print('proc'); 2544case_head: print('casehead'); 2545record_head: print('recordhead'); 2546var_head: print('varhead'); 2547elsie: print('elsie'); 2548casey: print('casey'); 2549mod_scrap: print('module'); 2550othercases print('UNKNOWN') 2551endcases; 2552end; 2553gubed 2554 2555@ The token lists for translated \TeX\ output contain some special control 2556symbols as well as ordinary characters. These control symbols are 2557interpreted by \.{WEAVE} before they are written to the output file. 2558 2559\yskip\hang |break_space| denotes an optional line break or an en space; 2560 2561\yskip\hang |force| denotes a line break; 2562 2563\yskip\hang |big_force| denotes a line break with additional vertical space; 2564 2565\yskip\hang |opt| denotes an optional line break (with the continuation 2566line indented two ems with respect to the normal starting position)---this 2567code is followed by an integer |n|, and the break will occur with penalty 2568$10n$; 2569 2570\yskip\hang |backup| denotes a backspace of one em; 2571 2572\yskip\hang |cancel| obliterates any |break_space| or |force| or |big_force| 2573tokens that immediately precede or follow it and also cancels any 2574|backup| tokens that follow it; 2575 2576\yskip\hang |indent| causes future lines to be indented one more em; 2577 2578\yskip\hang |outdent| causes future lines to be indented one less em. 2579 2580\yskip\noindent All of these tokens are removed from the \TeX\ output that 2581comes from \PASCAL\ text between \pb\ signs; |break_space| and |force| and 2582|big_force| become single spaces in this mode. The translation of other 2583\PASCAL\ texts results in \TeX\ control sequences \.{\\1}, \.{\\2}, 2584\.{\\3}, \.{\\4}, \.{\\5}, \.{\\6}, \.{\\7} corresponding respectively to 2585|indent|, |outdent|, |opt|, |backup|, |break_space|, |force|, and 2586|big_force|. However, a sequence of consecutive `\.\ ', |break_space|, 2587|force|, and/or |big_force| tokens is first replaced by a single token 2588(the maximum of the given ones). 2589 2590The tokens |math_rel|, |math_bin|, |math_op| will be translated into 2591\.{\\mathrel\{}, \.{\\mathbin\{}, and \.{\\mathop\{}, respectively. 2592Other control sequences in the \TeX\ output will be `\.{\\\\\{}$\,\ldots\,$\.\}' 2593surrounding identifiers, `\.{\\\&\{}$\,\ldots\,$\.\}' surrounding 2594reserved words, `\.{\\.\{}$\,\ldots\,$\.\}' surrounding strings, 2595`\.{\\C\{}$\,\ldots\,$\.\}$\,$|force|' surrounding comments, and 2596`\.{\\X$n$:}$\,\ldots\,$\.{\\X}' surrounding module names, where 2597|n| is the module number. 2598 2599@d math_bin=@'203 2600@d math_rel=@'204 2601@d math_op=@'205 2602@d big_cancel=@'206 {like |cancel|, also overrides spaces} 2603@d cancel=@'207 {overrides |backup|, |break_space|, |force|, |big_force|} 2604@d indent=cancel+1 {one more tab (\.{\\1})} 2605@d outdent=cancel+2 {one less tab (\.{\\2})} 2606@d opt=cancel+3 {optional break in mid-statement (\.{\\3})} 2607@d backup=cancel+4 {stick out one unit to the left (\.{\\4})} 2608@d break_space=cancel+5 {optional break between statements (\.{\\5})} 2609@d force=cancel+6 {forced break between statements (\.{\\6})} 2610@d big_force=cancel+7 {forced break with additional space (\.{\\7})} 2611@d end_translation=big_force+1 {special sentinel token at end of list} 2612 2613@ The raw input is converted into scraps according to the following table, 2614which gives category codes followed by the translations. Sometimes a single 2615item of input produces more than one scrap. 2616\def\stars {\.{**}}% 2617(The symbol `\stars' stands for `\.{\\\&\{{\rm identifier}\}}', 2618i.e., the identifier itself treated as a reserved word. In a few cases the 2619category is given as `|@!comment|'; this is not an actual category code, it 2620means that the translation will be treated as a comment, as explained 2621below.) 2622 2623\yskip\halign{\quad#\hfil&\quad#\hfil\cr 2624\.{<>}&|math:|\.{\\I}\cr 2625\.{<=}&|math:|\.{\\L}\cr 2626\.{>=}&|math:|\.{\\G}\cr 2627\.{:=}&|math:|\.{\\K}\cr 2628\.{==}&|math:|\.{\\S}\cr 2629\.{(*}&|math:|\.{\\B}\cr 2630\.{*)}&|math:|\.{\\T}\cr 2631\.{(.}&|open:|\.[\cr 2632\.{.)}&|close:|\.]\cr 2633\."$\,$string$\,$\."&|simp:|\.{\\.\{"{\rm$\,$modified string$\,$}"\}}\cr 2634\.\'$\,$string$\,$\.\'&|simp:|\.{\\.\{\\\'{\rm$\,$modified 2635 string$\,$}\\\'\}}\cr 2636\.{@@=}$\,$string$\,$\.{@@>}&|simp:|\.{\\=\{{\rm$\,$modified string$\,$}\}}\cr 2637\#&|math:|\.{\\\#}\cr 2638\.\$&|math:|\.{\\\$}\cr 2639\.\_&|math:|\.{\\\_}\cr 2640\.\%&|math:|\.{\\\%}\cr 2641\.\^&|math:|\.{\\\^}\cr 2642\.(&|open:|\.(\cr 2643\.)&|close:|\.)\cr 2644\.[&|open:|\.[\cr 2645\.]&|close:|\.]\cr 2646\.*&|math:|\.{\\ast}\cr 2647\.,&|math:|\.,|@,opt@,|\.9\cr 2648\.{..}&|math:|\.{\\to}\cr 2649\..&|simp:|\..\cr 2650\.:&|colon:|\.:\cr 2651\.;&|semi:|\.;\cr 2652identifier&|simp:|\.{\\\\\{{\rm$\,$identifier$\,$}\}}\cr 2653\.E in constant&|exp:|\.{\\E\{}\cr 2654digit $d$&|simp:|$d$\cr 2655other character $c$&|math:|$c$\cr 2656\.{and}&|math:|\.{\\W}\cr 2657\.{array}&|alpha:|\stars\cr 2658\.{begin}&|beginning:|$|force|\,\stars\,|cancel|$\qquad|intro:|\cr 2659\.{case}&|casey:|\qquad|alpha:|$|force|\,\stars$\cr 2660\.{const}&|intro:|$|force|\,|backup|\,\stars$\cr 2661\.{div}&|math:|$|math_bin|\,\stars\,\.\}$\cr 2662\.{do}&|omega:|\stars\cr 2663\.{downto}&|math:|$|math_rel|\,\stars\,\.\}$\cr 2664\.{else}&|terminator:|\qquad|elsie:|$|force|\,|backup|\,\stars$\cr 2665\.{end}&|terminator:|\qquad|close:|$|force|\,\stars$\cr 2666\.{file}&|alpha:|\stars\cr 2667\.{for}&|alpha:|$|force|\,\stars$\cr 2668\.{function}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad 2669 |intro:|$|indent|\,\.{\\\ }$\cr 2670\.{goto}&|intro:|\stars\cr 2671\.{if}&|cond:|\qquad|alpha:|$|force|\,\stars$\cr 2672\.{in}&|math:|\.{\\in}\cr 2673\.{label}&|intro:|$|force|\,|backup|\,\stars$\cr 2674\.{mod}&|math:|$|math_bin|\,\stars\,\.\}$\cr 2675\.{nil}&|simp:|\stars\cr 2676\.{not}&|math:|\.{\\R}\cr 2677\.{of}&|omega:|\stars\cr 2678\.{or}&|math:|\.{\\V}\cr 2679\.{packed}&|intro:|\stars\cr 2680\.{procedure}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad 2681 |intro:|$|indent|\,\.{\\\ }$\cr 2682\.{program}&|proc:|$|force|\,|backup|\,\stars\,|cancel|$\qquad 2683 |intro:|$|indent|\,\.{\\\ }$\cr 2684\.{record}&|record_head:|\stars\qquad|intro:|\cr 2685\.{repeat}&|beginning:|$|force|\,|indent|\,\stars\,|cancel|$\qquad|intro:|\cr 2686\.{set}&|alpha:|\stars\cr 2687\.{then}&|omega:|\stars\cr 2688\.{to}&|math:|$|math_rel|\,\stars\,\.\}$\cr 2689\.{type}&|intro:|$|force|\,|backup|\,\stars$\cr 2690\.{until}&|terminator:|\qquad|close:|$|force|\,|backup|\,\stars$\qquad 2691 |clause:|\cr 2692\.{var}&|var_head:|$|force|\,|backup|\,\stars\,|cancel|$\qquad|intro:|\cr 2693\.{while}&|alpha:|$|force|\,\stars$\cr 2694\.{with}&|alpha:|$|force|\,\stars$\cr 2695\.{xclause}&|alpha:|$|force|\,\.{\\\~}$\qquad|omega:|\stars\cr 2696\.{@@\'}$\,$const&|simp:|\.{\\O\{}\hbox{const}\.\}\cr 2697\.{@@"}$\,$const&|simp:|\.{\\H\{}\hbox{const}\.\}\cr 2698\.{@@\$}&|simp:|\.{\\)}\cr 2699\.{@@\\}&|simp:|\.{\\]}\cr 2700\.{@@,}&|math:|\.{\\,}\cr 2701\.{@@t}$\,$stuff$\,$\.{@@>}&|simp:|\.{\\hbox\{{\rm$\,$stuff$\,$}\}}\cr 2702\.{@@<}$\,$module$\,$\.{@@>}&|mod_scrap:|\.{\\X$n$:{\rm$\,$module$\,$}\\X}\cr 2703\.{@@\#}&|comment:||big_force|\cr 2704\.{@@/}&|comment:||force|\cr 2705\.{@@\char'174}&|simp:|$|opt|\,\.0$\cr 2706\.{@@+}&|comment:|$|big_cancel|\,\.{\\\ }\,|big_cancel|$\cr 2707\.{@@;}&|semi:|\cr 2708\.{@@\&}&|math:|\.{\\J}\cr 2709\.{@@\{}&|math:|\.{\\B}\cr 2710\.{@@\}}&|math:|\.{\\T}\cr} 2711\yskip\noindent When a string is output, certain characters are preceded by 2712`\.\\' signs so that they will print properly. 2713 2714A comment in the input will be combined with the preceding 2715|omega| or |semi| scrap, or with the following |terminator| scrap, if 2716possible; otherwise it will be inserted as a separate |terminator| scrap. 2717An additional ``comment'' is effectively appended at the end of the 2718\PASCAL\ text, just before translation begins; this consists of a |cancel| 2719token in the case of \PASCAL\ text in \pb, otherwise it consists of a 2720|force| token. 2721 2722From this table it is evident that \.{WEAVE} will parse a lot of non-\PASCAL\ 2723programs. For example, the reserved words `\.{for}' and `\.{array}' are 2724treated in an identical way by \.{WEAVE} from a syntactic standpoint, 2725and semantically they are equivalent except that a forced line break occurs 2726just before `\&{for}'; \PASCAL\ programmers may well be surprised at this 2727similarity. The idea is to keep \.{WEAVE}'s rules as simple as possible, 2728consistent with doing a reasonable job on syntactically correct \PASCAL\ 2729programs. The production rules below have been formulated in the same 2730spirit of ``almost anything goes.'' 2731 2732@ Here is a table of all the productions. The reader can best get a feel for 2733@^productions, table of@> 2734how they work by trying them out by hand on small examples; no amount of 2735explanation will be as effective as watching the rules in action. 2736Parsing can also be watched by debugging with `\.{@@2}'. 2737 2738\def\[#1]{\quad$\dleft#1\dright$} 2739\def\sp{\.{\ }} 2740\yskip 2741\halign to\the\hsize{\hfil\it# & 2742 #\hfil\hskip-200pt\tabskip 0pt plus 100pt& 2743 #\hfil\tabskip0pt\cr 2744&Production categories\[\hbox{translations}]&Remarks\cr 2745\noalign{\yskip} 27461&|alpha@,math@,colon| $\RA$ |alpha@,math|&e.g., |case v:boolean of|\cr 27472&|alpha@,math@,omega| $\RA$ |clause|\[C=A\,\sp\,\.\$\,M\,\.\$\,\sp\,|indent|\, 2748O]&e.g., |while x>0 do|\cr 27493&|alpha@,omega| $\RA$ |clause|\[C=A\,\sp\,|indent|\,O]&e.g., |file of|\cr 27504&|alpha@,simp| $\RA$ |alpha@,math|&convert to math mode\cr 27515&|beginning@,close@,(terminator@t or @>stmt)| $\RA$ |stmt|&compound statement 2752ends\cr 27536&|beginning@,stmt| $\RA$ |beginning|\[B_2=B_1\,|break_space|\,S]&compound 2754statement grows\cr 27557&|case_head@,casey@,clause| $\RA$ |case_head|\[C_4=C_1\,|outdent|\,C_2\,C_3]& 2756variant records\cr 27578&|case_head@,close@,terminator| $\RA$ |stmt|\[S=C_1\,|cancel|\,|outdent|\, 2758C_2\,T]&end of case statement\cr 27599&|case_head@,stmt| $\RA$ |case_head|\[C_2=C_1\,|force|\,S]&case statement 2760grows\cr 276110&|casey@,clause| $\RA$ |case_head|&beginning of case statement\cr 276211&|clause@,stmt| $\RA$ |stmt|\[S_2=C\,|break_space|\,S_1\,|cancel|\,|outdent|\, 2763|force|]&end of controlled statement\cr 276412&|cond@,clause@,stmt@,elsie| $\RA$ |clause|\[C_3=C_1\,C_2\,|break_space|\,S\, 2765E\,\sp\,|cancel|]&complete conditional\cr 276613&|cond@,clause@,stmt| $\RA$ |stmt|\cr 2767&\qquad\[S_2=C_1\,C_2\,|break_space|\,S_1\, 2768|cancel|\,|outdent|\,|force|]&incomplete conditional\cr 276914&|elsie| $\RA$ |intro|&unmatched else\cr 277015&|exp@,math@,simp|* $\RA$ |math|\[M_2=E\,M_1\,S\,\.\}]&signed exponent\cr 277116&|exp@,simp|* $\RA$ |math|\[M=E\,S\,\.\}]&unsigned exponent\cr 277217&|intro@,stmt| $\RA$ |stmt|\[S_2=I\,\sp\,|opt|\,\.7\,|cancel|\,S_1]&labeled 2773statement, etc.\cr 277418&|math@,close| $\RA$ |stmt@,close|\[S=\.\$\,M\,\.\$]&end of field list\cr 277519&|math@,colon| $\RA$ |intro|\[I=|force|\,|backup|\,\.\$\,M\,\.\$\,C]&compound 2776label\cr 277720&|math@,math| $\RA$ |math|&simple concatenation\cr 277821&|math@,simp| $\RA$ |math|&simple concatenation\cr 277922&|math@,stmt| $\RA$ |stmt|\cr 2780&\qquad\[S_2=\.\$\,M\,\.\$\,|indent|\,|break_space|\, 2781S_1\,|cancel|\,|outdent|\,|force|]¯o or type definition\cr 278223&|math@,terminator| $\RA$ |stmt|\[S=\.\$\,M\,\.\$\,T]&statement involving 2783math\cr 278424&|mod_scrap@,(terminator@t or @>semi)| $\RA$ |stmt|\[S=M\,T\,|force|]&module 2785like a statement\cr 278625&|mod_scrap| $\RA$ |simp|&module unlike a statement\cr 278726&|open@,case_head@,close| $\RA$ |math|\[M=O\,\.\$\,|cancel|\,C_1\, 2788|cancel|\,|outdent|\,\.\$\,C_2]&case in field list\cr 278927&|open@,close| $\RA$ |math|\[M=O\,\.\\\,\.,\,C]&empty set |[]|\cr 279028&|open@,math@,case_head@,close| $\RA$ |math|\cr 2791&\qquad\[M_2=O\,M_1\,\.\$\,|cancel|\, 2792C_1\,|cancel|\,|outdent|\,\.\$\,C_2]&case in field list\cr 279329&|open@,math@,close| $\RA$ |math|&parenthesized group\cr 279430&|open@,math@,colon| $\RA$ |open@,math|&colon in parentheses\cr 279531&|open@,math@,proc@,intro| $\RA$ |open@,math|\[M_2=M_1\,|math_op|\,|cancel|\, 2796P\,\.\}]&|procedure| in parentheses\cr 279732&|open@,math@,semi| $\RA$ |open@,math|\[M_2=M_1\,S\,\.\\\,\.,\,|opt|\, 2798\.5]&semicolon in parentheses\cr 279933&|open@,math@,var_head@,intro| $\RA$ |open@,math|\[M_2=M_1\,|math_op|\, 2800|cancel|\,V\,\.\}]&|var| in parentheses\cr 280134&|open@,proc@,intro| $\RA$ |open@,math|\[M=|math_op|\,|cancel|\, 2802P\,\.\}]&|procedure| in parentheses\cr 280335&|open@,simp| $\RA$ |open@,math|&convert to math mode\cr 280436&|open@,stmt@,close| $\RA$ |math|\[M=O\,\.\$\,|cancel|\,S\,|cancel|\, 2805\.\$\,C]&field list\cr 280637&|open@,var_head@,intro| $\RA$ |open@,math|\[M=|math_op|\,|cancel|\,V\, 2807\.\}]&|var| in parentheses\cr 280838&|proc@,beginning@,close@,terminator| $\RA$ |stmt|\[S=P\,|cancel|\, 2809|outdent|\,B\,C\,T]&end of procedure declaration\cr 281039&|proc@,stmt| $\RA$ |proc|\[P_2=P_1\,|break_space|\,S]&procedure declaration 2811grows\cr 281240&|record_head@,intro@,casey| $\RA$ |casey|\[C_2=R\,I\,\sp\,|cancel|\,C_1]& 2813\&{record case} $\ldots$\cr 281441&|record_head| $\RA$ |case_head|\[C=|indent|\,R\,|cancel|]&other \&{record} 2815structures\cr 281642&|semi| $\RA$ |terminator|&semicolon after statement\cr 281743&|simp@,close| $\RA$ |stmt@,close|&end of field list\cr 281844&|simp@,colon| $\RA$ |intro|\[I=|force|\,|backup|\,S\,C]&simple label\cr 281945&|simp@,math| $\RA$ |math|&simple concatenation\cr 282046&|simp@,mod_scrap| $\RA$ |mod_scrap|&in emergencies\cr 282147&|simp@,simp| $\RA$ |simp|&simple concatenation\cr 282248&|simp@,terminator| $\RA$ |stmt|&simple statement\cr 282349&|stmt@,stmt| $\RA$ |stmt|\[S_3=S_1\,|break_space|\,S_2]&adjacent 2824statements\cr 282550&|terminator| $\RA$ |stmt|&empty statement\cr 282651&|var_head@,beginning| $\RA$ |stmt@,beginning|&end of variable 2827declarations\cr 282852&|var_head@,math@,colon| $\RA$ |var_head@,intro|\[I=\.\$\,M\,\.\$\,C]& 2829variable declaration\cr 283053&|var_head@,simp@,colon| $\RA$ |var_head@,intro|&variable declaration\cr 283154&|var_head@,stmt| $\RA$ |var_head|\[V_2=V_1\,|break_space|\,S]&variable 2832declarations grow\cr} 2833\yskip\noindent 2834Translations are not specified here when they are simple concatenations 2835of the scraps that change. For example, the full translation of 2836`|open@,math@,colon| $\RA$ |open@,math|' is $O_2=O_1$, $M_2=M_1C$. 2837 2838The notation `|simp|*', in the |exp|-related productions above, 2839stands for a |simp| scrap that isn't followed by another |simp|. 2840 2841@* Implementing the productions. 2842When \PASCAL\ text is to be processed with the grammar above, we put its 2843initial scraps $s_1\ldots s_n$ into two arrays |cat[1..n]| and |trans[1..n]|. 2844The value of |cat[k]| is simply a category code from the list above; the 2845value of |trans[k]| is a text pointer, i.e., an index into |tok_start|. 2846Our production rules have the nice property that the right-hand side is never 2847longer than the left-hand side. Therefore it is convenient to use sequential 2848allocation for the current sequence of scraps. Five pointers are used to 2849manage the parsing: 2850 2851\yskip\hang |pp| (the parsing pointer) is such that we are trying to match 2852the category codes |cat[pp]@,cat[pp+1]|$\,\ldots\,$ to the left-hand sides 2853of productions. 2854 2855\yskip\hang |scrap_base|, |lo_ptr|, |hi_ptr|, and |scrap_ptr| are such that 2856the current sequence of scraps appears in positions |scrap_base| through 2857|lo_ptr| and |hi_ptr| through |scrap_ptr|, inclusive, in the |cat| and 2858|trans| arrays. Scraps located between |scrap_base| and |lo_ptr| have 2859been examined, while those in positions |>=hi_ptr| have not yet been 2860looked at by the parsing process. 2861 2862\yskip\noindent Initially |scrap_ptr| is set to the position of the final 2863scrap to be parsed, and it doesn't change its value. The parsing process 2864makes sure that |lo_ptr>=pp+3|, since productions have as many as four terms, 2865by moving scraps from |hi_ptr| to |lo_ptr|. If there are 2866fewer than |pp+3| scraps left, the positions up to |pp+3| are filled with 2867blanks that will not match in any productions. Parsing stops when 2868|pp=lo_ptr+1| and |hi_ptr=scrap_ptr+1|. 2869 2870The |trans| array elements are declared to be of type |0..10239| instead 2871of type |text_pointer|, because the final sorting phase of \.{WEAVE} 2872uses this array to contain elements of type |name_pointer|. Both 2873of these types are subranges of |0..10239|. 2874 2875@<Glo...@>= 2876@!cat:array[0..max_scraps] of eight_bits; {category codes of scraps} 2877@!trans:array[0..max_scraps] of 0..10239; {translation texts of scraps} 2878@!pp:0..max_scraps; {current position for reducing productions} 2879@!scrap_base:0..max_scraps; {beginning of the current scrap sequence} 2880@!scrap_ptr:0..max_scraps; {ending of the current scrap sequence} 2881@!lo_ptr:0..max_scraps; {last scrap that has been examined} 2882@!hi_ptr:0..max_scraps; {first scrap that has not been examined} 2883stat@!max_scr_ptr:0..max_scraps; {largest value assumed by |scrap_ptr|} 2884tats 2885 2886@ @<Set init...@>= 2887scrap_base:=1; scrap_ptr:=0; 2888stat max_scr_ptr:=0; @+tats 2889 2890@ Token lists in |@!tok_mem| are composed of the following kinds of 2891items for \TeX\ output. 2892 2893\yskip\item{$\bullet$}ASCII codes and special codes like |force| and 2894|math_rel| represent themselves; 2895 2896\item{$\bullet$}|id_flag+p| represents \.{\\\\\{{\rm identifier $p$}\}}; 2897 2898\item{$\bullet$}|res_flag+p| represents \.{\\\&\{{\rm identifier $p$}\}}; 2899 2900\item{$\bullet$}|mod_flag+p| represents module name |p|; 2901 2902\item{$\bullet$}|tok_flag+p| represents token list number |p|; 2903 2904\item{$\bullet$}|inner_tok_flag+p| represents token list number |p|, to be 2905translated without line-break controls. 2906 2907@d id_flag=10240 {signifies an identifier} 2908@d res_flag=id_flag+id_flag {signifies a reserved word} 2909@d mod_flag=res_flag+id_flag {signifies a module name} 2910@d tok_flag==mod_flag+id_flag {signifies a token list} 2911@d inner_tok_flag==tok_flag+id_flag {signifies a token list in `\pb'} 2912@# 2913@d lbrace==xchr["{"] {this avoids possible \PASCAL\ compiler confusion} 2914@d rbrace==xchr["}"] {because these braces might occur within comments} 2915 2916@p @!debug procedure print_text(@!p:text_pointer); {prints a token list} 2917var j:0..max_toks; {index into |tok_mem|} 2918@!r:0..id_flag-1; {remainder of token after the flag has been stripped off} 2919begin if p>=text_ptr then print('BAD') 2920else for j:=tok_start[p] to tok_start[p+1]-1 do 2921 begin r:=tok_mem[j] mod id_flag; 2922 case tok_mem[j] div id_flag of 2923 1: begin print('\\',lbrace); print_id(r); print(rbrace); 2924 end; {|id_flag|} 2925 2: begin print('\&',lbrace); print_id(r); print(rbrace); 2926 end; {|res_flag|} 2927 3: begin print('<'); print_id(r); print('>'); 2928 end; {|mod_flag|} 2929 4: print('[[',r:1,']]'); {|tok_flag|} 2930 5: print('|[[',r:1,']]|'); {|inner_tok_flag|} 2931 othercases @<Print token |r| in symbolic form@> 2932 endcases; 2933 end; 2934end; 2935gubed 2936 2937@ @<Print token |r|...@>= 2938case r of 2939math_bin: print('\mathbin',lbrace); 2940math_rel: print('\mathrel',lbrace); 2941math_op: print('\mathop',lbrace); 2942big_cancel: print('[ccancel]'); 2943cancel: print('[cancel]'); 2944indent: print('[indent]'); 2945outdent: print('[outdent]'); 2946backup: print('[backup]'); 2947opt: print('[opt]'); 2948break_space: print('[break]'); 2949force: print('[force]'); 2950big_force: print('[fforce]'); 2951end_translation: print('[quit]'); 2952othercases print(xchr[r]) 2953endcases 2954 2955@ The production rules listed above are embedded directly into the \.{WEAVE} 2956program, since it is easier to do this than to write an interpretive system 2957that would handle production systems in general. Several macros are defined 2958here so that the program for each production is fairly short. 2959 2960All of our productions conform to the general notion that some |k| 2961consecutive scraps starting at some position |j| are to be replaced by a 2962single scrap of some category |c| whose translation is composed from the 2963translations of the disappearing scraps. After this production has been 2964applied, the production pointer |pp| should change by an amount |d|. Such 2965a production can be represented by the quadruple $(j,k,c,d)$. For example, 2966the production `|simp@,math| $\RA$ |math|' would be represented by 2967`$(|pp|,2,|math|,-1)$'; in this case the pointer $pp$ should decrease by 1 2968after the production has been applied, because some productions with 2969|math| in their second positions might now match, but no productions have 2970|math| in the third or fourth position of their left-hand sides. Note that 2971the value of |d| is determined by the whole collection of productions, not 2972by an individual one. Consider the further example 2973`|var_head@,math@,colon| $\RA$ |var_head@,intro|', which is represented by 2974`$(|pp|+1,2,|intro|,+1)$'; the $+1$ here is deduced by looking at the 2975grammar and seeing that no matches could possibly occur at positions |<=pp| 2976after this production has been applied. The determination of |d| has been 2977done by hand in each case, based on the full set of productions but not on 2978the grammar of \PASCAL\ or on the rules for constructing the initial 2979scraps. 2980 2981We also attach a serial number to each production, so that additional 2982information is available when debugging. For example, the program below 2983contains the statement `|reduce(pp+1,2,intro,+1)(52)|' when it implements 2984the production just mentioned. 2985 2986Before calling |reduce|, the program should have appended the tokens of 2987the new translation to the |tok_mem| array. We commonly want to append 2988copies of several existing translations, and macros are defined to 2989simplify these common cases. For example, |app2(pp)| will append the 2990translations of two consecutive scraps, |trans[pp]| and |trans[pp+1]|, to 2991the current token list. If the entire new translation is formed in this 2992way, we write `$|squash|(j,k,c,d)$' instead of `$|reduce|(j,k,c,d)$'. For 2993example, `|squash(pp,2,math,-1)|' is an abbreviation for `|app2(pp); 2994reduce(pp,2,math,-1)|'. 2995 2996The code below is an exact translation of the production rules into 2997\PASCAL, using such macros, and the reader should have no difficulty 2998understanding the format by comparing the code with the symbolic 2999productions as they were listed earlier. 3000 3001{\sl Caution:\/} The macros |app|, |app1|, |app2|, and |app3| are 3002sequences of statements that are not enclosed with |begin| and $|end|$, 3003because such delimiters would make the \PASCAL\ program much longer. This 3004means that it is necessary to write |begin| and |end| explicitly when such 3005a macro is used as a single statement. Several mysterious bugs in the 3006original programming of \.{WEAVE} were caused by a failure to remember 3007this fact. Next time the author will know better. 3008 3009@d production(#)==@!debug prod(#) gubed; goto found 3010@d reduce(#)==red(#); production 3011@d production_end(#)==@!debug prod(#) gubed; goto found; 3012 end 3013@d squash(#)==begin sq(#); production_end 3014@d app(#)==tok_mem[tok_ptr]:=#; incr(tok_ptr) {this is like |app_tok|, 3015 but it doesn't test for overflow} 3016@d app1(#)==tok_mem[tok_ptr]:=tok_flag+trans[#]; incr(tok_ptr) 3017@d app2(#)==app1(#);app1(#+1) 3018@d app3(#)==app2(#);app1(#+2) 3019 3020@ Let us consider the big case statement for productions now, before looking 3021at its context. We want to design the program so that this case statement 3022works, so we might as well not keep ourselves in suspense about exactly what 3023code needs to be provided with a proper environment. 3024 3025The code here is more complicated than it need be, since some popular 3026\PASCAL\ compilers are unable to deal with procedures that contain a lot 3027of program text. The |translate| procedure, which incorporates the |case| 3028statement here, would become too long for those compilers if we did 3029not do something to split the cases into parts. Therefore 3030a separate procedure called |five_cases| has been introduced. 3031@^split procedures@> 3032This auxiliary procedure contains approximately half of the program text 3033that |translate| would otherwise have had. There's also a procedure 3034called |alpha_cases|, which turned out to be necessary because the best 3035two-way split wasn't good enough. The procedure could be split further 3036in an analogous manner, but the present scheme works on all compilers 3037known to the author. 3038 3039@<Match a production at |pp|, or increase |pp| if there is no match@>= 3040if cat[pp]<=alpha then 3041 if cat[pp]<alpha then five_cases@+else alpha_cases 3042else begin case cat[pp] of 3043 case_head: @<Cases for |case_head|@>; 3044 casey: @<Cases for |casey|@>; 3045 clause: @<Cases for |clause|@>; 3046 cond: @<Cases for |cond|@>; 3047 elsie: @<Cases for |elsie|@>; 3048 exp: @<Cases for |exp|@>; 3049 mod_scrap: @<Cases for |mod_scrap|@>; 3050 proc: @<Cases for |proc|@>; 3051 record_head: @<Cases for |record_head|@>; 3052 semi: @<Cases for |semi|@>; 3053 stmt: @<Cases for |stmt|@>; 3054 terminator: @<Cases for |terminator|@>; 3055 var_head: @<Cases for |var_head|@>; 3056 othercases do_nothing 3057 endcases;@/ 3058 incr(pp); {if no match was found, we move to the right} 3059 found: end 3060 3061@ Here are the procedures that need to be present for the reason just 3062explained. 3063 3064@<Declaration of subprocedures for |translate|@>= 3065procedure five_cases; {handles almost half of the syntax} 3066label found; 3067begin case cat[pp] of 3068beginning: @<Cases for |beginning|@>; 3069intro: @<Cases for |intro|@>; 3070math: @<Cases for |math|@>; 3071open: @<Cases for |open|@>; 3072simp: @<Cases for |simp|@>; 3073othercases do_nothing 3074endcases;@/ 3075incr(pp); {if no match was found, we move to the right} 3076found: end; 3077@# 3078procedure alpha_cases; 3079label found; 3080begin @<Cases for |alpha|@>; 3081incr(pp); {if no match was found, we move to the right} 3082found: end; 3083 3084@ Now comes the code that tries to match each production starting 3085with a particular type of scrap. Whenever a match is discovered, 3086the |squash| or |reduce| macro will cause the appropriate action 3087to be performed, followed by |goto found|. 3088 3089@<Cases for |alpha|@>= 3090if cat[pp+1]=math then 3091 begin if cat[pp+2]=colon then squash(pp+1,2,math,0)(1) 3092 else if cat[pp+2]=omega then 3093 begin app1(pp); app(" "); app("$"); app1(pp+1); 3094 app("$"); app(" "); app(indent); app1(pp+2); 3095 reduce(pp,3,clause,-2)(2); 3096 end; 3097 end 3098else if cat[pp+1]=omega then 3099 begin app1(pp); app(" "); app(indent); app1(pp+1); 3100 reduce(pp,2,clause,-2)(3); 3101 end 3102else if cat[pp+1]=simp then squash(pp+1,1,math,0)(4) 3103 3104@ @<Cases for |beginning|@>= 3105if cat[pp+1]=close then 3106 begin if (cat[pp+2]=terminator)or(cat[pp+2]=stmt) then 3107 squash(pp,3,stmt,-2)(5); 3108 end 3109else if cat[pp+1]=stmt then 3110 begin app1(pp); app(break_space); app1(pp+1); 3111 reduce(pp,2,beginning,-1)(6); 3112 end 3113 3114@ @<Cases for |case_head|@>= 3115if cat[pp+1]=casey then 3116 begin if cat[pp+2]=clause then 3117 begin app1(pp); app(outdent); app2(pp+1); 3118 reduce(pp,3,case_head,0)(7); 3119 end; 3120 end 3121else if cat[pp+1]=close then 3122 begin if cat[pp+2]=terminator then 3123 begin app1(pp); app(cancel); app(outdent); app2(pp+1); 3124 reduce(pp,3,stmt,-2)(8); 3125 end; 3126 end 3127else if cat[pp+1]=stmt then 3128 begin app1(pp); app(force); app1(pp+1); 3129 reduce(pp,2,case_head,0)(9); 3130 end 3131 3132@ @<Cases for |casey|@>= 3133if cat[pp+1]=clause then squash(pp,2,case_head,0)(10) 3134 3135@ @<Cases for |clause|@>= 3136if cat[pp+1]=stmt then 3137 begin app1(pp); app(break_space); app1(pp+1); 3138 app(cancel); app(outdent); 3139 app(force); reduce(pp,2,stmt,-2)(11); 3140 end 3141 3142@ @<Cases for |cond|@>= 3143if (cat[pp+1]=clause)and(cat[pp+2]=stmt) then 3144 if cat[pp+3]=elsie then 3145 begin app2(pp); app(break_space); app2(pp+2); app(" "); 3146 app(cancel); reduce(pp,4,clause,-2)(12); 3147 end 3148 else begin app2(pp); app(break_space); app1(pp+2); app(cancel); 3149 app(outdent); app(force); reduce(pp,3,stmt,-2)(13); 3150 end 3151 3152@ @<Cases for |elsie|@>= 3153squash(pp,1,intro,-3)(14) 3154 3155@ @<Cases for |exp|@>= 3156if cat[pp+1]=math then 3157 begin if cat[pp+2]=simp then if cat[pp+3]<>simp then 3158 begin app3(pp); app("}"); reduce(pp,3,math,-1)(15); 3159 end; 3160 end 3161else if cat[pp+1]=simp then if cat[pp+2]<>simp then 3162 begin app2(pp); app("}"); reduce(pp,2,math,-1)(16); 3163 end 3164 3165@ @<Cases for |intro|@>= 3166if cat[pp+1]=stmt then 3167 begin app1(pp); app(" "); app(opt); app("7"); 3168 app(cancel); app1(pp+1); reduce(pp,2,stmt,-2)(17); 3169 end 3170 3171@ @<Cases for |math|@>= 3172if cat[pp+1]=close then 3173 begin app("$"); app1(pp); app("$"); reduce(pp,1,stmt,-2)(18); 3174 end 3175else if cat[pp+1]=colon then 3176 begin app(force); app(backup); app("$"); app1(pp); 3177 app("$"); app1(pp+1); reduce(pp,2,intro,-3)(19); 3178 end 3179else if cat[pp+1]=math then squash(pp,2,math,-1)(20) 3180else if cat[pp+1]=simp then squash(pp,2,math,-1)(21) 3181else if cat[pp+1]=stmt then 3182 begin app("$"); app1(pp); app("$"); app(indent); 3183 app(break_space); app1(pp+1); app(cancel); app(outdent); 3184 app(force); reduce(pp,2,stmt,-2)(22); 3185 end 3186else if cat[pp+1]=terminator then 3187 begin app("$"); app1(pp); app("$"); app1(pp+1); 3188 reduce(pp,2,stmt,-2)(23); 3189 end 3190 3191@ @<Cases for |mod_scrap|@>= 3192if (cat[pp+1]=terminator)or(cat[pp+1]=semi) then 3193 begin app2(pp); app(force); reduce(pp,2,stmt,-2)(24); 3194 end 3195else squash(pp,1,simp,-2)(25) 3196 3197@ @<Cases for |open|@>= 3198if (cat[pp+1]=case_head)and(cat[pp+2]=close) then 3199 begin app1(pp); app("$"); app(cancel); app1(pp+1); app(cancel); 3200 app(outdent); app("$"); app1(pp+2); reduce(pp,3,math,-1)(26); 3201 end 3202else if cat[pp+1]=close then 3203 begin app1(pp); app("\"); app(","); app1(pp+1); 3204@.\\,@> 3205 reduce(pp,2,math,-1)(27); 3206 end 3207else if cat[pp+1]=math then @<Cases for |open@,math|@> 3208else if cat[pp+1]=proc then 3209 begin if cat[pp+2]=intro then 3210 begin app(math_op); app(cancel); app1(pp+1); app("}"); 3211 reduce(pp+1,2,math,0)(34); 3212 end; 3213 end 3214else if cat[pp+1]=simp then squash(pp+1,1,math,0)(35) 3215else if (cat[pp+1]=stmt)and(cat[pp+2]=close) then 3216 begin app1(pp); app("$"); app(cancel); app1(pp+1); app(cancel); 3217 app("$"); app1(pp+2); reduce(pp,3,math,-1)(36); 3218 end 3219else if cat[pp+1]=var_head then 3220 begin if cat[pp+2]=intro then 3221 begin app(math_op); app(cancel); app1(pp+1); app("}"); 3222 reduce(pp+1,2,math,0)(37); 3223 end; 3224 end 3225 3226@ @<Cases for |open@,math|@>= 3227begin if (cat[pp+2]=case_head)and(cat[pp+3]=close) then 3228 begin app2(pp); app("$"); app(cancel); app1(pp+2); app(cancel); 3229 app(outdent); app("$"); app1(pp+3); reduce(pp,4,math,-1)(28); 3230 end 3231else if cat[pp+2]=close then squash(pp,3,math,-1)(29) 3232else if cat[pp+2]=colon then squash(pp+1,2,math,0)(30) 3233else if cat[pp+2]=proc then 3234 begin if cat[pp+3]=intro then 3235 begin app1(pp+1); app(math_op); app(cancel); 3236 app1(pp+2); app("}"); reduce(pp+1,3,math,0)(31); 3237 end; 3238 end 3239else if cat[pp+2]=semi then 3240 begin app2(pp+1); app("\"); app(","); app(opt); app("5"); 3241@.\\,@> 3242 reduce(pp+1,2,math,0)(32); 3243 end 3244else if cat[pp+2]=var_head then 3245 begin if cat[pp+3]=intro then 3246 begin app1(pp+1); app(math_op); app(cancel); 3247 app1(pp+2); app("}"); reduce(pp+1,3,math,0)(31); 3248 end; 3249 end; 3250end 3251 3252@ @<Cases for |proc|@>= 3253if cat[pp+1]=beginning then 3254 begin if (cat[pp+2]=close)and(cat[pp+3]=terminator) then 3255 begin app1(pp); app(cancel); app(outdent); app3(pp+1); 3256 reduce(pp,4,stmt,-2)(38); 3257 end; 3258 end 3259else if cat[pp+1]=stmt then 3260 begin app1(pp); app(break_space); app1(pp+1); 3261 reduce(pp,2,proc,-2)(39); 3262 end 3263 3264@ @<Cases for |record_head|@>= 3265if (cat[pp+1]=intro)and(cat[pp+2]=casey) then 3266 begin app2(pp); app(" "); app(cancel); app1(pp+2); 3267 reduce(pp,3,casey,-2)(40); 3268 end 3269else begin app(indent); app1(pp); app(cancel); 3270 reduce(pp,1,case_head,0)(41); 3271 end 3272 3273@ @<Cases for |semi|@>= 3274squash(pp,1,terminator,-3)(42) 3275 3276@ @<Cases for |simp|@>= 3277if cat[pp+1]=close then squash(pp,1,stmt,-2)(43) 3278else if cat[pp+1]=colon then 3279 begin app(force); app(backup); app2(pp); reduce(pp,2,intro,-3)(44); 3280 end 3281else if cat[pp+1]=math then squash(pp,2,math,-1)(45) 3282else if cat[pp+1]=mod_scrap then squash(pp,2,mod_scrap,0)(46) 3283else if cat[pp+1]=simp then squash(pp,2,simp,-2)(47) 3284else if cat[pp+1]=terminator then squash(pp,2,stmt,-2)(48) 3285 3286@ @<Cases for |stmt|@>= 3287if cat[pp+1]=stmt then 3288 begin app1(pp); app(break_space); app1(pp+1); 3289 reduce(pp,2,stmt,-2)(49); 3290 end 3291 3292@ @<Cases for |terminator|@>= 3293squash(pp,1,stmt,-2)(50) 3294 3295@ @<Cases for |var_head|@>= 3296if cat[pp+1]=beginning then squash(pp,1,stmt,-2)(51) 3297else if cat[pp+1]=math then 3298 begin if cat[pp+2]=colon then 3299 begin app("$"); app1(pp+1); app("$"); app1(pp+2); 3300 reduce(pp+1,2,intro,+1)(52); 3301 end; 3302 end 3303else if cat[pp+1]=simp then 3304 begin if cat[pp+2]=colon then squash(pp+1,2,intro,+1)(53); 3305 end 3306else if cat[pp+1]=stmt then 3307 begin app1(pp); app(break_space); app1(pp+1); 3308 reduce(pp,2,var_head,-2)(54); 3309 end 3310 3311@ The `|freeze_text|' macro is used to give official status to a token list. 3312Before saying |freeze_text|, items are appended to the current token list, 3313and we know that the eventual number of this token list will be the current 3314value of |text_ptr|. But no list of that number really exists as yet, 3315because no ending point for the current list has been 3316stored in the |tok_start| array. After saying |freeze_text|, the 3317old current token list becomes legitimate, and its number is the current 3318value of |text_ptr-1| since |text_ptr| has been increased. The new 3319current token list is empty and ready to be appended to. 3320Note that |freeze_text| does not check to see that |text_ptr| hasn't gotten 3321too large, since it is assumed that this test was done beforehand. 3322 3323@d freeze_text==incr(text_ptr); tok_start[text_ptr]:=tok_ptr 3324 3325@ The `|reduce|' macro used in our code for productions actually calls on 3326a procedure named `|red|', which makes the appropriate changes to the 3327scrap list. 3328 3329@p procedure red(@!j:sixteen_bits; @!k:eight_bits; @!c:eight_bits; 3330 @!d:integer); 3331var i:0..max_scraps; {index into scrap memory} 3332begin cat[j]:=c; trans[j]:=text_ptr; freeze_text; 3333if k>1 then 3334 begin for i:=j+k to lo_ptr do 3335 begin cat[i-k+1]:=cat[i]; trans[i-k+1]:=trans[i]; 3336 end; 3337 lo_ptr:=lo_ptr-k+1; 3338 end; 3339@<Change |pp| to $\max(|scrap_base|,|pp+d|)$@>; 3340end; 3341 3342@ @<Change |pp| to $\max(|scrap_base|,|pp+d|)$@>= 3343if pp+d>=scrap_base then pp:=pp+d 3344else pp:=scrap_base 3345 3346@ Similarly, the `|squash|' macro invokes a procedure called `|sq|'. This 3347procedure takes advantage of the simplification that occurs when |k=1|. 3348 3349@p procedure sq(@!j:sixteen_bits; @!k:eight_bits; @!c:eight_bits; 3350 @!d:integer); 3351var i:0..max_scraps; {index into scrap memory} 3352begin if k=1 then 3353 begin cat[j]:=c; @<Change |pp|...@>; 3354 end 3355else begin for i:=j to j+k-1 do 3356 begin app1(i); 3357 end; 3358 red(j,k,c,d); 3359 end; 3360end; 3361 3362@ Here now is the code that applies productions as long as possible. It 3363requires two local labels (|found| and |done|), as well as a local 3364variable (|i|). 3365 3366@<Reduce the scraps using the productions until no more rules apply@>= 3367loop@+begin @<Make sure the entries |cat[pp..(pp+3)]| are defined@>; 3368 if (tok_ptr+8>max_toks)or(text_ptr+4>max_texts) then 3369 begin stat if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr; 3370 if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr; 3371 tats@;@/ 3372 overflow('token/text'); 3373 end; 3374 if pp>lo_ptr then goto done; 3375 @<Match a production...@>; 3376 end; 3377done: 3378 3379@ If we get to the end of the scrap list, category codes equal to zero are 3380stored, since zero does not match anything in a production. 3381 3382@<Make sure the entries...@>= 3383if lo_ptr<pp+3 then 3384 begin repeat if hi_ptr<=scrap_ptr then 3385 begin incr(lo_ptr);@/ 3386 cat[lo_ptr]:=cat[hi_ptr]; trans[lo_ptr]:=trans[hi_ptr];@/ 3387 incr(hi_ptr); 3388 end; 3389 until (hi_ptr>scrap_ptr)or(lo_ptr=pp+3); 3390 for i:=lo_ptr+1 to pp+3 do cat[i]:=0; 3391 end 3392 3393@ If \.{WEAVE} is being run in debugging mode, the production numbers and 3394current stack categories will be printed out when |tracing| is set to 2; 3395a sequence of two or more irreducible scraps will be printed out when 3396|tracing| is set to 1. 3397@.\AT!2@> 3398@.\AT!1@> 3399 3400@<Glo...@>= 3401@!debug@!tracing:0..2; {can be used to show parsing details} 3402gubed 3403 3404@ The |prod| procedure is called in debugging mode just after |reduce| or 3405|squash|; its parameter is the number of the production that has just 3406been applied. 3407 3408@p @!debug procedure prod(@!n:eight_bits); {shows current categories} 3409var k:1..max_scraps; {index into |cat|} 3410begin if tracing=2 then 3411 begin print_nl(n:1,':'); 3412 for k:=scrap_base to lo_ptr do 3413 begin if k=pp then print('*') @+ else print(' '); 3414 print_cat(cat[k]); 3415 end; 3416 if hi_ptr<=scrap_ptr then print('...'); {indicate that more is coming} 3417 end; 3418end; 3419gubed 3420 3421@ The |translate| function assumes that scraps have been stored in 3422positions |scrap_base| through |scrap_ptr| of |cat| and |trans|. It 3423appends a |terminator| scrap and begins to apply productions as much as 3424possible. The result is a token list containing the translation of 3425the given sequence of scraps. 3426 3427After calling |translate|, we will have |text_ptr+3<=max_texts| and 3428|tok_ptr+6<=max_toks|, so it will be possible to create up to three token 3429lists with up to six tokens without checking for overflow. Before calling 3430|translate|, we should have |text_ptr<max_texts| and |scrap_ptr<max_scraps|, 3431since |translate| might add a new text and a new scrap before it checks 3432for overflow. 3433 3434@p @<Declaration of subprocedures for |translate|@>@; 3435function translate:text_pointer; {converts a sequence of scraps} 3436label done,found; 3437var i: 1..max_scraps; {index into |cat|} 3438@!j:0..max_scraps; {runs through final scraps} 3439@!k:0..long_buf_size; {index into |buffer|} 3440begin pp:=scrap_base; lo_ptr:=pp-1; hi_ptr:=pp; 3441@<If tracing, print an indication of where we are@>; 3442@<Reduce the scraps...@>; 3443if (lo_ptr=scrap_base)and(cat[lo_ptr]<>math) then translate:=trans[lo_ptr] 3444else @<Combine the irreducible scraps that remain@>; 3445end; 3446 3447@ If the initial sequence of scraps does not reduce to a single scrap, 3448we concatenate the translations of all remaining scraps, separated by 3449blank spaces, with dollar signs surrounding the translations of |math| 3450scraps. 3451 3452@<Combine the irreducible...@>= 3453begin @<If semi-tracing, show the irreducible scraps@>; 3454for j:=scrap_base to lo_ptr do 3455 begin if j<>scrap_base then 3456 begin app(" "); 3457 end; 3458 if cat[j]=math then 3459 begin app("$"); 3460 end; 3461 app1(j); 3462 if cat[j]=math then 3463 begin app("$"); 3464 end; 3465 if tok_ptr+6>max_toks then overflow('token'); 3466 end; 3467freeze_text; translate:=text_ptr-1; 3468end 3469 3470@ @<If semi-tracing, show the irreducible scraps@>= 3471@!debug if (lo_ptr>scrap_base)and(tracing=1) then 3472 begin print_nl('Irreducible scrap sequence in section ',module_count:1); 3473 print_ln(':'); mark_harmless; 3474 for j:=scrap_base to lo_ptr do 3475 begin print(' '); print_cat(cat[j]); 3476 end; 3477 end; 3478gubed 3479 3480@ @<If tracing,...@>= 3481@!debug if tracing=2 then 3482 begin print_nl('Tracing after l.',line:1,':'); mark_harmless; 3483 if loc>50 then 3484 begin print('...'); 3485 for k:=loc-50 to loc do print(xchr[buffer[k-1]]); 3486 end 3487 else for k:=1 to loc do print(xchr[buffer[k-1]]); 3488 end 3489gubed 3490 3491@* Initializing the scraps. 3492If we are going to use the powerful production mechanism just developed, we 3493must get the scraps set up in the first place, given a \PASCAL\ text. A table 3494of the initial scraps corresponding to \PASCAL\ tokens appeared above in the 3495section on parsing; our goal now is to implement that table. We shall do this 3496by implementing a subroutine called |Pascal_parse| that is analogous to the 3497|Pascal_xref| routine used during phase one. 3498 3499Like |Pascal_xref|, the |Pascal_parse| procedure starts with the current 3500value of |next_control| and it uses the operation |next_control:=get_next| 3501repeatedly to read \PASCAL\ text until encountering the next `\v' or 3502`\.\{', or until |next_control>=format|. The scraps corresponding to what 3503it reads are appended into the |cat| and |trans| arrays, and |scrap_ptr| 3504is advanced. 3505 3506Like |prod|, this procedure has to split into pieces so that each 3507part is short enough to be handled by \PASCAL\ compilers that discriminate 3508against long subroutines. This time there are two split-off routines, 3509called |easy_cases| and |sub_cases|. 3510@^split procedures@> 3511 3512After studying |Pascal_parse|, we will look at the sub-procedures 3513|app_comment|, |app_octal|, and |app_hex| that are used in some of its 3514branches. 3515 3516@p @<Declaration of the |app_comment| procedure@>@; 3517@<Declaration of the |app_octal| and |app_hex| procedures@>@; 3518@<Declaration of the |easy_cases| procedure@>@; 3519@<Declaration of the |sub_cases| procedure@>@; 3520procedure Pascal_parse; {creates scraps from \PASCAL\ tokens} 3521label reswitch, exit; 3522var j:0..long_buf_size; {index into |buffer|} 3523@!p:name_pointer; {identifier designator} 3524begin while next_control<format do 3525 begin @<Append the scrap appropriate to |next_control|@>; 3526 next_control:=get_next; 3527 if (next_control="|")or(next_control="{") then return; 3528 end; 3529exit:end; 3530 3531@ The macros defined here are helpful abbreviations for the operations 3532needed when generating the scraps. A scrap of category |c| whose 3533translation has three tokens $t_1$, $t_2$, $t_3$ is generated by 3534|sc3|$(t_1)(t_2)(t_3)(c)$, etc. 3535 3536@d s0(#)==incr(scrap_ptr); cat[scrap_ptr]:=#; trans[scrap_ptr]:=text_ptr; 3537 freeze_text; 3538 end 3539@d s1(#)==app(#);s0 3540@d s2(#)==app(#);s1 3541@d s3(#)==app(#);s2 3542@d s4(#)==app(#);s3 3543@d sc4==@+begin s4 3544@d sc3==@+begin s3 3545@d sc2==@+begin s2 3546@d sc1==@+begin s1 3547@d sc0(#)==begin incr(scrap_ptr); cat[scrap_ptr]:=#; trans[scrap_ptr]:=0; 3548 end 3549@d comment_scrap(#)==begin app(#); app_comment; 3550 end 3551 3552@ @<Append the scr...@>= 3553@<Make sure that there is room for at least four more scraps, six more 3554tokens, and four more texts@>; 3555reswitch: case next_control of 3556string,verbatim: @<Append a \(string scrap@>; 3557identifier: @<Append an identifier scrap@>; 3558TeX_string: @<Append a \TeX\ string scrap@>; 3559othercases easy_cases 3560endcases 3561 3562@ The |easy_cases| each result in straightforward scraps. 3563 3564@<Declaration of the |easy_cases| procedure@>= 3565procedure easy_cases; {a subprocedure of |Pascal_parse|} 3566begin case next_control of 3567set_element_sign: sc3("\")("i")("n")(math); 3568@.\\in@> 3569double_dot: sc3("\")("t")("o")(math); 3570@.\\to@> 3571"#","$","%","^","_": sc2("\")(next_control)(math); 3572@.\\\#@> 3573@.\\\$@> 3574@.\\\%@> 3575@.\\\^@> 3576ignore,"|",xref_roman,xref_wildcard,xref_typewriter: do_nothing; 3577"(","[": sc1(next_control)(open); 3578")","]": sc1(next_control)(close); 3579"*": sc4("\")("a")("s")("t")(math); 3580@.\\ast@> 3581",": sc3(",")(opt)("9")(math); 3582".","0","1","2","3","4","5","6","7","8","9": sc1(next_control)(simp); 3583";": sc1(";")(semi); 3584":": sc1(":")(colon); 3585@t\4@> @<Cases involving nonstandard ASCII characters@>@; 3586exponent: sc3("\")("E")("{")(exp); 3587@.\\E@> 3588begin_comment: sc2("\")("B")(math); 3589@.\\B@> 3590end_comment: sc2("\")("T")(math); 3591@.\\T@> 3592octal: app_octal; 3593hex: app_hex; 3594check_sum: sc2("\")(")")(simp); 3595@.\\)@> 3596force_line: sc2("\")("]")(simp); 3597@.\\]@> 3598thin_space: sc2("\")(",")(math); 3599@.\\,@> 3600math_break: sc2(opt)("0")(simp); 3601line_break: comment_scrap(force); 3602big_line_break: comment_scrap(big_force); 3603no_line_break: begin app(big_cancel); app("\"); app(" "); 3604@.\\\ @> 3605 comment_scrap(big_cancel); 3606 end; 3607pseudo_semi: sc0(semi); 3608join: sc2("\")("J")(math); 3609@.\\J@> 3610othercases sc1(next_control)(math) 3611endcases; 3612end; 3613 3614@ @<Make sure that there is room for at least four...@>= 3615if (scrap_ptr+4>max_scraps)or(tok_ptr+6>max_toks)or(text_ptr+4>max_texts) then 3616 begin stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr; 3617 if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr; 3618 if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr; 3619 tats@;@/ 3620 overflow('scrap/token/text'); 3621 end 3622 3623@ Some nonstandard ASCII characters may have entered \.{WEAVE} by means of 3624standard ones. They are converted to \TeX\ control sequences so that it is 3625possible to keep \.{WEAVE} from stepping beyond standard ASCII. 3626 3627@<Cases involving nonstandard...@>= 3628not_equal: sc2("\")("I")(math); 3629@.\\I@> 3630less_or_equal: sc2("\")("L")(math); 3631@.\\L@> 3632greater_or_equal: sc2("\")("G")(math); 3633@.\\G@> 3634equivalence_sign: sc2("\")("S")(math); 3635@.\\S@> 3636and_sign: sc2("\")("W")(math); 3637@.\\W@> 3638or_sign: sc2("\")("V")(math); 3639@.\\V@> 3640not_sign: sc2("\")("R")(math); 3641@.\\R@> 3642left_arrow: sc2("\")("K")(math); 3643@.\\K@> 3644 3645@ The following code must use |app_tok| instead of |app| in order to 3646protect against overflow. Note that |tok_ptr+1<=max_toks| after |app_tok| 3647has been used, so another |app| is legitimate before testing again. 3648 3649Many of the special characters in a string must be prefixed by `\.\\' so that 3650\TeX\ will print them properly. 3651@^special string characters@> 3652 3653@<Append a \(string scrap@>= 3654begin app("\"); 3655if next_control=verbatim then 3656 begin app("="); 3657@.\\=@> 3658 end 3659else begin app("."); 3660@.\\.@> 3661 end; 3662app("{"); j:=id_first; 3663while j<id_loc do 3664 begin case buffer[j] of 3665 " ","\","#","%","$","^","'","`","{","}","~","&","_": 3666 begin app("\"); 3667 end; 3668@.\\\ @> 3669@.\\\\@> 3670@.\\\#@> 3671@.\\\%@> 3672@.\\\$@> 3673@.\\\^@> 3674@.\\\'@> 3675@.\\\`@> 3676@.\\\{@> 3677@.\\\}@> 3678@.\\\~@> 3679@.\\\&@> 3680@.\\_@> 3681 "@@": if buffer[j+1]="@@" then incr(j) 3682 else err_print('! Double @@ should be used in strings'); 3683@.Double \AT! should be used...@> 3684 othercases do_nothing 3685 endcases;@/ 3686 app_tok(buffer[j]); incr(j); 3687 end; 3688sc1("}")(simp); 3689end 3690 3691@ @<Append a \TeX\ string scrap@>= 3692begin app("\"); app("h"); app("b"); app("o"); app("x"); 3693app("{"); 3694for j:=id_first to id_loc-1 do app_tok(buffer[j]); 3695sc1("}")(simp); 3696end 3697 3698@ @<Append an identifier scrap@>= 3699begin p:=id_lookup(normal); 3700case ilk[p] of 3701normal,array_like,const_like,div_like, 3702 do_like,for_like,goto_like,nil_like,to_like: sub_cases(p); 3703@t\4@>@<Cases that generate more than one scrap@>@; 3704othercases begin next_control:=ilk[p]-char_like; goto reswitch; 3705 end {\&{and}, \&{in}, \&{not}, \&{or}} 3706endcases; 3707end 3708 3709@ The |sub_cases| also result in straightforward scraps. 3710 3711@<Declaration of the |sub_cases| procedure@>= 3712procedure sub_cases(@!p:name_pointer); {a subprocedure of |Pascal_parse|} 3713begin case ilk[p] of 3714normal: sc1(id_flag+p)(simp); {not a reserved word} 3715array_like: sc1(res_flag+p)(alpha); {\&{array}, \&{file}, \&{set}} 3716const_like: sc3(force)(backup)(res_flag+p)(intro); 3717 {\&{const}, \&{label}, \&{type}} 3718div_like: sc3(math_bin)(res_flag+p)("}")(math); {\&{div}, \&{mod}} 3719do_like: sc1(res_flag+p)(omega); {\&{do}, \&{of}, \&{then}} 3720for_like: sc2(force)(res_flag+p)(alpha); {\&{for}, \&{while}, \&{with}} 3721goto_like: sc1(res_flag+p)(intro); {\&{goto}, \&{packed}} 3722nil_like: sc1(res_flag+p)(simp); {\&{nil}} 3723to_like: sc3(math_rel)(res_flag+p)("}")(math); {\&{downto}, \&{to}} 3724end; 3725end; 3726 3727@ @<Cases that generate more than one scrap@>= 3728begin_like: begin sc3(force)(res_flag+p)(cancel)(beginning); sc0(intro); 3729 end; {\&{begin}} 3730case_like: begin sc0(casey); sc2(force)(res_flag+p)(alpha); 3731 end; {\&{case}} 3732else_like: begin @<Append |terminator| if not already present@>; 3733 sc3(force)(backup)(res_flag+p)(elsie); 3734 end; {\&{else}} 3735end_like: begin @<Append |term...@>; 3736 sc2(force)(res_flag+p)(close); 3737 end; {\&{end}} 3738if_like: begin sc0(cond); sc2(force)(res_flag+p)(alpha); 3739 end; {\&{if}} 3740loop_like: begin sc3(force)("\")("~")(alpha); 3741@.\\\~@> 3742 sc1(res_flag+p)(omega); 3743 end; {\&{xclause}} 3744proc_like: begin sc4(force)(backup)(res_flag+p)(cancel)(proc); 3745 sc3(indent)("\")(" ")(intro); 3746@.\\\ @> 3747 end; {\&{function}, \&{procedure}, \&{program}} 3748record_like: begin sc1(res_flag+p)(record_head); sc0(intro); 3749 end; {\&{record}} 3750repeat_like: begin sc4(force)(indent)(res_flag+p)(cancel)(beginning); 3751 sc0(intro); 3752 end; {\&{repeat}} 3753until_like: begin @<Append |term...@>; 3754 sc3(force)(backup)(res_flag+p)(close); sc0(clause); 3755 end; {\&{until}} 3756var_like: begin sc4(force)(backup)(res_flag+p)(cancel)(var_head); sc0(intro); 3757 end; {\&{var}} 3758 3759@ If a comment or semicolon appears before the reserved words \&{end}, 3760\&{else}, or \&{until}, the |semi| or |terminator| scrap that is already 3761present overrides the |terminator| scrap belonging to this reserved word. 3762 3763@<Append |termin...@>= 3764if (scrap_ptr<scrap_base)or((cat[scrap_ptr]<>terminator)and 3765 (cat[scrap_ptr]<>semi)) then sc0(terminator) 3766 3767@ A comment is incorporated into the previous scrap if that scrap is of type 3768|omega| or |semi| or |terminator|. (These three categories have consecutive 3769category codes.) Otherwise the comment is entered as a separate scrap 3770of type |terminator|, and it will combine with a |terminator| scrap that 3771immediately follows~it. 3772 3773The |app_comment| procedure takes care of placing a comment at the end of the 3774current scrap list. When |app_comment| is called, we assume that the current 3775token list is the translation of the comment involved. 3776 3777@<Declaration of the |app_comment|...@>= 3778procedure app_comment; {append a comment to the scrap list} 3779begin freeze_text; 3780if (scrap_ptr<scrap_base)or(cat[scrap_ptr]<omega)or 3781 (cat[scrap_ptr]>terminator) then sc0(terminator) 3782else begin app1(scrap_ptr); {|cat[scrap_ptr]| is 3783 |omega| or |semi| or |terminator|} 3784 end; 3785app(text_ptr-1+tok_flag); trans[scrap_ptr]:=text_ptr; freeze_text; 3786end; 3787 3788@ We are now finished with |Pascal_parse|, except for two relatively 3789trivial subprocedures that convert constants into tokens. 3790 3791@<Declaration of the |app_octal| and...@>= 3792procedure app_octal; 3793begin app("\"); app("O"); app("{"); 3794@.\\O@> 3795while (buffer[loc]>="0")and(buffer[loc]<="7") do 3796 begin app_tok(buffer[loc]); incr(loc); 3797 end; 3798sc1("}")(simp); 3799end; 3800@# 3801procedure app_hex; 3802begin app("\"); app("H"); app("{"); 3803@.\\H@> 3804while ((buffer[loc]>="0")and(buffer[loc]<="9"))or@| 3805 ((buffer[loc]>="A")and(buffer[loc]<="F")) do 3806 begin app_tok(buffer[loc]); incr(loc); 3807 end; 3808sc1("}")(simp); 3809end; 3810 3811 3812@ When the `\v' that introduces \PASCAL\ text is sensed, a call on 3813|Pascal_translate| will return a pointer to the \TeX\ translation of 3814that text. If scraps exist in the |cat| and |trans| arrays, they are 3815unaffected by this translation process. 3816 3817@p function Pascal_translate: text_pointer; 3818var p:text_pointer; {points to the translation} 3819@!save_base:0..max_scraps; {holds original value of |scrap_base|} 3820begin save_base:=scrap_base; scrap_base:=scrap_ptr+1; 3821Pascal_parse; {get the scraps together} 3822if next_control<>"|" then err_print('! Missing "|" after Pascal text'); 3823@.Missing "|"...@> 3824app_tok(cancel); app_comment; {place a |cancel| token as a final ``comment''} 3825p:=translate; {make the translation} 3826stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr;@;@+tats@;@/ 3827scrap_ptr:=scrap_base-1; scrap_base:=save_base; {scrap the scraps} 3828Pascal_translate:=p; 3829end; 3830 3831@ The |outer_parse| routine is to |Pascal_parse| as |outer_xref| 3832is to |Pascal_xref|: It constructs a sequence of scraps for \PASCAL\ text 3833until |next_control>=format|. Thus, it takes care of embedded comments. 3834 3835@p procedure outer_parse; {makes scraps from \PASCAL\ tokens and comments} 3836var bal:eight_bits; {brace level in comment} 3837@!p,@!q:text_pointer; {partial comments} 3838begin while next_control<format do 3839 if next_control<>"{" then Pascal_parse 3840 else begin @<Make sure that there is room for at least seven more 3841 tokens, three more texts, and one more scrap@>; 3842 app("\"); app("C"); app("{"); 3843@.\\C@> 3844 bal:=copy_comment(1); next_control:="|"; 3845 while bal>0 do 3846 begin p:=text_ptr; freeze_text; q:=Pascal_translate; 3847 {at this point we have |tok_ptr+6<=max_toks|} 3848 app(tok_flag+p); app(inner_tok_flag+q); 3849 if next_control="|" then bal:=copy_comment(bal) 3850 else bal:=0; {an error has been reported} 3851 end; 3852 app(force); app_comment; {the full comment becomes a scrap} 3853 end; 3854end; 3855 3856@ @<Make sure that there is room for at least seven more...@>= 3857if (tok_ptr+7>max_toks)or(text_ptr+3>max_texts)or(scrap_ptr>=max_scraps) then 3858 begin stat if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr; 3859 if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr; 3860 if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr; 3861 tats@;@/ 3862 overflow('token/text/scrap'); 3863 end 3864 3865@* Output of tokens. 3866So far our programs have only built up multi-layered token lists in 3867\.{WEAVE}'s internal memory; we have to figure out how to get them into 3868the desired final form. The job of converting token lists to characters in 3869the \TeX\ output file is not difficult, although it is an implicitly 3870recursive process. Four main considerations had to be kept in mind when 3871this part of \.{WEAVE} was designed. (a) There are two modes of output: 3872|outer| mode, which translates tokens like |force| into line-breaking 3873control sequences, and |inner| mode, which ignores them except that blank 3874spaces take the place of line breaks. (b) The |cancel| instruction applies 3875to adjacent token or tokens that are output, and this cuts across levels 3876of recursion since `|cancel|' occurs at the beginning or end of a token 3877list on one level. (c) The \TeX\ output file will be semi-readable if line 3878breaks are inserted after the result of tokens like |break_space| and 3879|force|. (d) The final line break should be suppressed, and there should 3880be no |force| token output immediately after `\.{\\Y\\P}'. 3881 3882@ The output process uses a stack to keep track of what is going on at 3883different ``levels'' as the token lists are being written out. Entries on 3884this stack have three parts: 3885 3886\yskip\hang |end_field| is the |tok_mem| location where the token list of a 3887particular level will end; 3888 3889\yskip\hang |tok_field| is the |tok_mem| location from which the next token 3890on a particular level will be read; 3891 3892\yskip\hang |mode_field| is the current mode, either |inner| or |outer|. 3893 3894\yskip\noindent The current values of these quantities are referred to 3895quite frequently, so they are stored in a separate place instead of in the 3896|stack| array. We call the current values |cur_end|, |cur_tok|, and 3897|cur_mode|. 3898 3899The global variable |stack_ptr| tells how many levels of output are 3900currently in progress. The end of output occurs when an |end_translation| 3901token is found, so the stack is never empty except when we first begin the 3902output process. 3903 3904@d inner=0 {value of |mode| for \PASCAL\ texts within \TeX\ texts} 3905@d outer=1 {value of |mode| for \PASCAL\ texts in modules} 3906 3907@<Types...@>= 3908@!mode=inner..outer;@/ 3909@!output_state=record@!end_field:sixteen_bits; {ending location of token list} 3910 @!tok_field:sixteen_bits; {present location within token list} 3911 @!mode_field:mode; {interpretation of control tokens} 3912 end; 3913 3914@ @d cur_end==cur_state.end_field {current ending location in |tok_mem|} 3915@d cur_tok==cur_state.tok_field {location of next output token in |tok_mem|} 3916@d cur_mode==cur_state.mode_field {current mode of interpretation} 3917@d init_stack==stack_ptr:=0;cur_mode:=outer {do this to initialize the stack} 3918 3919@<Glob...@>= 3920@!cur_state:output_state; {|cur_end|, |cur_tok|, |cur_mode|} 3921@!stack:array[1..stack_size] of output_state; {info for non-current levels} 3922@!stack_ptr:0..stack_size; {first unused location in the output state stack} 3923stat@!max_stack_ptr:0..stack_size; {largest value assumed by |stack_ptr|} 3924tats 3925 3926@ @<Set init...@>=stat max_stack_ptr:=0;@+tats 3927 3928@ To insert token-list |p| into the output, the |push_level| subroutine 3929is called; it saves the old level of output and gets a new one going. 3930The value of |cur_mode| is not changed. 3931 3932@p procedure push_level(@!p:text_pointer); {suspends the current level} 3933begin if stack_ptr=stack_size then overflow('stack') 3934else begin if stack_ptr>0 then 3935 stack[stack_ptr]:=cur_state; {save |cur_end|$\,\ldots\,$|cur_mode|} 3936 incr(stack_ptr); 3937 stat if stack_ptr>max_stack_ptr then 3938 max_stack_ptr:=stack_ptr;@;@+tats@;@/ 3939 cur_tok:=tok_start[p]; cur_end:=tok_start[p+1]; 3940 end; 3941end; 3942 3943@ Conversely, the |pop_level| routine restores the conditions that were in 3944force when the current level was begun. This subroutine will never be 3945called when |stack_ptr=1|. It is so simple, we declare it as a macro: 3946 3947@d pop_level==begin decr(stack_ptr); cur_state:=stack[stack_ptr]; 3948 end {do this when |cur_tok| reaches |cur_end|} 3949 3950@ The |get_output| function returns the next byte of output that is not a 3951reference to a token list. It returns the values |identifier| or |res_word| 3952or |mod_name| if the next token is to be an identifier (typeset in 3953italics), a reserved word (typeset in boldface) or a module name (typeset 3954by a complex routine that might generate additional levels of output). 3955In these cases |cur_name| points to the identifier or module name in 3956question. 3957 3958@d res_word=@'201 {returned by |get_output| for reserved words} 3959@d mod_name=@'200 {returned by |get_output| for module names} 3960 3961@p function get_output:eight_bits; {returns the next token of output} 3962label restart; 3963var a:sixteen_bits; {current item read from |tok_mem|} 3964begin restart: while cur_tok=cur_end do pop_level; 3965a:=tok_mem[cur_tok]; incr(cur_tok); 3966if a>=@'400 then 3967 begin cur_name:=a mod id_flag; 3968 case a div id_flag of 3969 2: a:=res_word; {|a=res_flag+cur_name|} 3970 3: a:=mod_name; {|a=mod_flag+cur_name|} 3971 4: begin push_level(cur_name); goto restart; 3972 end; {|a=tok_flag+cur_name|} 3973 5: begin push_level(cur_name); cur_mode:=inner; goto restart; 3974 end; {|a=inner_tok_flag+cur_name|} 3975 othercases a:=identifier {|a=id_flag+cur_name|} 3976 endcases; 3977 end; 3978@!debug if trouble_shooting then debug_help; @+ gubed@/ 3979get_output:=a; 3980end; 3981 3982@ The real work associated with token output is done by |make_output|. 3983This procedure appends an |end_translation| token to the current token list, 3984and then it repeatedly calls |get_output| and feeds characters to the output 3985buffer until reaching the |end_translation| sentinel. It is possible for 3986|make_output| to 3987be called recursively, since a module name may include embedded \PASCAL\ 3988text; however, the depth of recursion never exceeds one level, since 3989module names cannot be inside of module names. 3990 3991A procedure called |output_Pascal| does the scanning, translation, and 3992output of \PASCAL\ text within `\pb' brackets, and this procedure uses 3993|make_output| to output the current token list. Thus, the recursive call 3994of |make_output| actually occurs when |make_output| calls |output_Pascal| 3995while outputting the name of a module. 3996@^recursion@> 3997 3998@p procedure make_output; forward; @t\2@>@# 3999procedure output_Pascal; {outputs the current token list} 4000var save_tok_ptr,@!save_text_ptr,@!save_next_control:sixteen_bits; 4001 {values to be restored} 4002p:text_pointer; {translation of the \PASCAL\ text} 4003begin save_tok_ptr:=tok_ptr; save_text_ptr:=text_ptr; 4004save_next_control:=next_control; next_control:="|"; p:=Pascal_translate; 4005app(p+inner_tok_flag); 4006make_output; {output the list} 4007stat if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr; 4008if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr;@;@+tats@;@/ 4009text_ptr:=save_text_ptr; tok_ptr:=save_tok_ptr; {forget the tokens} 4010next_control:=save_next_control; {restore |next_control| to original state} 4011end; 4012 4013@ Here is \.{WEAVE}'s major output handler. 4014 4015@p procedure make_output; {outputs the equivalents of tokens} 4016label reswitch,exit,found; 4017var a:eight_bits; {current output byte} 4018@!b:eight_bits; {next output byte} 4019@!k,@!k_limit:0..max_bytes; {indices into |byte_mem|} 4020@!w:0..ww-1; {row of |byte_mem|} 4021@!j:0..long_buf_size; {index into |buffer|} 4022@!string_delimiter:ASCII_code; {first and last character of 4023 string being copied} 4024@!save_loc,@!save_limit:0..long_buf_size; {|loc| and |limit| to be restored} 4025@!cur_mod_name:name_pointer; {name of module being output} 4026@!save_mode:mode; {value of |cur_mode| before a sequence of breaks} 4027begin app(end_translation); {append a sentinel} 4028freeze_text; push_level(text_ptr-1); 4029loop@+ begin a:=get_output; 4030reswitch: case a of 4031 end_translation: return; 4032 identifier,res_word:@<Output an identifier@>; 4033 mod_name:@<Output a module name@>; 4034 math_bin,math_op,math_rel:@<Output a \.{\\math} operator@>; 4035 cancel: begin repeat a:=get_output; 4036 until (a<backup)or(a>big_force); 4037 goto reswitch; 4038 end; 4039 big_cancel: begin repeat a:=get_output; 4040 until ((a<backup)and(a<>" "))or(a>big_force); 4041 goto reswitch; 4042 end; 4043 indent,outdent,opt,backup,break_space,force,big_force:@<Output a 4044 \(control, look ahead in case of line breaks, 4045 possibly |goto reswitch|@>; 4046 othercases out(a) {otherwise |a| is an ASCII character} 4047 endcases; 4048 end; 4049exit:end; 4050 4051@ An identifier of length one does not have to be enclosed in braces, and it 4052looks slightly better if set in a math-italic font instead of a (slightly 4053narrower) text-italic font. Thus we output `\.{\\\char'174a}' but 4054`\.{\\\\\{aa\}}'. 4055 4056@<Output an identifier@>= 4057begin out("\"); 4058if a=identifier then 4059 if length(cur_name)=1 then out("|") 4060@.\\|@> 4061 else out("\") 4062@.\\\\@> 4063else out("&"); {|a=res_word|} 4064@.\\\&@> 4065if length(cur_name)=1 then out(byte_mem[cur_name mod ww,byte_start[cur_name]]) 4066else out_name(cur_name); 4067end 4068 4069@ @<Output a \....@>= 4070begin out5("\")("m")("a")("t")("h"); 4071if a=math_bin then out3("b")("i")("n") 4072else if a=math_rel then out3("r")("e")("l") 4073else out2("o")("p"); 4074out("{"); 4075end 4076 4077@ The current mode does not affect the behavior of \.{WEAVE}'s output routine 4078except when we are outputting control tokens. 4079 4080@<Output a \(control...@>= 4081if a<break_space then 4082 begin if cur_mode=outer then 4083 begin out2("\")(a-cancel+"0"); 4084@.\\1@> 4085@.\\2@> 4086@.\\3@> 4087@.\\4@> 4088@.\\5@> 4089@.\\6@> 4090@.\\7@> 4091 if a=opt then out(get_output) {|opt| is followed by a digit} 4092 end 4093 else if a=opt then b:=get_output {ignore digit following |opt|} 4094 end 4095else @<Look ahead for strongest line break, |goto reswitch|@> 4096 4097@ If several of the tokens |break_space|, |force|, |big_force| occur in a 4098row, possibly mixed with blank spaces (which are ignored), 4099the largest one is used. A line break also occurs in the output file, 4100except at the very end of the translation. The very first line break 4101is suppressed (i.e., a line break that follows `\.{\\Y\\P}'). 4102 4103@<Look ahead for st...@>= 4104begin b:=a; save_mode:=cur_mode; 4105loop@+ begin a:=get_output; 4106 if (a=cancel)or(a=big_cancel) then goto reswitch; 4107 {|cancel| overrides everything} 4108 if ((a<>" ")and(a<break_space))or(a>big_force) then 4109 begin if save_mode=outer then 4110 begin if out_ptr>3 then 4111 if (out_buf[out_ptr]="P")and 4112 (out_buf[out_ptr-1]="\")and 4113@.\\P@> 4114@.\\Y@> 4115 (out_buf[out_ptr-2]="Y")and 4116 (out_buf[out_ptr-3]="\") then 4117 goto reswitch; 4118@.\\1@> 4119@.\\2@> 4120@.\\3@> 4121@.\\4@> 4122@.\\5@> 4123@.\\6@> 4124@.\\7@> 4125 out2("\")(b-cancel+"0"); 4126 if a<>end_translation then finish_line; 4127 end 4128 else if (a<>end_translation)and(cur_mode=inner) then out(" "); 4129 goto reswitch; 4130 end; 4131 if a>b then b:=a; {if |a=" "| we have |a<b|} 4132 end; 4133end 4134 4135@ The remaining part of |make_output| is somewhat more complicated. When we 4136output a module name, we may need to enter the parsing and translation 4137routines, since the name may contain \PASCAL\ code embedded in 4138\pb\ constructions. This \PASCAL\ code is placed at the end of the active 4139input buffer and the translation process uses the end of the active 4140|tok_mem| area. 4141 4142@<Output a module name@>= 4143begin out2("\")("X"); 4144@.\\X@> 4145cur_xref:=xref[cur_name]; 4146if num(cur_xref)>=def_flag then 4147 begin out_mod(num(cur_xref)-def_flag); 4148 if phase_three then 4149 begin cur_xref:=xlink(cur_xref); 4150 while num(cur_xref)>=def_flag do 4151 begin out2(",")(" "); 4152 out_mod(num(cur_xref)-def_flag); 4153 cur_xref:=xlink(cur_xref); 4154 end; 4155 end; 4156 end 4157else out("0"); {output the module number, or zero if it was undefined} 4158out(":"); @<Output the text of the module name@>; 4159out2("\")("X"); 4160end 4161 4162@ @<Output the text...@>= 4163k:=byte_start[cur_name]; w:=cur_name mod ww; k_limit:=byte_start[cur_name+ww]; 4164cur_mod_name:=cur_name; 4165while k<k_limit do 4166 begin b:=byte_mem[w,k]; incr(k); 4167 if b="@@" then @<Skip next character, give error if not `\.{@@}'@>; 4168 if b<>"|" then out(b) 4169 else begin @<Copy the \PASCAL\ text into |buffer[(limit+1)..j]|@>; 4170 save_loc:=loc; save_limit:=limit; loc:=limit+2; limit:=j+1; 4171 buffer[limit]:="|"; output_Pascal; 4172 loc:=save_loc; limit:=save_limit; 4173 end; 4174 end 4175 4176@ @<Skip next char...@>= 4177begin if byte_mem[w,k]<>"@@" then 4178 begin print_nl('! Illegal control code in section name:'); 4179@.Illegal control code...@> 4180 print_nl('<'); print_id(cur_mod_name); print('> '); mark_error; 4181 end; 4182incr(k); 4183end 4184 4185@ The \PASCAL\ text enclosed in \pb\ should not contain `\v' characters, 4186except within strings. We put a `\v' at the front of the buffer, so that an 4187error message that displays the whole buffer will look a little bit sensible. 4188The variable |string_delimiter| is zero outside of strings, otherwise it 4189equals the delimiter that began the string being copied. 4190 4191@<Copy the \PASCAL\ text into...@>= 4192j:=limit+1; buffer[j]:="|"; string_delimiter:=0; 4193loop@+ begin if k>=k_limit then 4194 begin print_nl('! Pascal text in section name didn''t end:'); 4195@.Pascal text...didn't end@> 4196 print_nl('<'); print_id(cur_mod_name); print('> '); mark_error; 4197 goto found; 4198 end; 4199 b:=byte_mem[w,k]; incr(k); 4200 if b="@@" then @<Copy a control code into the buffer@> 4201 else begin if (b="""")or(b="'") then 4202 if string_delimiter=0 then string_delimiter:=b 4203 else if string_delimiter=b then string_delimiter:=0; 4204 if (b<>"|")or(string_delimiter<>0) then 4205 begin if j>long_buf_size-3 then overflow('buffer'); 4206 incr(j); buffer[j]:=b; 4207 end 4208 else goto found; 4209 end; 4210 end; 4211found: 4212 4213@ @<Copy a control code into the buffer@>= 4214begin if j>long_buf_size-4 then overflow('buffer'); 4215buffer[j+1]:="@@"; buffer[j+2]:=byte_mem[w,k]; j:=j+2; incr(k); 4216end 4217 4218@* Phase two processing. 4219We have assembled enough pieces of the puzzle in order to be ready to specify 4220the processing in \.{WEAVE}'s main pass over the source file. Phase two 4221is analogous to phase one, except that more work is involved because we must 4222actually output the \TeX\ material instead of merely looking at the 4223\.{WEB} specifications. 4224 4225@<Phase II: Read all the text again and translate it to \TeX\ form@>= 4226reset_input; print_nl('Writing the output file...'); 4227module_count:=0; 4228copy_limbo; 4229finish_line; flush_buffer(0,false,false); {insert a blank line, it looks nice} 4230while not input_has_ended do @<Translate the \(current module@> 4231 4232@ The output file will contain the control sequence \.{\\Y} between non-null 4233sections of a module, e.g., between the \TeX\ and definition parts if both 4234are nonempty. This puts a little white space between the parts when they are 4235printed. However, we don't want \.{\\Y} to occur between two definitions 4236within a single module. The variables |out_line| or |out_ptr| will 4237change if a section is non-null, so the following macros `|save_position|' 4238and `|emit_space_if_needed|' are able to handle the situation: 4239 4240@d save_position==save_line:=out_line; save_place:=out_ptr 4241@d emit_space_if_needed==if (save_line<>out_line)or(save_place<>out_ptr) then 4242 out2("\")("Y") 4243@.\\Y@> 4244 4245@<Glo...@>= 4246@!save_line:integer; {former value of |out_line|} 4247@!save_place:sixteen_bits; {former value of |out_ptr|} 4248 4249@ @<Translate the \(current module@>= 4250begin incr(module_count);@/ 4251@<Output the code for the beginning of a new module@>; 4252save_position;@/ 4253@<Translate the \TeX\ part of the current module@>; 4254@<Translate the \(definition part of the current module@>; 4255@<Translate the \PASCAL\ part of the current module@>; 4256@<Show cross references to this module@>; 4257@<Output the code for the end of a module@>; 4258end 4259 4260@ Modules beginning with the \.{WEB} control sequence `\.{@@\ }' start in the 4261output with the \TeX\ control sequence `\.{\\M}', followed by the module 4262number. Similarly, `\.{@@*}' modules lead to the control sequence `\.{\\N}'. 4263If this is a changed module, we put \.{*} just before the module number. 4264 4265@<Output the code for the beginning...@>= 4266out("\"); 4267if buffer[loc-1]<>"*" then out("M") 4268@.\\M@> 4269else begin out("N"); print('*',module_count:1); 4270@.\\N@> 4271 update_terminal; {print a progress report} 4272 end; 4273out_mod(module_count); out2(".")(" ") 4274 4275@ In the \TeX\ part of a module, we simply copy the source text, except that 4276index entries are not copied and \PASCAL\ text within \pb\ is translated. 4277 4278@<Translate the \T...@>= 4279repeat next_control:=copy_TeX; 4280case next_control of 4281"|": begin init_stack; output_Pascal; 4282 end; 4283"@@": out("@@"); 4284octal: @<Translate an octal constant appearing in \TeX\ text@>; 4285hex: @<Translate a hexadecimal constant appearing in \TeX\ text@>; 4286TeX_string,xref_roman,xref_wildcard,xref_typewriter,module_name: 4287 begin loc:=loc-2; next_control:=get_next; {skip to \.{@@>}} 4288 if next_control=TeX_string then 4289 err_print('! TeX string should be in Pascal text only'); 4290@.TeX string should be...@> 4291 end; 4292begin_comment,end_comment,check_sum,thin_space,math_break,line_break, 4293 big_line_break,no_line_break,join,pseudo_semi: 4294 err_print('! You can''t do that in TeX text'); 4295@.You can't do that...@> 4296othercases do_nothing 4297endcases; 4298until next_control>=format 4299 4300@ @<Translate an octal constant appearing in \TeX\ text@>= 4301begin out3("\")("O")("{"); 4302@.\\O@> 4303while (buffer[loc]>="0")and(buffer[loc]<="7") do 4304 begin out(buffer[loc]); incr(loc); 4305 end; {since |buffer[limit]=" "|, this loop will end} 4306out("}"); 4307end 4308 4309@ @<Translate a hexadecimal constant appearing in \TeX\ text@>= 4310begin out3("\")("H")("{"); 4311@.\\H@> 4312while ((buffer[loc]>="0")and(buffer[loc]<="9"))or@| 4313 ((buffer[loc]>="A")and(buffer[loc]<="F")) do 4314 begin out(buffer[loc]); incr(loc); 4315 end; 4316out("}"); 4317end 4318@ When we get to the following code we have |next_control>=format|, and 4319the token memory is in its initial empty state. 4320 4321@<Translate the \(d...@>= 4322if next_control<=definition then {definition part non-empty} 4323 begin emit_space_if_needed; save_position; 4324 end; 4325while next_control<=definition do {|format| or |definition|} 4326 begin init_stack; 4327 if next_control=definition then @<Start a macro definition@> 4328 else @<Start a format definition@>; 4329 outer_parse; finish_Pascal; 4330 end 4331 4332@ The |finish_Pascal| procedure outputs the translation of the current 4333scraps, preceded by the control sequence `\.{\\P}' and followed by the 4334control sequence `\.{\\par}'. It also restores the token and scrap 4335memories to their initial empty state. 4336 4337A |force| token is appended to the current scraps before translation 4338takes place, so that the translation will normally end with \.{\\6} or 4339\.{\\7} (the \TeX\ macros for |force| and |big_force|). This \.{\\6} or 4340\.{\\7} is replaced by the concluding \.{\\par} or by \.{\\Y\\par}. 4341 4342@p procedure finish_Pascal; {finishes a definition or a \PASCAL\ part} 4343var p:text_pointer; {translation of the scraps} 4344begin out2("\")("P"); app_tok(force); app_comment; p:=translate; 4345@.\\P@> 4346app(p+tok_flag); make_output; {output the list} 4347if out_ptr>1 then 4348 if out_buf[out_ptr-1]="\" then 4349@.\\6@> 4350@.\\7@> 4351@.\\Y@> 4352 if out_buf[out_ptr]="6" then out_ptr:=out_ptr-2 4353 else if out_buf[out_ptr]="7" then out_buf[out_ptr]:="Y"; 4354out4("\")("p")("a")("r"); finish_line; 4355stat if text_ptr>max_txt_ptr then max_txt_ptr:=text_ptr; 4356if tok_ptr>max_tok_ptr then max_tok_ptr:=tok_ptr; 4357if scrap_ptr>max_scr_ptr then max_scr_ptr:=scrap_ptr; 4358tats@;@/ 4359tok_ptr:=1; text_ptr:=1; scrap_ptr:=0; {forget the tokens and the scraps} 4360end; 4361 4362@ @<Start a macro...@>= 4363begin sc2("\")("D")(intro); {this will produce `\&{define }'} 4364@.\\D@> 4365next_control:=get_next; 4366if next_control<>identifier then err_print('! Improper macro definition') 4367@.Improper macro definition@> 4368else sc1(id_flag+id_lookup(normal))(math); 4369next_control:=get_next; 4370end 4371 4372@ @<Start a format...@>= 4373begin sc2("\")("F")(intro); {this will produce `\&{format }'} 4374@.\\F@> 4375next_control:=get_next; 4376if next_control=identifier then 4377 begin sc1(id_flag+id_lookup(normal))(math); 4378 next_control:=get_next; 4379 if next_control=equivalence_sign then 4380 begin sc2("\")("S")(math); {output an equivalence sign} 4381@.\\S@> 4382 next_control:=get_next; 4383 if next_control=identifier then 4384 begin sc1(id_flag+id_lookup(normal))(math); 4385 sc0(semi); {insert an invisible semicolon} 4386 next_control:=get_next; 4387 end; 4388 end; 4389 end; 4390if scrap_ptr<>5 then err_print('! Improper format definition'); 4391@.Improper format definition@> 4392end 4393 4394@ Finally, when the \TeX\ and definition parts have been treated, we have 4395|next_control>=begin_Pascal|. We will make the global variable |this_module| 4396point to the current module name, if it has a name. 4397 4398@<Glob...@>=@!this_module:name_pointer; {the current module name, or zero} 4399 4400@ @<Translate the \P...@>= 4401this_module:=0; 4402if next_control<=module_name then 4403 begin emit_space_if_needed; init_stack; 4404 if next_control=begin_Pascal then next_control:=get_next 4405 else begin this_module:=cur_module; 4406 @<Check that |=| or |==| follows this module name, and 4407 emit the scraps to start the module definition@>; 4408 end; 4409 while next_control<=module_name do 4410 begin outer_parse; 4411 @<Emit the scrap for a module name if present@>; 4412 end; 4413 finish_Pascal; 4414 end 4415 4416@ @<Check that |=|...@>= 4417repeat next_control:=get_next; 4418until next_control<>"+"; {allow optional `\.{+=}'} 4419if (next_control<>"=")and(next_control<>equivalence_sign) then 4420 err_print('! You need an = sign after the section name') 4421@.You need an = sign...@> 4422else next_control:=get_next; 4423if out_ptr>1 then 4424 if (out_buf[out_ptr]="Y")and(out_buf[out_ptr-1]="\") then 4425@.\\Y@> 4426 begin app(backup); {the module name will be flush left} 4427 end; 4428sc1(mod_flag+this_module)(mod_scrap); 4429cur_xref:=xref[this_module]; 4430if num(cur_xref)<>module_count+def_flag then 4431 begin sc3(math_rel)("+")("}")(math); 4432 {module name is multiply defined} 4433 this_module:=0; {so we won't give cross-reference info here} 4434 end; 4435sc2("\")("S")(math); {output an equivalence sign} 4436@.\\S@> 4437sc1(force)(semi); {this forces a line break unless `\.{@@+}' follows} 4438 4439@ @<Emit the scrap...@>= 4440if next_control<module_name then 4441 begin err_print('! You can''t do that in Pascal text'); 4442@.You can't do that...@> 4443 next_control:=get_next; 4444 end 4445else if next_control=module_name then 4446 begin sc1(mod_flag+cur_module)(mod_scrap); next_control:=get_next; 4447 end 4448 4449@ Cross references relating to a named module are given after the module ends. 4450 4451@<Show cross...@>= 4452if this_module>0 then 4453 begin @<Rearrange the list pointed to by |cur_xref|@>; 4454 footnote(def_flag); footnote(0); 4455 end 4456 4457@ To rearrange the order of the linked list of cross references, we need 4458four more variables that point to cross reference entries. We'll end up 4459with a list pointed to by |cur_xref|. 4460 4461@<Glob...@>= 4462@!next_xref,@!this_xref,@!first_xref,@!mid_xref:xref_number; 4463 {pointer variables for rearranging a list} 4464 4465@ We want to rearrange the cross reference list so that all the entries with 4466|def_flag| come first, in ascending order; then come all the other 4467entries, in ascending order. There may be no entries in either one or both 4468of these categories. 4469 4470@<Rearrange the list...@>= 4471first_xref:=xref[this_module]; 4472this_xref:=xlink(first_xref); {bypass current module number} 4473if num(this_xref)>def_flag then 4474 begin mid_xref:=this_xref; cur_xref:=0; {this value doesn't matter} 4475 repeat next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref; 4476 cur_xref:=this_xref; this_xref:=next_xref; 4477 until num(this_xref)<=def_flag; 4478 xlink(first_xref):=cur_xref; 4479 end 4480else mid_xref:=0; {first list null} 4481cur_xref:=0; 4482while this_xref<>0 do 4483 begin next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref; 4484 cur_xref:=this_xref; this_xref:=next_xref; 4485 end; 4486if mid_xref>0 then xlink(mid_xref):=cur_xref 4487else xlink(first_xref):=cur_xref; 4488cur_xref:=xlink(first_xref) 4489 4490@ The |footnote| procedure gives cross reference information about 4491multiply defined module names (if the |flag| parameter is |def_flag|), or about 4492the uses of a module name (if the |flag| parameter is zero). It assumes that 4493|cur_xref| points to the first cross-reference entry of interest, and it 4494leaves |cur_xref| pointing to the first element not printed. Typical outputs: 4495`\.{\\A101.}'; `\.{\\Us370\\ET1009.}'; `\.{\\As8, 27\\*, 51\\ETs64.}'. 4496 4497@p procedure footnote(@!flag:sixteen_bits); {outputs module cross-references} 4498label done,exit; 4499var q:xref_number; {cross-reference pointer variable} 4500begin if num(cur_xref)<=flag then return; 4501finish_line; out("\"); 4502@.\\A@> 4503@.\\U@> 4504if flag=0 then out("U")@+else out("A"); 4505@<Output all the module numbers on the reference list |cur_xref|@>; 4506out("."); 4507exit:end; 4508 4509@ The following code distinguishes three cases, according as the number 4510of cross references is one, two, or more than two. Variable |q| points 4511to the first cross reference, and the last link is a zero. 4512 4513@<Output all the module numbers...@>= 4514q:=cur_xref; if num(xlink(q))>flag then out("s"); {plural} 4515@.\\As@> 4516@.\\Us@> 4517loop@+ begin out_mod(num(cur_xref)-flag); 4518 cur_xref:=xlink(cur_xref); {point to the next cross reference to output} 4519 if num(cur_xref)<=flag then goto done; 4520 if num(xlink(cur_xref))>flag then out2(",")(" ") {not the last} 4521 else begin out3("\")("E")("T"); {the last} 4522@.\\ET@> 4523 if cur_xref<>xlink(q) then out("s"); {the last of more than two} 4524@.\\ETs@> 4525 end; 4526 end; 4527done: 4528 4529@ @<Output the code for the end of a module@>= 4530out3("\")("f")("i"); finish_line; 4531flush_buffer(0,false,false); {insert a blank line, it looks nice} 4532@.\\fi@> 4533 4534@* Phase three processing. 4535We are nearly finished! \.{WEAVE}'s only remaining task is to write out the 4536index, after sorting the identifiers and index entries. 4537 4538@<Phase III: Output the cross-reference index@>= 4539phase_three:=true; print_nl('Writing the index...'); 4540if change_exists then 4541 begin finish_line; @<Tell about changed modules@>; 4542 end; 4543finish_line; out4("\")("i")("n")("x"); finish_line; 4544@.\\inx@> 4545@<Do the first pass of sorting@>; 4546@<Sort and output the index@>; 4547out4("\")("f")("i")("n"); finish_line; 4548@.\\fin@> 4549@<Output all the module names@>; 4550out4("\")("c")("o")("n"); finish_line; 4551@.\\con@> 4552print('Done.'); 4553 4554@ Just before the index comes a list of all the changed modules, including 4555the index module itself. 4556 4557@<Glob...@>= 4558@!k_module:0..max_modules; {runs through the modules} 4559 4560@ @<Tell about changed modules@>= 4561begin {remember that the index is already marked as changed} 4562k_module:=1; 4563out4("\")("c")("h")(" "); 4564while k_module<module_count do 4565 begin if changed_module[k_module] then 4566 begin out_mod(k_module); out2(",")(" "); 4567 end; 4568 incr(k_module); 4569 end; 4570out_mod(k_module); 4571out("."); 4572end 4573 4574@ A left-to-right radix sorting method is used, since this makes it easy to 4575adjust the collating sequence and since the running time will be at worst 4576proportional to the total length of all entries in the index. We put the 4577identifiers into 230 different lists based on their first characters. 4578(Uppercase letters are put into the same list as the corresponding lowercase 4579letters, since we want to have `$t<\\{TeX}<\&{to}$'.) The 4580list for character |c| begins at location |bucket[c]| and continues through 4581the |blink| array. 4582 4583@<Glob...@>= 4584@!bucket:array[ASCII_code] of name_pointer; 4585@!next_name: name_pointer; {successor of |cur_name| when sorting} 4586@!c:ASCII_code; {index into |bucket|} 4587@!h:0..hash_size; {index into |hash|} 4588@!blink:array[0..max_names] of sixteen_bits; {links in the buckets} 4589 4590@ To begin the sorting, we go through all the hash lists and put each entry 4591having a nonempty cross-reference list into the proper bucket. 4592 4593@<Do the first pass...@>= 4594for c:=0 to 255 do bucket[c]:=0; 4595for h:=0 to hash_size-1 do 4596 begin next_name:=hash[h]; 4597 while next_name<>0 do 4598 begin cur_name:=next_name; next_name:=link[cur_name]; 4599 if xref[cur_name]<>0 then 4600 begin c:=byte_mem[cur_name mod ww,byte_start[cur_name]]; 4601 if (c<="Z")and(c>="A") then c:=c+@'40; 4602 blink[cur_name]:=bucket[c]; bucket[c]:=cur_name; 4603 end; 4604 end; 4605 end 4606 4607@ During the sorting phase we shall use the |cat| and |trans| arrays from 4608\.{WEAVE}'s parsing algorithm and rename them |depth| and |head|. They now 4609represent a stack of identifier lists for all the index entries that have 4610not yet been output. The variable |sort_ptr| tells how many such lists are 4611present; the lists are output in reverse order (first |sort_ptr|, then 4612|sort_ptr-1|, etc.). The |j|th list starts at |head[j]|, and if the first 4613|k| characters of all entries on this list are known to be equal we have 4614|depth[j]=k|. 4615 4616@d depth==cat {reclaims memory that is no longer needed for parsing} 4617@d head==trans {ditto} 4618@d sort_ptr==scrap_ptr {ditto} 4619@d max_sorts==max_scraps {ditto} 4620 4621@<Globals...@>= 4622@!cur_depth:eight_bits; {depth of current buckets} 4623@!cur_byte:0..max_bytes; {index into |byte_mem|} 4624@!cur_bank:0..ww-1; {row of |byte_mem|} 4625@!cur_val:sixteen_bits; {current cross reference number} 4626stat@!max_sort_ptr:0..max_sorts;@+tats {largest value of |sort_ptr|} 4627 4628@ @<Set init...@>=stat max_sort_ptr:=0;@+tats 4629 4630@ The desired alphabetic order is specified by the |collate| array; namely, 4631|collate[0]<collate[1]<@t$\cdots$@><collate[229]|. 4632 4633@<Glob...@>=@!collate:array[0..229] of ASCII_code; {collation order} 4634 4635@ @<Local variables for init...@>= 4636@!c:ASCII_code; {used to initialize |collate|} 4637 4638@ We use the order $\hbox{null}<\.\ <\hbox{other characters}<\.\_< 4639\.A=\.a<\cdots<\.Z=\.z<\.0<\cdots<\.9.$ 4640 4641@<Set init...@>= 4642collate[0]:=0; collate[1]:=" "; 4643for c:=1 to " "-1 do collate[c+1]:=c; 4644for c:=" "+1 to "0"-1 do collate[c]:=c; 4645for c:="9"+1 to "A"-1 do collate[c-10]:=c; 4646for c:="Z"+1 to "_"-1 do collate[c-36]:=c; 4647collate["_"-36]:="_"+1; 4648for c:="z"+1 to 255 do collate[c-63]:=c; 4649collate[193]:="_"; 4650for c:="a" to "z" do collate[c-"a"+194]:=c; 4651for c:="0" to "9" do collate[c-"0"+220]:=c; 4652 4653@ Procedure |unbucket| goes through the buckets and adds nonempty lists 4654to the stack, using the collating sequence specified in the |collate| array. 4655The parameter to |unbucket| tells the current depth in the buckets. 4656Any two sequences that agree in their first 255 character positions are 4657regarded as identical. 4658 4659@d infinity=255 {$\infty$ (approximately)} 4660 4661@p procedure unbucket(@!d:eight_bits); {empties buckets having depth |d|} 4662var c:ASCII_code; {index into |bucket|} 4663begin for c:=229 downto 0 do if bucket[collate[c]]>0 then 4664 begin if sort_ptr>max_sorts then overflow('sorting'); 4665 incr(sort_ptr); 4666 stat if sort_ptr>max_sort_ptr then max_sort_ptr:=sort_ptr;@;@+tats@;@/ 4667 if c=0 then depth[sort_ptr]:=infinity else depth[sort_ptr]:=d; 4668 head[sort_ptr]:=bucket[collate[c]]; bucket[collate[c]]:=0; 4669 end; 4670end; 4671 4672@ @<Sort and output...@>= 4673sort_ptr:=0; unbucket(1); 4674while sort_ptr>0 do 4675 begin cur_depth:=cat[sort_ptr]; 4676 if (blink[head[sort_ptr]]=0)or(cur_depth=infinity) then 4677 @<Output index entries for the list at |sort_ptr|@> 4678 else @<Split the list at |sort_ptr| into further lists@>; 4679 end 4680 4681@ @<Split the list...@>= 4682begin next_name:=head[sort_ptr]; 4683repeat cur_name:=next_name; next_name:=blink[cur_name]; 4684 cur_byte:=byte_start[cur_name]+cur_depth; cur_bank:=cur_name mod ww; 4685 if cur_byte=byte_start[cur_name+ww] then c:=0 {we hit the end of the name} 4686 else begin c:=byte_mem[cur_bank,cur_byte]; 4687 if (c<="Z")and(c>="A") then c:=c+@'40; 4688 end; 4689 blink[cur_name]:=bucket[c]; bucket[c]:=cur_name; 4690until next_name=0; 4691decr(sort_ptr); unbucket(cur_depth+1); 4692end 4693 4694@ @<Output index...@>= 4695begin cur_name:=head[sort_ptr]; 4696@!debug if trouble_shooting then debug_help;@;@+gubed@/ 4697repeat out2("\")(":"); 4698@.\\:@> 4699 @<Output the name at |cur_name|@>; 4700 @<Output the cross-references at |cur_name|@>; 4701 cur_name:=blink[cur_name]; 4702until cur_name=0; 4703decr(sort_ptr); 4704end 4705 4706@ @<Output the name...@>= 4707case ilk[cur_name] of 4708normal: if length(cur_name)=1 then out2("\")("|")@+else out2("\")("\"); 4709@.\\|@> 4710@.\\\\@> 4711roman: do_nothing; 4712wildcard: out2("\")("9"); 4713@.\\9@> 4714typewriter: out2("\")("."); 4715@.\\.@> 4716othercases out2("\")("&") 4717@.\\\&@> 4718endcases;@/ 4719out_name(cur_name) 4720 4721@ Section numbers that are to be underlined are enclosed in 4722`\.{\\[}$\,\ldots\,$\.]'. 4723 4724@<Output the cross-references...@>= 4725@<Invert the cross-reference list at |cur_name|, making |cur_xref| the head@>; 4726repeat out2(",")(" "); cur_val:=num(cur_xref); 4727if cur_val<def_flag then out_mod(cur_val) 4728else begin out2("\")("["); out_mod(cur_val-def_flag); out("]"); 4729@.\\[@> 4730 end; 4731cur_xref:=xlink(cur_xref); 4732until cur_xref=0; 4733out("."); finish_line 4734 4735@ List inversion is best thought of as popping elements off one stack and 4736pushing them onto another. In this case |cur_xref| will be the head of 4737the stack that we push things onto. 4738 4739@<Invert the cross-reference list at |cur_name|, making |cur_xref| the head@>= 4740this_xref:=xref[cur_name]; cur_xref:=0; 4741repeat next_xref:=xlink(this_xref); xlink(this_xref):=cur_xref; 4742cur_xref:=this_xref; this_xref:=next_xref; 4743until this_xref=0 4744 4745@ The following recursive procedure walks through the tree of module names and 4746prints them. 4747@^recursion@> 4748 4749@p procedure mod_print(p:name_pointer); {print all module names in subtree |p|} 4750begin if p>0 then 4751 begin mod_print(llink[p]);@/ 4752 out2("\")(":");@/ 4753@.\\:@> 4754 tok_ptr:=1; text_ptr:=1; scrap_ptr:=0; init_stack; 4755 app(p+mod_flag); make_output; 4756 footnote(0); {|cur_xref| was set by |make_output|} 4757 finish_line;@/ 4758 mod_print(rlink[p]); 4759 end; 4760end; 4761 4762@ @<Output all the module names@>=@+mod_print(root) 4763 4764@* Debugging. 4765The \PASCAL\ debugger with which \.{WEAVE} was developed allows breakpoints 4766to be set, and variables can be read and changed, but procedures cannot be 4767executed. Therefore a `|debug_help|' procedure has been inserted in the main 4768loops of each phase of the program; when |ddt| and |dd| are set to appropriate 4769values, symbolic printouts of various tables will appear. 4770 4771The idea is to set a breakpoint inside the |debug_help| routine, at the 4772place of `\ignorespaces|breakpoint:|\unskip' below. Then when 4773|debug_help| is to be activated, set |trouble_shooting| equal to |true|. 4774The |debug_help| routine will prompt you for values of |ddt| and |dd|, 4775discontinuing this when |ddt<=0|; thus you type $2n+1$ integers, ending 4776with zero or a negative number. Then control either passes to the 4777breakpoint, allowing you to look at and/or change variables (if you typed 4778zero), or to exit the routine (if you typed a negative value). 4779 4780Another global variable, |debug_cycle|, can be used to skip silently 4781past calls on |debug_help|. If you set |debug_cycle>1|, the program stops 4782only every |debug_cycle| times |debug_help| is called; however, 4783any error stop will set |debug_cycle| to zero. 4784 4785@<Globals...@>= 4786@!debug@!trouble_shooting:boolean; {is |debug_help| wanted?} 4787@!ddt:integer; {operation code for the |debug_help| routine} 4788@!dd:integer; {operand in procedures performed by |debug_help|} 4789@!debug_cycle:integer; {threshold for |debug_help| stopping} 4790@!debug_skipped:integer; {we have skipped this many |debug_help| calls} 4791@!term_in:text_file; {the user's terminal as an input file} 4792gubed 4793 4794@ The debugging routine needs to read from the user's terminal. 4795@^system dependencies@> 4796@<Set init...@>= 4797@!debug trouble_shooting:=true; debug_cycle:=1; debug_skipped:=0; tracing:=0;@/ 4798trouble_shooting:=false; debug_cycle:=99999; {use these when it almost works} 4799reset(term_in,'TTY:','/I'); {open |term_in| as the terminal, don't do a |get|} 4800gubed 4801 4802@ @d breakpoint=888 {place where a breakpoint is desirable} 4803@^system dependencies@> 4804 4805@p @!debug procedure debug_help; {routine to display various things} 4806label breakpoint,exit; 4807var k:integer; {index into various arrays} 4808begin incr(debug_skipped); 4809if debug_skipped<debug_cycle then return; 4810debug_skipped:=0; 4811loop@+ begin print_nl('#'); update_terminal; {prompt} 4812 read(term_in,ddt); {read a debug-command code} 4813 if ddt<0 then return 4814 else if ddt=0 then 4815 begin goto breakpoint;@\ {go to every label at least once} 4816 breakpoint: ddt:=0;@\ 4817 end 4818 else begin read(term_in,dd); 4819 case ddt of 4820 1: print_id(dd); 4821 2: print_text(dd); 4822 3: for k:=1 to dd do print(xchr[buffer[k]]); 4823 4: for k:=1 to dd do print(xchr[mod_text[k]]); 4824 5: for k:=1 to out_ptr do print(xchr[out_buf[k]]); 4825 6: for k:=1 to dd do 4826 begin print_cat(cat[k]); print(' '); 4827 end; 4828 othercases print('?') 4829 endcases; 4830 end; 4831 end; 4832exit:end; 4833gubed 4834 4835@* The main program. 4836Let's put it all together now: \.{WEAVE} starts and ends here. 4837@^system dependencies@> 4838 4839The main procedure has been split into three sub-procedures in order to 4840keep certain \PASCAL\ compilers from overflowing their capacity. 4841@^split procedures@> 4842 4843@p procedure Phase_I; 4844begin @<Phase I:...@>; 4845end; 4846@# 4847procedure Phase_II; 4848begin @<Phase II:...@>; 4849end; 4850@# 4851begin initialize; {beginning of the main program} 4852print_ln(banner); {print a ``banner line''} 4853@<Store all the reserved words@>; 4854Phase_I; Phase_II;@/ 4855@<Phase III:...@>; 4856@<Check that all changes have been read@>; 4857end_of_WEAVE: 4858stat @<Print statistics about memory usage@>;@+tats@;@/ 4859@t\4\4@>{here files should be closed if the operating system requires it} 4860@<Print the job |history|@>; 4861end. 4862 4863@ @<Print statistics about memory usage@>= 4864print_nl('Memory usage statistics: ', 4865 name_ptr:1,' names, ', xref_ptr:1,' cross references, ', 4866 byte_ptr[0]:1); 4867for cur_bank:=1 to ww-1 do print('+',byte_ptr[cur_bank]:1); 4868 print(' bytes;'); 4869print_nl('parsing required ',max_scr_ptr:1,' scraps, ',max_txt_ptr:1, 4870 ' texts, ',max_tok_ptr:1,' tokens, ', max_stack_ptr:1,' levels;'); 4871print_nl('sorting required ',max_sort_ptr:1, ' levels.') 4872 4873@ Some implementations may wish to pass the |history| value to the 4874operating system so that it can be used to govern whether or not other 4875programs are started. Here we simply report the history to the user. 4876@^system dependencies@> 4877 4878@<Print the job |history|@>= 4879case history of 4880spotless: print_nl('(No errors were found.)'); 4881harmless_message: print_nl('(Did you see the warning message above?)'); 4882error_message: print_nl('(Pardon me, but I think I spotted something wrong.)'); 4883fatal_message: print_nl('(That was a fatal error, my friend.)'); 4884end {there are no other cases} 4885 4886@* System-dependent changes. 4887This module should be replaced, if necessary, by changes to the program 4888that are necessary to make \.{WEAVE} work at a particular installation. 4889It is usually best to design your change file so that all changes to 4890previous modules preserve the module numbering; then everybody's version 4891will be consistent with the printed program. More extensive changes, 4892which introduce new modules, can be inserted here; then only the index 4893itself will get a new module number. 4894@^system dependencies@> 4895 4896@* Index. 4897If you have read and understood the code for Phase III above, you know what 4898is in this index and how it got here. All modules in which an identifier is 4899used are listed with that identifier, except that reserved words are 4900indexed only when they appear in format definitions, and the appearances 4901of identifiers in module names are not indexed. Underlined entries 4902correspond to where the identifier was declared. Error messages, control 4903sequences put into the output, and a few 4904other things like ``recursion'' are indexed here too. 4905