1% This program by D. E. Knuth is not copyrighted and can be used freely.
2% Version 0.0 was more-or-less debugged on June 4, 1985.
3% Version 0.1 improved formatting of : and added \\ (June 15, 1985).
4% Version 0.2 improved formatting of good, fixed @@ bug (August 4, 1985).
5% Version 0.3 fixed minor bug in change_file move (August 30, 1985).
6% Version 0.4 fixed minor bug regarding empty comments (April 8, 1989).
7% Version 1.0 was tuned up for the METAFONTware report (April 16, 1989).
8% Version 1.1 ditto, with input handled by Hosek's idea (April 27, 1989).
9% Version 2 has the new primitives of METAFONT 2.0 (October 16, 1989).
10
11% Here is TeX material that gets inserted after \input webmac
12\def\hang{\hangindent 3em\indent\ignorespaces}
13\font\ninerm=cmr9
14\let\mc=\ninerm % medium caps for names like SAIL
15\def\PASCAL{Pascal}
16\font\logo=manfnt % font used for the METAFONT logo
17\def\MF{{\logo META}\-{\logo FONT}}
18\def\pb{$\.|\ldots\.|$} % MF brackets (|...|)
19\def\v{\.{\char'174}} % vertical (|) in typewriter font
20\def\dleft{[\![} \def\dright{]\!]} % double brackets
21\mathchardef\RA="3221 % right arrow
22\mathchardef\BA="3224 % double arrow
23\def\({} % kludge for alphabetizing certain module names
24\chardef\V=`\| % vertical line in a string
25
26\def\title{MFT}
27\def\contentspagenumber{401}
28\def\topofcontents{\null
29  \def\titlepage{F} % include headline on the contents page
30  \def\rheader{\mainfont\hfil \contentspagenumber}
31  \vfill
32  \centerline{\titlefont The {\ttitlefont MFT} processor}
33  \vskip 15pt
34  \centerline{(Version 2.0, October 1989)}
35  \vfill}
36\def\botofcontents{\vfill
37  \centerline{\hsize 5in\baselineskip9pt
38    \vbox{\ninerm\noindent
39    The preparation of this report
40    was supported in part by the National Science
41    Foundation under grants IST-8201926, MCS-8300984, and
42    CCR-8610181,
43    and by the System Development Foundation. `\TeX' is a
44    trademark of the American Mathematical Society.
45    `{\logo hijklmnj}\kern1pt' is a trademark of Addison-Wesley
46    Publishing Company.}}}
47\pageno=\contentspagenumber \advance\pageno by 1
48
49@* Introduction.
50This program converts a \MF\ source file to a \TeX\ file. It was written
51by D.~E. Knuth in June, 1985; a somewhat similar {\mc SAIL} program had
52@^Knuth, Donald Ervin@>
53been developed in January, 1980.
54
55The general idea is to input a file called, say, \.{foo.mf} and to produce an
56output file called, say, \.{foo.tex}. The latter file, when processed by \TeX,
57will yield a ``prettyprinted'' representation of the input file.
58@^user manual@>
59
60Line breaks in the input are carried over into the output; moreover,
61blank spaces at the beginning of a line are converted to quads of indentation
62in the output. Thus, the user has full control over the indentation and line
63breaks. Each line of input is translated independently of the others.
64
65A slight change to \MF's comment convention allows further control.
66Namely, `\.{\%\%}' indicates that the remainder of an input line should be
67copied verbatim to the output; this interrupts the translation and forces
68\.{MFT} to produce a certain result.
69
70Furthermore, `\.{\%\%\%} $\langle\,$token$_1\,\rangle\ldots
71\langle\,$token$_n\,\rangle$'
72introduces a change in \.{MFT}'s formatting rules; all tokens after the first
73will henceforth be translated according to the current conventions for
74$\langle\,$token$_1\,\rangle$. The tokens must be symbolic (i.e., not
75numeric or string tokens). For example, the input line
76$$\.{\%\%\% addto fill draw filldraw}$$
77says that the `\.{fill}', `\.{draw}', and `\.{filldraw}' operations of
78plain \MF\ should be formatted as the primitive token `\.{addto}', i.e.,
79in boldface type. (Without such reformatting commands, \.{MFT} would treat
80`\.{fill}' like an ordinary tag or variable name. In fact, you need
81a reformatting command even to get parentheses to act like delimiters!)
82
83\MF\ comments, which follow a single \.\% sign, should be valid \TeX\
84input.  But \MF\ material can be included in \pb\ within a comment; this
85will be translated by \.{MFT} as if it were not in a comment. For example,
86a phrase like `\.{make} \.{\V x2r\V} \.{zero}' will be translated into
87`\.{make \$x\_\{2r\}\$ zero}'.
88
89The rules just stated apply to lines that contain one, two, or three \.\% signs
90in a row. Comments to \.{MFT} can follow `\.{\%\%\%\%}'.
91Five or more \.\% signs should not be used.
92
93Beside the normal input file, \.{MFT} also looks for a change file
94(e.g., `\.{foo.ch}'), which allows substitutions to be made in the
95translation. The change file follows the conventions of \.{WEB}, and
96it should be null if there are no changes. (Changes usually contain
97verbatim instructions to compensate for the fact that \.{MFT} cannot
98format everything in an optimum way.)
99
100There's also a third input file (e.g., `\.{plain.mft}'), which is
101input before the other two. This file normally contains the `\.{\%\%\%}'
102formatting commands that are necessary to tune \.{MFT} to a particular
103style of \MF\ code, so it is called the style file.
104
105The output of \.{MFT} should be accompanied by the macros in a small
106package called \.{mftmac.tex}.
107@.mftmac@>
108
109Caveat: This program is not as ``bulletproof'' as the other routines
110produced by Stanford's \TeX\ project. It takes care of a great deal of
111tedious formatting, but it can produce strange output, because \MF\ is
112an extremely general language. Users should proofread their output carefully.
113
114@ \.{MFT} uses a few features of the local \PASCAL\ compiler that may
115need to be changed in other installations:
116
117\yskip\item{1)} Case statements have a default.
118\item{2)} Input-output routines may need to be adapted for use with a particular
119character set and/or for printing messages on the user's terminal.
120
121\yskip\noindent
122These features are also present in the \PASCAL\ version of \TeX, where they
123are used in a similar (but more complex) way. System-dependent portions
124of \.{MFT} can be identified by looking at the entries for `system
125dependencies' in the index below.
126@!@^system dependencies@>
127
128The ``banner line'' defined here should be changed whenever \.{MFT}
129is modified.
130
131@d banner=='This is MFT, Version 2.0'
132
133@ The program begins with a fairly normal header, made up of pieces that
134@^system dependencies@>
135will mostly be filled in later. The \.{MF} input comes from files |mf_file|,
136|change_file|, and |style_file|; the \TeX\ output goes to file |tex_file|.
137
138If it is necessary to abort the job because of a fatal error, the program
139calls the `|jump_out|' procedure, which goes to the label |end_of_MFT|.
140
141@d end_of_MFT = 9999 {go here to wrap it up}
142
143@p @t\4@>@<Compiler directives@>@/
144program MFT(@!mf_file,@!change_file,@!style_file,@!tex_file);
145label end_of_MFT; {go here to finish}
146const @<Constants in the outer block@>@/
147type @<Types in the outer block@>@/
148var @<Globals in the outer block@>@/
149@<Error handling procedures@>@/
150procedure initialize;
151  var @<Local variables for initialization@>@/
152  begin @<Set initial values@>@/
153  end;
154
155@ The \PASCAL\ compiler used to develop this system has ``compiler
156directives'' that can appear in comments whose first character is a dollar sign.
157In our case these directives tell the compiler to detect
158@^system dependencies@>
159things that are out of range.
160
161@<Compiler directives@>=
162@{@&$C+,A+,D-@} {range check, catch arithmetic overflow, no debug overhead}
163
164@ Labels are given symbolic names by the following definitions. We insert
165the label `|exit|:' just before the `\ignorespaces|end|\unskip' of a
166procedure in which we have used the `|return|' statement defined below;
167the label `|restart|' is occasionally used at the very beginning of a
168procedure; and the label `|reswitch|' is occasionally used just prior to
169a \&{case} statement in which some cases change the conditions and we wish to
170branch to the newly applicable case.
171Loops that are set up with the \&{loop} construction defined below are
172commonly exited by going to `|done|' or to `|found|' or to `|not_found|',
173and they are sometimes repeated by going to `|continue|'.
174
175@d exit=10 {go here to leave a procedure}
176@d restart=20 {go here to start a procedure again}
177@d reswitch=21 {go here to start a case statement again}
178@d continue=22 {go here to resume a loop}
179@d done=30 {go here to exit a loop}
180@d found=31 {go here when you've found it}
181@d not_found=32 {go here when you've found something else}
182
183@ Here are some macros for common programming idioms.
184
185@d incr(#) == #:=#+1 {increase a variable by unity}
186@d decr(#) == #:=#-1 {decrease a variable by unity}
187@d loop == @+ while true do@+ {repeat over and over until a |goto| happens}
188@d do_nothing == {empty statement}
189@d return == goto exit {terminate a procedure call}
190@f return == nil
191@f loop == xclause
192
193@ We assume that |case| statements may include a default case that applies
194if no matching label is found. Thus, we shall use constructions like
195@^system dependencies@>
196$$\vbox{\halign{#\hfil\cr
197|case x of|\cr
1981: $\langle\,$code for $x=1\,\rangle$;\cr
1993: $\langle\,$code for $x=3\,\rangle$;\cr
200|othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr
201|endcases|\cr}}$$
202since most \PASCAL\ compilers have plugged this hole in the language by
203incorporating some sort of default mechanism. For example, the compiler
204used to develop \.{WEB} and \TeX\ allows `|others|:' as a default label,
205and other \PASCAL s allow syntaxes like `\ignorespaces|else|\unskip' or
206`\&{otherwise}' or `\\{otherwise}:', etc. The definitions of |othercases|
207and |endcases| should be changed to agree with local conventions.
208(Of course, if no default mechanism is available, the |case| statements of
209this program must be extended by listing all remaining cases.)
210
211@d othercases == others: {default for cases not listed explicitly}
212@d endcases == @+end {follows the default case in an extended |case| statement}
213@f othercases == else
214@f endcases == end
215
216@ The following parameters are set big enough to handle the Computer
217Modern fonts, so they should be sufficient for most applications of \.{MFT}.
218
219@<Constants...@>=
220@!max_bytes=10000; {the number of bytes in tokens; must be less than 65536}
221@!max_names=1000; {number of tokens}
222@!hash_size=353; {should be prime}
223@!buf_size=100; {maximum length of input line}
224@!line_length=80; {lines of \TeX\ output have at most this many characters,
225  should be less than 256}
226
227@ A global variable called |history| will contain one of four values
228at the end of every run: |spotless| means that no unusual messages were
229printed; |harmless_message| means that a message of possible interest
230was printed but no serious errors were detected; |error_message| means that
231at least one error was found; |fatal_message| means that the program
232terminated abnormally. The value of |history| does not influence the
233behavior of the program; it is simply computed for the convenience
234of systems that might want to use such information.
235
236@d spotless=0 {|history| value for normal jobs}
237@d harmless_message=1 {|history| value when non-serious info was printed}
238@d error_message=2 {|history| value when an error was noted}
239@d fatal_message=3 {|history| value when we had to stop prematurely}
240@#
241@d mark_harmless==@t@>@+if history=spotless then history:=harmless_message
242@d mark_error==history:=error_message
243@d mark_fatal==history:=fatal_message
244
245@<Glob...@>=@!history:spotless..fatal_message; {how bad was this run?}
246
247@ @<Set init...@>=history:=spotless;
248
249@* The character set.
250\.{MFT} works internally with ASCII codes, like all other programs
251associated with \TeX\ and \MF. The present section has been lifted
252almost verbatim from the \MF\ program.
253@^ASCII code@>
254
255@ Characters of text that have been converted to \MF's internal form
256are said to be of type |ASCII_code|, which is a subrange of the integers.
257
258@<Types...@>=
259@!ASCII_code=0..255; {eight-bit numbers}
260
261@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit
262character sets were common, so it did not make provision for lowercase
263letters. Nowadays, of course, we need to deal with both capital and small
264letters in a convenient way, especially in a program for font design;
265so the present specification of \.{MFT} has been written under the assumption
266that the \PASCAL\ compiler and run-time system permit the use of text files
267with more than 64 distinguishable characters. More precisely, we assume that
268the character set contains at least the letters and symbols associated
269with ASCII codes @'40 through @'176. If additional characters are present,
270\.{MFT} can be configured to work with them too.
271
272Since we are dealing with more characters than were present in the first
273\PASCAL\ compilers, we have to decide what to call the associated data
274type. Some \PASCAL s use the original name |char| for the
275characters in text files, even though there now are more than 64 such
276characters, while other \PASCAL s consider |char| to be a 64-element
277subrange of a larger data type that has some other name.
278
279In order to accommodate this difference, we shall use the name |text_char|
280to stand for the data type of the characters that are converted to and
281from |ASCII_code| when they are input and output. We shall also assume
282that |text_char| consists of the elements |chr(first_text_char)| through
283|chr(last_text_char)|, inclusive. The following definitions should be
284adjusted if necessary.
285@^system dependencies@>
286
287@d text_char == char {the data type of characters in text files}
288@d first_text_char=0 {ordinal number of the smallest element of |text_char|}
289@d last_text_char=255 {ordinal number of the largest element of |text_char|}
290
291@<Types...@>=
292@!text_file=packed file of text_char;
293
294@ @<Local variables for init...@>=
295@!i:0..255;
296
297@ The \.{MFT} processor converts between ASCII code and
298the user's external character set by means of arrays |xord| and |xchr|
299that are analogous to \PASCAL's |ord| and |chr| functions.
300
301@<Glob...@>=
302@!xord: array [text_char] of ASCII_code;
303  {specifies conversion of input characters}
304@!xchr: array [ASCII_code] of text_char;
305  {specifies conversion of output characters}
306
307@ Since we are assuming that our \PASCAL\ system is able to read and write the
308visible characters of standard ASCII (although not necessarily using the
309ASCII codes to represent them), the following assignment statements initialize
310most of the |xchr| array properly, without needing any system-dependent
311changes. On the other hand, it is possible to implement \.{MFT} with
312less complete character sets, and in such cases it will be necessary to
313change something here.
314@^system dependencies@>
315
316@<Set init...@>=
317xchr[@'40]:=' ';
318xchr[@'41]:='!';
319xchr[@'42]:='"';
320xchr[@'43]:='#';
321xchr[@'44]:='$';
322xchr[@'45]:='%';
323xchr[@'46]:='&';
324xchr[@'47]:='''';@/
325xchr[@'50]:='(';
326xchr[@'51]:=')';
327xchr[@'52]:='*';
328xchr[@'53]:='+';
329xchr[@'54]:=',';
330xchr[@'55]:='-';
331xchr[@'56]:='.';
332xchr[@'57]:='/';@/
333xchr[@'60]:='0';
334xchr[@'61]:='1';
335xchr[@'62]:='2';
336xchr[@'63]:='3';
337xchr[@'64]:='4';
338xchr[@'65]:='5';
339xchr[@'66]:='6';
340xchr[@'67]:='7';@/
341xchr[@'70]:='8';
342xchr[@'71]:='9';
343xchr[@'72]:=':';
344xchr[@'73]:=';';
345xchr[@'74]:='<';
346xchr[@'75]:='=';
347xchr[@'76]:='>';
348xchr[@'77]:='?';@/
349xchr[@'100]:='@@';
350xchr[@'101]:='A';
351xchr[@'102]:='B';
352xchr[@'103]:='C';
353xchr[@'104]:='D';
354xchr[@'105]:='E';
355xchr[@'106]:='F';
356xchr[@'107]:='G';@/
357xchr[@'110]:='H';
358xchr[@'111]:='I';
359xchr[@'112]:='J';
360xchr[@'113]:='K';
361xchr[@'114]:='L';
362xchr[@'115]:='M';
363xchr[@'116]:='N';
364xchr[@'117]:='O';@/
365xchr[@'120]:='P';
366xchr[@'121]:='Q';
367xchr[@'122]:='R';
368xchr[@'123]:='S';
369xchr[@'124]:='T';
370xchr[@'125]:='U';
371xchr[@'126]:='V';
372xchr[@'127]:='W';@/
373xchr[@'130]:='X';
374xchr[@'131]:='Y';
375xchr[@'132]:='Z';
376xchr[@'133]:='[';
377xchr[@'134]:='\';
378xchr[@'135]:=']';
379xchr[@'136]:='^';
380xchr[@'137]:='_';@/
381xchr[@'140]:='`';
382xchr[@'141]:='a';
383xchr[@'142]:='b';
384xchr[@'143]:='c';
385xchr[@'144]:='d';
386xchr[@'145]:='e';
387xchr[@'146]:='f';
388xchr[@'147]:='g';@/
389xchr[@'150]:='h';
390xchr[@'151]:='i';
391xchr[@'152]:='j';
392xchr[@'153]:='k';
393xchr[@'154]:='l';
394xchr[@'155]:='m';
395xchr[@'156]:='n';
396xchr[@'157]:='o';@/
397xchr[@'160]:='p';
398xchr[@'161]:='q';
399xchr[@'162]:='r';
400xchr[@'163]:='s';
401xchr[@'164]:='t';
402xchr[@'165]:='u';
403xchr[@'166]:='v';
404xchr[@'167]:='w';@/
405xchr[@'170]:='x';
406xchr[@'171]:='y';
407xchr[@'172]:='z';
408xchr[@'173]:='{';
409xchr[@'174]:='|';
410xchr[@'175]:='}';
411xchr[@'176]:='~';
412
413@ The ASCII code is ``standard'' only to a certain extent, since many
414computer installations have found it advantageous to have ready access
415to more than 94 printing characters.  If \.{MFT} is being used
416on a garden-variety \PASCAL\ for which only standard ASCII
417codes will appear in the input and output files, it doesn't really matter
418what codes are specified in |xchr[0..@'37]|, but the safest policy is to
419blank everything out by using the code shown below.
420
421However, other settings of |xchr| will make \.{MFT} more friendly on
422computers that have an extended character set, so that users can type things
423like `\.^^Z' instead of `\.{<>}', and so that \.{MFT} can echo the
424page breaks found in its input.  People with extended character sets can
425assign codes arbitrarily, giving an |xchr| equivalent to whatever
426characters the users of \.{MFT} are allowed to have in their input files.
427Appropriate changes to \.{MFT}'s |char_class| table should then be made.
428(Unlike \TeX, each installation of \MF\ has a fixed assignment of category
429codes, called the |char_class|.) Such changes make portability of programs
430more difficult, so they should be introduced cautiously if at all.
431@^character set dependencies@>
432@^system dependencies@>
433
434@<Set init...@>=
435for i:=0 to @'37 do xchr[i]:=' ';
436for i:=@'177 to @'377 do xchr[i]:=' ';
437
438@ The following system-independent code makes the |xord| array contain a
439suitable inverse to the information in |xchr|. Note that if |xchr[i]=xchr[j]|
440where |i<j<@'177|, the value of |xord[xchr[i]]| will turn out to be
441|j| or more; hence, standard ASCII code numbers will be used instead of
442codes below @'40 in case there is a coincidence.
443
444@<Set init...@>=
445for i:=first_text_char to last_text_char do xord[chr(i)]:=@'177;
446for i:=@'200 to @'377 do xord[xchr[i]]:=i;
447for i:=1 to @'176 do xord[xchr[i]]:=i;
448
449@* Input and output.
450The I/O conventions of this program are essentially identical to those
451of \.{WEAVE}.  Therefore people who need to make modifications should be
452able to do so without too many headaches.
453
454@ Terminal output is done by writing on file |term_out|, which is assumed to
455consist of characters of type |text_char|:
456@^system dependencies@>
457
458@d print(#)==write(term_out,#) {`|print|' means write on the terminal}
459@d print_ln(#)==write_ln(term_out,#) {`|print|' and then start new line}
460@d new_line==write_ln(term_out) {start new line on the terminal}
461@d print_nl(#)==  {print information starting on a new line}
462  begin new_line; print(#);
463  end
464
465@<Globals...@>=
466@!term_out:text_file; {the terminal as an output file}
467
468@ Different systems have different ways of specifying that the output on a
469certain file will appear on the user's terminal. Here is one way to do this
470on the \PASCAL\ system that was used in \.{WEAVE}'s initial development:
471@^system dependencies@>
472
473@<Set init...@>=
474rewrite(term_out,'TTY:'); {send |term_out| output to the terminal}
475
476@ The |update_terminal| procedure is called when we want
477to make sure that everything we have output to the terminal so far has
478actually left the computer's internal buffers and been sent.
479@^system dependencies@>
480
481@d update_terminal == break(term_out) {empty the terminal output buffer}
482
483@ The main input comes from |mf_file|; this input may be overridden
484by changes in |change_file|. (If |change_file| is empty, there are no changes.)
485Furthermore the |style_file| is input first; it is unchangeable.
486
487@<Globals...@>=
488@!mf_file:text_file; {primary input}
489@!change_file:text_file; {updates}
490@!style_file:text_file; {formatting bootstrap}
491
492@ The following code opens the input files.  Since these files were listed
493in the program header, we assume that the \PASCAL\ runtime system has
494already checked that suitable file names have been given; therefore no
495additional error checking needs to be done.
496@^system dependencies@>
497
498@p procedure open_input; {prepare to read the inputs}
499begin reset(mf_file); reset(change_file); reset(style_file);
500end;
501
502@ The main output goes to |tex_file|.
503
504@<Globals...@>=
505@!tex_file: text_file;
506
507@ The following code opens |tex_file|.
508Since this file was listed in the program header, we assume that the
509\PASCAL\ runtime system has checked that a suitable external file name has
510been given.
511@^system dependencies@>
512
513@<Set init...@>=
514rewrite(tex_file);
515
516@ Input goes into an array called |buffer|.
517
518@<Globals...@>=@!buffer: array[0..buf_size] of ASCII_code;
519
520@ The |input_ln| procedure brings the next line of input from the specified
521file into the |buffer| array and returns the value |true|, unless the file has
522already been entirely read, in which case it returns |false|. The conventions
523of \TeX\ are followed; i.e., |ASCII_code| numbers representing the next line
524of the file are input into |buffer[0]|, |buffer[1]|, \dots,
525|buffer[limit-1]|; trailing blanks are ignored;
526and the global variable |limit| is set to the length of the
527@^system dependencies@>
528line. The value of |limit| must be strictly less than |buf_size|.
529
530@p function input_ln(var f:text_file):boolean;
531  {inputs a line or returns |false|}
532var final_limit:0..buf_size; {|limit| without trailing blanks}
533begin limit:=0; final_limit:=0;
534if eof(f) then input_ln:=false
535else  begin while not eoln(f) do
536    begin buffer[limit]:=xord[f^]; get(f);
537    incr(limit);
538    if buffer[limit-1]<>" " then final_limit:=limit;
539    if limit=buf_size then
540      begin while not eoln(f) do get(f);
541      decr(limit); {keep |buffer[buf_size]| empty}
542      if final_limit>limit then final_limit:=limit;
543      print_nl('! Input line too long'); loc:=0; error;
544@.Input line too long@>
545      end;
546    end;
547  read_ln(f); limit:=final_limit; input_ln:=true;
548  end;
549end;
550
551@* Reporting errors to the user.
552The command `|err_print('! Error message')|' will report a syntax error to
553the user, by printing the error message at the beginning of a new line and
554then giving an indication of where the error was spotted in the source file.
555Note that no period follows the error message, since the error routine
556will automatically supply a period.
557
558The actual error indications are provided by a procedure called |error|.
559
560@d err_print(#)==
561    begin new_line; print(#); error;
562    end
563
564@<Error handling...@>=
565procedure error; {prints `\..' and location of error message}
566var@!k,@!l: 0..buf_size; {indices into |buffer|}
567begin @<Print error location based on input buffer@>;
568update_terminal; mark_error;
569end;
570
571@ The error locations can be indicated by using the global variables
572|loc|, |line|, |styling|, and |changing|, which tell respectively the first
573unlooked-at position in |buffer|, the current line number, and whether or not
574the current line is from |style_file| or |change_file| or |mf_file|.
575This routine should be modified on systems whose standard text editor
576has special line-numbering conventions.
577@^system dependencies@>
578
579@<Print error location based on input buffer@>=
580begin if styling then print('. (style file ')
581else if changing then print('. (change file ')@+else print('. (');
582print_ln('l.', line:1, ')');
583if loc>=limit then l:=limit else l:=loc;
584for k:=1 to l do
585  print(xchr[buffer[k-1]]); {print the characters already read}
586new_line;
587for k:=1 to l do print(' '); {space out the next line}
588for k:=l+1 to limit do print(xchr[buffer[k-1]]); {print the part not yet read}
589end
590
591@ The |jump_out| procedure just cuts across all active procedure levels
592and jumps out of the program. This is the only non-local \&{goto} statement
593in \.{MFT}. It is used when no recovery from a particular error has
594been provided.
595
596Some \PASCAL\ compilers do not implement non-local |goto| statements.
597@^system dependencies@>
598In such cases the code that appears at label |end_of_MFT| should be
599copied into the |jump_out| procedure, followed by a call to a system procedure
600that terminates the program.
601
602@d fatal_error(#)==begin new_line; print(#); error; mark_fatal; jump_out;
603  end
604
605@<Error handling...@>=
606procedure jump_out;
607begin goto end_of_MFT;
608end;
609
610@ Sometimes the program's behavior is far different from what it should be,
611and \.{MFT} prints an error message that is really for the \.{MFT}
612maintenance person, not the user. In such cases the program says
613|confusion('indication of where we are')|.
614
615@d confusion(#)==fatal_error('! This can''t happen (',#,')')
616@.This can't happen@>
617
618@ An overflow stop occurs if \.{MFT}'s tables aren't large enough.
619
620@d overflow(#)==fatal_error('! Sorry, ',#,' capacity exceeded')
621@.Sorry, x capacity exceeded@>
622
623@* Inserting the changes.
624Let's turn now to the low-level routine |get_line|
625that takes care of merging |change_file| into |mf_file|. The |get_line|
626procedure also updates the line numbers for error messages.
627(This routine was copied from \.{WEAVE}, but updated to include |styling|.)
628
629@<Globals...@>=
630@!line:integer; {the number of the current line in the current file}
631@!other_line:integer; {the number of the current line in the input file that
632  is not currently being read}
633@!temp_line:integer; {used when interchanging |line| with |other_line|}
634@!limit:0..buf_size; {the last character position occupied in the buffer}
635@!loc:0..buf_size; {the next character position to be read from the buffer}
636@!input_has_ended: boolean; {if |true|, there is no more input}
637@!changing: boolean; {if |true|, the current line is from |change_file|}
638@!styling: boolean; {if |true|, the current line is from |style_file|}
639
640@ As we change |changing| from |true| to |false| and back again, we must
641remember to swap the values of |line| and |other_line| so that the |err_print|
642routine will be sure to report the correct line number.
643
644@d change_changing==
645  changing := not changing;
646  temp_line:=other_line; other_line:=line; line:=temp_line
647    {|line @t$\null\BA\null$@> other_line|}
648
649@ When |changing| is |false|, the next line of |change_file| is kept in
650|change_buffer[0..change_limit]|, for purposes of comparison with the next
651line of |mf_file|. After the change file has been completely input, we
652set |change_limit:=0|, so that no further matches will be made.
653
654@<Globals...@>=
655@!change_buffer:array[0..buf_size] of ASCII_code;
656@!change_limit:0..buf_size; {the last position occupied in |change_buffer|}
657
658@ Here's a simple function that checks if the two buffers are different.
659
660@p function lines_dont_match:boolean;
661label exit;
662var k:0..buf_size; {index into the buffers}
663begin lines_dont_match:=true;
664if change_limit<>limit then return;
665if limit>0 then
666  for k:=0 to limit-1 do if change_buffer[k]<>buffer[k] then return;
667lines_dont_match:=false;
668exit: end;
669
670@ Procedure |prime_the_change_buffer| sets |change_buffer| in preparation
671for the next matching operation. Since blank lines in the change file are
672not used for matching, we have |(change_limit=0)and not changing| if and
673only if the change file is exhausted. This procedure is called only
674when |changing| is true; hence error messages will be reported correctly.
675
676@p procedure prime_the_change_buffer;
677label continue, done, exit;
678var k:0..buf_size; {index into the buffers}
679begin change_limit:=0; {this value will be used if the change file ends}
680@<Skip over comment lines in the change file; |return| if end of file@>;
681@<Skip to the next nonblank line; |return| if end of file@>;
682@<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>;
683exit: end;
684
685@ While looking for a line that begins with \.{@@x} in the change file,
686we allow lines that begin with \.{@@}, as long as they don't begin with
687\.{@@y} or \.{@@z} (which would probably indicate that the change file is
688fouled up).
689
690@<Skip over comment lines in the change file...@>=
691loop@+  begin incr(line);
692  if not input_ln(change_file) then return;
693  if limit<2 then goto continue;
694  if buffer[0]<>"@@" then goto continue;
695  if (buffer[1]>="X")and(buffer[1]<="Z") then
696    buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
697  if buffer[1]="x" then goto done;
698  if (buffer[1]="y")or(buffer[1]="z") then
699    begin loc:=2; err_print('! Where is the matching @@x?');
700@.Where is the match...@>
701    end;
702continue: end;
703done:
704
705@ Here we are looking at lines following the \.{@@x}.
706
707@<Skip to the next nonblank line...@>=
708repeat incr(line);
709  if not input_ln(change_file) then
710    begin err_print('! Change file ended after @@x');
711@.Change file ended...@>
712    return;
713    end;
714until limit>0;
715
716@ @<Move |buffer| and |limit| to |change_buffer| and |change_limit|@>=
717begin change_limit:=limit;
718if limit>0 then for k:=0 to limit-1 do change_buffer[k]:=buffer[k];
719end
720
721@ The following procedure is used to see if the next change entry should
722go into effect; it is called only when |changing| is false.
723The idea is to test whether or not the current
724contents of |buffer| matches the current contents of |change_buffer|.
725If not, there's nothing more to do; but if so, a change is called for:
726All of the text down to the \.{@@y} is supposed to match. An error
727message is issued if any discrepancy is found. Then the procedure
728prepares to read the next line from |change_file|.
729
730@p procedure check_change; {switches to |change_file| if the buffers match}
731label exit;
732var n:integer; {the number of discrepancies found}
733@!k:0..buf_size; {index into the buffers}
734begin if lines_dont_match then return;
735n:=0;
736loop@+  begin change_changing; {now it's |true|}
737  incr(line);
738  if not input_ln(change_file) then
739    begin err_print('! Change file ended before @@y');
740@.Change file ended...@>
741    change_limit:=0;  change_changing; {|false| again}
742    return;
743    end;
744  @<If the current line starts with \.{@@y},
745    report any discrepancies and |return|@>;
746  @<Move |buffer| and |limit|...@>;
747  change_changing; {now it's |false|}
748  incr(line);
749  if not input_ln(mf_file) then
750    begin err_print('! MF file ended during a change');
751@.MF file ended...@>
752    input_has_ended:=true; return;
753    end;
754  if lines_dont_match then incr(n);
755  end;
756exit: end;
757
758@ @<If the current line starts with \.{@@y}...@>=
759if limit>1 then if buffer[0]="@@" then
760  begin if (buffer[1]>="X")and(buffer[1]<="Z") then
761    buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
762  if (buffer[1]="x")or(buffer[1]="z") then
763    begin loc:=2; err_print('! Where is the matching @@y?');
764@.Where is the match...@>
765    end
766  else if buffer[1]="y" then
767    begin if n>0 then
768      begin loc:=2; err_print('! Hmm... ',n:1,
769        ' of the preceding lines failed to match');
770@.Hmm... n of the preceding...@>
771      end;
772    return;
773    end;
774  end
775
776@ Here's what we do to get the input rolling.
777
778@<Initialize the input system@>=
779begin open_input; line:=0; other_line:=0;@/
780changing:=true; prime_the_change_buffer; change_changing;@/
781styling:=true; limit:=0; loc:=1; buffer[0]:=" "; input_has_ended:=false;
782end
783
784@ The |get_line| procedure is called when |loc>limit|; it puts the next
785line of merged input into the buffer and updates the other variables
786appropriately.
787
788@p procedure get_line; {inputs the next line}
789label restart;
790begin restart: if styling then
791  @<Read from |style_file| and maybe turn off |styling|@>;
792if not styling then
793  begin if changing then
794    @<Read from |change_file| and maybe turn off |changing|@>;
795  if not changing then
796    begin @<Read from |mf_file| and maybe turn on |changing|@>;
797    if changing then goto restart;
798    end;
799  end;
800end;
801
802@ @<Read from |mf_file|...@>=
803begin incr(line);
804if not input_ln(mf_file) then input_has_ended:=true
805else if limit=change_limit then
806  if buffer[0]=change_buffer[0] then
807    if change_limit>0 then check_change;
808end
809
810@ @<Read from |style_file|...@>=
811begin incr(line);
812if not input_ln(style_file) then
813  begin styling:=false; line:=0;
814  end;
815end
816
817@ @<Read from |change_file|...@>=
818begin incr(line);
819if not input_ln(change_file) then
820  begin err_print('! Change file ended without @@z');
821@.Change file ended...@>
822  buffer[0]:="@@"; buffer[1]:="z"; limit:=2;
823  end;
824if limit>1 then {check if the change has ended}
825  if buffer[0]="@@" then
826    begin if (buffer[1]>="X")and(buffer[1]<="Z") then
827      buffer[1]:=buffer[1]+"z"-"Z"; {lowercasify}
828    if (buffer[1]="x")or(buffer[1]="y") then
829      begin loc:=2; err_print('! Where is the matching @@z?');
830@.Where is the match...@>
831      end
832    else if buffer[1]="z" then
833      begin prime_the_change_buffer; change_changing;
834      end;
835    end;
836end
837
838@ At the end of the program, we will tell the user if the change file
839had a line that didn't match any relevant line in |mf_file|.
840
841@<Check that all changes have been read@>=
842if change_limit<>0 then {|changing| is false}
843  begin for loc:=0 to change_limit do buffer[loc]:=change_buffer[loc];
844  limit:=change_limit; changing:=true; line:=other_line; loc:=change_limit;
845  err_print('! Change file entry did not match');
846@.Change file entry did not match@>
847  end
848
849@* Data structures.
850\.{MFT} puts token names
851into the large |byte_mem| array, which is packed with eight-bit integers.
852Allocation is sequential, since names are never deleted.
853
854An auxiliary array |byte_start| is used as a directory for |byte_mem|;
855the |link| and |ilk| arrays give further information about names.
856These auxiliary arrays consist of sixteen-bit items.
857
858@<Types...@>=
859@!eight_bits=0..255; {unsigned one-byte quantity}
860@!sixteen_bits=0..65535; {unsigned two-byte quantity}
861
862@ \.{MFT} has been designed to avoid the need for indices that are more
863than sixteen bits wide, so that it can be used on most computers.
864
865@<Globals...@>=
866@!byte_mem: packed array [0..max_bytes] of ASCII_code; {characters of names}
867@!byte_start: array [0..max_names] of sixteen_bits; {directory into |byte_mem|}
868@!link: array [0..max_names] of sixteen_bits; {hash table links}
869@!ilk: array [0..max_names] of sixteen_bits; {type codes}
870
871@ The names of tokens are found by computing a hash address |h| and
872then looking at strings of bytes signified by |hash[h]|, |link[hash[h]]|,
873|link[link[hash[h]]]|, \dots, until either finding the desired name
874or encountering a zero.
875
876A `|name_pointer|' variable, which signifies a name, is an index into
877|byte_start|. The actual sequence of characters in the name pointed to by
878|p| appears in positions |byte_start[p]| to |byte_start[p+1]-1|, inclusive,
879of |byte_mem|.
880
881We usually have |byte_start[name_ptr]=byte_ptr|, which is
882the starting position for the next name to be stored in |byte_mem|.
883
884@d length(#)==byte_start[#+1]-byte_start[#] {the length of a name}
885
886@<Types...@>=
887@!name_pointer=0..max_names; {identifies a name}
888
889@ @<Global...@>=
890@!name_ptr:name_pointer; {first unused position in |byte_start|}
891@!byte_ptr:0..max_bytes; {first unused position in |byte_mem|}
892
893@ @<Set init...@>=
894byte_start[0]:=0; byte_ptr:=0;
895byte_start[1]:=0; {this makes name 0 of length zero}
896name_ptr:=1;
897
898@ The hash table described above is updated by the |lookup| procedure,
899which finds a given name and returns a pointer to its index in
900|byte_start|. The token is supposed to match character by character.
901If it was not already present, it is inserted into the table.
902
903Because of the way \.{MFT}'s scanning mechanism works, it is most convenient
904to let |lookup| search for a token that is present in the |buffer|
905array. Two other global variables specify its position in the buffer: the
906first character is |buffer[id_first]|, and the last is |buffer[id_loc-1]|.
907
908@<Glob...@>=
909@!id_first:0..buf_size; {where the current token begins in the buffer}
910@!id_loc:0..buf_size; {just after the current token in the buffer}
911@#
912@!hash:array [0..hash_size] of sixteen_bits; {heads of hash lists}
913
914@ Initially all the hash lists are empty.
915
916@<Local variables for init...@>=
917@!h:0..hash_size; {index into hash-head array}
918
919@ @<Set init...@>=
920for h:=0 to hash_size-1 do hash[h]:=0;
921
922@ Here now is the main procedure for finding tokens.
923
924@p function lookup:name_pointer; {finds current token}
925label found;
926var i:0..buf_size; {index into |buffer|}
927@!h:0..hash_size; {hash code}
928@!k:0..max_bytes; {index into |byte_mem|}
929@!l:0..buf_size; {length of the given token}
930@!p:name_pointer; {where the token is being sought}
931begin l:=id_loc-id_first; {compute the length}
932@<Compute the hash code |h|@>;
933@<Compute the name location |p|@>;
934if p=name_ptr then @<Enter a new name into the table at position |p|@>;
935lookup:=p;
936end;
937
938@ A simple hash code is used: If the sequence of
939ASCII codes is $c_1c_2\ldots c_m$, its hash value will be
940$$(2^{n-1}c_1+2^{n-2}c_2+\cdots+c_n)\,\bmod\,|hash_size|.$$
941
942@<Compute the hash...@>=
943h:=buffer[id_first]; i:=id_first+1;
944while i<id_loc do
945  begin h:=(h+h+buffer[i]) mod hash_size; incr(i);
946  end
947
948@ If the token is new, it will be placed in position |p=name_ptr|,
949otherwise |p| will point to its existing location.
950
951@<Compute the name location...@>=
952p:=hash[h];
953while p<>0 do
954  begin if length(p)=l then
955    @<Compare name |p| with current token,
956      |goto found| if equal@>;
957  p:=link[p];
958  end;
959p:=name_ptr; {the current token is new}
960link[p]:=hash[h]; hash[h]:=p; {insert |p| at beginning of hash list}
961found:
962
963@ @<Compare name |p|...@>=
964begin i:=id_first; k:=byte_start[p];
965while (i<id_loc)and(buffer[i]=byte_mem[k]) do
966  begin incr(i); incr(k);
967  end;
968if i=id_loc then goto found; {all characters agree}
969end
970
971@ When we begin the following segment of the program, |p=name_ptr|.
972
973@<Enter a new name...@>=
974begin if byte_ptr+l>max_bytes then overflow('byte memory');
975if name_ptr+1>max_names then overflow('name');
976i:=id_first; {get ready to move the token into |byte_mem|}
977while i<id_loc do
978  begin byte_mem[byte_ptr]:=buffer[i]; incr(byte_ptr); incr(i);
979  end;
980incr(name_ptr); byte_start[name_ptr]:=byte_ptr;
981@<Assign the default value to |ilk[p]|@>;
982end
983
984@* Initializing the primitive tokens.
985Each token read by \.{MFT} is recognized as belonging to one of the
986following ``types'':
987
988@d indentation=0 {internal code for space at beginning of a line}
989@d end_of_line=1 {internal code for hypothetical token at end of a line}
990@d end_of_file=2 {internal code for hypothetical token at end of the input}
991@d verbatim=3 {internal code for the token `\.{\%\%}'}
992@d set_format=4 {internal code for the token `\.{\%\%\%}'}
993@d mft_comment=5 {internal code for the token `\.{\%\%\%\%}'}
994@d min_action_type=6 {smallest code for tokens that produce ``real'' output}
995@d numeric_token=6 {internal code for tokens like `\.{3.14159}'}
996@d string_token=7 {internal code for tokens like `|"pie"|'}
997@d min_symbolic_token=8 {smallest internal code for a symbolic token}
998@d op=8 {internal code for tokens like `\.{sqrt}'}
999@d command=9 {internal code for tokens like `\.{addto}'}
1000@d endit=10 {internal code for tokens like `\.{fi}'}
1001@d binary=11 {internal code for tokens like `\.{and}'}
1002@d abinary=12 {internal code for tokens like `\.{+}'}
1003@d bbinary=13 {internal code for tokens like `\.{step}'}
1004@d ampersand=14 {internal code for the token `\.{\char`\&}'}
1005@d pyth_sub=15 {internal code for the token `\.{+-+}'}
1006@d as_is=16 {internal code for tokens like `\.{]}'}
1007@d bold=17 {internal code for tokens like `\.{nullpen}'}
1008@d type_name=18 {internal code for tokens like `\.{numeric}'}
1009@d path_join=19 {internal code for the token `\.{..}'}
1010@d colon=20 {internal code for the token `\.:'}
1011@d semicolon=21 {internal code for the token `\.;'}
1012@d backslash=22 {internal code for the token `\.{\\}'}
1013@d double_back=23 {internal code for the token `\.{\\\\}'}
1014@d less_or_equal=24 {internal code for the token `\.{<=}'}
1015@d greater_or_equal=25 {internal code for the token `\.{>=}'}
1016@d not_equal=26 {internal code for the token `\.{<>}'}
1017@d sharp=27 {internal code for the token `\.{\char`\#}'}
1018@d comment=28 {internal code for the token `\.{\char`\%}'}
1019@d recomment=29 {internal code used to resume a comment after `\pb'}
1020@d min_suffix=30 {smallest code for symbolic tokens in suffixes}
1021@d internal=30 {internal code for tokens like `\.{pausing}'}
1022@d input_command=31 {internal code for tokens like `\.{input}'}
1023@d special_tag=32 {internal code for tags that take at most one subscript}
1024@d tag=33 {internal code for nonprimitive tokens}
1025
1026@<Assign the default value to |ilk[p]|@>=ilk[p]:=tag
1027
1028@ We have to get \MF's primitives into the hash table, and the
1029simplest way to do this is to insert them every time \.{MFT} is run.
1030
1031A few macros permit us to do the initialization with a compact program.
1032We use the fact that the longest primitive is \.{intersectiontimes},
1033which is 17 letters long.
1034
1035@d spr17(#)==buffer[17]:=#;cur_tok:=lookup;ilk[cur_tok]:=
1036@d spr16(#)==buffer[16]:=#;spr17
1037@d spr15(#)==buffer[15]:=#;spr16
1038@d spr14(#)==buffer[14]:=#;spr15
1039@d spr13(#)==buffer[13]:=#;spr14
1040@d spr12(#)==buffer[12]:=#;spr13
1041@d spr11(#)==buffer[11]:=#;spr12
1042@d spr10(#)==buffer[10]:=#;spr11
1043@d spr9(#)==buffer[9]:=#;spr10
1044@d spr8(#)==buffer[8]:=#;spr9
1045@d spr7(#)==buffer[7]:=#;spr8
1046@d spr6(#)==buffer[6]:=#;spr7
1047@d spr5(#)==buffer[5]:=#;spr6
1048@d spr4(#)==buffer[4]:=#;spr5
1049@d spr3(#)==buffer[3]:=#;spr4
1050@d spr2(#)==buffer[2]:=#;spr3
1051@d spr1(#)==buffer[1]:=#;spr2
1052@d pr1==id_first:=17; spr17
1053@d pr2==id_first:=16; spr16
1054@d pr3==id_first:=15; spr15
1055@d pr4==id_first:=14; spr14
1056@d pr5==id_first:=13; spr13
1057@d pr6==id_first:=12; spr12
1058@d pr7==id_first:=11; spr11
1059@d pr8==id_first:=10; spr10
1060@d pr9==id_first:=9; spr9
1061@d pr10==id_first:=8; spr8
1062@d pr11==id_first:=7; spr7
1063@d pr12==id_first:=6; spr6
1064@d pr13==id_first:=5; spr5
1065@d pr14==id_first:=4; spr4
1066@d pr15==id_first:=3; spr3
1067@d pr16==id_first:=2; spr2
1068@d pr17==id_first:=1; spr1
1069
1070@ The intended use of the macros above might not be immediately obvious,
1071but the riddle is answered by the following:
1072
1073@<Store all the primitives@>=
1074id_loc:=18;@/
1075pr2(".")(".")(path_join);@/
1076pr1("[")(as_is);@/
1077pr1("]")(as_is);@/
1078pr1("}")(as_is);@/
1079pr1("{")(as_is);@/
1080pr1(":")(colon);@/
1081pr2(":")(":")(colon);@/
1082pr3("|")("|")(":")(colon);@/
1083pr2(":")("=")(as_is);@/
1084pr1(",")(as_is);@/
1085pr1(";")(semicolon);@/
1086pr1("\")(backslash);@/
1087pr2("\")("\")(double_back);@/
1088pr5("a")("d")("d")("t")("o")(command);@/
1089pr2("a")("t")(bbinary);@/
1090pr7("a")("t")("l")("e")("a")("s")("t")(op);@/
1091pr10("b")("e")("g")("i")("n")("g")("r")("o")("u")("p")(command);
1092pr8("c")("o")("n")("t")("r")("o")("l")("s")(op);@/
1093pr4("c")("u")("l")("l")(command);@/
1094pr4("c")("u")("r")("l")(op);@/
1095pr10("d")("e")("l")("i")("m")("i")("t")("e")("r")("s")(command);@/
1096pr7("d")("i")("s")("p")("l")("a")("y")(command);@/
1097pr8("e")("n")("d")("g")("r")("o")("u")("p")(endit);@/
1098pr8("e")("v")("e")("r")("y")("j")("o")("b")(command);@/
1099pr6("e")("x")("i")("t")("i")("f")(command);@/
1100pr11("e")("x")("p")("a")("n")("d")("a")("f")("t")("e")("r")(command);@/
1101pr4("f")("r")("o")("m")(bbinary);@/
1102pr8("i")("n")("w")("i")("n")("d")("o")("w")(bbinary);@/
1103pr7("i")("n")("t")("e")("r")("i")("m")(command);@/
1104pr3("l")("e")("t")(command);@/
1105pr11("n")("e")("w")("i")("n")("t")("e")("r")("n")("a")("l")(command);@/
1106pr2("o")("f")(command);@/
1107pr10("o")("p")("e")("n")("w")("i")("n")("d")("o")("w")(command);@/
1108pr10("r")("a")("n")("d")("o")("m")("s")("e")("e")("d")(command);@/
1109pr4("s")("a")("v")("e")(command);@/
1110pr10("s")("c")("a")("n")("t")("o")("k")("e")("n")("s")(command);@/
1111pr7("s")("h")("i")("p")("o")("u")("t")(command);@/
1112pr4("s")("t")("e")("p")(bbinary);@/
1113pr3("s")("t")("r")(command);@/
1114pr7("t")("e")("n")("s")("i")("o")("n")(op);@/
1115pr2("t")("o")(bbinary);@/
1116pr5("u")("n")("t")("i")("l")(bbinary);@/
1117pr3("d")("e")("f")(command);@/
1118pr6("v")("a")("r")("d")("e")("f")(command);@/
1119
1120@ (There are so many primitives, it's necessary to break this long
1121initialization code up into pieces so as not to overflow \.{WEAVE}'s capacity.)
1122
1123@<Store all the primitives@>=
1124pr10("p")("r")("i")("m")("a")("r")("y")("d")("e")("f")(command);@/
1125pr12("s")("e")("c")("o")("n")("d")("a")("r")("y")("d")("e")("f")(command);@/
1126pr11("t")("e")("r")("t")("i")("a")("r")("y")("d")("e")("f")(command);@/
1127pr6("e")("n")("d")("d")("e")("f")(endit);@/
1128pr3("f")("o")("r")(command);@/
1129pr11("f")("o")("r")("s")("u")("f")("f")("i")("x")("e")("s")(command);@/
1130pr7("f")("o")("r")("e")("v")("e")("r")(command);@/
1131pr6("e")("n")("d")("f")("o")("r")(endit);@/
1132pr5("q")("u")("o")("t")("e")(command);@/
1133pr4("e")("x")("p")("r")(command);@/
1134pr6("s")("u")("f")("f")("i")("x")(command);@/
1135pr4("t")("e")("x")("t")(command);@/
1136pr7("p")("r")("i")("m")("a")("r")("y")(command);@/
1137pr9("s")("e")("c")("o")("n")("d")("a")("r")("y")(command);@/
1138pr8("t")("e")("r")("t")("i")("a")("r")("y")(command);@/
1139pr5("i")("n")("p")("u")("t")(input_command);@/
1140pr8("e")("n")("d")("i")("n")("p")("u")("t")(bold);@/
1141pr2("i")("f")(command);@/
1142pr2("f")("i")(endit);@/
1143pr4("e")("l")("s")("e")(command);@/
1144pr6("e")("l")("s")("e")("i")("f")(command);@/
1145pr4("t")("r")("u")("e")(bold);@/
1146pr5("f")("a")("l")("s")("e")(bold);@/
1147pr11("n")("u")("l")("l")("p")("i")("c")("t")("u")("r")("e")(bold);@/
1148pr7("n")("u")("l")("l")("p")("e")("n")(bold);@/
1149pr7("j")("o")("b")("n")("a")("m")("e")(bold);@/
1150pr10("r")("e")("a")("d")("s")("t")("r")("i")("n")("g")(bold);@/
1151pr9("p")("e")("n")("c")("i")("r")("c")("l")("e")(bold);@/
1152pr4("g")("o")("o")("d")(special_tag);@/
1153pr2("=")(":")(as_is);@/
1154pr3("=")(":")("|")(as_is);@/
1155pr4("=")(":")("|")(">")(as_is);@/
1156pr3("|")("=")(":")(as_is);@/
1157pr4("|")("=")(":")(">")(as_is);@/
1158pr4("|")("=")(":")("|")(as_is);@/
1159pr5("|")("=")(":")("|")(">")(as_is);@/
1160pr6("|")("=")(":")("|")(">")(">")(as_is);@/
1161pr4("k")("e")("r")("n")(binary);
1162pr6("s")("k")("i")("p")("t")("o")(command);@/
1163
1164@ (Does anybody out there remember the commercials that went \.{LS-MFT}?)
1165
1166@<Store all the prim...@>=
1167pr13("n")("o")("r")("m")("a")("l")("d")("e")("v")("i")("a")("t")("e")(op);@/
1168pr3("o")("d")("d")(op);@/
1169pr5("k")("n")("o")("w")("n")(op);@/
1170pr7("u")("n")("k")("n")("o")("w")("n")(op);@/
1171pr3("n")("o")("t")(op);@/
1172pr7("d")("e")("c")("i")("m")("a")("l")(op);@/
1173pr7("r")("e")("v")("e")("r")("s")("e")(op);@/
1174pr8("m")("a")("k")("e")("p")("a")("t")("h")(op);@/
1175pr7("m")("a")("k")("e")("p")("e")("n")(op);@/
1176pr11("t")("o")("t")("a")("l")("w")("e")("i")("g")("h")("t")(op);@/
1177pr3("o")("c")("t")(op);@/
1178pr3("h")("e")("x")(op);@/
1179pr5("A")("S")("C")("I")("I")(op);@/
1180pr4("c")("h")("a")("r")(op);@/
1181pr6("l")("e")("n")("g")("t")("h")(op);@/
1182pr13("t")("u")("r")("n")("i")("n")("g")("n")("u")("m")("b")("e")("r")(op);@/
1183pr5("x")("p")("a")("r")("t")(op);@/
1184pr5("y")("p")("a")("r")("t")(op);@/
1185pr6("x")("x")("p")("a")("r")("t")(op);@/
1186pr6("x")("y")("p")("a")("r")("t")(op);@/
1187pr6("y")("x")("p")("a")("r")("t")(op);@/
1188pr6("y")("y")("p")("a")("r")("t")(op);@/
1189pr4("s")("q")("r")("t")(op);@/
1190pr4("m")("e")("x")("p")(op);@/
1191pr4("m")("l")("o")("g")(op);@/
1192pr4("s")("i")("n")("d")(op);@/
1193pr4("c")("o")("s")("d")(op);@/
1194pr5("f")("l")("o")("o")("r")(op);@/
1195pr14("u")("n")("i")("f")("o")("r")("m")("d")("e")("v")("i")("a")("t")("e")(op);
1196  @/
1197pr10("c")("h")("a")("r")("e")("x")("i")("s")("t")("s")(op);@/
1198pr5("a")("n")("g")("l")("e")(op);@/
1199pr5("c")("y")("c")("l")("e")(op);@/
1200
1201@ (If you think this \.{WEB} code is ugly, you should see the Pascal code
1202it produces.)
1203
1204@<Store all the primitives@>=
1205pr13("t")("r")("a")("c")("i")("n")("g")
1206 ("t")("i")("t")("l")("e")("s")(internal);@/
1207pr16("t")("r")("a")("c")("i")("n")("g")
1208 ("e")("q")("u")("a")("t")("i")("o")("n")("s")(internal);@/
1209pr15("t")("r")("a")("c")("i")("n")("g")
1210 ("c")("a")("p")("s")("u")("l")("e")("s")(internal);@/
1211pr14("t")("r")("a")("c")("i")("n")("g")
1212 ("c")("h")("o")("i")("c")("e")("s")(internal);@/
1213pr12("t")("r")("a")("c")("i")("n")("g")
1214 ("s")("p")("e")("c")("s")(internal);@/
1215pr11("t")("r")("a")("c")("i")("n")("g")
1216 ("p")("e")("n")("s")(internal);@/
1217pr15("t")("r")("a")("c")("i")("n")("g")
1218 ("c")("o")("m")("m")("a")("n")("d")("s")(internal);@/
1219pr13("t")("r")("a")("c")("i")("n")("g")
1220 ("m")("a")("c")("r")("o")("s")(internal);@/
1221pr12("t")("r")("a")("c")("i")("n")("g")
1222 ("e")("d")("g")("e")("s")(internal);@/
1223pr13("t")("r")("a")("c")("i")("n")("g")
1224 ("o")("u")("t")("p")("u")("t")(internal);@/
1225pr12("t")("r")("a")("c")("i")("n")("g")
1226 ("s")("t")("a")("t")("s")(internal);@/
1227pr13("t")("r")("a")("c")("i")("n")("g")
1228 ("o")("n")("l")("i")("n")("e")(internal);@/
1229
1230@ @<Store all the primitives@>=
1231pr4("y")("e")("a")("r")(internal);@/
1232pr5("m")("o")("n")("t")("h")(internal);@/
1233pr3("d")("a")("y")(internal);@/
1234pr4("t")("i")("m")("e")(internal);@/
1235pr8("c")("h")("a")("r")("c")("o")("d")("e")(internal);@/
1236pr7("c")("h")("a")("r")("f")("a")("m")(internal);@/
1237pr6("c")("h")("a")("r")("w")("d")(internal);@/
1238pr6("c")("h")("a")("r")("h")("t")(internal);@/
1239pr6("c")("h")("a")("r")("d")("p")(internal);@/
1240pr6("c")("h")("a")("r")("i")("c")(internal);@/
1241pr6("c")("h")("a")("r")("d")("x")(internal);@/
1242pr6("c")("h")("a")("r")("d")("y")(internal);@/
1243pr10("d")("e")("s")("i")("g")("n")("s")("i")("z")("e")(internal);@/
1244pr4("h")("p")("p")("p")(internal);@/
1245pr4("v")("p")("p")("p")(internal);@/
1246pr7("x")("o")("f")("f")("s")("e")("t")(internal);@/
1247pr7("y")("o")("f")("f")("s")("e")("t")(internal);@/
1248pr7("p")("a")("u")("s")("i")("n")("g")(internal);@/
1249pr12("s")("h")("o")("w")
1250 ("s")("t")("o")("p")("p")("i")("n")("g")(internal);@/
1251pr10("f")("o")("n")("t")("m")("a")("k")("i")("n")("g")(internal);@/
1252pr8("p")("r")("o")("o")("f")("i")("n")("g")(internal);@/
1253pr9("s")("m")("o")("o")("t")("h")("i")("n")("g")(internal);@/
1254pr12("a")("u")("t")("o")("r")("o")("u")("n")("d")("i")("n")("g")(internal);@/
1255pr11("g")("r")("a")("n")("u")("l")("a")("r")("i")("t")("y")(internal);@/
1256pr6("f")("i")("l")("l")("i")("n")(internal);@/
1257pr12("t")("u")("r")("n")("i")("n")("g")("c")("h")("e")("c")("k")(internal);@/
1258pr12("w")("a")("r")("n")("i")("n")("g")("c")("h")("e")("c")("k")(internal);@/
1259pr12("b")("o")("u")("n")("d")("a")("r")("y")("c")("h")("a")("r")(internal);@/
1260
1261@ Still more.
1262
1263@<Store all the prim...@>=
1264pr1("+")(abinary);@/
1265pr1("-")(abinary);@/
1266pr1("*")(abinary);@/
1267pr1("/")(as_is);@/
1268pr2("+")("+")(binary);@/
1269pr3("+")("-")("+")(pyth_sub);@/
1270pr3("a")("n")("d")(binary);@/
1271pr2("o")("r")(binary);@/
1272pr1("<")(as_is);@/
1273pr2("<")("=")(less_or_equal);@/
1274pr1(">")(as_is);@/
1275pr2(">")("=")(greater_or_equal);@/
1276pr1("=")(as_is);@/
1277pr2("<")(">")(not_equal);@/
1278pr9("s")("u")("b")("s")("t")("r")("i")("n")("g")(command);@/
1279pr7("s")("u")("b")("p")("a")("t")("h")(command);@/
1280pr13("d")("i")("r")("e")("c")("t")("i")("o")("n")@|
1281 ("t")("i")("m")("e")(command);@/
1282pr5("p")("o")("i")("n")("t")(command);@/
1283pr10("p")("r")("e")("c")("o")("n")("t")("r")("o")("l")(command);@/
1284pr11("p")("o")("s")("t")("c")("o")("n")("t")("r")("o")("l")(command);@/
1285pr9("p")("e")("n")("o")("f")("f")("s")("e")("t")(command);@/
1286pr1("&")(ampersand);@/
1287pr7("r")("o")("t")("a")("t")("e")("d")(binary);@/
1288pr7("s")("l")("a")("n")("t")("e")("d")(binary);@/
1289pr6("s")("c")("a")("l")("e")("d")(binary);@/
1290pr7("s")("h")("i")("f")("t")("e")("d")(binary);@/
1291pr11("t")("r")("a")("n")("s")("f")("o")("r")("m")("e")("d")(binary);@/
1292pr7("x")("s")("c")("a")("l")("e")("d")(binary);@/
1293pr7("y")("s")("c")("a")("l")("e")("d")(binary);@/
1294pr7("z")("s")("c")("a")("l")("e")("d")(binary);@/
1295pr17("i")("n")("t")("e")("r")("s")("e")("c")("t")("i")("o")("n")@|
1296 ("t")("i")("m")("e")("s")(binary);@/
1297pr7("n")("u")("m")("e")("r")("i")("c")(type_name);@/
1298pr6("s")("t")("r")("i")("n")("g")(type_name);@/
1299pr7("b")("o")("o")("l")("e")("a")("n")(type_name);@/
1300pr4("p")("a")("t")("h")(type_name);@/
1301pr3("p")("e")("n")(type_name);@/
1302pr7("p")("i")("c")("t")("u")("r")("e")(type_name);@/
1303pr9("t")("r")("a")("n")("s")("f")("o")("r")("m")(type_name);@/
1304pr4("p")("a")("i")("r")(type_name);@/
1305
1306@ At last we are done with the tedious initialization of primitives.
1307
1308@<Store all the prim...@>=
1309pr3("e")("n")("d")(endit);@/
1310pr4("d")("u")("m")("p")(endit);@/
1311pr9("b")("a")("t")("c")("h")("m")("o")("d")("e")(bold);
1312pr11("n")("o")("n")("s")("t")("o")("p")("m")("o")("d")("e")(bold);
1313pr10("s")("c")("r")("o")("l")("l")("m")("o")("d")("e")(bold);
1314pr13("e")("r")("r")("o")("r")("s")("t")("o")("p")@|
1315 ("m")("o")("d")("e")(bold);
1316pr5("i")("n")("n")("e")("r")(command);@/
1317pr5("o")("u")("t")("e")("r")(command);@/
1318pr9("s")("h")("o")("w")("t")("o")("k")("e")("n")(command);@/
1319pr9("s")("h")("o")("w")("s")("t")("a")("t")("s")(bold);@/
1320pr4("s")("h")("o")("w")(command);@/
1321pr12("s")("h")("o")("w")("v")("a")("r")("i")("a")("b")("l")("e")(command);@/
1322pr16("s")("h")("o")("w")@|
1323 ("d")("e")("p")("e")("n")("d")("e")("n")("c")("i")("e")("s")(bold);@/
1324pr7("c")("o")("n")("t")("o")("u")("r")(command);@/
1325pr10("d")("o")("u")("b")("l")("e")("p")("a")("t")("h")(command);@/
1326pr4("a")("l")("s")("o")(command);@/
1327pr7("w")("i")("t")("h")("p")("e")("n")(command);@/
1328pr10("w")("i")("t")("h")("w")("e")("i")("g")("h")("t")(command);@/
1329pr8("d")("r")("o")("p")("p")("i")("n")("g")(command);@/
1330pr7("k")("e")("e")("p")("i")("n")("g")(command);@/
1331pr7("m")("e")("s")("s")("a")("g")("e")(command);@/
1332pr10("e")("r")("r")("m")("e")("s")("s")("a")("g")("e")(command);@/
1333pr7("e")("r")("r")("h")("e")("l")("p")(command);@/
1334pr8("c")("h")("a")("r")("l")("i")("s")("t")(command);@/
1335pr8("l")("i")("g")("t")("a")("b")("l")("e")(command);@/
1336pr10("e")("x")("t")("e")("n")("s")("i")("b")("l")("e")(command);@/
1337pr10("h")("e")("a")("d")("e")("r")("b")("y")("t")("e")(command);@/
1338pr9("f")("o")("n")("t")("d")("i")("m")("e")("n")(command);@/
1339pr7("s")("p")("e")("c")("i")("a")("l")(command);@/
1340pr10("n")("u")("m")("s")("p")("e")("c")("i")("a")("l")(command);@/
1341pr1("%")(comment);@/
1342pr2("%")("%")(verbatim);@/
1343pr3("%")("%")("%")(set_format);@/
1344pr4("%")("%")("%")("%")(mft_comment);@/
1345pr1("#")(sharp);@/
1346
1347@ We also want to store a few other strings of characters that are
1348used in \.{MFT}'s translation to \TeX\ code.
1349
1350@d ttr1(#)==byte_mem[byte_ptr-1]:=#; cur_tok:=name_ptr;
1351  incr(name_ptr); byte_start[name_ptr]:=byte_ptr
1352@d ttr2(#)==byte_mem[byte_ptr-2]:=#; ttr1
1353@d ttr3(#)==byte_mem[byte_ptr-3]:=#; ttr2
1354@d ttr4(#)==byte_mem[byte_ptr-4]:=#; ttr3
1355@d ttr5(#)==byte_mem[byte_ptr-5]:=#; ttr4
1356@d tr1==incr(byte_ptr); ttr1
1357@d tr2==byte_ptr:=byte_ptr+2; ttr2
1358@d tr3==byte_ptr:=byte_ptr+3; ttr3
1359@d tr4==byte_ptr:=byte_ptr+4; ttr4
1360@d tr5==byte_ptr:=byte_ptr+5; ttr5
1361
1362@<Glob...@>=
1363@!translation:array[ASCII_code] of name_pointer;
1364@!i:ASCII_code; {index into |translation|}
1365
1366@ @<Store all the translations@>=
1367for i:=0 to 255 do translation[i]:=0;
1368tr2("\")("$"); translation["$"]:=cur_tok;@/
1369tr2("\")("#"); translation["#"]:=cur_tok;@/
1370tr2("\")("&"); translation["&"]:=cur_tok;@/
1371tr2("\")("{"); translation["{"]:=cur_tok;@/
1372tr2("\")("}"); translation["}"]:=cur_tok;@/
1373tr2("\")("_"); translation["_"]:=cur_tok;@/
1374tr2("\")("%"); translation["%"]:=cur_tok;@/
1375tr4("\")("B")("S")(" "); translation["\"]:=cur_tok;@/
1376tr4("\")("H")("A")(" "); translation["^"]:=cur_tok;@/
1377tr4("\")("T")("I")(" "); translation["~"]:=cur_tok;@/
1378tr5("\")("a")("s")("t")(" "); translation["*"]:=cur_tok;@/
1379tr4("\")("A")("M")(" "); tr_amp:=cur_tok;@/
1380@.\\AM, etc@>
1381tr4("\")("B")("L")(" "); tr_skip:=cur_tok;@/
1382tr4("\")("S")("H")(" "); tr_sharp:=cur_tok;@/
1383tr4("\")("P")("S")(" "); tr_ps:=cur_tok;@/
1384tr4("\")("l")("e")(" "); tr_le:=cur_tok;@/
1385tr4("\")("g")("e")(" "); tr_ge:=cur_tok;@/
1386tr4("\")("n")("e")(" "); tr_ne:=cur_tok;@/
1387tr5("\")("q")("u")("a")("d"); tr_quad:=cur_tok;@/
1388
1389@ @<Glob...@>=
1390@!tr_le,@!tr_ge,@!tr_ne,@!tr_amp,@!tr_sharp,@!tr_skip,@!tr_ps,
1391 @!tr_quad:name_pointer; {special translations}
1392
1393@* Inputting the next token.
1394\.{MFT}'s lexical scanning routine is called |get_next|. This procedure
1395inputs the next token of \MF\ input and puts its encoded meaning into
1396two global variables, |cur_type| and |cur_tok|.
1397
1398@<Glob...@>=
1399@!cur_type:eight_bits; {type of token just scanned}
1400@!cur_tok:integer; {hash table or buffer location}
1401@!prev_type:eight_bits; {previous value of |cur_type|}
1402@!prev_tok:integer; {previous value of |cur_tok|}
1403
1404@ @<Set init...@>=
1405cur_type:=end_of_line; cur_tok:=0;
1406
1407@ Two global state variables affect the behavior of |get_next|: A space
1408will be considered significant when |start_of_line| is |true|,
1409and the buffer will be considered devoid of information when |empty_buffer|
1410is |true|.
1411
1412@<Glob...@>=
1413@!start_of_line:boolean; {has the current line had nothing but spaces so far?}
1414@!empty_buffer:boolean; {is it time to input a new line?}
1415
1416@ The 256 |ASCII_code| characters are grouped into classes by means of
1417the |char_class| table. Individual class numbers have no semantic
1418or syntactic significance, expect in a few instances defined here.
1419There's also |max_class|, which can be used as a basis for additional
1420class numbers in nonstandard extensions of \MF.
1421
1422@d digit_class=0 {the class number of \.{0123456789}}
1423@d period_class=1 {the class number of `\..'}
1424@d space_class=2 {the class number of spaces and nonstandard characters}
1425@d percent_class=3 {the class number of `\.\%'}
1426@d string_class=4 {the class number of `\."'}
1427@d right_paren_class=8 {the class number of `\.)'}
1428@d isolated_classes==5,6,7,8 {characters that make length-one tokens only}
1429@d letter_class=9 {letters and the underline character}
1430@d left_bracket_class=17 {`\.['}
1431@d right_bracket_class=18 {`\.]'}
1432@d invalid_class=20 {bad character in the input}
1433@d end_line_class=21 {end of an input line (\.{MFT} only)}
1434@d max_class=21 {the largest class number}
1435
1436@<Glob...@>=
1437@!char_class:array[ASCII_code] of 0..max_class; {the class numbers}
1438
1439@ If changes are made to accommodate non-ASCII character sets, they should be
1440essentially the same in \.{MFT} as in \MF. However, \.{MFT} has an additional
1441class number, the |end_line_class|, which is used only for the special
1442character |carriage_return| that is placed at the end of the input buffer.
1443@^character set dependencies@>
1444@^system dependencies@>
1445
1446@d carriage_return=@'15 {special code placed in |buffer[limit]|}
1447
1448@<Set init...@>=
1449for i:="0" to "9" do char_class[i]:=digit_class;
1450char_class["."]:=period_class;
1451char_class[" "]:=space_class;
1452char_class["%"]:=percent_class;
1453char_class[""""]:=string_class;@/
1454char_class[","]:=5;
1455char_class[";"]:=6;
1456char_class["("]:=7;
1457char_class[")"]:=right_paren_class;
1458for i:="A" to "Z" do char_class[i]:=letter_class;
1459for i:="a" to "z" do char_class[i]:=letter_class;
1460char_class["_"]:=letter_class;@/
1461char_class["<"]:=10;
1462char_class["="]:=10;
1463char_class[">"]:=10;
1464char_class[":"]:=10;
1465char_class["|"]:=10;@/
1466char_class["`"]:=11;
1467char_class["'"]:=11;@/
1468char_class["+"]:=12;
1469char_class["-"]:=12;@/
1470char_class["/"]:=13;
1471char_class["*"]:=13;
1472char_class["\"]:=13;@/
1473char_class["!"]:=14;
1474char_class["?"]:=14;@/
1475char_class["#"]:=15;
1476char_class["&"]:=15;
1477char_class["@@"]:=15;
1478char_class["$"]:=15;@/
1479char_class["^"]:=16;
1480char_class["~"]:=16;@/
1481char_class["["]:=left_bracket_class;
1482char_class["]"]:=right_bracket_class;@/
1483char_class["{"]:=19;
1484char_class["}"]:=19;@/
1485for i:=0 to " "-1 do char_class[i]:=invalid_class;
1486char_class[carriage_return]:=end_line_class;@/
1487for i:=127 to 255 do char_class[i]:=invalid_class;
1488
1489@ And now we're ready to take the plunge into |get_next| itself.
1490
1491@d switch=25 {a label in |get_next|}
1492@d pass_digits=85 {another}
1493@d pass_fraction=86 {and still another, although |goto| is considered harmful}
1494
1495@p procedure get_next; {sets |cur_type| and |cur_tok| to next token}
1496label switch,pass_digits,pass_fraction,done,found,exit;
1497var @!c:ASCII_code; {the current character in the buffer}
1498@!class:ASCII_code; {its class number}
1499begin prev_type:=cur_type; prev_tok:=cur_tok;
1500if empty_buffer then
1501  @<Bring in a new line of input; |return| if the file has ended@>;
1502switch: c:=buffer[loc]; id_first:=loc; incr(loc); class:=char_class[c];
1503@<Branch on the |class|, scan the token; |return| directly if the
1504  token is special, or |goto found| if it needs to be looked up@>;
1505found:id_loc:=loc; cur_tok:=lookup; cur_type:=ilk[cur_tok];
1506exit:end;
1507
1508@ @d emit(#)==@t@>@+begin cur_type:=#; cur_tok:=id_first; return;@+end
1509
1510@<Branch on the |class|...@>=
1511case class of
1512digit_class:goto pass_digits;
1513period_class:begin class:=char_class[buffer[loc]];
1514  if class>period_class then goto switch {ignore isolated `\..'}
1515  else if class<period_class then goto pass_fraction; {|class=digit_class|}
1516  end;
1517space_class:if start_of_line then emit(indentation)
1518  else goto switch;
1519end_line_class: emit(end_of_line);
1520string_class:@<Get a string token and |return|@>;
1521isolated_classes: goto found;
1522invalid_class:@<Decry the invalid character and |goto switch|@>;
1523othercases do_nothing {letters, etc.}
1524endcases;@/
1525while char_class[buffer[loc]]=class do incr(loc);
1526goto found;
1527pass_digits: while char_class[buffer[loc]]=digit_class do incr(loc);
1528if buffer[loc]<>"." then goto done;
1529if char_class[buffer[loc+1]]<>digit_class then goto done;
1530incr(loc);
1531pass_fraction:repeat incr(loc);
1532until char_class[buffer[loc]]<>digit_class;
1533done:emit(numeric_token)
1534
1535@ @<Get a string token and |return|@>=
1536loop@+begin if buffer[loc]="""" then
1537    begin incr(loc); emit(string_token);
1538    end;
1539  if loc=limit then @<Decry the missing string delimiter and |goto switch|@>;
1540  incr(loc);
1541  end
1542
1543@ @<Decry the missing string delimiter and |goto switch|@>=
1544begin err_print('! Incomplete string will be ignored'); goto switch;
1545@.Incomplete string...@>
1546end
1547
1548@ @<Decry the invalid character and |goto switch|@>=
1549begin err_print('! Invalid character will be ignored'); goto switch;
1550@.Invalid character...@>
1551end
1552
1553@ @<Bring in a new line of input; |return| if the file has ended@>=
1554begin get_line;
1555if input_has_ended then emit(end_of_file);
1556buffer[limit]:=carriage_return; loc:=0; start_of_line:=true;
1557empty_buffer:=false;
1558end
1559
1560@* Low-level output routines.
1561The \TeX\ output is supposed to appear in lines at most |line_length|
1562characters long, so we place it into an output buffer. During the output
1563process, |out_line| will hold the current line number of the line about to
1564be output.
1565
1566@<Glo...@>=
1567@!out_buf:array[0..line_length] of ASCII_code; {assembled characters}
1568@!out_ptr:0..line_length; {number of characters in |out_buf|}
1569@!out_line: integer; {coordinates of next line to be output}
1570
1571@ The |flush_buffer| routine empties the buffer up to a given breakpoint,
1572and moves any remaining characters to the beginning of the next line.
1573If the |per_cent| parameter is |true|, a |"%"| is appended to the line
1574that is being output; in this case the breakpoint |b| should be strictly
1575less than |line_length|. If the |per_cent| parameter is |false|,
1576trailing blanks are suppressed.
1577The characters emptied from the buffer form a new line of output.
1578
1579@p procedure flush_buffer(@!b:eight_bits;@!per_cent:boolean);
1580  {outputs |out_buf[1..b]|, where |b<=out_ptr|}
1581label done;
1582var j,@!k:0..line_length;
1583begin j:=b;
1584if not per_cent then {remove trailing blanks}
1585  loop@+  begin if j=0 then goto done;
1586    if out_buf[j]<>" " then goto done;
1587    decr(j);
1588    end;
1589done: for k:=1 to j do write(tex_file,xchr[out_buf[k]]);
1590if per_cent then write(tex_file,xchr["%"]);
1591write_ln(tex_file); incr(out_line);
1592if b<out_ptr then for k:=b+1 to out_ptr do out_buf[k-b]:=out_buf[k];
1593out_ptr:=out_ptr-b;
1594end;
1595
1596@ \.{MFT} calls |flush_buffer(out_ptr,false)| before it has input
1597anything. We initialize the output variables
1598so that the first line of the output file will be `\.{\\input mftmac}'.
1599@.\\input mftmac@>
1600@.mftmac@>
1601
1602@<Set init...@>=
1603out_ptr:=1; out_buf[1]:=" "; out_line:=1; write(tex_file,'\input mftmac');
1604
1605@ When we wish to append the character |c| to the output buffer, we write
1606`$|out|(c)$'; this will cause the buffer to be emptied if it was already
1607full. Similarly, `$|out2|(c_1)(c_2)$' appends a pair of characters.
1608A line break will occur at a space or after a single-nonletter
1609\TeX\ control sequence.
1610
1611@d oot(#)==@;@/
1612  if out_ptr=line_length then break_out;
1613  incr(out_ptr); out_buf[out_ptr]:=#;
1614@d oot1(#)==oot(#)@+end
1615@d oot2(#)==oot(#)@,oot1
1616@d oot3(#)==oot(#)@,oot2
1617@d oot4(#)==oot(#)@,oot3
1618@d oot5(#)==oot(#)@,oot4
1619@d out==@+begin oot1
1620@d out2==@+begin oot2
1621@d out3==@+begin oot3
1622@d out4==@+begin oot4
1623@d out5==@+begin oot5
1624
1625@ The |break_out| routine is called just before the output buffer is about
1626to overflow. To make this routine a little faster, we initialize position
16270 of the output buffer to `\.\\'; this character isn't really output.
1628
1629@<Set init...@>=
1630out_buf[0]:="\";
1631
1632@ A long line is broken at a blank space or just before a backslash that isn't
1633preceded by another backslash. In the latter case, a |"%"| is output at
1634the break. (This policy has a known bug, in the rare situation that the
1635backslash was in a string constant that's being output ``verbatim.'')
1636
1637@p procedure break_out; {finds a way to break the output line}
1638label exit;
1639var k:0..line_length; {index into |out_buf|}
1640@!d:ASCII_code; {character from the buffer}
1641begin k:=out_ptr;
1642loop@+  begin if k=0 then
1643    @<Print warning message, break the line, |return|@>;
1644  d:=out_buf[k];
1645  if d=" " then
1646    begin flush_buffer(k,false); return;
1647    end;
1648  if (d="\")and(out_buf[k-1]<>"\") then {in this case |k>1|}
1649    begin flush_buffer(k-1,true); return;
1650    end;
1651  decr(k);
1652  end;
1653exit:end;
1654
1655@ We get to this module only in unusual cases that the entire output line
1656consists of a string of backslashes followed by a string of nonblank
1657non-backslashes. In such cases it is almost always safe to break the
1658line by putting a |"%"| just before the last character.
1659
1660@<Print warning message...@>=
1661begin print_nl('! Line had to be broken (output l.',out_line:1);
1662@.Line had to be broken@>
1663print_ln('):');
1664for k:=1 to out_ptr-1 do print(xchr[out_buf[k]]);
1665new_line; mark_harmless;
1666flush_buffer(out_ptr-1,true); return;
1667end
1668
1669@ To output a string of bytes from |byte_mem|, we call |out_str|.
1670
1671@p procedure out_str(@!p:name_pointer); {outputs a string}
1672var @!k:0..max_bytes; {index into |byte_mem|}
1673begin for k:=byte_start[p] to byte_start[p+1]-1 do out(byte_mem[k]);
1674end;
1675
1676@ The |out_name| subroutine is used to output a symbolic token.
1677Unusual characters are translated into forms that won't screw up.
1678
1679@p procedure out_name(@!p:name_pointer); {outputs a name}
1680var @!k:0..max_bytes; {index into |byte_mem|}
1681@!t:name_pointer; {translation of character being output, if any}
1682begin for k:=byte_start[p] to byte_start[p+1]-1 do
1683  begin t:=translation[byte_mem[k]];
1684        if t=0 then out(byte_mem[k])
1685  else out_str(t);
1686  end;
1687end;
1688
1689@ We often want to output a name after calling a numeric macro
1690(e.g., `\.{\\1\{foo\}}').
1691
1692@p procedure out_mac_and_name(@!n:ASCII_code; @!p:name_pointer);
1693begin out("\"); out(n);
1694if length(p)=1 then out_name(p)
1695else  begin out("{"); out_name(p); out("}");
1696  end;
1697end;
1698
1699@ Here's a routine that simply copies from the input buffer to the output
1700buffer.
1701
1702@p procedure copy(@!first_loc:integer); {output |buffer[first_loc..loc-1]|}
1703var @!k:0..buf_size; {|buffer| location being copied}
1704begin for k:=first_loc to loc-1 do out(buffer[k]);
1705end;
1706
1707@* Translation.
1708The main work of \.{MFT} is accomplished by a routine that translates
1709the tokens, one by one, with a limited amount of lookahead/lookbehind.
1710Automata theorists might loosely call this a ``finite state transducer,''
1711because the flow of control is comparatively simple.
1712
1713@p procedure do_the_translation;
1714label restart,reswitch,done,exit;
1715var @!k:0..buf_size; {looks ahead in the buffer}
1716@!t:integer; {type that spreads to new tokens}
1717begin restart:if out_ptr>0 then flush_buffer(out_ptr,false);
1718empty_buffer:=true;
1719loop@+  begin get_next;
1720  if start_of_line then @<Do special actions at the start of a line@>;
1721  reswitch:case cur_type of
1722  numeric_token:@<Translate a numeric token or a fraction@>;
1723  string_token:@<Translate a string token@>;
1724  indentation:out_str(tr_quad);
1725  end_of_line,mft_comment:@<Wind up a line of translation and |goto restart|,
1726    or finish a \pb\ segment and |goto reswitch|@>;
1727  end_of_file:return;
1728@t\4@>  @<Cases that translate primitive tokens@>@;
1729  comment,recomment:@<Translate a comment and |goto restart|,
1730    unless there's a \pb\ segment@>;
1731  verbatim:@<Copy the rest of the current input line to the output,
1732    then |goto restart|@>;
1733  set_format:@<Change the translation format of tokens,
1734    and |goto restart| or |reswitch|@>;
1735  internal,special_tag,tag:@<Translate a tag and possible subscript@>;
1736  end;  {all cases have been listed}
1737  end;
1738exit:end;
1739
1740@ @<Do special actions at the start of a line@>=
1741if cur_type>=min_action_type then
1742  begin out("$"); start_of_line:=false;
1743  case cur_type of
1744  endit:out2("\")("!");
1745@.\\!@>
1746  binary,abinary,bbinary,ampersand,pyth_sub:out2("{")("}");
1747@.\{\}@>
1748  othercases do_nothing
1749  endcases;
1750  end
1751else if cur_type=end_of_line then
1752  begin out_str(tr_skip); goto restart;
1753  end
1754else if cur_type=mft_comment then goto restart
1755
1756@ Let's start with some of the easier translations, so that the harder
1757ones will also be easy when we get to them. A string like |"cat"|
1758comes out `\.{\\7"cat"}'.
1759
1760@<Translate a string token@>=
1761begin out2("\")("7"); copy(cur_tok);
1762@.\\7@>
1763end
1764
1765@ Similarly, the translation of `\.{sqrt}' is `\.{\\1\{sqrt\}}'.
1766
1767@<Cases that translate primitive tokens@>=
1768op: out_mac_and_name("1",cur_tok);
1769@.\\1@>
1770command: out_mac_and_name("2",cur_tok);
1771@.\\2@>
1772type_name: if prev_type=command then out_mac_and_name("1",cur_tok)
1773  else out_mac_and_name("2",cur_tok);
1774endit: out_mac_and_name("3",cur_tok);
1775@.\\3@>
1776bbinary: out_mac_and_name("4",cur_tok);
1777@.\\4@>
1778bold: out_mac_and_name("5",cur_tok);
1779@.\\5@>
1780binary: out_mac_and_name("6",cur_tok);
1781@.\\6@>
1782path_join: out_mac_and_name("8",cur_tok);
1783@.\\8@>
1784colon: out_mac_and_name("?",cur_tok);
1785@.\\?@>
1786
1787@ Here are a few more easy cases.
1788
1789@<Cases that translate primitive tokens@>=
1790as_is,sharp,abinary: out_name(cur_tok);
1791double_back: out2("\")(";");
1792@.\\;@>
1793semicolon: begin out_name(cur_tok); get_next;
1794  if cur_type<>end_of_line then if cur_type<>endit then out2("\")(" ");
1795@.\\\char32@>
1796  goto reswitch;
1797  end;
1798
1799@ Some of the primitives have a fixed output (independent of |cur_tok|):
1800
1801@<Cases that translate primitive tokens@>=
1802backslash:out_str(translation["\"]);
1803pyth_sub:out_str(tr_ps);
1804less_or_equal:out_str(tr_le);
1805greater_or_equal:out_str(tr_ge);
1806not_equal:out_str(tr_ne);
1807ampersand:out_str(tr_amp);
1808
1809@ The remaining primitive is slightly special.
1810
1811@<Cases that translate primitive tokens@>=
1812input_command: begin out_mac_and_name("2",cur_tok);
1813  out5("\")("h")("b")("o")("x");
1814  @<Scan the file name and output it in \.{typewriter type}@>;
1815  end;
1816
1817@ File names have different formats on different computers, so we don't scan
1818them with |get_next|. Here we use
1819a rule that probably covers most cases satisfactorily: We ignore leading
1820blanks, then consider the file name to consist of all subsequent characters
1821up to the first blank, semicolon, comment, or end-of-line.
1822(A |carriage_return| appears at the end of the line.)
1823
1824@<Scan the file name and output it in \.{typewriter type}@>=
1825while buffer[loc]=" " do incr(loc);
1826out5("{")("\")("t")("t")(" ");
1827while (buffer[loc]<>" ")and(buffer[loc]<>"%")and(buffer[loc]<>";")
1828  and(loc<limit) do
1829  begin out(buffer[loc]); incr(loc);
1830  end;
1831out("}")
1832
1833@ @<Translate a numeric token or a fraction@>=
1834if buffer[loc]="/" then
1835  if char_class[buffer[loc+1]]=digit_class then {it's a fraction}
1836    begin out5("\")("f")("r")("a")("c"); copy(cur_tok); get_next;
1837@.\\frac@>
1838    out2("/")("{"); get_next; copy(cur_tok); out("}");
1839    end
1840  else copy(cur_tok)
1841else copy(cur_tok)
1842
1843@ @<Translate a tag and possible subscript@>=
1844begin if length(cur_tok)=1 then out_name(cur_tok)
1845else out_mac_and_name("\",cur_tok);
1846@.\\\\@>
1847get_next;
1848if byte_mem[byte_start[prev_tok]]="'" then goto reswitch;
1849case prev_type of
1850internal:begin if (cur_type=numeric_token)or(cur_type>=min_suffix) then
1851    out2("\")(",");
1852@.\\,@>
1853  goto reswitch;
1854  end;
1855special_tag:if cur_type<min_suffix then goto reswitch
1856  else  begin out("."); cur_type:=internal; goto reswitch;
1857@..@>
1858    end;
1859tag:begin if cur_type=tag then if byte_mem[byte_start[cur_tok]]="'" then
1860    goto reswitch; {a sequence of primes goes on the main line}
1861  if (cur_type=numeric_token)or(cur_type>=min_suffix) then
1862    @<Translate a subscript@>
1863  else if cur_type=sharp then out_str(tr_sharp)
1864  else goto reswitch;
1865  end;
1866end; {there are no other cases}
1867end
1868
1869@ @<Translate a subscript@>=
1870begin out2("_")("{");
1871loop@+  begin if cur_type>=min_suffix then out_name(cur_tok)
1872  else copy(cur_tok);
1873  if prev_type=special_tag then
1874    begin get_next; goto done;
1875    end;
1876  get_next;
1877  if cur_type<min_suffix then if cur_type<>numeric_token then goto done;
1878  if cur_type=prev_type then
1879    if cur_type=numeric_token then out2("\")(",")
1880@.\\,@>
1881    else if char_class[byte_mem[byte_start[cur_tok]]]=@|
1882     char_class[byte_mem[byte_start[prev_tok]]] then
1883      if byte_mem[byte_start[prev_tok]]<>"." then out(".")
1884      else out2("\")(",");
1885  end;
1886done: out("}"); goto reswitch;
1887end
1888
1889@ The tricky thing about comments is that they might contain \pb.
1890We scan ahead for this, and replace the second `\.{\char'174}'
1891by a |carriage_return|.
1892
1893@<Translate a comment and |goto restart|...@>=
1894begin if cur_type=comment then out2("\")("9");
1895@.\\9@>
1896id_first:=loc;
1897while (loc<limit)and(buffer[loc]<>"|") do incr(loc);
1898copy(id_first);
1899if loc<limit then
1900  begin start_of_line:=true; incr(loc); k:=loc;
1901  while (k<limit)and(buffer[k]<>"|") do incr(k);
1902  buffer[k]:=carriage_return;
1903  end
1904else  begin if out_buf[out_ptr]="\" then out(" ");
1905  out4("\")("p")("a")("r"); goto restart;
1906@.\\par@>
1907  end;
1908end
1909
1910@ @<Copy the rest of the current input line to the output...@>=
1911begin id_first:=loc; loc:=limit; copy(id_first);
1912if out_ptr=0 then
1913  begin out_ptr:=1; out_buf[1]:=" ";
1914  end;
1915goto restart;
1916end
1917
1918@ @<Wind up a line of translation...@>=
1919begin out("$");
1920if (loc<limit)and(cur_type=end_of_line) then
1921  begin cur_type:=recomment; goto reswitch;
1922  end
1923else  begin out4("\")("p")("a")("r"); goto restart;
1924@.\\par@>
1925  end;
1926end
1927
1928@ @<Change the translation format...@>=
1929begin start_of_line:=false; get_next; t:=cur_type;
1930while cur_type>=min_symbolic_token do
1931  begin get_next;
1932  if cur_type>=min_symbolic_token then ilk[cur_tok]:=t;
1933  end;
1934if cur_type<>end_of_line then if cur_type<>mft_comment then
1935  begin err_print('! Only symbolic tokens should appear after %%%');
1936@.Only symbolic tokens...@>
1937  goto reswitch;
1938  end;
1939empty_buffer:=true; goto restart;
1940end
1941
1942@* The main program.
1943Let's put it all together now: \.{MFT} starts and ends here.
1944@^system dependencies@>
1945
1946@p begin initialize; {beginning of the main program}
1947print_ln(banner); {print a ``banner line''}
1948@<Store all the primitives@>;
1949@<Store all the translations@>;
1950@<Initialize the input...@>;
1951do_the_translation;
1952@<Check that all changes have been read@>;
1953end_of_MFT:{here files should be closed if the operating system requires it}
1954@<Print the job |history|@>;
1955end.
1956
1957@ Some implementations may wish to pass the |history| value to the
1958operating system so that it can be used to govern whether or not other
1959programs are started. Here we simply report the history to the user.
1960@^system dependencies@>
1961
1962@<Print the job |history|@>=
1963case history of
1964spotless: print_nl('(No errors were found.)');
1965harmless_message: print_nl('(Did you see the warning message above?)');
1966error_message: print_nl('(Pardon me, but I think I spotted something wrong.)');
1967fatal_message: print_nl('(That was a fatal error, my friend.)');
1968end {there are no other cases}
1969
1970@* System-dependent changes.
1971This module should be replaced, if necessary, by changes to the program
1972that are necessary to make \.{MFT} work at a particular installation.
1973It is usually best to design your change file so that all changes to
1974previous modules preserve the module numbering; then everybody's version
1975will be consistent with the printed program. More extensive changes,
1976which introduce new modules, can be inserted here; then only the index
1977itself will get a new module number.
1978@^system dependencies@>
1979
1980@* Index.
1981