1% Copyright (c) 1994-2008 by SIL International 2% Copyright (c) 2009-2012 by Jonathan Kew 3% Copyright (c) 2010-2012 by Han Han The Thanh 4% Copyright (c) 2012-2013 by Khaled Hosny 5 6% SIL Author(s): Jonathan Kew 7 8% Part of the XeTeX typesetting system 9 10% Permission is hereby granted, free of charge, to any person obtaining a copy 11% of this software and associated documentation files (the "Software"), to deal 12% in the Software without restriction, including without limitation the rights 13% to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 14% copies of the Software, and to permit persons to whom the Software is 15% furnished to do so, subject to the following conditions: 16 17% The above copyright notice and this permission notice shall be included in all 18% copies or substantial portions of the Software. 19 20% THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 21% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 22% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 23% COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 24% IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 25% CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 26 27% Except as contained in this notice, the name of the copyright holders shall 28% not be used in advertising or otherwise to promote the sale, use or other 29% dealings in this Software without prior written authorization from the 30% copyright holders. 31 32% e-TeX is copyright (C) 1999-2012 by P. Breitenlohner (1994,98 by the NTS 33% team); all rights are reserved. Copying of this file is authorized only if 34% (1) you are P. Breitenlohner, or if (2) you make absolutely no changes to 35% your copy. (Programs such as TIE allow the application of several change 36% files to tex.web; the master files tex.web and etex.ch should stay intact.) 37 38% See etex_gen.tex for hints on how to install this program. 39% And see etripman.tex for details about how to validate it. 40 41% e-TeX and NTS are trademarks of the NTS group. 42% TeX is a trademark of the American Mathematical Society. 43% METAFONT is a trademark of Addison-Wesley Publishing Company. 44 45% This program is directly derived from Donald E. Knuth's TeX; 46% the change history which follows and the reward offered for finders of 47% bugs refer specifically to TeX; they should not be taken as referring 48% to e-TeX, although the change history is relevant in that it 49% demonstrates the evolutionary path followed. This program is not TeX; 50% that name is reserved strictly for the program which is the creation 51% and sole responsibility of Professor Knuth. 52 53% Version 0 was released in September 1982 after it passed a variety of tests. 54% Version 1 was released in November 1983 after thorough testing. 55% Version 1.1 fixed ``disappearing font identifiers'' et alia (July 1984). 56% Version 1.2 allowed `0' in response to an error, et alia (October 1984). 57% Version 1.3 made memory allocation more flexible and local (November 1984). 58% Version 1.4 fixed accents right after line breaks, et alia (April 1985). 59% Version 1.5 fixed \the\toks after other expansion in \edefs (August 1985). 60% Version 2.0 (almost identical to 1.5) corresponds to "Volume B" (April 1986). 61% Version 2.1 corrected anomalies in discretionary breaks (January 1987). 62% Version 2.2 corrected "(Please type...)" with null \endlinechar (April 1987). 63% Version 2.3 avoided incomplete page in premature termination (August 1987). 64% Version 2.4 fixed \noaligned rules in indented displays (August 1987). 65% Version 2.5 saved cur_order when expanding tokens (September 1987). 66% Version 2.6 added 10sp slop when shipping leaders (November 1987). 67% Version 2.7 improved rounding of negative-width characters (November 1987). 68% Version 2.8 fixed weird bug if no \patterns are used (December 1987). 69% Version 2.9 made \csname\endcsname's "relax" local (December 1987). 70% Version 2.91 fixed \outer\def\a0{}\a\a bug (April 1988). 71% Version 2.92 fixed \patterns, also file names with complex macros (May 1988). 72% Version 2.93 fixed negative halving in allocator when mem_min<0 (June 1988). 73% Version 2.94 kept open_log_file from calling fatal_error (November 1988). 74% Version 2.95 solved that problem a better way (December 1988). 75% Version 2.96 corrected bug in "Infinite shrinkage" recovery (January 1989). 76% Version 2.97 corrected blunder in creating 2.95 (February 1989). 77% Version 2.98 omitted save_for_after at outer level (March 1989). 78% Version 2.99 caught $$\begingroup\halign..$$ (June 1989). 79% Version 2.991 caught .5\ifdim.6... (June 1989). 80% Version 2.992 introduced major changes for 8-bit extensions (September 1989). 81% Version 2.993 fixed a save_stack synchronization bug et alia (December 1989). 82% Version 3.0 fixed unusual displays; was more \output robust (March 1990). 83% Version 3.1 fixed nullfont, disabled \write{\the\prevgraf} (September 1990). 84% Version 3.14 fixed unprintable font names and corrected typos (March 1991). 85% Version 3.141 more of same; reconstituted ligatures better (March 1992). 86% Version 3.1415 preserved nonexplicit kerns, tidied up (February 1993). 87% Version 3.14159 allowed fontmemsize to change; bulletproofing (March 1995). 88% Version 3.141592 fixed \xleaders, glueset, weird alignments (December 2002). 89% Version 3.1415926 was a general cleanup with minor fixes (February 2008). 90% Version 3.14159265 was similar (January 2014). 91 92% A reward of $327.68 will be paid to the first finder of any remaining bug. 93 94% A preliminary version of TeX--XeT was released in April 1992. 95% TeX--XeT version 1.0 was released in June 1992, 96% version 1.1 prevented arith overflow in glue computation (Oct 1992). 97% A preliminary e-TeX version 0.95 was operational in March 1994. 98% Version 1.0beta was released in May 1995. 99% Version 1.01beta fixed bugs in just_copy and every_eof (December 1995). 100% Version 1.02beta allowed 256 mark classes (March 1996). 101% Version 1.1 changed \group{type,level} -> \currentgroup{type,level}, 102% first public release (October 1996). 103% Version 2.0 development was started in March 1997; 104% fixed a ligature-\beginR bug in January 1998; 105% was released in March 1998. 106% Version 2.1 fixed a \marks bug (when min_halfword<>0) (January 1999). 107% Version 2.2 development was started in Feb 2003; released in Oct 2004. 108% fixed a bug in sparse array handling (0=>null), Jun 2002; 109% fixed a bug in \lastnodetype (cur_val=>cur_val_level) 110% reported by Hartmut Henkel <hartmut_henkel@@gmx.de>, 111% fix by Fabrice Popineau <Fabrice.Popineau@@supelec.fr>, 112% Jan 2004; 113% another bug in sparse array handling (cur_ptr=>cur_chr) 114% reported by Taco Hoekwater <taco@@elvenkind.com>, Jul 2004; 115% fixed a sparse array reference count bug (\let,\futurelet), 116% fix by Bernd Raichle <berd@@dante.de>, Aug 2004; 117% reorganized handling of banner, additional token list and 118% integer parameters, and similar in order to reduce the 119% interference between eTeX, pdfTeX, and web2c change files. 120% adapted to tex.web 3.141592, revised glue rounding for mixed 121% direction typesetting; 122% fixed a bug in the revised glue rounding code, detected by 123% Tigran Aivazian <tigran@@aivazian.fsnet.co.uk>, Oct 2004. 124% Version 2.3 development was started in Feb 2008; released in Apr 2011. 125% fixed a bug in hyph_code handling (\savinghyphcodes) 126% reported by Vladimir Volovich <vvv@@vsu.ru>, Feb 2008. 127% fixed the error messages for improper use of \protected, 128% reported by Heiko Oberdiek 129% <heiko.oberdiek@@googlemail.com>, May 2010. 130% some rearrangements to reduce interferences between 131% e-TeX and pTeX, in part suggested by Hironori Kitagawa 132% <h_kitagawa2001@@yahoo.co.jp>, Mar 2011. 133% Version 2.4 fixed an uninitialized line number bug, released in May 2012. 134% Version 2.5 development was started in Aug 2012; released in Feb 2013. 135% better tracing of font definitions, reported by 136% Bruno Le Floch <blflatex@@gmail.com>, Jul 2012. 137% Version 2.6 development was started in Mar 2013; released in ??? 201?. 138% enable hyphenation of text between \beginL and \endL or 139% between \beginR and \endR, problem reported by 140% Vafa Khalighi <vafalgk@@gmail.com>, Nov 2013. 141% better handling of right-to-left text -- to be done. 142 143% Although considerable effort has been expended to make the e-TeX program 144% correct and reliable, no warranty is implied; the author disclaims any 145% obligation or liability for damages, including but not limited to 146% special, indirect, or consequential damages arising out of or in 147% connection with the use or performance of this software. This work has 148% been a ``labor of love'' and the author hopes that users enjoy it. 149 150% Here is TeX material that gets inserted after \input webmac 151\input xewebmac 152\def\hang{\hangindent 3em\noindent\ignorespaces} 153\def\hangg#1 {\hang\hbox{#1 }} 154\def\textindent#1{\hangindent2.5em\noindent\hbox to2.5em{\hss#1 }\ignorespaces} 155\font\ninerm=cmr9 156\let\mc=\ninerm % medium caps for names like SAIL 157 158\def\mirror#1{% 159 \phantom{#1}% 160 \rlap{\special{pdf:btrans xscale -1}#1\special{pdf:etrans}}% 161} 162\def\XeTeX{X\kern-.125em\lower.5ex\hbox{\mirror{E}}\kern-.1667em\TeX} 163 164\def\eTeX{$\varepsilon$-\TeX} 165\font\revrm=xbmc10 % for right-to-left text 166% to generate xbmc10 (i.e., reflected cmbx10) use a file 167% xbmc10.mf containing: 168%+++++++++++++++++++++++++++++++++++++++++++++++++ 169% if unknown cmbase: input cmbase fi 170% extra_endchar := extra_endchar & 171% "currentpicture:=currentpicture " & 172% "reflectedabout((.5[l,r],0),(.5[l,r],1));"; 173% input cmbx10 174%+++++++++++++++++++++++++++++++++++++++++++++++++ 175\def\TeXeT{\TeX-{\revrm\beginR\TeX\endR}} % for TeX-XeT 176\def\PASCAL{Pascal} 177\def\ph{\hbox{Pascal-H}} 178\def\pct!{{\char`\%}} % percent sign in ordinary text 179\def\grp{\.{\char'173...\char'175}} 180\font\logo=logo10 % font used for the METAFONT logo 181\def\MF{{\logo META}\-{\logo FONT}} 182\def\<#1>{$\langle#1\rangle$} 183\def\section{\mathhexbox278} 184 185\def\(#1){} % this is used to make section names sort themselves better 186\def\9#1{} % this is used for sort keys in the index via @@:sort key}{entry@@> 187 188\outer\def\N#1. \[#2]#3.{\MN#1.\vfil\eject % begin starred section 189 \def\rhead{PART #2:\uppercase{#3}} % define running headline 190 \message{*\modno} % progress report 191 \edef\next{\write\cont{\Z{\?#2]#3}{\modno}{\the\pageno}}}\next 192 \ifon\startsection{\bf\ignorespaces#3.\quad}\ignorespaces} 193\let\?=\relax % we want to be able to \write a \? 194 195\def\title{\XeTeX} 196% system dependent redefinitions of \title should come later 197% and should use: 198% \toks0=\expandafter{\title} 199% \edef\title{...\the\toks0...} 200\def\topofcontents{\hsize 5.5in 201 \vglue 0pt plus 1fil minus 1.5in 202 \def\?##1]{\hbox to 1in{\hfil##1.\ }} 203 } 204\def\botofcontents{\vskip 0pt plus 1fil minus 1.5in} 205\pageno=3 206\def\glob{13} % this should be the section number of "<Global...>" 207\def\gglob{20, 26} % this should be the next two sections of "<Global...>" 208 209@* \[1] Introduction. 210This is \XeTeX, a program derived from and extending the capabilities of 211\TeX, a document compiler intended to produce typesetting of high 212quality. 213The \PASCAL\ program that follows is the definition of \TeX82, a standard 214@:PASCAL}{\PASCAL@> 215@!@:TeX82}{\TeX82@> 216version of \TeX\ that is designed to be highly portable so that identical output 217will be obtainable on a great variety of computers. 218 219The main purpose of the following program is to explain the algorithms of \TeX\ 220as clearly as possible. As a result, the program will not necessarily be very 221efficient when a particular \PASCAL\ compiler has translated it into a 222particular machine language. However, the program has been written so that it 223can be tuned to run efficiently in a wide variety of operating environments 224by making comparatively few changes. Such flexibility is possible because 225the documentation that follows is written in the \.{WEB} language, which is 226at a higher level than \PASCAL; the preprocessing step that converts \.{WEB} 227to \PASCAL\ is able to introduce most of the necessary refinements. 228Semi-automatic translation to other languages is also feasible, because the 229program below does not make extensive use of features that are peculiar to 230\PASCAL. 231 232A large piece of software like \TeX\ has inherent complexity that cannot 233be reduced below a certain level of difficulty, although each individual 234part is fairly simple by itself. The \.{WEB} language is intended to make 235the algorithms as readable as possible, by reflecting the way the 236individual program pieces fit together and by providing the 237cross-references that connect different parts. Detailed comments about 238what is going on, and about why things were done in certain ways, have 239been liberally sprinkled throughout the program. These comments explain 240features of the implementation, but they rarely attempt to explain the 241\TeX\ language itself, since the reader is supposed to be familiar with 242{\sl The \TeX book}. 243@.WEB@> 244@:TeXbook}{\sl The \TeX book@> 245 246@ The present implementation has a long ancestry, beginning in the summer 247of~1977, when Michael~F. Plass and Frank~M. Liang designed and coded 248a prototype 249@^Plass, Michael Frederick@> 250@^Liang, Franklin Mark@> 251@^Knuth, Donald Ervin@> 252based on some specifications that the author had made in May of that year. 253This original proto\TeX\ included macro definitions and elementary 254manipulations on boxes and glue, but it did not have line-breaking, 255page-breaking, mathematical formulas, alignment routines, error recovery, 256or the present semantic nest; furthermore, 257it used character lists instead of token lists, so that a control sequence 258like \.{\\halign} was represented by a list of seven characters. A 259complete version of \TeX\ was designed and coded by the author in late 2601977 and early 1978; that program, like its prototype, was written in the 261{\mc SAIL} language, for which an excellent debugging system was 262available. Preliminary plans to convert the {\mc SAIL} code into a form 263somewhat like the present ``web'' were developed by Luis Trabb~Pardo and 264@^Trabb Pardo, Luis Isidoro@> 265the author at the beginning of 1979, and a complete implementation was 266created by Ignacio~A. Zabala in 1979 and 1980. The \TeX82 program, which 267@^Zabala Salelles, Ignacio Andr\'es@> 268was written by the author during the latter part of 1981 and the early 269part of 1982, also incorporates ideas from the 1979 implementation of 270@^Guibas, Leonidas Ioannis@> 271@^Sedgewick, Robert@> 272@^Wyatt, Douglas Kirk@> 273\TeX\ in {\mc MESA} that was written by Leonidas Guibas, Robert Sedgewick, 274and Douglas Wyatt at the Xerox Palo Alto Research Center. Several hundred 275refinements were introduced into \TeX82 based on the experiences gained with 276the original implementations, so that essentially every part of the system 277has been substantially improved. After the appearance of ``Version 0'' in 278September 1982, this program benefited greatly from the comments of 279many other people, notably David~R. Fuchs and Howard~W. Trickey. 280A final revision in September 1989 extended the input character set to 281eight-bit codes and introduced the ability to hyphenate words from 282different languages, based on some ideas of Michael~J. Ferguson. 283@^Fuchs, David Raymond@> 284@^Trickey, Howard Wellington@> 285@^Ferguson, Michael John@> 286 287No doubt there still is plenty of room for improvement, but the author 288is firmly committed to keeping \TeX82 ``frozen'' from now on; stability 289and reliability are to be its main virtues. 290 291On the other hand, the \.{WEB} description can be extended without changing 292the core of \TeX82 itself, and the program has been designed so that such 293extensions are not extremely difficult to make. 294The |banner| string defined here should be changed whenever \TeX\ 295undergoes any modifications, so that it will be clear which version of 296\TeX\ might be the guilty party when a problem arises. 297@^extensions to \TeX@> 298@^system dependencies@> 299 300This program contains code for various features extending \TeX, 301therefore this program is called `\XeTeX' and not 302`\TeX'; the official name `\TeX' by itself is reserved 303for software systems that are fully compatible with each other. 304A special test suite called the ``\.{TRIP} test'' is available for 305helping to determine whether a particular implementation deserves to be 306known as `\TeX' [cf.~Stanford Computer Science report CS1027, 307November 1984]. 308 309A similar test suite called the ``\.{e-TRIP} test'' is available for 310helping to determine whether a particular implementation deserves to be 311known as `\eTeX'. 312 313@d eTeX_version=2 { \.{\\eTeXversion} } 314@d eTeX_revision==".6" { \.{\\eTeXrevision} } 315@d eTeX_version_string=='-2.6' {current \eTeX\ version} 316@# 317@d XeTeX_version=0 { \.{\\XeTeXversion} } 318@d XeTeX_revision==".99992" { \.{\\XeTeXrevision} } 319@d XeTeX_version_string=='-0.99992' {current \XeTeX\ version} 320@# 321@d XeTeX_banner=='This is XeTeX, Version 3.14159265',eTeX_version_string,XeTeX_version_string 322 {printed when \XeTeX\ starts} 323@# 324@d banner=='This is TeX, Version 3.14159265' {printed when \TeX\ starts} 325@# 326@d TEX==XETEX {change program name into |XETEX|} 327@# 328@d TeXXeT_code=0 {the \TeXXeT\ feature is optional} 329@# 330@d XeTeX_dash_break_code=1 {non-zero to enable breaks after en- and em-dashes} 331@# 332@d XeTeX_upwards_code=2 {non-zero if the main vertical list is being built upwards} 333@d XeTeX_use_glyph_metrics_code=3 {non-zero to use exact glyph height/depth} 334@d XeTeX_inter_char_tokens_code=4 {non-zero to enable \.{\\XeTeXinterchartokens} insertion} 335@# 336@d XeTeX_input_normalization_code=5 {normalization mode:, 1 for NFC, 2 for NFD, else none} 337@# 338@d XeTeX_default_input_mode_code=6 {input mode for newly opened files} 339@d XeTeX_input_mode_auto=0 340@d XeTeX_input_mode_utf8=1 341@d XeTeX_input_mode_utf16be=2 342@d XeTeX_input_mode_utf16le=3 343@d XeTeX_input_mode_raw=4 344@d XeTeX_input_mode_icu_mapping=5 345@# 346@d XeTeX_default_input_encoding_code=7 {|str_number| of encoding name if |mode =| ICU} 347@# 348@d XeTeX_tracing_fonts_code=8 {non-zero to log native fonts used} 349@d eTeX_states=9 {number of \eTeX\ state variables in |eqtb|} 350 351@ Different \PASCAL s have slightly different conventions, and the present 352@!@:PASCAL H}{\ph@> 353program expresses \TeX\ in terms of the \PASCAL\ that was 354available to the author in 1982. Constructions that apply to 355this particular compiler, which we shall call \ph, should help the 356reader see how to make an appropriate interface for other systems 357if necessary. (\ph\ is Charles Hedrick's modification of a compiler 358@^Hedrick, Charles Locke@> 359for the DECsystem-10 that was originally developed at the University of 360Hamburg; cf.\ {\sl SOFTWARE---Practice \AM\ Experience \bf6} (1976), 36129--42. The \TeX\ program below is intended to be adaptable, without 362extensive changes, to most other versions of \PASCAL, so it does not fully 363use the admirable features of \ph. Indeed, a conscious effort has been 364made here to avoid using several idiosyncratic features of standard 365\PASCAL\ itself, so that most of the code can be translated mechanically 366into other high-level languages. For example, the `\&{with}' and `\\{new}' 367features are not used, nor are pointer types, set types, or enumerated 368scalar types; there are no `\&{var}' parameters, except in the case of files 369--- \eTeX, however, does use `\&{var}' parameters for the |reverse| function; 370there are no tag fields on variant records; there are no assignments 371|real:=integer|; no procedures are declared local to other procedures.) 372 373The portions of this program that involve system-dependent code, where 374changes might be necessary because of differences between \PASCAL\ compilers 375and/or differences between 376operating systems, can be identified by looking at the sections whose 377numbers are listed under `system dependencies' in the index. Furthermore, 378the index entries for `dirty \PASCAL' list all places where the restrictions 379of \PASCAL\ have not been followed perfectly, for one reason or another. 380@!@^system dependencies@> 381@!@^dirty \PASCAL@> 382 383Incidentally, \PASCAL's standard |round| function can be problematical, 384because it disagrees with the IEEE floating-point standard. 385Many implementors have 386therefore chosen to substitute their own home-grown rounding procedure. 387 388@ The program begins with a normal \PASCAL\ program heading, whose 389components will be filled in later, using the conventions of \.{WEB}. 390@.WEB@> 391For example, the portion of the program called `\X\glob:Global 392variables\X' below will be replaced by a sequence of variable declarations 393that starts in $\section\glob$ of this documentation. In this way, we are able 394to define each individual global variable when we are prepared to 395understand what it means; we do not have to define all of the globals at 396once. Cross references in $\section\glob$, where it says ``See also 397sections \gglob, \dots,'' also make it possible to look at the set of 398all global variables, if desired. Similar remarks apply to the other 399portions of the program heading. 400 401Actually the heading shown here is not quite normal: The |program| line 402does not mention any |output| file, because \ph\ would ask the \TeX\ user 403to specify a file name if |output| were specified here. 404@:PASCAL H}{\ph@> 405@^system dependencies@> 406 407@d mtype==t@&y@&p@&e {this is a \.{WEB} coding trick:} 408@f mtype==type {`\&{mtype}' will be equivalent to `\&{type}'} 409@f type==true {but `|type|' will not be treated as a reserved word} 410 411@p @t\4@>@<Compiler directives@>@/ 412program TEX; {all file names are defined dynamically} 413label @<Labels in the outer block@>@/ 414const @<Constants in the outer block@>@/ 415mtype @<Types in the outer block@>@/ 416var @<Global variables@>@/ 417@# 418procedure initialize; {this procedure gets things started properly} 419 var @<Local variables for initialization@>@/ 420 begin @<Initialize whatever \TeX\ might access@>@; 421 end;@# 422@t\4@>@<Basic printing procedures@>@/ 423@t\4@>@<Error handling procedures@>@/ 424 425@ The overall \TeX\ program begins with the heading just shown, after which 426comes a bunch of procedure declarations and function declarations. 427Finally we will get to the main program, which begins with the 428comment `|start_here|'. If you want to skip down to the 429main program now, you can look up `|start_here|' in the index. 430But the author suggests that the best way to understand this program 431is to follow pretty much the order of \TeX's components as they appear in the 432\.{WEB} description you are now reading, since the present ordering is 433intended to combine the advantages of the ``bottom up'' and ``top down'' 434approaches to the problem of understanding a somewhat complicated system. 435 436@ Three labels must be declared in the main program, so we give them 437symbolic names. 438 439@d start_of_TEX=1 {go here when \TeX's variables are initialized} 440@d end_of_TEX=9998 {go here to close files and terminate gracefully} 441@d final_end=9999 {this label marks the ending of the program} 442 443@<Labels in the out...@>= 444start_of_TEX@t\hskip-2pt@>, end_of_TEX@t\hskip-2pt@>,@,final_end; 445 {key control points} 446 447@ Some of the code below is intended to be used only when diagnosing the 448strange behavior that sometimes occurs when \TeX\ is being installed or 449when system wizards are fooling around with \TeX\ without quite knowing 450what they are doing. Such code will not normally be compiled; it is 451delimited by the codewords `$|debug|\ldots|gubed|$', with apologies 452to people who wish to preserve the purity of English. 453 454Similarly, there is some conditional code delimited by 455`$|stat|\ldots|tats|$' that is intended for use when statistics are to be 456kept about \TeX's memory usage. The |stat| $\ldots$ |tats| code also 457implements diagnostic information for \.{\\tracingparagraphs} and 458\.{\\tracingpages}. 459@^debugging@> 460 461@d debug==@{ {change this to `$\\{debug}\equiv\null$' when debugging} 462@d gubed==@t@>@} {change this to `$\\{gubed}\equiv\null$' when debugging} 463@f debug==begin 464@f gubed==end 465@# 466@d stat==@{ {change this to `$\\{stat}\equiv\null$' when gathering 467 usage statistics} 468@d tats==@t@>@} {change this to `$\\{tats}\equiv\null$' when gathering 469 usage statistics} 470@f stat==begin 471@f tats==end 472 473@ This program has two important variations: (1) There is a long and slow 474version called \.{INITEX}, which does the extra calculations needed to 475@.INITEX@> 476initialize \TeX's internal tables; and (2)~there is a shorter and faster 477production version, which cuts the initialization to a bare minimum. 478Parts of the program that are needed in (1) but not in (2) are delimited by 479the codewords `$|init|\ldots|tini|$'. 480 481@d init== {change this to `$\\{init}\equiv\.{@@\{}$' in the production version} 482@d tini== {change this to `$\\{tini}\equiv\.{@@\}}$' in the production version} 483@f init==begin 484@f tini==end 485 486@<Initialize whatever...@>= 487@<Set initial values of key variables@>@/ 488@!init @<Initialize table entries (done by \.{INITEX} only)@>@;@+tini 489 490@ If the first character of a \PASCAL\ comment is a dollar sign, 491\ph\ treats the comment as a list of ``compiler directives'' that will 492affect the translation of this program into machine language. The 493directives shown below specify full checking and inclusion of the \PASCAL\ 494debugger when \TeX\ is being debugged, but they cause range checking and other 495redundant code to be eliminated when the production system is being generated. 496Arithmetic overflow will be detected in all cases. 497@:PASCAL H}{\ph@> 498@^system dependencies@> 499@^overflow in arithmetic@> 500 501@<Compiler directives@>= 502@{@&$C-,A+,D-@} {no range check, catch arithmetic overflow, no debug overhead} 503@!debug @{@&$C+,D+@}@+ gubed {but turn everything on when debugging} 504 505@ This \TeX\ implementation conforms to the rules of the {\sl Pascal User 506@:PASCAL}{\PASCAL@> 507@^system dependencies@> 508Manual} published by Jensen and Wirth in 1975, except where system-dependent 509@^Wirth, Niklaus@> 510@^Jensen, Kathleen@> 511code is necessary to make a useful system program, and except in another 512respect where such conformity would unnecessarily obscure the meaning 513and clutter up the code: We assume that |case| statements may include a 514default case that applies if no matching label is found. Thus, we shall use 515constructions like 516$$\vbox{\halign{\ignorespaces#\hfil\cr 517|case x of|\cr 5181: $\langle\,$code for $x=1\,\rangle$;\cr 5193: $\langle\,$code for $x=3\,\rangle$;\cr 520|othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr 521|endcases|\cr}}$$ 522since most \PASCAL\ compilers have plugged this hole in the language by 523incorporating some sort of default mechanism. For example, the \ph\ 524compiler allows `|others|:' as a default label, and other \PASCAL s allow 525syntaxes like `\&{else}' or `\&{otherwise}' or `\\{otherwise}:', etc. The 526definitions of |othercases| and |endcases| should be changed to agree with 527local conventions. Note that no semicolon appears before |endcases| in 528this program, so the definition of |endcases| should include a semicolon 529if the compiler wants one. (Of course, if no default mechanism is 530available, the |case| statements of \TeX\ will have to be laboriously 531extended by listing all remaining cases. People who are stuck with such 532\PASCAL s have, in fact, done this, successfully but not happily!) 533@:PASCAL H}{\ph@> 534 535@d othercases == others: {default for cases not listed explicitly} 536@d endcases == @+end {follows the default case in an extended |case| statement} 537@f othercases == else 538@f endcases == end 539 540@ The following parameters can be changed at compile time to extend or 541reduce \TeX's capacity. They may have different values in \.{INITEX} and 542in production versions of \TeX. 543@.INITEX@> 544@^system dependencies@> 545 546@<Constants...@>= 547@!mem_max=30000; {greatest index in \TeX's internal |mem| array; 548 must be strictly less than |max_halfword|; 549 must be equal to |mem_top| in \.{INITEX}, otherwise |>=mem_top|} 550@!mem_min=0; {smallest index in \TeX's internal |mem| array; 551 must be |min_halfword| or more; 552 must be equal to |mem_bot| in \.{INITEX}, otherwise |<=mem_bot|} 553@!buf_size=500; {maximum number of characters simultaneously present in 554 current lines of open files and in control sequences between 555 \.{\\csname} and \.{\\endcsname}; must not exceed |max_halfword|} 556@!error_line=72; {width of context lines on terminal error messages} 557@!half_error_line=42; {width of first lines of contexts in terminal 558 error messages; should be between 30 and |error_line-15|} 559@!max_print_line=79; {width of longest text lines output; should be at least 60} 560@!stack_size=200; {maximum number of simultaneous input sources} 561@!max_in_open=6; {maximum number of input files and error insertions that 562 can be going on simultaneously} 563@!font_max=75; {maximum internal font number; must not exceed |max_quarterword| 564 and must be at most |font_base+256|} 565@!font_mem_size=20000; {number of words of |font_info| for all fonts} 566@!param_size=60; {maximum number of simultaneous macro parameters} 567@!nest_size=40; {maximum number of semantic levels simultaneously active} 568@!max_strings=3000; {maximum number of strings; must not exceed |max_halfword|} 569@!string_vacancies=8000; {the minimum number of characters that should be 570 available for the user's control sequences and font names, 571 after \TeX's own error messages are stored} 572@!pool_size=32000; {maximum number of characters in strings, including all 573 error messages and help texts, and the names of all fonts and 574 control sequences; must exceed |string_vacancies| by the total 575 length of \TeX's own strings, which is currently about 23000} 576@!save_size=600; {space for saving values outside of current group; must be 577 at most |max_halfword|} 578@!trie_size=8000; {space for hyphenation patterns; should be larger for 579 \.{INITEX} than it is in production versions of \TeX} 580@!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns} 581@!dvi_buf_size=800; {size of the output buffer; must be a multiple of 8} 582@!file_name_size=40; {file names shouldn't be longer than this} 583@!pool_name='TeXformats:TEX.POOL '; 584 {string of length |file_name_size|; tells where the string pool appears} 585@.TeXformats@> 586 587@ Like the preceding parameters, the following quantities can be changed 588at compile time to extend or reduce \TeX's capacity. But if they are changed, 589it is necessary to rerun the initialization program \.{INITEX} 590@.INITEX@> 591to generate new tables for the production \TeX\ program. 592One can't simply make helter-skelter changes to the following constants, 593since certain rather complex initialization 594numbers are computed from them. They are defined here using 595\.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to 596emphasize this distinction. 597 598@d mem_bot=0 {smallest index in the |mem| array dumped by \.{INITEX}; 599 must not be less than |mem_min|} 600@d mem_top==30000 {largest index in the |mem| array dumped by \.{INITEX}; 601 must be substantially larger than |mem_bot| 602 and not greater than |mem_max|} 603@d font_base=0 {smallest internal font number; must not be less 604 than |min_quarterword|} 605@d hash_size=2100 {maximum number of control sequences; it should be at most 606 about |(mem_max-mem_min)/10|} 607@d hash_prime=1777 {a prime number equal to about 85\pct! of |hash_size|} 608@d hyph_size=307 {another prime; the number of \.{\\hyphenation} exceptions} 609@d biggest_char=65535 {the largest allowed character number; 610 must be |<=max_quarterword|, this refers to UTF16 codepoints that we store in strings, etc; 611 actual character codes can exceed this range, up to |biggest_usv|} 612@d too_big_char=65536 {|biggest_char+1|} 613@d special_char=65537 {|biggest_char+2|} 614@d number_chars=65536 {|biggest_char+1|} 615@d biggest_usv=@"10FFFF {the largest Unicode Scalar Value} 616@d too_big_usv=@"110000 {|biggest_usv+1|} 617@d number_usvs=@"110000 {|biggest_usv+1|} 618@d biggest_reg=255 {the largest allowed register number; 619 must be |<=max_quarterword|} 620@d number_regs=256 {|biggest_reg+1|} 621@d font_biggest=255 {the real biggest font} 622@d number_fonts=font_biggest-font_base+2 623@d number_math_families=256 624@d number_math_fonts=number_math_families+number_math_families+number_math_families 625@d math_font_biggest=number_math_fonts-1 626@d text_size=0 {size code for the largest size in a family} 627@d script_size=number_math_families {size code for the medium size in a family} 628@d script_script_size=number_math_families+number_math_families {size code for the smallest size in a family} 629@d biggest_lang=255 {the largest hyphenation language} 630@d too_big_lang=256 {|biggest_lang+1|} 631@^system dependencies@> 632 633@ In case somebody has inadvertently made bad settings of the ``constants,'' 634\TeX\ checks them using a global variable called |bad|. 635 636This is the first of many sections of \TeX\ where global variables are 637defined. 638 639@<Glob...@>= 640@!bad:integer; {is some ``constant'' wrong?} 641 642@ Later on we will say `\ignorespaces|if mem_max>=max_halfword then bad:=14|', 643or something similar. (We can't do that until |max_halfword| has been defined.) 644 645@<Check the ``constant'' values for consistency@>= 646bad:=0; 647if (half_error_line<30)or(half_error_line>error_line-15) then bad:=1; 648if max_print_line<60 then bad:=2; 649if dvi_buf_size mod 8<>0 then bad:=3; 650if mem_bot+1100>mem_top then bad:=4; 651if hash_prime>hash_size then bad:=5; 652if max_in_open>=128 then bad:=6; 653if mem_top<256+11 then bad:=7; {we will want |null_list>255|} 654 655@ Labels are given symbolic names by the following definitions, so that 656occasional |goto| statements will be meaningful. We insert the label 657`|exit|' just before the `\ignorespaces|end|\unskip' of a procedure in 658which we have used the `|return|' statement defined below; the label 659`|restart|' is occasionally used at the very beginning of a procedure; and 660the label `|reswitch|' is occasionally used just prior to a |case| 661statement in which some cases change the conditions and we wish to branch 662to the newly applicable case. Loops that are set up with the |loop| 663construction defined below are commonly exited by going to `|done|' or to 664`|found|' or to `|not_found|', and they are sometimes repeated by going to 665`|continue|'. If two or more parts of a subroutine start differently but 666end up the same, the shared code may be gathered together at 667`|common_ending|'. 668 669Incidentally, this program never declares a label that isn't actually used, 670because some fussy \PASCAL\ compilers will complain about redundant labels. 671 672@d exit=10 {go here to leave a procedure} 673@d restart=20 {go here to start a procedure again} 674@d reswitch=21 {go here to start a case statement again} 675@d continue=22 {go here to resume a loop} 676@d done=30 {go here to exit a loop} 677@d done1=31 {like |done|, when there is more than one loop} 678@d done2=32 {for exiting the second loop in a long block} 679@d done3=33 {for exiting the third loop in a very long block} 680@d done4=34 {for exiting the fourth loop in an extremely long block} 681@d done5=35 {for exiting the fifth loop in an immense block} 682@d done6=36 {for exiting the sixth loop in a block} 683@d found=40 {go here when you've found it} 684@d found1=41 {like |found|, when there's more than one per routine} 685@d found2=42 {like |found|, when there's more than two per routine} 686@d not_found=45 {go here when you've found nothing} 687@d not_found1=46 {like |not_found|, when there's more than one} 688@d not_found2=47 {like |not_found|, when there's more than two} 689@d not_found3=48 {like |not_found|, when there's more than three} 690@d not_found4=49 {like |not_found|, when there's more than four} 691@d common_ending=50 {go here when you want to merge with another branch} 692 693@ Here are some macros for common programming idioms. 694 695@d incr(#) == #:=#+1 {increase a variable by unity} 696@d decr(#) == #:=#-1 {decrease a variable by unity} 697@d negate(#) == #:=-# {change the sign of a variable} 698@d loop == @+ while true do@+ {repeat over and over until a |goto| happens} 699@f loop == xclause 700 {\.{WEB}'s |xclause| acts like `\ignorespaces|while true do|\unskip'} 701@d do_nothing == {empty statement} 702@d return == goto exit {terminate a procedure call} 703@f return == nil 704@d empty=0 {symbolic name for a null constant} 705 706@* \[2] The character set. 707In order to make \TeX\ readily portable to a wide variety of 708computers, all of its input text is converted to an internal eight-bit 709code that includes standard ASCII, the ``American Standard Code for 710Information Interchange.'' This conversion is done immediately when each 711character is read in. Conversely, characters are converted from ASCII to 712the user's external representation just before they are output to a 713text file. 714 715Such an internal code is relevant to users of \TeX\ primarily because it 716governs the positions of characters in the fonts. For example, the 717character `\.A' has ASCII code $65=@'101$, and when \TeX\ typesets 718this letter it specifies character number 65 in the current font. 719If that font actually has `\.A' in a different position, \TeX\ doesn't 720know what the real position is; the program that does the actual printing from 721\TeX's device-independent files is responsible for converting from ASCII to 722a particular font encoding. 723@^ASCII code@> 724 725\TeX's internal code also defines the value of constants 726that begin with a reverse apostrophe; and it provides an index to the 727\.{\\catcode}, \.{\\mathcode}, \.{\\uccode}, \.{\\lccode}, and \.{\\delcode} 728tables. 729 730@ Characters of text that have been converted to \TeX's internal form 731are said to be of type |ASCII_code|, which is a subrange of the integers. 732For xetex, we rename |ASCII_code| as |UTF16_code|. But we also have a 733new type |UTF8_code|, used when we construct filenames to pass to the 734system libraries. 735 736@d ASCII_code==UTF16_code 737@d packed_ASCII_code==packed_UTF16_code 738 739@<Types...@>= 740@!ASCII_code=0..biggest_char; {16-bit numbers} 741@!UTF8_code=0..255; {8-bit numbers} 742@!UnicodeScalar=0..biggest_usv; {Unicode scalars} 743 744@ The original \PASCAL\ compiler was designed in the late 60s, when six-bit 745character sets were common, so it did not make provision for lowercase 746letters. Nowadays, of course, we need to deal with both capital and small 747letters in a convenient way, especially in a program for typesetting; 748so the present specification of \TeX\ has been written under the assumption 749that the \PASCAL\ compiler and run-time system permit the use of text files 750with more than 64 distinguishable characters. More precisely, we assume that 751the character set contains at least the letters and symbols associated 752with ASCII codes @'40 through @'176; all of these characters are now 753available on most computer terminals. 754 755Since we are dealing with more characters than were present in the first 756\PASCAL\ compilers, we have to decide what to call the associated data 757type. Some \PASCAL s use the original name |char| for the 758characters in text files, even though there now are more than 64 such 759characters, while other \PASCAL s consider |char| to be a 64-element 760subrange of a larger data type that has some other name. 761 762In order to accommodate this difference, we shall use the name |text_char| 763to stand for the data type of the characters that are converted to and 764from |ASCII_code| when they are input and output. We shall also assume 765that |text_char| consists of the elements |chr(first_text_char)| through 766|chr(last_text_char)|, inclusive. The following definitions should be 767adjusted if necessary. 768@^system dependencies@> 769 770@d text_char == char {the data type of characters in text files} 771@d first_text_char=0 {ordinal number of the smallest element of |text_char|} 772@d last_text_char=biggest_char {ordinal number of the largest element of |text_char|} 773 774@<Local variables for init...@>= 775@!i:integer; 776 777@ The \TeX\ processor converts between ASCII code and 778the user's external character set by means of arrays |xord| and |xchr| 779that are analogous to \PASCAL's |ord| and |chr| functions. 780 781@<Glob...@>= 782@!xchr: array [ASCII_code] of text_char; 783 {specifies conversion of output characters} 784 785@ Since we are assuming that our \PASCAL\ system is able to read and 786write the visible characters of standard ASCII (although not 787necessarily using the ASCII codes to represent them), the following 788assignment statements initialize the standard part of the |xchr| array 789properly, without needing any system-dependent changes. On the other 790hand, it is possible to implement \TeX\ with less complete character 791sets, and in such cases it will be necessary to change something here. 792@^system dependencies@> 793 794@ Some of the ASCII codes without visible characters have been given symbolic 795names in this program because they are used with a special meaning. 796 797@d null_code=@'0 {ASCII code that might disappear} 798@d carriage_return=@'15 {ASCII code used at end of line} 799@d invalid_code=@'177 {ASCII code that many systems prohibit in text files} 800 801@ The ASCII code is ``standard'' only to a certain extent, since many 802computer installations have found it advantageous to have ready access 803to more than 94 printing characters. Appendix~C of {\sl The \TeX book\/} 804gives a complete specification of the intended correspondence between 805characters and \TeX's internal representation. 806@:TeXbook}{\sl The \TeX book@> 807 808If \TeX\ is being used 809on a garden-variety \PASCAL\ for which only standard ASCII 810codes will appear in the input and output files, it doesn't really matter 811what codes are specified in |xchr[0..@'37]|, but the safest policy is to 812blank everything out by using the code shown below. 813 814However, other settings of |xchr| will make \TeX\ more friendly on 815computers that have an extended character set, so that users can type things 816like `\.^^Z' instead of `\.{\\ne}'. People with extended character sets can 817assign codes arbitrarily, giving an |xchr| equivalent to whatever 818characters the users of \TeX\ are allowed to have in their input files. 819It is best to make the codes correspond to the intended interpretations as 820shown in Appendix~C whenever possible; but this is not necessary. For 821example, in countries with an alphabet of more than 26 letters, it is 822usually best to map the additional letters into codes less than~@'40. 823To get the most ``permissive'' character set, change |' '| on the 824right of these assignment statements to |chr(i)|. 825@^character set dependencies@> 826@^system dependencies@> 827 828@<Set init...@>= 829for i:=0 to @'37 do xchr[i]:=' '; 830for i:=@'177 to @'377 do xchr[i]:=' '; 831 832@ The following system-independent code makes the |xord| array contain a 833suitable inverse to the information in |xchr|. Note that if |xchr[i]=xchr[j]| 834where |i<j<@'177|, the value of |xord[xchr[i]]| will turn out to be 835|j| or more; hence, standard ASCII code numbers will be used instead of 836codes below @'40 in case there is a coincidence. 837 838@<Set init...@>= 839for i:=0 to @'176 do xord[xchr[i]]:=i; 840 841@* \[3] Input and output. 842The bane of portability is the fact that different operating systems treat 843input and output quite differently, perhaps because computer scientists 844have not given sufficient attention to this problem. People have felt somehow 845that input and output are not part of ``real'' programming. Well, it is true 846that some kinds of programming are more fun than others. With existing 847input/output conventions being so diverse and so messy, the only sources of 848joy in such parts of the code are the rare occasions when one can find a 849way to make the program a little less bad than it might have been. We have 850two choices, either to attack I/O now and get it over with, or to postpone 851I/O until near the end. Neither prospect is very attractive, so let's 852get it over with. 853 854The basic operations we need to do are (1)~inputting and outputting of 855text, to or from a file or the user's terminal; (2)~inputting and 856outputting of eight-bit bytes, to or from a file; (3)~instructing the 857operating system to initiate (``open'') or to terminate (``close'') input or 858output from a specified file; (4)~testing whether the end of an input 859file has been reached. 860 861\TeX\ needs to deal with two kinds of files. 862We shall use the term |alpha_file| for a file that contains textual data, 863and the term |byte_file| for a file that contains eight-bit binary information. 864These two types turn out to be the same on many computers, but 865sometimes there is a significant distinction, so we shall be careful to 866distinguish between them. Standard protocols for transferring 867such files from computer to computer, via high-speed networks, are 868now becoming available to more and more communities of users. 869 870The program actually makes use also of a third kind of file, called a 871|word_file|, when dumping and reloading base information for its own 872initialization. We shall define a word file later; but it will be possible 873for us to specify simple operations on word files before they are defined. 874 875@<Types...@>= 876@!eight_bits=0..255; {unsigned one-byte quantity} 877@!alpha_file=packed file of text_char; {files that contain textual data} 878@!byte_file=packed file of eight_bits; {files that contain binary data} 879 880@ Most of what we need to do with respect to input and output can be handled 881by the I/O facilities that are standard in \PASCAL, i.e., the routines 882called |get|, |put|, |eof|, and so on. But 883standard \PASCAL\ does not allow file variables to be associated with file 884names that are determined at run time, so it cannot be used to implement 885\TeX; some sort of extension to \PASCAL's ordinary |reset| and |rewrite| 886is crucial for our purposes. We shall assume that |name_of_file| is a variable 887of an appropriate type such that the \PASCAL\ run-time system being used to 888implement \TeX\ can open a file whose external name is specified by 889|name_of_file|. 890@^system dependencies@> 891 892@<Glob...@>= 893@!name_of_file:packed array[1..file_name_size] of char;@;@/ 894 {on some systems this may be a \&{record} variable} 895@!name_of_file16:array[1..file_name_size] of UTF16_code;@;@/ 896 {but sometimes we need a UTF16 version of the name} 897@!name_length:0..file_name_size;@/{this many characters are actually 898 relevant in |name_of_file| (the rest are blank)} 899@!name_length16:0..file_name_size; 900 901@ The \ph\ compiler with which the present version of \TeX\ was prepared has 902extended the rules of \PASCAL\ in a very convenient way. To open file~|f|, 903we can write 904$$\vbox{\halign{#\hfil\qquad&#\hfil\cr 905|reset(f,@t\\{name}@>,'/O')|&for input;\cr 906|rewrite(f,@t\\{name}@>,'/O')|&for output.\cr}}$$ 907The `\\{name}' parameter, which is of type `{\bf packed array 908$[\langle\\{any}\rangle]$ of \\{char}}', stands for the name of 909the external file that is being opened for input or output. 910Blank spaces that might appear in \\{name} are ignored. 911 912The `\.{/O}' parameter tells the operating system not to issue its own 913error messages if something goes wrong. If a file of the specified name 914cannot be found, or if such a file cannot be opened for some other reason 915(e.g., someone may already be trying to write the same file), we will have 916|@!erstat(f)<>0| after an unsuccessful |reset| or |rewrite|. This allows 917\TeX\ to undertake appropriate corrective action. 918@:PASCAL H}{\ph@> 919@^system dependencies@> 920 921\TeX's file-opening procedures return |false| if no file identified by 922|name_of_file| could be opened. 923 924@d reset_OK(#)==erstat(#)=0 925@d rewrite_OK(#)==erstat(#)=0 926 927@p function a_open_in(var f:alpha_file):boolean; 928 {open a text file for input} 929begin reset(f,name_of_file,'/O'); a_open_in:=reset_OK(f); 930end; 931@# 932function a_open_out(var f:alpha_file):boolean; 933 {open a text file for output} 934begin rewrite(f,name_of_file,'/O'); a_open_out:=rewrite_OK(f); 935end; 936@# 937function b_open_in(var f:byte_file):boolean; 938 {open a binary file for input} 939begin reset(f,name_of_file,'/O'); b_open_in:=reset_OK(f); 940end; 941@# 942function b_open_out(var f:byte_file):boolean; 943 {open a binary file for output} 944begin rewrite(f,name_of_file,'/O'); b_open_out:=rewrite_OK(f); 945end; 946@# 947function w_open_in(var f:word_file):boolean; 948 {open a word file for input} 949begin reset(f,name_of_file,'/O'); w_open_in:=reset_OK(f); 950end; 951@# 952function w_open_out(var f:word_file):boolean; 953 {open a word file for output} 954begin rewrite(f,name_of_file,'/O'); w_open_out:=rewrite_OK(f); 955end; 956 957@ Files can be closed with the \ph\ routine `|close(f)|', which 958@:PASCAL H}{\ph@> 959@^system dependencies@> 960should be used when all input or output with respect to |f| has been completed. 961This makes |f| available to be opened again, if desired; and if |f| was used for 962output, the |close| operation makes the corresponding external file appear 963on the user's area, ready to be read. 964 965These procedures should not generate error messages if a file is 966being closed before it has been successfully opened. 967 968@p procedure a_close(var f:alpha_file); {close a text file} 969begin close(f); 970end; 971@# 972procedure b_close(var f:byte_file); {close a binary file} 973begin close(f); 974end; 975@# 976procedure w_close(var f:word_file); {close a word file} 977begin close(f); 978end; 979 980@ Binary input and output are done with \PASCAL's ordinary |get| and |put| 981procedures, so we don't have to make any other special arrangements for 982binary~I/O. Text output is also easy to do with standard \PASCAL\ routines. 983The treatment of text input is more difficult, however, because 984of the necessary translation to |ASCII_code| values. 985\TeX's conventions should be efficient, and they should 986blend nicely with the user's operating environment. 987 988@ Input from text files is read one line at a time, using a routine called 989|input_ln|. This function is defined in terms of global variables called 990|buffer|, |first|, and |last| that will be described in detail later; for 991now, it suffices for us to know that |buffer| is an array of |ASCII_code| 992values, and that |first| and |last| are indices into this array 993representing the beginning and ending of a line of text. 994 995@<Glob...@>= 996@!buffer:array[0..buf_size] of ASCII_code; {lines of characters being read} 997@!first:0..buf_size; {the first unused position in |buffer|} 998@!last:0..buf_size; {end of the line just input to |buffer|} 999@!max_buf_stack:0..buf_size; {largest index used in |buffer|} 1000 1001@ The |input_ln| function brings the next line of input from the specified 1002file into available positions of the buffer array and returns the value 1003|true|, unless the file has already been entirely read, in which case it 1004returns |false| and sets |last:=first|. In general, the |ASCII_code| 1005numbers that represent the next line of the file are input into 1006|buffer[first]|, |buffer[first+1]|, \dots, |buffer[last-1]|; and the 1007global variable |last| is set equal to |first| plus the length of the 1008line. Trailing blanks are removed from the line; thus, either |last=first| 1009(in which case the line was entirely blank) or |buffer[last-1]<>" "|. 1010 1011An overflow error is given, however, if the normal actions of |input_ln| 1012would make |last>=buf_size|; this is done so that other parts of \TeX\ 1013can safely look at the contents of |buffer[last+1]| without overstepping 1014the bounds of the |buffer| array. Upon entry to |input_ln|, the condition 1015|first<buf_size| will always hold, so that there is always room for an 1016``empty'' line. 1017 1018The variable |max_buf_stack|, which is used to keep track of how large 1019the |buf_size| parameter must be to accommodate the present job, is 1020also kept up to date by |input_ln|. 1021 1022If the |bypass_eoln| parameter is |true|, |input_ln| will do a |get| 1023before looking at the first character of the line; this skips over 1024an |eoln| that was in |f^|. The procedure does not do a |get| when it 1025reaches the end of the line; therefore it can be used to acquire input 1026from the user's terminal as well as from ordinary text files. 1027 1028Standard \PASCAL\ says that a file should have |eoln| immediately 1029before |eof|, but \TeX\ needs only a weaker restriction: If |eof| 1030occurs in the middle of a line, the system function |eoln| should return 1031a |true| result (even though |f^| will be undefined). 1032 1033Since the inner loop of |input_ln| is part of \TeX's ``inner loop''---each 1034character of input comes in at this place---it is wise to reduce system 1035overhead by making use of special routines that read in an entire array 1036of characters at once, if such routines are available. The following 1037code uses standard \PASCAL\ to illustrate what needs to be done, but 1038finer tuning is often possible at well-developed \PASCAL\ sites. 1039@^inner loop@> 1040 1041@p function input_ln(var f:alpha_file;@!bypass_eoln:boolean):boolean; 1042 {inputs the next line or returns |false|} 1043var last_nonblank:0..buf_size; {|last| with trailing blanks removed} 1044begin if bypass_eoln then if not eof(f) then get(f); 1045 {input the first character of the line into |f^|} 1046last:=first; {cf.\ Matthew 19\thinspace:\thinspace30} 1047if eof(f) then input_ln:=false 1048else begin last_nonblank:=first; 1049 while not eoln(f) do 1050 begin if last>=max_buf_stack then 1051 begin max_buf_stack:=last+1; 1052 if max_buf_stack=buf_size then 1053 @<Report overflow of the input buffer, and abort@>; 1054 end; 1055 buffer[last]:=xord[f^]; get(f); incr(last); 1056 if buffer[last-1]<>" " then last_nonblank:=last; 1057 end; 1058 last:=last_nonblank; input_ln:=true; 1059 end; 1060end; 1061 1062@ The user's terminal acts essentially like other files of text, except 1063that it is used both for input and for output. When the terminal is 1064considered an input file, the file variable is called |term_in|, and when it 1065is considered an output file the file variable is |term_out|. 1066@^system dependencies@> 1067 1068@<Glob...@>= 1069@!term_in:alpha_file; {the terminal as an input file} 1070@!term_out:alpha_file; {the terminal as an output file} 1071 1072@ Here is how to open the terminal files 1073in \ph. The `\.{/I}' switch suppresses the first |get|. 1074@:PASCAL H}{\ph@> 1075@^system dependencies@> 1076 1077@d t_open_in==reset(term_in,'TTY:','/O/I') {open the terminal for text input} 1078@d t_open_out==rewrite(term_out,'TTY:','/O') {open the terminal for text output} 1079 1080@ Sometimes it is necessary to synchronize the input/output mixture that 1081happens on the user's terminal, and three system-dependent 1082procedures are used for this 1083purpose. The first of these, |update_terminal|, is called when we want 1084to make sure that everything we have output to the terminal so far has 1085actually left the computer's internal buffers and been sent. 1086The second, |clear_terminal|, is called when we wish to cancel any 1087input that the user may have typed ahead (since we are about to 1088issue an unexpected error message). The third, |wake_up_terminal|, 1089is supposed to revive the terminal if the user has disabled it by 1090some instruction to the operating system. The following macros show how 1091these operations can be specified in \ph: 1092@:PASCAL H}{\ph@> 1093@^system dependencies@> 1094 1095@d update_terminal == break(term_out) {empty the terminal output buffer} 1096@d clear_terminal == break_in(term_in,true) {clear the terminal input buffer} 1097@d wake_up_terminal == do_nothing {cancel the user's cancellation of output} 1098 1099@ We need a special routine to read the first line of \TeX\ input from 1100the user's terminal. This line is different because it is read before we 1101have opened the transcript file; there is sort of a ``chicken and 1102egg'' problem here. If the user types `\.{\\input paper}' on the first 1103line, or if some macro invoked by that line does such an \.{\\input}, 1104the transcript file will be named `\.{paper.log}'; but if no \.{\\input} 1105commands are performed during the first line of terminal input, the transcript 1106file will acquire its default name `\.{texput.log}'. (The transcript file 1107will not contain error messages generated by the first line before the 1108first \.{\\input} command.) 1109@.texput@> 1110 1111The first line is even more special if we are lucky enough to have an operating 1112system that treats \TeX\ differently from a run-of-the-mill \PASCAL\ object 1113program. It's nice to let the user start running a \TeX\ job by typing 1114a command line like `\.{tex paper}'; in such a case, \TeX\ will operate 1115as if the first line of input were `\.{paper}', i.e., the first line will 1116consist of the remainder of the command line, after the part that invoked 1117\TeX. 1118 1119The first line is special also because it may be read before \TeX\ has 1120input a format file. In such cases, normal error messages cannot yet 1121be given. The following code uses concepts that will be explained later. 1122(If the \PASCAL\ compiler does not support non-local |@!goto|\unskip, the 1123@^system dependencies@> 1124statement `|goto final_end|' should be replaced by something that 1125quietly terminates the program.) 1126 1127@<Report overflow of the input buffer, and abort@>= 1128if format_ident=0 then 1129 begin write_ln(term_out,'Buffer size exceeded!'); goto final_end; 1130@.Buffer size exceeded@> 1131 end 1132else begin cur_input.loc_field:=first; cur_input.limit_field:=last-1; 1133 overflow("buffer size",buf_size); 1134@:TeX capacity exceeded buffer size}{\quad buffer size@> 1135 end 1136 1137@ Different systems have different ways to get started. But regardless of 1138what conventions are adopted, the routine that initializes the terminal 1139should satisfy the following specifications: 1140 1141\yskip\textindent{1)}It should open file |term_in| for input from the 1142 terminal. (The file |term_out| will already be open for output to the 1143 terminal.) 1144 1145\textindent{2)}If the user has given a command line, this line should be 1146 considered the first line of terminal input. Otherwise the 1147 user should be prompted with `\.{**}', and the first line of input 1148 should be whatever is typed in response. 1149 1150\textindent{3)}The first line of input, which might or might not be a 1151 command line, should appear in locations |first| to |last-1| of the 1152 |buffer| array. 1153 1154\textindent{4)}The global variable |loc| should be set so that the 1155 character to be read next by \TeX\ is in |buffer[loc]|. This 1156 character should not be blank, and we should have |loc<last|. 1157 1158\yskip\noindent(It may be necessary to prompt the user several times 1159before a non-blank line comes in. The prompt is `\.{**}' instead of the 1160later `\.*' because the meaning is slightly different: `\.{\\input}' need 1161not be typed immediately after~`\.{**}'.) 1162 1163@d loc==cur_input.loc_field {location of first unread character in |buffer|} 1164 1165@ The following program does the required initialization 1166without retrieving a possible command line. 1167It should be clear how to modify this routine to deal with command lines, 1168if the system permits them. 1169@^system dependencies@> 1170 1171@p function init_terminal:boolean; {gets the terminal input started} 1172label exit; 1173begin t_open_in; 1174loop@+begin wake_up_terminal; write(term_out,'**'); update_terminal; 1175@.**@> 1176 if not input_ln(term_in,true) then {this shouldn't happen} 1177 begin write_ln(term_out); 1178 write(term_out,'! End of file on the terminal... why?'); 1179@.End of file on the terminal@> 1180 init_terminal:=false; return; 1181 end; 1182 loc:=first; 1183 while (loc<last)and(buffer[loc]=" ") do incr(loc); 1184 if loc<last then 1185 begin init_terminal:=true; 1186 return; {return unless the line was all blank} 1187 end; 1188 write_ln(term_out,'Please type the name of your input file.'); 1189 end; 1190exit:end; 1191 1192@* \[4] String handling. 1193Control sequence names and diagnostic messages are variable-length strings 1194of eight-bit characters. Since \PASCAL\ does not have a well-developed string 1195mechanism, \TeX\ does all of its string processing by homegrown methods. 1196 1197Elaborate facilities for dynamic strings are not needed, so all of the 1198necessary operations can be handled with a simple data structure. 1199The array |str_pool| contains all of the (eight-bit) ASCII codes in all 1200of the strings, and the array |str_start| contains indices of the starting 1201points of each string. Strings are referred to by integer numbers, so that 1202string number |s| comprises the characters |str_pool[j]| for 1203|str_start_macro[s]<=j<str_start_macro[s+1]|. Additional integer variables 1204|pool_ptr| and |str_ptr| indicate the number of entries used so far 1205in |str_pool| and |str_start|, respectively; locations 1206|str_pool[pool_ptr]| and |str_start_macro[str_ptr]| are 1207ready for the next string to be allocated. 1208 1209String numbers 0 to 255 are reserved for strings that correspond to single 1210ASCII characters. This is in accordance with the conventions of \.{WEB}, 1211@.WEB@> 1212which converts single-character strings into the ASCII code number of the 1213single character involved, while it converts other strings into integers 1214and builds a string pool file. Thus, when the string constant \.{"."} appears 1215in the program below, \.{WEB} converts it into the integer 46, which is the 1216ASCII code for a period, while \.{WEB} will convert a string like \.{"hello"} 1217into some integer greater than~255. String number 46 will presumably be the 1218single character `\..'; but some ASCII codes have no standard visible 1219representation, and \TeX\ sometimes needs to be able to print an arbitrary 1220ASCII character, so the first 256 strings are used to specify exactly what 1221should be printed for each of the 256 possibilities. 1222 1223Elements of the |str_pool| array must be ASCII codes that can actually 1224be printed; i.e., they must have an |xchr| equivalent in the local 1225character set. (This restriction applies only to preloaded strings, 1226not to those generated dynamically by the user.) 1227 1228Some \PASCAL\ compilers won't pack integers into a single byte unless the 1229integers lie in the range |-128..127|. To accommodate such systems 1230we access the string pool only via macros that can easily be redefined. 1231@^system dependencies@> 1232 1233@d si(#) == # {convert from |ASCII_code| to |packed_ASCII_code|} 1234@d so(#) == # {convert from |packed_ASCII_code| to |ASCII_code|} 1235@d str_start_macro(#) == str_start[(#) - too_big_char] 1236 1237@<Types...@>= 1238@!pool_pointer = 0..pool_size; {for variables that point into |str_pool|} 1239@!str_number = 0..max_strings; {for variables that point into |str_start|} 1240@!packed_ASCII_code = 0..biggest_char; {elements of |str_pool| array} 1241 1242@ @<Glob...@>= 1243@!str_pool:packed array[pool_pointer] of packed_ASCII_code; {the characters} 1244@!str_start : array[str_number] of pool_pointer; {the starting pointers} 1245@!pool_ptr : pool_pointer; {first unused position in |str_pool|} 1246@!str_ptr : str_number; {number of the current string being created} 1247@!init_pool_ptr : pool_pointer; {the starting value of |pool_ptr|} 1248@!init_str_ptr : str_number; {the starting value of |str_ptr|} 1249 1250@ Several of the elementary string operations are performed using \.{WEB} 1251macros instead of \PASCAL\ procedures, because many of the 1252operations are done quite frequently and we want to avoid the 1253overhead of procedure calls. For example, here is 1254a simple macro that computes the length of a string. 1255@.WEB@> 1256 1257@p function length(s:str_number):integer; 1258 {the number of characters in string number |s|} 1259begin if (s>=@"10000) then length:=str_start_macro(s+1)-str_start_macro(s) 1260else if (s>=@"20) and (s<@"7F) then length:=1 1261else if (s<=@"7F) then length:=3 1262else if (s<@"100) then length:=4 1263else length:=8 1264end; 1265 1266@ The length of the current string is called |cur_length|: 1267 1268@d cur_length == (pool_ptr - str_start_macro(str_ptr)) 1269 1270@ Strings are created by appending character codes to |str_pool|. 1271The |append_char| macro, defined here, does not check to see if the 1272value of |pool_ptr| has gotten too high; this test is supposed to be 1273made before |append_char| is used. There is also a |flush_char| 1274macro, which erases the last character appended. 1275 1276To test if there is room to append |l| more characters to |str_pool|, 1277we shall write |str_room(l)|, which aborts \TeX\ and gives an 1278apologetic error message if there isn't enough room. 1279 1280@d append_char(#) == {put |ASCII_code| \# at the end of |str_pool|} 1281begin str_pool[pool_ptr]:=si(#); incr(pool_ptr); 1282end 1283@d flush_char == decr(pool_ptr) {forget the last character in the pool} 1284@d str_room(#) == {make sure that the pool hasn't overflowed} 1285 begin if pool_ptr+# > pool_size then 1286 overflow("pool size",pool_size-init_pool_ptr); 1287@:TeX capacity exceeded pool size}{\quad pool size@> 1288 end 1289 1290@ Once a sequence of characters has been appended to |str_pool|, it 1291officially becomes a string when the function |make_string| is called. 1292This function returns the identification number of the new string as its 1293value. 1294 1295@p function make_string : str_number; {current string enters the pool} 1296begin if str_ptr=max_strings then 1297 overflow("number of strings",max_strings-init_str_ptr); 1298@:TeX capacity exceeded number of strings}{\quad number of strings@> 1299incr(str_ptr); str_start_macro(str_ptr):=pool_ptr; 1300make_string:=str_ptr-1; 1301end; 1302 1303@ To destroy the most recently made string, we say |flush_string|. 1304 1305@d flush_string==begin decr(str_ptr); pool_ptr:=str_start_macro(str_ptr); 1306 end 1307 1308@p procedure append_str(@!s:str_number); { append an existing string to the current string } 1309var i: integer; 1310 j: pool_pointer; 1311begin 1312 i:=length(s); 1313 str_room(i); 1314 j:=str_start_macro(s); 1315 while (i > 0) do begin 1316 append_char(str_pool[j]); 1317 incr(j); decr(i); 1318 end; 1319end; 1320 1321@ The following subroutine compares string |s| with another string of the 1322same length that appears in |buffer| starting at position |k|; 1323the result is |true| if and only if the strings are equal. 1324Empirical tests indicate that |str_eq_buf| is used in such a way that 1325it tends to return |true| about 80 percent of the time. 1326 1327@p function str_eq_buf(@!s:str_number;@!k:integer):boolean; 1328 {test equality of strings} 1329label not_found; {loop exit} 1330var j: pool_pointer; {running index} 1331@!result: boolean; {result of comparison} 1332begin j:=str_start_macro(s); 1333while j<str_start_macro(s+1) do 1334 begin 1335 if buffer[k]>=@"10000 then 1336 if so(str_pool[j])<>@"D800+(buffer[k]-@"10000)div@"400 then 1337 begin result:=false; goto not_found; 1338 end 1339 else if so(str_pool[j+1])<>@"DC00+(buffer[k]-@"10000)mod@"400 then 1340 begin result:=false; goto not_found; 1341 end 1342 else incr(j) 1343 else if so(str_pool[j])<>buffer[k] then 1344 begin result:=false; goto not_found; 1345 end; 1346 incr(j); incr(k); 1347 end; 1348result:=true; 1349not_found: str_eq_buf:=result; 1350end; 1351 1352@ Here is a similar routine, but it compares two strings in the string pool, 1353and it does not assume that they have the same length. 1354 1355@p function str_eq_str(@!s,@!t:str_number):boolean; 1356 {test equality of strings} 1357label not_found; {loop exit} 1358var j,@!k: pool_pointer; {running indices} 1359@!result: boolean; {result of comparison} 1360begin result:=false; 1361if length(s)<>length(t) then goto not_found; 1362if (length(s)=1) then begin 1363 if s<65536 then begin 1364 if t<65536 then begin 1365 if s<>t then goto not_found; 1366 end 1367 else begin 1368 if s<>str_pool[str_start_macro(t)] then goto not_found; 1369 end; 1370 end 1371 else begin 1372 if t<65536 then begin 1373 if str_pool[str_start_macro(s)]<>t then goto not_found; 1374 end 1375 else begin 1376 if str_pool[str_start_macro(s)]<>str_pool[str_start_macro(t)] then 1377 goto not_found; 1378 end; 1379 end; 1380 end 1381else begin 1382 j:=str_start_macro(s); k:=str_start_macro(t); 1383 while j<str_start_macro(s+1) do 1384 begin if str_pool[j]<>str_pool[k] then goto not_found; 1385 incr(j); incr(k); 1386 end; 1387 end; 1388result:=true; 1389not_found: str_eq_str:=result; 1390end; 1391 1392@ The initial values of |str_pool|, |str_start|, |pool_ptr|, 1393and |str_ptr| are computed by the \.{INITEX} program, based in part 1394on the information that \.{WEB} has output while processing \TeX. 1395@.INITEX@> 1396@^string pool@> 1397 1398@p @!init function get_strings_started:boolean; {initializes the string pool, 1399 but returns |false| if something goes wrong} 1400label done,exit; 1401var 1402@!m,@!n:text_char; {characters input from |pool_file|} 1403@!g:str_number; {garbage} 1404@!a:integer; {accumulator for check sum} 1405@!c:boolean; {check sum has been checked} 1406begin pool_ptr:=0; str_ptr:=0; str_start[0]:=0; 1407@<Make the first 256 strings@>; 1408@<Read the other strings from the \.{TEX.POOL} file and return |true|, 1409 or give an error message and return |false|@>; 1410exit:end; 1411tini 1412 1413@ The first 65536 strings will consist of a single character only. 1414But we don't actually make them; they're simulated on the fly. 1415 1416@<Make the first 256...@>= 1417begin 1418str_ptr:=too_big_char; 1419str_start_macro(str_ptr):=pool_ptr; 1420end 1421 1422@ The first 128 strings will contain 95 standard ASCII characters, and the 1423other 33 characters will be printed in three-symbol form like `\.{\^\^A}' 1424unless a system-dependent change is made here. Installations that have 1425an extended character set, where for example |xchr[@'32]=@t\.{\'^^Z\'}@>|, 1426would like string @'32 to be the single character @'32 instead of the 1427three characters @'136, @'136, @'132 (\.{\^\^Z}). On the other hand, 1428even people with an extended character set will want to represent string 1429@'15 by \.{\^\^M}, since @'15 is |carriage_return|; the idea is to 1430produce visible strings instead of tabs or line-feeds or carriage-returns 1431or bell-rings or characters that are treated anomalously in text files. 1432 1433Unprintable characters of codes 128--255 are, similarly, rendered 1434\.{\^\^80}--\.{\^\^ff}. 1435 1436The boolean expression defined here should be |true| unless \TeX\ 1437internal code number~|k| corresponds to a non-troublesome visible 1438symbol in the local character set. An appropriate formula for the 1439extended character set recommended in {\sl The \TeX book\/} would, for 1440example, be `|k in [0,@'10..@'12,@'14,@'15,@'33,@'177..@'377]|'. 1441If character |k| cannot be printed, and |k<@'200|, then character |k+@'100| or 1442|k-@'100| must be printable; moreover, ASCII codes |[@'41..@'46, 1443@'60..@'71, @'136, @'141..@'146, @'160..@'171]| must be printable. 1444Thus, at least 81 printable characters are needed. 1445@:TeXbook}{\sl The \TeX book@> 1446@^character set dependencies@> 1447@^system dependencies@> 1448 1449@ When the \.{WEB} system program called \.{TANGLE} processes the \.{TEX.WEB} 1450description that you are now reading, it outputs the \PASCAL\ program 1451\.{TEX.PAS} and also a string pool file called \.{TEX.POOL}. The \.{INITEX} 1452@.WEB@>@.INITEX@> 1453program reads the latter file, where each string appears as a two-digit decimal 1454length followed by the string itself, and the information is recorded in 1455\TeX's string memory. 1456 1457@<Glob...@>= 1458@!init @!pool_file:alpha_file; {the string-pool file output by \.{TANGLE}} 1459tini 1460 1461@ @d bad_pool(#)==begin wake_up_terminal; write_ln(term_out,#); 1462 a_close(pool_file); get_strings_started:=false; return; 1463 end 1464@<Read the other strings...@>= 1465name_of_file:=pool_name; {we needn't set |name_length|} 1466if a_open_in(pool_file) then 1467 begin c:=false; 1468 repeat @<Read one string, but return |false| if the 1469 string memory space is getting too tight for comfort@>; 1470 until c; 1471 a_close(pool_file); get_strings_started:=true; 1472 end 1473else bad_pool('! I can''t read TEX.POOL.') 1474@.I can't read TEX.POOL@> 1475 1476@ @<Read one string...@>= 1477begin if eof(pool_file) then bad_pool('! TEX.POOL has no check sum.'); 1478@.TEX.POOL has no check sum@> 1479read(pool_file,m,n); {read two digits of string length} 1480if m='*' then @<Check the pool check sum@> 1481else begin if (xord[m]<"0")or(xord[m]>"9")or@| 1482 (xord[n]<"0")or(xord[n]>"9") then 1483 bad_pool('! TEX.POOL line doesn''t begin with two digits.'); 1484@.TEX.POOL line doesn't...@> 1485 l:=xord[m]*10+xord[n]-"0"*11; {compute the length} 1486 if pool_ptr+l+string_vacancies>pool_size then 1487 bad_pool('! You have to increase POOLSIZE.'); 1488@.You have to increase POOLSIZE@> 1489 for k:=1 to l do 1490 begin if eoln(pool_file) then m:=' '@+else read(pool_file,m); 1491 append_char(xord[m]); 1492 end; 1493 read_ln(pool_file); g:=make_string; 1494 end; 1495end 1496 1497@ The \.{WEB} operation \.{@@\$} denotes the value that should be at the 1498end of this \.{TEX.POOL} file; any other value means that the wrong pool 1499file has been loaded. 1500@^check sum@> 1501 1502@<Check the pool check sum@>= 1503begin a:=0; k:=1; 1504loop@+ begin if (xord[n]<"0")or(xord[n]>"9") then 1505 bad_pool('! TEX.POOL check sum doesn''t have nine digits.'); 1506@.TEX.POOL check sum...@> 1507 a:=10*a+xord[n]-"0"; 1508 if k=9 then goto done; 1509 incr(k); read(pool_file,n); 1510 end; 1511done: if a<>@$ then bad_pool('! TEX.POOL doesn''t match; TANGLE me again.'); 1512@.TEX.POOL doesn't match@> 1513c:=true; 1514end 1515 1516@* \[5] On-line and off-line printing. 1517Messages that are sent to a user's terminal and to the transcript-log file 1518are produced by several `|print|' procedures. These procedures will 1519direct their output to a variety of places, based on the setting of 1520the global variable |selector|, which has the following possible 1521values: 1522 1523\yskip 1524\hang |term_and_log|, the normal setting, prints on the terminal and on the 1525 transcript file. 1526 1527\hang |log_only|, prints only on the transcript file. 1528 1529\hang |term_only|, prints only on the terminal. 1530 1531\hang |no_print|, doesn't print at all. This is used only in rare cases 1532 before the transcript file is open. 1533 1534\hang |pseudo|, puts output into a cyclic buffer that is used 1535 by the |show_context| routine; when we get to that routine we shall discuss 1536 the reasoning behind this curious mode. 1537 1538\hang |new_string|, appends the output to the current string in the 1539 string pool. 1540 1541\hang 0 to 15, prints on one of the sixteen files for \.{\\write} output. 1542 1543\yskip 1544\noindent The symbolic names `|term_and_log|', etc., have been assigned 1545numeric codes that satisfy the convenient relations |no_print+1=term_only|, 1546|no_print+2=log_only|, |term_only+2=log_only+1=term_and_log|. 1547 1548Three additional global variables, |tally| and |term_offset| and 1549|file_offset|, record the number of characters that have been printed 1550since they were most recently cleared to zero. We use |tally| to record 1551the length of (possibly very long) stretches of printing; |term_offset| 1552and |file_offset|, on the other hand, keep track of how many characters 1553have appeared so far on the current line that has been output to the 1554terminal or to the transcript file, respectively. 1555 1556@d no_print=16 {|selector| setting that makes data disappear} 1557@d term_only=17 {printing is destined for the terminal only} 1558@d log_only=18 {printing is destined for the transcript file only} 1559@d term_and_log=19 {normal |selector| setting} 1560@d pseudo=20 {special |selector| setting for |show_context|} 1561@d new_string=21 {printing is deflected to the string pool} 1562@d max_selector=21 {highest selector setting} 1563 1564@<Glob...@>= 1565@!log_file : alpha_file; {transcript of \TeX\ session} 1566@!selector : 0..max_selector; {where to print a message} 1567@!dig : array[0..22] of 0..15; {digits in a number being output} 1568@!tally : integer; {the number of characters recently printed} 1569@!term_offset : 0..max_print_line; 1570 {the number of characters on the current terminal line} 1571@!file_offset : 0..max_print_line; 1572 {the number of characters on the current file line} 1573@!trick_buf:array[0..error_line] of ASCII_code; {circular buffer for 1574 pseudoprinting} 1575@!trick_count: integer; {threshold for pseudoprinting, explained later} 1576@!first_count: integer; {another variable for pseudoprinting} 1577 1578@ @<Initialize the output routines@>= 1579selector:=term_only; tally:=0; term_offset:=0; file_offset:=0; 1580 1581@ Macro abbreviations for output to the terminal and to the log file are 1582defined here for convenience. Some systems need special conventions 1583for terminal output, and it is possible to adhere to those conventions 1584by changing |wterm|, |wterm_ln|, and |wterm_cr| in this section. 1585@^system dependencies@> 1586 1587@d wterm(#)==write(term_out,#) 1588@d wterm_ln(#)==write_ln(term_out,#) 1589@d wterm_cr==write_ln(term_out) 1590@d wlog(#)==write(log_file,#) 1591@d wlog_ln(#)==write_ln(log_file,#) 1592@d wlog_cr==write_ln(log_file) 1593 1594@ To end a line of text output, we call |print_ln|. 1595 1596@<Basic print...@>= 1597procedure print_ln; {prints an end-of-line} 1598begin case selector of 1599term_and_log: begin wterm_cr; wlog_cr; 1600 term_offset:=0; file_offset:=0; 1601 end; 1602log_only: begin wlog_cr; file_offset:=0; 1603 end; 1604term_only: begin wterm_cr; term_offset:=0; 1605 end; 1606no_print,pseudo,new_string: do_nothing; 1607othercases write_ln(write_file[selector]) 1608endcases;@/ 1609end; {|tally| is not affected} 1610 1611@ The |print_raw_char| procedure sends one character to the desired destination, 1612using the |xchr| array to map it into an external character compatible with 1613|input_ln|. All printing comes through |print_ln|, |print_char| or 1614|print_visible_char|. When printing a multi-byte character, the boolean 1615parameter |incr_offset| is set |false| except for the very last byte, to avoid 1616calling |print_ln| in the middle of such character. 1617 1618@<Basic printing...@>= 1619procedure print_raw_char(@!s:ASCII_code;@!incr_offset:boolean); {prints a single character} 1620label exit; {label is not used but nonetheless kept (for other changes?)} 1621begin 1622case selector of 1623term_and_log: begin wterm(xchr[s]); wlog(xchr[s]); 1624 if incr_offset then 1625 begin incr(term_offset); incr(file_offset); 1626 end; 1627 if term_offset=max_print_line then 1628 begin wterm_cr; term_offset:=0; 1629 end; 1630 if file_offset=max_print_line then 1631 begin wlog_cr; file_offset:=0; 1632 end; 1633 end; 1634log_only: begin wlog(xchr[s]); 1635 if incr_offset then incr(file_offset); 1636 if file_offset=max_print_line then print_ln; 1637 end; 1638term_only: begin wterm(xchr[s]); 1639 if incr_offset then incr(term_offset); 1640 if term_offset=max_print_line then print_ln; 1641 end; 1642no_print: do_nothing; 1643pseudo: if tally<trick_count then trick_buf[tally mod error_line]:=s; 1644new_string: begin if pool_ptr<pool_size then append_char(s); 1645 end; {we drop characters if the string space is full} 1646othercases write(write_file[selector],xchr[s]) 1647endcases;@/ 1648incr(tally); 1649exit:end; 1650 1651@ The |print_char| procedure sends one character to the desired destination. 1652Control sequence names, file names and string constructed with 1653\.{\\string} might contain |ASCII_code| values that can't 1654be printed using |print_raw_char|. These characters will be printed 1655in three- or four-symbol form like `\.{\^\^A}' or `\.{\^\^e4}', 1656unless the -8bit option is enabled. 1657Output that goes to the terminal and/or log file is treated differently 1658when it comes to determining whether a character is printable. 1659 1660@d print_visible_char(#)==print_raw_char(#, true) 1661@d print_lc_hex(#)==l:=#; 1662 if l<10 then print_visible_char(l+"0")@+else print_visible_char(l-10+"a") 1663 1664@<Basic printing...@>= 1665procedure print_char(@!s:integer); {prints a single character} 1666label exit; 1667var l: small_number; 1668begin if (selector>pseudo) and (not doing_special) then 1669 {``printing'' to a new string, encode as UTF-16 rather than UTF-8} 1670 begin if s>=@"10000 then 1671 begin print_visible_char(@"D800 + (s - @"10000) div @"400); 1672 print_visible_char(@"DC00 + (s - @"10000) mod @"400); 1673 end else print_visible_char(s); 1674 return; 1675 end; 1676if @<Character |s| is the current new-line character@> then 1677 if selector<pseudo then 1678 begin print_ln; return; 1679 end; 1680 if (s < 32) and (eight_bit_p = 0) and (not doing_special) then {control char: \.{\^\^X}} 1681 begin print_visible_char("^"); print_visible_char("^"); print_visible_char(s+64); 1682 end 1683 else if s < 127 then { printable ASCII } 1684 print_visible_char(s) 1685 else if (s = 127) then { DEL } 1686 begin if (eight_bit_p = 0) and (not doing_special) then 1687 begin print_visible_char("^"); print_visible_char("^"); print_visible_char("?") 1688 end else 1689 print_visible_char(s) 1690 end 1691 else if (s < @"A0) and (eight_bit_p = 0) and (not doing_special) then {C1 controls: \.{\^\^xx}} 1692 begin print_visible_char("^"); print_visible_char("^"); 1693 print_lc_hex((s mod @"100) div @"10); print_lc_hex(s mod @"10); 1694 end 1695 else begin 1696 { |char >= 128|: encode as UTF8 } 1697 if s<@"800 then begin 1698 print_raw_char(@"C0 + s div @"40, false); 1699 print_raw_char(@"80 + s mod @"40, true); 1700 end 1701 else if s<@"10000 then begin 1702 print_raw_char(@"E0 + (s div @"1000), false); 1703 print_raw_char(@"80 + (s mod @"1000) div @"40, false); 1704 print_raw_char(@"80 + (s mod @"40), true); 1705 end 1706 else begin 1707 print_raw_char(@"F0 + (s div @"40000), false); 1708 print_raw_char(@"80 + (s mod @"40000) div @"1000, false); 1709 print_raw_char(@"80 + (s mod @"1000) div @"40, false); 1710 print_raw_char(@"80 + (s mod @"40), true); 1711 end 1712 end; 1713exit:end; 1714 1715@ @d native_room(#)==while native_text_size <= native_len+# do begin 1716 native_text_size:=native_text_size+128; 1717 native_text:=xrealloc(native_text, native_text_size * sizeof(UTF16_code)); 1718 end 1719@d append_native(#)==begin native_text[native_len]:=#; incr(native_len); end 1720 1721@ @<Glob...@>= 1722@!doing_special: boolean; 1723@!native_text: ^UTF16_code; { buffer for collecting native-font strings } 1724@!native_text_size: integer; { size of buffer } 1725@!native_len: integer; 1726@!save_native_len: integer; 1727 1728@ @<Set init...@>= 1729doing_special:=false; 1730native_text_size:=128; 1731native_text:=xmalloc(native_text_size * sizeof(UTF16_code)); 1732 1733@ An entire string is output by calling |print|. Note that if we are outputting 1734the single standard ASCII character \.c, we could call |print("c")|, since 1735|"c"=99| is the number of a single-character string, as explained above. But 1736|print_char("c")| is quicker, so \TeX\ goes directly to the |print_char| 1737routine when it knows that this is safe. (The present implementation 1738assumes that it is always safe to print a visible ASCII character.) 1739@^system dependencies@> 1740 1741@<Basic print...@>= 1742procedure print(@!s:integer); {prints string |s|} 1743label exit; 1744var j:pool_pointer; {current character code position} 1745@!nl:integer; {new-line character to restore} 1746begin if s>=str_ptr then s:="???" {this can't happen} 1747@.???@> 1748else if s<biggest_char then 1749 if s<0 then s:="???" {can't happen} 1750 else begin if selector>pseudo then 1751 begin print_char(s); return; {internal strings are not expanded} 1752 end; 1753 if (@<Character |s| is the current new-line character@>) then 1754 if selector<pseudo then 1755 begin print_ln; return; 1756 end; 1757 nl:=new_line_char; 1758 new_line_char:=-1; 1759 print_char(s); 1760 new_line_char:=nl; 1761 return; 1762 end; 1763j:=str_start_macro(s); 1764while j<str_start_macro(s+1) do begin 1765 if (so(str_pool[j])>=@"D800) and (so(str_pool[j])<=@"DBFF) 1766 and (j+1<str_start_macro(s+1)) 1767 and (so(str_pool[j+1])>=@"DC00) and (so(str_pool[j+1])<=@"DFFF) then 1768 begin print_char(@"10000 + (so(str_pool[j])-@"D800) * @"400 1769 + so(str_pool[j+1])-@"DC00); j:=j+2; 1770 end 1771 else begin print_char(so(str_pool[j])); incr(j); 1772 end; 1773end; 1774exit:end; 1775 1776@ Old versions of \TeX\ needed a procedure called |slow_print| whose function 1777is now subsumed by |print| and the new functionality of |print_char| and 1778|print_visible_char|. We retain the old name |slow_print| here as a 1779possible aid to future software arch\ae ologists. 1780 1781@d slow_print == print 1782 1783@ Here is the very first thing that \TeX\ prints: a headline that identifies 1784the version number and format package. The |term_offset| variable is temporarily 1785incorrect, but the discrepancy is not serious since we assume that the banner 1786and format identifier together will occupy at most |max_print_line| 1787character positions. 1788 1789@<Initialize the output...@>= 1790wterm(banner); 1791if format_ident=0 then wterm_ln(' (no format preloaded)') 1792else begin slow_print(format_ident); print_ln; 1793 end; 1794update_terminal; 1795 1796@ The procedure |print_nl| is like |print|, but it makes sure that the 1797string appears at the beginning of a new line. 1798 1799@<Basic print...@>= 1800procedure print_nl(@!s:str_number); {prints string |s| at beginning of line} 1801begin if ((term_offset>0)and(odd(selector)))or@| 1802 ((file_offset>0)and(selector>=log_only)) then print_ln; 1803print(s); 1804end; 1805 1806@ The procedure |print_esc| prints a string that is preceded by 1807the user's escape character (which is usually a backslash). 1808 1809@<Basic print...@>= 1810procedure print_esc(@!s:str_number); {prints escape character, then |s|} 1811var c:integer; {the escape character code} 1812begin @<Set variable |c| to the current escape character@>; 1813if c>=0 then if c<=biggest_usv then print_char(c); 1814slow_print(s); 1815end; 1816 1817@ An array of digits in the range |0..15| is printed by |print_the_digs|. 1818 1819@<Basic print...@>= 1820procedure print_the_digs(@!k:eight_bits); 1821 {prints |dig[k-1]|$\,\ldots\,$|dig[0]|} 1822begin while k>0 do 1823 begin decr(k); 1824 if dig[k]<10 then print_char("0"+dig[k]) 1825 else print_char("A"-10+dig[k]); 1826 end; 1827end; 1828 1829@ The following procedure, which prints out the decimal representation of a 1830given integer |n|, has been written carefully so that it works properly 1831if |n=0| or if |(-n)| would cause overflow. It does not apply |mod| or |div| 1832to negative arguments, since such operations are not implemented consistently 1833by all \PASCAL\ compilers. 1834 1835@<Basic print...@>= 1836procedure print_int(@!n:integer); {prints an integer in decimal form} 1837var k:0..23; {index to current digit; we assume that $|n|<10^{23}$} 1838@!m:integer; {used to negate |n| in possibly dangerous cases} 1839begin k:=0; 1840if n<0 then 1841 begin print_char("-"); 1842 if n>-100000000 then negate(n) 1843 else begin m:=-1-n; n:=m div 10; m:=(m mod 10)+1; k:=1; 1844 if m<10 then dig[0]:=m 1845 else begin dig[0]:=0; incr(n); 1846 end; 1847 end; 1848 end; 1849repeat dig[k]:=n mod 10; n:=n div 10; incr(k); 1850until n=0; 1851print_the_digs(k); 1852end; 1853 1854@ Here is a trivial procedure to print two digits; it is usually called with 1855a parameter in the range |0<=n<=99|. 1856 1857@p procedure print_two(@!n:integer); {prints two least significant digits} 1858begin n:=abs(n) mod 100; print_char("0"+(n div 10)); 1859print_char("0"+(n mod 10)); 1860end; 1861 1862@ Hexadecimal printing of nonnegative integers is accomplished by |print_hex|. 1863 1864@p procedure print_hex(@!n:integer); 1865 {prints a positive integer in hexadecimal form} 1866var k:0..22; {index to current digit; we assume that $0\L n<16^{22}$} 1867begin k:=0; print_char(""""); 1868repeat dig[k]:=n mod 16; n:=n div 16; incr(k); 1869until n=0; 1870print_the_digs(k); 1871end; 1872 1873@ Old versions of \TeX\ needed a procedure called |print_ASCII| whose function 1874is now subsumed by |print|. We retain the old name here as a possible aid to 1875future software arch\ae ologists. 1876 1877@d print_ASCII == print 1878 1879@ Roman numerals are produced by the |print_roman_int| routine. Readers 1880who like puzzles might enjoy trying to figure out how this tricky code 1881works; therefore no explanation will be given. Notice that 1990 yields 1882\.{mcmxc}, not \.{mxm}. 1883 1884@p procedure print_roman_int(@!n:integer); 1885label exit; 1886var j,@!k: pool_pointer; {mysterious indices into |str_pool|} 1887@!u,@!v: nonnegative_integer; {mysterious numbers} 1888begin j:=str_start_macro("m2d5c2l5x2v5i"); v:=1000; 1889loop@+ begin while n>=v do 1890 begin print_char(so(str_pool[j])); n:=n-v; 1891 end; 1892 if n<=0 then return; {nonpositive input produces no output} 1893 k:=j+2; u:=v div (so(str_pool[k-1])-"0"); 1894 if str_pool[k-1]=si("2") then 1895 begin k:=k+2; u:=u div (so(str_pool[k-1])-"0"); 1896 end; 1897 if n+u>=v then 1898 begin print_char(so(str_pool[k])); n:=n+u; 1899 end 1900 else begin j:=j+2; v:=v div (so(str_pool[j-1])-"0"); 1901 end; 1902 end; 1903exit:end; 1904 1905@ The |print| subroutine will not print a string that is still being 1906created. The following procedure will. 1907 1908@p procedure print_current_string; {prints a yet-unmade string} 1909var j:pool_pointer; {points to current character code} 1910begin j:=str_start_macro(str_ptr); 1911while j<pool_ptr do 1912 begin print_char(so(str_pool[j])); incr(j); 1913 end; 1914end; 1915 1916@ Here is a procedure that asks the user to type a line of input, 1917assuming that the |selector| setting is either |term_only| or |term_and_log|. 1918The input is placed into locations |first| through |last-1| of the 1919|buffer| array, and echoed on the transcript file if appropriate. 1920 1921This procedure is never called when |interaction<scroll_mode|. 1922 1923@d prompt_input(#)==begin wake_up_terminal; print(#); term_input; 1924 end {prints a string and gets a line of input} 1925 1926@p procedure term_input; {gets a line from the terminal} 1927var k:0..buf_size; {index into |buffer|} 1928begin update_terminal; {now the user sees the prompt for sure} 1929if not input_ln(term_in,true) then fatal_error("End of file on the terminal!"); 1930@.End of file on the terminal@> 1931term_offset:=0; {the user's line ended with \<\rm return>} 1932decr(selector); {prepare to echo the input} 1933if last<>first then for k:=first to last-1 do print(buffer[k]); 1934print_ln; incr(selector); {restore previous status} 1935end; 1936 1937@* \[6] Reporting errors. 1938When something anomalous is detected, \TeX\ typically does something like this: 1939$$\vbox{\halign{#\hfil\cr 1940|print_err("Something anomalous has been detected");|\cr 1941|help3("This is the first line of my offer to help.")|\cr 1942|("This is the second line. I'm trying to")|\cr 1943|("explain the best way for you to proceed.");|\cr 1944|error;|\cr}}$$ 1945A two-line help message would be given using |help2|, etc.; these informal 1946helps should use simple vocabulary that complements the words used in the 1947official error message that was printed. (Outside the U.S.A., the help 1948messages should preferably be translated into the local vernacular. Each 1949line of help is at most 60 characters long, in the present implementation, 1950so that |max_print_line| will not be exceeded.) 1951 1952The |print_err| procedure supplies a `\.!' before the official message, 1953and makes sure that the terminal is awake if a stop is going to occur. 1954The |error| procedure supplies a `\..' after the official message, then it 1955shows the location of the error; and if |interaction=error_stop_mode|, 1956it also enters into a dialog with the user, during which time the help 1957message may be printed. 1958@^system dependencies@> 1959 1960@ The global variable |interaction| has four settings, representing increasing 1961amounts of user interaction: 1962 1963@d batch_mode=0 {omits all stops and omits terminal output} 1964@d nonstop_mode=1 {omits all stops} 1965@d scroll_mode=2 {omits error stops} 1966@d error_stop_mode=3 {stops at every opportunity to interact} 1967@d print_err(#)==begin if interaction=error_stop_mode then wake_up_terminal; 1968 print_nl("! "); print(#); 1969 end 1970 1971@<Glob...@>= 1972@!interaction:batch_mode..error_stop_mode; {current level of interaction} 1973 1974@ @<Set init...@>=interaction:=error_stop_mode; 1975 1976@ \TeX\ is careful not to call |error| when the print |selector| setting 1977might be unusual. The only possible values of |selector| at the time of 1978error messages are 1979 1980\yskip\hang|no_print| (when |interaction=batch_mode| 1981 and |log_file| not yet open); 1982 1983\hang|term_only| (when |interaction>batch_mode| and |log_file| not yet open); 1984 1985\hang|log_only| (when |interaction=batch_mode| and |log_file| is open); 1986 1987\hang|term_and_log| (when |interaction>batch_mode| and |log_file| is open). 1988 1989@<Initialize the print |selector| based on |interaction|@>= 1990if interaction=batch_mode then selector:=no_print@+else selector:=term_only 1991 1992@ A global variable |deletions_allowed| is set |false| if the |get_next| 1993routine is active when |error| is called; this ensures that |get_next| 1994and related routines like |get_token| will never be called recursively. 1995A similar interlock is provided by |set_box_allowed|. 1996@^recursion@> 1997 1998The global variable |history| records the worst level of error that 1999has been detected. It has four possible values: |spotless|, |warning_issued|, 2000|error_message_issued|, and |fatal_error_stop|. 2001 2002Another global variable, |error_count|, is increased by one when an 2003|error| occurs without an interactive dialog, and it is reset to zero at 2004the end of every paragraph. If |error_count| reaches 100, \TeX\ decides 2005that there is no point in continuing further. 2006 2007@d spotless=0 {|history| value when nothing has been amiss yet} 2008@d warning_issued=1 {|history| value when |begin_diagnostic| has been called} 2009@d error_message_issued=2 {|history| value when |error| has been called} 2010@d fatal_error_stop=3 {|history| value when termination was premature} 2011 2012@<Glob...@>= 2013@!deletions_allowed:boolean; {is it safe for |error| to call |get_token|?} 2014@!set_box_allowed:boolean; {is it safe to do a \.{\\setbox} assignment?} 2015@!history:spotless..fatal_error_stop; {has the source input been clean so far?} 2016@!error_count:-1..100; {the number of scrolled errors since the 2017 last paragraph ended} 2018 2019@ The value of |history| is initially |fatal_error_stop|, but it will 2020be changed to |spotless| if \TeX\ survives the initialization process. 2021 2022@<Set init...@>= 2023deletions_allowed:=true; set_box_allowed:=true; 2024error_count:=0; {|history| is initialized elsewhere} 2025 2026@ Since errors can be detected almost anywhere in \TeX, we want to declare the 2027error procedures near the beginning of the program. But the error procedures 2028in turn use some other procedures, which need to be declared |forward| 2029before we get to |error| itself. 2030 2031It is possible for |error| to be called recursively if some error arises 2032when |get_token| is being used to delete a token, and/or if some fatal error 2033occurs while \TeX\ is trying to fix a non-fatal one. But such recursion 2034@^recursion@> 2035is never more than two levels deep. 2036 2037@<Error handling...@>= 2038procedure@?normalize_selector; forward;@t\2@>@/ 2039procedure@?get_token; forward;@t\2@>@/ 2040procedure@?term_input; forward;@t\2@>@/ 2041procedure@?show_context; forward;@t\2@>@/ 2042procedure@?begin_file_reading; forward;@t\2@>@/ 2043procedure@?open_log_file; forward;@t\2@>@/ 2044procedure@?close_files_and_terminate; forward;@t\2@>@/ 2045procedure@?clear_for_error_prompt; forward;@t\2@>@/ 2046procedure@?give_err_help; forward;@t\2@>@/ 2047@t\4\hskip-\fontdimen2\font@>@;@+@!debug@+procedure@?debug_help; 2048 forward;@;@+gubed 2049 2050@ Individual lines of help are recorded in the array |help_line|, which 2051contains entries in positions |0..(help_ptr-1)|. They should be printed 2052in reverse order, i.e., with |help_line[0]| appearing last. 2053 2054@d hlp1(#)==help_line[0]:=#;@+end 2055@d hlp2(#)==help_line[1]:=#; hlp1 2056@d hlp3(#)==help_line[2]:=#; hlp2 2057@d hlp4(#)==help_line[3]:=#; hlp3 2058@d hlp5(#)==help_line[4]:=#; hlp4 2059@d hlp6(#)==help_line[5]:=#; hlp5 2060@d help0==help_ptr:=0 {sometimes there might be no help} 2061@d help1==@+begin help_ptr:=1; hlp1 {use this with one help line} 2062@d help2==@+begin help_ptr:=2; hlp2 {use this with two help lines} 2063@d help3==@+begin help_ptr:=3; hlp3 {use this with three help lines} 2064@d help4==@+begin help_ptr:=4; hlp4 {use this with four help lines} 2065@d help5==@+begin help_ptr:=5; hlp5 {use this with five help lines} 2066@d help6==@+begin help_ptr:=6; hlp6 {use this with six help lines} 2067 2068@<Glob...@>= 2069@!help_line:array[0..5] of str_number; {helps for the next |error|} 2070@!help_ptr:0..6; {the number of help lines present} 2071@!use_err_help:boolean; {should the |err_help| list be shown?} 2072 2073@ @<Set init...@>= 2074help_ptr:=0; use_err_help:=false; 2075 2076@ The |jump_out| procedure just cuts across all active procedure levels and 2077goes to |end_of_TEX|. This is the only nontrivial |@!goto| statement in the 2078whole program. It is used when there is no recovery from a particular error. 2079 2080Some \PASCAL\ compilers do not implement non-local |goto| statements. 2081@^system dependencies@> 2082In such cases the body of |jump_out| should simply be 2083`|close_files_and_terminate|;\thinspace' followed by a call on some system 2084procedure that quietly terminates the program. 2085 2086@<Error hand...@>= 2087procedure jump_out; 2088begin goto end_of_TEX; 2089end; 2090 2091@ Here now is the general |error| routine. 2092 2093@<Error hand...@>= 2094procedure error; {completes the job of error reporting} 2095label continue,exit; 2096var c:ASCII_code; {what the user types} 2097@!s1,@!s2,@!s3,@!s4:integer; 2098 {used to save global variables when deleting tokens} 2099begin if history<error_message_issued then history:=error_message_issued; 2100print_char("."); show_context; 2101if interaction=error_stop_mode then @<Get user's advice and |return|@>; 2102incr(error_count); 2103if error_count=100 then 2104 begin print_nl("(That makes 100 errors; please try again.)"); 2105@.That makes 100 errors...@> 2106 history:=fatal_error_stop; jump_out; 2107 end; 2108@<Put help message on the transcript file@>; 2109exit:end; 2110 2111@ @<Get user's advice...@>= 2112loop@+begin continue: clear_for_error_prompt; prompt_input("? "); 2113@.?\relax@> 2114 if last=first then return; 2115 c:=buffer[first]; 2116 if c>="a" then c:=c+"A"-"a"; {convert to uppercase} 2117 @<Interpret code |c| and |return| if done@>; 2118 end 2119 2120@ It is desirable to provide an `\.E' option here that gives the user 2121an easy way to return from \TeX\ to the system editor, with the offending 2122line ready to be edited. But such an extension requires some system 2123wizardry, so the present implementation simply types out the name of the 2124file that should be 2125edited and the relevant line number. 2126@^system dependencies@> 2127 2128There is a secret `\.D' option available when the debugging routines haven't 2129been commented~out. 2130@^debugging@> 2131 2132@<Interpret code |c| and |return| if done@>= 2133case c of 2134"0","1","2","3","4","5","6","7","8","9": if deletions_allowed then 2135 @<Delete \(c)|c-"0"| tokens and |goto continue|@>; 2136@t\4\4@>@;@+@!debug "D": begin debug_help; goto continue;@+end;@+gubed@/ 2137"E": if base_ptr>0 then 2138 begin print_nl("You want to edit file "); 2139@.You want to edit file x@> 2140 slow_print(input_stack[base_ptr].name_field); 2141 print(" at line "); print_int(line); 2142 interaction:=scroll_mode; jump_out; 2143 end; 2144"H": @<Print the help information and |goto continue|@>; 2145"I":@<Introduce new material from the terminal and |return|@>; 2146"Q","R","S":@<Change the interaction level and |return|@>; 2147"X":begin interaction:=scroll_mode; jump_out; 2148 end; 2149othercases do_nothing 2150endcases;@/ 2151@<Print the menu of available options@> 2152 2153@ @<Print the menu...@>= 2154begin print("Type <return> to proceed, S to scroll future error messages,");@/ 2155@.Type <return> to proceed...@> 2156print_nl("R to run without stopping, Q to run quietly,");@/ 2157print_nl("I to insert something, "); 2158if base_ptr>0 then print("E to edit your file,"); 2159if deletions_allowed then 2160 print_nl("1 or ... or 9 to ignore the next 1 to 9 tokens of input,"); 2161print_nl("H for help, X to quit."); 2162end 2163 2164@ Here the author of \TeX\ apologizes for making use of the numerical 2165relation between |"Q"|, |"R"|, |"S"|, and the desired interaction settings 2166|batch_mode|, |nonstop_mode|, |scroll_mode|. 2167@^Knuth, Donald Ervin@> 2168 2169@<Change the interaction...@>= 2170begin error_count:=0; interaction:=batch_mode+c-"Q"; 2171print("OK, entering "); 2172case c of 2173"Q":begin print_esc("batchmode"); decr(selector); 2174 end; 2175"R":print_esc("nonstopmode"); 2176"S":print_esc("scrollmode"); 2177end; {there are no other cases} 2178print("..."); print_ln; update_terminal; return; 2179end 2180 2181@ When the following code is executed, |buffer[(first+1)..(last-1)]| may 2182contain the material inserted by the user; otherwise another prompt will 2183be given. In order to understand this part of the program fully, you need 2184to be familiar with \TeX's input stacks. 2185 2186@<Introduce new material...@>= 2187begin begin_file_reading; {enter a new syntactic level for terminal input} 2188{now |state=mid_line|, so an initial blank space will count as a blank} 2189if last>first+1 then 2190 begin loc:=first+1; buffer[first]:=" "; 2191 end 2192else begin prompt_input("insert>"); loc:=first; 2193@.insert>@> 2194 end; 2195first:=last; 2196cur_input.limit_field:=last-1; {no |end_line_char| ends this line} 2197return; 2198end 2199 2200@ We allow deletion of up to 99 tokens at a time. 2201 2202@<Delete \(c)|c-"0"| tokens...@>= 2203begin s1:=cur_tok; s2:=cur_cmd; s3:=cur_chr; s4:=align_state; 2204align_state:=1000000; OK_to_interrupt:=false; 2205if (last>first+1) and (buffer[first+1]>="0")and(buffer[first+1]<="9") then 2206 c:=c*10+buffer[first+1]-"0"*11 2207else c:=c-"0"; 2208while c>0 do 2209 begin get_token; {one-level recursive call of |error| is possible} 2210 decr(c); 2211 end; 2212cur_tok:=s1; cur_cmd:=s2; cur_chr:=s3; align_state:=s4; OK_to_interrupt:=true; 2213help2("I have just deleted some text, as you asked.")@/ 2214("You can now delete more, or insert, or whatever."); 2215show_context; goto continue; 2216end 2217 2218@ @<Print the help info...@>= 2219begin if use_err_help then 2220 begin give_err_help; use_err_help:=false; 2221 end 2222else begin if help_ptr=0 then 2223 help2("Sorry, I don't know how to help in this situation.")@/ 2224 @t\kern1em@>("Maybe you should try asking a human?"); 2225 repeat decr(help_ptr); print(help_line[help_ptr]); print_ln; 2226 until help_ptr=0; 2227 end; 2228help4("Sorry, I already gave what help I could...")@/ 2229 ("Maybe you should try asking a human?")@/ 2230 ("An error might have occurred before I noticed any problems.")@/ 2231 ("``If all else fails, read the instructions.''");@/ 2232goto continue; 2233end 2234 2235@ @<Put help message on the transcript file@>= 2236if interaction>batch_mode then decr(selector); {avoid terminal output} 2237if use_err_help then 2238 begin print_ln; give_err_help; 2239 end 2240else while help_ptr>0 do 2241 begin decr(help_ptr); print_nl(help_line[help_ptr]); 2242 end; 2243print_ln; 2244if interaction>batch_mode then incr(selector); {re-enable terminal output} 2245print_ln 2246 2247@ A dozen or so error messages end with a parenthesized integer, so we 2248save a teeny bit of program space by declaring the following procedure: 2249 2250@p procedure int_error(@!n:integer); 2251begin print(" ("); print_int(n); print_char(")"); error; 2252end; 2253 2254@ In anomalous cases, the print selector might be in an unknown state; 2255the following subroutine is called to fix things just enough to keep 2256running a bit longer. 2257 2258@p procedure normalize_selector; 2259begin if log_opened then selector:=term_and_log 2260else selector:=term_only; 2261if job_name=0 then open_log_file; 2262if interaction=batch_mode then decr(selector); 2263end; 2264 2265@ The following procedure prints \TeX's last words before dying. 2266 2267@d succumb==begin if interaction=error_stop_mode then 2268 interaction:=scroll_mode; {no more interaction} 2269 if log_opened then error; 2270 @!debug if interaction>batch_mode then debug_help;@+gubed@;@/ 2271 history:=fatal_error_stop; jump_out; {irrecoverable error} 2272 end 2273 2274@<Error hand...@>= 2275procedure fatal_error(@!s:str_number); {prints |s|, and that's it} 2276begin normalize_selector;@/ 2277print_err("Emergency stop"); help1(s); succumb; 2278@.Emergency stop@> 2279end; 2280 2281@ Here is the most dreaded error message. 2282 2283@<Error hand...@>= 2284procedure overflow(@!s:str_number;@!n:integer); {stop due to finiteness} 2285begin normalize_selector; 2286print_err("TeX capacity exceeded, sorry ["); 2287@.TeX capacity exceeded ...@> 2288print(s); print_char("="); print_int(n); print_char("]"); 2289help2("If you really absolutely need more capacity,")@/ 2290 ("you can ask a wizard to enlarge me."); 2291succumb; 2292end; 2293 2294@ The program might sometime run completely amok, at which point there is 2295no choice but to stop. If no previous error has been detected, that's bad 2296news; a message is printed that is really intended for the \TeX\ 2297maintenance person instead of the user (unless the user has been 2298particularly diabolical). The index entries for `this can't happen' may 2299help to pinpoint the problem. 2300@^dry rot@> 2301 2302@<Error hand...@>= 2303procedure confusion(@!s:str_number); 2304 {consistency check violated; |s| tells where} 2305begin normalize_selector; 2306if history<error_message_issued then 2307 begin print_err("This can't happen ("); print(s); print_char(")"); 2308@.This can't happen@> 2309 help1("I'm broken. Please show this to someone who can fix can fix"); 2310 end 2311else begin print_err("I can't go on meeting you like this"); 2312@.I can't go on...@> 2313 help2("One of your faux pas seems to have wounded me deeply...")@/ 2314 ("in fact, I'm barely conscious. Please fix it and try again."); 2315 end; 2316succumb; 2317end; 2318 2319@ Users occasionally want to interrupt \TeX\ while it's running. 2320If the \PASCAL\ runtime system allows this, one can implement 2321a routine that sets the global variable |interrupt| to some nonzero value 2322when such an interrupt is signalled. Otherwise there is probably at least 2323a way to make |interrupt| nonzero using the \PASCAL\ debugger. 2324@^system dependencies@> 2325@^debugging@> 2326 2327@d check_interrupt==begin if interrupt<>0 then pause_for_instructions; 2328 end 2329 2330@<Global...@>= 2331@!interrupt:integer; {should \TeX\ pause for instructions?} 2332@!OK_to_interrupt:boolean; {should interrupts be observed?} 2333 2334@ @<Set init...@>= 2335interrupt:=0; OK_to_interrupt:=true; 2336 2337@ When an interrupt has been detected, the program goes into its 2338highest interaction level and lets the user have nearly the full flexibility of 2339the |error| routine. \TeX\ checks for interrupts only at times when it is 2340safe to do this. 2341 2342@p procedure pause_for_instructions; 2343begin if OK_to_interrupt then 2344 begin interaction:=error_stop_mode; 2345 if (selector=log_only)or(selector=no_print) then 2346 incr(selector); 2347 print_err("Interruption"); 2348@.Interruption@> 2349 help3("You rang?")@/ 2350 ("Try to insert some instructions for me (e.g.,`I\showlists'),")@/ 2351 ("unless you just want to quit by typing `X'."); 2352 deletions_allowed:=false; error; deletions_allowed:=true; 2353 interrupt:=0; 2354 end; 2355end; 2356 2357@* \[7] Arithmetic with scaled dimensions. 2358The principal computations performed by \TeX\ are done entirely in terms of 2359integers less than $2^{31}$ in magnitude; and divisions are done only when both 2360dividend and divisor are nonnegative. Thus, the arithmetic specified in this 2361program can be carried out in exactly the same way on a wide variety of 2362computers, including some small ones. Why? Because the arithmetic 2363calculations need to be spelled out precisely in order to guarantee that 2364\TeX\ will produce identical output on different machines. If some 2365quantities were rounded differently in different implementations, we would 2366find that line breaks and even page breaks might occur in different places. 2367Hence the arithmetic of \TeX\ has been designed with care, and systems that 2368claim to be implementations of \TeX82 should follow precisely the 2369@:TeX82}{\TeX82@> 2370calculations as they appear in the present program. 2371 2372(Actually there are three places where \TeX\ uses |div| with a possibly negative 2373numerator. These are harmless; see |div| in the index. Also if the user 2374sets the \.{\\time} or the \.{\\year} to a negative value, some diagnostic 2375information will involve negative-numerator division. The same remarks 2376apply for |mod| as well as for |div|.) 2377 2378@ Here is a routine that calculates half of an integer, using an 2379unambiguous convention with respect to signed odd numbers. 2380 2381@p function half(@!x:integer):integer; 2382begin if odd(x) then half:=(x+1) div 2 2383else half:=x @!div 2; 2384end; 2385 2386@ Fixed-point arithmetic is done on {\sl scaled integers\/} that are multiples 2387of $2^{-16}$. In other words, a binary point is assumed to be sixteen bit 2388positions from the right end of a binary computer word. 2389 2390@d unity == @'200000 {$2^{16}$, represents 1.00000} 2391@d two == @'400000 {$2^{17}$, represents 2.00000} 2392 2393@<Types...@>= 2394@!scaled = integer; {this type is used for scaled integers} 2395@!nonnegative_integer=0..@'17777777777; {$0\L x<2^{31}$} 2396@!small_number=0..63; {this type is self-explanatory} 2397 2398@ The following function is used to create a scaled integer from a given decimal 2399fraction $(.d_0d_1\ldots d_{k-1})$, where |0<=k<=17|. The digit $d_i$ is 2400given in |dig[i]|, and the calculation produces a correctly rounded result. 2401 2402@p function round_decimals(@!k:small_number) : scaled; 2403 {converts a decimal fraction} 2404var a:integer; {the accumulator} 2405begin a:=0; 2406while k>0 do 2407 begin decr(k); a:=(a+dig[k]*two) div 10; 2408 end; 2409round_decimals:=(a+1) div 2; 2410end; 2411 2412@ Conversely, here is a procedure analogous to |print_int|. If the output 2413of this procedure is subsequently read by \TeX\ and converted by the 2414|round_decimals| routine above, it turns out that the original value will 2415be reproduced exactly; the ``simplest'' such decimal number is output, 2416but there is always at least one digit following the decimal point. 2417 2418The invariant relation in the \&{repeat} loop is that a sequence of 2419decimal digits yet to be printed will yield the original number if and only if 2420they form a fraction~$f$ in the range $s-\delta\L10\cdot2^{16}f<s$. 2421We can stop if and only if $f=0$ satisfies this condition; the loop will 2422terminate before $s$ can possibly become zero. 2423 2424@p procedure print_scaled(@!s:scaled); {prints scaled real, rounded to five 2425 digits} 2426var delta:scaled; {amount of allowable inaccuracy} 2427begin if s<0 then 2428 begin print_char("-"); negate(s); {print the sign, if negative} 2429 end; 2430print_int(s div unity); {print the integer part} 2431print_char("."); 2432s:=10*(s mod unity)+5; delta:=10; 2433repeat if delta>unity then s:=s+@'100000-50000; {round the last digit} 2434print_char("0"+(s div unity)); s:=10*(s mod unity); delta:=delta*10; 2435until s<=delta; 2436end; 2437 2438@ Physical sizes that a \TeX\ user specifies for portions of documents are 2439represented internally as scaled points. Thus, if we define an `sp' (scaled 2440@^sp@> 2441point) as a unit equal to $2^{-16}$ printer's points, every dimension 2442inside of \TeX\ is an integer number of sp. There are exactly 24434,736,286.72 sp per inch. Users are not allowed to specify dimensions 2444larger than $2^{30}-1$ sp, which is a distance of about 18.892 feet (5.7583 2445meters); two such quantities can be added without overflow on a 32-bit 2446computer. 2447 2448The present implementation of \TeX\ does not check for overflow when 2449@^overflow in arithmetic@> 2450dimensions are added or subtracted. This could be done by inserting a 2451few dozen tests of the form `\ignorespaces|if x>=@'10000000000 then 2452@t\\{report\_overflow}@>|', but the chance of overflow is so remote that 2453such tests do not seem worthwhile. 2454 2455\TeX\ needs to do only a few arithmetic operations on scaled quantities, 2456other than addition and subtraction, and the following subroutines do most of 2457the work. A single computation might use several subroutine calls, and it is 2458desirable to avoid producing multiple error messages in case of arithmetic 2459overflow; so the routines set the global variable |arith_error| to |true| 2460instead of reporting errors directly to the user. Another global variable, 2461|remainder|, holds the remainder after a division. 2462 2463@<Glob...@>= 2464@!arith_error:boolean; {has arithmetic overflow occurred recently?} 2465@!remainder:scaled; {amount subtracted to get an exact division} 2466 2467@ The first arithmetical subroutine we need computes $nx+y$, where |x| 2468and~|y| are |scaled| and |n| is an integer. We will also use it to 2469multiply integers. 2470 2471@d nx_plus_y(#)==mult_and_add(#,@'7777777777) 2472@d mult_integers(#)==mult_and_add(#,0,@'17777777777) 2473 2474@p function mult_and_add(@!n:integer;@!x,@!y,@!max_answer:scaled):scaled; 2475begin if n<0 then 2476 begin negate(x); negate(n); 2477 end; 2478if n=0 then mult_and_add:=y 2479else if ((x<=(max_answer-y) div n)and(-x<=(max_answer+y) div n)) then 2480 mult_and_add:=n*x+y 2481else begin arith_error:=true; mult_and_add:=0; 2482 end; 2483end; 2484 2485@ We also need to divide scaled dimensions by integers. 2486 2487@p function x_over_n(@!x:scaled;@!n:integer):scaled; 2488var negative:boolean; {should |remainder| be negated?} 2489begin negative:=false; 2490if n=0 then 2491 begin arith_error:=true; x_over_n:=0; remainder:=x; 2492 end 2493else begin if n<0 then 2494 begin negate(x); negate(n); negative:=true; 2495 end; 2496 if x>=0 then 2497 begin x_over_n:=x div n; remainder:=x mod n; 2498 end 2499 else begin x_over_n:=-((-x) div n); remainder:=-((-x) mod n); 2500 end; 2501 end; 2502if negative then negate(remainder); 2503end; 2504 2505@ Then comes the multiplication of a scaled number by a fraction |n/d|, 2506where |n| and |d| are nonnegative integers |<=@t$2^{16}$@>| and |d| is 2507positive. It would be too dangerous to multiply by~|n| and then divide 2508by~|d|, in separate operations, since overflow might well occur; and it 2509would be too inaccurate to divide by |d| and then multiply by |n|. Hence 2510this subroutine simulates 1.5-precision arithmetic. 2511 2512@p function xn_over_d(@!x:scaled; @!n,@!d:integer):scaled; 2513var positive:boolean; {was |x>=0|?} 2514@!t,@!u,@!v:nonnegative_integer; {intermediate quantities} 2515begin if x>=0 then positive:=true 2516else begin negate(x); positive:=false; 2517 end; 2518t:=(x mod @'100000)*n; 2519u:=(x div @'100000)*n+(t div @'100000); 2520v:=(u mod d)*@'100000 + (t mod @'100000); 2521if u div d>=@'100000 then arith_error:=true 2522else u:=@'100000*(u div d) + (v div d); 2523if positive then 2524 begin xn_over_d:=u; remainder:=v mod d; 2525 end 2526else begin xn_over_d:=-u; remainder:=-(v mod d); 2527 end; 2528end; 2529 2530@ The next subroutine is used to compute the ``badness'' of glue, when a 2531total~|t| is supposed to be made from amounts that sum to~|s|. According 2532to {\sl The \TeX book}, the badness of this situation is $100(t/s)^3$; 2533however, badness is simply a heuristic, so we need not squeeze out the 2534last drop of accuracy when computing it. All we really want is an 2535approximation that has similar properties. 2536@:TeXbook}{\sl The \TeX book@> 2537 2538The actual method used to compute the badness is easier to read from the 2539program than to describe in words. It produces an integer value that is a 2540reasonably close approximation to $100(t/s)^3$, and all implementations 2541of \TeX\ should use precisely this method. Any badness of $2^{13}$ or more is 2542treated as infinitely bad, and represented by 10000. 2543 2544It is not difficult to prove that $$\hbox{|badness(t+1,s)>=badness(t,s) 2545>=badness(t,s+1)|}.$$ The badness function defined here is capable of 2546computing at most 1095 distinct values, but that is plenty. 2547 2548@d inf_bad = 10000 {infinitely bad value} 2549 2550@p function badness(@!t,@!s:scaled):halfword; {compute badness, given |t>=0|} 2551var r:integer; {approximation to $\alpha t/s$, where $\alpha^3\approx 2552 100\cdot2^{18}$} 2553begin if t=0 then badness:=0 2554else if s<=0 then badness:=inf_bad 2555else begin if t<=7230584 then r:=(t*297) div s {$297^3=99.94\times2^{18}$} 2556 else if s>=1663497 then r:=t div (s div 297) 2557 else r:=t; 2558 if r>1290 then badness:=inf_bad {$1290^3<2^{31}<1291^3$} 2559 else badness:=(r*r*r+@'400000) div @'1000000; 2560 end; {that was $r^3/2^{18}$, rounded to the nearest integer} 2561end; 2562 2563@ When \TeX\ ``packages'' a list into a box, it needs to calculate the 2564proportionality ratio by which the glue inside the box should stretch 2565or shrink. This calculation does not affect \TeX's decision making, 2566so the precise details of rounding, etc., in the glue calculation are not 2567of critical importance for the consistency of results on different computers. 2568 2569We shall use the type |glue_ratio| for such proportionality ratios. 2570A glue ratio should take the same amount of memory as an 2571|integer| (usually 32 bits) if it is to blend smoothly with \TeX's 2572other data structures. Thus |glue_ratio| should be equivalent to 2573|short_real| in some implementations of \PASCAL. Alternatively, 2574it is possible to deal with glue ratios using nothing but fixed-point 2575arithmetic; see {\sl TUGboat \bf3},1 (March 1982), 10--27. (But the 2576routines cited there must be modified to allow negative glue ratios.) 2577@^system dependencies@> 2578 2579@d set_glue_ratio_zero(#) == #:=0.0 {store the representation of zero ratio} 2580@d set_glue_ratio_one(#) == #:=1.0 {store the representation of unit ratio} 2581@d float(#) == # {convert from |glue_ratio| to type |real|} 2582@d unfloat(#) == # {convert from |real| to type |glue_ratio|} 2583@d float_constant(#) == #.0 {convert |integer| constant to |real|} 2584 2585@<Types...@>= 2586@!glue_ratio=real; {one-word representation of a glue expansion factor} 2587 2588@* \[8] Packed data. 2589In order to make efficient use of storage space, \TeX\ bases its major data 2590structures on a |memory_word|, which contains either a (signed) integer, 2591possibly scaled, or a (signed) |glue_ratio|, or a small number of 2592fields that are one half or one quarter of the size used for storing 2593integers. 2594 2595If |x| is a variable of type |memory_word|, it contains up to four 2596fields that can be referred to as follows: 2597$$\vbox{\halign{\hfil#&#\hfil&#\hfil\cr 2598|x|&.|int|&(an |integer|)\cr 2599|x|&.|sc|\qquad&(a |scaled| integer)\cr 2600|x|&.|gr|&(a |glue_ratio|)\cr 2601|x.hh.lh|, |x.hh|&.|rh|&(two halfword fields)\cr 2602|x.hh.b0|, |x.hh.b1|, |x.hh|&.|rh|&(two quarterword fields, one halfword 2603 field)\cr 2604|x.qqqq.b0|, |x.qqqq.b1|, |x.qqqq|&.|b2|, |x.qqqq.b3|\hskip-100pt 2605 &\qquad\qquad\qquad(four quarterword fields)\cr}}$$ 2606This is somewhat cumbersome to write, and not very readable either, but 2607macros will be used to make the notation shorter and more transparent. 2608The \PASCAL\ code below gives a formal definition of |memory_word| and 2609its subsidiary types, using packed variant records. \TeX\ makes no 2610assumptions about the relative positions of the fields within a word. 2611 2612Since we are assuming 32-bit integers, a halfword must contain at least 261316 bits, and a quarterword must contain at least 8 bits. 2614@^system dependencies@> 2615But it doesn't hurt to have more bits; for example, with enough 36-bit 2616words you might be able to have |mem_max| as large as 262142, which is 2617eight times as much memory as anybody had during the first four years of 2618\TeX's existence. 2619 2620N.B.: Valuable memory space will be dreadfully wasted unless \TeX\ is compiled 2621by a \PASCAL\ that packs all of the |memory_word| variants into 2622the space of a single integer. This means, for example, that |glue_ratio| 2623words should be |short_real| instead of |real| on some computers. Some 2624\PASCAL\ compilers will pack an integer whose subrange is `|0..255|' into 2625an eight-bit field, but others insist on allocating space for an additional 2626sign bit; on such systems you can get 256 values into a quarterword only 2627if the subrange is `|-128..127|'. 2628 2629The present implementation tries to accommodate as many variations as possible, 2630so it makes few assumptions. If integers having the subrange 2631`|min_quarterword..max_quarterword|' can be packed into a quarterword, 2632and if integers having the subrange `|min_halfword..max_halfword|' 2633can be packed into a halfword, everything should work satisfactorily. 2634 2635It is usually most efficient to have |min_quarterword=min_halfword=0|, 2636so one should try to achieve this unless it causes a severe problem. 2637The values defined here are recommended for most 32-bit computers. 2638 2639@d min_quarterword=0 {smallest allowable value in a |quarterword|} 2640@d max_quarterword=@"FFFF {largest allowable value in a |quarterword|} 2641@d min_halfword==-@"FFFFFFF {smallest allowable value in a |halfword|} 2642@d max_halfword==@"3FFFFFFF {largest allowable value in a |halfword|} 2643 2644@ Here are the inequalities that the quarterword and halfword values 2645must satisfy (or rather, the inequalities that they mustn't satisfy): 2646 2647@<Check the ``constant''...@>= 2648init if (mem_min<>mem_bot)or(mem_max<>mem_top) then bad:=10;@+tini@;@/ 2649if (mem_min>mem_bot)or(mem_max<mem_top) then bad:=10; 2650if (min_quarterword>0)or(max_quarterword<@"7FFF) then bad:=11; 2651if (min_halfword>0)or(max_halfword<@"3FFFFFFF) then bad:=12; 2652if (min_quarterword<min_halfword)or@| 2653 (max_quarterword>max_halfword) then bad:=13; 2654if (mem_min<min_halfword)or(mem_max>=max_halfword)or@| 2655 (mem_bot-mem_min>max_halfword+1) then bad:=14; 2656if (font_base<min_quarterword)or(font_max>max_quarterword) then bad:=15; 2657if font_max>font_base+256 then bad:=16; 2658if (save_size>max_halfword)or(max_strings>max_halfword) then bad:=17; 2659if buf_size>max_halfword then bad:=18; 2660if max_quarterword-min_quarterword<@"FFFF then bad:=19; 2661 2662@ The operation of adding or subtracting |min_quarterword| occurs quite 2663frequently in \TeX, so it is convenient to abbreviate this operation 2664by using the macros |qi| and |qo| for input and output to and from 2665quarterword format. 2666 2667The inner loop of \TeX\ will run faster with respect to compilers 2668that don't optimize expressions like `|x+0|' and `|x-0|', if these 2669macros are simplified in the obvious way when |min_quarterword=0|. 2670@^inner loop@>@^system dependencies@> 2671 2672@d qi(#)==#+min_quarterword 2673 {to put an |eight_bits| item into a quarterword} 2674@d qo(#)==#-min_quarterword 2675 {to take an |eight_bits| item out of a quarterword} 2676@d hi(#)==#+min_halfword 2677 {to put a sixteen-bit item into a halfword} 2678@d ho(#)==#-min_halfword 2679 {to take a sixteen-bit item from a halfword} 2680 2681@ The reader should study the following definitions closely: 2682@^system dependencies@> 2683 2684@d sc==int {|scaled| data is equivalent to |integer|} 2685 2686@<Types...@>= 2687@!quarterword = min_quarterword..max_quarterword; {1/4 of a word} 2688@!halfword=min_halfword..max_halfword; {1/2 of a word} 2689@!two_choices = 1..2; {used when there are two variants in a record} 2690@!four_choices = 1..4; {used when there are four variants in a record} 2691@!two_halves = packed record@;@/ 2692 @!rh:halfword; 2693 case two_choices of 2694 1: (@!lh:halfword); 2695 2: (@!b0:quarterword; @!b1:quarterword); 2696 end; 2697@!four_quarters = packed record@;@/ 2698 @!b0:quarterword; 2699 @!b1:quarterword; 2700 @!b2:quarterword; 2701 @!b3:quarterword; 2702 end; 2703@!memory_word = record@;@/ 2704 case four_choices of 2705 1: (@!int:integer); 2706 2: (@!gr:glue_ratio); 2707 3: (@!hh:two_halves); 2708 4: (@!qqqq:four_quarters); 2709 end; 2710@!word_file = gzFile; 2711 2712@ When debugging, we may want to print a |memory_word| without knowing 2713what type it is; so we print it in all modes. 2714@^dirty \PASCAL@>@^debugging@> 2715 2716@p @!debug procedure print_word(@!w:memory_word); 2717 {prints |w| in all ways} 2718begin print_int(w.int); print_char(" ");@/ 2719print_scaled(w.sc); print_char(" ");@/ 2720print_scaled(round(unity*float(w.gr))); print_ln;@/ 2721@^real multiplication@> 2722print_int(w.hh.lh); print_char("="); print_int(w.hh.b0); print_char(":"); 2723print_int(w.hh.b1); print_char(";"); print_int(w.hh.rh); print_char(" ");@/ 2724print_int(w.qqqq.b0); print_char(":"); print_int(w.qqqq.b1); print_char(":"); 2725print_int(w.qqqq.b2); print_char(":"); print_int(w.qqqq.b3); 2726end; 2727gubed 2728 2729@* \[9] Dynamic memory allocation. 2730The \TeX\ system does nearly all of its own memory allocation, so that it 2731can readily be transported into environments that do not have automatic 2732facilities for strings, garbage collection, etc., and so that it can be in 2733control of what error messages the user receives. The dynamic storage 2734requirements of \TeX\ are handled by providing a large array |mem| in 2735which consecutive blocks of words are used as nodes by the \TeX\ routines. 2736 2737Pointer variables are indices into this array, or into another array 2738called |eqtb| that will be explained later. A pointer variable might 2739also be a special flag that lies outside the bounds of |mem|, so we 2740allow pointers to assume any |halfword| value. The minimum halfword 2741value represents a null pointer. \TeX\ does not assume that |mem[null]| exists. 2742 2743@d pointer==halfword {a flag or a location in |mem| or |eqtb|} 2744@d null==min_halfword {the null pointer} 2745 2746@<Glob...@>= 2747@!temp_ptr:pointer; {a pointer variable for occasional emergency use} 2748 2749@ The |mem| array is divided into two regions that are allocated separately, 2750but the dividing line between these two regions is not fixed; they grow 2751together until finding their ``natural'' size in a particular job. 2752Locations less than or equal to |lo_mem_max| are used for storing 2753variable-length records consisting of two or more words each. This region 2754is maintained using an algorithm similar to the one described in exercise 27552.5--19 of {\sl The Art of Computer Programming}. However, no size field 2756appears in the allocated nodes; the program is responsible for knowing the 2757relevant size when a node is freed. Locations greater than or equal to 2758|hi_mem_min| are used for storing one-word records; a conventional 2759\.{AVAIL} stack is used for allocation in this region. 2760 2761Locations of |mem| between |mem_bot| and |mem_top| may be dumped as part 2762of preloaded format files, by the \.{INITEX} preprocessor. 2763@.INITEX@> 2764Production versions of \TeX\ may extend the memory at both ends in order to 2765provide more space; locations between |mem_min| and |mem_bot| are always 2766used for variable-size nodes, and locations between |mem_top| and |mem_max| 2767are always used for single-word nodes. 2768 2769The key pointers that govern |mem| allocation have a prescribed order: 2770$$\advance\thickmuskip-2mu 2771\hbox{|null<=mem_min<=mem_bot<lo_mem_max< 2772 hi_mem_min<mem_top<=mem_end<=mem_max|.}$$ 2773 2774Empirical tests show that the present implementation of \TeX\ tends to 2775spend about 9\pct! of its running time allocating nodes, and about 6\pct! 2776deallocating them after their use. 2777 2778@<Glob...@>= 2779@!mem : array[mem_min..mem_max] of memory_word; {the big dynamic storage area} 2780@!lo_mem_max : pointer; {the largest location of variable-size memory in use} 2781@!hi_mem_min : pointer; {the smallest location of one-word memory in use} 2782 2783@ In order to study the memory requirements of particular applications, it 2784is possible to prepare a version of \TeX\ that keeps track of current and 2785maximum memory usage. When code between the delimiters |@!stat| $\ldots$ 2786|tats| is not ``commented out,'' \TeX\ will run a bit slower but it will 2787report these statistics when |tracing_stats| is sufficiently large. 2788 2789@<Glob...@>= 2790@!var_used, @!dyn_used : integer; {how much memory is in use} 2791 2792@ Let's consider the one-word memory region first, since it's the 2793simplest. The pointer variable |mem_end| holds the highest-numbered location 2794of |mem| that has ever been used. The free locations of |mem| that 2795occur between |hi_mem_min| and |mem_end|, inclusive, are of type 2796|two_halves|, and we write |info(p)| and |link(p)| for the |lh| 2797and |rh| fields of |mem[p]| when it is of this type. The single-word 2798free locations form a linked list 2799$$|avail|,\;\hbox{|link(avail)|},\;\hbox{|link(link(avail))|},\;\ldots$$ 2800terminated by |null|. 2801 2802@d link(#) == mem[#].hh.rh {the |link| field of a memory word} 2803@d info(#) == mem[#].hh.lh {the |info| field of a memory word} 2804 2805@<Glob...@>= 2806@!avail : pointer; {head of the list of available one-word nodes} 2807@!mem_end : pointer; {the last one-word node used in |mem|} 2808 2809@ If memory is exhausted, it might mean that the user has forgotten 2810a right brace. We will define some procedures later that try to help 2811pinpoint the trouble. 2812 2813@p @<Declare the procedure called |show_token_list|@>@/ 2814@<Declare the procedure called |runaway|@> 2815 2816@ The function |get_avail| returns a pointer to a new one-word node whose 2817|link| field is null. However, \TeX\ will halt if there is no more room left. 2818@^inner loop@> 2819 2820If the available-space list is empty, i.e., if |avail=null|, 2821we try first to increase |mem_end|. If that cannot be done, i.e., if 2822|mem_end=mem_max|, we try to decrease |hi_mem_min|. If that cannot be 2823done, i.e., if |hi_mem_min=lo_mem_max+1|, we have to quit. 2824 2825@p function get_avail : pointer; {single-word node allocation} 2826var p:pointer; {the new node being got} 2827begin p:=avail; {get top location in the |avail| stack} 2828if p<>null then avail:=link(avail) {and pop it off} 2829else if mem_end<mem_max then {or go into virgin territory} 2830 begin incr(mem_end); p:=mem_end; 2831 end 2832else begin decr(hi_mem_min); p:=hi_mem_min; 2833 if hi_mem_min<=lo_mem_max then 2834 begin runaway; {if memory is exhausted, display possible runaway text} 2835 overflow("main memory size",mem_max+1-mem_min); 2836 {quit; all one-word nodes are busy} 2837@:TeX capacity exceeded main memory size}{\quad main memory size@> 2838 end; 2839 end; 2840link(p):=null; {provide an oft-desired initialization of the new node} 2841@!stat incr(dyn_used);@+tats@;{maintain statistics} 2842get_avail:=p; 2843end; 2844 2845@ Conversely, a one-word node is recycled by calling |free_avail|. 2846This routine is part of \TeX's ``inner loop,'' so we want it to be fast. 2847@^inner loop@> 2848 2849@d free_avail(#)== {single-word node liberation} 2850 begin link(#):=avail; avail:=#; 2851 @!stat decr(dyn_used);@+tats@/ 2852 end 2853 2854@ There's also a |fast_get_avail| routine, which saves the procedure-call 2855overhead at the expense of extra programming. This routine is used in 2856the places that would otherwise account for the most calls of |get_avail|. 2857@^inner loop@> 2858 2859@d fast_get_avail(#)==@t@>@;@/ 2860 begin #:=avail; {avoid |get_avail| if possible, to save time} 2861 if #=null then #:=get_avail 2862 else begin avail:=link(#); link(#):=null; 2863 @!stat incr(dyn_used);@+tats@/ 2864 end; 2865 end 2866 2867@ The procedure |flush_list(p)| frees an entire linked list of 2868one-word nodes that starts at position |p|. 2869@^inner loop@> 2870 2871@p procedure flush_list(@!p:pointer); {makes list of single-word nodes 2872 available} 2873var @!q,@!r:pointer; {list traversers} 2874begin if p<>null then 2875 begin r:=p; 2876 repeat q:=r; r:=link(r); @!stat decr(dyn_used);@+tats@/ 2877 until r=null; {now |q| is the last node on the list} 2878 link(q):=avail; avail:=p; 2879 end; 2880end; 2881 2882@ The available-space list that keeps track of the variable-size portion 2883of |mem| is a nonempty, doubly-linked circular list of empty nodes, 2884pointed to by the roving pointer |rover|. 2885 2886Each empty node has size 2 or more; the first word contains the special 2887value |max_halfword| in its |link| field and the size in its |info| field; 2888the second word contains the two pointers for double linking. 2889 2890Each nonempty node also has size 2 or more. Its first word is of type 2891|two_halves|\kern-1pt, and its |link| field is never equal to |max_halfword|. 2892Otherwise there is complete flexibility with respect to the contents 2893of its other fields and its other words. 2894 2895(We require |mem_max<max_halfword| because terrible things can happen 2896when |max_halfword| appears in the |link| field of a nonempty node.) 2897 2898@d empty_flag == max_halfword {the |link| of an empty variable-size node} 2899@d is_empty(#) == (link(#)=empty_flag) {tests for empty node} 2900@d node_size == info {the size field in empty variable-size nodes} 2901@d llink(#) == info(#+1) {left link in doubly-linked list of empty nodes} 2902@d rlink(#) == link(#+1) {right link in doubly-linked list of empty nodes} 2903 2904@<Glob...@>= 2905@!rover : pointer; {points to some node in the list of empties} 2906 2907@ A call to |get_node| with argument |s| returns a pointer to a new node 2908of size~|s|, which must be 2~or more. The |link| field of the first word 2909of this new node is set to null. An overflow stop occurs if no suitable 2910space exists. 2911 2912If |get_node| is called with $s=2^{30}$, it simply merges adjacent free 2913areas and returns the value |max_halfword|. 2914 2915@p function get_node(@!s:integer):pointer; {variable-size node allocation} 2916label found,exit,restart; 2917var p:pointer; {the node currently under inspection} 2918@!q:pointer; {the node physically after node |p|} 2919@!r:integer; {the newly allocated node, or a candidate for this honor} 2920@!t:integer; {temporary register} 2921begin restart: p:=rover; {start at some free node in the ring} 2922repeat @<Try to allocate within node |p| and its physical successors, 2923 and |goto found| if allocation was possible@>; 2924@^inner loop@> 2925p:=rlink(p); {move to the next node in the ring} 2926until p=rover; {repeat until the whole list has been traversed} 2927if s=@'10000000000 then 2928 begin get_node:=max_halfword; return; 2929 end; 2930if lo_mem_max+2<hi_mem_min then if lo_mem_max+2<=mem_bot+max_halfword then 2931 @<Grow more variable-size memory and |goto restart|@>; 2932overflow("main memory size",mem_max+1-mem_min); 2933 {sorry, nothing satisfactory is left} 2934@:TeX capacity exceeded main memory size}{\quad main memory size@> 2935found: link(r):=null; {this node is now nonempty} 2936@!stat var_used:=var_used+s; {maintain usage statistics} 2937tats@;@/ 2938get_node:=r; 2939exit:end; 2940 2941@ The lower part of |mem| grows by 1000 words at a time, unless 2942we are very close to going under. When it grows, we simply link 2943a new node into the available-space list. This method of controlled 2944growth helps to keep the |mem| usage consecutive when \TeX\ is 2945implemented on ``virtual memory'' systems. 2946@^virtual memory@> 2947 2948@<Grow more variable-size memory and |goto restart|@>= 2949begin if hi_mem_min-lo_mem_max>=1998 then t:=lo_mem_max+1000 2950else t:=lo_mem_max+1+(hi_mem_min-lo_mem_max) div 2; 2951 {|lo_mem_max+2<=t<hi_mem_min|} 2952p:=llink(rover); q:=lo_mem_max; rlink(p):=q; llink(rover):=q;@/ 2953if t>mem_bot+max_halfword then t:=mem_bot+max_halfword; 2954rlink(q):=rover; llink(q):=p; link(q):=empty_flag; node_size(q):=t-lo_mem_max;@/ 2955lo_mem_max:=t; link(lo_mem_max):=null; info(lo_mem_max):=null; 2956rover:=q; goto restart; 2957end 2958 2959@ Empirical tests show that the routine in this section performs a 2960node-merging operation about 0.75 times per allocation, on the average, 2961after which it finds that |r>p+1| about 95\pct! of the time. 2962 2963@<Try to allocate...@>= 2964q:=p+node_size(p); {find the physical successor} 2965@^inner loop@> 2966while is_empty(q) do {merge node |p| with node |q|} 2967 begin t:=rlink(q); 2968 if q=rover then rover:=t; 2969 llink(t):=llink(q); rlink(llink(q)):=t;@/ 2970 q:=q+node_size(q); 2971 end; 2972r:=q-s; 2973if r>p+1 then @<Allocate from the top of node |p| and |goto found|@>; 2974if r=p then if rlink(p)<>p then 2975 @<Allocate entire node |p| and |goto found|@>; 2976node_size(p):=q-p {reset the size in case it grew} 2977 2978@ @<Allocate from the top...@>= 2979begin node_size(p):=r-p; {store the remaining size} 2980@^inner loop@> 2981rover:=p; {start searching here next time} 2982goto found; 2983end 2984 2985@ Here we delete node |p| from the ring, and let |rover| rove around. 2986 2987@<Allocate entire...@>= 2988begin rover:=rlink(p); t:=llink(p); 2989llink(rover):=t; rlink(t):=rover; 2990goto found; 2991end 2992 2993@ Conversely, when some variable-size node |p| of size |s| is no longer needed, 2994the operation |free_node(p,s)| will make its words available, by inserting 2995|p| as a new empty node just before where |rover| now points. 2996@^inner loop@> 2997 2998@p procedure free_node(@!p:pointer; @!s:halfword); {variable-size node 2999 liberation} 3000var q:pointer; {|llink(rover)|} 3001begin node_size(p):=s; link(p):=empty_flag; 3002q:=llink(rover); llink(p):=q; rlink(p):=rover; {set both links} 3003llink(rover):=p; rlink(q):=p; {insert |p| into the ring} 3004@!stat var_used:=var_used-s;@+tats@;{maintain statistics} 3005end; 3006 3007@ Just before \.{INITEX} writes out the memory, it sorts the doubly linked 3008available space list. The list is probably very short at such times, so a 3009simple insertion sort is used. The smallest available location will be 3010pointed to by |rover|, the next-smallest by |rlink(rover)|, etc. 3011 3012@p @!init procedure sort_avail; {sorts the available variable-size nodes 3013 by location} 3014var p,@!q,@!r: pointer; {indices into |mem|} 3015@!old_rover:pointer; {initial |rover| setting} 3016begin p:=get_node(@'10000000000); {merge adjacent free areas} 3017p:=rlink(rover); rlink(rover):=max_halfword; old_rover:=rover; 3018while p<>old_rover do @<Sort \(p)|p| into the list starting at |rover| 3019 and advance |p| to |rlink(p)|@>; 3020p:=rover; 3021while rlink(p)<>max_halfword do 3022 begin llink(rlink(p)):=p; p:=rlink(p); 3023 end; 3024rlink(p):=rover; llink(rover):=p; 3025end; 3026tini 3027 3028@ The following |while| loop is guaranteed to 3029terminate, since the list that starts at 3030|rover| ends with |max_halfword| during the sorting procedure. 3031 3032@<Sort \(p)|p|...@>= 3033if p<rover then 3034 begin q:=p; p:=rlink(q); rlink(q):=rover; rover:=q; 3035 end 3036else begin q:=rover; 3037 while rlink(q)<p do q:=rlink(q); 3038 r:=rlink(p); rlink(p):=rlink(q); rlink(q):=p; p:=r; 3039 end 3040 3041@* \[10] Data structures for boxes and their friends. 3042From the computer's standpoint, \TeX's chief mission is to create 3043horizontal and vertical lists. We shall now investigate how the elements 3044of these lists are represented internally as nodes in the dynamic memory. 3045 3046A horizontal or vertical list is linked together by |link| fields in 3047the first word of each node. Individual nodes represent boxes, glue, 3048penalties, or special things like discretionary hyphens; because of this 3049variety, some nodes are longer than others, and we must distinguish different 3050kinds of nodes. We do this by putting a `|type|' field in the first word, 3051together with the link and an optional `|subtype|'. 3052 3053@d type(#) == mem[#].hh.b0 {identifies what kind of node this is} 3054@d subtype(#) == mem[#].hh.b1 {secondary identification in some cases} 3055 3056@ A |@!char_node|, which represents a single character, is the most important 3057kind of node because it accounts for the vast majority of all boxes. 3058Special precautions are therefore taken to ensure that a |char_node| does 3059not take up much memory space. Every such node is one word long, and in fact 3060it is identifiable by this property, since other kinds of nodes have at least 3061two words, and they appear in |mem| locations less than |hi_mem_min|. 3062This makes it possible to omit the |type| field in a |char_node|, leaving 3063us room for two bytes that identify a |font| and a |character| within 3064that font. 3065 3066Note that the format of a |char_node| allows for up to 256 different 3067fonts and up to 256 characters per font; but most implementations will 3068probably limit the total number of fonts to fewer than 75 per job, 3069and most fonts will stick to characters whose codes are 3070less than 128 (since higher codes 3071are more difficult to access on most keyboards). 3072 3073Extensions of \TeX\ intended for oriental languages will need even more 3074than $256\times256$ possible characters, when we consider different sizes 3075@^oriental characters@>@^Chinese characters@>@^Japanese characters@> 3076and styles of type. It is suggested that Chinese and Japanese fonts be 3077handled by representing such characters in two consecutive |char_node| 3078entries: The first of these has |font=font_base|, and its |link| points 3079to the second; 3080the second identifies the font and the character dimensions. 3081The saving feature about oriental characters is that most of them have 3082the same box dimensions. The |character| field of the first |char_node| 3083is a ``\\{charext}'' that distinguishes between graphic symbols whose 3084dimensions are identical for typesetting purposes. (See the \MF\ manual.) 3085Such an extension of \TeX\ would not be difficult; further details are 3086left to the reader. 3087 3088In order to make sure that the |character| code fits in a quarterword, 3089\TeX\ adds the quantity |min_quarterword| to the actual code. 3090 3091Character nodes appear only in horizontal lists, never in vertical lists. 3092 3093@d is_char_node(#) == (#>=hi_mem_min) 3094 {does the argument point to a |char_node|?} 3095@d font == type {the font code in a |char_node|} 3096@d character == subtype {the character code in a |char_node|} 3097 3098@ An |hlist_node| stands for a box that was made from a horizontal list. 3099Each |hlist_node| is seven words long, and contains the following fields 3100(in addition to the mandatory |type| and |link|, which we shall not 3101mention explicitly when discussing the other node types): The |height| and 3102|width| and |depth| are scaled integers denoting the dimensions of the 3103box. There is also a |shift_amount| field, a scaled integer indicating 3104how much this box should be lowered (if it appears in a horizontal list), 3105or how much it should be moved to the right (if it appears in a vertical 3106list). There is a |list_ptr| field, which points to the beginning of the 3107list from which this box was fabricated; if |list_ptr| is |null|, the box 3108is empty. Finally, there are three fields that represent the setting of 3109the glue: |glue_set(p)| is a word of type |glue_ratio| that represents 3110the proportionality constant for glue setting; |glue_sign(p)| is 3111|stretching| or |shrinking| or |normal| depending on whether or not the 3112glue should stretch or shrink or remain rigid; and |glue_order(p)| 3113specifies the order of infinity to which glue setting applies (|normal|, 3114|fil|, |fill|, or |filll|). The |subtype| field is not used in \TeX. 3115In \eTeX\ the |subtype| field records the box direction mode |box_lr|. 3116 3117@d hlist_node=0 {|type| of hlist nodes} 3118@d box_node_size=7 {number of words to allocate for a box node} 3119@d width_offset=1 {position of |width| field in a box node} 3120@d depth_offset=2 {position of |depth| field in a box node} 3121@d height_offset=3 {position of |height| field in a box node} 3122@d width(#) == mem[#+width_offset].sc {width of the box, in sp} 3123@d depth(#) == mem[#+depth_offset].sc {depth of the box, in sp} 3124@d height(#) == mem[#+height_offset].sc {height of the box, in sp} 3125@d shift_amount(#) == mem[#+4].sc {repositioning distance, in sp} 3126@d list_offset=5 {position of |list_ptr| field in a box node} 3127@d list_ptr(#) == link(#+list_offset) {beginning of the list inside the box} 3128@d glue_order(#) == subtype(#+list_offset) {applicable order of infinity} 3129@d glue_sign(#) == type(#+list_offset) {stretching or shrinking} 3130@d normal=0 {the most common case when several cases are named} 3131@d stretching = 1 {glue setting applies to the stretch components} 3132@d shrinking = 2 {glue setting applies to the shrink components} 3133@d glue_offset = 6 {position of |glue_set| in a box node} 3134@d glue_set(#) == mem[#+glue_offset].gr 3135 {a word of type |glue_ratio| for glue setting} 3136 3137@ The |new_null_box| function returns a pointer to an |hlist_node| in 3138which all subfields have the values corresponding to `\.{\\hbox\{\}}'. 3139The |subtype| field is set to |min_quarterword|, since that's the desired 3140|span_count| value if this |hlist_node| is changed to an |unset_node|. 3141 3142@p function new_null_box:pointer; {creates a new box node} 3143var p:pointer; {the new node} 3144begin p:=get_node(box_node_size); type(p):=hlist_node; 3145subtype(p):=min_quarterword; 3146width(p):=0; depth(p):=0; height(p):=0; shift_amount(p):=0; list_ptr(p):=null; 3147glue_sign(p):=normal; glue_order(p):=normal; set_glue_ratio_zero(glue_set(p)); 3148new_null_box:=p; 3149end; 3150 3151@ A |vlist_node| is like an |hlist_node| in all respects except that it 3152contains a vertical list. 3153 3154@d vlist_node=1 {|type| of vlist nodes} 3155 3156@ A |rule_node| stands for a solid black rectangle; it has |width|, 3157|depth|, and |height| fields just as in an |hlist_node|. However, if 3158any of these dimensions is $-2^{30}$, the actual value will be determined 3159by running the rule up to the boundary of the innermost enclosing box. 3160This is called a ``running dimension.'' The |width| is never running in 3161an hlist; the |height| and |depth| are never running in a~vlist. 3162 3163@d rule_node=2 {|type| of rule nodes} 3164@d rule_node_size=4 {number of words to allocate for a rule node} 3165@d null_flag==-@'10000000000 {$-2^{30}$, signifies a missing item} 3166@d is_running(#) == (#=null_flag) {tests for a running dimension} 3167 3168@ A new rule node is delivered by the |new_rule| function. It 3169makes all the dimensions ``running,'' so you have to change the 3170ones that are not allowed to run. 3171 3172@p function new_rule:pointer; 3173var p:pointer; {the new node} 3174begin p:=get_node(rule_node_size); type(p):=rule_node; 3175subtype(p):=0; {the |subtype| is not used} 3176width(p):=null_flag; depth(p):=null_flag; height(p):=null_flag; 3177new_rule:=p; 3178end; 3179 3180@ Insertions are represented by |ins_node| records, where the |subtype| 3181indicates the corresponding box number. For example, `\.{\\insert 250}' 3182leads to an |ins_node| whose |subtype| is |250+min_quarterword|. 3183The |height| field of an |ins_node| is slightly misnamed; it actually holds 3184the natural height plus depth of the vertical list being inserted. 3185The |depth| field holds the |split_max_depth| to be used in case this 3186insertion is split, and the |split_top_ptr| points to the corresponding 3187|split_top_skip|. The |float_cost| field holds the |floating_penalty| that 3188will be used if this insertion floats to a subsequent page after a 3189split insertion of the same class. There is one more field, the 3190|ins_ptr|, which points to the beginning of the vlist for the insertion. 3191 3192@d ins_node=3 {|type| of insertion nodes} 3193@d ins_node_size=5 {number of words to allocate for an insertion} 3194@d float_cost(#)==mem[#+1].int {the |floating_penalty| to be used} 3195@d ins_ptr(#)==info(#+4) {the vertical list to be inserted} 3196@d split_top_ptr(#)==link(#+4) {the |split_top_skip| to be used} 3197 3198@ A |mark_node| has a |mark_ptr| field that points to the reference count 3199of a token list that contains the user's \.{\\mark} text. 3200In addition there is a |mark_class| field that contains the mark class. 3201 3202@d mark_node=4 {|type| of a mark node} 3203@d small_node_size=2 {number of words to allocate for most node types} 3204@d mark_ptr(#)==link(#+1) {head of the token list for a mark} 3205@d mark_class(#)==info(#+1) {the mark class} 3206 3207@ An |adjust_node|, which occurs only in horizontal lists, 3208specifies material that will be moved out into the surrounding 3209vertical list; i.e., it is used to implement \TeX's `\.{\\vadjust}' 3210operation. The |adjust_ptr| field points to the vlist containing this 3211material. 3212 3213@d adjust_node=5 {|type| of an adjust node} 3214@d adjust_pre == subtype {|if subtype <>0| it is pre-adjustment} 3215@#{|append_list| is used to append a list to |tail|} 3216@d append_list(#) == begin link(tail):=link(#); append_list_end 3217@d append_list_end(#) == tail:=#; end 3218@d adjust_ptr(#)==mem[#+1].int 3219 {vertical list to be moved out of horizontal list} 3220 3221@ A |ligature_node|, which occurs only in horizontal lists, specifies 3222a character that was fabricated from the interaction of two or more 3223actual characters. The second word of the node, which is called the 3224|lig_char| word, contains |font| and |character| fields just as in a 3225|char_node|. The characters that generated the ligature have not been 3226forgotten, since they are needed for diagnostic messages and for 3227hyphenation; the |lig_ptr| field points to a linked list of character 3228nodes for all original characters that have been deleted. (This list 3229might be empty if the characters that generated the ligature were 3230retained in other nodes.) 3231 3232The |subtype| field is 0, plus 2 and/or 1 if the original source of the 3233ligature included implicit left and/or right boundaries. 3234 3235@d ligature_node=6 {|type| of a ligature node} 3236@d lig_char(#)==#+1 {the word where the ligature is to be found} 3237@d lig_ptr(#)==link(lig_char(#)) {the list of characters} 3238 3239@ The |new_ligature| function creates a ligature node having given 3240contents of the |font|, |character|, and |lig_ptr| fields. We also have 3241a |new_lig_item| function, which returns a two-word node having a given 3242|character| field. Such nodes are used for temporary processing as ligatures 3243are being created. 3244 3245@p function new_ligature(@!f,@!c:quarterword; @!q:pointer):pointer; 3246var p:pointer; {the new node} 3247begin p:=get_node(small_node_size); type(p):=ligature_node; 3248font(lig_char(p)):=f; character(lig_char(p)):=c; lig_ptr(p):=q; 3249subtype(p):=0; new_ligature:=p; 3250end; 3251@# 3252function new_lig_item(@!c:quarterword):pointer; 3253var p:pointer; {the new node} 3254begin p:=get_node(small_node_size); character(p):=c; lig_ptr(p):=null; 3255new_lig_item:=p; 3256end; 3257 3258@ A |disc_node|, which occurs only in horizontal lists, specifies a 3259``dis\-cretion\-ary'' line break. If such a break occurs at node |p|, the text 3260that starts at |pre_break(p)| will precede the break, the text that starts at 3261|post_break(p)| will follow the break, and text that appears in the next 3262|replace_count(p)| nodes will be ignored. For example, an ordinary 3263discretionary hyphen, indicated by `\.{\\-}', yields a |disc_node| with 3264|pre_break| pointing to a |char_node| containing a hyphen, |post_break=null|, 3265and |replace_count=0|. All three of the discretionary texts must be 3266lists that consist entirely of character, kern, box, rule, and ligature nodes. 3267 3268If |pre_break(p)=null|, the |ex_hyphen_penalty| will be charged for this 3269break. Otherwise the |hyphen_penalty| will be charged. The texts will 3270actually be substituted into the list by the line-breaking algorithm if it 3271decides to make the break, and the discretionary node will disappear at 3272that time; thus, the output routine sees only discretionaries that were 3273not chosen. 3274 3275@d disc_node=7 {|type| of a discretionary node} 3276@d replace_count==subtype {how many subsequent nodes to replace} 3277@d pre_break==llink {text that precedes a discretionary break} 3278@d post_break==rlink {text that follows a discretionary break} 3279 3280@p function new_disc:pointer; {creates an empty |disc_node|} 3281var p:pointer; {the new node} 3282begin p:=get_node(small_node_size); type(p):=disc_node; 3283replace_count(p):=0; pre_break(p):=null; post_break(p):=null; 3284new_disc:=p; 3285end; 3286 3287@ A |whatsit_node| is a wild card reserved for extensions to \TeX. The 3288|subtype| field in its first word says what `\\{whatsit}' it is, and 3289implicitly determines the node size (which must be 2 or more) and the 3290format of the remaining words. When a |whatsit_node| is encountered 3291in a list, special actions are invoked; knowledgeable people who are 3292careful not to mess up the rest of \TeX\ are able to make \TeX\ do new 3293things by adding code at the end of the program. For example, there 3294might be a `\TeX nicolor' extension to specify different colors of ink, 3295@^extensions to \TeX@> 3296and the whatsit node might contain the desired parameters. 3297 3298The present implementation of \TeX\ treats the features associated with 3299`\.{\\write}' and `\.{\\special}' as if they were extensions, in order to 3300illustrate how such routines might be coded. We shall defer further 3301discussion of extensions until the end of this program. 3302 3303@d whatsit_node=8 {|type| of special extension nodes} 3304 3305@ To support ``native'' fonts, we build |native_word_node|s, which are variable 3306size whatsits. These have the same |width|, |depth|, and |height| fields as a 3307|box_node|, at offsets 1-3, and then a word containing a size field for the 3308node, a font number, a length, and a glyph count. Then there is a field 3309containing a C pointer to a glyph info array; this and the glyph count are set 3310by |set_native_metrics|. Copying and freeing of these nodes needs to take 3311account of this! This is followed by |2*length| bytes, for the actual 3312characters of the string (in UTF-16). 3313 3314So |native_node_size|, which does not include any space for the actual text, is 33156. 3316 33170-3 whatsits subtypes are used for open, write, close, special; 4 is language; 3318pdf\TeX\ uses up through 30-something, so we use subtypes starting from 40. 3319 3320There are also |glyph_node|s; these are like |native_word_node|s in having 3321|width|, |depth|, and |height| fields, but then they contain a glyph ID rather 3322than size and length fields, and there's no subsidiary C pointer. 3323 3324@d native_word_node=40 {|subtype| of whatsits that hold |native_font| words} 3325@d glyph_node=41 {|subtype| in whatsits that hold glyph numbers} 3326 3327@d native_node_size=6 {size of a |native_word| node (plus the actual chars) -- see also \.{xetex.h}} 3328@d glyph_node_size=5 3329@d native_size(#)==mem[#+4].qqqq.b0 3330@d native_font(#)==mem[#+4].qqqq.b1 3331@d native_length(#)==mem[#+4].qqqq.b2 3332@d native_glyph_count(#)==mem[#+4].qqqq.b3 3333@d native_glyph_info_ptr(#)==mem[#+5].ptr 3334@d native_glyph_info_size=10 {number of bytes of info per glyph: 16-bit glyph ID, 32-bit x and y coords} 3335@d native_glyph==native_length {in |glyph_node|s, we store the glyph number here} 3336 3337@d free_native_glyph_info(#) == 3338 begin 3339 if native_glyph_info_ptr(#) <> null_ptr then begin 3340 libc_free(native_glyph_info_ptr(#)); 3341 native_glyph_info_ptr(#):=null_ptr; 3342 native_glyph_count(#):=0; 3343 end 3344 end 3345 3346@p procedure copy_native_glyph_info(src:pointer; dest:pointer); 3347var glyph_count:integer; 3348begin 3349 if native_glyph_info_ptr(src) <> null_ptr then begin 3350 glyph_count:=native_glyph_count(src); 3351 native_glyph_info_ptr(dest):=xmalloc_array(char, glyph_count * native_glyph_info_size); 3352 memcpy(native_glyph_info_ptr(dest), native_glyph_info_ptr(src), glyph_count * native_glyph_info_size); 3353 native_glyph_count(dest):=glyph_count; 3354 end 3355end; 3356 3357@ Picture files are handled with nodes that include fields for the transform associated 3358with the picture, and a pathname for the picture file itself. 3359They also have 3360the |width|, |depth|, and |height| fields of a |box_node| at offsets 1-3. (|depth| will 3361always be zero, as it happens.) 3362 3363So |pic_node_size|, which does not include any space for the picture file pathname, is 7. 3364 3365A |pdf_node| is just like |pic_node|, but generate a different \.{XDV} file code. 3366 3367@d pic_node=42 {|subtype| in whatsits that hold picture file references} 3368@d pdf_node=43 {|subtype| in whatsits that hold PDF page references} 3369@# 3370@d pic_node_size=8 {must sync with \.{xetex.h}} 3371@d pic_path_length(#)==mem[#+4].hh.b0 3372@d pic_page(#)==mem[#+4].hh.b1 3373@d pic_transform1(#)==mem[#+5].hh.lh 3374@d pic_transform2(#)==mem[#+5].hh.rh 3375@d pic_transform3(#)==mem[#+6].hh.lh 3376@d pic_transform4(#)==mem[#+6].hh.rh 3377@d pic_transform5(#)==mem[#+7].hh.lh 3378@d pic_transform6(#)==mem[#+7].hh.rh 3379 3380@ A |math_node|, which occurs only in horizontal lists, appears before and 3381after mathematical formulas. The |subtype| field is |before| before the 3382formula and |after| after it. There is a |width| field, which represents 3383the amount of surrounding space inserted by \.{\\mathsurround}. 3384 3385In addition a |math_node| with |subtype>after| and |width=0| will be 3386(ab)used to record a regular |math_node| reinserted after being 3387discarded at a line break or one of the text direction primitives ( 3388\.{\\beginL}, \.{\\endL}, \.{\\beginR}, and \.{\\endR} ). 3389 3390@d math_node=9 {|type| of a math node} 3391@d before=0 {|subtype| for math node that introduces a formula} 3392@d after=1 {|subtype| for math node that winds up a formula} 3393@# 3394@d M_code=2 3395@d begin_M_code=M_code+before {|subtype| for \.{\\beginM} node} 3396@d end_M_code=M_code+after {|subtype| for \.{\\endM} node} 3397@d L_code=4 3398@d begin_L_code=L_code+begin_M_code {|subtype| for \.{\\beginL} node} 3399@d end_L_code=L_code+end_M_code {|subtype| for \.{\\endL} node} 3400@d R_code=L_code+L_code 3401@d begin_R_code=R_code+begin_M_code {|subtype| for \.{\\beginR} node} 3402@d end_R_code=R_code+end_M_code {|subtype| for \.{\\endR} node} 3403@# 3404@d end_LR(#)==odd(subtype(#)) 3405@d end_LR_type(#)==(L_code*(subtype(#) div L_code)+end_M_code) 3406@d begin_LR_type(#)==(#-after+before) 3407 3408@p function new_math(@!w:scaled;@!s:small_number):pointer; 3409var p:pointer; {the new node} 3410begin p:=get_node(small_node_size); type(p):=math_node; 3411subtype(p):=s; width(p):=w; new_math:=p; 3412end; 3413 3414@ \TeX\ makes use of the fact that |hlist_node|, |vlist_node|, 3415|rule_node|, |ins_node|, |mark_node|, |adjust_node|, |ligature_node|, 3416|disc_node|, |whatsit_node|, and |math_node| are at the low end of the 3417type codes, by permitting a break at glue in a list if and only if the 3418|type| of the previous node is less than |math_node|. Furthermore, a 3419node is discarded after a break if its type is |math_node| or~more. 3420 3421@d precedes_break(#)==(type(#)<math_node) 3422@d non_discardable(#)==(type(#)<math_node) 3423 3424@ A |glue_node| represents glue in a list. However, it is really only 3425a pointer to a separate glue specification, since \TeX\ makes use of the 3426fact that many essentially identical nodes of glue are usually present. 3427If |p| points to a |glue_node|, |glue_ptr(p)| points to 3428another packet of words that specify the stretch and shrink components, etc. 3429 3430Glue nodes also serve to represent leaders; the |subtype| is used to 3431distinguish between ordinary glue (which is called |normal|) and the three 3432kinds of leaders (which are called |a_leaders|, |c_leaders|, and |x_leaders|). 3433The |leader_ptr| field points to a rule node or to a box node containing the 3434leaders; it is set to |null| in ordinary glue nodes. 3435 3436Many kinds of glue are computed from \TeX's ``skip'' parameters, and 3437it is helpful to know which parameter has led to a particular glue node. 3438Therefore the |subtype| is set to indicate the source of glue, whenever 3439it originated as a parameter. We will be defining symbolic names for the 3440parameter numbers later (e.g., |line_skip_code=0|, |baseline_skip_code=1|, 3441etc.); it suffices for now to say that the |subtype| of parametric glue 3442will be the same as the parameter number, plus~one. 3443 3444In math formulas there are two more possibilities for the |subtype| in a 3445glue node: |mu_glue| denotes an \.{\\mskip} (where the units are scaled \.{mu} 3446instead of scaled \.{pt}); and |cond_math_glue| denotes the `\.{\\nonscript}' 3447feature that cancels the glue node immediately following if it appears 3448in a subscript. 3449 3450@d glue_node=10 {|type| of node that points to a glue specification} 3451@d cond_math_glue=98 {special |subtype| to suppress glue in the next node} 3452@d mu_glue=99 {|subtype| for math glue} 3453@d a_leaders=100 {|subtype| for aligned leaders} 3454@d c_leaders=101 {|subtype| for centered leaders} 3455@d x_leaders=102 {|subtype| for expanded leaders} 3456@d glue_ptr==llink {pointer to a glue specification} 3457@d leader_ptr==rlink {pointer to box or rule node for leaders} 3458 3459@ A glue specification has a halfword reference count in its first word, 3460@^reference counts@> 3461representing |null| plus the number of glue nodes that point to it (less one). 3462Note that the reference count appears in the same position as 3463the |link| field in list nodes; this is the field that is initialized 3464to |null| when a node is allocated, and it is also the field that is flagged 3465by |empty_flag| in empty nodes. 3466 3467Glue specifications also contain three |scaled| fields, for the |width|, 3468|stretch|, and |shrink| dimensions. Finally, there are two one-byte 3469fields called |stretch_order| and |shrink_order|; these contain the 3470orders of infinity (|normal|, |fil|, |fill|, or |filll|) 3471corresponding to the stretch and shrink values. 3472 3473@d glue_spec_size=4 {number of words to allocate for a glue specification} 3474@d glue_ref_count(#) == link(#) {reference count of a glue specification} 3475@d stretch(#) == mem[#+2].sc {the stretchability of this glob of glue} 3476@d shrink(#) == mem[#+3].sc {the shrinkability of this glob of glue} 3477@d stretch_order == type {order of infinity for stretching} 3478@d shrink_order == subtype {order of infinity for shrinking} 3479@d fil=1 {first-order infinity} 3480@d fill=2 {second-order infinity} 3481@d filll=3 {third-order infinity} 3482 3483@<Types...@>= 3484@!glue_ord=normal..filll; {infinity to the 0, 1, 2, or 3 power} 3485 3486@ Here is a function that returns a pointer to a copy of a glue spec. 3487The reference count in the copy is |null|, because there is assumed 3488to be exactly one reference to the new specification. 3489 3490@p function new_spec(@!p:pointer):pointer; {duplicates a glue specification} 3491var q:pointer; {the new spec} 3492begin q:=get_node(glue_spec_size);@/ 3493mem[q]:=mem[p]; glue_ref_count(q):=null;@/ 3494width(q):=width(p); stretch(q):=stretch(p); shrink(q):=shrink(p); 3495new_spec:=q; 3496end; 3497 3498@ And here's a function that creates a glue node for a given parameter 3499identified by its code number; for example, 3500|new_param_glue(line_skip_code)| returns a pointer to a glue node for the 3501current \.{\\lineskip}. 3502 3503@p function new_param_glue(@!n:small_number):pointer; 3504var p:pointer; {the new node} 3505@!q:pointer; {the glue specification} 3506begin p:=get_node(small_node_size); type(p):=glue_node; subtype(p):=n+1; 3507leader_ptr(p):=null;@/ 3508q:=@<Current |mem| equivalent of glue parameter number |n|@>@t@>; 3509glue_ptr(p):=q; incr(glue_ref_count(q)); 3510new_param_glue:=p; 3511end; 3512 3513@ Glue nodes that are more or less anonymous are created by |new_glue|, 3514whose argument points to a glue specification. 3515 3516@p function new_glue(@!q:pointer):pointer; 3517var p:pointer; {the new node} 3518begin p:=get_node(small_node_size); type(p):=glue_node; subtype(p):=normal; 3519leader_ptr(p):=null; glue_ptr(p):=q; incr(glue_ref_count(q)); 3520new_glue:=p; 3521end; 3522 3523@ Still another subroutine is needed: This one is sort of a combination 3524of |new_param_glue| and |new_glue|. It creates a glue node for one of 3525the current glue parameters, but it makes a fresh copy of the glue 3526specification, since that specification will probably be subject to change, 3527while the parameter will stay put. The global variable |temp_ptr| is 3528set to the address of the new spec. 3529 3530@p function new_skip_param(@!n:small_number):pointer; 3531var p:pointer; {the new node} 3532begin temp_ptr:=new_spec(@<Current |mem| equivalent of glue parameter...@>); 3533p:=new_glue(temp_ptr); glue_ref_count(temp_ptr):=null; subtype(p):=n+1; 3534new_skip_param:=p; 3535end; 3536 3537@ A |kern_node| has a |width| field to specify a (normally negative) 3538amount of spacing. This spacing correction appears in horizontal lists 3539between letters like A and V when the font designer said that it looks 3540better to move them closer together or further apart. A kern node can 3541also appear in a vertical list, when its `|width|' denotes additional 3542spacing in the vertical direction. The |subtype| is either |normal| (for 3543kerns inserted from font information or math mode calculations) or |explicit| 3544(for kerns inserted from \.{\\kern} and \.{\\/} commands) or |acc_kern| 3545(for kerns inserted from non-math accents) or |mu_glue| (for kerns 3546inserted from \.{\\mkern} specifications in math formulas). 3547 3548@d kern_node=11 {|type| of a kern node} 3549@d explicit=1 {|subtype| of kern nodes from \.{\\kern} and \.{\\/}} 3550@d acc_kern=2 {|subtype| of kern nodes from accents} 3551 3552@# {memory structure for marginal kerns} 3553@d margin_kern_node = 40 3554@d margin_kern_node_size = 3 3555@d margin_char(#) == info(# + 2) {unused for now; relevant for font expansion} 3556 3557@# {|subtype| of marginal kerns} 3558@d left_side == 0 3559@d right_side == 1 3560 3561@# {base for lp/rp codes starts from 2: 3562 0 for |hyphen_char|, 3563 1 for |skew_char|} 3564@d lp_code_base == 2 3565@d rp_code_base == 3 3566 3567@d max_hlist_stack = 512 {maximum fill level for |hlist_stack|} 3568{maybe good if larger than |2 * max_quarterword|, so that box nesting level would overflow first} 3569 3570@ The |new_kern| function creates a kern node having a given width. 3571 3572@p function new_kern(@!w:scaled):pointer; 3573var p:pointer; {the new node} 3574begin p:=get_node(small_node_size); type(p):=kern_node; 3575subtype(p):=normal; 3576width(p):=w; 3577new_kern:=p; 3578end; 3579 3580@ @<Glob...@>= 3581@!last_leftmost_char: pointer; 3582@!last_rightmost_char: pointer; 3583@!hlist_stack:array[0..max_hlist_stack] of pointer; {stack for |find_protchar_left()| and |find_protchar_right()|} 3584@!hlist_stack_level:0..max_hlist_stack; {fill level for |hlist_stack|} 3585@!first_p: pointer; {to access the first node of the paragraph} 3586@!global_prev_p: pointer; {to access |prev_p| in |line_break|; should be kept in sync with |prev_p| by |update_prev_p|} 3587 3588@ A |penalty_node| specifies the penalty associated with line or page 3589breaking, in its |penalty| field. This field is a fullword integer, but 3590the full range of integer values is not used: Any penalty |>=10000| is 3591treated as infinity, and no break will be allowed for such high values. 3592Similarly, any penalty |<=-10000| is treated as negative infinity, and a 3593break will be forced. 3594 3595@d penalty_node=12 {|type| of a penalty node} 3596@d inf_penalty=inf_bad {``infinite'' penalty value} 3597@d eject_penalty=-inf_penalty {``negatively infinite'' penalty value} 3598@d penalty(#) == mem[#+1].int {the added cost of breaking a list here} 3599 3600@ Anyone who has been reading the last few sections of the program will 3601be able to guess what comes next. 3602 3603@p function new_penalty(@!m:integer):pointer; 3604var p:pointer; {the new node} 3605begin p:=get_node(small_node_size); type(p):=penalty_node; 3606subtype(p):=0; {the |subtype| is not used} 3607penalty(p):=m; new_penalty:=p; 3608end; 3609 3610@ You might think that we have introduced enough node types by now. Well, 3611almost, but there is one more: An |unset_node| has nearly the same format 3612as an |hlist_node| or |vlist_node|; it is used for entries in \.{\\halign} 3613or \.{\\valign} that are not yet in their final form, since the box 3614dimensions are their ``natural'' sizes before any glue adjustment has been 3615made. The |glue_set| word is not present; instead, we have a |glue_stretch| 3616field, which contains the total stretch of order |glue_order| that is 3617present in the hlist or vlist being boxed. 3618Similarly, the |shift_amount| field is replaced by a |glue_shrink| field, 3619containing the total shrink of order |glue_sign| that is present. 3620The |subtype| field is called |span_count|; an unset box typically 3621contains the data for |qo(span_count)+1| columns. 3622Unset nodes will be changed to box nodes when alignment is completed. 3623 3624@d unset_node=13 {|type| for an unset node} 3625@d glue_stretch(#)==mem[#+glue_offset].sc {total stretch in an unset node} 3626@d glue_shrink==shift_amount {total shrink in an unset node} 3627@d span_count==subtype {indicates the number of spanned columns} 3628 3629@ In fact, there are still more types coming. When we get to math formula 3630processing we will see that a |style_node| has |type=14|; and a number 3631of larger type codes will also be defined, for use in math mode only. 3632 3633@ Warning: If any changes are made to these data structure layouts, such as 3634changing any of the node sizes or even reordering the words of nodes, 3635the |copy_node_list| procedure and the memory initialization code 3636below may have to be changed. Such potentially dangerous parts of the 3637program are listed in the index under `data structure assumptions'. 3638@!@^data structure assumptions@> 3639However, other references to the nodes are made symbolically in terms of 3640the \.{WEB} macro definitions above, so that format changes will leave 3641\TeX's other algorithms intact. 3642@^system dependencies@> 3643 3644@* \[11] Memory layout. 3645Some areas of |mem| are dedicated to fixed usage, since static allocation is 3646more efficient than dynamic allocation when we can get away with it. For 3647example, locations |mem_bot| to |mem_bot+3| are always used to store the 3648specification for glue that is `\.{0pt plus 0pt minus 0pt}'. The 3649following macro definitions accomplish the static allocation by giving 3650symbolic names to the fixed positions. Static variable-size nodes appear 3651in locations |mem_bot| through |lo_mem_stat_max|, and static single-word nodes 3652appear in locations |hi_mem_stat_min| through |mem_top|, inclusive. It is 3653harmless to let |lig_trick| and |garbage| share the same location of |mem|. 3654 3655@d zero_glue==mem_bot {specification for \.{0pt plus 0pt minus 0pt}} 3656@d fil_glue==zero_glue+glue_spec_size {\.{0pt plus 1fil minus 0pt}} 3657@d fill_glue==fil_glue+glue_spec_size {\.{0pt plus 1fill minus 0pt}} 3658@d ss_glue==fill_glue+glue_spec_size {\.{0pt plus 1fil minus 1fil}} 3659@d fil_neg_glue==ss_glue+glue_spec_size {\.{0pt plus -1fil minus 0pt}} 3660@d lo_mem_stat_max==fil_neg_glue+glue_spec_size-1 {largest statically 3661 allocated word in the variable-size |mem|} 3662@# 3663@d page_ins_head==mem_top {list of insertion data for current page} 3664@d contrib_head==mem_top-1 {vlist of items not yet on current page} 3665@d page_head==mem_top-2 {vlist for current page} 3666@d temp_head==mem_top-3 {head of a temporary list of some kind} 3667@d hold_head==mem_top-4 {head of a temporary list of another kind} 3668@d adjust_head==mem_top-5 {head of adjustment list returned by |hpack|} 3669@d active==mem_top-7 {head of active list in |line_break|, needs two words} 3670@d align_head==mem_top-8 {head of preamble list for alignments} 3671@d end_span==mem_top-9 {tail of spanned-width lists} 3672@d omit_template==mem_top-10 {a constant token list} 3673@d null_list==mem_top-11 {permanently empty list} 3674@d lig_trick==mem_top-12 {a ligature masquerading as a |char_node|} 3675@d garbage==mem_top-12 {used for scrap information} 3676@d backup_head==mem_top-13 {head of token list built by |scan_keyword|} 3677@d pre_adjust_head==mem_top-14 {head of pre-adjustment list returned by |hpack|} 3678@d hi_mem_stat_min==mem_top-14 {smallest statically allocated word in 3679 the one-word |mem|} 3680@d hi_mem_stat_usage=15 {the number of one-word nodes always present} 3681 3682@ The following code gets |mem| off to a good start, when \TeX\ is 3683initializing itself the slow~way. 3684 3685@<Local variables for init...@>= 3686@!k:integer; {index into |mem|, |eqtb|, etc.} 3687 3688@ @<Initialize table entries...@>= 3689for k:=mem_bot+1 to lo_mem_stat_max do mem[k].sc:=0; 3690 {all glue dimensions are zeroed} 3691@^data structure assumptions@> 3692k:=mem_bot;@+while k<=lo_mem_stat_max do 3693 {set first words of glue specifications} 3694 begin glue_ref_count(k):=null+1; 3695 stretch_order(k):=normal; shrink_order(k):=normal; 3696 k:=k+glue_spec_size; 3697 end; 3698stretch(fil_glue):=unity; stretch_order(fil_glue):=fil;@/ 3699stretch(fill_glue):=unity; stretch_order(fill_glue):=fill;@/ 3700stretch(ss_glue):=unity; stretch_order(ss_glue):=fil;@/ 3701shrink(ss_glue):=unity; shrink_order(ss_glue):=fil;@/ 3702stretch(fil_neg_glue):=-unity; stretch_order(fil_neg_glue):=fil;@/ 3703rover:=lo_mem_stat_max+1; 3704link(rover):=empty_flag; {now initialize the dynamic memory} 3705node_size(rover):=1000; {which is a 1000-word available node} 3706llink(rover):=rover; rlink(rover):=rover;@/ 3707lo_mem_max:=rover+1000; link(lo_mem_max):=null; info(lo_mem_max):=null;@/ 3708for k:=hi_mem_stat_min to mem_top do 3709 mem[k]:=mem[lo_mem_max]; {clear list heads} 3710@<Initialize the special list heads and constant nodes@>; 3711avail:=null; mem_end:=mem_top; 3712hi_mem_min:=hi_mem_stat_min; {initialize the one-word memory} 3713var_used:=lo_mem_stat_max+1-mem_bot; dyn_used:=hi_mem_stat_usage; 3714 {initialize statistics} 3715 3716@ If \TeX\ is extended improperly, the |mem| array might get screwed up. 3717For example, some pointers might be wrong, or some ``dead'' nodes might not 3718have been freed when the last reference to them disappeared. Procedures 3719|check_mem| and |search_mem| are available to help diagnose such 3720problems. These procedures make use of two arrays called |free| and 3721|was_free| that are present only if \TeX's debugging routines have 3722been included. (You may want to decrease the size of |mem| while you 3723@^debugging@> 3724are debugging.) 3725 3726@<Glob...@>= 3727@!debug @!free: packed array [mem_min..mem_max] of boolean; {free cells} 3728@t\hskip10pt@>@!was_free: packed array [mem_min..mem_max] of boolean; 3729 {previously free cells} 3730@t\hskip10pt@>@!was_mem_end,@!was_lo_max,@!was_hi_min: pointer; 3731 {previous |mem_end|, |lo_mem_max|, and |hi_mem_min|} 3732@t\hskip10pt@>@!panicking:boolean; {do we want to check memory constantly?} 3733gubed 3734 3735@ @<Set initial...@>= 3736@!debug was_mem_end:=mem_min; {indicate that everything was previously free} 3737was_lo_max:=mem_min; was_hi_min:=mem_max; 3738panicking:=false; 3739gubed 3740 3741@ Procedure |check_mem| makes sure that the available space lists of 3742|mem| are well formed, and it optionally prints out all locations 3743that are reserved now but were free the last time this procedure was called. 3744 3745@p @!debug procedure check_mem(@!print_locs : boolean); 3746label done1,done2; {loop exits} 3747var p,@!q:pointer; {current locations of interest in |mem|} 3748@!clobbered:boolean; {is something amiss?} 3749begin for p:=mem_min to lo_mem_max do free[p]:=false; {you can probably 3750 do this faster} 3751for p:=hi_mem_min to mem_end do free[p]:=false; {ditto} 3752@<Check single-word |avail| list@>; 3753@<Check variable-size |avail| list@>; 3754@<Check flags of unavailable nodes@>; 3755if print_locs then @<Print newly busy locations@>; 3756for p:=mem_min to lo_mem_max do was_free[p]:=free[p]; 3757for p:=hi_mem_min to mem_end do was_free[p]:=free[p]; 3758 {|was_free:=free| might be faster} 3759was_mem_end:=mem_end; was_lo_max:=lo_mem_max; was_hi_min:=hi_mem_min; 3760end; 3761gubed 3762 3763@ @<Check single-word...@>= 3764p:=avail; q:=null; clobbered:=false; 3765while p<>null do 3766 begin if (p>mem_end)or(p<hi_mem_min) then clobbered:=true 3767 else if free[p] then clobbered:=true; 3768 if clobbered then 3769 begin print_nl("AVAIL list clobbered at "); 3770@.AVAIL list clobbered...@> 3771 print_int(q); goto done1; 3772 end; 3773 free[p]:=true; q:=p; p:=link(q); 3774 end; 3775done1: 3776 3777@ @<Check variable-size...@>= 3778p:=rover; q:=null; clobbered:=false; 3779repeat if (p>=lo_mem_max)or(p<mem_min) then clobbered:=true 3780 else if (rlink(p)>=lo_mem_max)or(rlink(p)<mem_min) then clobbered:=true 3781 else if not(is_empty(p))or(node_size(p)<2)or@| 3782 (p+node_size(p)>lo_mem_max)or@| (llink(rlink(p))<>p) then clobbered:=true; 3783 if clobbered then 3784 begin print_nl("Double-AVAIL list clobbered at "); 3785 print_int(q); goto done2; 3786 end; 3787for q:=p to p+node_size(p)-1 do {mark all locations free} 3788 begin if free[q] then 3789 begin print_nl("Doubly free location at "); 3790@.Doubly free location...@> 3791 print_int(q); goto done2; 3792 end; 3793 free[q]:=true; 3794 end; 3795q:=p; p:=rlink(p); 3796until p=rover; 3797done2: 3798 3799@ @<Check flags...@>= 3800p:=mem_min; 3801while p<=lo_mem_max do {node |p| should not be empty} 3802 begin if is_empty(p) then 3803 begin print_nl("Bad flag at "); print_int(p); 3804@.Bad flag...@> 3805 end; 3806 while (p<=lo_mem_max) and not free[p] do incr(p); 3807 while (p<=lo_mem_max) and free[p] do incr(p); 3808 end 3809 3810@ @<Print newly busy...@>= 3811begin print_nl("New busy locs:"); 3812for p:=mem_min to lo_mem_max do 3813 if not free[p] and ((p>was_lo_max) or was_free[p]) then 3814 begin print_char(" "); print_int(p); 3815 end; 3816for p:=hi_mem_min to mem_end do 3817 if not free[p] and 3818 ((p<was_hi_min) or (p>was_mem_end) or was_free[p]) then 3819 begin print_char(" "); print_int(p); 3820 end; 3821end 3822 3823@ The |search_mem| procedure attempts to answer the question ``Who points 3824to node~|p|?'' In doing so, it fetches |link| and |info| fields of |mem| 3825that might not be of type |two_halves|. Strictly speaking, this is 3826@^dirty \PASCAL@> 3827undefined in \PASCAL, and it can lead to ``false drops'' (words that seem to 3828point to |p| purely by coincidence). But for debugging purposes, we want 3829to rule out the places that do {\sl not\/} point to |p|, so a few false 3830drops are tolerable. 3831 3832@p @!debug procedure search_mem(@!p:pointer); {look for pointers to |p|} 3833var q:integer; {current position being searched} 3834begin for q:=mem_min to lo_mem_max do 3835 begin if link(q)=p then 3836 begin print_nl("LINK("); print_int(q); print_char(")"); 3837 end; 3838 if info(q)=p then 3839 begin print_nl("INFO("); print_int(q); print_char(")"); 3840 end; 3841 end; 3842for q:=hi_mem_min to mem_end do 3843 begin if link(q)=p then 3844 begin print_nl("LINK("); print_int(q); print_char(")"); 3845 end; 3846 if info(q)=p then 3847 begin print_nl("INFO("); print_int(q); print_char(")"); 3848 end; 3849 end; 3850@<Search |eqtb| for equivalents equal to |p|@>; 3851@<Search |save_stack| for equivalents that point to |p|@>; 3852@<Search |hyph_list| for pointers to |p|@>; 3853end; 3854gubed 3855 3856@ Some stuff for character protrusion. 3857 3858@p 3859procedure pdf_error(t, p: str_number); 3860begin 3861 normalize_selector; 3862 print_err("Error"); 3863 if t <> 0 then begin 3864 print(" ("); 3865 print(t); 3866 print(")"); 3867 end; 3868 print(": "); print(p); 3869 succumb; 3870end; 3871 3872function prev_rightmost(s, e: pointer): pointer; 3873{finds the node preceding the rightmost node |e|; |s| is some node 3874before |e|} 3875var p: pointer; 3876begin 3877 prev_rightmost:=null; 3878 p:=s; 3879 if p = null then 3880 return; 3881 while link(p) <> e do begin 3882 p:=link(p); 3883 if p = null then 3884 return; 3885 end; 3886 prev_rightmost:=p; 3887end; 3888 3889function round_xn_over_d(@!x:scaled; @!n,@!d:integer):scaled; 3890var positive:boolean; {was |x>=0|?} 3891@!t,@!u,@!v:nonnegative_integer; {intermediate quantities} 3892begin if x>=0 then positive:=true 3893else begin negate(x); positive:=false; 3894 end; 3895t:=(x mod @'100000)*n; 3896u:=(x div @'100000)*n+(t div @'100000); 3897v:=(u mod d)*@'100000 + (t mod @'100000); 3898if u div d>=@'100000 then arith_error:=true 3899else u:=@'100000*(u div d) + (v div d); 3900v:=v mod d; 3901if 2*v >= d then 3902 incr(u); 3903if positive then 3904 round_xn_over_d:=u 3905else 3906 round_xn_over_d:=-u; 3907end; 3908 3909@* \[12] Displaying boxes. 3910We can reinforce our knowledge of the data structures just introduced 3911by considering two procedures that display a list in symbolic form. 3912The first of these, called |short_display|, is used in ``overfull box'' 3913messages to give the top-level description of a list. The other one, 3914called |show_node_list|, prints a detailed description of exactly what 3915is in the data structure. 3916 3917The philosophy of |short_display| is to ignore the fine points about exactly 3918what is inside boxes, except that ligatures and discretionary breaks are 3919expanded. As a result, |short_display| is a recursive procedure, but the 3920recursion is never more than one level deep. 3921@^recursion@> 3922 3923A global variable |font_in_short_display| keeps track of the font code that 3924is assumed to be present when |short_display| begins; deviations from this 3925font will be printed. 3926 3927@<Glob...@>= 3928@!font_in_short_display:integer; {an internal font number} 3929 3930@ Boxes, rules, inserts, whatsits, marks, and things in general that are 3931sort of ``complicated'' are indicated only by printing `\.{[]}'. 3932 3933@p procedure short_display(@!p:integer); {prints highlights of list |p|} 3934var n:integer; {for replacement counts} 3935begin while p>mem_min do 3936 begin if is_char_node(p) then 3937 begin if p<=mem_end then 3938 begin if font(p)<>font_in_short_display then 3939 begin if (font(p)<font_base)or(font(p)>font_max) then 3940 print_char("*") 3941@.*\relax@> 3942 else @<Print the font identifier for |font(p)|@>; 3943 print_char(" "); font_in_short_display:=font(p); 3944 end; 3945 print_ASCII(qo(character(p))); 3946 end; 3947 end 3948 else @<Print a short indication of the contents of node |p|@>; 3949 p:=link(p); 3950 end; 3951end; 3952 3953@ @<Print a short indication of the contents of node |p|@>= 3954case type(p) of 3955hlist_node,vlist_node,ins_node,mark_node,adjust_node, 3956 unset_node: print("[]"); 3957whatsit_node: 3958 case subtype(p) of 3959 native_word_node: begin 3960 if native_font(p)<>font_in_short_display then begin 3961 print_esc(font_id_text(native_font(p))); 3962 print_char(" "); 3963 font_in_short_display:=native_font(p); 3964 end; 3965 print_native_word(p); 3966 end; 3967 othercases print("[]") 3968 endcases; 3969rule_node: print_char("|"); 3970glue_node: if glue_ptr(p)<>zero_glue then print_char(" "); 3971math_node: if subtype(p)>=L_code then print("[]") 3972 else print_char("$"); 3973ligature_node: short_display(lig_ptr(p)); 3974disc_node: begin short_display(pre_break(p)); 3975 short_display(post_break(p));@/ 3976 n:=replace_count(p); 3977 while n>0 do 3978 begin if link(p)<>null then p:=link(p); 3979 decr(n); 3980 end; 3981 end; 3982othercases do_nothing 3983endcases 3984 3985@ The |show_node_list| routine requires some auxiliary subroutines: one to 3986print a font-and-character combination, one to print a token list without 3987its reference count, and one to print a rule dimension. 3988 3989@p procedure print_font_and_char(@!p:integer); {prints |char_node| data} 3990begin if p>mem_end then print_esc("CLOBBERED.") 3991else begin if (font(p)<font_base)or(font(p)>font_max) then print_char("*") 3992@.*\relax@> 3993 else @<Print the font identifier for |font(p)|@>; 3994 print_char(" "); print_ASCII(qo(character(p))); 3995 end; 3996end; 3997@# 3998procedure print_mark(@!p:integer); {prints token list data in braces} 3999begin print_char("{"); 4000if (p<hi_mem_min)or(p>mem_end) then print_esc("CLOBBERED.") 4001else show_token_list(link(p),null,max_print_line-10); 4002print_char("}"); 4003end; 4004@# 4005procedure print_rule_dimen(@!d:scaled); {prints dimension in rule node} 4006begin if is_running(d) then print_char("*") else print_scaled(d); 4007@.*\relax@> 4008end; 4009 4010@ Then there is a subroutine that prints glue stretch and shrink, possibly 4011followed by the name of finite units: 4012 4013@p procedure print_glue(@!d:scaled;@!order:integer;@!s:str_number); 4014 {prints a glue component} 4015begin print_scaled(d); 4016if (order<normal)or(order>filll) then print("foul") 4017else if order>normal then 4018 begin print("fil"); 4019 while order>fil do 4020 begin print_char("l"); decr(order); 4021 end; 4022 end 4023else if s<>0 then print(s); 4024end; 4025 4026@ The next subroutine prints a whole glue specification. 4027 4028@p procedure print_spec(@!p:integer;@!s:str_number); 4029 {prints a glue specification} 4030begin if (p<mem_min)or(p>=lo_mem_max) then print_char("*") 4031@.*\relax@> 4032else begin print_scaled(width(p)); 4033 if s<>0 then print(s); 4034 if stretch(p)<>0 then 4035 begin print(" plus "); print_glue(stretch(p),stretch_order(p),s); 4036 end; 4037 if shrink(p)<>0 then 4038 begin print(" minus "); print_glue(shrink(p),shrink_order(p),s); 4039 end; 4040 end; 4041end; 4042 4043@ We also need to declare some procedures that appear later in this 4044documentation. 4045 4046@p @<Declare procedures needed for displaying the elements of mlists@>@; 4047@<Declare the procedure called |print_skip_param|@> 4048 4049@ Since boxes can be inside of boxes, |show_node_list| is inherently recursive, 4050@^recursion@> 4051up to a given maximum number of levels. The history of nesting is indicated 4052by the current string, which will be printed at the beginning of each line; 4053the length of this string, namely |cur_length|, is the depth of nesting. 4054 4055Recursive calls on |show_node_list| therefore use the following pattern: 4056 4057@d node_list_display(#)== 4058 begin append_char("."); show_node_list(#); flush_char; 4059 end {|str_room| need not be checked; see |show_box| below} 4060 4061@ A global variable called |depth_threshold| is used to record the maximum 4062depth of nesting for which |show_node_list| will show information. If we 4063have |depth_threshold=0|, for example, only the top level information will 4064be given and no sublists will be traversed. Another global variable, called 4065|breadth_max|, tells the maximum number of items to show at each level; 4066|breadth_max| had better be positive, or you won't see anything. 4067 4068@<Glob...@>= 4069@!depth_threshold : integer; {maximum nesting depth in box displays} 4070@!breadth_max : integer; {maximum number of items shown at the same list level} 4071 4072@ Now we are ready for |show_node_list| itself. This procedure has been 4073written to be ``extra robust'' in the sense that it should not crash or get 4074into a loop even if the data structures have been messed up by bugs in 4075the rest of the program. You can safely call its parent routine 4076|show_box(p)| for arbitrary values of |p| when you are debugging \TeX. 4077However, in the presence of bad data, the procedure may 4078@^dirty \PASCAL@>@^debugging@> 4079fetch a |memory_word| whose variant is different from the way it was stored; 4080for example, it might try to read |mem[p].hh| when |mem[p]| 4081contains a scaled integer, if |p| is a pointer that has been 4082clobbered or chosen at random. 4083 4084@p procedure show_node_list(@!p:integer); {prints a node list symbolically} 4085label exit; 4086var n:integer; {the number of items already printed at this level} 4087i:integer; {temp index for printing chars of picfile paths} 4088@!g:real; {a glue ratio, as a floating point number} 4089begin if cur_length>depth_threshold then 4090 begin if p>null then print(" []"); 4091 {indicate that there's been some truncation} 4092 return; 4093 end; 4094n:=0; 4095while p>mem_min do 4096 begin print_ln; print_current_string; {display the nesting history} 4097 if p>mem_end then {pointer out of range} 4098 begin print("Bad link, display aborted."); return; 4099@.Bad link...@> 4100 end; 4101 incr(n); if n>breadth_max then {time to stop} 4102 begin print("etc."); return; 4103@.etc@> 4104 end; 4105 @<Display node |p|@>; 4106 p:=link(p); 4107 end; 4108exit: 4109end; 4110 4111@ @<Display node |p|@>= 4112if is_char_node(p) then print_font_and_char(p) 4113else case type(p) of 4114 hlist_node,vlist_node,unset_node: @<Display box |p|@>; 4115 rule_node: @<Display rule |p|@>; 4116 ins_node: @<Display insertion |p|@>; 4117 whatsit_node: @<Display the whatsit node |p|@>; 4118 glue_node: @<Display glue |p|@>; 4119 kern_node: @<Display kern |p|@>; 4120 margin_kern_node: begin 4121 print_esc("kern"); 4122 print_scaled(width(p)); 4123 if subtype(p) = left_side then 4124 print(" (left margin)") 4125 else 4126 print(" (right margin)"); 4127 end; 4128 math_node: @<Display math node |p|@>; 4129 ligature_node: @<Display ligature |p|@>; 4130 penalty_node: @<Display penalty |p|@>; 4131 disc_node: @<Display discretionary |p|@>; 4132 mark_node: @<Display mark |p|@>; 4133 adjust_node: @<Display adjustment |p|@>; 4134 @t\4@>@<Cases of |show_node_list| that arise in mlists only@>@; 4135 othercases print("Unknown node type!") 4136 endcases 4137 4138@ @<Display box |p|@>= 4139begin if type(p)=hlist_node then print_esc("h") 4140else if type(p)=vlist_node then print_esc("v") 4141else print_esc("unset"); 4142print("box("); print_scaled(height(p)); print_char("+"); 4143print_scaled(depth(p)); print(")x"); print_scaled(width(p)); 4144if type(p)=unset_node then 4145 @<Display special fields of the unset node |p|@> 4146else begin @<Display the value of |glue_set(p)|@>; 4147 if shift_amount(p)<>0 then 4148 begin print(", shifted "); print_scaled(shift_amount(p)); 4149 end; 4150 if eTeX_ex then @<Display if this box is never to be reversed@>; 4151 end; 4152node_list_display(list_ptr(p)); {recursive call} 4153end 4154 4155@ @<Display special fields of the unset node |p|@>= 4156begin if span_count(p)<>min_quarterword then 4157 begin print(" ("); print_int(qo(span_count(p))+1); 4158 print(" columns)"); 4159 end; 4160if glue_stretch(p)<>0 then 4161 begin print(", stretch "); print_glue(glue_stretch(p),glue_order(p),0); 4162 end; 4163if glue_shrink(p)<>0 then 4164 begin print(", shrink "); print_glue(glue_shrink(p),glue_sign(p),0); 4165 end; 4166end 4167 4168@ The code will have to change in this place if |glue_ratio| is 4169a structured type instead of an ordinary |real|. Note that this routine 4170should avoid arithmetic errors even if the |glue_set| field holds an 4171arbitrary random value. The following code assumes that a properly 4172formed nonzero |real| number has absolute value $2^{20}$ or more when 4173it is regarded as an integer; this precaution was adequate to prevent 4174floating point underflow on the author's computer. 4175@^system dependencies@> 4176@^dirty \PASCAL@> 4177 4178@<Display the value of |glue_set(p)|@>= 4179g:=float(glue_set(p)); 4180if (g<>float_constant(0))and(glue_sign(p)<>normal) then 4181 begin print(", glue set "); 4182 if glue_sign(p)=shrinking then print("- "); 4183 if abs(mem[p+glue_offset].int)<@'4000000 then print("?.?") 4184 else if abs(g)>float_constant(20000) then 4185 begin if g>float_constant(0) then print_char(">") 4186 else print("< -"); 4187 print_glue(20000*unity,glue_order(p),0); 4188 end 4189 else print_glue(round(unity*g),glue_order(p),0); 4190@^real multiplication@> 4191 end 4192 4193@ @<Display rule |p|@>= 4194begin print_esc("rule("); print_rule_dimen(height(p)); print_char("+"); 4195print_rule_dimen(depth(p)); print(")x"); print_rule_dimen(width(p)); 4196end 4197 4198@ @<Display insertion |p|@>= 4199begin print_esc("insert"); print_int(qo(subtype(p))); 4200print(", natural size "); print_scaled(height(p)); 4201print("; split("); print_spec(split_top_ptr(p),0); 4202print_char(","); print_scaled(depth(p)); 4203print("); float cost "); print_int(float_cost(p)); 4204node_list_display(ins_ptr(p)); {recursive call} 4205end 4206 4207@ @<Display glue |p|@>= 4208if subtype(p)>=a_leaders then @<Display leaders |p|@> 4209else begin print_esc("glue"); 4210 if subtype(p)<>normal then 4211 begin print_char("("); 4212 if subtype(p)<cond_math_glue then 4213 print_skip_param(subtype(p)-1) 4214 else if subtype(p)=cond_math_glue then print_esc("nonscript") 4215 else print_esc("mskip"); 4216 print_char(")"); 4217 end; 4218 if subtype(p)<>cond_math_glue then 4219 begin print_char(" "); 4220 if subtype(p)<cond_math_glue then print_spec(glue_ptr(p),0) 4221 else print_spec(glue_ptr(p),"mu"); 4222 end; 4223 end 4224 4225@ @<Display leaders |p|@>= 4226begin print_esc(""); 4227if subtype(p)=c_leaders then print_char("c") 4228else if subtype(p)=x_leaders then print_char("x"); 4229print("leaders "); print_spec(glue_ptr(p),0); 4230node_list_display(leader_ptr(p)); {recursive call} 4231end 4232 4233@ An ``explicit'' kern value is indicated implicitly by an explicit space. 4234 4235@<Display kern |p|@>= 4236if subtype(p)<>mu_glue then 4237 begin print_esc("kern"); 4238 if subtype(p)<>normal then print_char(" "); 4239 print_scaled(width(p)); 4240 if subtype(p)=acc_kern then print(" (for accent)"); 4241@.for accent@> 4242 end 4243else begin print_esc("mkern"); print_scaled(width(p)); print("mu"); 4244 end 4245 4246@ @<Display math node |p|@>= 4247if subtype(p)>after then 4248 begin if end_LR(p) then print_esc("end") 4249 else print_esc("begin"); 4250 if subtype(p)>R_code then print_char("R") 4251 else if subtype(p)>L_code then print_char("L") 4252 else print_char("M"); 4253 end else 4254begin print_esc("math"); 4255if subtype(p)=before then print("on") 4256else print("off"); 4257if width(p)<>0 then 4258 begin print(", surrounded "); print_scaled(width(p)); 4259 end; 4260end 4261 4262@ @<Display ligature |p|@>= 4263begin print_font_and_char(lig_char(p)); print(" (ligature "); 4264if subtype(p)>1 then print_char("|"); 4265font_in_short_display:=font(lig_char(p)); short_display(lig_ptr(p)); 4266if odd(subtype(p)) then print_char("|"); 4267print_char(")"); 4268end 4269 4270@ @<Display penalty |p|@>= 4271begin print_esc("penalty "); print_int(penalty(p)); 4272end 4273 4274@ The |post_break| list of a discretionary node is indicated by a prefixed 4275`\.{\char'174}' instead of the `\..' before the |pre_break| list. 4276 4277@<Display discretionary |p|@>= 4278begin print_esc("discretionary"); 4279if replace_count(p)>0 then 4280 begin print(" replacing "); print_int(replace_count(p)); 4281 end; 4282node_list_display(pre_break(p)); {recursive call} 4283append_char("|"); show_node_list(post_break(p)); flush_char; {recursive call} 4284end 4285 4286@ @<Display mark |p|@>= 4287begin print_esc("mark"); 4288if mark_class(p)<>0 then 4289 begin print_char("s"); print_int(mark_class(p)); 4290 end; 4291print_mark(mark_ptr(p)); 4292end 4293 4294@ @<Display adjustment |p|@>= 4295begin print_esc("vadjust"); if adjust_pre(p) <> 0 then print(" pre "); 4296node_list_display(adjust_ptr(p)); {recursive call} 4297end 4298 4299@ The recursive machinery is started by calling |show_box|. 4300@^recursion@> 4301 4302@p procedure show_box(@!p:pointer); 4303begin @<Assign the values |depth_threshold:=show_box_depth| and 4304 |breadth_max:=show_box_breadth|@>; 4305if breadth_max<=0 then breadth_max:=5; 4306if pool_ptr+depth_threshold>=pool_size then 4307 depth_threshold:=pool_size-pool_ptr-1; 4308 {now there's enough room for prefix string} 4309show_node_list(p); {the show starts at |p|} 4310print_ln; 4311end; 4312 4313procedure short_display_n(@!p, m:integer); {prints highlights of list |p|} 4314begin 4315 breadth_max:=m; 4316 depth_threshold:=pool_size-pool_ptr-1; 4317 show_node_list(p); {the show starts at |p|} 4318end; 4319 4320@* \[13] Destroying boxes. 4321When we are done with a node list, we are obliged to return it to free 4322storage, including all of its sublists. The recursive procedure 4323|flush_node_list| does this for us. 4324 4325@ First, however, we shall consider two non-recursive procedures that do 4326simpler tasks. The first of these, |delete_token_ref|, is called when 4327a pointer to a token list's reference count is being removed. This means 4328that the token list should disappear if the reference count was |null|, 4329otherwise the count should be decreased by one. 4330@^reference counts@> 4331 4332@d token_ref_count(#) == info(#) {reference count preceding a token list} 4333 4334@p procedure delete_token_ref(@!p:pointer); {|p| points to the reference count 4335 of a token list that is losing one reference} 4336begin if token_ref_count(p)=null then flush_list(p) 4337else decr(token_ref_count(p)); 4338end; 4339 4340@ Similarly, |delete_glue_ref| is called when a pointer to a glue 4341specification is being withdrawn. 4342@^reference counts@> 4343@d fast_delete_glue_ref(#)==@t@>@;@/ 4344 begin if glue_ref_count(#)=null then free_node(#,glue_spec_size) 4345 else decr(glue_ref_count(#)); 4346 end 4347 4348@p procedure delete_glue_ref(@!p:pointer); {|p| points to a glue specification} 4349fast_delete_glue_ref(p); 4350 4351@ Now we are ready to delete any node list, recursively. 4352In practice, the nodes deleted are usually charnodes (about 2/3 of the time), 4353and they are glue nodes in about half of the remaining cases. 4354@^recursion@> 4355 4356@p procedure flush_node_list(@!p:pointer); {erase list of nodes starting at |p|} 4357label done; {go here when node |p| has been freed} 4358var q:pointer; {successor to node |p|} 4359begin while p<>null do 4360@^inner loop@> 4361 begin q:=link(p); 4362 if is_char_node(p) then free_avail(p) 4363 else begin case type(p) of 4364 hlist_node,vlist_node,unset_node: begin flush_node_list(list_ptr(p)); 4365 free_node(p,box_node_size); goto done; 4366 end; 4367 rule_node: begin free_node(p,rule_node_size); goto done; 4368 end; 4369 ins_node: begin flush_node_list(ins_ptr(p)); 4370 delete_glue_ref(split_top_ptr(p)); 4371 free_node(p,ins_node_size); goto done; 4372 end; 4373 whatsit_node: @<Wipe out the whatsit node |p| and |goto done|@>; 4374 glue_node: begin fast_delete_glue_ref(glue_ptr(p)); 4375 if leader_ptr(p)<>null then flush_node_list(leader_ptr(p)); 4376 end; 4377 kern_node,math_node,penalty_node: do_nothing; 4378 margin_kern_node: begin 4379 free_node(p, margin_kern_node_size); 4380 goto done; 4381 end; 4382 ligature_node: flush_node_list(lig_ptr(p)); 4383 mark_node: delete_token_ref(mark_ptr(p)); 4384 disc_node: begin flush_node_list(pre_break(p)); 4385 flush_node_list(post_break(p)); 4386 end; 4387 adjust_node: flush_node_list(adjust_ptr(p)); 4388 @t\4@>@<Cases of |flush_node_list| that arise in mlists only@>@; 4389 othercases confusion("flushing") 4390@:this can't happen flushing}{\quad flushing@> 4391 endcases;@/ 4392 free_node(p,small_node_size); 4393 done:end; 4394 p:=q; 4395 end; 4396end; 4397 4398@* \[14] Copying boxes. 4399Another recursive operation that acts on boxes is sometimes needed: The 4400procedure |copy_node_list| returns a pointer to another node list that has 4401the same structure and meaning as the original. Note that since glue 4402specifications and token lists have reference counts, we need not make 4403copies of them. Reference counts can never get too large to fit in a 4404halfword, since each pointer to a node is in a different memory address, 4405and the total number of memory addresses fits in a halfword. 4406@^recursion@> 4407@^reference counts@> 4408 4409(Well, there actually are also references from outside |mem|; if the 4410|save_stack| is made arbitrarily large, it would theoretically be possible 4411to break \TeX\ by overflowing a reference count. But who would want to do that?) 4412 4413@d add_token_ref(#)==incr(token_ref_count(#)) {new reference to a token list} 4414@d add_glue_ref(#)==incr(glue_ref_count(#)) {new reference to a glue spec} 4415 4416@ The copying procedure copies words en masse without bothering 4417to look at their individual fields. If the node format changes---for 4418example, if the size is altered, or if some link field is moved to another 4419relative position---then this code may need to be changed too. 4420@^data structure assumptions@> 4421 4422@p function copy_node_list(@!p:pointer):pointer; {makes a duplicate of the 4423 node list that starts at |p| and returns a pointer to the new list} 4424var h:pointer; {temporary head of copied list} 4425@!q:pointer; {previous position in new list} 4426@!r:pointer; {current node being fabricated for new list} 4427@!words:0..5; {number of words remaining to be copied} 4428begin h:=get_avail; q:=h; 4429while p<>null do 4430 begin @<Make a copy of node |p| in node |r|@>; 4431 link(q):=r; q:=r; p:=link(p); 4432 end; 4433link(q):=null; q:=link(h); free_avail(h); 4434copy_node_list:=q; 4435end; 4436 4437@ @<Make a copy of node |p|...@>= 4438words:=1; {this setting occurs in more branches than any other} 4439if is_char_node(p) then r:=get_avail 4440else @<Case statement to copy different types and set |words| to the number 4441 of initial words not yet copied@>; 4442while words>0 do 4443 begin decr(words); mem[r+words]:=mem[p+words]; 4444 end 4445 4446@ @<Case statement to copy...@>= 4447case type(p) of 4448hlist_node,vlist_node,unset_node: begin r:=get_node(box_node_size); 4449 mem[r+6]:=mem[p+6]; mem[r+5]:=mem[p+5]; {copy the last two words} 4450 list_ptr(r):=copy_node_list(list_ptr(p)); {this affects |mem[r+5]|} 4451 words:=5; 4452 end; 4453rule_node: begin r:=get_node(rule_node_size); words:=rule_node_size; 4454 end; 4455ins_node: begin r:=get_node(ins_node_size); mem[r+4]:=mem[p+4]; 4456 add_glue_ref(split_top_ptr(p)); 4457 ins_ptr(r):=copy_node_list(ins_ptr(p)); {this affects |mem[r+4]|} 4458 words:=ins_node_size-1; 4459 end; 4460whatsit_node:@<Make a partial copy of the whatsit node |p| and make |r| 4461 point to it; set |words| to the number of initial words not yet copied@>; 4462glue_node: begin r:=get_node(small_node_size); add_glue_ref(glue_ptr(p)); 4463 glue_ptr(r):=glue_ptr(p); leader_ptr(r):=copy_node_list(leader_ptr(p)); 4464 end; 4465kern_node,math_node,penalty_node: begin r:=get_node(small_node_size); 4466 words:=small_node_size; 4467 end; 4468margin_kern_node: begin 4469 r:=get_node(margin_kern_node_size); 4470 words:=margin_kern_node_size; 4471 end; 4472ligature_node: begin r:=get_node(small_node_size); 4473 mem[lig_char(r)]:=mem[lig_char(p)]; {copy |font| and |character|} 4474 lig_ptr(r):=copy_node_list(lig_ptr(p)); 4475 end; 4476disc_node: begin r:=get_node(small_node_size); 4477 pre_break(r):=copy_node_list(pre_break(p)); 4478 post_break(r):=copy_node_list(post_break(p)); 4479 end; 4480mark_node: begin r:=get_node(small_node_size); add_token_ref(mark_ptr(p)); 4481 words:=small_node_size; 4482 end; 4483adjust_node: begin r:=get_node(small_node_size); 4484 adjust_ptr(r):=copy_node_list(adjust_ptr(p)); 4485 end; {|words=1=small_node_size-1|} 4486othercases confusion("copying") 4487@:this can't happen copying}{\quad copying@> 4488endcases 4489 4490@* \[15] The command codes. 4491Before we can go any further, we need to define symbolic names for the internal 4492code numbers that represent the various commands obeyed by \TeX. These codes 4493are somewhat arbitrary, but not completely so. For example, the command 4494codes for character types are fixed by the language, since a user says, 4495e.g., `\.{\\catcode \`\\\${} = 3}' to make \.{\char'44} a math delimiter, 4496and the command code |math_shift| is equal to~3. Some other codes have 4497been made adjacent so that |case| statements in the program need not consider 4498cases that are widely spaced, or so that |case| statements can be replaced 4499by |if| statements. 4500 4501At any rate, here is the list, for future reference. First come the 4502``catcode'' commands, several of which share their numeric codes with 4503ordinary commands when the catcode cannot emerge from \TeX's scanning routine. 4504 4505@d escape=0 {escape delimiter (called \.\\ in {\sl The \TeX book\/})} 4506@:TeXbook}{\sl The \TeX book@> 4507@d relax=0 {do nothing ( \.{\\relax} )} 4508@d left_brace=1 {beginning of a group ( \.\{ )} 4509@d right_brace=2 {ending of a group ( \.\} )} 4510@d math_shift=3 {mathematics shift character ( \.\$ )} 4511@d tab_mark=4 {alignment delimiter ( \.\&, \.{\\span} )} 4512@d car_ret=5 {end of line ( |carriage_return|, \.{\\cr}, \.{\\crcr} )} 4513@d out_param=5 {output a macro parameter} 4514@d mac_param=6 {macro parameter symbol ( \.\# )} 4515@d sup_mark=7 {superscript ( \.{\char'136} )} 4516@d sub_mark=8 {subscript ( \.{\char'137} )} 4517@d ignore=9 {characters to ignore ( \.{\^\^@@} )} 4518@d endv=9 {end of \<v_j> list in alignment template} 4519@d spacer=10 {characters equivalent to blank space ( \.{\ } )} 4520@d letter=11 {characters regarded as letters ( \.{A..Z}, \.{a..z} )} 4521@d other_char=12 {none of the special character types} 4522@d active_char=13 {characters that invoke macros ( \.{\char`\~} )} 4523@d par_end=13 {end of paragraph ( \.{\\par} )} 4524@d match=13 {match a macro parameter} 4525@d comment=14 {characters that introduce comments ( \.\% )} 4526@d end_match=14 {end of parameters to macro} 4527@d stop=14 {end of job ( \.{\\end}, \.{\\dump} )} 4528@d invalid_char=15 {characters that shouldn't appear ( \.{\^\^?} )} 4529@d delim_num=15 {specify delimiter numerically ( \.{\\delimiter} )} 4530@d max_char_code=15 {largest catcode for individual characters} 4531 4532@ Next are the ordinary run-of-the-mill command codes. Codes that are 4533|min_internal| or more represent internal quantities that might be 4534expanded by `\.{\\the}'. 4535 4536@d char_num=16 {character specified numerically ( \.{\\char} )} 4537@d math_char_num=17 {explicit math code ( \.{\\mathchar} )} 4538@d mark=18 {mark definition ( \.{\\mark} )} 4539@d xray=19 {peek inside of \TeX\ ( \.{\\show}, \.{\\showbox}, etc.~)} 4540@d make_box=20 {make a box ( \.{\\box}, \.{\\copy}, \.{\\hbox}, etc.~)} 4541@d hmove=21 {horizontal motion ( \.{\\moveleft}, \.{\\moveright} )} 4542@d vmove=22 {vertical motion ( \.{\\raise}, \.{\\lower} )} 4543@d un_hbox=23 {unglue a box ( \.{\\unhbox}, \.{\\unhcopy} )} 4544@d un_vbox=24 {unglue a box ( \.{\\unvbox}, \.{\\unvcopy} )} 4545 {( or \.{\\pagediscards}, \.{\\splitdiscards} )} 4546@d remove_item=25 {nullify last item ( \.{\\unpenalty}, 4547 \.{\\unkern}, \.{\\unskip} )} 4548@d hskip=26 {horizontal glue ( \.{\\hskip}, \.{\\hfil}, etc.~)} 4549@d vskip=27 {vertical glue ( \.{\\vskip}, \.{\\vfil}, etc.~)} 4550@d mskip=28 {math glue ( \.{\\mskip} )} 4551@d kern=29 {fixed space ( \.{\\kern})} 4552@d mkern=30 {math kern ( \.{\\mkern} )} 4553@d leader_ship=31 {use a box ( \.{\\shipout}, \.{\\leaders}, etc.~)} 4554@d halign=32 {horizontal table alignment ( \.{\\halign} )} 4555@d valign=33 {vertical table alignment ( \.{\\valign} )} 4556 {or text direction directives ( \.{\\beginL}, etc.~)} 4557@d no_align=34 {temporary escape from alignment ( \.{\\noalign} )} 4558@d vrule=35 {vertical rule ( \.{\\vrule} )} 4559@d hrule=36 {horizontal rule ( \.{\\hrule} )} 4560@d insert=37 {vlist inserted in box ( \.{\\insert} )} 4561@d vadjust=38 {vlist inserted in enclosing paragraph ( \.{\\vadjust} )} 4562@d ignore_spaces=39 {gobble |spacer| tokens ( \.{\\ignorespaces} )} 4563@d after_assignment=40 {save till assignment is done ( \.{\\afterassignment} )} 4564@d after_group=41 {save till group is done ( \.{\\aftergroup} )} 4565@d break_penalty=42 {additional badness ( \.{\\penalty} )} 4566@d start_par=43 {begin paragraph ( \.{\\indent}, \.{\\noindent} )} 4567@d ital_corr=44 {italic correction ( \.{\\/} )} 4568@d accent=45 {attach accent in text ( \.{\\accent} )} 4569@d math_accent=46 {attach accent in math ( \.{\\mathaccent} )} 4570@d discretionary=47 {discretionary texts ( \.{\\-}, \.{\\discretionary} )} 4571@d eq_no=48 {equation number ( \.{\\eqno}, \.{\\leqno} )} 4572@d left_right=49 {variable delimiter ( \.{\\left}, \.{\\right} )} 4573 {( or \.{\\middle} )} 4574@d math_comp=50 {component of formula ( \.{\\mathbin}, etc.~)} 4575@d limit_switch=51 {diddle limit conventions ( \.{\\displaylimits}, etc.~)} 4576@d above=52 {generalized fraction ( \.{\\above}, \.{\\atop}, etc.~)} 4577@d math_style=53 {style specification ( \.{\\displaystyle}, etc.~)} 4578@d math_choice=54 {choice specification ( \.{\\mathchoice} )} 4579@d non_script=55 {conditional math glue ( \.{\\nonscript} )} 4580@d vcenter=56 {vertically center a vbox ( \.{\\vcenter} )} 4581@d case_shift=57 {force specific case ( \.{\\lowercase}, \.{\\uppercase}~)} 4582@d message=58 {send to user ( \.{\\message}, \.{\\errmessage} )} 4583@d extension=59 {extensions to \TeX\ ( \.{\\write}, \.{\\special}, etc.~)} 4584@d in_stream=60 {files for reading ( \.{\\openin}, \.{\\closein} )} 4585@d begin_group=61 {begin local grouping ( \.{\\begingroup} )} 4586@d end_group=62 {end local grouping ( \.{\\endgroup} )} 4587@d omit=63 {omit alignment template ( \.{\\omit} )} 4588@d ex_space=64 {explicit space ( \.{\\\ } )} 4589@d no_boundary=65 {suppress boundary ligatures ( \.{\\noboundary} )} 4590@d radical=66 {square root and similar signs ( \.{\\radical} )} 4591@d end_cs_name=67 {end control sequence ( \.{\\endcsname} )} 4592@d min_internal=68 {the smallest code that can follow \.{\\the}} 4593@d char_given=68 {character code defined by \.{\\chardef}} 4594@d math_given=69 {math code defined by \.{\\mathchardef}} 4595@d XeTeX_math_given=70 {extended math code defined by \.{\\Umathchardef}} 4596@d last_item=71 {most recent item ( \.{\\lastpenalty}, 4597 \.{\\lastkern}, \.{\\lastskip} )} 4598@d max_non_prefixed_command=71 {largest command code that can't be \.{\\global}} 4599 4600@ The next codes are special; they all relate to mode-independent 4601assignment of values to \TeX's internal registers or tables. 4602Codes that are |max_internal| or less represent internal quantities 4603that might be expanded by `\.{\\the}'. 4604 4605@d toks_register=72 {token list register ( \.{\\toks} )} 4606@d assign_toks=73 {special token list ( \.{\\output}, \.{\\everypar}, etc.~)} 4607@d assign_int=74 {user-defined integer ( \.{\\tolerance}, \.{\\day}, etc.~)} 4608@d assign_dimen=75 {user-defined length ( \.{\\hsize}, etc.~)} 4609@d assign_glue=76 {user-defined glue ( \.{\\baselineskip}, etc.~)} 4610@d assign_mu_glue=77 {user-defined muglue ( \.{\\thinmuskip}, etc.~)} 4611@d assign_font_dimen=78 {user-defined font dimension ( \.{\\fontdimen} )} 4612@d assign_font_int=79 {user-defined font integer ( \.{\\hyphenchar}, 4613 \.{\\skewchar} )} 4614@d set_aux=80 {specify state info ( \.{\\spacefactor}, \.{\\prevdepth} )} 4615@d set_prev_graf=81 {specify state info ( \.{\\prevgraf} )} 4616@d set_page_dimen=82 {specify state info ( \.{\\pagegoal}, etc.~)} 4617@d set_page_int=83 {specify state info ( \.{\\deadcycles}, 4618 \.{\\insertpenalties} )} 4619 {( or \.{\\interactionmode} )} 4620@d set_box_dimen=84 {change dimension of box ( \.{\\wd}, \.{\\ht}, \.{\\dp} )} 4621@d set_shape=85 {specify fancy paragraph shape ( \.{\\parshape} )} 4622 {(or \.{\\interlinepenalties}, etc.~)} 4623@d def_code=86 {define a character code ( \.{\\catcode}, etc.~)} 4624@d XeTeX_def_code=87 {\.{\\Umathcode}, \.{\\Udelcode}} 4625@d def_family=88 {declare math fonts ( \.{\\textfont}, etc.~)} 4626@d set_font=89 {set current font ( font identifiers )} 4627@d def_font=90 {define a font file ( \.{\\font} )} 4628@d register=91 {internal register ( \.{\\count}, \.{\\dimen}, etc.~)} 4629@d max_internal=91 {the largest code that can follow \.{\\the}} 4630@d advance=92 {advance a register or parameter ( \.{\\advance} )} 4631@d multiply=93 {multiply a register or parameter ( \.{\\multiply} )} 4632@d divide=94 {divide a register or parameter ( \.{\\divide} )} 4633@d prefix=95 {qualify a definition ( \.{\\global}, \.{\\long}, \.{\\outer} )} 4634 {( or \.{\\protected} )} 4635@d let=96 {assign a command code ( \.{\\let}, \.{\\futurelet} )} 4636@d shorthand_def=97 {code definition ( \.{\\chardef}, \.{\\countdef}, etc.~)} 4637@d read_to_cs=98 {read into a control sequence ( \.{\\read} )} 4638 {( or \.{\\readline} )} 4639@d def=99 {macro definition ( \.{\\def}, \.{\\gdef}, \.{\\xdef}, \.{\\edef} )} 4640@d set_box=100 {set a box ( \.{\\setbox} )} 4641@d hyph_data=101 {hyphenation data ( \.{\\hyphenation}, \.{\\patterns} )} 4642@d set_interaction=102 {define level of interaction ( \.{\\batchmode}, etc.~)} 4643@d max_command=102 {the largest command code seen at |big_switch|} 4644 4645@ The remaining command codes are extra special, since they cannot get through 4646\TeX's scanner to the main control routine. They have been given values higher 4647than |max_command| so that their special nature is easily discernible. 4648The ``expandable'' commands come first. 4649 4650@d undefined_cs=max_command+1 {initial state of most |eq_type| fields} 4651@d expand_after=max_command+2 {special expansion ( \.{\\expandafter} )} 4652@d no_expand=max_command+3 {special nonexpansion ( \.{\\noexpand} )} 4653@d input=max_command+4 {input a source file ( \.{\\input}, \.{\\endinput} )} 4654 {( or \.{\\scantokens} )} 4655@d if_test=max_command+5 {conditional text ( \.{\\if}, \.{\\ifcase}, etc.~)} 4656@d fi_or_else=max_command+6 {delimiters for conditionals ( \.{\\else}, etc.~)} 4657@d cs_name=max_command+7 {make a control sequence from tokens ( \.{\\csname} )} 4658@d convert=max_command+8 {convert to text ( \.{\\number}, \.{\\string}, etc.~)} 4659@d the=max_command+9 {expand an internal quantity ( \.{\\the} )} 4660 {( or \.{\\unexpanded}, \.{\\detokenize} )} 4661@d top_bot_mark=max_command+10 {inserted mark ( \.{\\topmark}, etc.~)} 4662@d call=max_command+11 {non-long, non-outer control sequence} 4663@d long_call=max_command+12 {long, non-outer control sequence} 4664@d outer_call=max_command+13 {non-long, outer control sequence} 4665@d long_outer_call=max_command+14 {long, outer control sequence} 4666@d end_template=max_command+15 {end of an alignment template} 4667@d dont_expand=max_command+16 {the following token was marked by \.{\\noexpand}} 4668@d glue_ref=max_command+17 {the equivalent points to a glue specification} 4669@d shape_ref=max_command+18 {the equivalent points to a parshape specification} 4670@d box_ref=max_command+19 {the equivalent points to a box node, or is |null|} 4671@d data=max_command+20 {the equivalent is simply a halfword number} 4672 4673@* \[16] The semantic nest. 4674\TeX\ is typically in the midst of building many lists at once. For example, 4675when a math formula is being processed, \TeX\ is in math mode and 4676working on an mlist; this formula has temporarily interrupted \TeX\ from 4677being in horizontal mode and building the hlist of a paragraph; and this 4678paragraph has temporarily interrupted \TeX\ from being in vertical mode 4679and building the vlist for the next page of a document. Similarly, when a 4680\.{\\vbox} occurs inside of an \.{\\hbox}, \TeX\ is temporarily 4681interrupted from working in restricted horizontal mode, and it enters 4682internal vertical mode. The ``semantic nest'' is a stack that 4683keeps track of what lists and modes are currently suspended. 4684 4685At each level of processing we are in one of six modes: 4686 4687\yskip\hang|vmode| stands for vertical mode (the page builder); 4688 4689\hang|hmode| stands for horizontal mode (the paragraph builder); 4690 4691\hang|mmode| stands for displayed formula mode; 4692 4693\hang|-vmode| stands for internal vertical mode (e.g., in a \.{\\vbox}); 4694 4695\hang|-hmode| stands for restricted horizontal mode (e.g., in an \.{\\hbox}); 4696 4697\hang|-mmode| stands for math formula mode (not displayed). 4698 4699\yskip\noindent The mode is temporarily set to zero while processing \.{\\write} 4700texts in the |ship_out| routine. 4701 4702Numeric values are assigned to |vmode|, |hmode|, and |mmode| so that 4703\TeX's ``big semantic switch'' can select the appropriate thing to 4704do by computing the value |abs(mode)+cur_cmd|, where |mode| is the current 4705mode and |cur_cmd| is the current command code. 4706 4707@d vmode=1 {vertical mode} 4708@d hmode=vmode+max_command+1 {horizontal mode} 4709@d mmode=hmode+max_command+1 {math mode} 4710 4711@p procedure print_mode(@!m:integer); {prints the mode represented by |m|} 4712begin if m>0 then 4713 case m div (max_command+1) of 4714 0:print("vertical"); 4715 1:print("horizontal"); 4716 2:print("display math"); 4717 end 4718else if m=0 then print("no") 4719else case (-m) div (max_command+1) of 4720 0:print("internal vertical"); 4721 1:print("restricted horizontal"); 4722 2:print("math"); 4723 end; 4724print(" mode"); 4725end; 4726 4727@ The state of affairs at any semantic level can be represented by 4728five values: 4729 4730\yskip\hang|mode| is the number representing the semantic mode, as 4731just explained. 4732 4733\yskip\hang|head| is a |pointer| to a list head for the list being built; 4734|link(head)| therefore points to the first element of the list, or 4735to |null| if the list is empty. 4736 4737\yskip\hang|tail| is a |pointer| to the final node of the list being 4738built; thus, |tail=head| if and only if the list is empty. 4739 4740\yskip\hang|prev_graf| is the number of lines of the current paragraph that 4741have already been put into the present vertical list. 4742 4743\yskip\hang|aux| is an auxiliary |memory_word| that gives further information 4744that is needed to characterize the situation. 4745 4746\yskip\noindent 4747In vertical mode, |aux| is also known as |prev_depth|; it is the scaled 4748value representing the depth of the previous box, for use in baseline 4749calculations, or it is |<=-1000|pt if the next box on the vertical list is to 4750be exempt from baseline calculations. In horizontal mode, |aux| is also 4751known as |space_factor| and |clang|; it holds the current space factor used in 4752spacing calculations, and the current language used for hyphenation. 4753(The value of |clang| is undefined in restricted horizontal mode.) 4754In math mode, |aux| is also known as |incompleat_noad|; if 4755not |null|, it points to a record that represents the numerator of a 4756generalized fraction for which the denominator is currently being formed 4757in the current list. 4758 4759There is also a sixth quantity, |mode_line|, which correlates 4760the semantic nest with the user's input; |mode_line| contains the source 4761line number at which the current level of nesting was entered. The negative 4762of this line number is the |mode_line| at the level of the 4763user's output routine. 4764 4765A seventh quantity, |eTeX_aux|, is used by the extended features \eTeX. 4766In vertical modes it is known as |LR_save| and holds the LR stack when a 4767paragraph is interrupted by a displayed formula. In display math mode 4768it is known as |LR_box| and holds a pointer to a prototype box for the 4769display. In math mode it is known as |delim_ptr| and points to the most 4770recent |left_noad| or |middle_noad| of a |math_left_group|. 4771 4772In horizontal mode, the |prev_graf| field is used for initial language data. 4773 4774The semantic nest is an array called |nest| that holds the |mode|, |head|, 4775|tail|, |prev_graf|, |aux|, and |mode_line| values for all semantic levels 4776below the currently active one. Information about the currently active 4777level is kept in the global quantities |mode|, |head|, |tail|, |prev_graf|, 4778|aux|, and |mode_line|, which live in a \PASCAL\ record that is ready to 4779be pushed onto |nest| if necessary. 4780 4781@d ignore_depth==-65536000 {|prev_depth| value that is ignored} 4782 4783@<Types...@>= 4784@!list_state_record=record@!mode_field:-mmode..mmode;@+ 4785 @!head_field,@!tail_field: pointer; 4786 @!eTeX_aux_field: pointer; 4787 @!pg_field,@!ml_field: integer;@+ 4788 @!aux_field: memory_word; 4789 end; 4790 4791@ @d mode==cur_list.mode_field {current mode} 4792@d head==cur_list.head_field {header node of current list} 4793@d tail==cur_list.tail_field {final node on current list} 4794@d eTeX_aux==cur_list.eTeX_aux_field {auxiliary data for \eTeX} 4795@d LR_save==eTeX_aux {LR stack when a paragraph is interrupted} 4796@d LR_box==eTeX_aux {prototype box for display} 4797@d delim_ptr==eTeX_aux {most recent left or right noad of a math left group} 4798@d prev_graf==cur_list.pg_field {number of paragraph lines accumulated} 4799@d aux==cur_list.aux_field {auxiliary data about the current list} 4800@d prev_depth==aux.sc {the name of |aux| in vertical mode} 4801@d space_factor==aux.hh.lh {part of |aux| in horizontal mode} 4802@d clang==aux.hh.rh {the other part of |aux| in horizontal mode} 4803@d incompleat_noad==aux.int {the name of |aux| in math mode} 4804@d mode_line==cur_list.ml_field {source file line number at beginning of list} 4805 4806@<Glob...@>= 4807@!nest:array[0..nest_size] of list_state_record; 4808@!nest_ptr:0..nest_size; {first unused location of |nest|} 4809@!max_nest_stack:0..nest_size; {maximum of |nest_ptr| when pushing} 4810@!cur_list:list_state_record; {the ``top'' semantic state} 4811@!shown_mode:-mmode..mmode; {most recent mode shown by \.{\\tracingcommands}} 4812 4813@ Here is a common way to make the current list grow: 4814 4815@d tail_append(#)==begin link(tail):=#; tail:=link(tail); 4816 end 4817 4818@ We will see later that the vertical list at the bottom semantic level is split 4819into two parts; the ``current page'' runs from |page_head| to |page_tail|, 4820and the ``contribution list'' runs from |contrib_head| to |tail| of 4821semantic level zero. The idea is that contributions are first formed in 4822vertical mode, then ``contributed'' to the current page (during which time 4823the page-breaking decisions are made). For now, we don't need to know 4824any more details about the page-building process. 4825 4826@<Set init...@>= 4827nest_ptr:=0; max_nest_stack:=0; 4828mode:=vmode; head:=contrib_head; tail:=contrib_head; 4829eTeX_aux:=null; 4830prev_depth:=ignore_depth; mode_line:=0; 4831prev_graf:=0; shown_mode:=0; 4832@<Start a new current page@>; 4833 4834@ When \TeX's work on one level is interrupted, the state is saved by 4835calling |push_nest|. This routine changes |head| and |tail| so that 4836a new (empty) list is begun; it does not change |mode| or |aux|. 4837 4838@p procedure push_nest; {enter a new semantic level, save the old} 4839begin if nest_ptr>max_nest_stack then 4840 begin max_nest_stack:=nest_ptr; 4841 if nest_ptr=nest_size then overflow("semantic nest size",nest_size); 4842@:TeX capacity exceeded semantic nest size}{\quad semantic nest size@> 4843 end; 4844nest[nest_ptr]:=cur_list; {stack the record} 4845incr(nest_ptr); head:=get_avail; tail:=head; prev_graf:=0; mode_line:=line; 4846eTeX_aux:=null; 4847end; 4848 4849@ Conversely, when \TeX\ is finished on the current level, the former 4850state is restored by calling |pop_nest|. This routine will never be 4851called at the lowest semantic level, nor will it be called unless |head| 4852is a node that should be returned to free memory. 4853 4854@p procedure pop_nest; {leave a semantic level, re-enter the old} 4855begin free_avail(head); decr(nest_ptr); cur_list:=nest[nest_ptr]; 4856end; 4857 4858@ Here is a procedure that displays what \TeX\ is working on, at all levels. 4859 4860@p procedure@?print_totals; forward;@t\2@> 4861procedure show_activities; 4862var p:0..nest_size; {index into |nest|} 4863@!m:-mmode..mmode; {mode} 4864@!a:memory_word; {auxiliary} 4865@!q,@!r:pointer; {for showing the current page} 4866@!t:integer; {ditto} 4867begin nest[nest_ptr]:=cur_list; {put the top level into the array} 4868print_nl(""); print_ln; 4869for p:=nest_ptr downto 0 do 4870 begin m:=nest[p].mode_field; a:=nest[p].aux_field; 4871 print_nl("### "); print_mode(m); 4872 print(" entered at line "); print_int(abs(nest[p].ml_field)); 4873 if m=hmode then if nest[p].pg_field <> @'40600000 then 4874 begin print(" (language"); print_int(nest[p].pg_field mod @'200000); 4875 print(":hyphenmin"); print_int(nest[p].pg_field div @'20000000); 4876 print_char(","); print_int((nest[p].pg_field div @'200000) mod @'100); 4877 print_char(")"); 4878 end; 4879 if nest[p].ml_field<0 then print(" (\output routine)"); 4880 if p=0 then 4881 begin @<Show the status of the current page@>; 4882 if link(contrib_head)<>null then 4883 print_nl("### recent contributions:"); 4884 end; 4885 show_box(link(nest[p].head_field)); 4886 @<Show the auxiliary field, |a|@>; 4887 end; 4888end; 4889 4890@ @<Show the auxiliary...@>= 4891case abs(m) div (max_command+1) of 48920: begin print_nl("prevdepth "); 4893 if a.sc<=ignore_depth then print("ignored") 4894 else print_scaled(a.sc); 4895 if nest[p].pg_field<>0 then 4896 begin print(", prevgraf "); 4897 print_int(nest[p].pg_field); print(" line"); 4898 if nest[p].pg_field<>1 then print_char("s"); 4899 end; 4900 end; 49011: begin print_nl("spacefactor "); print_int(a.hh.lh); 4902 if m>0 then@+ if a.hh.rh>0 then 4903 begin print(", current language "); print_int(a.hh.rh);@+ 4904 end; 4905 end; 49062: if a.int<>null then 4907 begin print("this will be denominator of:"); show_box(a.int);@+ 4908 end; 4909end {there are no other cases} 4910 4911@* \[17] The table of equivalents. 4912Now that we have studied the data structures for \TeX's semantic routines, 4913we ought to consider the data structures used by its syntactic routines. In 4914other words, our next concern will be 4915the tables that \TeX\ looks at when it is scanning 4916what the user has written. 4917 4918The biggest and most important such table is called |eqtb|. It holds the 4919current ``equivalents'' of things; i.e., it explains what things mean 4920or what their current values are, for all quantities that are subject to 4921the nesting structure provided by \TeX's grouping mechanism. There are six 4922parts to |eqtb|: 4923 4924\yskip\hangg 1) |eqtb[active_base..(hash_base-1)]| holds the current 4925equivalents of single-character control sequences. 4926 4927\yskip\hangg 2) |eqtb[hash_base..(glue_base-1)]| holds the current 4928equivalents of multiletter control sequences. 4929 4930\yskip\hangg 3) |eqtb[glue_base..(local_base-1)]| holds the current 4931equivalents of glue parameters like the current baselineskip. 4932 4933\yskip\hangg 4) |eqtb[local_base..(int_base-1)]| holds the current 4934equivalents of local halfword quantities like the current box registers, 4935the current ``catcodes,'' the current font, and a pointer to the current 4936paragraph shape. 4937 4938\yskip\hangg 5) |eqtb[int_base..(dimen_base-1)]| holds the current 4939equivalents of fullword integer parameters like the current hyphenation 4940penalty. 4941 4942\yskip\hangg 6) |eqtb[dimen_base..eqtb_size]| holds the current equivalents 4943of fullword dimension parameters like the current hsize or amount of 4944hanging indentation. 4945 4946\yskip\noindent Note that, for example, the current amount of 4947baselineskip glue is determined by the setting of a particular location 4948in region~3 of |eqtb|, while the current meaning of the control sequence 4949`\.{\\baselineskip}' (which might have been changed by \.{\\def} or 4950\.{\\let}) appears in region~2. 4951 4952@ Each entry in |eqtb| is a |memory_word|. Most of these words are of type 4953|two_halves|, and subdivided into three fields: 4954 4955\yskip\hangg 1) The |eq_level| (a quarterword) is the level of grouping at 4956which this equivalent was defined. If the level is |level_zero|, the 4957equivalent has never been defined; |level_one| refers to the outer level 4958(outside of all groups), and this level is also used for global 4959definitions that never go away. Higher levels are for equivalents that 4960will disappear at the end of their group. @^global definitions@> 4961 4962\yskip\hangg 2) The |eq_type| (another quarterword) specifies what kind of 4963entry this is. There are many types, since each \TeX\ primitive like 4964\.{\\hbox}, \.{\\def}, etc., has its own special code. The list of 4965command codes above includes all possible settings of the |eq_type| field. 4966 4967\yskip\hangg 3) The |equiv| (a halfword) is the current equivalent value. 4968This may be a font number, a pointer into |mem|, or a variety of other 4969things. 4970 4971@d eq_level_field(#)==#.hh.b1 4972@d eq_type_field(#)==#.hh.b0 4973@d equiv_field(#)==#.hh.rh 4974@d eq_level(#)==eq_level_field(eqtb[#]) {level of definition} 4975@d eq_type(#)==eq_type_field(eqtb[#]) {command code for equivalent} 4976@d equiv(#)==equiv_field(eqtb[#]) {equivalent value} 4977@d level_zero=min_quarterword {level for undefined quantities} 4978@d level_one=level_zero+1 {outermost level for defined quantities} 4979 4980@ Many locations in |eqtb| have symbolic names. The purpose of the next 4981paragraphs is to define these names, and to set up the initial values of the 4982equivalents. 4983 4984In the first region we have |number_chars| equivalents for ``active characters'' 4985that act as control sequences, followed by |number_chars| equivalents for 4986single-character control sequences. 4987 4988Then comes region~2, which corresponds to the hash table that we will 4989define later. The maximum address in this region is used for a dummy 4990control sequence that is perpetually undefined. There also are several 4991locations for control sequences that are perpetually defined 4992(since they are used in error recovery). 4993 4994@d active_base=1 {beginning of region 1, for active character equivalents} 4995@d single_base=active_base+number_usvs 4996 {equivalents of one-character control sequences} 4997@d null_cs=single_base+number_usvs {equivalent of \.{\\csname\\endcsname}} 4998@d hash_base=null_cs+1 {beginning of region 2, for the hash table} 4999@d frozen_control_sequence=hash_base+hash_size {for error recovery} 5000@d frozen_protection=frozen_control_sequence {inaccessible but definable} 5001@d frozen_cr=frozen_control_sequence+1 {permanent `\.{\\cr}'} 5002@d frozen_end_group=frozen_control_sequence+2 {permanent `\.{\\endgroup}'} 5003@d frozen_right=frozen_control_sequence+3 {permanent `\.{\\right}'} 5004@d frozen_fi=frozen_control_sequence+4 {permanent `\.{\\fi}'} 5005@d frozen_end_template=frozen_control_sequence+5 {permanent `\.{\\endtemplate}'} 5006@d frozen_endv=frozen_control_sequence+6 {second permanent `\.{\\endtemplate}'} 5007@d frozen_relax=frozen_control_sequence+7 {permanent `\.{\\relax}'} 5008@d end_write=frozen_control_sequence+8 {permanent `\.{\\endwrite}'} 5009@d frozen_dont_expand=frozen_control_sequence+9 5010 {permanent `\.{\\notexpanded:}'} 5011@d frozen_null_font=frozen_control_sequence+10 5012 {permanent `\.{\\nullfont}'} 5013@d font_id_base=frozen_null_font-font_base 5014 {begins table of 257 permanent font identifiers} 5015@d undefined_control_sequence=frozen_null_font+257 {dummy location} 5016@d glue_base=undefined_control_sequence+1 {beginning of region 3} 5017 5018@<Initialize table entries...@>= 5019eq_type(undefined_control_sequence):=undefined_cs; 5020equiv(undefined_control_sequence):=null; 5021eq_level(undefined_control_sequence):=level_zero; 5022for k:=active_base to undefined_control_sequence-1 do 5023 eqtb[k]:=eqtb[undefined_control_sequence]; 5024 5025@ Here is a routine that displays the current meaning of an |eqtb| entry 5026in region 1 or~2. (Similar routines for the other regions will appear 5027below.) 5028 5029@<Show equivalent |n|, in region 1 or 2@>= 5030begin sprint_cs(n); print_char("="); print_cmd_chr(eq_type(n),equiv(n)); 5031if eq_type(n)>=call then 5032 begin print_char(":"); show_token_list(link(equiv(n)),null,32); 5033 end; 5034end 5035 5036@ Region 3 of |eqtb| contains the |number_regs| \.{\\skip} registers, 5037as well as the glue parameters defined here. It is important that the 5038``muskip'' parameters have larger numbers than the others. 5039 5040@d line_skip_code=0 {interline glue if |baseline_skip| is infeasible} 5041@d baseline_skip_code=1 {desired glue between baselines} 5042@d par_skip_code=2 {extra glue just above a paragraph} 5043@d above_display_skip_code=3 {extra glue just above displayed math} 5044@d below_display_skip_code=4 {extra glue just below displayed math} 5045@d above_display_short_skip_code=5 5046 {glue above displayed math following short lines} 5047@d below_display_short_skip_code=6 5048 {glue below displayed math following short lines} 5049@d left_skip_code=7 {glue at left of justified lines} 5050@d right_skip_code=8 {glue at right of justified lines} 5051@d top_skip_code=9 {glue at top of main pages} 5052@d split_top_skip_code=10 {glue at top of split pages} 5053@d tab_skip_code=11 {glue between aligned entries} 5054@d space_skip_code=12 {glue between words (if not |zero_glue|)} 5055@d xspace_skip_code=13 {glue after sentences (if not |zero_glue|)} 5056@d par_fill_skip_code=14 {glue on last line of paragraph} 5057@d XeTeX_linebreak_skip_code=15 {glue introduced at potential linebreak location} 5058@d thin_mu_skip_code=16 {thin space in math formula} 5059@d med_mu_skip_code=17 {medium space in math formula} 5060@d thick_mu_skip_code=18 {thick space in math formula} 5061@d glue_pars=19 {total number of glue parameters} 5062@d skip_base=glue_base+glue_pars {table of |number_regs| ``skip'' registers} 5063@d mu_skip_base=skip_base+number_regs 5064 {table of |number_regs| ``muskip'' registers} 5065@d local_base=mu_skip_base+number_regs {beginning of region 4} 5066@# 5067@d skip(#)==equiv(skip_base+#) {|mem| location of glue specification} 5068@d mu_skip(#)==equiv(mu_skip_base+#) {|mem| location of math glue spec} 5069@d glue_par(#)==equiv(glue_base+#) {|mem| location of glue specification} 5070@d line_skip==glue_par(line_skip_code) 5071@d baseline_skip==glue_par(baseline_skip_code) 5072@d par_skip==glue_par(par_skip_code) 5073@d above_display_skip==glue_par(above_display_skip_code) 5074@d below_display_skip==glue_par(below_display_skip_code) 5075@d above_display_short_skip==glue_par(above_display_short_skip_code) 5076@d below_display_short_skip==glue_par(below_display_short_skip_code) 5077@d left_skip==glue_par(left_skip_code) 5078@d right_skip==glue_par(right_skip_code) 5079@d top_skip==glue_par(top_skip_code) 5080@d split_top_skip==glue_par(split_top_skip_code) 5081@d tab_skip==glue_par(tab_skip_code) 5082@d space_skip==glue_par(space_skip_code) 5083@d xspace_skip==glue_par(xspace_skip_code) 5084@d par_fill_skip==glue_par(par_fill_skip_code) 5085@d XeTeX_linebreak_skip==glue_par(XeTeX_linebreak_skip_code) 5086@d thin_mu_skip==glue_par(thin_mu_skip_code) 5087@d med_mu_skip==glue_par(med_mu_skip_code) 5088@d thick_mu_skip==glue_par(thick_mu_skip_code) 5089 5090@<Current |mem| equivalent of glue parameter number |n|@>=glue_par(n) 5091 5092@ Sometimes we need to convert \TeX's internal code numbers into symbolic 5093form. The |print_skip_param| routine gives the symbolic name of a glue 5094parameter. 5095 5096@<Declare the procedure called |print_skip_param|@>= 5097procedure print_skip_param(@!n:integer); 5098begin case n of 5099line_skip_code: print_esc("lineskip"); 5100baseline_skip_code: print_esc("baselineskip"); 5101par_skip_code: print_esc("parskip"); 5102above_display_skip_code: print_esc("abovedisplayskip"); 5103below_display_skip_code: print_esc("belowdisplayskip"); 5104above_display_short_skip_code: print_esc("abovedisplayshortskip"); 5105below_display_short_skip_code: print_esc("belowdisplayshortskip"); 5106left_skip_code: print_esc("leftskip"); 5107right_skip_code: print_esc("rightskip"); 5108top_skip_code: print_esc("topskip"); 5109split_top_skip_code: print_esc("splittopskip"); 5110tab_skip_code: print_esc("tabskip"); 5111space_skip_code: print_esc("spaceskip"); 5112xspace_skip_code: print_esc("xspaceskip"); 5113par_fill_skip_code: print_esc("parfillskip"); 5114XeTeX_linebreak_skip_code: print_esc("XeTeXlinebreakskip"); 5115thin_mu_skip_code: print_esc("thinmuskip"); 5116med_mu_skip_code: print_esc("medmuskip"); 5117thick_mu_skip_code: print_esc("thickmuskip"); 5118othercases print("[unknown glue parameter!]") 5119endcases; 5120end; 5121 5122@ The symbolic names for glue parameters are put into \TeX's hash table 5123by using the routine called |primitive|, defined below. Let us enter them 5124now, so that we don't have to list all those parameter names anywhere else. 5125 5126@<Put each of \TeX's primitives into the hash table@>= 5127primitive("lineskip",assign_glue,glue_base+line_skip_code);@/ 5128@!@:line_skip_}{\.{\\lineskip} primitive@> 5129primitive("baselineskip",assign_glue,glue_base+baseline_skip_code);@/ 5130@!@:baseline_skip_}{\.{\\baselineskip} primitive@> 5131primitive("parskip",assign_glue,glue_base+par_skip_code);@/ 5132@!@:par_skip_}{\.{\\parskip} primitive@> 5133primitive("abovedisplayskip",assign_glue,glue_base+above_display_skip_code);@/ 5134@!@:above_display_skip_}{\.{\\abovedisplayskip} primitive@> 5135primitive("belowdisplayskip",assign_glue,glue_base+below_display_skip_code);@/ 5136@!@:below_display_skip_}{\.{\\belowdisplayskip} primitive@> 5137primitive("abovedisplayshortskip", 5138 assign_glue,glue_base+above_display_short_skip_code);@/ 5139@!@:above_display_short_skip_}{\.{\\abovedisplayshortskip} primitive@> 5140primitive("belowdisplayshortskip", 5141 assign_glue,glue_base+below_display_short_skip_code);@/ 5142@!@:below_display_short_skip_}{\.{\\belowdisplayshortskip} primitive@> 5143primitive("leftskip",assign_glue,glue_base+left_skip_code);@/ 5144@!@:left_skip_}{\.{\\leftskip} primitive@> 5145primitive("rightskip",assign_glue,glue_base+right_skip_code);@/ 5146@!@:right_skip_}{\.{\\rightskip} primitive@> 5147primitive("topskip",assign_glue,glue_base+top_skip_code);@/ 5148@!@:top_skip_}{\.{\\topskip} primitive@> 5149primitive("splittopskip",assign_glue,glue_base+split_top_skip_code);@/ 5150@!@:split_top_skip_}{\.{\\splittopskip} primitive@> 5151primitive("tabskip",assign_glue,glue_base+tab_skip_code);@/ 5152@!@:tab_skip_}{\.{\\tabskip} primitive@> 5153primitive("spaceskip",assign_glue,glue_base+space_skip_code);@/ 5154@!@:space_skip_}{\.{\\spaceskip} primitive@> 5155primitive("xspaceskip",assign_glue,glue_base+xspace_skip_code);@/ 5156@!@:xspace_skip_}{\.{\\xspaceskip} primitive@> 5157primitive("parfillskip",assign_glue,glue_base+par_fill_skip_code);@/ 5158@!@:par_fill_skip_}{\.{\\parfillskip} primitive@> 5159primitive("XeTeXlinebreakskip",assign_glue,glue_base+XeTeX_linebreak_skip_code);@/ 5160@!@:XeTeX_linebreak_skip_}{\.{\\XeTeXlinebreakskip} primitive@> 5161primitive("thinmuskip",assign_mu_glue,glue_base+thin_mu_skip_code);@/ 5162@!@:thin_mu_skip_}{\.{\\thinmuskip} primitive@> 5163primitive("medmuskip",assign_mu_glue,glue_base+med_mu_skip_code);@/ 5164@!@:med_mu_skip_}{\.{\\medmuskip} primitive@> 5165primitive("thickmuskip",assign_mu_glue,glue_base+thick_mu_skip_code);@/ 5166@!@:thick_mu_skip_}{\.{\\thickmuskip} primitive@> 5167 5168@ @<Cases of |print_cmd_chr| for symbolic printing of primitives@>= 5169assign_glue,assign_mu_glue: if chr_code<skip_base then 5170 print_skip_param(chr_code-glue_base) 5171 else if chr_code<mu_skip_base then 5172 begin print_esc("skip"); print_int(chr_code-skip_base); 5173 end 5174 else begin print_esc("muskip"); print_int(chr_code-mu_skip_base); 5175 end; 5176 5177@ All glue parameters and registers are initially `\.{0pt plus0pt minus0pt}'. 5178 5179@<Initialize table entries...@>= 5180equiv(glue_base):=zero_glue; eq_level(glue_base):=level_one; 5181eq_type(glue_base):=glue_ref; 5182for k:=glue_base+1 to local_base-1 do eqtb[k]:=eqtb[glue_base]; 5183glue_ref_count(zero_glue):=glue_ref_count(zero_glue)+local_base-glue_base; 5184 5185@ @<Show equivalent |n|, in region 3@>= 5186if n<skip_base then 5187 begin print_skip_param(n-glue_base); print_char("="); 5188 if n<glue_base+thin_mu_skip_code then print_spec(equiv(n),"pt") 5189 else print_spec(equiv(n),"mu"); 5190 end 5191else if n<mu_skip_base then 5192 begin print_esc("skip"); print_int(n-skip_base); print_char("="); 5193 print_spec(equiv(n),"pt"); 5194 end 5195else begin print_esc("muskip"); print_int(n-mu_skip_base); print_char("="); 5196 print_spec(equiv(n),"mu"); 5197 end 5198 5199@ Region 4 of |eqtb| contains the local quantities defined here. The 5200bulk of this region is taken up by five tables that are indexed by eight-bit 5201characters; these tables are important to both the syntactic and semantic 5202portions of \TeX. There are also a bunch of special things like font and 5203token parameters, as well as the tables of \.{\\toks} and \.{\\box} 5204registers. 5205 5206@d par_shape_loc=local_base {specifies paragraph shape} 5207@d output_routine_loc=local_base+1 {points to token list for \.{\\output}} 5208@d every_par_loc=local_base+2 {points to token list for \.{\\everypar}} 5209@d every_math_loc=local_base+3 {points to token list for \.{\\everymath}} 5210@d every_display_loc=local_base+4 {points to token list for \.{\\everydisplay}} 5211@d every_hbox_loc=local_base+5 {points to token list for \.{\\everyhbox}} 5212@d every_vbox_loc=local_base+6 {points to token list for \.{\\everyvbox}} 5213@d every_job_loc=local_base+7 {points to token list for \.{\\everyjob}} 5214@d every_cr_loc=local_base+8 {points to token list for \.{\\everycr}} 5215@d err_help_loc=local_base+9 {points to token list for \.{\\errhelp}} 5216@d tex_toks=local_base+10 {end of \TeX's token list parameters} 5217@# 5218@d etex_toks_base=tex_toks {base for \eTeX's token list parameters} 5219@d every_eof_loc=etex_toks_base {points to token list for \.{\\everyeof}} 5220@d XeTeX_inter_char_loc=every_eof_loc+1 {not really used, but serves as a flag} 5221@d etex_toks=XeTeX_inter_char_loc+1 {end of \eTeX's token list parameters} 5222@# 5223@d toks_base=etex_toks {table of |number_regs| token list registers} 5224@# 5225@d etex_pen_base=toks_base+number_regs {start of table of \eTeX's penalties} 5226@d inter_line_penalties_loc=etex_pen_base {additional penalties between lines} 5227@d club_penalties_loc=etex_pen_base+1 {penalties for creating club lines} 5228@d widow_penalties_loc=etex_pen_base+2 {penalties for creating widow lines} 5229@d display_widow_penalties_loc=etex_pen_base+3 {ditto, just before a display} 5230@d etex_pens=etex_pen_base+4 {end of table of \eTeX's penalties} 5231@# 5232@d box_base=etex_pens {table of |number_regs| box registers} 5233@d cur_font_loc=box_base+number_regs {internal font number outside math mode} 5234@d math_font_base=cur_font_loc+1 {table of |number_math_fonts| math font numbers} 5235@d cat_code_base=math_font_base+number_math_fonts 5236 {table of |number_usvs| command codes (the ``catcodes'')} 5237@d lc_code_base=cat_code_base+number_usvs {table of |number_usvs| lowercase mappings} 5238@d uc_code_base=lc_code_base+number_usvs {table of |number_usvs| uppercase mappings} 5239@d sf_code_base=uc_code_base+number_usvs {table of |number_usvs| spacefactor mappings} 5240@d math_code_base=sf_code_base+number_usvs {table of |number_usvs| math mode mappings} 5241@d int_base=math_code_base+number_usvs {beginning of region 5} 5242@# 5243@d par_shape_ptr==equiv(par_shape_loc) 5244@d output_routine==equiv(output_routine_loc) 5245@d every_par==equiv(every_par_loc) 5246@d every_math==equiv(every_math_loc) 5247@d every_display==equiv(every_display_loc) 5248@d every_hbox==equiv(every_hbox_loc) 5249@d every_vbox==equiv(every_vbox_loc) 5250@d every_job==equiv(every_job_loc) 5251@d every_cr==equiv(every_cr_loc) 5252@d err_help==equiv(err_help_loc) 5253@d toks(#)==equiv(toks_base+#) 5254@d box(#)==equiv(box_base+#) 5255@d cur_font==equiv(cur_font_loc) 5256@d fam_fnt(#)==equiv(math_font_base+#) 5257@d cat_code(#)==equiv(cat_code_base+#) 5258@d lc_code(#)==equiv(lc_code_base+#) 5259@d uc_code(#)==equiv(uc_code_base+#) 5260@d sf_code(#)==equiv(sf_code_base+#) 5261@d math_code(#)==equiv(math_code_base+#) 5262 {Note: |math_code(c)| is the true math code plus |min_halfword|} 5263 5264@<Put each...@>= 5265primitive("output",assign_toks,output_routine_loc); 5266@!@:output_}{\.{\\output} primitive@> 5267primitive("everypar",assign_toks,every_par_loc); 5268@!@:every_par_}{\.{\\everypar} primitive@> 5269primitive("everymath",assign_toks,every_math_loc); 5270@!@:every_math_}{\.{\\everymath} primitive@> 5271primitive("everydisplay",assign_toks,every_display_loc); 5272@!@:every_display_}{\.{\\everydisplay} primitive@> 5273primitive("everyhbox",assign_toks,every_hbox_loc); 5274@!@:every_hbox_}{\.{\\everyhbox} primitive@> 5275primitive("everyvbox",assign_toks,every_vbox_loc); 5276@!@:every_vbox_}{\.{\\everyvbox} primitive@> 5277primitive("everyjob",assign_toks,every_job_loc); 5278@!@:every_job_}{\.{\\everyjob} primitive@> 5279primitive("everycr",assign_toks,every_cr_loc); 5280@!@:every_cr_}{\.{\\everycr} primitive@> 5281primitive("errhelp",assign_toks,err_help_loc); 5282@!@:err_help_}{\.{\\errhelp} primitive@> 5283 5284@ @<Cases of |print_cmd_chr|...@>= 5285assign_toks: if chr_code>=toks_base then 5286 begin print_esc("toks"); print_int(chr_code-toks_base); 5287 end 5288else case chr_code of 5289 output_routine_loc: print_esc("output"); 5290 every_par_loc: print_esc("everypar"); 5291 every_math_loc: print_esc("everymath"); 5292 every_display_loc: print_esc("everydisplay"); 5293 every_hbox_loc: print_esc("everyhbox"); 5294 every_vbox_loc: print_esc("everyvbox"); 5295 every_job_loc: print_esc("everyjob"); 5296 every_cr_loc: print_esc("everycr"); 5297 @/@<Cases of |assign_toks| for |print_cmd_chr|@>@/ 5298 othercases print_esc("errhelp") 5299 endcases; 5300 5301@ We initialize most things to null or undefined values. An undefined font 5302is represented by the internal code |font_base|. 5303 5304However, the character code tables are given initial values based on the 5305conventional interpretation of ASCII code. These initial values should 5306not be changed when \TeX\ is adapted for use with non-English languages; 5307all changes to the initialization conventions should be made in format 5308packages, not in \TeX\ itself, so that global interchange of formats is 5309possible. 5310 5311@d null_font==font_base 5312@d var_fam_class = 7 5313@d active_math_char = @"1FFFFF 5314@d is_active_math_char(#) == math_char_field(#) = active_math_char 5315@d is_var_family(#) == math_class_field(#) = 7 5316 5317@<Initialize table entries...@>= 5318par_shape_ptr:=null; eq_type(par_shape_loc):=shape_ref; 5319eq_level(par_shape_loc):=level_one;@/ 5320for k:=etex_pen_base to etex_pens-1 do 5321 eqtb[k]:=eqtb[par_shape_loc]; 5322for k:=output_routine_loc to toks_base+number_regs-1 do 5323 eqtb[k]:=eqtb[undefined_control_sequence]; 5324box(0):=null; eq_type(box_base):=box_ref; eq_level(box_base):=level_one; 5325for k:=box_base+1 to box_base+number_regs-1 do eqtb[k]:=eqtb[box_base]; 5326cur_font:=null_font; eq_type(cur_font_loc):=data; 5327eq_level(cur_font_loc):=level_one;@/ 5328for k:=math_font_base to math_font_base+number_math_fonts-1 do eqtb[k]:=eqtb[cur_font_loc]; 5329equiv(cat_code_base):=0; eq_type(cat_code_base):=data; 5330eq_level(cat_code_base):=level_one;@/ 5331for k:=cat_code_base+1 to int_base-1 do eqtb[k]:=eqtb[cat_code_base]; 5332for k:=0 to number_usvs-1 do 5333 begin cat_code(k):=other_char; math_code(k):=hi(k); sf_code(k):=1000; 5334 end; 5335cat_code(carriage_return):=car_ret; cat_code(" "):=spacer; 5336cat_code("\"):=escape; cat_code("%"):=comment; 5337cat_code(invalid_code):=invalid_char; cat_code(null_code):=ignore; 5338for k:="0" to "9" do math_code(k):=hi(k + set_class_field(var_fam_class)); 5339for k:="A" to "Z" do 5340 begin cat_code(k):=letter; cat_code(k+"a"-"A"):=letter;@/ 5341 math_code(k):=hi(k + set_family_field(1) + set_class_field(var_fam_class)); 5342 math_code(k+"a"-"A"):=hi(k+"a"-"A" + set_family_field(1) + set_class_field(var_fam_class));@/ 5343 lc_code(k):=k+"a"-"A"; lc_code(k+"a"-"A"):=k+"a"-"A";@/ 5344 uc_code(k):=k; uc_code(k+"a"-"A"):=k;@/ 5345 sf_code(k):=999; 5346 end; 5347 5348@ @<Show equivalent |n|, in region 4@>= 5349if (n=par_shape_loc) or ((n>=etex_pen_base) and (n<etex_pens)) then 5350 begin print_cmd_chr(set_shape,n); print_char("="); 5351 if equiv(n)=null then print_char("0") 5352 else if n>par_shape_loc then 5353 begin print_int(penalty(equiv(n))); print_char(" "); 5354 print_int(penalty(equiv(n)+1)); 5355 if penalty(equiv(n))>1 then print_esc("ETC."); 5356 end 5357 else print_int(info(par_shape_ptr)); 5358 end 5359else if n<toks_base then 5360 begin print_cmd_chr(assign_toks,n); print_char("="); 5361 if equiv(n)<>null then show_token_list(link(equiv(n)),null,32); 5362 end 5363else if n<box_base then 5364 begin print_esc("toks"); print_int(n-toks_base); print_char("="); 5365 if equiv(n)<>null then show_token_list(link(equiv(n)),null,32); 5366 end 5367else if n<cur_font_loc then 5368 begin print_esc("box"); print_int(n-box_base); print_char("="); 5369 if equiv(n)=null then print("void") 5370 else begin depth_threshold:=0; breadth_max:=1; show_node_list(equiv(n)); 5371 end; 5372 end 5373else if n<cat_code_base then @<Show the font identifier in |eqtb[n]|@> 5374else @<Show the halfword code in |eqtb[n]|@> 5375 5376@ @<Show the font identifier in |eqtb[n]|@>= 5377begin if n=cur_font_loc then print("current font") 5378else if n<math_font_base+script_size then 5379 begin print_esc("textfont"); print_int(n-math_font_base); 5380 end 5381else if n<math_font_base+script_script_size then 5382 begin print_esc("scriptfont"); print_int(n-math_font_base-script_size); 5383 end 5384else begin print_esc("scriptscriptfont"); 5385 print_int(n-math_font_base-script_script_size); 5386 end; 5387print_char("=");@/ 5388print_esc(hash[font_id_base+equiv(n)].rh); 5389 {that's |font_id_text(equiv(n))|} 5390end 5391 5392@ @<Show the halfword code in |eqtb[n]|@>= 5393if n<math_code_base then 5394 begin if n<lc_code_base then 5395 begin print_esc("catcode"); print_int(n-cat_code_base); 5396 end 5397 else if n<uc_code_base then 5398 begin print_esc("lccode"); print_int(n-lc_code_base); 5399 end 5400 else if n<sf_code_base then 5401 begin print_esc("uccode"); print_int(n-uc_code_base); 5402 end 5403 else begin print_esc("sfcode"); print_int(n-sf_code_base); 5404 end; 5405 print_char("="); print_int(equiv(n)); 5406 end 5407else begin print_esc("mathcode"); print_int(n-math_code_base); 5408 print_char("="); print_int(ho(equiv(n))); 5409 end 5410 5411@ Region 5 of |eqtb| contains the integer parameters and registers defined 5412here, as well as the |del_code| table. The latter table differs from the 5413|cat_code..math_code| tables that precede it, since delimiter codes are 5414fullword integers while the other kinds of codes occupy at most a 5415halfword. This is what makes region~5 different from region~4. We will 5416store the |eq_level| information in an auxiliary array of quarterwords 5417that will be defined later. 5418 5419@d pretolerance_code=0 {badness tolerance before hyphenation} 5420@d tolerance_code=1 {badness tolerance after hyphenation} 5421@d line_penalty_code=2 {added to the badness of every line} 5422@d hyphen_penalty_code=3 {penalty for break after discretionary hyphen} 5423@d ex_hyphen_penalty_code=4 {penalty for break after explicit hyphen} 5424@d club_penalty_code=5 {penalty for creating a club line} 5425@d widow_penalty_code=6 {penalty for creating a widow line} 5426@d display_widow_penalty_code=7 {ditto, just before a display} 5427@d broken_penalty_code=8 {penalty for breaking a page at a broken line} 5428@d bin_op_penalty_code=9 {penalty for breaking after a binary operation} 5429@d rel_penalty_code=10 {penalty for breaking after a relation} 5430@d pre_display_penalty_code=11 5431 {penalty for breaking just before a displayed formula} 5432@d post_display_penalty_code=12 5433 {penalty for breaking just after a displayed formula} 5434@d inter_line_penalty_code=13 {additional penalty between lines} 5435@d double_hyphen_demerits_code=14 {demerits for double hyphen break} 5436@d final_hyphen_demerits_code=15 {demerits for final hyphen break} 5437@d adj_demerits_code=16 {demerits for adjacent incompatible lines} 5438@d mag_code=17 {magnification ratio} 5439@d delimiter_factor_code=18 {ratio for variable-size delimiters} 5440@d looseness_code=19 {change in number of lines for a paragraph} 5441@d time_code=20 {current time of day} 5442@d day_code=21 {current day of the month} 5443@d month_code=22 {current month of the year} 5444@d year_code=23 {current year of our Lord} 5445@d show_box_breadth_code=24 {nodes per level in |show_box|} 5446@d show_box_depth_code=25 {maximum level in |show_box|} 5447@d hbadness_code=26 {hboxes exceeding this badness will be shown by |hpack|} 5448@d vbadness_code=27 {vboxes exceeding this badness will be shown by |vpack|} 5449@d pausing_code=28 {pause after each line is read from a file} 5450@d tracing_online_code=29 {show diagnostic output on terminal} 5451@d tracing_macros_code=30 {show macros as they are being expanded} 5452@d tracing_stats_code=31 {show memory usage if \TeX\ knows it} 5453@d tracing_paragraphs_code=32 {show line-break calculations} 5454@d tracing_pages_code=33 {show page-break calculations} 5455@d tracing_output_code=34 {show boxes when they are shipped out} 5456@d tracing_lost_chars_code=35 {show characters that aren't in the font} 5457@d tracing_commands_code=36 {show command codes at |big_switch|} 5458@d tracing_restores_code=37 {show equivalents when they are restored} 5459@d uc_hyph_code=38 {hyphenate words beginning with a capital letter} 5460@d output_penalty_code=39 {penalty found at current page break} 5461@d max_dead_cycles_code=40 {bound on consecutive dead cycles of output} 5462@d hang_after_code=41 {hanging indentation changes after this many lines} 5463@d floating_penalty_code=42 {penalty for insertions heldover after a split} 5464@d global_defs_code=43 {override \.{\\global} specifications} 5465@d cur_fam_code=44 {current family} 5466@d escape_char_code=45 {escape character for token output} 5467@d default_hyphen_char_code=46 {value of \.{\\hyphenchar} when a font is loaded} 5468@d default_skew_char_code=47 {value of \.{\\skewchar} when a font is loaded} 5469@d end_line_char_code=48 {character placed at the right end of the buffer} 5470@d new_line_char_code=49 {character that prints as |print_ln|} 5471@d language_code=50 {current hyphenation table} 5472@d left_hyphen_min_code=51 {minimum left hyphenation fragment size} 5473@d right_hyphen_min_code=52 {minimum right hyphenation fragment size} 5474@d holding_inserts_code=53 {do not remove insertion nodes from \.{\\box255}} 5475@d error_context_lines_code=54 {maximum intermediate line pairs shown} 5476@d tex_int_pars=55 {total number of \TeX's integer parameters} 5477@# 5478@d etex_int_base=tex_int_pars {base for \eTeX's integer parameters} 5479@d tracing_assigns_code=etex_int_base {show assignments} 5480@d tracing_groups_code=etex_int_base+1 {show save/restore groups} 5481@d tracing_ifs_code=etex_int_base+2 {show conditionals} 5482@d tracing_scan_tokens_code=etex_int_base+3 {show pseudo file open and close} 5483@d tracing_nesting_code=etex_int_base+4 {show incomplete groups and ifs within files} 5484@d pre_display_direction_code=etex_int_base+5 {text direction preceding a display} 5485@d last_line_fit_code=etex_int_base+6 {adjustment for last line of paragraph} 5486@d saving_vdiscards_code=etex_int_base+7 {save items discarded from vlists} 5487@d saving_hyph_codes_code=etex_int_base+8 {save hyphenation codes for languages} 5488@d suppress_fontnotfound_error_code=etex_int_base+9 {suppress errors for missing fonts} 5489@d XeTeX_linebreak_locale_code=etex_int_base+10 {string number of locale to use for linebreak locations} 5490@d XeTeX_linebreak_penalty_code=etex_int_base+11 {penalty to use at locale-dependent linebreak locations} 5491@d XeTeX_protrude_chars_code=etex_int_base+12 {protrude chars at left/right edge of paragraphs} 5492@d eTeX_state_code=etex_int_base+13 {\eTeX\ state variables} 5493@d etex_int_pars=eTeX_state_code+eTeX_states {total number of \eTeX's integer parameters} 5494@# 5495@d int_pars=etex_int_pars {total number of integer parameters} 5496@d count_base=int_base+int_pars {|number_regs| user \.{\\count} registers} 5497@d del_code_base=count_base+number_regs {|number_usvs| delimiter code mappings} 5498@d dimen_base=del_code_base+number_usvs {beginning of region 6} 5499@# 5500@d del_code(#)==eqtb[del_code_base+#].int 5501@d count(#)==eqtb[count_base+#].int 5502@d int_par(#)==eqtb[int_base+#].int {an integer parameter} 5503@d pretolerance==int_par(pretolerance_code) 5504@d tolerance==int_par(tolerance_code) 5505@d line_penalty==int_par(line_penalty_code) 5506@d hyphen_penalty==int_par(hyphen_penalty_code) 5507@d ex_hyphen_penalty==int_par(ex_hyphen_penalty_code) 5508@d club_penalty==int_par(club_penalty_code) 5509@d widow_penalty==int_par(widow_penalty_code) 5510@d display_widow_penalty==int_par(display_widow_penalty_code) 5511@d broken_penalty==int_par(broken_penalty_code) 5512@d bin_op_penalty==int_par(bin_op_penalty_code) 5513@d rel_penalty==int_par(rel_penalty_code) 5514@d pre_display_penalty==int_par(pre_display_penalty_code) 5515@d post_display_penalty==int_par(post_display_penalty_code) 5516@d inter_line_penalty==int_par(inter_line_penalty_code) 5517@d double_hyphen_demerits==int_par(double_hyphen_demerits_code) 5518@d final_hyphen_demerits==int_par(final_hyphen_demerits_code) 5519@d adj_demerits==int_par(adj_demerits_code) 5520@d mag==int_par(mag_code) 5521@d delimiter_factor==int_par(delimiter_factor_code) 5522@d looseness==int_par(looseness_code) 5523@d time==int_par(time_code) 5524@d day==int_par(day_code) 5525@d month==int_par(month_code) 5526@d year==int_par(year_code) 5527@d show_box_breadth==int_par(show_box_breadth_code) 5528@d show_box_depth==int_par(show_box_depth_code) 5529@d hbadness==int_par(hbadness_code) 5530@d vbadness==int_par(vbadness_code) 5531@d pausing==int_par(pausing_code) 5532@d tracing_online==int_par(tracing_online_code) 5533@d tracing_macros==int_par(tracing_macros_code) 5534@d tracing_stats==int_par(tracing_stats_code) 5535@d tracing_paragraphs==int_par(tracing_paragraphs_code) 5536@d tracing_pages==int_par(tracing_pages_code) 5537@d tracing_output==int_par(tracing_output_code) 5538@d tracing_lost_chars==int_par(tracing_lost_chars_code) 5539@d tracing_commands==int_par(tracing_commands_code) 5540@d tracing_restores==int_par(tracing_restores_code) 5541@d uc_hyph==int_par(uc_hyph_code) 5542@d output_penalty==int_par(output_penalty_code) 5543@d max_dead_cycles==int_par(max_dead_cycles_code) 5544@d hang_after==int_par(hang_after_code) 5545@d floating_penalty==int_par(floating_penalty_code) 5546@d global_defs==int_par(global_defs_code) 5547@d cur_fam==int_par(cur_fam_code) 5548@d escape_char==int_par(escape_char_code) 5549@d default_hyphen_char==int_par(default_hyphen_char_code) 5550@d default_skew_char==int_par(default_skew_char_code) 5551@d end_line_char==int_par(end_line_char_code) 5552@d new_line_char==int_par(new_line_char_code) 5553@d language==int_par(language_code) 5554@d left_hyphen_min==int_par(left_hyphen_min_code) 5555@d right_hyphen_min==int_par(right_hyphen_min_code) 5556@d holding_inserts==int_par(holding_inserts_code) 5557@d error_context_lines==int_par(error_context_lines_code) 5558@# 5559@d tracing_assigns==int_par(tracing_assigns_code) 5560@d tracing_groups==int_par(tracing_groups_code) 5561@d tracing_ifs==int_par(tracing_ifs_code) 5562@d tracing_scan_tokens==int_par(tracing_scan_tokens_code) 5563@d tracing_nesting==int_par(tracing_nesting_code) 5564@d pre_display_direction==int_par(pre_display_direction_code) 5565@d last_line_fit==int_par(last_line_fit_code) 5566@d saving_vdiscards==int_par(saving_vdiscards_code) 5567@d saving_hyph_codes==int_par(saving_hyph_codes_code) 5568@d suppress_fontnotfound_error==int_par(suppress_fontnotfound_error_code) 5569@d XeTeX_linebreak_locale==int_par(XeTeX_linebreak_locale_code) 5570@d XeTeX_linebreak_penalty==int_par(XeTeX_linebreak_penalty_code) 5571@d XeTeX_protrude_chars==int_par(XeTeX_protrude_chars_code) 5572 5573@<Assign the values |depth_threshold:=show_box_depth|...@>= 5574depth_threshold:=show_box_depth; 5575breadth_max:=show_box_breadth 5576 5577@ We can print the symbolic name of an integer parameter as follows. 5578 5579@p procedure print_param(@!n:integer); 5580begin case n of 5581pretolerance_code:print_esc("pretolerance"); 5582tolerance_code:print_esc("tolerance"); 5583line_penalty_code:print_esc("linepenalty"); 5584hyphen_penalty_code:print_esc("hyphenpenalty"); 5585ex_hyphen_penalty_code:print_esc("exhyphenpenalty"); 5586club_penalty_code:print_esc("clubpenalty"); 5587widow_penalty_code:print_esc("widowpenalty"); 5588display_widow_penalty_code:print_esc("displaywidowpenalty"); 5589broken_penalty_code:print_esc("brokenpenalty"); 5590bin_op_penalty_code:print_esc("binoppenalty"); 5591rel_penalty_code:print_esc("relpenalty"); 5592pre_display_penalty_code:print_esc("predisplaypenalty"); 5593post_display_penalty_code:print_esc("postdisplaypenalty"); 5594inter_line_penalty_code:print_esc("interlinepenalty"); 5595double_hyphen_demerits_code:print_esc("doublehyphendemerits"); 5596final_hyphen_demerits_code:print_esc("finalhyphendemerits"); 5597adj_demerits_code:print_esc("adjdemerits"); 5598mag_code:print_esc("mag"); 5599delimiter_factor_code:print_esc("delimiterfactor"); 5600looseness_code:print_esc("looseness"); 5601time_code:print_esc("time"); 5602day_code:print_esc("day"); 5603month_code:print_esc("month"); 5604year_code:print_esc("year"); 5605show_box_breadth_code:print_esc("showboxbreadth"); 5606show_box_depth_code:print_esc("showboxdepth"); 5607hbadness_code:print_esc("hbadness"); 5608vbadness_code:print_esc("vbadness"); 5609pausing_code:print_esc("pausing"); 5610tracing_online_code:print_esc("tracingonline"); 5611tracing_macros_code:print_esc("tracingmacros"); 5612tracing_stats_code:print_esc("tracingstats"); 5613tracing_paragraphs_code:print_esc("tracingparagraphs"); 5614tracing_pages_code:print_esc("tracingpages"); 5615tracing_output_code:print_esc("tracingoutput"); 5616tracing_lost_chars_code:print_esc("tracinglostchars"); 5617tracing_commands_code:print_esc("tracingcommands"); 5618tracing_restores_code:print_esc("tracingrestores"); 5619uc_hyph_code:print_esc("uchyph"); 5620output_penalty_code:print_esc("outputpenalty"); 5621max_dead_cycles_code:print_esc("maxdeadcycles"); 5622hang_after_code:print_esc("hangafter"); 5623floating_penalty_code:print_esc("floatingpenalty"); 5624global_defs_code:print_esc("globaldefs"); 5625cur_fam_code:print_esc("fam"); 5626escape_char_code:print_esc("escapechar"); 5627default_hyphen_char_code:print_esc("defaulthyphenchar"); 5628default_skew_char_code:print_esc("defaultskewchar"); 5629end_line_char_code:print_esc("endlinechar"); 5630new_line_char_code:print_esc("newlinechar"); 5631language_code:print_esc("language"); 5632left_hyphen_min_code:print_esc("lefthyphenmin"); 5633right_hyphen_min_code:print_esc("righthyphenmin"); 5634holding_inserts_code:print_esc("holdinginserts"); 5635error_context_lines_code:print_esc("errorcontextlines"); 5636XeTeX_linebreak_penalty_code:print_esc("XeTeXlinebreakpenalty"); 5637XeTeX_protrude_chars_code:print_esc("XeTeXprotrudechars"); 5638@/@<Cases for |print_param|@>@/ 5639othercases print("[unknown integer parameter!]") 5640endcases; 5641end; 5642 5643@ The integer parameter names must be entered into the hash table. 5644 5645@<Put each...@>= 5646primitive("pretolerance",assign_int,int_base+pretolerance_code);@/ 5647@!@:pretolerance_}{\.{\\pretolerance} primitive@> 5648primitive("tolerance",assign_int,int_base+tolerance_code);@/ 5649@!@:tolerance_}{\.{\\tolerance} primitive@> 5650primitive("linepenalty",assign_int,int_base+line_penalty_code);@/ 5651@!@:line_penalty_}{\.{\\linepenalty} primitive@> 5652primitive("hyphenpenalty",assign_int,int_base+hyphen_penalty_code);@/ 5653@!@:hyphen_penalty_}{\.{\\hyphenpenalty} primitive@> 5654primitive("exhyphenpenalty",assign_int,int_base+ex_hyphen_penalty_code);@/ 5655@!@:ex_hyphen_penalty_}{\.{\\exhyphenpenalty} primitive@> 5656primitive("clubpenalty",assign_int,int_base+club_penalty_code);@/ 5657@!@:club_penalty_}{\.{\\clubpenalty} primitive@> 5658primitive("widowpenalty",assign_int,int_base+widow_penalty_code);@/ 5659@!@:widow_penalty_}{\.{\\widowpenalty} primitive@> 5660primitive("displaywidowpenalty", 5661 assign_int,int_base+display_widow_penalty_code);@/ 5662@!@:display_widow_penalty_}{\.{\\displaywidowpenalty} primitive@> 5663primitive("brokenpenalty",assign_int,int_base+broken_penalty_code);@/ 5664@!@:broken_penalty_}{\.{\\brokenpenalty} primitive@> 5665primitive("binoppenalty",assign_int,int_base+bin_op_penalty_code);@/ 5666@!@:bin_op_penalty_}{\.{\\binoppenalty} primitive@> 5667primitive("relpenalty",assign_int,int_base+rel_penalty_code);@/ 5668@!@:rel_penalty_}{\.{\\relpenalty} primitive@> 5669primitive("predisplaypenalty",assign_int,int_base+pre_display_penalty_code);@/ 5670@!@:pre_display_penalty_}{\.{\\predisplaypenalty} primitive@> 5671primitive("postdisplaypenalty",assign_int,int_base+post_display_penalty_code);@/ 5672@!@:post_display_penalty_}{\.{\\postdisplaypenalty} primitive@> 5673primitive("interlinepenalty",assign_int,int_base+inter_line_penalty_code);@/ 5674@!@:inter_line_penalty_}{\.{\\interlinepenalty} primitive@> 5675primitive("doublehyphendemerits", 5676 assign_int,int_base+double_hyphen_demerits_code);@/ 5677@!@:double_hyphen_demerits_}{\.{\\doublehyphendemerits} primitive@> 5678primitive("finalhyphendemerits", 5679 assign_int,int_base+final_hyphen_demerits_code);@/ 5680@!@:final_hyphen_demerits_}{\.{\\finalhyphendemerits} primitive@> 5681primitive("adjdemerits",assign_int,int_base+adj_demerits_code);@/ 5682@!@:adj_demerits_}{\.{\\adjdemerits} primitive@> 5683primitive("mag",assign_int,int_base+mag_code);@/ 5684@!@:mag_}{\.{\\mag} primitive@> 5685primitive("delimiterfactor",assign_int,int_base+delimiter_factor_code);@/ 5686@!@:delimiter_factor_}{\.{\\delimiterfactor} primitive@> 5687primitive("looseness",assign_int,int_base+looseness_code);@/ 5688@!@:looseness_}{\.{\\looseness} primitive@> 5689primitive("time",assign_int,int_base+time_code);@/ 5690@!@:time_}{\.{\\time} primitive@> 5691primitive("day",assign_int,int_base+day_code);@/ 5692@!@:day_}{\.{\\day} primitive@> 5693primitive("month",assign_int,int_base+month_code);@/ 5694@!@:month_}{\.{\\month} primitive@> 5695primitive("year",assign_int,int_base+year_code);@/ 5696@!@:year_}{\.{\\year} primitive@> 5697primitive("showboxbreadth",assign_int,int_base+show_box_breadth_code);@/ 5698@!@:show_box_breadth_}{\.{\\showboxbreadth} primitive@> 5699primitive("showboxdepth",assign_int,int_base+show_box_depth_code);@/ 5700@!@:show_box_depth_}{\.{\\showboxdepth} primitive@> 5701primitive("hbadness",assign_int,int_base+hbadness_code);@/ 5702@!@:hbadness_}{\.{\\hbadness} primitive@> 5703primitive("vbadness",assign_int,int_base+vbadness_code);@/ 5704@!@:vbadness_}{\.{\\vbadness} primitive@> 5705primitive("pausing",assign_int,int_base+pausing_code);@/ 5706@!@:pausing_}{\.{\\pausing} primitive@> 5707primitive("tracingonline",assign_int,int_base+tracing_online_code);@/ 5708@!@:tracing_online_}{\.{\\tracingonline} primitive@> 5709primitive("tracingmacros",assign_int,int_base+tracing_macros_code);@/ 5710@!@:tracing_macros_}{\.{\\tracingmacros} primitive@> 5711primitive("tracingstats",assign_int,int_base+tracing_stats_code);@/ 5712@!@:tracing_stats_}{\.{\\tracingstats} primitive@> 5713primitive("tracingparagraphs",assign_int,int_base+tracing_paragraphs_code);@/ 5714@!@:tracing_paragraphs_}{\.{\\tracingparagraphs} primitive@> 5715primitive("tracingpages",assign_int,int_base+tracing_pages_code);@/ 5716@!@:tracing_pages_}{\.{\\tracingpages} primitive@> 5717primitive("tracingoutput",assign_int,int_base+tracing_output_code);@/ 5718@!@:tracing_output_}{\.{\\tracingoutput} primitive@> 5719primitive("tracinglostchars",assign_int,int_base+tracing_lost_chars_code);@/ 5720@!@:tracing_lost_chars_}{\.{\\tracinglostchars} primitive@> 5721primitive("tracingcommands",assign_int,int_base+tracing_commands_code);@/ 5722@!@:tracing_commands_}{\.{\\tracingcommands} primitive@> 5723primitive("tracingrestores",assign_int,int_base+tracing_restores_code);@/ 5724@!@:tracing_restores_}{\.{\\tracingrestores} primitive@> 5725primitive("uchyph",assign_int,int_base+uc_hyph_code);@/ 5726@!@:uc_hyph_}{\.{\\uchyph} primitive@> 5727primitive("outputpenalty",assign_int,int_base+output_penalty_code);@/ 5728@!@:output_penalty_}{\.{\\outputpenalty} primitive@> 5729primitive("maxdeadcycles",assign_int,int_base+max_dead_cycles_code);@/ 5730@!@:max_dead_cycles_}{\.{\\maxdeadcycles} primitive@> 5731primitive("hangafter",assign_int,int_base+hang_after_code);@/ 5732@!@:hang_after_}{\.{\\hangafter} primitive@> 5733primitive("floatingpenalty",assign_int,int_base+floating_penalty_code);@/ 5734@!@:floating_penalty_}{\.{\\floatingpenalty} primitive@> 5735primitive("globaldefs",assign_int,int_base+global_defs_code);@/ 5736@!@:global_defs_}{\.{\\globaldefs} primitive@> 5737primitive("fam",assign_int,int_base+cur_fam_code);@/ 5738@!@:fam_}{\.{\\fam} primitive@> 5739primitive("escapechar",assign_int,int_base+escape_char_code);@/ 5740@!@:escape_char_}{\.{\\escapechar} primitive@> 5741primitive("defaulthyphenchar",assign_int,int_base+default_hyphen_char_code);@/ 5742@!@:default_hyphen_char_}{\.{\\defaulthyphenchar} primitive@> 5743primitive("defaultskewchar",assign_int,int_base+default_skew_char_code);@/ 5744@!@:default_skew_char_}{\.{\\defaultskewchar} primitive@> 5745primitive("endlinechar",assign_int,int_base+end_line_char_code);@/ 5746@!@:end_line_char_}{\.{\\endlinechar} primitive@> 5747primitive("newlinechar",assign_int,int_base+new_line_char_code);@/ 5748@!@:new_line_char_}{\.{\\newlinechar} primitive@> 5749primitive("language",assign_int,int_base+language_code);@/ 5750@!@:language_}{\.{\\language} primitive@> 5751primitive("lefthyphenmin",assign_int,int_base+left_hyphen_min_code);@/ 5752@!@:left_hyphen_min_}{\.{\\lefthyphenmin} primitive@> 5753primitive("righthyphenmin",assign_int,int_base+right_hyphen_min_code);@/ 5754@!@:right_hyphen_min_}{\.{\\righthyphenmin} primitive@> 5755primitive("holdinginserts",assign_int,int_base+holding_inserts_code);@/ 5756@!@:holding_inserts_}{\.{\\holdinginserts} primitive@> 5757primitive("errorcontextlines",assign_int,int_base+error_context_lines_code);@/ 5758@!@:error_context_lines_}{\.{\\errorcontextlines} primitive@> 5759primitive("XeTeXlinebreakpenalty",assign_int,int_base+XeTeX_linebreak_penalty_code);@/ 5760@!@:XeTeX_linebreak_penalty_}{\.{\\XeTeXlinebreakpenalty} primitive@> 5761primitive("XeTeXprotrudechars",assign_int,int_base+XeTeX_protrude_chars_code);@/ 5762@!@:XeTeX_protrude_chars_}{\.{\\XeTeXprotrudechars} primitive@> 5763 5764@ @<Cases of |print_cmd_chr|...@>= 5765assign_int: if chr_code<count_base then print_param(chr_code-int_base) 5766 else begin print_esc("count"); print_int(chr_code-count_base); 5767 end; 5768 5769@ The integer parameters should really be initialized by a macro package; 5770the following initialization does the minimum to keep \TeX\ from 5771complete failure. 5772@^null delimiter@> 5773 5774@<Initialize table entries...@>= 5775for k:=int_base to del_code_base-1 do eqtb[k].int:=0; 5776mag:=1000; tolerance:=10000; hang_after:=1; max_dead_cycles:=25; 5777escape_char:="\"; end_line_char:=carriage_return; 5778for k:=0 to number_chars-1 do del_code(k):=-1; 5779del_code("."):=0; {this null delimiter is used in error recovery} 5780 5781@ The following procedure, which is called just before \TeX\ initializes its 5782input and output, establishes the initial values of the date and time. 5783@^system dependencies@> 5784Since standard \PASCAL\ cannot provide such information, something special 5785is needed. The program here simply specifies July 4, 1776, at noon; but 5786users probably want a better approximation to the truth. 5787 5788@p procedure fix_date_and_time; 5789begin time:=12*60; {minutes since midnight} 5790day:=4; {fourth day of the month} 5791month:=7; {seventh month of the year} 5792year:=1776; {Anno Domini} 5793end; 5794 5795@ @<Show equivalent |n|, in region 5@>= 5796begin if n<count_base then print_param(n-int_base) 5797else if n<del_code_base then 5798 begin print_esc("count"); print_int(n-count_base); 5799 end 5800else begin print_esc("delcode"); print_int(n-del_code_base); 5801 end; 5802print_char("="); print_int(eqtb[n].int); 5803end 5804 5805@ @<Set variable |c| to the current escape character@>=c:=escape_char 5806 5807@ @<Character |s| is the current new-line character@>=s=new_line_char 5808 5809@ \TeX\ is occasionally supposed to print diagnostic information that 5810goes only into the transcript file, unless |tracing_online| is positive. 5811Here are two routines that adjust the destination of print commands: 5812 5813@p procedure begin_diagnostic; {prepare to do some tracing} 5814begin old_setting:=selector; 5815if (tracing_online<=0)and(selector=term_and_log) then 5816 begin decr(selector); 5817 if history=spotless then history:=warning_issued; 5818 end; 5819end; 5820@# 5821procedure end_diagnostic(@!blank_line:boolean); 5822 {restore proper conditions after tracing} 5823begin print_nl(""); 5824if blank_line then print_ln; 5825selector:=old_setting; 5826end; 5827 5828@ Of course we had better declare another global variable, if the previous 5829routines are going to work. 5830 5831@<Glob...@>= 5832@!old_setting:0..max_selector; 5833 5834@ The final region of |eqtb| contains the dimension parameters defined 5835here, and the |number_regs| \.{\\dimen} registers. 5836 5837@d par_indent_code=0 {indentation of paragraphs} 5838@d math_surround_code=1 {space around math in text} 5839@d line_skip_limit_code=2 {threshold for |line_skip| instead of |baseline_skip|} 5840@d hsize_code=3 {line width in horizontal mode} 5841@d vsize_code=4 {page height in vertical mode} 5842@d max_depth_code=5 {maximum depth of boxes on main pages} 5843@d split_max_depth_code=6 {maximum depth of boxes on split pages} 5844@d box_max_depth_code=7 {maximum depth of explicit vboxes} 5845@d hfuzz_code=8 {tolerance for overfull hbox messages} 5846@d vfuzz_code=9 {tolerance for overfull vbox messages} 5847@d delimiter_shortfall_code=10 {maximum amount uncovered by variable delimiters} 5848@d null_delimiter_space_code=11 {blank space in null delimiters} 5849@d script_space_code=12 {extra space after subscript or superscript} 5850@d pre_display_size_code=13 {length of text preceding a display} 5851@d display_width_code=14 {length of line for displayed equation} 5852@d display_indent_code=15 {indentation of line for displayed equation} 5853@d overfull_rule_code=16 {width of rule that identifies overfull hboxes} 5854@d hang_indent_code=17 {amount of hanging indentation} 5855@d h_offset_code=18 {amount of horizontal offset when shipping pages out} 5856@d v_offset_code=19 {amount of vertical offset when shipping pages out} 5857@d emergency_stretch_code=20 {reduces badnesses on final pass of line-breaking} 5858@d pdf_page_width_code=21 {page width of the PDF output} 5859@d pdf_page_height_code=22 {page height of the PDF output} 5860@d dimen_pars=23 {total number of dimension parameters} 5861@d scaled_base=dimen_base+dimen_pars 5862 {table of |number_regs| user-defined \.{\\dimen} registers} 5863@d eqtb_size=scaled_base+biggest_reg {largest subscript of |eqtb|} 5864@# 5865@d dimen(#)==eqtb[scaled_base+#].sc 5866@d dimen_par(#)==eqtb[dimen_base+#].sc {a scaled quantity} 5867@d par_indent==dimen_par(par_indent_code) 5868@d math_surround==dimen_par(math_surround_code) 5869@d line_skip_limit==dimen_par(line_skip_limit_code) 5870@d hsize==dimen_par(hsize_code) 5871@d vsize==dimen_par(vsize_code) 5872@d max_depth==dimen_par(max_depth_code) 5873@d split_max_depth==dimen_par(split_max_depth_code) 5874@d box_max_depth==dimen_par(box_max_depth_code) 5875@d hfuzz==dimen_par(hfuzz_code) 5876@d vfuzz==dimen_par(vfuzz_code) 5877@d delimiter_shortfall==dimen_par(delimiter_shortfall_code) 5878@d null_delimiter_space==dimen_par(null_delimiter_space_code) 5879@d script_space==dimen_par(script_space_code) 5880@d pre_display_size==dimen_par(pre_display_size_code) 5881@d display_width==dimen_par(display_width_code) 5882@d display_indent==dimen_par(display_indent_code) 5883@d overfull_rule==dimen_par(overfull_rule_code) 5884@d hang_indent==dimen_par(hang_indent_code) 5885@d h_offset==dimen_par(h_offset_code) 5886@d v_offset==dimen_par(v_offset_code) 5887@d emergency_stretch==dimen_par(emergency_stretch_code) 5888@d pdf_page_width == dimen_par(pdf_page_width_code) 5889@d pdf_page_height == dimen_par(pdf_page_height_code) 5890 5891@p procedure print_length_param(@!n:integer); 5892begin case n of 5893par_indent_code:print_esc("parindent"); 5894math_surround_code:print_esc("mathsurround"); 5895line_skip_limit_code:print_esc("lineskiplimit"); 5896hsize_code:print_esc("hsize"); 5897vsize_code:print_esc("vsize"); 5898max_depth_code:print_esc("maxdepth"); 5899split_max_depth_code:print_esc("splitmaxdepth"); 5900box_max_depth_code:print_esc("boxmaxdepth"); 5901hfuzz_code:print_esc("hfuzz"); 5902vfuzz_code:print_esc("vfuzz"); 5903delimiter_shortfall_code:print_esc("delimitershortfall"); 5904null_delimiter_space_code:print_esc("nulldelimiterspace"); 5905script_space_code:print_esc("scriptspace"); 5906pre_display_size_code:print_esc("predisplaysize"); 5907display_width_code:print_esc("displaywidth"); 5908display_indent_code:print_esc("displayindent"); 5909overfull_rule_code:print_esc("overfullrule"); 5910hang_indent_code:print_esc("hangindent"); 5911h_offset_code:print_esc("hoffset"); 5912v_offset_code:print_esc("voffset"); 5913emergency_stretch_code:print_esc("emergencystretch"); 5914pdf_page_width_code: print_esc("pdfpagewidth"); 5915pdf_page_height_code: print_esc("pdfpageheight"); 5916othercases print("[unknown dimen parameter!]") 5917endcases; 5918end; 5919 5920@ @<Put each...@>= 5921primitive("parindent",assign_dimen,dimen_base+par_indent_code);@/ 5922@!@:par_indent_}{\.{\\parindent} primitive@> 5923primitive("mathsurround",assign_dimen,dimen_base+math_surround_code);@/ 5924@!@:math_surround_}{\.{\\mathsurround} primitive@> 5925primitive("lineskiplimit",assign_dimen,dimen_base+line_skip_limit_code);@/ 5926@!@:line_skip_limit_}{\.{\\lineskiplimit} primitive@> 5927primitive("hsize",assign_dimen,dimen_base+hsize_code);@/ 5928@!@:hsize_}{\.{\\hsize} primitive@> 5929primitive("vsize",assign_dimen,dimen_base+vsize_code);@/ 5930@!@:vsize_}{\.{\\vsize} primitive@> 5931primitive("maxdepth",assign_dimen,dimen_base+max_depth_code);@/ 5932@!@:max_depth_}{\.{\\maxdepth} primitive@> 5933primitive("splitmaxdepth",assign_dimen,dimen_base+split_max_depth_code);@/ 5934@!@:split_max_depth_}{\.{\\splitmaxdepth} primitive@> 5935primitive("boxmaxdepth",assign_dimen,dimen_base+box_max_depth_code);@/ 5936@!@:box_max_depth_}{\.{\\boxmaxdepth} primitive@> 5937primitive("hfuzz",assign_dimen,dimen_base+hfuzz_code);@/ 5938@!@:hfuzz_}{\.{\\hfuzz} primitive@> 5939primitive("vfuzz",assign_dimen,dimen_base+vfuzz_code);@/ 5940@!@:vfuzz_}{\.{\\vfuzz} primitive@> 5941primitive("delimitershortfall", 5942 assign_dimen,dimen_base+delimiter_shortfall_code);@/ 5943@!@:delimiter_shortfall_}{\.{\\delimitershortfall} primitive@> 5944primitive("nulldelimiterspace", 5945 assign_dimen,dimen_base+null_delimiter_space_code);@/ 5946@!@:null_delimiter_space_}{\.{\\nulldelimiterspace} primitive@> 5947primitive("scriptspace",assign_dimen,dimen_base+script_space_code);@/ 5948@!@:script_space_}{\.{\\scriptspace} primitive@> 5949primitive("predisplaysize",assign_dimen,dimen_base+pre_display_size_code);@/ 5950@!@:pre_display_size_}{\.{\\predisplaysize} primitive@> 5951primitive("displaywidth",assign_dimen,dimen_base+display_width_code);@/ 5952@!@:display_width_}{\.{\\displaywidth} primitive@> 5953primitive("displayindent",assign_dimen,dimen_base+display_indent_code);@/ 5954@!@:display_indent_}{\.{\\displayindent} primitive@> 5955primitive("overfullrule",assign_dimen,dimen_base+overfull_rule_code);@/ 5956@!@:overfull_rule_}{\.{\\overfullrule} primitive@> 5957primitive("hangindent",assign_dimen,dimen_base+hang_indent_code);@/ 5958@!@:hang_indent_}{\.{\\hangindent} primitive@> 5959primitive("hoffset",assign_dimen,dimen_base+h_offset_code);@/ 5960@!@:h_offset_}{\.{\\hoffset} primitive@> 5961primitive("voffset",assign_dimen,dimen_base+v_offset_code);@/ 5962@!@:v_offset_}{\.{\\voffset} primitive@> 5963primitive("emergencystretch",assign_dimen,dimen_base+emergency_stretch_code);@/ 5964@!@:emergency_stretch_}{\.{\\emergencystretch} primitive@> 5965 5966primitive("pdfpagewidth",assign_dimen,dimen_base+pdf_page_width_code);@/ 5967@!@:pdf_page_width_}{\.{\\pdfpagewidth} primitive@> 5968primitive("pdfpageheight",assign_dimen,dimen_base+pdf_page_height_code);@/ 5969@!@:pdf_page_height_}{\.{\\pdfpageheight} primitive@> 5970 5971@ @<Cases of |print_cmd_chr|...@>= 5972assign_dimen: if chr_code<scaled_base then 5973 print_length_param(chr_code-dimen_base) 5974 else begin print_esc("dimen"); print_int(chr_code-scaled_base); 5975 end; 5976 5977@ @<Initialize table entries...@>= 5978for k:=dimen_base to eqtb_size do eqtb[k].sc:=0; 5979 5980@ @<Show equivalent |n|, in region 6@>= 5981begin if n<scaled_base then print_length_param(n-dimen_base) 5982else begin print_esc("dimen"); print_int(n-scaled_base); 5983 end; 5984print_char("="); print_scaled(eqtb[n].sc); print("pt"); 5985end 5986 5987@ Here is a procedure that displays the contents of |eqtb[n]| 5988symbolically. 5989 5990@p@t\4@>@<Declare the procedure called |print_cmd_chr|@>@;@/ 5991@!stat procedure show_eqtb(@!n:pointer); 5992begin if n<active_base then print_char("?") {this can't happen} 5993else if n<glue_base then @<Show equivalent |n|, in region 1 or 2@> 5994else if n<local_base then @<Show equivalent |n|, in region 3@> 5995else if n<int_base then @<Show equivalent |n|, in region 4@> 5996else if n<dimen_base then @<Show equivalent |n|, in region 5@> 5997else if n<=eqtb_size then @<Show equivalent |n|, in region 6@> 5998else print_char("?"); {this can't happen either} 5999end; 6000tats 6001 6002@ The last two regions of |eqtb| have fullword values instead of the 6003three fields |eq_level|, |eq_type|, and |equiv|. An |eq_type| is unnecessary, 6004but \TeX\ needs to store the |eq_level| information in another array 6005called |xeq_level|. 6006 6007@<Glob...@>= 6008@!eqtb:array[active_base..eqtb_size] of memory_word; 6009@!xeq_level:array[int_base..eqtb_size] of quarterword; 6010 6011@ @<Set init...@>= 6012for k:=int_base to eqtb_size do xeq_level[k]:=level_one; 6013 6014@ When the debugging routine |search_mem| is looking for pointers having a 6015given value, it is interested only in regions 1 to~3 of~|eqtb|, and in the 6016first part of region~4. 6017 6018@<Search |eqtb| for equivalents equal to |p|@>= 6019for q:=active_base to box_base+biggest_reg do 6020 begin if equiv(q)=p then 6021 begin print_nl("EQUIV("); print_int(q); print_char(")"); 6022 end; 6023 end 6024 6025@* \[18] The hash table. 6026Control sequences are stored and retrieved by means of a fairly standard hash 6027table algorithm called the method of ``coalescing lists'' (cf.\ Algorithm 6.4C 6028in {\sl The Art of Computer Programming\/}). Once a control sequence enters the 6029table, it is never removed, because there are complicated situations 6030involving \.{\\gdef} where the removal of a control sequence at the end of 6031a group would be a mistake preventable only by the introduction of a 6032complicated reference-count mechanism. 6033 6034The actual sequence of letters forming a control sequence identifier is 6035stored in the |str_pool| array together with all the other strings. An 6036auxiliary array |hash| consists of items with two halfword fields per 6037word. The first of these, called |next(p)|, points to the next identifier 6038belonging to the same coalesced list as the identifier corresponding to~|p|; 6039and the other, called |text(p)|, points to the |str_start| entry for 6040|p|'s identifier. If position~|p| of the hash table is empty, we have 6041|text(p)=0|; if position |p| is either empty or the end of a coalesced 6042hash list, we have |next(p)=0|. An auxiliary pointer variable called 6043|hash_used| is maintained in such a way that all locations |p>=hash_used| 6044are nonempty. The global variable |cs_count| tells how many multiletter 6045control sequences have been defined, if statistics are being kept. 6046 6047A global boolean variable called |no_new_control_sequence| is set to 6048|true| during the time that new hash table entries are forbidden. 6049 6050@d next(#) == hash[#].lh {link for coalesced lists} 6051@d text(#) == hash[#].rh {string number for control sequence name} 6052@d hash_is_full == (hash_used=hash_base) {test if all positions are occupied} 6053@d font_id_text(#) == text(font_id_base+#) {a frozen font identifier's name} 6054 6055@<Glob...@>= 6056@!hash: array[hash_base..undefined_control_sequence-1] of two_halves; 6057 {the hash table} 6058@!hash_used:pointer; {allocation pointer for |hash|} 6059@!no_new_control_sequence:boolean; {are new identifiers legal?} 6060@!cs_count:integer; {total number of known identifiers} 6061 6062@ Primitive support needs a few extra variables and definitions 6063 6064@d prim_size=480 {maximum number of primitives } 6065@d prim_prime=409 {about 85\pct! of |primitive_size|} 6066@d prim_base=1 6067@d prim_next(#) == prim[#].lh {link for coalesced lists} 6068@d prim_text(#) == prim[#].rh {string number for control sequence name} 6069@d prim_is_full == (prim_used=prim_base) {test if all positions are occupied} 6070@d prim_eq_level_field(#)==#.hh.b1 6071@d prim_eq_type_field(#)==#.hh.b0 6072@d prim_equiv_field(#)==#.hh.rh 6073@d prim_eq_level(#)==prim_eq_level_field(prim_eqtb[#]) {level of definition} 6074@d prim_eq_type(#)==prim_eq_type_field(prim_eqtb[#]) {command code for equivalent} 6075@d prim_equiv(#)==prim_equiv_field(prim_eqtb[#]) {equivalent value} 6076@d undefined_primitive=0 6077 6078@<Glob...@>= 6079@!prim: array [0..prim_size] of two_halves; {the primitives table} 6080@!prim_used:pointer; {allocation pointer for |prim|} 6081@!prim_eqtb:array[0..prim_size] of memory_word; 6082 6083@ @<Set init...@>= 6084no_new_control_sequence:=true; {new identifiers are usually forbidden} 6085prim_next(0):=0; prim_text(0):=0; 6086for k:=1 to prim_size do prim[k]:=prim[0]; 6087prim_eq_level(0):=level_zero; 6088prim_eq_type(0):=undefined_cs; 6089prim_equiv(0):=null; 6090for k:=1 to prim_size do prim_eqtb[k]:=prim_eqtb[0]; 6091next(hash_base):=0; text(hash_base):=0; 6092for k:=hash_base+1 to undefined_control_sequence-1 do hash[k]:=hash[hash_base]; 6093 6094@ @<Initialize table entries...@>= 6095prim_used:=prim_size; {nothing is used} 6096hash_used:=frozen_control_sequence; {nothing is used} 6097cs_count:=0; 6098eq_type(frozen_dont_expand):=dont_expand; 6099text(frozen_dont_expand):="notexpanded:"; 6100@.notexpanded:@> 6101 6102eq_type(frozen_primitive):=ignore_spaces; 6103equiv(frozen_primitive):=1; 6104eq_level(frozen_primitive):=level_one; 6105text(frozen_primitive):="primitive"; 6106 6107@ Here is the subroutine that searches the hash table for an identifier 6108that matches a given string of length |l>0| appearing in |buffer[j.. 6109(j+l-1)]|. If the identifier is found, the corresponding hash table address 6110is returned. Otherwise, if the global variable |no_new_control_sequence| 6111is |true|, the dummy address |undefined_control_sequence| is returned. 6112Otherwise the identifier is inserted into the hash table and its location 6113is returned. 6114 6115@p function id_lookup(@!j,@!l:integer):pointer; {search the hash table} 6116label found; {go here if you found it} 6117var h:integer; {hash code} 6118@!d:integer; {number of characters in incomplete current string} 6119@!p:pointer; {index in |hash| array} 6120@!k:pointer; {index in |buffer| array} 6121@!ll:integer; {length in UTF16 code units} 6122begin @<Compute the hash code |h|@>; 6123p:=h+hash_base; {we start searching here; note that |0<=h<hash_prime|} 6124ll:=l; for d:=0 to l-1 do if buffer[j+d]>=@"10000 then incr(ll); 6125loop@+begin if text(p)>0 then if length(text(p))=ll then 6126 if str_eq_buf(text(p),j) then goto found; 6127 if next(p)=0 then 6128 begin if no_new_control_sequence then 6129 p:=undefined_control_sequence 6130 else @<Insert a new control sequence after |p|, then make 6131 |p| point to it@>; 6132 goto found; 6133 end; 6134 p:=next(p); 6135 end; 6136found: id_lookup:=p; 6137end; 6138 6139@ @<Insert a new control...@>= 6140begin if text(p)>0 then 6141 begin repeat if hash_is_full then overflow("hash size",hash_size); 6142@:TeX capacity exceeded hash size}{\quad hash size@> 6143 decr(hash_used); 6144 until text(hash_used)=0; {search for an empty location in |hash|} 6145 next(p):=hash_used; p:=hash_used; 6146 end; 6147str_room(ll); d:=cur_length; 6148while pool_ptr>str_start_macro(str_ptr) do 6149 begin decr(pool_ptr); str_pool[pool_ptr+l]:=str_pool[pool_ptr]; 6150 end; {move current string up to make room for another} 6151for k:=j to j+l-1 do begin 6152 if buffer[k]<@"10000 then append_char(buffer[k]) 6153 else begin 6154 append_char(@"D800+(buffer[k]-@"10000)div@"400); 6155 append_char(@"DC00+(buffer[k]-@"10000)mod@"400); 6156 end 6157end; 6158text(p):=make_string; pool_ptr:=pool_ptr+d; 6159@!stat incr(cs_count);@+tats@;@/ 6160end 6161 6162@ The value of |hash_prime| should be roughly 85\pct! of |hash_size|, and it 6163should be a prime number. The theory of hashing tells us to expect fewer 6164than two table probes, on the average, when the search is successful. 6165[See J.~S. Vitter, {\sl Journal of the ACM\/ \bf30} (1983), 231--258.] 6166@^Vitter, Jeffrey Scott@> 6167 6168@<Compute the hash code |h|@>= 6169h:=0; 6170for k:=j to j+l-1 do 6171 begin h:=h+h+buffer[k]; 6172 while h>=hash_prime do h:=h-hash_prime; 6173 end 6174 6175@ Here is the subroutine that searches the primitive table for an identifier 6176 6177@p function prim_lookup(@!s:str_number):pointer; {search the primitives table} 6178label found; {go here if you found it} 6179var h:integer; {hash code} 6180@!p:pointer; {index in |hash| array} 6181@!k:pointer; {index in string pool} 6182@!j,@!l:integer; 6183begin 6184if s<256 then begin 6185 p:=s; 6186 if (p<0) or (prim_eq_level(p)<>level_one) then 6187 p:=undefined_primitive; 6188end 6189else begin 6190 j:=str_start_macro(s); 6191 if s = str_ptr then l:=cur_length else l:=length(s); 6192 @<Compute the primitive code |h|@>; 6193 p:=h+prim_base; {we start searching here; note that |0<=h<hash_prime|} 6194 loop@+begin if prim_text(p)>0 then if length(prim_text(p))=l then 6195 if str_eq_str(prim_text(p),s) then goto found; 6196 if prim_next(p)=0 then 6197 begin if no_new_control_sequence then 6198 p:=undefined_primitive 6199 else @<Insert a new primitive after |p|, then make 6200 |p| point to it@>; 6201 goto found; 6202 end; 6203 p:=prim_next(p); 6204 end; 6205 end; 6206found: prim_lookup:=p; 6207end; 6208 6209@ @<Insert a new primitive...@>= 6210begin if prim_text(p)>0 then 6211 begin repeat if prim_is_full then overflow("primitive size",prim_size); 6212@:TeX capacity exceeded primitive size}{\quad primitive size@> 6213 decr(prim_used); 6214 until prim_text(prim_used)=0; {search for an empty location in |prim|} 6215 prim_next(p):=prim_used; p:=prim_used; 6216 end; 6217prim_text(p):=s; 6218end 6219 6220@ The value of |prim_prime| should be roughly 85\pct! of 6221|prim_size|, and it should be a prime number. 6222 6223@<Compute the primitive code |h|@>= 6224h:=str_pool[j]; 6225for k:=j+1 to j+l-1 do 6226 begin h:=h+h+str_pool[k]; 6227 while h>=prim_prime do h:=h-prim_prime; 6228 end 6229 6230@ Single-character control sequences do not need to be looked up in a hash 6231table, since we can use the character code itself as a direct address. 6232The procedure |print_cs| prints the name of a control sequence, given 6233a pointer to its address in |eqtb|. A space is printed after the name 6234unless it is a single nonletter or an active character. This procedure 6235might be invoked with invalid data, so it is ``extra robust.'' The 6236individual characters must be printed one at a time using |print|, since 6237they may be unprintable. 6238 6239@<Basic printing...@>= 6240procedure print_cs(@!p:integer); {prints a purported control sequence} 6241begin if p<hash_base then {single character} 6242 if p>=single_base then 6243 if p=null_cs then 6244 begin print_esc("csname"); print_esc("endcsname"); print_char(" "); 6245 end 6246 else begin print_esc(p-single_base); 6247 if cat_code(p-single_base)=letter then print_char(" "); 6248 end 6249 else if p<active_base then print_esc("IMPOSSIBLE.") 6250@.IMPOSSIBLE@> 6251 else print_char(p-active_base) 6252else if p>=undefined_control_sequence then print_esc("IMPOSSIBLE.") 6253else if (text(p)<0)or(text(p)>=str_ptr) then print_esc("NONEXISTENT.") 6254@.NONEXISTENT@> 6255else begin print_esc(text(p)); 6256 print_char(" "); 6257 end; 6258end; 6259 6260@ Here is a similar procedure; it avoids the error checks, and it never 6261prints a space after the control sequence. 6262 6263@<Basic printing procedures@>= 6264procedure sprint_cs(@!p:pointer); {prints a control sequence} 6265begin if p<hash_base then 6266 if p<single_base then print_char(p-active_base) 6267 else if p<null_cs then print_esc(p-single_base) 6268 else begin print_esc("csname"); print_esc("endcsname"); 6269 end 6270else print_esc(text(p)); 6271end; 6272 6273@ We need to put \TeX's ``primitive'' control sequences into the hash 6274table, together with their command code (which will be the |eq_type|) 6275and an operand (which will be the |equiv|). The |primitive| procedure 6276does this, in a way that no \TeX\ user can. The global value |cur_val| 6277contains the new |eqtb| pointer after |primitive| has acted. 6278 6279@p @!init procedure primitive(@!s:str_number;@!c:quarterword;@!o:halfword); 6280var k:pool_pointer; {index into |str_pool|} 6281@!j:0..buf_size; {index into |buffer|} 6282@!l:small_number; {length of the string} 6283@!prim_val:integer; {needed to fill |prim_eqtb|} 6284begin if s<256 then begin 6285 cur_val:=s+single_base; 6286 prim_val:=s; 6287end 6288else begin k:=str_start_macro(s); l:=str_start_macro(s+1)-k; 6289 {we will move |s| into the (possibly non-empty) |buffer|} 6290 if first+l>buf_size+1 then 6291 overflow("buffer size",buf_size); 6292@:TeX capacity exceeded buffer size}{\quad buffer size@> 6293 for j:=0 to l-1 do buffer[first+j]:=so(str_pool[k+j]); 6294 cur_val:=id_lookup(first,l); {|no_new_control_sequence| is |false|} 6295 flush_string; text(cur_val):=s; {we don't want to have the string twice} 6296 prim_val:=prim_lookup(s); 6297 end; 6298eq_level(cur_val):=level_one; eq_type(cur_val):=c; equiv(cur_val):=o; 6299prim_eq_level(prim_val):=level_one; 6300prim_eq_type(prim_val):=c; 6301prim_equiv(prim_val):=o; 6302end; 6303tini 6304 6305@ Many of \TeX's primitives need no |equiv|, since they are identifiable 6306by their |eq_type| alone. These primitives are loaded into the hash table 6307as follows: 6308 6309@<Put each of \TeX's primitives into the hash table@>= 6310primitive(" ",ex_space,0);@/ 6311@!@:Single-character primitives /}{\quad\.{\\\ }@> 6312primitive("/",ital_corr,0);@/ 6313@!@:Single-character primitives /}{\quad\.{\\/}@> 6314primitive("accent",accent,0);@/ 6315@!@:accent_}{\.{\\accent} primitive@> 6316primitive("advance",advance,0);@/ 6317@!@:advance_}{\.{\\advance} primitive@> 6318primitive("afterassignment",after_assignment,0);@/ 6319@!@:after_assignment_}{\.{\\afterassignment} primitive@> 6320primitive("aftergroup",after_group,0);@/ 6321@!@:after_group_}{\.{\\aftergroup} primitive@> 6322primitive("begingroup",begin_group,0);@/ 6323@!@:begin_group_}{\.{\\begingroup} primitive@> 6324primitive("char",char_num,0);@/ 6325@!@:char_}{\.{\\char} primitive@> 6326primitive("csname",cs_name,0);@/ 6327@!@:cs_name_}{\.{\\csname} primitive@> 6328primitive("delimiter",delim_num,0);@/ 6329@!@:delimiter_}{\.{\\delimiter} primitive@> 6330primitive("XeTeXdelimiter",delim_num,1);@/ 6331primitive("Udelimiter",delim_num,1);@/ 6332@!@:U_delimiter_}{\.{\\Udelimiter} primitive@> 6333primitive("divide",divide,0);@/ 6334@!@:divide_}{\.{\\divide} primitive@> 6335primitive("endcsname",end_cs_name,0);@/ 6336@!@:end_cs_name_}{\.{\\endcsname} primitive@> 6337primitive("endgroup",end_group,0); 6338@!@:end_group_}{\.{\\endgroup} primitive@> 6339text(frozen_end_group):="endgroup"; eqtb[frozen_end_group]:=eqtb[cur_val];@/ 6340primitive("expandafter",expand_after,0);@/ 6341@!@:expand_after_}{\.{\\expandafter} primitive@> 6342primitive("font",def_font,0);@/ 6343@!@:font_}{\.{\\font} primitive@> 6344primitive("fontdimen",assign_font_dimen,0);@/ 6345@!@:font_dimen_}{\.{\\fontdimen} primitive@> 6346primitive("halign",halign,0);@/ 6347@!@:halign_}{\.{\\halign} primitive@> 6348primitive("hrule",hrule,0);@/ 6349@!@:hrule_}{\.{\\hrule} primitive@> 6350primitive("ignorespaces",ignore_spaces,0);@/ 6351@!@:ignore_spaces_}{\.{\\ignorespaces} primitive@> 6352primitive("insert",insert,0);@/ 6353@!@:insert_}{\.{\\insert} primitive@> 6354primitive("mark",mark,0);@/ 6355@!@:mark_}{\.{\\mark} primitive@> 6356primitive("mathaccent",math_accent,0);@/ 6357@!@:math_accent_}{\.{\\mathaccent} primitive@> 6358primitive("XeTeXmathaccent",math_accent,1);@/ 6359primitive("Umathaccent",math_accent,1);@/ 6360@!@:U_math_accent_}{\.{\\Umathaccent} primitive@> 6361primitive("mathchar",math_char_num,0);@/ 6362@!@:math_char_}{\.{\\mathchar} primitive@> 6363primitive("XeTeXmathcharnum",math_char_num,1);@/ 6364primitive("Umathcharnum",math_char_num,1);@/ 6365@!@:U_math_char_num_}{\.{\\Umathcharnum} primitive@> 6366primitive("XeTeXmathchar",math_char_num,2);@/ 6367primitive("Umathchar",math_char_num,2);@/ 6368@!@:U_math_char_}{\.{\\Umathchar} primitive@> 6369primitive("mathchoice",math_choice,0);@/ 6370@!@:math_choice_}{\.{\\mathchoice} primitive@> 6371primitive("multiply",multiply,0);@/ 6372@!@:multiply_}{\.{\\multiply} primitive@> 6373primitive("noalign",no_align,0);@/ 6374@!@:no_align_}{\.{\\noalign} primitive@> 6375primitive("noboundary",no_boundary,0);@/ 6376@!@:no_boundary_}{\.{\\noboundary} primitive@> 6377primitive("noexpand",no_expand,0);@/ 6378@!@:no_expand_}{\.{\\noexpand} primitive@> 6379primitive("primitive",no_expand,1);@/ 6380@!@:primitive_}{\.{\\primitive} primitive@> 6381primitive("nonscript",non_script,0);@/ 6382@!@:non_script_}{\.{\\nonscript} primitive@> 6383primitive("omit",omit,0);@/ 6384@!@:omit_}{\.{\\omit} primitive@> 6385primitive("parshape",set_shape,par_shape_loc);@/ 6386@!@:par_shape_}{\.{\\parshape} primitive@> 6387primitive("penalty",break_penalty,0);@/ 6388@!@:penalty_}{\.{\\penalty} primitive@> 6389primitive("prevgraf",set_prev_graf,0);@/ 6390@!@:prev_graf_}{\.{\\prevgraf} primitive@> 6391primitive("radical",radical,0);@/ 6392@!@:radical_}{\.{\\radical} primitive@> 6393primitive("XeTeXradical",radical,1);@/ 6394primitive("Uradical",radical,1);@/ 6395@!@:U_radical_}{\.{\\Uradical} primitive@> 6396primitive("read",read_to_cs,0);@/ 6397@!@:read_}{\.{\\read} primitive@> 6398primitive("relax",relax,too_big_usv); {cf.\ |scan_file_name|} 6399@!@:relax_}{\.{\\relax} primitive@> 6400text(frozen_relax):="relax"; eqtb[frozen_relax]:=eqtb[cur_val];@/ 6401primitive("setbox",set_box,0);@/ 6402@!@:set_box_}{\.{\\setbox} primitive@> 6403primitive("the",the,0);@/ 6404@!@:the_}{\.{\\the} primitive@> 6405primitive("toks",toks_register,mem_bot);@/ 6406@!@:toks_}{\.{\\toks} primitive@> 6407primitive("vadjust",vadjust,0);@/ 6408@!@:vadjust_}{\.{\\vadjust} primitive@> 6409primitive("valign",valign,0);@/ 6410@!@:valign_}{\.{\\valign} primitive@> 6411primitive("vcenter",vcenter,0);@/ 6412@!@:vcenter_}{\.{\\vcenter} primitive@> 6413primitive("vrule",vrule,0);@/ 6414@!@:vrule_}{\.{\\vrule} primitive@> 6415 6416@ Each primitive has a corresponding inverse, so that it is possible to 6417display the cryptic numeric contents of |eqtb| in symbolic form. 6418Every call of |primitive| in this program is therefore accompanied by some 6419straightforward code that forms part of the |print_cmd_chr| routine 6420below. 6421 6422@<Cases of |print_cmd_chr|...@>= 6423accent: print_esc("accent"); 6424advance: print_esc("advance"); 6425after_assignment: print_esc("afterassignment"); 6426after_group: print_esc("aftergroup"); 6427assign_font_dimen: print_esc("fontdimen"); 6428begin_group: print_esc("begingroup"); 6429break_penalty: print_esc("penalty"); 6430char_num: print_esc("char"); 6431cs_name: print_esc("csname"); 6432def_font: print_esc("font"); 6433delim_num: if chr_code=1 then print_esc("Udelimiter") 6434 else print_esc("delimiter"); 6435divide: print_esc("divide"); 6436end_cs_name: print_esc("endcsname"); 6437end_group: print_esc("endgroup"); 6438ex_space: print_esc(" "); 6439expand_after: if chr_code=0 then print_esc("expandafter") 6440 @<Cases of |expandafter| for |print_cmd_chr|@>; 6441halign: print_esc("halign"); 6442hrule: print_esc("hrule"); 6443ignore_spaces: if chr_code=0 then print_esc("ignorespaces") else print_esc("primitive"); 6444insert: print_esc("insert"); 6445ital_corr: print_esc("/"); 6446mark: begin print_esc("mark"); 6447 if chr_code>0 then print_char("s"); 6448 end; 6449math_accent: if chr_code=1 then print_esc("Umathaccent") 6450 else print_esc("mathaccent"); 6451math_char_num: if chr_code=2 then print_esc("Umathchar") 6452 else if chr_code=1 then print_esc("Umathcharnum") 6453 else print_esc("mathchar"); 6454math_choice: print_esc("mathchoice"); 6455multiply: print_esc("multiply"); 6456no_align: print_esc("noalign"); 6457no_boundary:print_esc("noboundary"); 6458no_expand: if chr_code=0 then print_esc("noexpand") 6459 else print_esc("primitive"); 6460non_script: print_esc("nonscript"); 6461omit: print_esc("omit"); 6462radical: if chr_code=1 then print_esc("Uradical") else print_esc("radical"); 6463read_to_cs: if chr_code=0 then print_esc("read") 6464 @<Cases of |read| for |print_cmd_chr|@>; 6465relax: print_esc("relax"); 6466set_box: print_esc("setbox"); 6467set_prev_graf: print_esc("prevgraf"); 6468set_shape: case chr_code of 6469 par_shape_loc: print_esc("parshape"); 6470 @<Cases of |set_shape| for |print_cmd_chr|@>@;@/ 6471 end; {there are no other cases} 6472the: if chr_code=0 then print_esc("the") 6473 @<Cases of |the| for |print_cmd_chr|@>; 6474toks_register: @<Cases of |toks_register| for |print_cmd_chr|@>; 6475vadjust: print_esc("vadjust"); 6476valign: if chr_code=0 then print_esc("valign")@/ 6477 @<Cases of |valign| for |print_cmd_chr|@>; 6478vcenter: print_esc("vcenter"); 6479vrule: print_esc("vrule"); 6480 6481@ We will deal with the other primitives later, at some point in the program 6482where their |eq_type| and |equiv| values are more meaningful. For example, 6483the primitives for math mode will be loaded when we consider the routines 6484that deal with formulas. It is easy to find where each particular 6485primitive was treated by looking in the index at the end; for example, the 6486section where |"radical"| entered |eqtb| is listed under `\.{\\radical} 6487primitive'. (Primitives consisting of a single nonalphabetic character, 6488@!like `\.{\\/}', are listed under `Single-character primitives'.) 6489@!@^Single-character primitives@> 6490 6491Meanwhile, this is a convenient place to catch up on something we were unable 6492to do before the hash table was defined: 6493 6494@<Print the font identifier for |font(p)|@>= 6495print_esc(font_id_text(font(p))) 6496 6497@* \[19] Saving and restoring equivalents. 6498The nested structure provided by `$\.{\char'173}\ldots\.{\char'175}$' groups 6499in \TeX\ means that |eqtb| entries valid in outer groups should be saved 6500and restored later if they are overridden inside the braces. When a new |eqtb| 6501value is being assigned, the program therefore checks to see if the previous 6502entry belongs to an outer level. In such a case, the old value is placed 6503on the |save_stack| just before the new value enters |eqtb|. At the 6504end of a grouping level, i.e., when the right brace is sensed, the 6505|save_stack| is used to restore the outer values, and the inner ones are 6506destroyed. 6507 6508Entries on the |save_stack| are of type |memory_word|. The top item on 6509this stack is |save_stack[p]|, where |p=save_ptr-1|; it contains three 6510fields called |save_type|, |save_level|, and |save_index|, and it is 6511interpreted in one of five ways: 6512 6513\yskip\hangg 1) If |save_type(p)=restore_old_value|, then 6514|save_index(p)| is a location in |eqtb| whose current value should 6515be destroyed at the end of the current group and replaced by |save_stack[p-1]|. 6516Furthermore if |save_index(p)>=int_base|, then |save_level(p)| 6517should replace the corresponding entry in |xeq_level|. 6518 6519\yskip\hangg 2) If |save_type(p)=restore_zero|, then |save_index(p)| 6520is a location in |eqtb| whose current value should be destroyed at the end 6521of the current group, when it should be 6522replaced by the current value of |eqtb[undefined_control_sequence]|. 6523 6524\yskip\hangg 3) If |save_type(p)=insert_token|, then |save_index(p)| 6525is a token that should be inserted into \TeX's input when the current 6526group ends. 6527 6528\yskip\hangg 4) If |save_type(p)=level_boundary|, then |save_level(p)| 6529is a code explaining what kind of group we were previously in, and 6530|save_index(p)| points to the level boundary word at the bottom of 6531the entries for that group. 6532Furthermore, in extended \eTeX\ mode, |save_stack[p-1]| contains the 6533source line number at which the current level of grouping was entered. 6534 6535\yskip\hang 5) If |save_type(p)=restore_sa|, then |sa_chain| points to a 6536chain of sparse array entries to be restored at the end of the current 6537group. Furthermore |save_index(p)| and |save_level(p)| should replace 6538the values of |sa_chain| and |sa_level| respectively. 6539 6540@d save_type(#)==save_stack[#].hh.b0 {classifies a |save_stack| entry} 6541@d save_level(#)==save_stack[#].hh.b1 6542 {saved level for regions 5 and 6, or group code} 6543@d save_index(#)==save_stack[#].hh.rh 6544 {|eqtb| location or token or |save_stack| location} 6545@d restore_old_value=0 {|save_type| when a value should be restored later} 6546@d restore_zero=1 {|save_type| when an undefined entry should be restored} 6547@d insert_token=2 {|save_type| when a token is being saved for later use} 6548@d level_boundary=3 {|save_type| corresponding to beginning of group} 6549@d restore_sa=4 {|save_type| when sparse array entries should be restored} 6550 6551@p@t\4@>@<Declare \eTeX\ procedures for tracing and input@> 6552 6553@ Here are the group codes that are used to discriminate between different 6554kinds of groups. They allow \TeX\ to decide what special actions, if any, 6555should be performed when a group ends. 6556\def\grp{\.{\char'173...\char'175}} 6557 6558Some groups are not supposed to be ended by right braces. For example, 6559the `\.\$' that begins a math formula causes a |math_shift_group| to 6560be started, and this should be terminated by a matching `\.\$'. Similarly, 6561a group that starts with \.{\\left} should end with \.{\\right}, and 6562one that starts with \.{\\begingroup} should end with \.{\\endgroup}. 6563 6564@d bottom_level=0 {group code for the outside world} 6565@d simple_group=1 {group code for local structure only} 6566@d hbox_group=2 {code for `\.{\\hbox}\grp'} 6567@d adjusted_hbox_group=3 {code for `\.{\\hbox}\grp' in vertical mode} 6568@d vbox_group=4 {code for `\.{\\vbox}\grp'} 6569@d vtop_group=5 {code for `\.{\\vtop}\grp'} 6570@d align_group=6 {code for `\.{\\halign}\grp', `\.{\\valign}\grp'} 6571@d no_align_group=7 {code for `\.{\\noalign}\grp'} 6572@d output_group=8 {code for output routine} 6573@d math_group=9 {code for, e.g., `\.{\char'136}\grp'} 6574@d disc_group=10 {code for `\.{\\discretionary}\grp\grp\grp'} 6575@d insert_group=11 {code for `\.{\\insert}\grp', `\.{\\vadjust}\grp'} 6576@d vcenter_group=12 {code for `\.{\\vcenter}\grp'} 6577@d math_choice_group=13 {code for `\.{\\mathchoice}\grp\grp\grp\grp'} 6578@d semi_simple_group=14 {code for `\.{\\begingroup...\\endgroup}'} 6579@d math_shift_group=15 {code for `\.{\$...\$}'} 6580@d math_left_group=16 {code for `\.{\\left...\\right}'} 6581@d max_group_code=16 6582 6583@<Types...@>= 6584@!group_code=0..max_group_code; {|save_level| for a level boundary} 6585 6586@ The global variable |cur_group| keeps track of what sort of group we are 6587currently in. Another global variable, |cur_boundary|, points to the 6588topmost |level_boundary| word. And |cur_level| is the current depth of 6589nesting. The routines are designed to preserve the condition that no entry 6590in the |save_stack| or in |eqtb| ever has a level greater than |cur_level|. 6591 6592@ @<Glob...@>= 6593@!save_stack : array[0..save_size] of memory_word; 6594@!save_ptr : 0..save_size; {first unused entry on |save_stack|} 6595@!max_save_stack:0..save_size; {maximum usage of save stack} 6596@!cur_level: quarterword; {current nesting level for groups} 6597@!cur_group: group_code; {current group type} 6598@!cur_boundary: 0..save_size; {where the current level begins} 6599 6600@ At this time it might be a good idea for the reader to review the introduction 6601to |eqtb| that was given above just before the long lists of parameter names. 6602Recall that the ``outer level'' of the program is |level_one|, since 6603undefined control sequences are assumed to be ``defined'' at |level_zero|. 6604 6605@<Set init...@>= 6606save_ptr:=0; cur_level:=level_one; cur_group:=bottom_level; cur_boundary:=0; 6607max_save_stack:=0; 6608 6609@ The following macro is used to test if there is room for up to seven more 6610entries on |save_stack|. By making a conservative test like this, we can 6611get by with testing for overflow in only a few places. 6612 6613@d check_full_save_stack==if save_ptr>max_save_stack then 6614 begin max_save_stack:=save_ptr; 6615 if max_save_stack>save_size-7 then overflow("save size",save_size); 6616@:TeX capacity exceeded save size}{\quad save size@> 6617 end 6618 6619@ Procedure |new_save_level| is called when a group begins. The 6620argument is a group identification code like `|hbox_group|'. After 6621calling this routine, it is safe to put five more entries on |save_stack|. 6622 6623In some cases integer-valued items are placed onto the 6624|save_stack| just below a |level_boundary| word, because this is a 6625convenient place to keep information that is supposed to ``pop up'' just 6626when the group has finished. 6627For example, when `\.{\\hbox to 100pt}\grp' is being treated, the 100pt 6628dimension is stored on |save_stack| just before |new_save_level| is 6629called. 6630 6631We use the notation |saved(k)| to stand for an integer item that 6632appears in location |save_ptr+k| of the save stack. 6633 6634@d saved(#)==save_stack[save_ptr+#].int 6635 6636@p procedure new_save_level(@!c:group_code); {begin a new level of grouping} 6637begin check_full_save_stack; 6638if eTeX_ex then 6639 begin saved(0):=line; incr(save_ptr); 6640 end; 6641save_type(save_ptr):=level_boundary; save_level(save_ptr):=cur_group; 6642save_index(save_ptr):=cur_boundary; 6643if cur_level=max_quarterword then overflow("grouping levels", 6644@:TeX capacity exceeded grouping levels}{\quad grouping levels@> 6645 max_quarterword-min_quarterword); 6646 {quit if |(cur_level+1)| is too big to be stored in |eqtb|} 6647cur_boundary:=save_ptr; cur_group:=c; 6648@!stat if tracing_groups>0 then group_trace(false);@+tats@;@/ 6649incr(cur_level); incr(save_ptr); 6650end; 6651 6652@ Just before an entry of |eqtb| is changed, the following procedure should 6653be called to update the other data structures properly. It is important 6654to keep in mind that reference counts in |mem| include references from 6655within |save_stack|, so these counts must be handled carefully. 6656@^reference counts@> 6657 6658@p procedure eq_destroy(@!w:memory_word); {gets ready to forget |w|} 6659var q:pointer; {|equiv| field of |w|} 6660begin case eq_type_field(w) of 6661call,long_call,outer_call,long_outer_call: delete_token_ref(equiv_field(w)); 6662glue_ref: delete_glue_ref(equiv_field(w)); 6663shape_ref: begin q:=equiv_field(w); {we need to free a \.{\\parshape} block} 6664 if q<>null then free_node(q,info(q)+info(q)+1); 6665 end; {such a block is |2n+1| words long, where |n=info(q)|} 6666box_ref: flush_node_list(equiv_field(w)); 6667@/@<Cases for |eq_destroy|@>@/ 6668othercases do_nothing 6669endcases; 6670end; 6671 6672@ To save a value of |eqtb[p]| that was established at level |l|, we 6673can use the following subroutine. 6674 6675@p procedure eq_save(@!p:pointer;@!l:quarterword); {saves |eqtb[p]|} 6676begin check_full_save_stack; 6677if l=level_zero then save_type(save_ptr):=restore_zero 6678else begin save_stack[save_ptr]:=eqtb[p]; incr(save_ptr); 6679 save_type(save_ptr):=restore_old_value; 6680 end; 6681save_level(save_ptr):=l; save_index(save_ptr):=p; incr(save_ptr); 6682end; 6683 6684@ The procedure |eq_define| defines an |eqtb| entry having specified 6685|eq_type| and |equiv| fields, and saves the former value if appropriate. 6686This procedure is used only for entries in the first four regions of |eqtb|, 6687i.e., only for entries that have |eq_type| and |equiv| fields. 6688After calling this routine, it is safe to put four more entries on 6689|save_stack|, provided that there was room for four more entries before 6690the call, since |eq_save| makes the necessary test. 6691 6692@d assign_trace(#)==@!stat if tracing_assigns>0 then restore_trace(#); 6693 tats 6694 6695@p procedure eq_define(@!p:pointer;@!t:quarterword;@!e:halfword); 6696 {new data for |eqtb|} 6697label exit; 6698begin if eTeX_ex and(eq_type(p)=t)and(equiv(p)=e) then 6699 begin assign_trace(p,"reassigning")@;@/ 6700 eq_destroy(eqtb[p]); return; 6701 end; 6702assign_trace(p,"changing")@;@/ 6703if eq_level(p)=cur_level then eq_destroy(eqtb[p]) 6704else if cur_level>level_one then eq_save(p,eq_level(p)); 6705eq_level(p):=cur_level; eq_type(p):=t; equiv(p):=e; 6706assign_trace(p,"into")@;@/ 6707exit:end; 6708 6709@ The counterpart of |eq_define| for the remaining (fullword) positions in 6710|eqtb| is called |eq_word_define|. Since |xeq_level[p]>=level_one| for all 6711|p|, a `|restore_zero|' will never be used in this case. 6712 6713@p procedure eq_word_define(@!p:pointer;@!w:integer); 6714label exit; 6715begin if eTeX_ex and(eqtb[p].int=w) then 6716 begin assign_trace(p,"reassigning")@;@/ 6717 return; 6718 end; 6719assign_trace(p,"changing")@;@/ 6720if xeq_level[p]<>cur_level then 6721 begin eq_save(p,xeq_level[p]); xeq_level[p]:=cur_level; 6722 end; 6723eqtb[p].int:=w; 6724assign_trace(p,"into")@;@/ 6725exit:end; 6726 6727@ The |eq_define| and |eq_word_define| routines take care of local definitions. 6728@^global definitions@> 6729Global definitions are done in almost the same way, but there is no need 6730to save old values, and the new value is associated with |level_one|. 6731 6732@p procedure geq_define(@!p:pointer;@!t:quarterword;@!e:halfword); 6733 {global |eq_define|} 6734begin assign_trace(p,"globally changing")@;@/ 6735begin eq_destroy(eqtb[p]); 6736eq_level(p):=level_one; eq_type(p):=t; equiv(p):=e; 6737end; 6738assign_trace(p,"into")@;@/ 6739end; 6740@# 6741procedure geq_word_define(@!p:pointer;@!w:integer); {global |eq_word_define|} 6742begin assign_trace(p,"globally changing")@;@/ 6743begin eqtb[p].int:=w; xeq_level[p]:=level_one; 6744end; 6745assign_trace(p,"into")@;@/ 6746end; 6747 6748@ Subroutine |save_for_after| puts a token on the stack for save-keeping. 6749 6750@p procedure save_for_after(@!t:halfword); 6751begin if cur_level>level_one then 6752 begin check_full_save_stack; 6753 save_type(save_ptr):=insert_token; save_level(save_ptr):=level_zero; 6754 save_index(save_ptr):=t; incr(save_ptr); 6755 end; 6756end; 6757 6758@ The |unsave| routine goes the other way, taking items off of |save_stack|. 6759This routine takes care of restoration when a level ends; everything 6760belonging to the topmost group is cleared off of the save stack. 6761 6762@p 6763procedure@?back_input; forward; @t\2@> 6764procedure unsave; {pops the top level off the save stack} 6765label done; 6766var p:pointer; {position to be restored} 6767@!l:quarterword; {saved level, if in fullword regions of |eqtb|} 6768@!t:halfword; {saved value of |cur_tok|} 6769@!a:boolean; {have we already processed an \.{\\aftergroup} ?} 6770begin a:=false; 6771if cur_level>level_one then 6772 begin decr(cur_level); 6773 @<Clear off top level from |save_stack|@>; 6774 end 6775else confusion("curlevel"); {|unsave| is not used when |cur_group=bottom_level|} 6776@:this can't happen curlevel}{\quad curlevel@> 6777end; 6778 6779@ @<Clear off...@>= 6780loop@+begin decr(save_ptr); 6781 if save_type(save_ptr)=level_boundary then goto done; 6782 p:=save_index(save_ptr); 6783 if save_type(save_ptr)=insert_token then 6784 @<Insert token |p| into \TeX's input@> 6785 else if save_type(save_ptr)=restore_sa then 6786 begin sa_restore; sa_chain:=p; sa_level:=save_level(save_ptr); 6787 end 6788 else begin if save_type(save_ptr)=restore_old_value then 6789 begin l:=save_level(save_ptr); decr(save_ptr); 6790 end 6791 else save_stack[save_ptr]:=eqtb[undefined_control_sequence]; 6792 @<Store \(s)|save_stack[save_ptr]| in |eqtb[p]|, unless 6793 |eqtb[p]| holds a global value@>; 6794 end; 6795 end; 6796done: @!stat if tracing_groups>0 then group_trace(true);@+tats@;@/ 6797if grp_stack[in_open]=cur_boundary then group_warning; 6798 {groups possibly not properly nested with files} 6799cur_group:=save_level(save_ptr); cur_boundary:=save_index(save_ptr); 6800if eTeX_ex then decr(save_ptr) 6801 6802@ A global definition, which sets the level to |level_one|, 6803@^global definitions@> 6804will not be undone by |unsave|. If at least one global definition of 6805|eqtb[p]| has been carried out within the group that just ended, the 6806last such definition will therefore survive. 6807 6808@<Store \(s)|save...@>= 6809if p<int_base then 6810 if eq_level(p)=level_one then 6811 begin eq_destroy(save_stack[save_ptr]); {destroy the saved value} 6812 @!stat if tracing_restores>0 then restore_trace(p,"retaining");@+tats@;@/ 6813 end 6814 else begin eq_destroy(eqtb[p]); {destroy the current value} 6815 eqtb[p]:=save_stack[save_ptr]; {restore the saved value} 6816 @!stat if tracing_restores>0 then restore_trace(p,"restoring");@+tats@;@/ 6817 end 6818else if xeq_level[p]<>level_one then 6819 begin eqtb[p]:=save_stack[save_ptr]; xeq_level[p]:=l; 6820 @!stat if tracing_restores>0 then restore_trace(p,"restoring");@+tats@;@/ 6821 end 6822else begin 6823 @!stat if tracing_restores>0 then restore_trace(p,"retaining");@+tats@;@/ 6824 end 6825 6826@ @<Declare \eTeX\ procedures for tr...@>= 6827@!stat procedure restore_trace(@!p:pointer;@!s:str_number); 6828 {|eqtb[p]| has just been restored or retained} 6829begin begin_diagnostic; print_char("{"); print(s); print_char(" "); 6830show_eqtb(p); print_char("}"); 6831end_diagnostic(false); 6832end; 6833tats 6834 6835@ When looking for possible pointers to a memory location, it is helpful 6836to look for references from |eqtb| that might be waiting on the 6837save stack. Of course, we might find spurious pointers too; but this 6838routine is merely an aid when debugging, and at such times we are 6839grateful for any scraps of information, even if they prove to be irrelevant. 6840@^dirty \PASCAL@> 6841 6842@<Search |save_stack| for equivalents that point to |p|@>= 6843if save_ptr>0 then for q:=0 to save_ptr-1 do 6844 begin if equiv_field(save_stack[q])=p then 6845 begin print_nl("SAVE("); print_int(q); print_char(")"); 6846 end; 6847 end 6848 6849@ Most of the parameters kept in |eqtb| can be changed freely, but there's 6850an exception: The magnification should not be used with two different 6851values during any \TeX\ job, since a single magnification is applied to an 6852entire run. The global variable |mag_set| is set to the current magnification 6853whenever it becomes necessary to ``freeze'' it at a particular value. 6854 6855@<Glob...@>= 6856@!mag_set:integer; {if nonzero, this magnification should be used henceforth} 6857 6858@ @<Set init...@>= 6859mag_set:=0; 6860 6861@ The |prepare_mag| subroutine is called whenever \TeX\ wants to use |mag| 6862for magnification. 6863 6864@p procedure prepare_mag; 6865begin if (mag_set>0)and(mag<>mag_set) then 6866 begin print_err("Incompatible magnification ("); print_int(mag); 6867@.Incompatible magnification@> 6868 print(");"); print_nl(" the previous value will be retained"); 6869 help2("I can handle only one magnification ratio per job. So I've")@/ 6870 ("reverted to the magnification you used earlier on this run.");@/ 6871 int_error(mag_set); 6872 geq_word_define(int_base+mag_code,mag_set); {|mag:=mag_set|} 6873 end; 6874if (mag<=0)or(mag>32768) then 6875 begin print_err("Illegal magnification has been changed to 1000");@/ 6876@.Illegal magnification...@> 6877 help1("The magnification ratio must be between 1 and 32768."); 6878 int_error(mag); geq_word_define(int_base+mag_code,1000); 6879 end; 6880mag_set:=mag; 6881end; 6882 6883@* \[20] Token lists. 6884A \TeX\ token is either a character or a control sequence, and it is 6885@^token@> 6886represented internally in one of two ways: (1)~A character whose ASCII 6887code number is |c| and whose command code is |m| is represented as the 6888number $2^{21}m+c$; the command code is in the range |1<=m<=14|. (2)~A control 6889sequence whose |eqtb| address is |p| is represented as the number 6890|cs_token_flag+p|. Here |cs_token_flag=@t$2^{25}-1$@>| is larger than 6891$2^{21}m+c$, yet it is small enough that |cs_token_flag+p< max_halfword|; 6892thus, a token fits comfortably in a halfword. 6893 6894A token |t| represents a |left_brace| command if and only if 6895|t<left_brace_limit|; it represents a |right_brace| command if and only if 6896we have |left_brace_limit<=t<right_brace_limit|; and it represents a |match| or 6897|end_match| command if and only if |match_token<=t<=end_match_token|. 6898The following definitions take care of these token-oriented constants 6899and a few others. 6900 6901@d cs_token_flag=@"1FFFFFF {amount added to the |eqtb| location in a 6902 token that stands for a control sequence; is a multiple of~|@"10000|, less~1} 6903@d max_char_val=@"200000 {to separate char and command code} 6904@d left_brace_token=@"200000 {$2^{21}\cdot|left_brace|$} 6905@d left_brace_limit=@"400000 {$2^{21}\cdot(|left_brace|+1)$} 6906@d right_brace_token=@"400000 {$2^{21}\cdot|right_brace|$} 6907@d right_brace_limit=@"600000 {$2^{21}\cdot(|right_brace|+1)$} 6908@d math_shift_token=@"600000 {$2^{21}\cdot|math_shift|$} 6909@d tab_token=@"800000 {$2^{21}\cdot|tab_mark|$} 6910@d out_param_token=@"A00000 {$2^{21}\cdot|out_param|$} 6911@d space_token=@"1400020 {$2^{21}\cdot|spacer|+|" "|$} 6912@d letter_token=@"1600000 {$2^{21}\cdot|letter|$} 6913@d other_token=@"1800000 {$2^{21}\cdot|other_char|$} 6914@d match_token=@"1A00000 {$2^{21}\cdot|match|$} 6915@d end_match_token=@"1C00000 {$2^{21}\cdot|end_match|$} 6916@# 6917@d protected_token=end_match_token+1 {$2^{21}\cdot|end_match|+1$} 6918 6919@ @<Check the ``constant''...@>= 6920if cs_token_flag+undefined_control_sequence>max_halfword then bad:=21; 6921 6922@ A token list is a singly linked list of one-word nodes in |mem|, where 6923each word contains a token and a link. Macro definitions, output-routine 6924definitions, marks, \.{\\write} texts, and a few other things 6925are remembered by \TeX\ in the form 6926of token lists, usually preceded by a node with a reference count in its 6927|token_ref_count| field. The token stored in location |p| is called 6928|info(p)|. 6929 6930Three special commands appear in the token lists of macro definitions. 6931When |m=match|, it means that \TeX\ should scan a parameter 6932for the current macro; when |m=end_match|, it means that parameter 6933matching should end and \TeX\ should start reading the macro text; and 6934when |m=out_param|, it means that \TeX\ should insert parameter 6935number |c| into the text at this point. 6936 6937The enclosing \.{\char'173} and \.{\char'175} characters of a macro 6938definition are omitted, but the final right brace of an output routine 6939is included at the end of its token list. 6940 6941Here is an example macro definition that illustrates these conventions. 6942After \TeX\ processes the text 6943$$\.{\\def\\mac a\#1\#2 \\b \{\#1\\-a \#\#1\#2 \#2\}}$$ 6944the definition of \.{\\mac} is represented as a token list containing 6945$$\def\,{\hskip2pt} 6946\vbox{\halign{\hfil#\hfil\cr 6947(reference count), |letter|\,\.a, |match|\,\#, |match|\,\#, |spacer|\,\.\ , 6948\.{\\b}, |end_match|,\cr 6949|out_param|\,1, \.{\\-}, |letter|\,\.a, |spacer|\,\.\ , |mac_param|\,\#, 6950|other_char|\,\.1,\cr 6951|out_param|\,2, |spacer|\,\.\ , |out_param|\,2.\cr}}$$ 6952The procedure |scan_toks| builds such token lists, and |macro_call| 6953does the parameter matching. 6954@^reference counts@> 6955 6956Examples such as 6957$$\.{\\def\\m\{\\def\\m\{a\}\ b\}}$$ 6958explain why reference counts would be needed even if \TeX\ had no \.{\\let} 6959operation: When the token list for \.{\\m} is being read, the redefinition of 6960\.{\\m} changes the |eqtb| entry before the token list has been fully 6961consumed, so we dare not simply destroy a token list when its 6962control sequence is being redefined. 6963 6964If the parameter-matching part of a definition ends with `\.{\#\{}', 6965the corresponding token list will have `\.\{' just before the `|end_match|' 6966and also at the very end. The first `\.\{' is used to delimit the parameter; the 6967second one keeps the first from disappearing. 6968 6969@ The procedure |show_token_list|, which prints a symbolic form of 6970the token list that starts at a given node |p|, illustrates these 6971conventions. The token list being displayed should not begin with a reference 6972count. However, the procedure is intended to be robust, so that if the 6973memory links are awry or if |p| is not really a pointer to a token list, 6974nothing catastrophic will happen. 6975 6976An additional parameter |q| is also given; this parameter is either null 6977or it points to a node in the token list where a certain magic computation 6978takes place that will be explained later. (Basically, |q| is non-null when 6979we are printing the two-line context information at the time of an error 6980message; |q| marks the place corresponding to where the second line 6981should begin.) 6982 6983For example, if |p| points to the node containing the first \.a in the 6984token list above, then |show_token_list| will print the string 6985$$\hbox{`\.{a\#1\#2\ \\b\ ->\#1\\-a\ \#\#1\#2\ \#2}';}$$ 6986and if |q| points to the node containing the second \.a, 6987the magic computation will be performed just before the second \.a is printed. 6988 6989The generation will stop, and `\.{\\ETC.}' will be printed, if the length 6990of printing exceeds a given limit~|l|. Anomalous entries are printed in the 6991form of control sequences that are not followed by a blank space, e.g., 6992`\.{\\BAD.}'; this cannot be confused with actual control sequences because 6993a real control sequence named \.{BAD} would come out `\.{\\BAD\ }'. 6994 6995@<Declare the procedure called |show_token_list|@>= 6996procedure show_token_list(@!p,@!q:integer;@!l:integer); 6997label exit; 6998var m,@!c:integer; {pieces of a token} 6999@!match_chr:integer; {character used in a `|match|'} 7000@!n:ASCII_code; {the highest parameter number, as an ASCII digit} 7001begin match_chr:="#"; n:="0"; tally:=0; 7002while (p<>null) and (tally<l) do 7003 begin if p=q then @<Do magic computation@>; 7004 @<Display token |p|, and |return| if there are problems@>; 7005 p:=link(p); 7006 end; 7007if p<>null then print_esc("ETC."); 7008@.ETC@> 7009exit: 7010end; 7011 7012@ @<Display token |p|...@>= 7013if (p<hi_mem_min) or (p>mem_end) then 7014 begin print_esc("CLOBBERED."); return; 7015@.CLOBBERED@> 7016 end; 7017if info(p)>=cs_token_flag then print_cs(info(p)-cs_token_flag) 7018else begin m:=info(p) div max_char_val; c:=info(p) mod max_char_val; 7019 if info(p)<0 then print_esc("BAD.") 7020@.BAD@> 7021 else @<Display the token $(|m|,|c|)$@>; 7022 end 7023 7024@ The procedure usually ``learns'' the character code used for macro 7025parameters by seeing one in a |match| command before it runs into any 7026|out_param| commands. 7027 7028@<Display the token ...@>= 7029case m of 7030left_brace,right_brace,math_shift,tab_mark,sup_mark,sub_mark,spacer, 7031 letter,other_char: print_char(c); 7032mac_param: begin print_char(c); print_char(c); 7033 end; 7034out_param: begin print_char(match_chr); 7035 if c<=9 then print_char(c+"0") 7036 else begin print_char("!"); return; 7037 end; 7038 end; 7039match: begin match_chr:=c; print_char(c); incr(n); print_char(n); 7040 if n>"9" then return; 7041 end; 7042end_match: if c=0 then print("->"); 7043@.->@> 7044othercases print_esc("BAD.") 7045@.BAD@> 7046endcases 7047 7048@ Here's the way we sometimes want to display a token list, given a pointer 7049to its reference count; the pointer may be null. 7050 7051@p procedure token_show(@!p:pointer); 7052begin if p<>null then show_token_list(link(p),null,10000000); 7053end; 7054 7055@ The |print_meaning| subroutine displays |cur_cmd| and |cur_chr| in 7056symbolic form, including the expansion of a macro or mark. 7057 7058@p procedure print_meaning; 7059begin print_cmd_chr(cur_cmd,cur_chr); 7060if cur_cmd>=call then 7061 begin print_char(":"); print_ln; token_show(cur_chr); 7062 end 7063else if (cur_cmd=top_bot_mark)and(cur_chr<marks_code) then 7064 begin print_char(":"); print_ln; 7065 token_show(cur_mark[cur_chr]); 7066 end; 7067end; 7068 7069@* \[21] Introduction to the syntactic routines. 7070Let's pause a moment now and try to look at the Big Picture. 7071The \TeX\ program consists of three main parts: syntactic routines, 7072semantic routines, and output routines. The chief purpose of the 7073syntactic routines is to deliver the user's input to the semantic routines, 7074one token at a time. The semantic routines act as an interpreter 7075responding to these tokens, which may be regarded as commands. And the 7076output routines are periodically called on to convert box-and-glue 7077lists into a compact set of instructions that will be sent 7078to a typesetter. We have discussed the basic data structures and utility 7079routines of \TeX, so we are good and ready to plunge into the real activity by 7080considering the syntactic routines. 7081 7082Our current goal is to come to grips with the |get_next| procedure, 7083which is the keystone of \TeX's input mechanism. Each call of |get_next| 7084sets the value of three variables |cur_cmd|, |cur_chr|, and |cur_cs|, 7085representing the next input token. 7086$$\vbox{\halign{#\hfil\cr 7087 \hbox{|cur_cmd| denotes a command code from the long list of codes 7088 given above;}\cr 7089 \hbox{|cur_chr| denotes a character code or other modifier of the command 7090 code;}\cr 7091 \hbox{|cur_cs| is the |eqtb| location of the current control sequence,}\cr 7092 \hbox{\qquad if the current token was a control sequence, 7093 otherwise it's zero.}\cr}}$$ 7094Underlying this external behavior of |get_next| is all the machinery 7095necessary to convert from character files to tokens. At a given time we 7096may be only partially finished with the reading of several files (for 7097which \.{\\input} was specified), and partially finished with the expansion 7098of some user-defined macros and/or some macro parameters, and partially 7099finished with the generation of some text in a template for \.{\\halign}, 7100and so on. When reading a character file, special characters must be 7101classified as math delimiters, etc.; comments and extra blank spaces must 7102be removed, paragraphs must be recognized, and control sequences must be 7103found in the hash table. Furthermore there are occasions in which the 7104scanning routines have looked ahead for a word like `\.{plus}' but only 7105part of that word was found, hence a few characters must be put back 7106into the input and scanned again. 7107 7108To handle these situations, which might all be present simultaneously, 7109\TeX\ uses various stacks that hold information about the incomplete 7110activities, and there is a finite state control for each level of the 7111input mechanism. These stacks record the current state of an implicitly 7112recursive process, but the |get_next| procedure is not recursive. 7113Therefore it will not be difficult to translate these algorithms into 7114low-level languages that do not support recursion. 7115 7116@<Glob...@>= 7117@!cur_cmd: eight_bits; {current command set by |get_next|} 7118@!cur_chr: halfword; {operand of current command} 7119@!cur_cs: pointer; {control sequence found here, zero if none found} 7120@!cur_tok: halfword; {packed representative of |cur_cmd| and |cur_chr|} 7121 7122@ The |print_cmd_chr| routine prints a symbolic interpretation of a 7123command code and its modifier. This is used in certain `\.{You can\'t}' 7124error messages, and in the implementation of diagnostic routines like 7125\.{\\show}. 7126 7127The body of |print_cmd_chr| is a rather tedious listing of print 7128commands, and most of it is essentially an inverse to the |primitive| 7129routine that enters a \TeX\ primitive into |eqtb|. Therefore much of 7130this procedure appears elsewhere in the program, 7131together with the corresponding |primitive| calls. 7132 7133@d chr_cmd(#)==begin print(#); 7134 if chr_code < @"10000 then print_ASCII(chr_code) 7135 else print_char(chr_code); {non-Plane 0 Unicodes can't be sent through |print_ASCII|} 7136 end 7137 7138@<Declare the procedure called |print_cmd_chr|@>= 7139procedure print_cmd_chr(@!cmd:quarterword;@!chr_code:halfword); 7140var n:integer; {temp variable} 7141@!font_name_str:str_number; {local vars for \.{\\fontname} quoting extension} 7142@!quote_char:UTF16_code; 7143begin case cmd of 7144left_brace: chr_cmd("begin-group character "); 7145right_brace: chr_cmd("end-group character "); 7146math_shift: chr_cmd("math shift character "); 7147mac_param: chr_cmd("macro parameter character "); 7148sup_mark: chr_cmd("superscript character "); 7149sub_mark: chr_cmd("subscript character "); 7150endv: print("end of alignment template"); 7151spacer: chr_cmd("blank space "); 7152letter: chr_cmd("the letter "); 7153other_char: chr_cmd("the character "); 7154@t\4@>@<Cases of |print_cmd_chr| for symbolic printing of primitives@>@/ 7155othercases print("[unknown command code!]") 7156endcases; 7157end; 7158 7159@ Here is a procedure that displays the current command. 7160 7161@p procedure show_cur_cmd_chr; 7162var n:integer; {level of \.{\\if...\\fi} nesting} 7163@!l:integer; {line where \.{\\if} started} 7164@!p:pointer; 7165begin begin_diagnostic; print_nl("{"); 7166if mode<>shown_mode then 7167 begin print_mode(mode); print(": "); shown_mode:=mode; 7168 end; 7169print_cmd_chr(cur_cmd,cur_chr); 7170if tracing_ifs>0 then 7171 if cur_cmd>=if_test then if cur_cmd<=fi_or_else then 7172 begin print(": "); 7173 if cur_cmd=fi_or_else then 7174 begin print_cmd_chr(if_test,cur_if); print_char(" "); 7175 n:=0; l:=if_line; 7176 end 7177 else begin n:=1; l:=line; 7178 end; 7179 p:=cond_ptr; 7180 while p<>null do 7181 begin incr(n); p:=link(p); 7182 end; 7183 print("(level "); print_int(n); print_char(")"); print_if_line(l); 7184 end; 7185print_char("}"); 7186end_diagnostic(false); 7187end; 7188 7189@* \[22] Input stacks and states. 7190This implementation of 7191\TeX\ uses two different conventions for representing sequential stacks. 7192@^stack conventions@>@^conventions for representing stacks@> 7193 7194\yskip\hangg 1) If there is frequent access to the top entry, and if the 7195stack is essentially never empty, then the top entry is kept in a global 7196variable (even better would be a machine register), and the other entries 7197appear in the array $\\{stack}[0\to(\\{ptr}-1)]$. For example, the 7198semantic stack described above is handled this way, and so is the input 7199stack that we are about to study. 7200 7201\yskip\hangg 2) If there is infrequent top access, the entire stack contents 7202are in the array $\\{stack}[0\to(\\{ptr}-1)]$. For example, the |save_stack| 7203is treated this way, as we have seen. 7204 7205\yskip\noindent 7206The state of \TeX's input mechanism appears in the input stack, whose 7207entries are records with six fields, called |state|, |index|, |start|, |loc|, 7208|limit|, and |name|. This stack is maintained with 7209convention~(1), so it is declared in the following way: 7210 7211@<Types...@>= 7212@!in_state_record = record 7213 @!state_field, @!index_field: quarterword; 7214 @!start_field,@!loc_field, @!limit_field, @!name_field: halfword; 7215 end; 7216 7217@ @<Glob...@>= 7218@!input_stack : array[0..stack_size] of in_state_record; 7219@!input_ptr : 0..stack_size; {first unused location of |input_stack|} 7220@!max_in_stack: 0..stack_size; {largest value of |input_ptr| when pushing} 7221@!cur_input : in_state_record; 7222 {the ``top'' input state, according to convention (1)} 7223 7224@ We've already defined the special variable |loc==cur_input.loc_field| 7225in our discussion of basic input-output routines. The other components of 7226|cur_input| are defined in the same way: 7227 7228@d state==cur_input.state_field {current scanner state} 7229@d index==cur_input.index_field {reference for buffer information} 7230@d start==cur_input.start_field {starting position in |buffer|} 7231@d limit==cur_input.limit_field {end of current line in |buffer|} 7232@d name==cur_input.name_field {name of the current file} 7233 7234@ Let's look more closely now at the control variables 7235(|state|,~|index|,~|start|,~|loc|,~|limit|,~|name|), 7236assuming that \TeX\ is reading a line of characters that have been input 7237from some file or from the user's terminal. There is an array called 7238|buffer| that acts as a stack of all lines of characters that are 7239currently being read from files, including all lines on subsidiary 7240levels of the input stack that are not yet completed. \TeX\ will return to 7241the other lines when it is finished with the present input file. 7242 7243(Incidentally, on a machine with byte-oriented addressing, it might be 7244appropriate to combine |buffer| with the |str_pool| array, 7245letting the buffer entries grow downward from the top of the string pool 7246and checking that these two tables don't bump into each other.) 7247 7248The line we are currently working on begins in position |start| of the 7249buffer; the next character we are about to read is |buffer[loc]|; and 7250|limit| is the location of the last character present. If |loc>limit|, 7251the line has been completely read. Usually |buffer[limit]| is the 7252|end_line_char|, denoting the end of a line, but this is not 7253true if the current line is an insertion that was entered on the user's 7254terminal in response to an error message. 7255 7256The |name| variable is a string number that designates the name of 7257the current file, if we are reading a text file. It is zero if we 7258are reading from the terminal; it is |n+1| if we are reading from 7259input stream |n|, where |0<=n<=16|. (Input stream 16 stands for 7260an invalid stream number; in such cases the input is actually from 7261the terminal, under control of the procedure |read_toks|.) 7262Finally |18<=name<=19| indicates that we are reading a pseudo file 7263created by the \.{\\scantokens} command. 7264 7265The |state| variable has one of three values, when we are scanning such 7266files: 7267$$\baselineskip 15pt\vbox{\halign{#\hfil\cr 72681) |state=mid_line| is the normal state.\cr 72692) |state=skip_blanks| is like |mid_line|, but blanks are ignored.\cr 72703) |state=new_line| is the state at the beginning of a line.\cr}}$$ 7271These state values are assigned numeric codes so that if we add the state 7272code to the next character's command code, we get distinct values. For 7273example, `|mid_line+spacer|' stands for the case that a blank 7274space character occurs in the middle of a line when it is not being 7275ignored; after this case is processed, the next value of |state| will 7276be |skip_blanks|. 7277 7278@d mid_line=1 {|state| code when scanning a line of characters} 7279@d skip_blanks=2+max_char_code {|state| code when ignoring blanks} 7280@d new_line=3+max_char_code+max_char_code {|state| code at start of line} 7281 7282@ Additional information about the current line is available via the 7283|index| variable, which counts how many lines of characters are present 7284in the buffer below the current level. We have |index=0| when reading 7285from the terminal and prompting the user for each line; then if the user types, 7286e.g., `\.{\\input paper}', we will have |index=1| while reading 7287the file \.{paper.tex}. However, it does not follow that |index| is the 7288same as the input stack pointer, since many of the levels on the input 7289stack may come from token lists. For example, the instruction `\.{\\input 7290paper}' might occur in a token list. 7291 7292The global variable |in_open| is equal to the |index| 7293value of the highest non-token-list level. Thus, the number of partially read 7294lines in the buffer is |in_open+1|, and we have |in_open=index| 7295when we are not reading a token list. 7296 7297If we are not currently reading from the terminal, or from an input 7298stream, we are reading from the file variable |input_file[index]|. We use 7299the notation |terminal_input| as a convenient abbreviation for |name=0|, 7300and |cur_file| as an abbreviation for |input_file[index]|. 7301 7302The global variable |line| contains the line number in the topmost 7303open file, for use in error messages. If we are not reading from 7304the terminal, |line_stack[index]| holds the line number for the 7305enclosing level, so that |line| can be restored when the current 7306file has been read. Line numbers should never be negative, since the 7307negative of the current line number is used to identify the user's output 7308routine in the |mode_line| field of the semantic nest entries. 7309 7310If more information about the input state is needed, it can be 7311included in small arrays like those shown here. For example, 7312the current page or segment number in the input file might be 7313put into a variable |@!page|, maintained for enclosing levels in 7314`\ignorespaces|@!page_stack:array[1..max_in_open] of integer|\unskip' 7315by analogy with |line_stack|. 7316@^system dependencies@> 7317 7318@d terminal_input==(name=0) {are we reading from the terminal?} 7319@d cur_file==input_file[index] {the current |alpha_file| variable} 7320 7321@<Glob...@>= 7322@!in_open : 0..max_in_open; {the number of lines in the buffer, less one} 7323@!open_parens : 0..max_in_open; {the number of open text files} 7324@!input_file : array[1..max_in_open] of alpha_file; 7325@!line : integer; {current line number in the current source file} 7326@!line_stack : array[1..max_in_open] of integer; 7327 7328@ Users of \TeX\ sometimes forget to balance left and right braces properly, 7329and one of the ways \TeX\ tries to spot such errors is by considering an 7330input file as broken into subfiles by control sequences that 7331are declared to be \.{\\outer}. 7332 7333A variable called |scanner_status| tells \TeX\ whether or not to complain 7334when a subfile ends. This variable has six possible values: 7335 7336\yskip\hang|normal|, means that a subfile can safely end here without incident. 7337 7338\yskip\hang|skipping|, means that a subfile can safely end here, but not a file, 7339because we're reading past some conditional text that was not selected. 7340 7341\yskip\hang|defining|, means that a subfile shouldn't end now because a 7342macro is being defined. 7343 7344\yskip\hang|matching|, means that a subfile shouldn't end now because a 7345macro is being used and we are searching for the end of its arguments. 7346 7347\yskip\hang|aligning|, means that a subfile shouldn't end now because we are 7348not finished with the preamble of an \.{\\halign} or \.{\\valign}. 7349 7350\yskip\hang|absorbing|, means that a subfile shouldn't end now because we are 7351reading a balanced token list for \.{\\message}, \.{\\write}, etc. 7352 7353\yskip\noindent 7354If the |scanner_status| is not |normal|, the variable |warning_index| points 7355to the |eqtb| location for the relevant control sequence name to print 7356in an error message. 7357 7358@d skipping=1 {|scanner_status| when passing conditional text} 7359@d defining=2 {|scanner_status| when reading a macro definition} 7360@d matching=3 {|scanner_status| when reading macro arguments} 7361@d aligning=4 {|scanner_status| when reading an alignment preamble} 7362@d absorbing=5 {|scanner_status| when reading a balanced text} 7363 7364@<Glob...@>= 7365@!scanner_status : normal..absorbing; {can a subfile end now?} 7366@!warning_index : pointer; {identifier relevant to non-|normal| scanner status} 7367@!def_ref : pointer; {reference count of token list being defined} 7368 7369@ Here is a procedure that uses |scanner_status| to print a warning message 7370when a subfile has ended, and at certain other crucial times: 7371 7372@<Declare the procedure called |runaway|@>= 7373procedure runaway; 7374var p:pointer; {head of runaway list} 7375begin if scanner_status>skipping then 7376 begin print_nl("Runaway "); 7377@.Runaway...@> 7378 case scanner_status of 7379 defining: begin print("definition"); p:=def_ref; 7380 end; 7381 matching: begin print("argument"); p:=temp_head; 7382 end; 7383 aligning: begin print("preamble"); p:=hold_head; 7384 end; 7385 absorbing: begin print("text"); p:=def_ref; 7386 end; 7387 end; {there are no other cases} 7388 print_char("?");print_ln; show_token_list(link(p),null,error_line-10); 7389 end; 7390end; 7391 7392@ However, all this discussion about input state really applies only to the 7393case that we are inputting from a file. There is another important case, 7394namely when we are currently getting input from a token list. In this case 7395|state=token_list|, and the conventions about the other state variables 7396are different: 7397 7398\yskip\hang|loc| is a pointer to the current node in the token list, i.e., 7399the node that will be read next. If |loc=null|, the token list has been 7400fully read. 7401 7402\yskip\hang|start| points to the first node of the token list; this node 7403may or may not contain a reference count, depending on the type of token 7404list involved. 7405 7406\yskip\hang|token_type|, which takes the place of |index| in the 7407discussion above, is a code number that explains what kind of token list 7408is being scanned. 7409 7410\yskip\hang|name| points to the |eqtb| address of the control sequence 7411being expanded, if the current token list is a macro. 7412 7413\yskip\hang|param_start|, which takes the place of |limit|, tells where 7414the parameters of the current macro begin in the |param_stack|, if the 7415current token list is a macro. 7416 7417\yskip\noindent The |token_type| can take several values, depending on 7418where the current token list came from: 7419 7420\yskip\hang|parameter|, if a parameter is being scanned; 7421 7422\hang|u_template|, if the \<u_j> part of an alignment 7423template is being scanned; 7424 7425\hang|v_template|, if the \<v_j> part of an alignment 7426template is being scanned; 7427 7428\hang|backed_up|, if the token list being scanned has been inserted as 7429`to be read again'. 7430 7431\hang|inserted|, if the token list being scanned has been inserted as 7432the text expansion of a \.{\\count} or similar variable; 7433 7434\hang|macro|, if a user-defined control sequence is being scanned; 7435 7436\hang|output_text|, if an \.{\\output} routine is being scanned; 7437 7438\hang|every_par_text|, if the text of \.{\\everypar} is being scanned; 7439 7440\hang|every_math_text|, if the text of \.{\\everymath} is being scanned; 7441 7442\hang|every_display_text|, if the text of \.{\\everydisplay} is being scanned; 7443 7444\hang|every_hbox_text|, if the text of \.{\\everyhbox} is being scanned; 7445 7446\hang|every_vbox_text|, if the text of \.{\\everyvbox} is being scanned; 7447 7448\hang|every_job_text|, if the text of \.{\\everyjob} is being scanned; 7449 7450\hang|every_cr_text|, if the text of \.{\\everycr} is being scanned; 7451 7452\hang|mark_text|, if the text of a \.{\\mark} is being scanned; 7453 7454\hang|write_text|, if the text of a \.{\\write} is being scanned. 7455 7456\yskip\noindent 7457The codes for |output_text|, |every_par_text|, etc., are equal to a constant 7458plus the corresponding codes for token list parameters |output_routine_loc|, 7459|every_par_loc|, etc. The token list begins with a reference count if and 7460only if |token_type>=macro|. 7461@^reference counts@> 7462 7463Since \eTeX's additional token list parameters precede |toks_base|, the 7464corresponding token types must precede |write_text|. 7465 7466@d token_list=0 {|state| code when scanning a token list} 7467@d token_type==index {type of current token list} 7468@d param_start==limit {base of macro parameters in |param_stack|} 7469@d parameter=0 {|token_type| code for parameter} 7470@d u_template=1 {|token_type| code for \<u_j> template} 7471@d v_template=2 {|token_type| code for \<v_j> template} 7472@d backed_up=3 {|token_type| code for text to be reread} 7473@d backed_up_char=4 {special code for backed-up char from \\XeTeXinterchartoks hook} 7474@d inserted=5 {|token_type| code for inserted texts} 7475@d macro=6 {|token_type| code for defined control sequences} 7476@d output_text=7 {|token_type| code for output routines} 7477@d every_par_text=8 {|token_type| code for \.{\\everypar}} 7478@d every_math_text=9 {|token_type| code for \.{\\everymath}} 7479@d every_display_text=10 {|token_type| code for \.{\\everydisplay}} 7480@d every_hbox_text=11 {|token_type| code for \.{\\everyhbox}} 7481@d every_vbox_text=12 {|token_type| code for \.{\\everyvbox}} 7482@d every_job_text=13 {|token_type| code for \.{\\everyjob}} 7483@d every_cr_text=14 {|token_type| code for \.{\\everycr}} 7484@d mark_text=15 {|token_type| code for \.{\\topmark}, etc.} 7485@# 7486@d eTeX_text_offset=output_routine_loc-output_text 7487@d every_eof_text=every_eof_loc-eTeX_text_offset 7488 {|token_type| code for \.{\\everyeof}} 7489@# 7490@d inter_char_text=XeTeX_inter_char_loc-eTeX_text_offset 7491 {|token_type| code for \.{\\XeTeXinterchartoks}} 7492@# 7493@d write_text=toks_base-eTeX_text_offset {|token_type| code for \.{\\write}} 7494 7495@ The |param_stack| is an auxiliary array used to hold pointers to the token 7496lists for parameters at the current level and subsidiary levels of input. 7497This stack is maintained with convention (2), and it grows at a different 7498rate from the others. 7499 7500@<Glob...@>= 7501@!param_stack:array [0..param_size] of pointer; 7502 {token list pointers for parameters} 7503@!param_ptr:0..param_size; {first unused entry in |param_stack|} 7504@!max_param_stack:integer; 7505 {largest value of |param_ptr|, will be |<=param_size+9|} 7506 7507@ The input routines must also interact with the processing of 7508\.{\\halign} and \.{\\valign}, since the appearance of tab marks and 7509\.{\\cr} in certain places is supposed to trigger the beginning of special 7510\<v_j> template text in the scanner. This magic is accomplished by an 7511|align_state| variable that is increased by~1 when a `\.{\char'173}' is 7512scanned and decreased by~1 when a `\.{\char'175}' is scanned. The |align_state| 7513is nonzero during the \<u_j> template, after which it is set to zero; the 7514\<v_j> template begins when a tab mark or \.{\\cr} occurs at a time that 7515|align_state=0|. 7516 7517@<Glob...@>= 7518@!align_state:integer; {group level with respect to current alignment} 7519 7520@ Thus, the ``current input state'' can be very complicated indeed; there 7521can be many levels and each level can arise in a variety of ways. The 7522|show_context| procedure, which is used by \TeX's error-reporting routine to 7523print out the current input state on all levels down to the most recent 7524line of characters from an input file, illustrates most of these conventions. 7525The global variable |base_ptr| contains the lowest level that was 7526displayed by this procedure. 7527 7528@<Glob...@>= 7529@!base_ptr:0..stack_size; {shallowest level shown by |show_context|} 7530 7531@ The status at each level is indicated by printing two lines, where the first 7532line indicates what was read so far and the second line shows what remains 7533to be read. The context is cropped, if necessary, so that the first line 7534contains at most |half_error_line| characters, and the second contains 7535at most |error_line|. Non-current input levels whose |token_type| is 7536`|backed_up|' are shown only if they have not been fully read. 7537 7538@p procedure show_context; {prints where the scanner is} 7539label done; 7540var old_setting:0..max_selector; {saved |selector| setting} 7541@!nn:integer; {number of contexts shown so far, less one} 7542@!bottom_line:boolean; {have we reached the final context to be shown?} 7543@<Local variables for formatting calculations@>@/ 7544begin base_ptr:=input_ptr; input_stack[base_ptr]:=cur_input; 7545 {store current state} 7546nn:=-1; bottom_line:=false; 7547loop@+begin cur_input:=input_stack[base_ptr]; {enter into the context} 7548 if (state<>token_list) then 7549 if (name>19) or (base_ptr=0) then bottom_line:=true; 7550 if (base_ptr=input_ptr)or bottom_line or(nn<error_context_lines) then 7551 @<Display the current context@> 7552 else if nn=error_context_lines then 7553 begin print_nl("..."); incr(nn); {omitted if |error_context_lines<0|} 7554 end; 7555 if bottom_line then goto done; 7556 decr(base_ptr); 7557 end; 7558done: cur_input:=input_stack[input_ptr]; {restore original state} 7559end; 7560 7561@ @<Display the current context@>= 7562begin if (base_ptr=input_ptr) or (state<>token_list) or 7563 (token_type<>backed_up) or (loc<>null) then 7564 {we omit backed-up token lists that have already been read} 7565 begin tally:=0; {get ready to count characters} 7566 old_setting:=selector; 7567 if state<>token_list then 7568 begin @<Print location of current line@>; 7569 @<Pseudoprint the line@>; 7570 end 7571 else begin @<Print type of token list@>; 7572 @<Pseudoprint the token list@>; 7573 end; 7574 selector:=old_setting; {stop pseudoprinting} 7575 @<Print two lines using the tricky pseudoprinted information@>; 7576 incr(nn); 7577 end; 7578end 7579 7580@ This routine should be changed, if necessary, to give the best possible 7581indication of where the current line resides in the input file. 7582For example, on some systems it is best to print both a page and line number. 7583@^system dependencies@> 7584 7585@<Print location of current line@>= 7586if name<=17 then 7587 if terminal_input then 7588 if base_ptr=0 then print_nl("<*>") else print_nl("<insert> ") 7589 else begin print_nl("<read "); 7590 if name=17 then print_char("*")@+else print_int(name-1); 7591@.*\relax@> 7592 print_char(">"); 7593 end 7594else begin print_nl("l."); 7595 if index=in_open then print_int(line) 7596 else print_int(line_stack[index+1]); {input from a pseudo file} 7597 end; 7598print_char(" ") 7599 7600@ @<Print type of token list@>= 7601case token_type of 7602parameter: print_nl("<argument> "); 7603u_template,v_template: print_nl("<template> "); 7604backed_up,backed_up_char: if loc=null then print_nl("<recently read> ") 7605 else print_nl("<to be read again> "); 7606inserted: print_nl("<inserted text> "); 7607macro: begin print_ln; print_cs(name); 7608 end; 7609output_text: print_nl("<output> "); 7610every_par_text: print_nl("<everypar> "); 7611every_math_text: print_nl("<everymath> "); 7612every_display_text: print_nl("<everydisplay> "); 7613every_hbox_text: print_nl("<everyhbox> "); 7614every_vbox_text: print_nl("<everyvbox> "); 7615every_job_text: print_nl("<everyjob> "); 7616every_cr_text: print_nl("<everycr> "); 7617mark_text: print_nl("<mark> "); 7618every_eof_text: print_nl("<everyeof> "); 7619inter_char_text: print_nl("<XeTeXinterchartoks> "); 7620write_text: print_nl("<write> "); 7621othercases print_nl("?") {this should never happen} 7622endcases 7623 7624@ Here it is necessary to explain a little trick. We don't want to store a long 7625string that corresponds to a token list, because that string might take up 7626lots of memory; and we are printing during a time when an error message is 7627being given, so we dare not do anything that might overflow one of \TeX's 7628tables. So `pseudoprinting' is the answer: We enter a mode of printing 7629that stores characters into a buffer of length |error_line|, where character 7630$k+1$ is placed into \hbox{|trick_buf[k mod error_line]|} if 7631|k<trick_count|, otherwise character |k| is dropped. Initially we set 7632|tally:=0| and |trick_count:=1000000|; then when we reach the 7633point where transition from line 1 to line 2 should occur, we 7634set |first_count:=tally| and |trick_count:=@tmax@>(error_line, 7635tally+1+error_line-half_error_line)|. At the end of the 7636pseudoprinting, the values of |first_count|, |tally|, and 7637|trick_count| give us all the information we need to print the two lines, 7638and all of the necessary text is in |trick_buf|. 7639 7640Namely, let |l| be the length of the descriptive information that appears 7641on the first line. The length of the context information gathered for that 7642line is |k=first_count|, and the length of the context information 7643gathered for line~2 is $m=\min(|tally|, |trick_count|)-k$. If |l+k<=h|, 7644where |h=half_error_line|, we print |trick_buf[0..k-1]| after the 7645descriptive information on line~1, and set |n:=l+k|; here |n| is the 7646length of line~1. If $l+k>h$, some cropping is necessary, so we set |n:=h| 7647and print `\.{...}' followed by 7648$$\hbox{|trick_buf[(l+k-h+3)..k-1]|,}$$ 7649where subscripts of |trick_buf| are circular modulo |error_line|. The 7650second line consists of |n|~spaces followed by |trick_buf[k..(k+m-1)]|, 7651unless |n+m>error_line|; in the latter case, further cropping is done. 7652This is easier to program than to explain. 7653 7654@<Local variables for formatting...@>= 7655@!i:0..buf_size; {index into |buffer|} 7656@!j:0..buf_size; {end of current line in |buffer|} 7657@!l:0..half_error_line; {length of descriptive information on line 1} 7658@!m:integer; {context information gathered for line 2} 7659@!n:0..error_line; {length of line 1} 7660@!p: integer; {starting or ending place in |trick_buf|} 7661@!q: integer; {temporary index} 7662 7663@ The following code sets up the print routines so that they will gather 7664the desired information. 7665 7666@d begin_pseudoprint== 7667 begin l:=tally; tally:=0; selector:=pseudo; 7668 trick_count:=1000000; 7669 end 7670@d set_trick_count== 7671 begin first_count:=tally; 7672 trick_count:=tally+1+error_line-half_error_line; 7673 if trick_count<error_line then trick_count:=error_line; 7674 end 7675 7676@ And the following code uses the information after it has been gathered. 7677 7678@<Print two lines using the tricky pseudoprinted information@>= 7679if trick_count=1000000 then set_trick_count; 7680 {|set_trick_count| must be performed} 7681if tally<trick_count then m:=tally-first_count 7682else m:=trick_count-first_count; {context on line 2} 7683if l+first_count<=half_error_line then 7684 begin p:=0; n:=l+first_count; 7685 end 7686else begin print("..."); p:=l+first_count-half_error_line+3; 7687 n:=half_error_line; 7688 end; 7689for q:=p to first_count-1 do print_visible_char(trick_buf[q mod error_line]); 7690print_ln; 7691for q:=1 to n do print_visible_char(" "); {print |n| spaces to begin line~2} 7692if m+n<=error_line then p:=first_count+m else p:=first_count+(error_line-n-3); 7693for q:=first_count to p-1 do print_visible_char(trick_buf[q mod error_line]); 7694if m+n>error_line then print("...") 7695 7696@ But the trick is distracting us from our current goal, which is to 7697understand the input state. So let's concentrate on the data structures that 7698are being pseudoprinted as we finish up the |show_context| procedure. 7699 7700@<Pseudoprint the line@>= 7701begin_pseudoprint; 7702if buffer[limit]=end_line_char then j:=limit 7703else j:=limit+1; {determine the effective end of the line} 7704if j>0 then for i:=start to j-1 do 7705 begin if i=loc then set_trick_count; 7706 print_char(buffer[i]); 7707 end 7708 7709@ @<Pseudoprint the token list@>= 7710begin_pseudoprint; 7711if token_type<macro then show_token_list(start,loc,100000) 7712else show_token_list(link(start),loc,100000) {avoid reference count} 7713 7714@ Here is the missing piece of |show_token_list| that is activated when the 7715token beginning line~2 is about to be shown: 7716 7717@<Do magic computation@>=set_trick_count 7718 7719@* \[23] Maintaining the input stacks. 7720The following subroutines change the input status in commonly needed ways. 7721 7722First comes |push_input|, which stores the current state and creates a 7723new level (having, initially, the same properties as the old). 7724 7725@d push_input==@t@> {enter a new input level, save the old} 7726 begin if input_ptr>max_in_stack then 7727 begin max_in_stack:=input_ptr; 7728 if input_ptr=stack_size then overflow("input stack size",stack_size); 7729@:TeX capacity exceeded input stack size}{\quad input stack size@> 7730 end; 7731 input_stack[input_ptr]:=cur_input; {stack the record} 7732 incr(input_ptr); 7733 end 7734 7735@ And of course what goes up must come down. 7736 7737@d pop_input==@t@> {leave an input level, re-enter the old} 7738 begin decr(input_ptr); cur_input:=input_stack[input_ptr]; 7739 end 7740 7741@ Here is a procedure that starts a new level of token-list input, given 7742a token list |p| and its type |t|. If |t=macro|, the calling routine should 7743set |name| and |loc|. 7744 7745@d back_list(#)==begin_token_list(#,backed_up) {backs up a simple token list} 7746@d ins_list(#)==begin_token_list(#,inserted) {inserts a simple token list} 7747 7748@p procedure begin_token_list(@!p:pointer;@!t:quarterword); 7749begin push_input; state:=token_list; start:=p; token_type:=t; 7750if t>=macro then {the token list starts with a reference count} 7751 begin add_token_ref(p); 7752 if t=macro then param_start:=param_ptr 7753 else begin loc:=link(p); 7754 if tracing_macros>1 then 7755 begin begin_diagnostic; print_nl(""); 7756 case t of 7757 mark_text:print_esc("mark"); 7758 write_text:print_esc("write"); 7759 othercases print_cmd_chr(assign_toks,t-output_text+output_routine_loc) 7760 endcases;@/ 7761 print("->"); token_show(p); end_diagnostic(false); 7762 end; 7763 end; 7764 end 7765else loc:=p; 7766end; 7767 7768@ When a token list has been fully scanned, the following computations 7769should be done as we leave that level of input. The |token_type| tends 7770to be equal to either |backed_up| or |inserted| about 2/3 of the time. 7771@^inner loop@> 7772 7773@p procedure end_token_list; {leave a token-list input level} 7774begin if token_type>=backed_up then {token list to be deleted} 7775 begin if token_type<=inserted then flush_list(start) 7776 else begin delete_token_ref(start); {update reference count} 7777 if token_type=macro then {parameters must be flushed} 7778 while param_ptr>param_start do 7779 begin decr(param_ptr); 7780 flush_list(param_stack[param_ptr]); 7781 end; 7782 end; 7783 end 7784else if token_type=u_template then 7785 if align_state>500000 then align_state:=0 7786 else fatal_error("(interwoven alignment preambles are not allowed)"); 7787@.interwoven alignment preambles...@> 7788pop_input; 7789check_interrupt; 7790end; 7791 7792@ Sometimes \TeX\ has read too far and wants to ``unscan'' what it has 7793seen. The |back_input| procedure takes care of this by putting the token 7794just scanned back into the input stream, ready to be read again. This 7795procedure can be used only if |cur_tok| represents the token to be 7796replaced. Some applications of \TeX\ use this procedure a lot, 7797so it has been slightly optimized for speed. 7798@^inner loop@> 7799 7800@p procedure back_input; {undoes one token of input} 7801var p:pointer; {a token list of length one} 7802begin while (state=token_list)and(loc=null)and(token_type<>v_template) do 7803 end_token_list; {conserve stack space} 7804p:=get_avail; info(p):=cur_tok; 7805if cur_tok<right_brace_limit then 7806 if cur_tok<left_brace_limit then decr(align_state) 7807 else incr(align_state); 7808push_input; state:=token_list; start:=p; token_type:=backed_up; 7809loc:=p; {that was |back_list(p)|, without procedure overhead} 7810end; 7811 7812@ @<Insert token |p| into \TeX's input@>= 7813begin t:=cur_tok; cur_tok:=p; 7814if a then 7815 begin p:=get_avail; info(p):=cur_tok; link(p):=loc; loc:=p; start:=p; 7816 if cur_tok<right_brace_limit then 7817 if cur_tok<left_brace_limit then decr(align_state) 7818 else incr(align_state); 7819 end 7820else begin back_input; a:=eTeX_ex; 7821 end; 7822cur_tok:=t; 7823end 7824 7825@ The |back_error| routine is used when we want to replace an offending token 7826just before issuing an error message. This routine, like |back_input|, 7827requires that |cur_tok| has been set. We disable interrupts during the 7828call of |back_input| so that the help message won't be lost. 7829 7830@p procedure back_error; {back up one token and call |error|} 7831begin OK_to_interrupt:=false; back_input; OK_to_interrupt:=true; error; 7832end; 7833@# 7834procedure ins_error; {back up one inserted token and call |error|} 7835begin OK_to_interrupt:=false; back_input; token_type:=inserted; 7836OK_to_interrupt:=true; error; 7837end; 7838 7839@ The |begin_file_reading| procedure starts a new level of input for lines 7840of characters to be read from a file, or as an insertion from the 7841terminal. It does not take care of opening the file, nor does it set |loc| 7842or |limit| or |line|. 7843@^system dependencies@> 7844 7845@p procedure begin_file_reading; 7846begin if in_open=max_in_open then overflow("text input levels",max_in_open); 7847@:TeX capacity exceeded text input levels}{\quad text input levels@> 7848if first=buf_size then overflow("buffer size",buf_size); 7849@:TeX capacity exceeded buffer size}{\quad buffer size@> 7850incr(in_open); push_input; index:=in_open; 7851eof_seen[index]:=false; 7852grp_stack[index]:=cur_boundary; if_stack[index]:=cond_ptr; 7853line_stack[index]:=line; start:=first; state:=mid_line; 7854name:=0; {|terminal_input| is now |true|} 7855end; 7856 7857@ Conversely, the variables must be downdated when such a level of input 7858is finished: 7859 7860@p procedure end_file_reading; 7861begin first:=start; line:=line_stack[index]; 7862if (name=18)or(name=19) then pseudo_close else 7863if name>17 then u_close(cur_file); {forget it} 7864pop_input; decr(in_open); 7865end; 7866 7867@ In order to keep the stack from overflowing during a long sequence of 7868inserted `\.{\\show}' commands, the following routine removes completed 7869error-inserted lines from memory. 7870 7871@p procedure clear_for_error_prompt; 7872begin while (state<>token_list)and terminal_input and@| 7873 (input_ptr>0)and(loc>limit) do end_file_reading; 7874print_ln; clear_terminal; 7875end; 7876 7877@ To get \TeX's whole input mechanism going, we perform the following 7878actions. 7879 7880@<Initialize the input routines@>= 7881begin input_ptr:=0; max_in_stack:=0; 7882in_open:=0; open_parens:=0; max_buf_stack:=0; 7883grp_stack[0]:=0; if_stack[0]:=null; 7884param_ptr:=0; max_param_stack:=0; 7885first:=buf_size; repeat buffer[first]:=0; decr(first); until first=0; 7886scanner_status:=normal; warning_index:=null; first:=1; 7887state:=new_line; start:=1; index:=0; line:=0; name:=0; 7888force_eof:=false; 7889align_state:=1000000;@/ 7890if not init_terminal then goto final_end; 7891limit:=last; first:=last+1; {|init_terminal| has set |loc| and |last|} 7892end 7893 7894@* \[24] Getting the next token. 7895The heart of \TeX's input mechanism is the |get_next| procedure, which 7896we shall develop in the next few sections of the program. Perhaps we 7897shouldn't actually call it the ``heart,'' however, because it really acts 7898as \TeX's eyes and mouth, reading the source files and gobbling them up. 7899And it also helps \TeX\ to regurgitate stored token lists that are to be 7900processed again. 7901@^eyes and mouth@> 7902 7903The main duty of |get_next| is to input one token and to set |cur_cmd| 7904and |cur_chr| to that token's command code and modifier. Furthermore, if 7905the input token is a control sequence, the |eqtb| location of that control 7906sequence is stored in |cur_cs|; otherwise |cur_cs| is set to zero. 7907 7908Underlying this simple description is a certain amount of complexity 7909because of all the cases that need to be handled. 7910However, the inner loop of |get_next| is reasonably short and fast. 7911 7912When |get_next| is asked to get the next token of a \.{\\read} line, 7913it sets |cur_cmd=cur_chr=cur_cs=0| in the case that no more tokens 7914appear on that line. (There might not be any tokens at all, if the 7915|end_line_char| has |ignore| as its catcode.) 7916 7917@ The value of |par_loc| is the |eqtb| address of `\.{\\par}'. This quantity 7918is needed because a blank line of input is supposed to be exactly equivalent 7919to the appearance of \.{\\par}; we must set |cur_cs:=par_loc| 7920when detecting a blank line. 7921 7922@<Glob...@>= 7923@!par_loc:pointer; {location of `\.{\\par}' in |eqtb|} 7924@!par_token:halfword; {token representing `\.{\\par}'} 7925 7926@ @<Put each...@>= 7927primitive("par",par_end,too_big_usv); {cf.\ |scan_file_name|} 7928@!@:par_}{\.{\\par} primitive@> 7929par_loc:=cur_val; par_token:=cs_token_flag+par_loc; 7930 7931@ @<Cases of |print_cmd_chr|...@>= 7932par_end:print_esc("par"); 7933 7934@ Before getting into |get_next|, let's consider the subroutine that 7935is called when an `\.{\\outer}' control sequence has been scanned or 7936when the end of a file has been reached. These two cases are distinguished 7937by |cur_cs|, which is zero at the end of a file. 7938 7939@p procedure check_outer_validity; 7940var p:pointer; {points to inserted token list} 7941@!q:pointer; {auxiliary pointer} 7942begin if scanner_status<>normal then 7943 begin deletions_allowed:=false; 7944 @<Back up an outer control sequence so that it can be reread@>; 7945 if scanner_status>skipping then 7946 @<Tell the user what has run away and try to recover@> 7947 else begin print_err("Incomplete "); print_cmd_chr(if_test,cur_if); 7948@.Incomplete \\if...@> 7949 print("; all text was ignored after line "); print_int(skip_line); 7950 help3("A forbidden control sequence occurred in skipped text.")@/ 7951 ("This kind of error happens when you say `\if...' and forget")@/ 7952 ("the matching `\fi'. I've inserted a `\fi'; this might work."); 7953 if cur_cs<>0 then cur_cs:=0 7954 else help_line[2]:=@| 7955 "The file ended while I was skipping conditional text."; 7956 cur_tok:=cs_token_flag+frozen_fi; ins_error; 7957 end; 7958 deletions_allowed:=true; 7959 end; 7960end; 7961 7962@ An outer control sequence that occurs in a \.{\\read} will not be reread, 7963since the error recovery for \.{\\read} is not very powerful. 7964 7965@<Back up an outer control sequence so that it can be reread@>= 7966if cur_cs<>0 then 7967 begin if (state=token_list)or(name<1)or(name>17) then 7968 begin p:=get_avail; info(p):=cs_token_flag+cur_cs; 7969 back_list(p); {prepare to read the control sequence again} 7970 end; 7971 cur_cmd:=spacer; cur_chr:=" "; {replace it by a space} 7972 end 7973 7974@ @<Tell the user what has run away...@>= 7975begin runaway; {print a definition, argument, or preamble} 7976if cur_cs=0 then print_err("File ended") 7977@.File ended while scanning...@> 7978else begin cur_cs:=0; print_err("Forbidden control sequence found"); 7979@.Forbidden control sequence...@> 7980 end; 7981print(" while scanning "); 7982@<Print either `\.{definition}' or `\.{use}' or `\.{preamble}' or `\.{text}', 7983 and insert tokens that should lead to recovery@>; 7984print(" of "); sprint_cs(warning_index); 7985help4("I suspect you have forgotten a `}', causing me")@/ 7986("to read past where you wanted me to stop.")@/ 7987("I'll try to recover; but if the error is serious,")@/ 7988("you'd better type `E' or `X' now and fix your file.");@/ 7989error; 7990end 7991 7992@ The recovery procedure can't be fully understood without knowing more 7993about the \TeX\ routines that should be aborted, but we can sketch the 7994ideas here: For a runaway definition we will insert a right brace; for a 7995runaway preamble, we will insert a special \.{\\cr} token and a right 7996brace; and for a runaway argument, we will set |long_state| to 7997|outer_call| and insert \.{\\par}. 7998 7999@<Print either `\.{definition}' or ...@>= 8000p:=get_avail; 8001case scanner_status of 8002defining:begin print("definition"); info(p):=right_brace_token+"}"; 8003 end; 8004matching:begin print("use"); info(p):=par_token; long_state:=outer_call; 8005 end; 8006aligning:begin print("preamble"); info(p):=right_brace_token+"}"; q:=p; 8007 p:=get_avail; link(p):=q; info(p):=cs_token_flag+frozen_cr; 8008 align_state:=-1000000; 8009 end; 8010absorbing:begin print("text"); info(p):=right_brace_token+"}"; 8011 end; 8012end; {there are no other cases} 8013ins_list(p) 8014 8015@ We need to mention a procedure here that may be called by |get_next|. 8016 8017@p procedure@?firm_up_the_line; forward; 8018 8019@ Now we're ready to take the plunge into |get_next| itself. Parts of 8020this routine are executed more often than any other instructions of \TeX. 8021@^mastication@>@^inner loop@> 8022 8023@d switch=25 {a label in |get_next|} 8024@d start_cs=26 {another} 8025@d not_exp=27 8026 8027@p procedure get_next; {sets |cur_cmd|, |cur_chr|, |cur_cs| to next token} 8028label restart, {go here to get the next input token} 8029 switch, {go here to eat the next character from a file} 8030 reswitch, {go here to digest it again} 8031 start_cs, {go here to start looking for a control sequence} 8032 found, {go here when a control sequence has been found} 8033 not_exp, {go here when ^^ turned out not to start an expanded code} 8034 exit; {go here when the next input token has been got} 8035var k:0..buf_size; {an index into |buffer|} 8036@!t:halfword; {a token} 8037@!cat:0..max_char_code; {|cat_code(cur_chr)|, usually} 8038@!c:UnicodeScalar; {constituent of a possible expanded code} 8039@!lower:UTF16_code; {lower surrogate of a possible UTF-16 compound} 8040@!d:small_number; {number of excess characters in an expanded code} 8041@!sup_count:small_number; {number of identical |sup_mark| characters} 8042begin restart: cur_cs:=0; 8043if state<>token_list then 8044@<Input from external file, |goto restart| if no input found@> 8045else @<Input from token list, |goto restart| if end of list or 8046 if a parameter needs to be expanded@>; 8047@<If an alignment entry has just ended, take appropriate action@>; 8048exit:end; 8049 8050@ An alignment entry ends when a tab or \.{\\cr} occurs, provided that the 8051current level of braces is the same as the level that was present at the 8052beginning of that alignment entry; i.e., provided that |align_state| has 8053returned to the value it had after the \<u_j> template for that entry. 8054@^inner loop@> 8055 8056@<If an alignment entry has just ended, take appropriate action@>= 8057if cur_cmd<=car_ret then if cur_cmd>=tab_mark then if align_state=0 then 8058 @<Insert the \(v)\<v_j> template and |goto restart|@> 8059 8060@ @<Input from external file, |goto restart| if no input found@>= 8061@^inner loop@> 8062begin switch: if loc<=limit then {current line not yet finished} 8063 begin cur_chr:=buffer[loc]; incr(loc); 8064 if (cur_chr >= @"D800) and (cur_chr < @"DC00) 8065 and (loc <= limit) and (buffer[loc] >= @"DC00) and (buffer[loc] < @"E000) then 8066 begin 8067 lower := buffer[loc] - @"DC00; 8068 incr(loc); 8069 cur_chr := @"10000 + (cur_chr - @"D800) * 1024 + lower; 8070 end; 8071 reswitch: cur_cmd:=cat_code(cur_chr); 8072 @<Change state if necessary, and |goto switch| if the 8073 current character should be ignored, 8074 or |goto reswitch| if the current character 8075 changes to another@>; 8076 end 8077else begin state:=new_line;@/ 8078 @<Move to next line of file, 8079 or |goto restart| if there is no next line, 8080 or |return| if a \.{\\read} line has finished@>; 8081 check_interrupt; 8082 goto switch; 8083 end; 8084end 8085 8086@ The following 48-way switch accomplishes the scanning quickly, assuming 8087that a decent \PASCAL\ compiler has translated the code. Note that the numeric 8088values for |mid_line|, |skip_blanks|, and |new_line| are spaced 8089apart from each other by |max_char_code+1|, so we can add a character's 8090command code to the state to get a single number that characterizes both. 8091 8092@d any_state_plus(#) == mid_line+#,skip_blanks+#,new_line+# 8093 8094@<Change state if necessary...@>= 8095case state+cur_cmd of 8096@<Cases where character is ignored@>: goto switch; 8097any_state_plus(escape): @<Scan a control sequence 8098 and set |state:=skip_blanks| or |mid_line|@>; 8099any_state_plus(active_char): @<Process an active-character control sequence 8100 and set |state:=mid_line|@>; 8101any_state_plus(sup_mark): @<If this |sup_mark| starts an expanded character 8102 like~\.{\^\^A} or~\.{\^\^df}, then |goto reswitch|, 8103 otherwise set |state:=mid_line|@>; 8104any_state_plus(invalid_char): @<Decry the invalid character and 8105 |goto restart|@>; 8106@t\4@>@<Handle situations involving spaces, braces, changes of state@>@; 8107othercases do_nothing 8108endcases 8109 8110@ @<Cases where character is ignored@>= 8111any_state_plus(ignore),skip_blanks+spacer,new_line+spacer 8112 8113@ We go to |restart| instead of to |switch|, because |state| might equal 8114|token_list| after the error has been dealt with 8115(cf.\ |clear_for_error_prompt|). 8116 8117@<Decry the invalid...@>= 8118begin print_err("Text line contains an invalid character"); 8119@.Text line contains...@> 8120help2("A funny symbol that I can't read has just been input.")@/ 8121("Continue, and I'll forget that it ever happened.");@/ 8122deletions_allowed:=false; error; deletions_allowed:=true; 8123goto restart; 8124end 8125 8126@ @d add_delims_to(#)==#+math_shift,#+tab_mark,#+mac_param, 8127 #+sub_mark,#+letter,#+other_char 8128 8129@<Handle situations involving spaces, braces, changes of state@>= 8130mid_line+spacer:@<Enter |skip_blanks| state, emit a space@>; 8131mid_line+car_ret:@<Finish line, emit a space@>; 8132skip_blanks+car_ret,any_state_plus(comment): 8133 @<Finish line, |goto switch|@>; 8134new_line+car_ret:@<Finish line, emit a \.{\\par}@>; 8135mid_line+left_brace: incr(align_state); 8136skip_blanks+left_brace,new_line+left_brace: begin 8137 state:=mid_line; incr(align_state); 8138 end; 8139mid_line+right_brace: decr(align_state); 8140skip_blanks+right_brace,new_line+right_brace: begin 8141 state:=mid_line; decr(align_state); 8142 end; 8143add_delims_to(skip_blanks),add_delims_to(new_line): state:=mid_line; 8144 8145@ When a character of type |spacer| gets through, its character code is 8146changed to $\.{"\ "}=@'40$. This means that the ASCII codes for tab and space, 8147and for the space inserted at the end of a line, will 8148be treated alike when macro parameters are being matched. We do this 8149since such characters are indistinguishable on most computer terminal displays. 8150 8151@<Finish line, emit a space@>= 8152begin loc:=limit+1; cur_cmd:=spacer; cur_chr:=" "; 8153end 8154 8155@ The following code is performed only when |cur_cmd=spacer|. 8156 8157@<Enter |skip_blanks| state, emit a space@>= 8158begin state:=skip_blanks; cur_chr:=" "; 8159end 8160 8161@ @<Finish line, |goto switch|@>= 8162begin loc:=limit+1; goto switch; 8163end 8164 8165@ @<Finish line, emit a \.{\\par}@>= 8166begin loc:=limit+1; cur_cs:=par_loc; cur_cmd:=eq_type(cur_cs); 8167cur_chr:=equiv(cur_cs); 8168if cur_cmd>=outer_call then check_outer_validity; 8169end 8170 8171@ Notice that a code like \.{\^\^8} becomes \.x if not followed by a hex digit. 8172 8173@d is_hex(#)==(((#>="0")and(#<="9"))or((#>="a")and(#<="f"))) 8174@d hex_to_cur_chr== 8175 if c<="9" then cur_chr:=c-"0" @+else cur_chr:=c-"a"+10; 8176 if cc<="9" then cur_chr:=16*cur_chr+cc-"0" 8177 else cur_chr:=16*cur_chr+cc-"a"+10 8178@d long_hex_to_cur_chr== 8179 if c<="9" then cur_chr:=c-"0" @+else cur_chr:=c-"a"+10; 8180 if cc<="9" then cur_chr:=16*cur_chr+cc-"0" 8181 else cur_chr:=16*cur_chr+cc-"a"+10; 8182 if ccc<="9" then cur_chr:=16*cur_chr+ccc-"0" 8183 else cur_chr:=16*cur_chr+ccc-"a"+10; 8184 if cccc<="9" then cur_chr:=16*cur_chr+cccc-"0" 8185 else cur_chr:=16*cur_chr+cccc-"a"+10 8186 8187@<If this |sup_mark| starts an expanded character...@>= 8188begin if cur_chr=buffer[loc] then if loc<limit then 8189 begin sup_count:=2; 8190 {we have |^^| and another char; check how many |^|s we have altogether, up to a max of 6} 8191 while (sup_count<6) and (loc+2*sup_count-2<=limit) and (cur_chr=buffer[loc+sup_count-1]) do 8192 incr(sup_count); 8193 {check whether we have enough hex chars for the number of |^|s} 8194 for d:=1 to sup_count do 8195 if not is_hex(buffer[loc+sup_count-2+d]) then {found a non-hex char, so do single |^^X| style} 8196 begin c:=buffer[loc+1]; 8197 if c<@'200 then 8198 begin loc:=loc+2; 8199 if c<@'100 then cur_chr:=c+@'100 @+else cur_chr:=c-@'100; 8200 goto reswitch; 8201 end; 8202 goto not_exp; 8203 end; 8204 {there were the right number of hex chars, so convert them} 8205 cur_chr:=0; 8206 for d:=1 to sup_count do 8207 begin c:=buffer[loc+sup_count-2+d]; 8208 if c<="9" then cur_chr:=16*cur_chr+c-"0" 8209 else cur_chr:=16*cur_chr+c-"a"+10; 8210 end; 8211 {check the resulting value is within the valid range} 8212 if cur_chr>biggest_usv then 8213 begin cur_chr:=buffer[loc]; 8214 goto not_exp; 8215 end; 8216 loc:=loc+2*sup_count-1; 8217 goto reswitch; 8218 end; 8219not_exp: 8220state:=mid_line; 8221end 8222 8223@ @<Process an active-character...@>= 8224begin cur_cs:=cur_chr+active_base; 8225cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs); state:=mid_line; 8226if cur_cmd>=outer_call then check_outer_validity; 8227end 8228 8229@ Control sequence names are scanned only when they appear in some line of 8230a file; once they have been scanned the first time, their |eqtb| location 8231serves as a unique identification, so \TeX\ doesn't need to refer to the 8232original name any more except when it prints the equivalent in symbolic form. 8233 8234The program that scans a control sequence has been written carefully 8235in order to avoid the blowups that might otherwise occur if a malicious 8236user tried something like `\.{\\catcode\'15=0}'. The algorithm might 8237look at |buffer[limit+1]|, but it never looks at |buffer[limit+2]|. 8238 8239If expanded characters like `\.{\^\^A}' or `\.{\^\^df}' 8240appear in or just following 8241a control sequence name, they are converted to single characters in the 8242buffer and the process is repeated, slowly but surely. 8243 8244@<Scan a control...@>= 8245begin if loc>limit then cur_cs:=null_cs {|state| is irrelevant in this case} 8246else begin start_cs: k:=loc; cur_chr:=buffer[k]; cat:=cat_code(cur_chr); 8247 incr(k); 8248 if cat=letter then state:=skip_blanks 8249 else if cat=spacer then state:=skip_blanks 8250 else state:=mid_line; 8251 if (cat=letter)and(k<=limit) then 8252 @<Scan ahead in the buffer until finding a nonletter; 8253 if an expanded code is encountered, reduce it 8254 and |goto start_cs|; otherwise if a multiletter control 8255 sequence is found, adjust |cur_cs| and |loc|, and 8256 |goto found|@> 8257 else @<If an expanded code is present, reduce it and |goto start_cs|@>; 8258 {At this point, we have a single-character cs name in the buffer. 8259 But if the character code is > @"FFFF, we treat it like a multiletter name 8260 for string purposes, because we use UTF-16 in the string pool.} 8261 if buffer[loc]>@"FFFF then begin 8262 cur_cs:=id_lookup(loc,1); incr(loc); goto found; 8263 end; 8264 cur_cs:=single_base+buffer[loc]; incr(loc); 8265 end; 8266found: cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs); 8267if cur_cmd>=outer_call then check_outer_validity; 8268end 8269 8270@ Whenever we reach the following piece of code, we will have 8271|cur_chr=buffer[k-1]| and |k<=limit+1| and |cat=cat_code(cur_chr)|. If an 8272expanded code like \.{\^\^A} or \.{\^\^df} appears in |buffer[(k-1)..(k+1)]| 8273or |buffer[(k-1)..(k+2)]|, we 8274will store the corresponding code in |buffer[k-1]| and shift the rest of 8275the buffer left two or three places. 8276 8277@<If an expanded...@>= 8278begin if (cat=sup_mark) and (buffer[k]=cur_chr) and (k<limit) then 8279 begin sup_count:=2; 8280 {we have |^^| and another char; check how many |^|s we have altogether, up to a max of 6} 8281 while (sup_count<6) and (k+2*sup_count-2<=limit) and (buffer[k+sup_count-1]=cur_chr) do 8282 incr(sup_count); 8283 {check whether we have enough hex chars for the number of |^|s} 8284 for d:=1 to sup_count do 8285 if not is_hex(buffer[k+sup_count-2+d]) then {found a non-hex char, so do single |^^X| style} 8286 begin c:=buffer[k+1]; 8287 if c<@'200 then 8288 begin if c<@'100 then buffer[k-1]:=c+@'100 @+else buffer[k-1]:=c-@'100; 8289 d:=2; limit:=limit-d; 8290 while k<=limit do 8291 begin buffer[k]:=buffer[k+d]; incr(k); 8292 end; 8293 goto start_cs; 8294 end 8295 else sup_count:=0; 8296 end; 8297 if sup_count>0 then {there were the right number of hex chars, so convert them} 8298 begin cur_chr:=0; 8299 for d:=1 to sup_count do 8300 begin c:=buffer[k+sup_count-2+d]; 8301 if c<="9" then cur_chr:=16*cur_chr+c-"0" 8302 else cur_chr:=16*cur_chr+c-"a"+10; 8303 end; 8304 {check the resulting value is within the valid range} 8305 if cur_chr>biggest_usv then cur_chr:=buffer[k] 8306 else begin buffer[k-1]:=cur_chr; 8307 d:=2*sup_count-1; 8308 {shift the rest of the buffer left by |d| chars} 8309 limit:=limit-d; 8310 while k<=limit do 8311 begin buffer[k]:=buffer[k+d]; incr(k); 8312 end; 8313 goto start_cs; 8314 end 8315 end 8316 end 8317end 8318 8319@ @<Scan ahead in the buffer...@>= 8320begin repeat cur_chr:=buffer[k]; cat:=cat_code(cur_chr); incr(k); 8321until (cat<>letter)or(k>limit); 8322@<If an expanded...@>; 8323if cat<>letter then decr(k); 8324 {now |k| points to first nonletter} 8325if k>loc+1 then {multiletter control sequence has been scanned} 8326 begin cur_cs:=id_lookup(loc,k-loc); loc:=k; goto found; 8327 end; 8328end 8329 8330@ Let's consider now what happens when |get_next| is looking at a token list. 8331 8332@<Input from token list, |goto restart| if end of list or 8333 if a parameter needs to be expanded@>= 8334if loc<>null then {list not exhausted} 8335@^inner loop@> 8336 begin t:=info(loc); loc:=link(loc); {move to next} 8337 if t>=cs_token_flag then {a control sequence token} 8338 begin cur_cs:=t-cs_token_flag; 8339 cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs); 8340 if cur_cmd>=outer_call then 8341 if cur_cmd=dont_expand then 8342 @<Get the next token, suppressing expansion@> 8343 else check_outer_validity; 8344 end 8345 else begin cur_cmd:=t div max_char_val; cur_chr:=t mod max_char_val; 8346 case cur_cmd of 8347 left_brace: incr(align_state); 8348 right_brace: decr(align_state); 8349 out_param: @<Insert macro parameter and |goto restart|@>; 8350 othercases do_nothing 8351 endcases; 8352 end; 8353 end 8354else begin {we are done with this token list} 8355 end_token_list; goto restart; {resume previous level} 8356 end 8357 8358@ The present point in the program is reached only when the |expand| 8359routine has inserted a special marker into the input. In this special 8360case, |info(loc)| is known to be a control sequence token, and |link(loc)=null|. 8361 8362@d no_expand_flag=special_char {this characterizes a special variant of |relax|} 8363 8364@<Get the next token, suppressing expansion@>= 8365begin cur_cs:=info(loc)-cs_token_flag; loc:=null;@/ 8366cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs); 8367if cur_cmd>max_command then 8368 begin cur_cmd:=relax; cur_chr:=no_expand_flag; 8369 end; 8370end 8371 8372@ @<Insert macro parameter...@>= 8373begin begin_token_list(param_stack[param_start+cur_chr-1],parameter); 8374goto restart; 8375end 8376 8377@ All of the easy branches of |get_next| have now been taken care of. 8378There is one more branch. 8379 8380@d end_line_char_inactive == (end_line_char<0)or(end_line_char>255) 8381 8382@<Move to next line of file, or |goto restart|...@>= 8383if name>17 then @<Read next line of file into |buffer|, or 8384 |goto restart| if the file has ended@> 8385else begin if not terminal_input then {\.{\\read} line has ended} 8386 begin cur_cmd:=0; cur_chr:=0; return; 8387 end; 8388 if input_ptr>0 then {text was inserted during error recovery} 8389 begin end_file_reading; goto restart; {resume previous level} 8390 end; 8391 if selector<log_only then open_log_file; 8392 if interaction>nonstop_mode then 8393 begin if end_line_char_inactive then incr(limit); 8394 if limit=start then {previous line was empty} 8395 print_nl("(Please type a command or say `\end')"); 8396@.Please type...@> 8397 print_ln; first:=start; 8398 prompt_input("*"); {input on-line into |buffer|} 8399@.*\relax@> 8400 limit:=last; 8401 if end_line_char_inactive then decr(limit) 8402 else buffer[limit]:=end_line_char; 8403 first:=limit+1; 8404 loc:=start; 8405 end 8406 else fatal_error("*** (job aborted, no legal \end found)"); 8407@.job aborted@> 8408 {nonstop mode, which is intended for overnight batch processing, 8409 never waits for on-line input} 8410 end 8411 8412@ The global variable |force_eof| is normally |false|; it is set |true| 8413by an \.{\\endinput} command. 8414 8415@<Glob...@>= 8416@!force_eof:boolean; {should the next \.{\\input} be aborted early?} 8417 8418@ @<Read next line of file into |buffer|, or 8419 |goto restart| if the file has ended@>= 8420begin incr(line); first:=start; 8421if not force_eof then 8422 if name<=19 then 8423 begin if pseudo_input then {not end of file} 8424 firm_up_the_line {this sets |limit|} 8425 else if (every_eof<>null)and not eof_seen[index] then 8426 begin limit:=first-1; eof_seen[index]:=true; {fake one empty line} 8427 begin_token_list(every_eof,every_eof_text); goto restart; 8428 end 8429 else force_eof:=true; 8430 end 8431 else 8432 begin if input_ln(cur_file,true) then {not end of file} 8433 firm_up_the_line {this sets |limit|} 8434 else if (every_eof<>null)and not eof_seen[index] then 8435 begin limit:=first-1; eof_seen[index]:=true; {fake one empty line} 8436 begin_token_list(every_eof,every_eof_text); goto restart; 8437 end 8438 else force_eof:=true; 8439 end; 8440if force_eof then 8441 begin if tracing_nesting>0 then 8442 if (grp_stack[in_open]<>cur_boundary)or@| 8443 (if_stack[in_open]<>cond_ptr) then file_warning; 8444 {give warning for some unfinished groups and/or conditionals} 8445 if name>=19 then 8446 begin print_char(")"); decr(open_parens); 8447 update_terminal; {show user that file has been read} 8448 end; 8449 force_eof:=false; 8450 end_file_reading; {resume previous level} 8451 check_outer_validity; goto restart; 8452 end; 8453if end_line_char_inactive then decr(limit) 8454else buffer[limit]:=end_line_char; 8455first:=limit+1; loc:=start; {ready to read} 8456end 8457 8458@ If the user has set the |pausing| parameter to some positive value, 8459and if nonstop mode has not been selected, each line of input is displayed 8460on the terminal and the transcript file, followed by `\.{=>}'. 8461\TeX\ waits for a response. If the response is simply |carriage_return|, the 8462line is accepted as it stands, otherwise the line typed is 8463used instead of the line in the file. 8464 8465@p procedure firm_up_the_line; 8466var k:0..buf_size; {an index into |buffer|} 8467begin limit:=last; 8468if pausing>0 then if interaction>nonstop_mode then 8469 begin wake_up_terminal; print_ln; 8470 if start<limit then for k:=start to limit-1 do print(buffer[k]); 8471 first:=limit; prompt_input("=>"); {wait for user response} 8472@.=>@> 8473 if last>first then 8474 begin for k:=first to last-1 do {move line down in buffer} 8475 buffer[k+start-first]:=buffer[k]; 8476 limit:=start+last-first; 8477 end; 8478 end; 8479end; 8480 8481@ Since |get_next| is used so frequently in \TeX, it is convenient 8482to define three related procedures that do a little more: 8483 8484\yskip\hang|get_token| not only sets |cur_cmd| and |cur_chr|, it 8485also sets |cur_tok|, a packed halfword version of the current token. 8486 8487\yskip\hang|get_x_token|, meaning ``get an expanded token,'' is like 8488|get_token|, but if the current token turns out to be a user-defined 8489control sequence (i.e., a macro call), or a conditional, 8490or something like \.{\\topmark} or \.{\\expandafter} or \.{\\csname}, 8491it is eliminated from the input by beginning the expansion of the macro 8492or the evaluation of the conditional. 8493 8494\yskip\hang|x_token| is like |get_x_token| except that it assumes that 8495|get_next| has already been called. 8496 8497\yskip\noindent 8498In fact, these three procedures account for almost every use of |get_next|. 8499 8500@ No new control sequences will be defined except during a call of 8501|get_token|, or when \.{\\csname} compresses a token list, because 8502|no_new_control_sequence| is always |true| at other times. 8503 8504@p procedure get_token; {sets |cur_cmd|, |cur_chr|, |cur_tok|} 8505begin no_new_control_sequence:=false; get_next; no_new_control_sequence:=true; 8506@^inner loop@> 8507if cur_cs=0 then cur_tok:=(cur_cmd*max_char_val)+cur_chr 8508else cur_tok:=cs_token_flag+cur_cs; 8509end; 8510 8511@* \[25] Expanding the next token. 8512Only a dozen or so command codes |>max_command| can possibly be returned by 8513|get_next|; in increasing order, they are |undefined_cs|, |expand_after|, 8514|no_expand|, |input|, |if_test|, |fi_or_else|, |cs_name|, |convert|, |the|, 8515|top_bot_mark|, |call|, |long_call|, |outer_call|, |long_outer_call|, and 8516|end_template|.{\emergencystretch=40pt\par} 8517 8518The |expand| subroutine is used when |cur_cmd>max_command|. It removes a 8519``call'' or a conditional or one of the other special operations just 8520listed. It follows that |expand| might invoke itself recursively. In all 8521cases, |expand| destroys the current token, but it sets things up so that 8522the next |get_next| will deliver the appropriate next token. The value of 8523|cur_tok| need not be known when |expand| is called. 8524 8525Since several of the basic scanning routines communicate via global variables, 8526their values are saved as local variables of |expand| so that 8527recursive calls don't invalidate them. 8528@^recursion@> 8529 8530@p@t\4@>@<Declare the procedure called |macro_call|@>@;@/ 8531@t\4@>@<Declare the procedure called |insert_relax|@>@;@/ 8532@t\4@>@<Declare \eTeX\ procedures for expanding@>@;@/ 8533procedure@?pass_text; forward;@t\2@> 8534procedure@?start_input; forward;@t\2@> 8535procedure@?conditional; forward;@t\2@> 8536procedure@?get_x_token; forward;@t\2@> 8537procedure@?conv_toks; forward;@t\2@> 8538procedure@?ins_the_toks; forward;@t\2@> 8539procedure expand; 8540label reswitch; 8541var t:halfword; {token that is being ``expanded after''} 8542@!b:boolean; {keep track of nested csnames} 8543@!p,@!q,@!r:pointer; {for list manipulation} 8544@!j:0..buf_size; {index into |buffer|} 8545@!cv_backup:integer; {to save the global quantity |cur_val|} 8546@!cvl_backup,@!radix_backup,@!co_backup:small_number; 8547 {to save |cur_val_level|, etc.} 8548@!backup_backup:pointer; {to save |link(backup_head)|} 8549@!save_scanner_status:small_number; {temporary storage of |scanner_status|} 8550begin cv_backup:=cur_val; cvl_backup:=cur_val_level; radix_backup:=radix; 8551co_backup:=cur_order; backup_backup:=link(backup_head); 8552reswitch: 8553if cur_cmd<call then @<Expand a nonmacro@> 8554else if cur_cmd<end_template then macro_call 8555else @<Insert a token containing |frozen_endv|@>; 8556cur_val:=cv_backup; cur_val_level:=cvl_backup; radix:=radix_backup; 8557cur_order:=co_backup; link(backup_head):=backup_backup; 8558end; 8559 8560@ @<Glob...@>= 8561@!is_in_csname: boolean; 8562 8563@ @<Set init...@>= 8564is_in_csname:=false; 8565 8566@ @<Expand a nonmacro@>= 8567begin if tracing_commands>1 then show_cur_cmd_chr; 8568case cur_cmd of 8569top_bot_mark:@<Insert the \(a)appropriate mark text into the scanner@>; 8570expand_after:if cur_chr=0 then @<Expand the token after the next token@> 8571 else @<Negate a boolean conditional and |goto reswitch|@>; 8572no_expand: if cur_chr=0 then @<Suppress expansion of the next token@> 8573 else @<Implement \.{\\primitive}@>; 8574cs_name:@<Manufacture a control sequence name@>; 8575convert:conv_toks; {this procedure is discussed in Part 27 below} 8576the:ins_the_toks; {this procedure is discussed in Part 27 below} 8577if_test:conditional; {this procedure is discussed in Part 28 below} 8578fi_or_else:@<Terminate the current conditional and skip to \.{\\fi}@>; 8579input:@<Initiate or terminate input from a file@>; 8580othercases @<Complain about an undefined macro@> 8581endcases; 8582end 8583 8584@ It takes only a little shuffling to do what \TeX\ calls \.{\\expandafter}. 8585 8586@<Expand the token after...@>= 8587begin get_token; t:=cur_tok; get_token; 8588if cur_cmd>max_command then expand@+else back_input; 8589cur_tok:=t; back_input; 8590end 8591 8592@ The implementation of \.{\\noexpand} is a bit trickier, because it is 8593necessary to insert a special `|dont_expand|' marker into \TeX's reading 8594mechanism. This special marker is processed by |get_next|, but it does 8595not slow down the inner loop. 8596 8597Since \.{\\outer} macros might arise here, we must also 8598clear the |scanner_status| temporarily. 8599 8600@<Suppress expansion...@>= 8601begin save_scanner_status:=scanner_status; scanner_status:=normal; 8602get_token; scanner_status:=save_scanner_status; t:=cur_tok; 8603back_input; {now |start| and |loc| point to the backed-up token |t|} 8604if t>=cs_token_flag then 8605 begin p:=get_avail; info(p):=cs_token_flag+frozen_dont_expand; 8606 link(p):=loc; start:=p; loc:=p; 8607 end; 8608end 8609 8610@ The \.{\\primitive} handling. If the primitive meaning of the next 8611token is an expandable command, it suffices to replace the current 8612token with the primitive one and restart |expand|/ 8613 8614Otherwise, the token we just read has to be pushed back, as well 8615as a token matching the internal form of \.{\\primitive}, that is 8616sneaked in as an alternate form of |ignore_spaces|. 8617@!@:primitive_}{\.{\\primitive} primitive (internalized)@> 8618 8619Simply pushing back a token that matches the correct internal command 8620does not work, because approach would not survive roundtripping to a 8621temporary file. 8622 8623@<Implement \.{\\primitive}@>= 8624begin save_scanner_status:=scanner_status; scanner_status:=normal; 8625get_token; scanner_status:=save_scanner_status; 8626if cur_cs < hash_base then 8627 cur_cs:=prim_lookup(cur_cs-257) 8628else 8629 cur_cs:=prim_lookup(text(cur_cs)); 8630if cur_cs<>undefined_primitive then begin 8631 t:=prim_eq_type(cur_cs); 8632 if t>max_command then begin 8633 cur_cmd:=t; 8634 cur_chr:=prim_equiv(cur_cs); 8635 cur_tok:=(cur_cmd*max_char_val)+cur_chr; 8636 cur_cs :=0; 8637 goto reswitch; 8638 end 8639 else begin 8640 back_input; { now |loc| and |start| point to a one-item list } 8641 p:=get_avail; info(p):=cs_token_flag+frozen_primitive; 8642 link(p):=loc; loc:=p; start:=p; 8643 end; 8644 end; 8645end 8646 8647@ @<Complain about an undefined macro@>= 8648begin print_err("Undefined control sequence"); 8649@.Undefined control sequence@> 8650help5("The control sequence at the end of the top line")@/ 8651("of your error message was never \def'ed. If you have")@/ 8652("misspelled it (e.g., `\hobx'), type `I' and the correct")@/ 8653("spelling (e.g., `I\hbox'). Otherwise just continue,")@/ 8654("and I'll forget about whatever was undefined."); 8655error; 8656end 8657 8658@ The |expand| procedure and some other routines that construct token 8659lists find it convenient to use the following macros, which are valid only if 8660the variables |p| and |q| are reserved for token-list building. 8661 8662@d store_new_token(#)==begin q:=get_avail; link(p):=q; info(q):=#; 8663 p:=q; {|link(p)| is |null|} 8664 end 8665@d fast_store_new_token(#)==begin fast_get_avail(q); link(p):=q; info(q):=#; 8666 p:=q; {|link(p)| is |null|} 8667 end 8668 8669@ @<Manufacture a control...@>= 8670begin r:=get_avail; p:=r; {head of the list of characters} 8671b:=is_in_csname; is_in_csname:=true; 8672repeat get_x_token; 8673if cur_cs=0 then store_new_token(cur_tok); 8674until cur_cs<>0; 8675if cur_cmd<>end_cs_name then @<Complain about missing \.{\\endcsname}@>; 8676is_in_csname:=b; 8677@<Look up the characters of list |r| in the hash table, and set |cur_cs|@>; 8678flush_list(r); 8679if eq_type(cur_cs)=undefined_cs then 8680 begin eq_define(cur_cs,relax,too_big_usv); 8681 {N.B.: The |save_stack| might change} 8682 end; {the control sequence will now match `\.{\\relax}'} 8683cur_tok:=cur_cs+cs_token_flag; back_input; 8684end 8685 8686@ @<Complain about missing \.{\\endcsname}@>= 8687begin print_err("Missing "); print_esc("endcsname"); print(" inserted"); 8688@.Missing \\endcsname...@> 8689help2("The control sequence marked <to be read again> should")@/ 8690 ("not appear between \csname and \endcsname."); 8691back_error; 8692end 8693 8694@ @<Look up the characters of list |r| in the hash table...@>= 8695j:=first; p:=link(r); 8696while p<>null do 8697 begin if j>=max_buf_stack then 8698 begin max_buf_stack:=j+1; 8699 if max_buf_stack=buf_size then 8700 overflow("buffer size",buf_size); 8701@:TeX capacity exceeded buffer size}{\quad buffer size@> 8702 end; 8703 buffer[j]:=info(p) mod max_char_val; incr(j); p:=link(p); 8704 end; 8705if (j>first+1) or (buffer[first]>@"FFFF) then 8706 begin no_new_control_sequence:=false; cur_cs:=id_lookup(first,j-first); 8707 no_new_control_sequence:=true; 8708 end 8709else if j=first then cur_cs:=null_cs {the list is empty} 8710else cur_cs:=single_base+buffer[first] {the list has length one} 8711 8712@ An |end_template| command is effectively changed to an |endv| command 8713by the following code. (The reason for this is discussed below; the 8714|frozen_end_template| at the end of the template has passed the 8715|check_outer_validity| test, so its mission of error detection has been 8716accomplished.) 8717 8718@<Insert a token containing |frozen_endv|@>= 8719begin cur_tok:=cs_token_flag+frozen_endv; back_input; 8720end 8721 8722@ The processing of \.{\\input} involves the |start_input| subroutine, 8723which will be declared later; the processing of \.{\\endinput} is trivial. 8724 8725@<Put each...@>= 8726primitive("input",input,0);@/ 8727@!@:input_}{\.{\\input} primitive@> 8728primitive("endinput",input,1);@/ 8729@!@:end_input_}{\.{\\endinput} primitive@> 8730 8731@ @<Cases of |print_cmd_chr|...@>= 8732input: if chr_code=0 then print_esc("input") 8733 @/@<Cases of |input| for |print_cmd_chr|@>@/ 8734 else print_esc("endinput"); 8735 8736@ @<Initiate or terminate input...@>= 8737if cur_chr=1 then force_eof:=true 8738@/@<Cases for |input|@>@/ 8739else if name_in_progress then insert_relax 8740else start_input 8741 8742@ Sometimes the expansion looks too far ahead, so we want to insert 8743a harmless \.{\\relax} into the user's input. 8744 8745@<Declare the procedure called |insert_relax|@>= 8746procedure insert_relax; 8747begin cur_tok:=cs_token_flag+cur_cs; back_input; 8748cur_tok:=cs_token_flag+frozen_relax; back_input; token_type:=inserted; 8749end; 8750 8751@ Here is a recursive procedure that is \TeX's usual way to get the 8752next token of input. It has been slightly optimized to take account of 8753common cases. 8754 8755@p procedure get_x_token; {sets |cur_cmd|, |cur_chr|, |cur_tok|, 8756 and expands macros} 8757label restart,done; 8758begin restart: get_next; 8759@^inner loop@> 8760if cur_cmd<=max_command then goto done; 8761if cur_cmd>=call then 8762 if cur_cmd<end_template then macro_call 8763 else begin cur_cs:=frozen_endv; cur_cmd:=endv; 8764 goto done; {|cur_chr=null_list|} 8765 end 8766else expand; 8767goto restart; 8768done: if cur_cs=0 then cur_tok:=(cur_cmd*max_char_val)+cur_chr 8769else cur_tok:=cs_token_flag+cur_cs; 8770end; 8771 8772@ The |get_x_token| procedure is equivalent to two consecutive 8773procedure calls: |get_next; x_token|. 8774 8775@p procedure x_token; {|get_x_token| without the initial |get_next|} 8776begin while cur_cmd>max_command do 8777 begin expand; 8778 get_next; 8779 end; 8780if cur_cs=0 then cur_tok:=(cur_cmd*max_char_val)+cur_chr 8781else cur_tok:=cs_token_flag+cur_cs; 8782end; 8783 8784@ A control sequence that has been \.{\\def}'ed by the user is expanded by 8785\TeX's |macro_call| procedure. 8786 8787Before we get into the details of |macro_call|, however, let's consider the 8788treatment of primitives like \.{\\topmark}, since they are essentially 8789macros without parameters. The token lists for such marks are kept in a 8790global array of five pointers; we refer to the individual entries of this 8791array by symbolic names |top_mark|, etc. The value of |top_mark| is either 8792|null| or a pointer to the reference count of a token list. 8793 8794@d marks_code==5 {add this for \.{\\topmarks} etc.} 8795@# 8796@d top_mark_code=0 {the mark in effect at the previous page break} 8797@d first_mark_code=1 {the first mark between |top_mark| and |bot_mark|} 8798@d bot_mark_code=2 {the mark in effect at the current page break} 8799@d split_first_mark_code=3 {the first mark found by \.{\\vsplit}} 8800@d split_bot_mark_code=4 {the last mark found by \.{\\vsplit}} 8801@d top_mark==cur_mark[top_mark_code] 8802@d first_mark==cur_mark[first_mark_code] 8803@d bot_mark==cur_mark[bot_mark_code] 8804@d split_first_mark==cur_mark[split_first_mark_code] 8805@d split_bot_mark==cur_mark[split_bot_mark_code] 8806 8807@<Glob...@>= 8808@!cur_mark:array[top_mark_code..split_bot_mark_code] of pointer; 8809 {token lists for marks} 8810 8811@ @<Set init...@>= 8812top_mark:=null; first_mark:=null; bot_mark:=null; 8813split_first_mark:=null; split_bot_mark:=null; 8814 8815@ @<Put each...@>= 8816primitive("topmark",top_bot_mark,top_mark_code); 8817@!@:top_mark_}{\.{\\topmark} primitive@> 8818primitive("firstmark",top_bot_mark,first_mark_code); 8819@!@:first_mark_}{\.{\\firstmark} primitive@> 8820primitive("botmark",top_bot_mark,bot_mark_code); 8821@!@:bot_mark_}{\.{\\botmark} primitive@> 8822primitive("splitfirstmark",top_bot_mark,split_first_mark_code); 8823@!@:split_first_mark_}{\.{\\splitfirstmark} primitive@> 8824primitive("splitbotmark",top_bot_mark,split_bot_mark_code); 8825@!@:split_bot_mark_}{\.{\\splitbotmark} primitive@> 8826 8827@ @<Cases of |print_cmd_chr|...@>= 8828top_bot_mark: begin case (chr_code mod marks_code) of 8829 first_mark_code: print_esc("firstmark"); 8830 bot_mark_code: print_esc("botmark"); 8831 split_first_mark_code: print_esc("splitfirstmark"); 8832 split_bot_mark_code: print_esc("splitbotmark"); 8833 othercases print_esc("topmark") 8834 endcases; 8835 if chr_code>=marks_code then print_char("s"); 8836 end; 8837 8838@ The following code is activated when |cur_cmd=top_bot_mark| and 8839when |cur_chr| is a code like |top_mark_code|. 8840 8841@<Insert the \(a)appropriate mark text into the scanner@>= 8842begin t:=cur_chr mod marks_code; 8843if cur_chr>=marks_code then scan_register_num@+else cur_val:=0; 8844if cur_val=0 then cur_ptr:=cur_mark[t] 8845else @<Compute the mark pointer for mark type |t| and class |cur_val|@>; 8846if cur_ptr<>null then begin_token_list(cur_ptr,mark_text); 8847end 8848 8849@ Now let's consider |macro_call| itself, which is invoked when \TeX\ is 8850scanning a control sequence whose |cur_cmd| is either |call|, |long_call|, 8851|outer_call|, or |long_outer_call|. The control sequence definition 8852appears in the token list whose reference count is in location |cur_chr| 8853of |mem|. 8854 8855The global variable |long_state| will be set to |call| or to |long_call|, 8856depending on whether or not the control sequence disallows \.{\\par} 8857in its parameters. The |get_next| routine will set |long_state| to 8858|outer_call| and emit \.{\\par}, if a file ends or if an \.{\\outer} 8859control sequence occurs in the midst of an argument. 8860 8861@<Glob...@>= 8862@!long_state:call..long_outer_call; {governs the acceptance of \.{\\par}} 8863 8864@ The parameters, if any, must be scanned before the macro is expanded. 8865Parameters are token lists without reference counts. They are placed on 8866an auxiliary stack called |pstack| while they are being scanned, since 8867the |param_stack| may be losing entries during the matching process. 8868(Note that |param_stack| can't be gaining entries, since |macro_call| is 8869the only routine that puts anything onto |param_stack|, and it 8870is not recursive.) 8871 8872@<Glob...@>= 8873@!pstack:array[0..8] of pointer; {arguments supplied to a macro} 8874 8875@ After parameter scanning is complete, the parameters are moved to the 8876|param_stack|. Then the macro body is fed to the scanner; in other words, 8877|macro_call| places the defined text of the control sequence at the 8878top of\/ \TeX's input stack, so that |get_next| will proceed to read it 8879next. 8880 8881The global variable |cur_cs| contains the |eqtb| address of the control sequence 8882being expanded, when |macro_call| begins. If this control sequence has not been 8883declared \.{\\long}, i.e., if its command code in the |eq_type| field is 8884not |long_call| or |long_outer_call|, its parameters are not allowed to contain 8885the control sequence \.{\\par}. If an illegal \.{\\par} appears, the macro 8886call is aborted, and the \.{\\par} will be rescanned. 8887 8888@<Declare the procedure called |macro_call|@>= 8889procedure macro_call; {invokes a user-defined control sequence} 8890label exit, continue, done, done1, found; 8891var r:pointer; {current node in the macro's token list} 8892@!p:pointer; {current node in parameter token list being built} 8893@!q:pointer; {new node being put into the token list} 8894@!s:pointer; {backup pointer for parameter matching} 8895@!t:pointer; {cycle pointer for backup recovery} 8896@!u,@!v:pointer; {auxiliary pointers for backup recovery} 8897@!rbrace_ptr:pointer; {one step before the last |right_brace| token} 8898@!n:small_number; {the number of parameters scanned} 8899@!unbalance:halfword; {unmatched left braces in current parameter} 8900@!m:halfword; {the number of tokens or groups (usually)} 8901@!ref_count:pointer; {start of the token list} 8902@!save_scanner_status:small_number; {|scanner_status| upon entry} 8903@!save_warning_index:pointer; {|warning_index| upon entry} 8904@!match_chr:ASCII_code; {character used in parameter} 8905begin save_scanner_status:=scanner_status; save_warning_index:=warning_index; 8906warning_index:=cur_cs; ref_count:=cur_chr; r:=link(ref_count); n:=0; 8907if tracing_macros>0 then @<Show the text of the macro being expanded@>; 8908if info(r)=protected_token then r:=link(r); 8909if info(r)<>end_match_token then 8910 @<Scan the parameters and make |link(r)| point to the macro body; but 8911 |return| if an illegal \.{\\par} is detected@>; 8912@<Feed the macro body and its parameters to the scanner@>; 8913exit:scanner_status:=save_scanner_status; warning_index:=save_warning_index; 8914end; 8915 8916@ Before we put a new token list on the input stack, it is wise to clean off 8917all token lists that have recently been depleted. Then a user macro that ends 8918with a call to itself will not require unbounded stack space. 8919 8920@<Feed the macro body and its parameters to the scanner@>= 8921while (state=token_list)and(loc=null)and(token_type<>v_template) do 8922 end_token_list; {conserve stack space} 8923begin_token_list(ref_count,macro); name:=warning_index; loc:=link(r); 8924if n>0 then 8925 begin if param_ptr+n>max_param_stack then 8926 begin max_param_stack:=param_ptr+n; 8927 if max_param_stack>param_size then 8928 overflow("parameter stack size",param_size); 8929@:TeX capacity exceeded parameter stack size}{\quad parameter stack size@> 8930 end; 8931 for m:=0 to n-1 do param_stack[param_ptr+m]:=pstack[m]; 8932 param_ptr:=param_ptr+n; 8933 end 8934 8935@ At this point, the reader will find it advisable to review the explanation 8936of token list format that was presented earlier, since many aspects of that 8937format are of importance chiefly in the |macro_call| routine. 8938 8939The token list might begin with a string of compulsory tokens before the 8940first |match| or |end_match|. In that case the macro name is supposed to be 8941followed by those tokens; the following program will set |s=null| to 8942represent this restriction. Otherwise |s| will be set to the first token of 8943a string that will delimit the next parameter. 8944 8945@<Scan the parameters and make |link(r)| point to the macro body...@>= 8946begin scanner_status:=matching; unbalance:=0; 8947long_state:=eq_type(cur_cs); 8948if long_state>=outer_call then long_state:=long_state-2; 8949repeat link(temp_head):=null; 8950if (info(r)>=end_match_token)or(info(r)<match_token) then s:=null 8951else begin match_chr:=info(r)-match_token; s:=link(r); r:=s; 8952 p:=temp_head; m:=0; 8953 end; 8954@<Scan a parameter until its delimiter string has been found; or, if |s=null|, 8955 simply scan the delimiter string@>;@/ 8956{now |info(r)| is a token whose command code is either |match| or |end_match|} 8957until info(r)=end_match_token; 8958end 8959 8960@ If |info(r)| is a |match| or |end_match| command, it cannot be equal to 8961any token found by |get_token|. Therefore an undelimited parameter---i.e., 8962a |match| that is immediately followed by |match| or |end_match|---will 8963always fail the test `|cur_tok=info(r)|' in the following algorithm. 8964 8965@<Scan a parameter until its delimiter string has been found; or, ...@>= 8966continue: get_token; {set |cur_tok| to the next token of input} 8967if cur_tok=info(r) then 8968 @<Advance \(r)|r|; |goto found| if the parameter delimiter has been 8969 fully matched, otherwise |goto continue|@>; 8970@<Contribute the recently matched tokens to the current parameter, and 8971 |goto continue| if a partial match is still in effect; 8972 but abort if |s=null|@>; 8973if cur_tok=par_token then if long_state<>long_call then 8974 @<Report a runaway argument and abort@>; 8975if cur_tok<right_brace_limit then 8976 if cur_tok<left_brace_limit then 8977 @<Contribute an entire group to the current parameter@> 8978 else @<Report an extra right brace and |goto continue|@> 8979else @<Store the current token, but |goto continue| if it is 8980 a blank space that would become an undelimited parameter@>; 8981incr(m); 8982if info(r)>end_match_token then goto continue; 8983if info(r)<match_token then goto continue; 8984found: if s<>null then @<Tidy up the parameter just scanned, and tuck it away@> 8985 8986@ @<Store the current token, but |goto continue| if it is...@>= 8987begin if cur_tok=space_token then 8988 if info(r)<=end_match_token then 8989 if info(r)>=match_token then goto continue; 8990store_new_token(cur_tok); 8991end 8992 8993@ A slightly subtle point arises here: When the parameter delimiter ends 8994with `\.{\#\{}', the token list will have a left brace both before and 8995after the |end_match|\kern-.4pt. Only one of these should affect the 8996|align_state|, but both will be scanned, so we must make a correction. 8997 8998@<Advance \(r)|r|; |goto found| if the parameter delimiter has been fully...@>= 8999begin r:=link(r); 9000if (info(r)>=match_token)and(info(r)<=end_match_token) then 9001 begin if cur_tok<left_brace_limit then decr(align_state); 9002 goto found; 9003 end 9004else goto continue; 9005end 9006 9007@ @<Report an extra right brace and |goto continue|@>= 9008begin back_input; print_err("Argument of "); sprint_cs(warning_index); 9009@.Argument of \\x has...@> 9010print(" has an extra }"); 9011help6("I've run across a `}' that doesn't seem to match anything.")@/ 9012 ("For example, `\def\a#1{...}' and `\a}' would produce")@/ 9013 ("this error. If you simply proceed now, the `\par' that")@/ 9014 ("I've just inserted will cause me to report a runaway")@/ 9015 ("argument that might be the root of the problem. But if")@/ 9016 ("your `}' was spurious, just type `2' and it will go away."); 9017incr(align_state); long_state:=call; cur_tok:=par_token; ins_error; 9018goto continue; 9019end {a white lie; the \.{\\par} won't always trigger a runaway} 9020 9021@ If |long_state=outer_call|, a runaway argument has already been reported. 9022 9023@<Report a runaway argument and abort@>= 9024begin if long_state=call then 9025 begin runaway; print_err("Paragraph ended before "); 9026@.Paragraph ended before...@> 9027 sprint_cs(warning_index); print(" was complete"); 9028 help3("I suspect you've forgotten a `}', causing me to apply this")@/ 9029 ("control sequence to too much text. How can we recover?")@/ 9030 ("My plan is to forget the whole thing and hope for the best."); 9031 back_error; 9032 end; 9033pstack[n]:=link(temp_head); align_state:=align_state-unbalance; 9034for m:=0 to n do flush_list(pstack[m]); 9035return; 9036end 9037 9038@ When the following code becomes active, we have matched tokens from |s| to 9039the predecessor of |r|, and we have found that |cur_tok<>info(r)|. An 9040interesting situation now presents itself: If the parameter is to be 9041delimited by a string such as `\.{ab}', and if we have scanned `\.{aa}', 9042we want to contribute one `\.a' to the current parameter and resume 9043looking for a `\.b'. The program must account for such partial matches and 9044for others that can be quite complex. But most of the time we have |s=r| 9045and nothing needs to be done. 9046 9047Incidentally, it is possible for \.{\\par} tokens to sneak in to certain 9048parameters of non-\.{\\long} macros. For example, consider a case like 9049`\.{\\def\\a\#1\\par!\{...\}}' where the first \.{\\par} is not followed 9050by an exclamation point. In such situations it does not seem appropriate 9051to prohibit the \.{\\par}, so \TeX\ keeps quiet about this bending of 9052the rules. 9053 9054@<Contribute the recently matched tokens to the current parameter...@>= 9055if s<>r then 9056 if s=null then @<Report an improper use of the macro and abort@> 9057 else begin t:=s; 9058 repeat store_new_token(info(t)); incr(m); u:=link(t); v:=s; 9059 loop@+ begin if u=r then 9060 if cur_tok<>info(v) then goto done 9061 else begin r:=link(v); goto continue; 9062 end; 9063 if info(u)<>info(v) then goto done; 9064 u:=link(u); v:=link(v); 9065 end; 9066 done: t:=link(t); 9067 until t=r; 9068 r:=s; {at this point, no tokens are recently matched} 9069 end 9070 9071@ @<Report an improper use...@>= 9072begin print_err("Use of "); sprint_cs(warning_index); 9073@.Use of x doesn't match...@> 9074print(" doesn't match its definition"); 9075help4("If you say, e.g., `\def\a1{...}', then you must always")@/ 9076 ("put `1' after `\a', since control sequence names are")@/ 9077 ("made up of letters only. The macro here has not been")@/ 9078 ("followed by the required stuff, so I'm ignoring it."); 9079error; return; 9080end 9081 9082@ @<Contribute an entire group to the current parameter@>= 9083begin unbalance:=1; 9084@^inner loop@> 9085loop@+ begin fast_store_new_token(cur_tok); get_token; 9086 if cur_tok=par_token then if long_state<>long_call then 9087 @<Report a runaway argument and abort@>; 9088 if cur_tok<right_brace_limit then 9089 if cur_tok<left_brace_limit then incr(unbalance) 9090 else begin decr(unbalance); 9091 if unbalance=0 then goto done1; 9092 end; 9093 end; 9094done1: rbrace_ptr:=p; store_new_token(cur_tok); 9095end 9096 9097@ If the parameter consists of a single group enclosed in braces, we must 9098strip off the enclosing braces. That's why |rbrace_ptr| was introduced. 9099 9100@<Tidy up the parameter just scanned, and tuck it away@>= 9101begin if (m=1)and(info(p)<right_brace_limit)and(p<>temp_head) then 9102 begin link(rbrace_ptr):=null; free_avail(p); 9103 p:=link(temp_head); pstack[n]:=link(p); free_avail(p); 9104 end 9105else pstack[n]:=link(temp_head); 9106incr(n); 9107if tracing_macros>0 then 9108 begin begin_diagnostic; print_nl(match_chr); print_int(n); 9109 print("<-"); show_token_list(pstack[n-1],null,1000); 9110 end_diagnostic(false); 9111 end; 9112end 9113 9114@ @<Show the text of the macro being expanded@>= 9115begin begin_diagnostic; print_ln; print_cs(warning_index); 9116token_show(ref_count); end_diagnostic(false); 9117end 9118 9119@* \[26] Basic scanning subroutines. 9120Let's turn now to some procedures that \TeX\ calls upon frequently to digest 9121certain kinds of patterns in the input. Most of these are quite simple; 9122some are quite elaborate. Almost all of the routines call |get_x_token|, 9123which can cause them to be invoked recursively. 9124@^stomach@> 9125@^recursion@> 9126 9127@ The |scan_left_brace| routine is called when a left brace is supposed to be 9128the next non-blank token. (The term ``left brace'' means, more precisely, 9129a character whose catcode is |left_brace|.) \TeX\ allows \.{\\relax} to 9130appear before the |left_brace|. 9131 9132@p procedure scan_left_brace; {reads a mandatory |left_brace|} 9133begin @<Get the next non-blank non-relax non-call token@>; 9134if cur_cmd<>left_brace then 9135 begin print_err("Missing { inserted"); 9136@.Missing \{ inserted@> 9137 help4("A left brace was mandatory here, so I've put one in.")@/ 9138 ("You might want to delete and/or insert some corrections")@/ 9139 ("so that I will find a matching right brace soon.")@/ 9140 ("(If you're confused by all this, try typing `I}' now.)"); 9141 back_error; cur_tok:=left_brace_token+"{"; cur_cmd:=left_brace; 9142 cur_chr:="{"; incr(align_state); 9143 end; 9144end; 9145 9146@ @<Get the next non-blank non-relax non-call token@>= 9147repeat get_x_token; 9148until (cur_cmd<>spacer)and(cur_cmd<>relax) 9149 9150@ The |scan_optional_equals| routine looks for an optional `\.=' sign preceded 9151by optional spaces; `\.{\\relax}' is not ignored here. 9152 9153@p procedure scan_optional_equals; 9154begin @<Get the next non-blank non-call token@>; 9155if cur_tok<>other_token+"=" then back_input; 9156end; 9157 9158@ @<Get the next non-blank non-call token@>= 9159repeat get_x_token; 9160until cur_cmd<>spacer 9161 9162@ In case you are getting bored, here is a slightly less trivial routine: 9163Given a string of lowercase letters, like `\.{pt}' or `\.{plus}' or 9164`\.{width}', the |scan_keyword| routine checks to see whether the next 9165tokens of input match this string. The match must be exact, except that 9166uppercase letters will match their lowercase counterparts; uppercase 9167equivalents are determined by subtracting |"a"-"A"|, rather than using the 9168|uc_code| table, since \TeX\ uses this routine only for its own limited 9169set of keywords. 9170 9171If a match is found, the characters are effectively removed from the input 9172and |true| is returned. Otherwise |false| is returned, and the input 9173is left essentially unchanged (except for the fact that some macros 9174may have been expanded, etc.). 9175@^inner loop@> 9176 9177@p function scan_keyword(@!s:str_number):boolean; {look for a given string} 9178label exit; 9179var p:pointer; {tail of the backup list} 9180@!q:pointer; {new node being added to the token list via |store_new_token|} 9181@!k:pool_pointer; {index into |str_pool|} 9182begin p:=backup_head; link(p):=null; 9183if s<too_big_char then begin 9184 while true do 9185 begin get_x_token; {recursion is possible here} 9186@^recursion@> 9187 if (cur_cs=0)and@| 9188 ((cur_chr=s)or(cur_chr=s-"a"+"A")) then 9189 begin store_new_token(cur_tok); 9190 flush_list(link(backup_head)); scan_keyword:=true; return; 9191 end 9192 else if (cur_cmd<>spacer)or(p<>backup_head) then 9193 begin back_input; 9194 if p<>backup_head then back_list(link(backup_head)); 9195 scan_keyword:=false; return; 9196 end; 9197 end; 9198 end; 9199k:=str_start_macro(s); 9200while k<str_start_macro(s+1) do 9201 begin get_x_token; {recursion is possible here} 9202@^recursion@> 9203 if (cur_cs=0)and@| 9204 ((cur_chr=so(str_pool[k]))or(cur_chr=so(str_pool[k])-"a"+"A")) then 9205 begin store_new_token(cur_tok); incr(k); 9206 end 9207 else if (cur_cmd<>spacer)or(p<>backup_head) then 9208 begin back_input; 9209 if p<>backup_head then back_list(link(backup_head)); 9210 scan_keyword:=false; return; 9211 end; 9212 end; 9213flush_list(link(backup_head)); scan_keyword:=true; 9214exit:end; 9215 9216@ Here is a procedure that sounds an alarm when mu and non-mu units 9217are being switched. 9218 9219@p procedure mu_error; 9220begin print_err("Incompatible glue units"); 9221@.Incompatible glue units@> 9222help1("I'm going to assume that 1mu=1pt when they're mixed."); 9223error; 9224end; 9225 9226@ The next routine `|scan_something_internal|' is used to fetch internal 9227numeric quantities like `\.{\\hsize}', and also to handle the `\.{\\the}' 9228when expanding constructions like `\.{\\the\\toks0}' and 9229`\.{\\the\\baselineskip}'. Soon we will be considering the |scan_int| 9230procedure, which calls |scan_something_internal|; on the other hand, 9231|scan_something_internal| also calls |scan_int|, for constructions like 9232`\.{\\catcode\`\\\$}' or `\.{\\fontdimen} \.3 \.{\\ff}'. So we 9233have to declare |scan_int| as a |forward| procedure. A few other 9234procedures are also declared at this point. 9235 9236@p procedure@?scan_int; forward; {scans an integer value} 9237@t\4\4@>@<Declare procedures that scan restricted classes of integers@>@; 9238@t\4\4@>@<Declare \eTeX\ procedures for scanning@>@; 9239@t\4\4@>@<Declare procedures that scan font-related stuff@> 9240 9241@ \TeX\ doesn't know exactly what to expect when |scan_something_internal| 9242begins. For example, an integer or dimension or glue value could occur 9243immediately after `\.{\\hskip}'; and one can even say \.{\\the} with 9244respect to token lists in constructions like 9245`\.{\\xdef\\o\{\\the\\output\}}'. On the other hand, only integers are 9246allowed after a construction like `\.{\\count}'. To handle the various 9247possibilities, |scan_something_internal| has a |level| parameter, which 9248tells the ``highest'' kind of quantity that |scan_something_internal| is 9249allowed to produce. Six levels are distinguished, namely |int_val|, 9250|dimen_val|, |glue_val|, |mu_val|, |ident_val|, and |tok_val|. 9251 9252The output of |scan_something_internal| (and of the other routines 9253|scan_int|, |scan_dimen|, and |scan_glue| below) is put into the global 9254variable |cur_val|, and its level is put into |cur_val_level|. The highest 9255values of |cur_val_level| are special: |mu_val| is used only when 9256|cur_val| points to something in a ``muskip'' register, or to one of the 9257three parameters \.{\\thinmuskip}, \.{\\medmuskip}, \.{\\thickmuskip}; 9258|ident_val| is used only when |cur_val| points to a font identifier; 9259|tok_val| is used only when |cur_val| points to |null| or to the reference 9260count of a token list. The last two cases are allowed only when 9261|scan_something_internal| is called with |level=tok_val|. 9262 9263If the output is glue, |cur_val| will point to a glue specification, and 9264the reference count of that glue will have been updated to reflect this 9265reference; if the output is a nonempty token list, |cur_val| will point to 9266its reference count, but in this case the count will not have been updated. 9267Otherwise |cur_val| will contain the integer or scaled value in question. 9268 9269@d int_val=0 {integer values} 9270@d dimen_val=1 {dimension values} 9271@d glue_val=2 {glue specifications} 9272@d mu_val=3 {math glue specifications} 9273@d ident_val=4 {font identifier} 9274@d tok_val=5 {token lists} 9275@d inter_char_val=6 {inter-character (class) token lists} 9276 9277@<Glob...@>= 9278@!cur_val:integer; {value returned by numeric scanners} 9279@!cur_val1:integer; {value returned by numeric scanners} 9280@!cur_val_level:int_val..tok_val; {the ``level'' of this value} 9281 9282@ The hash table is initialized with `\.{\\count}', `\.{\\dimen}', `\.{\\skip}', 9283and `\.{\\muskip}' all having |register| as their command code; they are 9284distinguished by the |chr_code|, which is either |int_val|, |dimen_val|, 9285|glue_val|, or |mu_val| more than |mem_bot| (dynamic variable-size nodes 9286cannot have these values) 9287 9288@<Put each...@>= 9289primitive("count",register,mem_bot+int_val); 9290@!@:count_}{\.{\\count} primitive@> 9291primitive("dimen",register,mem_bot+dimen_val); 9292@!@:dimen_}{\.{\\dimen} primitive@> 9293primitive("skip",register,mem_bot+glue_val); 9294@!@:skip_}{\.{\\skip} primitive@> 9295primitive("muskip",register,mem_bot+mu_val); 9296@!@:mu_skip_}{\.{\\muskip} primitive@> 9297 9298@ @<Cases of |print_cmd_chr|...@>= 9299register: @<Cases of |register| for |print_cmd_chr|@>; 9300 9301@ OK, we're ready for |scan_something_internal| itself. A second parameter, 9302|negative|, is set |true| if the value that is found should be negated. 9303It is assumed that |cur_cmd| and |cur_chr| represent the first token of 9304the internal quantity to be scanned; an error will be signalled if 9305|cur_cmd<min_internal| or |cur_cmd>max_internal|. 9306 9307@d scanned_result_end(#)==cur_val_level:=#;@+end 9308@d scanned_result(#)==@+begin cur_val:=#;scanned_result_end 9309 9310@p procedure scan_something_internal(@!level:small_number;@!negative:boolean); 9311 {fetch an internal parameter} 9312label exit; 9313var m:halfword; {|chr_code| part of the operand token} 9314 n, k, kk: integer; {accumulators} 9315@!q,@!r:pointer; {general purpose indices} 9316@!tx:pointer; {effective tail node} 9317@!i:four_quarters; {character info} 9318@!p:0..nest_size; {index into |nest|} 9319begin m:=cur_chr; 9320case cur_cmd of 9321def_code: @<Fetch a character code from some table@>; 9322XeTeX_def_code: 9323 begin 9324 scan_usv_num; 9325 if m=sf_code_base then begin 9326 scanned_result(ho(sf_code(cur_val) div @"10000))(int_val) 9327 end 9328 else if m=math_code_base then begin 9329 scanned_result(ho(math_code(cur_val)))(int_val) 9330 end 9331 else if m=math_code_base+1 then begin 9332 print_err("Can't use \Umathcode as a number (try \Umathcodenum)"); 9333 help2("\Umathcode is for setting a mathcode from separate values;")@/ 9334 ("use \Umathcodenum to access them as single values."); error; 9335 scanned_result(0)(int_val) 9336 end 9337 else if m=del_code_base then begin 9338 scanned_result(ho(del_code(cur_val)))(int_val) 9339 end else begin 9340 print_err("Can't use \Udelcode as a number (try \Udelcodenum)"); 9341 help2("\Udelcode is for setting a delcode from separate values;")@/ 9342 ("use \Udelcodenum to access them as single values."); error; 9343 scanned_result(0)(int_val); 9344 end; 9345 end; 9346toks_register,assign_toks,def_family,set_font,def_font: @<Fetch a token list or 9347 font identifier, provided that |level=tok_val|@>; 9348assign_int: scanned_result(eqtb[m].int)(int_val); 9349assign_dimen: scanned_result(eqtb[m].sc)(dimen_val); 9350assign_glue: scanned_result(equiv(m))(glue_val); 9351assign_mu_glue: scanned_result(equiv(m))(mu_val); 9352set_aux: @<Fetch the |space_factor| or the |prev_depth|@>; 9353set_prev_graf: @<Fetch the |prev_graf|@>; 9354set_page_int:@<Fetch the |dead_cycles| or the |insert_penalties|@>; 9355set_page_dimen: @<Fetch something on the |page_so_far|@>; 9356set_shape: @<Fetch the |par_shape| size@>; 9357set_box_dimen: @<Fetch a box dimension@>; 9358char_given,math_given: scanned_result(cur_chr)(int_val); 9359assign_font_dimen: @<Fetch a font dimension@>; 9360assign_font_int: @<Fetch a font integer@>; 9361register: @<Fetch a register@>; 9362last_item: @<Fetch an item in the current node, if appropriate@>; 9363othercases @<Complain that \.{\\the} can't do this; give zero result@> 9364endcases;@/ 9365while cur_val_level>level do @<Convert \(c)|cur_val| to a lower level@>; 9366@<Fix the reference count, if any, and negate |cur_val| if |negative|@>; 9367exit:end; 9368 9369@ @<Fetch a character code from some table@>= 9370begin scan_usv_num; 9371if m=math_code_base then begin 9372 cur_val1:=ho(math_code(cur_val)); 9373 if is_active_math_char(cur_val1) then 9374 cur_val1:=@"8000 9375 else if (math_class_field(cur_val1)>7) or 9376 (math_fam_field(cur_val1)>15) or 9377 (math_char_field(cur_val1)>255) then 9378 begin print_err("Extended mathchar used as mathchar"); 9379@.Bad mathchar@> 9380 help2("A mathchar number must be between 0 and ""7FFF.")@/ 9381 ("I changed this one to zero."); int_error(cur_val1); cur_val1:=0; 9382 end; 9383 cur_val1:=(math_class_field(cur_val1)*@"1000) + 9384 (math_fam_field(cur_val1)*@"100) + 9385 math_char_field(cur_val1); 9386 scanned_result(cur_val1)(int_val) 9387 end 9388else if m=del_code_base then begin 9389 cur_val1:=del_code(cur_val); 9390 if cur_val1>=@"40000000 then begin 9391 print_err("Extended delcode used as delcode"); 9392@.Bad delcode@> 9393 help2("A delimiter code must be between 0 and ""7FFFFFF.")@/ 9394 ("I changed this one to zero."); error; 9395 scanned_result(0)(int_val); 9396 end else begin 9397 scanned_result(cur_val1)(int_val); 9398 end 9399end 9400else if m<sf_code_base then scanned_result(equiv(m+cur_val))(int_val) 9401else if m<math_code_base then scanned_result(equiv(m+cur_val) mod @"10000)(int_val) 9402else scanned_result(eqtb[m+cur_val].int)(int_val); 9403end 9404 9405@ @<Fetch a token list...@>= 9406if level<>tok_val then 9407 begin print_err("Missing number, treated as zero"); 9408@.Missing number...@> 9409 help3("A number should have been here; I inserted `0'.")@/ 9410 ("(If you can't figure out why I needed to see a number,")@/ 9411 ("look up `weird error' in the index to The TeXbook.)"); 9412@:TeXbook}{\sl The \TeX book@> 9413 back_error; scanned_result(0)(dimen_val); 9414 end 9415else if cur_cmd<=assign_toks then 9416 begin if cur_cmd<assign_toks then {|cur_cmd=toks_register|} 9417 if m=mem_bot then 9418 begin scan_register_num; 9419 if cur_val<256 then cur_val:=equiv(toks_base+cur_val) 9420 else begin find_sa_element(tok_val,cur_val,false); 9421 if cur_ptr=null then cur_val:=null 9422 else cur_val:=sa_ptr(cur_ptr); 9423 end; 9424 end 9425 else cur_val:=sa_ptr(m) 9426 else if cur_chr=XeTeX_inter_char_loc then begin 9427 scan_eight_bit_int; cur_ptr:=cur_val; 9428 scan_eight_bit_int; 9429 find_sa_element(inter_char_val, cur_ptr * @"100 + cur_val, false); 9430 if cur_ptr=null then cur_val:=null 9431 else cur_val:=sa_ptr(cur_ptr); 9432 end else cur_val:=equiv(m); 9433 cur_val_level:=tok_val; 9434 end 9435else begin back_input; scan_font_ident; 9436 scanned_result(font_id_base+cur_val)(ident_val); 9437 end 9438 9439@ Users refer to `\.{\\the\\spacefactor}' only in horizontal 9440mode, and to `\.{\\the\\prevdepth}' only in vertical mode; so we put the 9441associated mode in the modifier part of the |set_aux| command. 9442The |set_page_int| command has modifier 0 or 1, for `\.{\\deadcycles}' and 9443`\.{\\insertpenalties}', respectively. The |set_box_dimen| command is 9444modified by either |width_offset|, |height_offset|, or |depth_offset|. 9445And the |last_item| command is modified by either |int_val|, |dimen_val|, 9446|glue_val|, |input_line_no_code|, or |badness_code|. 9447\eTeX\ inserts |last_node_type_code| after |glue_val| and adds 9448the codes for its extensions: |eTeX_version_code|, \dots\ . 9449 9450@d last_node_type_code=glue_val+1 {code for \.{\\lastnodetype}} 9451@d input_line_no_code=glue_val+2 {code for \.{\\inputlineno}} 9452@d badness_code=input_line_no_code+1 {code for \.{\\badness}} 9453@# 9454@d eTeX_int=badness_code+1 {first of \eTeX\ codes for integers} 9455@# 9456@d XeTeX_int=eTeX_int+8 {first of \XeTeX\ codes for integers} 9457@# 9458@d XeTeX_version_code=XeTeX_int {code for \.{\\XeTeXversion}} 9459 9460@d XeTeX_count_glyphs_code=XeTeX_int+1 9461 9462@d XeTeX_count_variations_code=XeTeX_int+2 {Deprecated} 9463@d XeTeX_variation_code=XeTeX_int+3 {Deprecated} 9464@d XeTeX_find_variation_by_name_code=XeTeX_int+4 {Deprecated} 9465@d XeTeX_variation_min_code=XeTeX_int+5 {Deprecated} 9466@d XeTeX_variation_max_code=XeTeX_int+6 {Deprecated} 9467@d XeTeX_variation_default_code=XeTeX_int+7 {Deprecated} 9468 9469@d XeTeX_count_features_code=XeTeX_int+8 9470@d XeTeX_feature_code_code=XeTeX_int+9 9471@d XeTeX_find_feature_by_name_code=XeTeX_int+10 9472@d XeTeX_is_exclusive_feature_code=XeTeX_int+11 9473@d XeTeX_count_selectors_code=XeTeX_int+12 9474@d XeTeX_selector_code_code=XeTeX_int+13 9475@d XeTeX_find_selector_by_name_code=XeTeX_int+14 9476@d XeTeX_is_default_selector_code=XeTeX_int+15 9477 9478@d XeTeX_OT_count_scripts_code=XeTeX_int+16 9479@d XeTeX_OT_count_languages_code=XeTeX_int+17 9480@d XeTeX_OT_count_features_code=XeTeX_int+18 9481@d XeTeX_OT_script_code=XeTeX_int+19 9482@d XeTeX_OT_language_code=XeTeX_int+20 9483@d XeTeX_OT_feature_code=XeTeX_int+21 9484 9485@d XeTeX_map_char_to_glyph_code=XeTeX_int+22 9486@d XeTeX_glyph_index_code=XeTeX_int+23 9487@d XeTeX_font_type_code=XeTeX_int+24 9488 9489@d XeTeX_first_char_code=XeTeX_int+25 9490@d XeTeX_last_char_code=XeTeX_int+26 9491 9492@d pdf_last_x_pos_code = XeTeX_int+27 9493@d pdf_last_y_pos_code = XeTeX_int+28 9494@d pdf_strcmp_code = XeTeX_int+29 9495@d pdf_shell_escape_code = XeTeX_int+30 9496 9497@d XeTeX_pdf_page_count_code = XeTeX_int+31 9498 9499@# 9500@d XeTeX_dim=XeTeX_int+32 {first of \XeTeX\ codes for dimensions} 9501 9502@d XeTeX_glyph_bounds_code = XeTeX_dim 9503 9504@# 9505@d eTeX_dim=XeTeX_dim+1 {first of \eTeX\ codes for dimensions} 9506 {changed for \XeTeX\ to make room for \XeTeX\ integers and dimens} 9507@d eTeX_glue=eTeX_dim+9 {first of \eTeX\ codes for glue} 9508@d eTeX_mu=eTeX_glue+1 {first of \eTeX\ codes for muglue} 9509@d eTeX_expr=eTeX_mu+1 {first of \eTeX\ codes for expressions} 9510 9511@<Put each...@>= 9512primitive("spacefactor",set_aux,hmode); 9513@!@:space_factor_}{\.{\\spacefactor} primitive@> 9514primitive("prevdepth",set_aux,vmode);@/ 9515@!@:prev_depth_}{\.{\\prevdepth} primitive@> 9516primitive("deadcycles",set_page_int,0); 9517@!@:dead_cycles_}{\.{\\deadcycles} primitive@> 9518primitive("insertpenalties",set_page_int,1); 9519@!@:insert_penalties_}{\.{\\insertpenalties} primitive@> 9520primitive("wd",set_box_dimen,width_offset); 9521@!@:wd_}{\.{\\wd} primitive@> 9522primitive("ht",set_box_dimen,height_offset); 9523@!@:ht_}{\.{\\ht} primitive@> 9524primitive("dp",set_box_dimen,depth_offset); 9525@!@:dp_}{\.{\\dp} primitive@> 9526primitive("lastpenalty",last_item,int_val); 9527@!@:last_penalty_}{\.{\\lastpenalty} primitive@> 9528primitive("lastkern",last_item,dimen_val); 9529@!@:last_kern_}{\.{\\lastkern} primitive@> 9530primitive("lastskip",last_item,glue_val); 9531@!@:last_skip_}{\.{\\lastskip} primitive@> 9532primitive("inputlineno",last_item,input_line_no_code); 9533@!@:input_line_no_}{\.{\\inputlineno} primitive@> 9534primitive("badness",last_item,badness_code); 9535@!@:badness_}{\.{\\badness} primitive@> 9536 9537@ @<Cases of |print_cmd_chr|...@>= 9538set_aux: if chr_code=vmode then print_esc("prevdepth") 9539@+else print_esc("spacefactor"); 9540set_page_int: if chr_code=0 then print_esc("deadcycles") 9541@/@<Cases of |set_page_int| for |print_cmd_chr|@>@/ 9542@+else print_esc("insertpenalties"); 9543set_box_dimen: if chr_code=width_offset then print_esc("wd") 9544else if chr_code=height_offset then print_esc("ht") 9545else print_esc("dp"); 9546last_item: case chr_code of 9547 int_val: print_esc("lastpenalty"); 9548 dimen_val: print_esc("lastkern"); 9549 glue_val: print_esc("lastskip"); 9550 input_line_no_code: print_esc("inputlineno"); 9551 pdf_shell_escape_code: print_esc("shellescape"); 9552 @/@<Cases of |last_item| for |print_cmd_chr|@>@/ 9553 othercases print_esc("badness") 9554 endcases; 9555 9556@ @<Fetch the |space_factor| or the |prev_depth|@>= 9557if abs(mode)<>m then 9558 begin print_err("Improper "); print_cmd_chr(set_aux,m); 9559@.Improper \\spacefactor@> 9560@.Improper \\prevdepth@> 9561 help4("You can refer to \spacefactor only in horizontal mode;")@/ 9562 ("you can refer to \prevdepth only in vertical mode; and")@/ 9563 ("neither of these is meaningful inside \write. So")@/ 9564 ("I'm forgetting what you said and using zero instead."); 9565 error; 9566 if level<>tok_val then scanned_result(0)(dimen_val) 9567 else scanned_result(0)(int_val); 9568 end 9569else if m=vmode then scanned_result(prev_depth)(dimen_val) 9570else scanned_result(space_factor)(int_val) 9571 9572@ @<Fetch the |dead_cycles| or the |insert_penalties|@>= 9573begin if m=0 then cur_val:=dead_cycles 9574@/@<Cases for `Fetch the |dead_cycles| or the |insert_penalties|'@>@/ 9575else cur_val:=insert_penalties; 9576cur_val_level:=int_val; 9577end 9578 9579@ @<Fetch a box dimension@>= 9580begin scan_register_num; fetch_box(q); 9581if q=null then cur_val:=0 @+else cur_val:=mem[q+m].sc; 9582cur_val_level:=dimen_val; 9583end 9584 9585@ Inside an \.{\\output} routine, a user may wish to look at the page totals 9586that were present at the moment when output was triggered. 9587 9588@d max_dimen==@'7777777777 {$2^{30}-1$} 9589 9590@<Fetch something on the |page_so_far|@>= 9591begin if (page_contents=empty) and (not output_active) then 9592 if m=0 then cur_val:=max_dimen@+else cur_val:=0 9593else cur_val:=page_so_far[m]; 9594cur_val_level:=dimen_val; 9595end 9596 9597@ @<Fetch the |prev_graf|@>= 9598if mode=0 then scanned_result(0)(int_val) {|prev_graf=0| within \.{\\write}} 9599else begin nest[nest_ptr]:=cur_list; p:=nest_ptr; 9600 while abs(nest[p].mode_field)<>vmode do decr(p); 9601 scanned_result(nest[p].pg_field)(int_val); 9602 end 9603 9604@ @<Fetch the |par_shape| size@>= 9605begin if m>par_shape_loc then @<Fetch a penalties array element@> 9606else if par_shape_ptr=null then cur_val:=0 9607else cur_val:=info(par_shape_ptr); 9608cur_val_level:=int_val; 9609end 9610 9611@ Here is where \.{\\lastpenalty}, \.{\\lastkern}, \.{\\lastskip}, and 9612\.{\\lastnodetype} are 9613implemented. The reference count for \.{\\lastskip} will be updated later. 9614 9615We also handle \.{\\inputlineno} and \.{\\badness} here, because they are 9616legal in similar contexts. 9617 9618The macro |find_effective_tail_eTeX| sets |tx| to the last non-\.{\\endM} 9619node of the current list. 9620 9621@d find_effective_tail_eTeX== 9622tx:=tail; 9623if not is_char_node(tx) then 9624 if (type(tx)=math_node)and(subtype(tx)=end_M_code) then 9625 begin r:=head; 9626 repeat q:=r; r:=link(q); 9627 until r=tx; 9628 tx:=q; 9629 end 9630@# 9631@d find_effective_tail==find_effective_tail_eTeX 9632 9633@<Fetch an item in the current node...@>= 9634if m>=input_line_no_code then 9635 if m>=eTeX_glue then @<Process an expression and |return|@>@; 9636 else if m>=XeTeX_dim then 9637 begin case m of 9638 @/@<Cases for fetching a dimension value@>@/ 9639 end; {there are no other cases} 9640 cur_val_level:=dimen_val; 9641 end 9642 else begin case m of 9643 input_line_no_code: cur_val:=line; 9644 badness_code: cur_val:=last_badness; 9645 pdf_shell_escape_code: 9646 begin 9647 if shellenabledp then begin 9648 if restrictedshell then cur_val:=2 9649 else cur_val:=1; 9650 end 9651 else cur_val:=0; 9652 end; 9653 @/@<Cases for fetching an integer value@>@/ 9654 end; {there are no other cases} 9655 cur_val_level:=int_val; 9656 end 9657else begin if cur_chr=glue_val then cur_val:=zero_glue@+else cur_val:=0; 9658 find_effective_tail; 9659 if cur_chr=last_node_type_code then 9660 begin cur_val_level:=int_val; 9661 if (tx=head)or(mode=0) then cur_val:=-1; 9662 end 9663 else cur_val_level:=cur_chr; 9664 if not is_char_node(tx)and(mode<>0) then 9665 case cur_chr of 9666 int_val: if type(tx)=penalty_node then cur_val:=penalty(tx); 9667 dimen_val: if type(tx)=kern_node then cur_val:=width(tx); 9668 glue_val: if type(tx)=glue_node then 9669 begin cur_val:=glue_ptr(tx); 9670 if subtype(tx)=mu_glue then cur_val_level:=mu_val; 9671 end; 9672 last_node_type_code: if type(tx)<=unset_node then cur_val:=type(tx)+1 9673 else cur_val:=unset_node+2; 9674 end {there are no other cases} 9675 else if (mode=vmode)and(tx=head) then 9676 case cur_chr of 9677 int_val: cur_val:=last_penalty; 9678 dimen_val: cur_val:=last_kern; 9679 glue_val: if last_glue<>max_halfword then cur_val:=last_glue; 9680 last_node_type_code: cur_val:=last_node_type; 9681 end; {there are no other cases} 9682 end 9683 9684@ @<Fetch a font dimension@>= 9685begin find_font_dimen(false); font_info[fmem_ptr].sc:=0; 9686scanned_result(font_info[cur_val].sc)(dimen_val); 9687end 9688 9689@ @<Fetch a font integer@>= 9690begin scan_font_ident; 9691if m=0 then scanned_result(hyphen_char[cur_val])(int_val) 9692else if m=1 then scanned_result(skew_char[cur_val])(int_val) 9693else begin 9694 n:=cur_val; 9695 if is_native_font(n) then scan_glyph_number(n) 9696 else scan_char_num; 9697 k:=cur_val; 9698 case m of 9699 lp_code_base: scanned_result(get_cp_code(n, k, left_side))(int_val); 9700 rp_code_base: scanned_result(get_cp_code(n, k, right_side))(int_val); 9701 end; 9702end; 9703end 9704 9705@ @<Fetch a register@>= 9706begin if (m<mem_bot)or(m>lo_mem_stat_max) then 9707 begin cur_val_level:=sa_type(m); 9708 if cur_val_level<glue_val then cur_val:=sa_int(m) 9709 else cur_val:=sa_ptr(m); 9710 end 9711else begin scan_register_num; cur_val_level:=m-mem_bot; 9712 if cur_val>255 then 9713 begin find_sa_element(cur_val_level,cur_val,false); 9714 if cur_ptr=null then 9715 if cur_val_level<glue_val then cur_val:=0 9716 else cur_val:=zero_glue 9717 else if cur_val_level<glue_val then cur_val:=sa_int(cur_ptr) 9718 else cur_val:=sa_ptr(cur_ptr); 9719 end 9720 else 9721 case cur_val_level of 9722int_val:cur_val:=count(cur_val); 9723dimen_val:cur_val:=dimen(cur_val); 9724glue_val: cur_val:=skip(cur_val); 9725mu_val: cur_val:=mu_skip(cur_val); 9726end; {there are no other cases} 9727 end; 9728end 9729 9730@ @<Complain that \.{\\the} can't do this; give zero result@>= 9731begin print_err("You can't use `"); print_cmd_chr(cur_cmd,cur_chr); 9732@.You can't use x after ...@> 9733print("' after "); print_esc("the"); 9734help1("I'm forgetting what you said and using zero instead."); 9735error; 9736if level<>tok_val then scanned_result(0)(dimen_val) 9737else scanned_result(0)(int_val); 9738end 9739 9740@ When a |glue_val| changes to a |dimen_val|, we use the width component 9741of the glue; there is no need to decrease the reference count, since it 9742has not yet been increased. When a |dimen_val| changes to an |int_val|, 9743we use scaled points so that the value doesn't actually change. And when a 9744|mu_val| changes to a |glue_val|, the value doesn't change either. 9745 9746@<Convert \(c)|cur_val| to a lower level@>= 9747begin if cur_val_level=glue_val then cur_val:=width(cur_val) 9748else if cur_val_level=mu_val then mu_error; 9749decr(cur_val_level); 9750end 9751 9752@ If |cur_val| points to a glue specification at this point, the reference 9753count for the glue does not yet include the reference by |cur_val|. 9754If |negative| is |true|, |cur_val_level| is known to be |<=mu_val|. 9755 9756@<Fix the reference count, if any, ...@>= 9757if negative then 9758 if cur_val_level>=glue_val then 9759 begin cur_val:=new_spec(cur_val); 9760 @<Negate all three glue components of |cur_val|@>; 9761 end 9762 else negate(cur_val) 9763else if (cur_val_level>=glue_val)and(cur_val_level<=mu_val) then 9764 add_glue_ref(cur_val) 9765 9766@ @<Negate all three...@>= 9767begin negate(width(cur_val)); 9768negate(stretch(cur_val)); 9769negate(shrink(cur_val)); 9770end 9771 9772@ Our next goal is to write the |scan_int| procedure, which scans anything that 9773\TeX\ treats as an integer. But first we might as well look at some simple 9774applications of |scan_int| that have already been made inside of 9775|scan_something_internal|. 9776 9777@ @<Declare procedures that scan restricted classes of integers@>= 9778 9779procedure scan_glyph_number(f: internal_font_number); 9780{ scan a glyph ID for native font |f|, identified by Unicode value or name or glyph number } 9781begin 9782 if scan_keyword("/") then {set cp value by glyph name} 9783 begin 9784 scan_and_pack_name; {result is in |nameoffile|} 9785 scanned_result(map_glyph_to_index(f))(int_val); 9786 end else if scan_keyword("u") then {set cp value by unicode} 9787 begin 9788 scan_char_num; 9789 scanned_result(map_char_to_glyph(f,cur_val))(int_val); 9790 end else 9791 scan_int; 9792end; 9793 9794procedure scan_char_class; 9795begin scan_int; 9796if (cur_val<0)or(cur_val>256) then 9797 begin print_err("Bad character class"); 9798@.Bad character code@> 9799 help2("A character class must be between 0 and 256.")@/ 9800 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9801 end; 9802end; 9803 9804procedure scan_eight_bit_int; 9805begin scan_int; 9806if (cur_val<0)or(cur_val>255) then 9807 begin print_err("Bad register code"); 9808@.Bad register code@> 9809 help2("A register code or char class must be between 0 and 255.")@/ 9810 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9811 end; 9812end; 9813 9814@ @<Declare procedures that scan restricted classes of integers@>= 9815procedure scan_usv_num; 9816begin scan_int; 9817if (cur_val<0)or(cur_val>biggest_usv) then 9818 begin print_err("Bad character code"); 9819@.Bad character code@> 9820 help2("A Unicode scalar value must be between 0 and ""10FFFF.")@/ 9821 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9822 end; 9823end; 9824 9825procedure scan_char_num; 9826begin scan_int; 9827if (cur_val<0)or(cur_val>biggest_char) then 9828 begin print_err("Bad character code"); 9829@.Bad character code@> 9830 help2("A character number must be between 0 and 65535.")@/ 9831 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9832 end; 9833end; 9834 9835@ While we're at it, we might as well deal with similar routines that 9836will be needed later. 9837 9838@<Declare procedures that scan restricted classes of integers@>= 9839procedure scan_xetex_math_char_int; 9840begin scan_int; 9841 if is_active_math_char(cur_val) then begin 9842 if cur_val <> active_math_char then begin 9843 print_err("Bad active XeTeX math code"); 9844 help2("Since I ignore class and family for active math chars,")@/ 9845 ("I changed this one to ""1FFFFF."); int_error(cur_val); 9846 cur_val:=active_math_char; 9847 end 9848 end else if math_char_field(cur_val)>biggest_usv then begin 9849 print_err("Bad XeTeX math character code"); 9850 help2("Since I expected a character number between 0 and ""10FFFF,")@/ 9851 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9852 end; 9853end; 9854 9855procedure scan_math_class_int; 9856begin scan_int; 9857if (cur_val<0)or(cur_val>7) then 9858 begin print_err("Bad math class"); 9859@.Bad number@> 9860 help2("Since I expected to read a number between 0 and 7,")@/ 9861 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9862 end; 9863end; 9864 9865procedure scan_math_fam_int; 9866begin scan_int; 9867if (cur_val<0)or(cur_val>number_math_families-1) then 9868 begin print_err("Bad math family"); 9869@.Bad number@> 9870 help2("Since I expected to read a number between 0 and 255,")@/ 9871 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9872 end; 9873end; 9874 9875procedure scan_four_bit_int; 9876begin scan_int; 9877if (cur_val<0)or(cur_val>15) then 9878 begin print_err("Bad number"); 9879@.Bad number@> 9880 help2("Since I expected to read a number between 0 and 15,")@/ 9881 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9882 end; 9883end; 9884 9885@ @<Declare procedures that scan restricted classes of integers@>= 9886procedure scan_fifteen_bit_int; 9887begin scan_int; 9888if (cur_val<0)or(cur_val>@'77777) then 9889 begin print_err("Bad mathchar"); 9890@.Bad mathchar@> 9891 help2("A mathchar number must be between 0 and 32767.")@/ 9892 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9893 end; 9894end; 9895 9896@ @<Declare procedures that scan restricted classes of integers@>= 9897procedure scan_delimiter_int; 9898begin scan_int; 9899if (cur_val<0)or(cur_val>@'777777777) then 9900 begin print_err("Bad delimiter code"); 9901@.Bad delimiter code@> 9902 help2("A numeric delimiter code must be between 0 and 2^{27}-1.")@/ 9903 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 9904 end; 9905end; 9906 9907@ An integer number can be preceded by any number of spaces and `\.+' or 9908`\.-' signs. Then comes either a decimal constant (i.e., radix 10), an 9909octal constant (i.e., radix 8, preceded by~\.\'), a hexadecimal constant 9910(radix 16, preceded by~\."), an alphabetic constant (preceded by~\.\`), or 9911an internal variable. After scanning is complete, 9912|cur_val| will contain the answer, which must be at most 9913$2^{31}-1=2147483647$ in absolute value. The value of |radix| is set to 991410, 8, or 16 in the cases of decimal, octal, or hexadecimal constants, 9915otherwise |radix| is set to zero. An optional space follows a constant. 9916 9917@d octal_token=other_token+"'" {apostrophe, indicates an octal constant} 9918@d hex_token=other_token+"""" {double quote, indicates a hex constant} 9919@d alpha_token=other_token+"`" {reverse apostrophe, precedes alpha constants} 9920@d point_token=other_token+"." {decimal point} 9921@d continental_point_token=other_token+"," {decimal point, Eurostyle} 9922 9923@<Glob...@>= 9924@!radix:small_number; {|scan_int| sets this to 8, 10, 16, or zero} 9925 9926@ We initialize the following global variables just in case |expand| 9927comes into action before any of the basic scanning routines has assigned 9928them a value. 9929 9930@<Set init...@>= 9931cur_val:=0; cur_val_level:=int_val; radix:=0; cur_order:=normal; 9932 9933@ The |scan_int| routine is used also to scan the integer part of a 9934fraction; for example, the `\.3' in `\.{3.14159}' will be found by 9935|scan_int|. The |scan_dimen| routine assumes that |cur_tok=point_token| 9936after the integer part of such a fraction has been scanned by |scan_int|, 9937and that the decimal point has been backed up to be scanned again. 9938 9939@p procedure scan_int; {sets |cur_val| to an integer} 9940label done; 9941var negative:boolean; {should the answer be negated?} 9942@!m:integer; {|@t$2^{31}$@> div radix|, the threshold of danger} 9943@!d:small_number; {the digit just scanned} 9944@!vacuous:boolean; {have no digits appeared?} 9945@!OK_so_far:boolean; {has an error message been issued?} 9946begin radix:=0; OK_so_far:=true;@/ 9947@<Get the next non-blank non-sign token; set |negative| appropriately@>; 9948if cur_tok=alpha_token then @<Scan an alphabetic character code into |cur_val|@> 9949else if (cur_cmd>=min_internal)and(cur_cmd<=max_internal) then 9950 scan_something_internal(int_val,false) 9951else @<Scan a numeric constant@>; 9952if negative then negate(cur_val); 9953end; 9954 9955@ @<Get the next non-blank non-sign token...@>= 9956negative:=false; 9957repeat @<Get the next non-blank non-call token@>; 9958if cur_tok=other_token+"-" then 9959 begin negative:=not negative; cur_tok:=other_token+"+"; 9960 end; 9961until cur_tok<>other_token+"+" 9962 9963@ A space is ignored after an alphabetic character constant, so that 9964such constants behave like numeric ones. 9965 9966@<Scan an alphabetic character code into |cur_val|@>= 9967begin get_token; {suppress macro expansion} 9968if cur_tok<cs_token_flag then 9969 begin cur_val:=cur_chr; 9970 if cur_cmd<=right_brace then 9971 if cur_cmd=right_brace then incr(align_state) 9972 else decr(align_state); 9973 end 9974else if cur_tok<cs_token_flag+single_base then 9975 cur_val:=cur_tok-cs_token_flag-active_base 9976else cur_val:=cur_tok-cs_token_flag-single_base; 9977if cur_val>biggest_usv then 9978 begin print_err("Improper alphabetic constant"); 9979@.Improper alphabetic constant@> 9980 help2("A one-character control sequence belongs after a ` mark.")@/ 9981 ("So I'm essentially inserting \0 here."); 9982 cur_val:="0"; back_error; 9983 end 9984else @<Scan an optional space@>; 9985end 9986 9987@ @<Scan an optional space@>= 9988begin get_x_token; if cur_cmd<>spacer then back_input; 9989end 9990 9991@ @<Scan a numeric constant@>= 9992begin radix:=10; m:=214748364; 9993if cur_tok=octal_token then 9994 begin radix:=8; m:=@'2000000000; get_x_token; 9995 end 9996else if cur_tok=hex_token then 9997 begin radix:=16; m:=@'1000000000; get_x_token; 9998 end; 9999vacuous:=true; cur_val:=0;@/ 10000@<Accumulate the constant until |cur_tok| is not a suitable digit@>; 10001if vacuous then @<Express astonishment that no number was here@> 10002else if cur_cmd<>spacer then back_input; 10003end 10004 10005@ @d infinity==@'17777777777 {the largest positive value that \TeX\ knows} 10006@d zero_token=other_token+"0" {zero, the smallest digit} 10007@d A_token=letter_token+"A" {the smallest special hex digit} 10008@d other_A_token=other_token+"A" {special hex digit of type |other_char|} 10009 10010@<Accumulate the constant...@>= 10011loop@+ begin if (cur_tok<zero_token+radix)and(cur_tok>=zero_token)and 10012 (cur_tok<=zero_token+9) then d:=cur_tok-zero_token 10013 else if radix=16 then 10014 if (cur_tok<=A_token+5)and(cur_tok>=A_token) then d:=cur_tok-A_token+10 10015 else if (cur_tok<=other_A_token+5)and(cur_tok>=other_A_token) then 10016 d:=cur_tok-other_A_token+10 10017 else goto done 10018 else goto done; 10019 vacuous:=false; 10020 if (cur_val>=m)and((cur_val>m)or(d>7)or(radix<>10)) then 10021 begin if OK_so_far then 10022 begin print_err("Number too big"); 10023@.Number too big@> 10024 help2("I can only go up to 2147483647='17777777777=""7FFFFFFF,")@/ 10025 ("so I'm using that number instead of yours."); 10026 error; cur_val:=infinity; OK_so_far:=false; 10027 end; 10028 end 10029 else cur_val:=cur_val*radix+d; 10030 get_x_token; 10031 end; 10032done: 10033 10034@ @<Express astonishment...@>= 10035begin print_err("Missing number, treated as zero"); 10036@.Missing number...@> 10037help3("A number should have been here; I inserted `0'.")@/ 10038 ("(If you can't figure out why I needed to see a number,")@/ 10039 ("look up `weird error' in the index to The TeXbook.)"); 10040@:TeXbook}{\sl The \TeX book@> 10041back_error; 10042end 10043 10044@ The |scan_dimen| routine is similar to |scan_int|, but it sets |cur_val| to 10045a |scaled| value, i.e., an integral number of sp. One of its main tasks 10046is therefore to interpret the abbreviations for various kinds of units and 10047to convert measurements to scaled points. 10048 10049There are three parameters: |mu| is |true| if the finite units must be 10050`\.{mu}', while |mu| is |false| if `\.{mu}' units are disallowed; 10051|inf| is |true| if the infinite units `\.{fil}', `\.{fill}', `\.{filll}' 10052are permitted; and |shortcut| is |true| if |cur_val| already contains 10053an integer and only the units need to be considered. 10054 10055The order of infinity that was found in the case of infinite glue is returned 10056in the global variable |cur_order|. 10057 10058@<Glob...@>= 10059@!cur_order:glue_ord; {order of infinity found by |scan_dimen|} 10060 10061@ Constructions like `\.{-\'77 pt}' are legal dimensions, so |scan_dimen| 10062may begin with |scan_int|. This explains why it is convenient to use 10063|scan_int| also for the integer part of a decimal fraction. 10064 10065Several branches of |scan_dimen| work with |cur_val| as an integer and 10066with an auxiliary fraction |f|, so that the actual quantity of interest is 10067$|cur_val|+|f|/2^{16}$. At the end of the routine, this ``unpacked'' 10068representation is put into the single word |cur_val|, which suddenly 10069switches significance from |integer| to |scaled|. 10070 10071@d attach_fraction=88 {go here to pack |cur_val| and |f| into |cur_val|} 10072@d attach_sign=89 {go here when |cur_val| is correct except perhaps for sign} 10073@d scan_normal_dimen==scan_dimen(false,false,false) 10074 10075@p procedure xetex_scan_dimen(@!mu,@!inf,@!shortcut,@!requires_units:boolean); 10076 {sets |cur_val| to a dimension} 10077label done, done1, done2, found, not_found, attach_fraction, attach_sign; 10078var negative:boolean; {should the answer be negated?} 10079@!f:integer; {numerator of a fraction whose denominator is $2^{16}$} 10080@<Local variables for dimension calculations@>@; 10081begin f:=0; arith_error:=false; cur_order:=normal; negative:=false; 10082if not shortcut then 10083 begin @<Get the next non-blank non-sign...@>; 10084 if (cur_cmd>=min_internal)and(cur_cmd<=max_internal) then 10085 @<Fetch an internal dimension and |goto attach_sign|, 10086 or fetch an internal integer@> 10087 else begin back_input; 10088 if cur_tok=continental_point_token then cur_tok:=point_token; 10089 if cur_tok<>point_token then scan_int 10090 else begin radix:=10; cur_val:=0; 10091 end; 10092 if cur_tok=continental_point_token then cur_tok:=point_token; 10093 if (radix=10)and(cur_tok=point_token) then @<Scan decimal fraction@>; 10094 end; 10095 end; 10096if cur_val<0 then {in this case |f=0|} 10097 begin negative:=not negative; negate(cur_val); 10098 end; 10099if requires_units then begin 10100@<Scan units and set |cur_val| to $x\cdot(|cur_val|+f/2^{16})$, where there 10101 are |x| sp per unit; |goto attach_sign| if the units are internal@>; 10102@<Scan an optional space@>; 10103end else begin 10104 if cur_val>=@'40000 then arith_error:=true 10105 else cur_val:=cur_val*unity+f; 10106end; 10107attach_sign: if arith_error or(abs(cur_val)>=@'10000000000) then 10108 @<Report that this dimension is out of range@>; 10109if negative then negate(cur_val); 10110end; 10111 10112procedure scan_dimen(@!mu,@!inf,@!shortcut:boolean); 10113begin 10114 xetex_scan_dimen(mu,inf,shortcut,true); 10115end; 10116 10117@ For XeTeX, we have an additional version |scan_decimal|, like |scan_dimen| 10118but without any scanning of units. 10119 10120@p procedure scan_decimal; 10121 {sets |cur_val| to a quantity expressed as a decimal fraction} 10122begin 10123 xetex_scan_dimen(false, false, false, false); 10124end; 10125 10126@ @<Fetch an internal dimension and |goto attach_sign|...@>= 10127if mu then 10128 begin scan_something_internal(mu_val,false); 10129 @<Coerce glue to a dimension@>; 10130 if cur_val_level=mu_val then goto attach_sign; 10131 if cur_val_level<>int_val then mu_error; 10132 end 10133else begin scan_something_internal(dimen_val,false); 10134 if cur_val_level=dimen_val then goto attach_sign; 10135 end 10136 10137@ @<Local variables for dimension calculations@>= 10138@!num,@!denom:1..65536; {conversion ratio for the scanned units} 10139@!k,@!kk:small_number; {number of digits in a decimal fraction} 10140@!p,@!q:pointer; {top of decimal digit stack} 10141@!v:scaled; {an internal dimension} 10142@!save_cur_val:integer; {temporary storage of |cur_val|} 10143 10144@ The following code is executed when |scan_something_internal| was 10145called asking for |mu_val|, when we really wanted a ``mudimen'' instead 10146of ``muglue.'' 10147 10148@<Coerce glue to a dimension@>= 10149if cur_val_level>=glue_val then 10150 begin v:=width(cur_val); delete_glue_ref(cur_val); cur_val:=v; 10151 end 10152 10153@ When the following code is executed, we have |cur_tok=point_token|, but this 10154token has been backed up using |back_input|; we must first discard it. 10155 10156It turns out that a decimal point all by itself is equivalent to `\.{0.0}'. 10157Let's hope people don't use that fact. 10158 10159@<Scan decimal fraction@>= 10160begin k:=0; p:=null; get_token; {|point_token| is being re-scanned} 10161loop@+ begin get_x_token; 10162 if (cur_tok>zero_token+9)or(cur_tok<zero_token) then goto done1; 10163 if k<17 then {digits for |k>=17| cannot affect the result} 10164 begin q:=get_avail; link(q):=p; info(q):=cur_tok-zero_token; 10165 p:=q; incr(k); 10166 end; 10167 end; 10168done1: for kk:=k downto 1 do 10169 begin dig[kk-1]:=info(p); q:=p; p:=link(p); free_avail(q); 10170 end; 10171f:=round_decimals(k); 10172if cur_cmd<>spacer then back_input; 10173end 10174 10175@ Now comes the harder part: At this point in the program, |cur_val| is a 10176nonnegative integer and $f/2^{16}$ is a nonnegative fraction less than 1; 10177we want to multiply the sum of these two quantities by the appropriate 10178factor, based on the specified units, in order to produce a |scaled| 10179result, and we want to do the calculation with fixed point arithmetic that 10180does not overflow. 10181 10182@<Scan units and set |cur_val| to $x\cdot(|cur_val|+f/2^{16})$...@>= 10183if inf then @<Scan for \(f)\.{fil} units; |goto attach_fraction| if found@>; 10184@<Scan for \(u)units that are internal dimensions; 10185 |goto attach_sign| with |cur_val| set if found@>; 10186if mu then @<Scan for \(m)\.{mu} units and |goto attach_fraction|@>; 10187if scan_keyword("true") then @<Adjust \(f)for the magnification ratio@>; 10188@.true@> 10189if scan_keyword("pt") then goto attach_fraction; {the easy case} 10190@.pt@> 10191@<Scan for \(a)all other units and adjust |cur_val| and |f| accordingly; 10192 |goto done| in the case of scaled points@>; 10193attach_fraction: if cur_val>=@'40000 then arith_error:=true 10194else cur_val:=cur_val*unity+f; 10195done: 10196 10197@ A specification like `\.{filllll}' or `\.{fill L L L}' will lead to two 10198error messages (one for each additional keyword \.{"l"}). 10199 10200@<Scan for \(f)\.{fil} units...@>= 10201if scan_keyword("fil") then 10202@.fil@> 10203 begin cur_order:=fil; 10204 while scan_keyword("l") do 10205 begin if cur_order=filll then 10206 begin print_err("Illegal unit of measure ("); 10207@.Illegal unit of measure@> 10208 print("replaced by filll)"); 10209 help1("I dddon't go any higher than filll."); error; 10210 end 10211 else incr(cur_order); 10212 end; 10213 goto attach_fraction; 10214 end 10215 10216@ @<Scan for \(u)units that are internal dimensions...@>= 10217save_cur_val:=cur_val; 10218@<Get the next non-blank non-call...@>; 10219if (cur_cmd<min_internal)or(cur_cmd>max_internal) then back_input 10220else begin if mu then 10221 begin scan_something_internal(mu_val,false); @<Coerce glue...@>; 10222 if cur_val_level<>mu_val then mu_error; 10223 end 10224 else scan_something_internal(dimen_val,false); 10225 v:=cur_val; goto found; 10226 end; 10227if mu then goto not_found; 10228if scan_keyword("em") then v:=(@<The em width for |cur_font|@>) 10229@.em@> 10230else if scan_keyword("ex") then v:=(@<The x-height for |cur_font|@>) 10231@.ex@> 10232else goto not_found; 10233@<Scan an optional space@>; 10234found:cur_val:=nx_plus_y(save_cur_val,v,xn_over_d(v,f,@'200000)); 10235goto attach_sign; 10236not_found: 10237 10238@ @<Scan for \(m)\.{mu} units and |goto attach_fraction|@>= 10239if scan_keyword("mu") then goto attach_fraction 10240@.mu@> 10241else begin print_err("Illegal unit of measure ("); print("mu inserted)"); 10242@.Illegal unit of measure@> 10243 help4("The unit of measurement in math glue must be mu.")@/ 10244 ("To recover gracefully from this error, it's best to")@/ 10245 ("delete the erroneous units; e.g., type `2' to delete")@/ 10246 ("two letters. (See Chapter 27 of The TeXbook.)"); 10247@:TeXbook}{\sl The \TeX book@> 10248 error; goto attach_fraction; 10249 end 10250 10251@ @<Adjust \(f)for the magnification ratio@>= 10252begin prepare_mag; 10253if mag<>1000 then 10254 begin cur_val:=xn_over_d(cur_val,1000,mag); 10255 f:=(1000*f+@'200000*remainder) div mag; 10256 cur_val:=cur_val+(f div @'200000); f:=f mod @'200000; 10257 end; 10258end 10259 10260@ The necessary conversion factors can all be specified exactly as 10261fractions whose numerator and denominator sum to 32768 or less. 10262According to the definitions here, $\rm2660\,dd\approx1000.33297\,mm$; 10263this agrees well with the value $\rm1000.333\,mm$ cited by Bosshard 10264@^Bosshard, Hans Rudolf@> 10265in {\sl Technische Grundlagen zur Satzherstellung\/} (Bern, 1980). 10266 10267@d set_conversion_end(#)== denom:=#; end 10268@d set_conversion(#)==@+begin num:=#; set_conversion_end 10269 10270@<Scan for \(a)all other units and adjust |cur_val| and |f|...@>= 10271if scan_keyword("in") then set_conversion(7227)(100) 10272@.in@> 10273else if scan_keyword("pc") then set_conversion(12)(1) 10274@.pc@> 10275else if scan_keyword("cm") then set_conversion(7227)(254) 10276@.cm@> 10277else if scan_keyword("mm") then set_conversion(7227)(2540) 10278@.mm@> 10279else if scan_keyword("bp") then set_conversion(7227)(7200) 10280@.bp@> 10281else if scan_keyword("dd") then set_conversion(1238)(1157) 10282@.dd@> 10283else if scan_keyword("cc") then set_conversion(14856)(1157) 10284@.cc@> 10285else if scan_keyword("sp") then goto done 10286@.sp@> 10287else @<Complain about unknown unit and |goto done2|@>; 10288cur_val:=xn_over_d(cur_val,num,denom); 10289f:=(num*f+@'200000*remainder) div denom;@/ 10290cur_val:=cur_val+(f div @'200000); f:=f mod @'200000; 10291done2: 10292 10293@ @<Complain about unknown unit...@>= 10294begin print_err("Illegal unit of measure ("); print("pt inserted)"); 10295@.Illegal unit of measure@> 10296help6("Dimensions can be in units of em, ex, in, pt, pc,")@/ 10297 ("cm, mm, dd, cc, bp, or sp; but yours is a new one!")@/ 10298 ("I'll assume that you meant to say pt, for printer's points.")@/ 10299 ("To recover gracefully from this error, it's best to")@/ 10300 ("delete the erroneous units; e.g., type `2' to delete")@/ 10301 ("two letters. (See Chapter 27 of The TeXbook.)"); 10302@:TeXbook}{\sl The \TeX book@> 10303error; goto done2; 10304end 10305 10306 10307@ @<Report that this dimension is out of range@>= 10308begin print_err("Dimension too large"); 10309@.Dimension too large@> 10310help2("I can't work with sizes bigger than about 19 feet.")@/ 10311 ("Continue and I'll use the largest value I can.");@/ 10312error; cur_val:=max_dimen; arith_error:=false; 10313end 10314 10315@ The final member of \TeX's value-scanning trio is |scan_glue|, which 10316makes |cur_val| point to a glue specification. The reference count of that 10317glue spec will take account of the fact that |cur_val| is pointing to~it. 10318 10319The |level| parameter should be either |glue_val| or |mu_val|. 10320 10321Since |scan_dimen| was so much more complex than |scan_int|, we might expect 10322|scan_glue| to be even worse. But fortunately, it is very simple, since 10323most of the work has already been done. 10324 10325@p procedure scan_glue(@!level:small_number); 10326 {sets |cur_val| to a glue spec pointer} 10327label exit; 10328var negative:boolean; {should the answer be negated?} 10329@!q:pointer; {new glue specification} 10330@!mu:boolean; {does |level=mu_val|?} 10331begin mu:=(level=mu_val); @<Get the next non-blank non-sign...@>; 10332if (cur_cmd>=min_internal)and(cur_cmd<=max_internal) then 10333 begin scan_something_internal(level,negative); 10334 if cur_val_level>=glue_val then 10335 begin if cur_val_level<>level then mu_error; 10336 return; 10337 end; 10338 if cur_val_level=int_val then scan_dimen(mu,false,true) 10339 else if level=mu_val then mu_error; 10340 end 10341else begin back_input; scan_dimen(mu,false,false); 10342 if negative then negate(cur_val); 10343 end; 10344@<Create a new glue specification whose width is |cur_val|; scan for its 10345 stretch and shrink components@>; 10346exit:end; 10347@# 10348@<Declare procedures needed for expressions@>@; 10349 10350@ @<Create a new glue specification whose width is |cur_val|...@>= 10351q:=new_spec(zero_glue); width(q):=cur_val; 10352if scan_keyword("plus") then 10353@.plus@> 10354 begin scan_dimen(mu,true,false); 10355 stretch(q):=cur_val; stretch_order(q):=cur_order; 10356 end; 10357if scan_keyword("minus") then 10358@.minus@> 10359 begin scan_dimen(mu,true,false); 10360 shrink(q):=cur_val; shrink_order(q):=cur_order; 10361 end; 10362cur_val:=q 10363 10364@ Here's a similar procedure that returns a pointer to a rule node. This 10365routine is called just after \TeX\ has seen \.{\\hrule} or \.{\\vrule}; 10366therefore |cur_cmd| will be either |hrule| or |vrule|. The idea is to store 10367the default rule dimensions in the node, then to override them if 10368`\.{height}' or `\.{width}' or `\.{depth}' specifications are 10369found (in any order). 10370 10371@d default_rule=26214 {0.4\thinspace pt} 10372 10373@p function scan_rule_spec:pointer; 10374label reswitch; 10375var q:pointer; {the rule node being created} 10376begin q:=new_rule; {|width|, |depth|, and |height| all equal |null_flag| now} 10377if cur_cmd=vrule then width(q):=default_rule 10378else begin height(q):=default_rule; depth(q):=0; 10379 end; 10380reswitch: if scan_keyword("width") then 10381@.width@> 10382 begin scan_normal_dimen; width(q):=cur_val; goto reswitch; 10383 end; 10384if scan_keyword("height") then 10385@.height@> 10386 begin scan_normal_dimen; height(q):=cur_val; goto reswitch; 10387 end; 10388if scan_keyword("depth") then 10389@.depth@> 10390 begin scan_normal_dimen; depth(q):=cur_val; goto reswitch; 10391 end; 10392scan_rule_spec:=q; 10393end; 10394 10395@* \[27] Building token lists. 10396The token lists for macros and for other things like \.{\\mark} and \.{\\output} 10397and \.{\\write} are produced by a procedure called |scan_toks|. 10398 10399Before we get into the details of |scan_toks|, let's consider a much 10400simpler task, that of converting the current string into a token list. 10401The |str_toks| function does this; it classifies spaces as type |spacer| 10402and everything else as type |other_char|. 10403 10404The token list created by |str_toks| begins at |link(temp_head)| and ends 10405at the value |p| that is returned. (If |p=temp_head|, the list is empty.) 10406 10407The |str_toks_cat| function is the same, except that the catcode |cat| is 10408stamped on all the characters, unless zero is passed in which case it 10409chooses |spacer| or |other_char| automatically. 10410 10411@p @t\4@>@<Declare \eTeX\ procedures for token lists@>@;@/ 10412function str_toks_cat(@!b:pool_pointer;@!cat:small_number):pointer; 10413 {changes the string |str_pool[b..pool_ptr]| to a token list} 10414var p:pointer; {tail of the token list} 10415@!q:pointer; {new node being added to the token list via |store_new_token|} 10416@!t:halfword; {token being appended} 10417@!k:pool_pointer; {index into |str_pool|} 10418begin str_room(1); 10419p:=temp_head; link(p):=null; k:=b; 10420while k<pool_ptr do 10421 begin t:=so(str_pool[k]); 10422 if (t=" ") and (cat=0) then t:=space_token 10423 else begin if (t >= @"D800) and (t <= @"DBFF) and (k+1 < pool_ptr) 10424 and (so(str_pool[k+1]) >= @"DC00) and (so(str_pool[k+1]) <= @"DFFF) then 10425 begin 10426 incr(k); 10427 t := @"10000 + (t - @"D800) * @"400 + (so(str_pool[k]) - @"DC00); 10428 end; 10429 if cat=0 then t := other_token + t 10430 else t := max_char_val * cat + t; 10431 end; 10432 fast_store_new_token(t); 10433 incr(k); 10434 end; 10435pool_ptr:=b; str_toks_cat:=p; 10436end; 10437 10438function str_toks(@!b:pool_pointer):pointer; 10439begin str_toks:=str_toks_cat(b,0); end; 10440 10441@ The main reason for wanting |str_toks| is the next function, 10442|the_toks|, which has similar input/output characteristics. 10443 10444This procedure is supposed to scan something like `\.{\\skip\\count12}', 10445i.e., whatever can follow `\.{\\the}', and it constructs a token list 10446containing something like `\.{-3.0pt minus 0.5fill}'. 10447 10448@p function the_toks:pointer; 10449label exit; 10450var old_setting:0..max_selector; {holds |selector| setting} 10451@!p,@!q,@!r:pointer; {used for copying a token list} 10452@!b:pool_pointer; {base of temporary string} 10453@!c:small_number; {value of |cur_chr|} 10454begin @<Handle \.{\\unexpanded} or \.{\\detokenize} and |return|@>;@/ 10455get_x_token; scan_something_internal(tok_val,false); 10456if cur_val_level>=ident_val then @<Copy the token list@> 10457else begin old_setting:=selector; selector:=new_string; b:=pool_ptr; 10458 case cur_val_level of 10459 int_val:print_int(cur_val); 10460 dimen_val:begin print_scaled(cur_val); print("pt"); 10461 end; 10462 glue_val: begin print_spec(cur_val,"pt"); delete_glue_ref(cur_val); 10463 end; 10464 mu_val: begin print_spec(cur_val,"mu"); delete_glue_ref(cur_val); 10465 end; 10466 end; {there are no other cases} 10467 selector:=old_setting; the_toks:=str_toks(b); 10468 end; 10469exit:end; 10470 10471@ @<Copy the token list@>= 10472begin p:=temp_head; link(p):=null; 10473if cur_val_level=ident_val then store_new_token(cs_token_flag+cur_val) 10474else if cur_val<>null then 10475 begin r:=link(cur_val); {do not copy the reference count} 10476 while r<>null do 10477 begin fast_store_new_token(info(r)); r:=link(r); 10478 end; 10479 end; 10480the_toks:=p; 10481end 10482 10483@ Here's part of the |expand| subroutine that we are now ready to complete: 10484 10485@p procedure ins_the_toks; 10486begin link(garbage):=the_toks; ins_list(link(temp_head)); 10487end; 10488 10489@ The primitives \.{\\number}, \.{\\romannumeral}, \.{\\string}, \.{\\meaning}, 10490\.{\\fontname}, and \.{\\jobname} are defined as follows. 10491 10492\eTeX\ adds \.{\\eTeXrevision} such that |job_name_code| remains last. 10493 10494@d number_code=0 {command code for \.{\\number}} 10495@d roman_numeral_code=1 {command code for \.{\\romannumeral}} 10496@d string_code=2 {command code for \.{\\string}} 10497@d meaning_code=3 {command code for \.{\\meaning}} 10498@d font_name_code=4 {command code for \.{\\fontname}} 10499@d etex_convert_base=5 {base for \eTeX's command codes} 10500@d eTeX_revision_code=etex_convert_base {command code for \.{\\eTeXrevision}} 10501 10502@d XeTeX_revision_code=6 10503@d XeTeX_variation_name_code=7 10504@d XeTeX_feature_name_code=8 10505@d XeTeX_selector_name_code=9 10506@d XeTeX_glyph_name_code=10 10507 10508@d left_margin_kern_code=11 10509@d right_margin_kern_code=12 10510 10511@d XeTeX_Uchar_code = 13 10512@d XeTeX_Ucharcat_code = 14 10513 10514@d etex_convert_codes=XeTeX_Ucharcat_code+1 {end of \eTeX's command codes} 10515@d job_name_code=etex_convert_codes {command code for \.{\\jobname}} 10516 10517@<Put each...@>= 10518primitive("number",convert,number_code);@/ 10519@!@:number_}{\.{\\number} primitive@> 10520primitive("romannumeral",convert,roman_numeral_code);@/ 10521@!@:roman_numeral_}{\.{\\romannumeral} primitive@> 10522primitive("string",convert,string_code);@/ 10523@!@:string_}{\.{\\string} primitive@> 10524primitive("meaning",convert,meaning_code);@/ 10525@!@:meaning_}{\.{\\meaning} primitive@> 10526primitive("fontname",convert,font_name_code);@/ 10527@!@:font_name_}{\.{\\fontname} primitive@> 10528primitive("jobname",convert,job_name_code);@/ 10529@!@:job_name_}{\.{\\jobname} primitive@> 10530primitive("leftmarginkern",convert,left_margin_kern_code);@/ 10531@!@:left_margin_kern_}{\.{\\leftmarginkern} primitive@> 10532primitive("rightmarginkern",convert,right_margin_kern_code);@/ 10533@!@:right_margin_kern_}{\.{\\rightmarginkern} primitive@> 10534 10535primitive("Uchar",convert,XeTeX_Uchar_code);@/ 10536@!@:XeTeX_Uchar_}{\.{\\Uchar} primitive@> 10537primitive("Ucharcat",convert,XeTeX_Ucharcat_code);@/ 10538@!@:XeTeX_Ucharcat_}{\.{\\Ucharcat} primitive@> 10539 10540@ @<Cases of |print_cmd_chr|...@>= 10541convert: case chr_code of 10542 number_code: print_esc("number"); 10543 roman_numeral_code: print_esc("romannumeral"); 10544 string_code: print_esc("string"); 10545 meaning_code: print_esc("meaning"); 10546 font_name_code: print_esc("fontname"); 10547 pdf_strcmp_code: print_esc("strcmp"); 10548 left_margin_kern_code: print_esc("leftmarginkern"); 10549 right_margin_kern_code: print_esc("rightmarginkern"); 10550 @/@<Cases of |convert| for |print_cmd_chr|@>@/ 10551 othercases print_esc("jobname") 10552 endcases; 10553 10554@ The procedure |conv_toks| uses |str_toks| to insert the token list 10555for |convert| functions into the scanner; `\.{\\outer}' control sequences 10556are allowed to follow `\.{\\string}' and `\.{\\meaning}'. 10557 10558The extra temp string |u| is needed because |pdf_scan_ext_toks| incorporates 10559any pending string in its output. In order to save such a pending string, 10560we have to create a temporary string that is destroyed immediately after. 10561 10562@d save_cur_string==if str_start_macro(str_ptr)<pool_ptr then u:=make_string else u:=0 10563@d restore_cur_string==if u<>0 then decr(str_ptr) 10564 10565@p procedure conv_toks; 10566var old_setting:0..max_selector; {holds |selector| setting} 10567@!save_warning_index, @!save_def_ref:pointer; 10568@!u: str_number; 10569@!c:small_number; {desired type of conversion} 10570@!save_scanner_status:small_number; {|scanner_status| upon entry} 10571@!b:pool_pointer; {base of temporary string} 10572@!fnt,@!arg1,@!arg2:integer; {args for \XeTeX\ extensions} 10573@!font_name_str:str_number; {local vars for \.{\\fontname} quoting extension} 10574@!i:small_number; 10575@!quote_char:UTF16_code; 10576@!cat:small_number; {desired catcode, or 0 for automatic |spacer|/|other_char| selection} 10577@!saved_chr:UnicodeScalar; 10578p, q: pointer; 10579begin cat:=0; c:=cur_chr; @<Scan the argument for command |c|@>; 10580old_setting:=selector; selector:=new_string; b:=pool_ptr; 10581@<Print the result of command |c|@>; 10582selector:=old_setting; link(garbage):=str_toks_cat(b,cat); ins_list(link(temp_head)); 10583end; 10584 10585@ Not all catcode values are allowed by \.{\\Ucharcat}: 10586@d illegal_Ucharcat_catcode(#)==(#<left_brace)or(#>other_char)or(#=out_param)or(#=ignore) 10587 10588@<Scan the argument for command |c|@>= 10589case c of 10590number_code,roman_numeral_code: scan_int; 10591string_code, meaning_code: begin save_scanner_status:=scanner_status; 10592 scanner_status:=normal; get_token; scanner_status:=save_scanner_status; 10593 end; 10594font_name_code: scan_font_ident; 10595XeTeX_Uchar_code: scan_usv_num; 10596XeTeX_Ucharcat_code: 10597 begin 10598 scan_usv_num; 10599 saved_chr:=cur_val; 10600 scan_int; 10601 if illegal_Ucharcat_catcode(cur_val) then 10602 begin print_err("Invalid code ("); print_int(cur_val); 10603@.Invalid code@> 10604 print("), should be in the ranges 1..4, 6..8, 10..12"); 10605 help1("I'm going to use 12 instead of that illegal code value.");@/ 10606 error; cat:=12; 10607 end else 10608 cat:=cur_val; 10609 cur_val:=saved_chr; 10610 end; 10611@/@<Cases of `Scan the argument for command |c|'@>@/ 10612job_name_code: if job_name=0 then open_log_file; 10613end {there are no other cases} 10614 10615@ @<Print the result of command |c|@>= 10616case c of 10617number_code: print_int(cur_val); 10618roman_numeral_code: print_roman_int(cur_val); 10619string_code:if cur_cs<>0 then sprint_cs(cur_cs) 10620 else print_char(cur_chr); 10621meaning_code: print_meaning; 10622font_name_code: begin 10623 font_name_str:=font_name[cur_val]; 10624 if is_native_font(cur_val) then begin 10625 quote_char:=""""; 10626 for i:=0 to length(font_name_str) - 1 do 10627 if str_pool[str_start_macro(font_name_str) + i] = """" then quote_char:="'"; 10628 print_char(quote_char); 10629 print(font_name_str); 10630 print_char(quote_char); 10631 end else 10632 print(font_name_str); 10633 if font_size[cur_val]<>font_dsize[cur_val] then 10634 begin print(" at "); print_scaled(font_size[cur_val]); 10635 print("pt"); 10636 end; 10637 end; 10638XeTeX_Uchar_code, 10639XeTeX_Ucharcat_code: print_char(cur_val); 10640@/@<Cases of `Print the result of command |c|'@>@/ 10641job_name_code: print_file_name(job_name, 0, 0); 10642end {there are no other cases} 10643 10644@ Now we can't postpone the difficulties any longer; we must bravely tackle 10645|scan_toks|. This function returns a pointer to the tail of a new token 10646list, and it also makes |def_ref| point to the reference count at the 10647head of that list. 10648 10649There are two boolean parameters, |macro_def| and |xpand|. If |macro_def| 10650is true, the goal is to create the token list for a macro definition; 10651otherwise the goal is to create the token list for some other \TeX\ 10652primitive: \.{\\mark}, \.{\\output}, \.{\\everypar}, \.{\\lowercase}, 10653\.{\\uppercase}, \.{\\message}, \.{\\errmessage}, \.{\\write}, or 10654\.{\\special}. In the latter cases a left brace must be scanned next; this 10655left brace will not be part of the token list, nor will the matching right 10656brace that comes at the end. If |xpand| is false, the token list will 10657simply be copied from the input using |get_token|. Otherwise all expandable 10658tokens will be expanded until unexpandable tokens are left, except that 10659the results of expanding `\.{\\the}' are not expanded further. 10660If both |macro_def| and |xpand| are true, the expansion applies 10661only to the macro body (i.e., to the material following the first 10662|left_brace| character). 10663 10664The value of |cur_cs| when |scan_toks| begins should be the |eqtb| 10665address of the control sequence to display in ``runaway'' error 10666messages. 10667 10668@p function scan_toks(@!macro_def,@!xpand:boolean):pointer; 10669label found,done,done1,done2; 10670var t:halfword; {token representing the highest parameter number} 10671@!s:halfword; {saved token} 10672@!p:pointer; {tail of the token list being built} 10673@!q:pointer; {new node being added to the token list via |store_new_token|} 10674@!unbalance:halfword; {number of unmatched left braces} 10675@!hash_brace:halfword; {possible `\.{\#\{}' token} 10676begin if macro_def then scanner_status:=defining 10677@+else scanner_status:=absorbing; 10678warning_index:=cur_cs; def_ref:=get_avail; token_ref_count(def_ref):=null; 10679p:=def_ref; hash_brace:=0; t:=zero_token; 10680if macro_def then @<Scan and build the parameter part of the macro definition@> 10681else scan_left_brace; {remove the compulsory left brace} 10682@<Scan and build the body of the token list; |goto found| when finished@>; 10683found: scanner_status:=normal; 10684if hash_brace<>0 then store_new_token(hash_brace); 10685scan_toks:=p; 10686end; 10687 10688@ @<Scan and build the parameter part...@>= 10689begin loop begin get_token; {set |cur_cmd|, |cur_chr|, |cur_tok|} 10690 if cur_tok<right_brace_limit then goto done1; 10691 if cur_cmd=mac_param then 10692 @<If the next character is a parameter number, make |cur_tok| 10693 a |match| token; but if it is a left brace, store 10694 `|left_brace|, |end_match|', set |hash_brace|, and |goto done|@>; 10695 store_new_token(cur_tok); 10696 end; 10697done1: store_new_token(end_match_token); 10698if cur_cmd=right_brace then 10699 @<Express shock at the missing left brace; |goto found|@>; 10700done: end 10701 10702@ @<Express shock...@>= 10703begin print_err("Missing { inserted"); incr(align_state); 10704@.Missing \{ inserted@> 10705help2("Where was the left brace? You said something like `\def\a}',")@/ 10706 ("which I'm going to interpret as `\def\a{}'."); error; goto found; 10707end 10708 10709@ @<If the next character is a parameter number...@>= 10710begin s:=match_token+cur_chr; get_token; 10711if cur_cmd=left_brace then 10712 begin hash_brace:=cur_tok; 10713 store_new_token(cur_tok); store_new_token(end_match_token); 10714 goto done; 10715 end; 10716if t=zero_token+9 then 10717 begin print_err("You already have nine parameters"); 10718@.You already have nine...@> 10719 help1("I'm going to ignore the # sign you just used."); error; 10720 end 10721else begin incr(t); 10722 if cur_tok<>t then 10723 begin print_err("Parameters must be numbered consecutively"); 10724@.Parameters...consecutively@> 10725 help2("I've inserted the digit you should have used after the #.")@/ 10726 ("Type `1' to delete what you did use."); back_error; 10727 end; 10728 cur_tok:=s; 10729 end; 10730end 10731 10732@ @<Scan and build the body of the token list; |goto found| when finished@>= 10733unbalance:=1; 10734loop@+ begin if xpand then @<Expand the next part of the input@> 10735 else get_token; 10736 if cur_tok<right_brace_limit then 10737 if cur_cmd<right_brace then incr(unbalance) 10738 else begin decr(unbalance); 10739 if unbalance=0 then goto found; 10740 end 10741 else if cur_cmd=mac_param then 10742 if macro_def then @<Look for parameter number or \.{\#\#}@>; 10743 store_new_token(cur_tok); 10744 end 10745 10746@ Here we insert an entire token list created by |the_toks| without 10747expanding it further. 10748 10749@<Expand the next part of the input@>= 10750begin loop begin get_next; 10751 if cur_cmd>=call then 10752 if info(link(cur_chr))=protected_token then 10753 begin cur_cmd:=relax; cur_chr:=no_expand_flag; 10754 end; 10755 if cur_cmd<=max_command then goto done2; 10756 if cur_cmd<>the then expand 10757 else begin q:=the_toks; 10758 if link(temp_head)<>null then 10759 begin link(p):=link(temp_head); p:=q; 10760 end; 10761 end; 10762 end; 10763done2: x_token 10764end 10765 10766@ @<Look for parameter number...@>= 10767begin s:=cur_tok; 10768if xpand then get_x_token else get_token; 10769if cur_cmd<>mac_param then 10770 if (cur_tok<=zero_token)or(cur_tok>t) then 10771 begin print_err("Illegal parameter number in definition of "); 10772@.Illegal parameter number...@> 10773 sprint_cs(warning_index); 10774 help3("You meant to type ## instead of #, right?")@/ 10775 ("Or maybe a } was forgotten somewhere earlier, and things")@/ 10776 ("are all screwed up? I'm going to assume that you meant ##."); 10777 back_error; cur_tok:=s; 10778 end 10779 else cur_tok:=out_param_token-"0"+cur_chr; 10780end 10781 10782@ Another way to create a token list is via the \.{\\read} command. The 10783sixteen files potentially usable for reading appear in the following 10784global variables. The value of |read_open[n]| will be |closed| if 10785stream number |n| has not been opened or if it has been fully read; 10786|just_open| if an \.{\\openin} but not a \.{\\read} has been done; 10787and |normal| if it is open and ready to read the next line. 10788 10789@d closed=2 {not open, or at end of file} 10790@d just_open=1 {newly opened, first line not yet read} 10791 10792@<Glob...@>= 10793@!read_file:array[0..15] of unicode_file; {used for \.{\\read}} 10794@!read_open:array[0..16] of normal..closed; {state of |read_file[n]|} 10795 10796@ @<Set init...@>= 10797for k:=0 to 16 do read_open[k]:=closed; 10798 10799@ The |read_toks| procedure constructs a token list like that for any 10800macro definition, and makes |cur_val| point to it. Parameter |r| points 10801to the control sequence that will receive this token list. 10802 10803@p procedure read_toks(@!n:integer;@!r:pointer;@!j:halfword); 10804label done; 10805var p:pointer; {tail of the token list} 10806@!q:pointer; {new node being added to the token list via |store_new_token|} 10807@!s:integer; {saved value of |align_state|} 10808@!m:small_number; {stream number} 10809begin scanner_status:=defining; warning_index:=r; 10810def_ref:=get_avail; token_ref_count(def_ref):=null; 10811p:=def_ref; {the reference count} 10812store_new_token(end_match_token); 10813if (n<0)or(n>15) then m:=16@+else m:=n; 10814s:=align_state; align_state:=1000000; {disable tab marks, etc.} 10815repeat @<Input and store tokens from the next line of the file@>; 10816until align_state=1000000; 10817cur_val:=def_ref; scanner_status:=normal; align_state:=s; 10818end; 10819 10820@ @<Input and store tokens from the next line of the file@>= 10821begin_file_reading; name:=m+1; 10822if read_open[m]=closed then @<Input for \.{\\read} from the terminal@> 10823else if read_open[m]=just_open then @<Input the first line of |read_file[m]|@> 10824else @<Input the next line of |read_file[m]|@>; 10825limit:=last; 10826if end_line_char_inactive then decr(limit) 10827else buffer[limit]:=end_line_char; 10828first:=limit+1; loc:=start; state:=new_line;@/ 10829@<Handle \.{\\readline} and |goto done|@>;@/ 10830loop@+ begin get_token; 10831 if cur_tok=0 then goto done; 10832 {|cur_cmd=cur_chr=0| will occur at the end of the line} 10833 if align_state<1000000 then {unmatched `\.\}' aborts the line} 10834 begin repeat get_token; until cur_tok=0; 10835 align_state:=1000000; goto done; 10836 end; 10837 store_new_token(cur_tok); 10838 end; 10839done: end_file_reading 10840 10841@ Here we input on-line into the |buffer| array, prompting the user explicitly 10842if |n>=0|. The value of |n| is set negative so that additional prompts 10843will not be given in the case of multi-line input. 10844 10845@<Input for \.{\\read} from the terminal@>= 10846if interaction>nonstop_mode then 10847 if n<0 then prompt_input("") 10848 else begin wake_up_terminal; 10849 print_ln; sprint_cs(r); prompt_input("="); n:=-1; 10850 end 10851else fatal_error("*** (cannot \read from terminal in nonstop modes)") 10852@.cannot \\read@> 10853 10854@ The first line of a file must be treated specially, since |input_ln| 10855must be told not to start with |get|. 10856@^system dependencies@> 10857 10858@<Input the first line of |read_file[m]|@>= 10859if input_ln(read_file[m],false) then read_open[m]:=normal 10860else begin u_close(read_file[m]); read_open[m]:=closed; 10861 end 10862 10863@ An empty line is appended at the end of a |read_file|. 10864@^empty line at end of file@> 10865 10866@<Input the next line of |read_file[m]|@>= 10867begin if not input_ln(read_file[m],true) then 10868 begin u_close(read_file[m]); read_open[m]:=closed; 10869 if align_state<>1000000 then 10870 begin runaway; 10871 print_err("File ended within "); print_esc("read"); 10872@.File ended within \\read@> 10873 help1("This \read has unbalanced braces."); 10874 align_state:=1000000; error; 10875 end; 10876 end; 10877end 10878 10879@* \[28] Conditional processing. 10880We consider now the way \TeX\ handles various kinds of \.{\\if} commands. 10881 10882@d unless_code=32 {amount added for `\.{\\unless}' prefix} 10883@# 10884@d if_char_code=0 { `\.{\\if}' } 10885@d if_cat_code=1 { `\.{\\ifcat}' } 10886@d if_int_code=2 { `\.{\\ifnum}' } 10887@d if_dim_code=3 { `\.{\\ifdim}' } 10888@d if_odd_code=4 { `\.{\\ifodd}' } 10889@d if_vmode_code=5 { `\.{\\ifvmode}' } 10890@d if_hmode_code=6 { `\.{\\ifhmode}' } 10891@d if_mmode_code=7 { `\.{\\ifmmode}' } 10892@d if_inner_code=8 { `\.{\\ifinner}' } 10893@d if_void_code=9 { `\.{\\ifvoid}' } 10894@d if_hbox_code=10 { `\.{\\ifhbox}' } 10895@d if_vbox_code=11 { `\.{\\ifvbox}' } 10896@d ifx_code=12 { `\.{\\ifx}' } 10897@d if_eof_code=13 { `\.{\\ifeof}' } 10898@d if_true_code=14 { `\.{\\iftrue}' } 10899@d if_false_code=15 { `\.{\\iffalse}' } 10900@d if_case_code=16 { `\.{\\ifcase}' } 10901@d if_primitive_code=21 { `\.{\\ifprimitive}' } 10902 10903@<Put each...@>= 10904primitive("if",if_test,if_char_code); 10905@!@:if_char_}{\.{\\if} primitive@> 10906primitive("ifcat",if_test,if_cat_code); 10907@!@:if_cat_code_}{\.{\\ifcat} primitive@> 10908primitive("ifnum",if_test,if_int_code); 10909@!@:if_int_}{\.{\\ifnum} primitive@> 10910primitive("ifdim",if_test,if_dim_code); 10911@!@:if_dim_}{\.{\\ifdim} primitive@> 10912primitive("ifodd",if_test,if_odd_code); 10913@!@:if_odd_}{\.{\\ifodd} primitive@> 10914primitive("ifvmode",if_test,if_vmode_code); 10915@!@:if_vmode_}{\.{\\ifvmode} primitive@> 10916primitive("ifhmode",if_test,if_hmode_code); 10917@!@:if_hmode_}{\.{\\ifhmode} primitive@> 10918primitive("ifmmode",if_test,if_mmode_code); 10919@!@:if_mmode_}{\.{\\ifmmode} primitive@> 10920primitive("ifinner",if_test,if_inner_code); 10921@!@:if_inner_}{\.{\\ifinner} primitive@> 10922primitive("ifvoid",if_test,if_void_code); 10923@!@:if_void_}{\.{\\ifvoid} primitive@> 10924primitive("ifhbox",if_test,if_hbox_code); 10925@!@:if_hbox_}{\.{\\ifhbox} primitive@> 10926primitive("ifvbox",if_test,if_vbox_code); 10927@!@:if_vbox_}{\.{\\ifvbox} primitive@> 10928primitive("ifx",if_test,ifx_code); 10929@!@:ifx_}{\.{\\ifx} primitive@> 10930primitive("ifeof",if_test,if_eof_code); 10931@!@:if_eof_}{\.{\\ifeof} primitive@> 10932primitive("iftrue",if_test,if_true_code); 10933@!@:if_true_}{\.{\\iftrue} primitive@> 10934primitive("iffalse",if_test,if_false_code); 10935@!@:if_false_}{\.{\\iffalse} primitive@> 10936primitive("ifcase",if_test,if_case_code); 10937@!@:if_case_}{\.{\\ifcase} primitive@> 10938primitive("ifprimitive",if_test,if_primitive_code); 10939@!@:if_primitive_}{\.{\\ifprimitive} primitive@> 10940 10941@ @<Cases of |print_cmd_chr|...@>= 10942if_test: begin if chr_code>=unless_code then print_esc("unless"); 10943case chr_code mod unless_code of 10944 if_cat_code:print_esc("ifcat"); 10945 if_int_code:print_esc("ifnum"); 10946 if_dim_code:print_esc("ifdim"); 10947 if_odd_code:print_esc("ifodd"); 10948 if_vmode_code:print_esc("ifvmode"); 10949 if_hmode_code:print_esc("ifhmode"); 10950 if_mmode_code:print_esc("ifmmode"); 10951 if_inner_code:print_esc("ifinner"); 10952 if_void_code:print_esc("ifvoid"); 10953 if_hbox_code:print_esc("ifhbox"); 10954 if_vbox_code:print_esc("ifvbox"); 10955 ifx_code:print_esc("ifx"); 10956 if_eof_code:print_esc("ifeof"); 10957 if_true_code:print_esc("iftrue"); 10958 if_false_code:print_esc("iffalse"); 10959 if_case_code:print_esc("ifcase"); 10960 if_primitive_code:print_esc("ifprimitive"); 10961 @/@<Cases of |if_test| for |print_cmd_chr|@>@/ 10962 othercases print_esc("if") 10963 endcases; 10964end; 10965 10966@ Conditions can be inside conditions, and this nesting has a stack 10967that is independent of the |save_stack|. 10968 10969Four global variables represent the top of the condition stack: 10970|cond_ptr| points to pushed-down entries, if any; |if_limit| specifies 10971the largest code of a |fi_or_else| command that is syntactically legal; 10972|cur_if| is the name of the current type of conditional; and |if_line| 10973is the line number at which it began. 10974 10975If no conditions are currently in progress, the condition stack has the 10976special state |cond_ptr=null|, |if_limit=normal|, |cur_if=0|, |if_line=0|. 10977Otherwise |cond_ptr| points to a two-word node; the |type|, |subtype|, and 10978|link| fields of the first word contain |if_limit|, |cur_if|, and 10979|cond_ptr| at the next level, and the second word contains the 10980corresponding |if_line|. 10981 10982@d if_node_size=2 {number of words in stack entry for conditionals} 10983@d if_line_field(#)==mem[#+1].int 10984@d if_code=1 {code for \.{\\if...} being evaluated} 10985@d fi_code=2 {code for \.{\\fi}} 10986@d else_code=3 {code for \.{\\else}} 10987@d or_code=4 {code for \.{\\or}} 10988 10989@<Glob...@>= 10990@!cond_ptr:pointer; {top of the condition stack} 10991@!if_limit:normal..or_code; {upper bound on |fi_or_else| codes} 10992@!cur_if:small_number; {type of conditional being worked on} 10993@!if_line:integer; {line where that conditional began} 10994 10995@ @<Set init...@>= 10996cond_ptr:=null; if_limit:=normal; cur_if:=0; if_line:=0; 10997 10998@ @<Put each...@>= 10999primitive("fi",fi_or_else,fi_code); 11000@!@:fi_}{\.{\\fi} primitive@> 11001text(frozen_fi):="fi"; eqtb[frozen_fi]:=eqtb[cur_val]; 11002primitive("or",fi_or_else,or_code); 11003@!@:or_}{\.{\\or} primitive@> 11004primitive("else",fi_or_else,else_code); 11005@!@:else_}{\.{\\else} primitive@> 11006 11007@ @<Cases of |print_cmd_chr|...@>= 11008fi_or_else: if chr_code=fi_code then print_esc("fi") 11009 else if chr_code=or_code then print_esc("or") 11010 else print_esc("else"); 11011 11012@ When we skip conditional text, we keep track of the line number 11013where skipping began, for use in error messages. 11014 11015@<Glob...@>= 11016@!skip_line:integer; {skipping began here} 11017 11018@ Here is a procedure that ignores text until coming to an \.{\\or}, 11019\.{\\else}, or \.{\\fi} at level zero of $\.{\\if}\ldots\.{\\fi}$ 11020nesting. After it has acted, |cur_chr| will indicate the token that 11021was found, but |cur_tok| will not be set (because this makes the 11022procedure run faster). 11023 11024@p procedure pass_text; 11025label done; 11026var l:integer; {level of $\.{\\if}\ldots\.{\\fi}$ nesting} 11027@!save_scanner_status:small_number; {|scanner_status| upon entry} 11028begin save_scanner_status:=scanner_status; scanner_status:=skipping; l:=0; 11029skip_line:=line; 11030loop@+ begin get_next; 11031 if cur_cmd=fi_or_else then 11032 begin if l=0 then goto done; 11033 if cur_chr=fi_code then decr(l); 11034 end 11035 else if cur_cmd=if_test then incr(l); 11036 end; 11037done: scanner_status:=save_scanner_status; 11038if tracing_ifs>0 then show_cur_cmd_chr; 11039end; 11040 11041@ When we begin to process a new \.{\\if}, we set |if_limit:=if_code|; then 11042if\/ \.{\\or} or \.{\\else} or \.{\\fi} occurs before the current \.{\\if} 11043condition has been evaluated, \.{\\relax} will be inserted. 11044For example, a sequence of commands like `\.{\\ifvoid1\\else...\\fi}' 11045would otherwise require something after the `\.1'. 11046 11047@<Push the condition stack@>= 11048begin p:=get_node(if_node_size); link(p):=cond_ptr; type(p):=if_limit; 11049subtype(p):=cur_if; if_line_field(p):=if_line; 11050cond_ptr:=p; cur_if:=cur_chr; if_limit:=if_code; if_line:=line; 11051end 11052 11053@ @<Pop the condition stack@>= 11054begin if if_stack[in_open]=cond_ptr then if_warning; 11055 {conditionals possibly not properly nested with files} 11056p:=cond_ptr; if_line:=if_line_field(p); 11057cur_if:=subtype(p); if_limit:=type(p); cond_ptr:=link(p); 11058free_node(p,if_node_size); 11059end 11060 11061@ Here's a procedure that changes the |if_limit| code corresponding to 11062a given value of |cond_ptr|. 11063 11064@p procedure change_if_limit(@!l:small_number;@!p:pointer); 11065label exit; 11066var q:pointer; 11067begin if p=cond_ptr then if_limit:=l {that's the easy case} 11068else begin q:=cond_ptr; 11069 loop@+ begin if q=null then confusion("if"); 11070@:this can't happen if}{\quad if@> 11071 if link(q)=p then 11072 begin type(q):=l; return; 11073 end; 11074 q:=link(q); 11075 end; 11076 end; 11077exit:end; 11078 11079@ A condition is started when the |expand| procedure encounters 11080an |if_test| command; in that case |expand| reduces to |conditional|, 11081which is a recursive procedure. 11082@^recursion@> 11083 11084@p procedure conditional; 11085label exit,common_ending; 11086var b:boolean; {is the condition true?} 11087@!e:boolean; {keep track of nested csnames} 11088@!r:"<"..">"; {relation to be evaluated} 11089@!m,@!n:integer; {to be tested against the second operand} 11090@!p,@!q:pointer; {for traversing token lists in \.{\\ifx} tests} 11091@!save_scanner_status:small_number; {|scanner_status| upon entry} 11092@!save_cond_ptr:pointer; {|cond_ptr| corresponding to this conditional} 11093@!this_if:small_number; {type of this conditional} 11094@!is_unless:boolean; {was this if preceded by `\.{\\unless}' ?} 11095begin if tracing_ifs>0 then if tracing_commands<=1 then show_cur_cmd_chr; 11096@<Push the condition stack@>;@+save_cond_ptr:=cond_ptr; 11097is_unless:=(cur_chr>=unless_code); this_if:=cur_chr mod unless_code;@/ 11098@<Either process \.{\\ifcase} or set |b| to the value of a boolean condition@>; 11099if is_unless then b:=not b; 11100if tracing_commands>1 then @<Display the value of |b|@>; 11101if b then 11102 begin change_if_limit(else_code,save_cond_ptr); 11103 return; {wait for \.{\\else} or \.{\\fi}} 11104 end; 11105@<Skip to \.{\\else} or \.{\\fi}, then |goto common_ending|@>; 11106common_ending: if cur_chr=fi_code then @<Pop the condition stack@> 11107else if_limit:=fi_code; {wait for \.{\\fi}} 11108exit:end; 11109 11110@ In a construction like `\.{\\if\\iftrue abc\\else d\\fi}', the first 11111\.{\\else} that we come to after learning that the \.{\\if} is false is 11112not the \.{\\else} we're looking for. Hence the following curious 11113logic is needed. 11114 11115@ @<Skip to \.{\\else} or \.{\\fi}...@>= 11116loop@+ begin pass_text; 11117 if cond_ptr=save_cond_ptr then 11118 begin if cur_chr<>or_code then goto common_ending; 11119 print_err("Extra "); print_esc("or"); 11120@.Extra \\or@> 11121 help1("I'm ignoring this; it doesn't match any \if."); 11122 error; 11123 end 11124 else if cur_chr=fi_code then @<Pop the condition stack@>; 11125 end 11126 11127@ @<Either process \.{\\ifcase} or set |b|...@>= 11128case this_if of 11129if_char_code, if_cat_code: @<Test if two characters match@>; 11130if_int_code, if_dim_code: @<Test relation between integers or dimensions@>; 11131if_odd_code: @<Test if an integer is odd@>; 11132if_vmode_code: b:=(abs(mode)=vmode); 11133if_hmode_code: b:=(abs(mode)=hmode); 11134if_mmode_code: b:=(abs(mode)=mmode); 11135if_inner_code: b:=(mode<0); 11136if_void_code, if_hbox_code, if_vbox_code: @<Test box register status@>; 11137ifx_code: @<Test if two tokens match@>; 11138if_eof_code: begin scan_four_bit_int; b:=(read_open[cur_val]=closed); 11139 end; 11140if_true_code: b:=true; 11141if_false_code: b:=false; 11142@/@<Cases for |conditional|@>@/ 11143if_case_code: @<Select the appropriate case 11144 and |return| or |goto common_ending|@>; 11145if_primitive_code: begin 11146 save_scanner_status:=scanner_status; 11147 scanner_status:=normal; 11148 get_next; 11149 scanner_status:=save_scanner_status; 11150 if cur_cs < hash_base then 11151 m:=prim_lookup(cur_cs-257) 11152 else 11153 m:=prim_lookup(text(cur_cs)); 11154 b :=((cur_cmd<>undefined_cs) and 11155 (m<>undefined_primitive) and 11156 (cur_cmd=prim_eq_type(m)) and 11157 (cur_chr=prim_equiv(m))); 11158 end; 11159end {there are no other cases} 11160 11161@ @<Display the value of |b|@>= 11162begin begin_diagnostic; 11163if b then print("{true}")@+else print("{false}"); 11164end_diagnostic(false); 11165end 11166 11167@ Here we use the fact that |"<"|, |"="|, and |">"| are consecutive ASCII 11168codes. 11169@^ASCII code@> 11170 11171@<Test relation between integers or dimensions@>= 11172begin if this_if=if_int_code then scan_int@+else scan_normal_dimen; 11173n:=cur_val; @<Get the next non-blank non-call...@>; 11174if (cur_tok>=other_token+"<")and(cur_tok<=other_token+">") then 11175 r:=cur_tok-other_token 11176else begin print_err("Missing = inserted for "); 11177@.Missing = inserted@> 11178 print_cmd_chr(if_test,this_if); 11179 help1("I was expecting to see `<', `=', or `>'. Didn't."); 11180 back_error; r:="="; 11181 end; 11182if this_if=if_int_code then scan_int@+else scan_normal_dimen; 11183case r of 11184"<": b:=(n<cur_val); 11185"=": b:=(n=cur_val); 11186">": b:=(n>cur_val); 11187end; 11188end 11189 11190@ @<Test if an integer is odd@>= 11191begin scan_int; b:=odd(cur_val); 11192end 11193 11194@ @<Test box register status@>= 11195begin scan_register_num; fetch_box(p); 11196if this_if=if_void_code then b:=(p=null) 11197else if p=null then b:=false 11198else if this_if=if_hbox_code then b:=(type(p)=hlist_node) 11199else b:=(type(p)=vlist_node); 11200end 11201 11202@ An active character will be treated as category 13 following 11203\.{\\if\\noexpand} or following \.{\\ifcat\\noexpand}. We use the fact that 11204active characters have the smallest tokens, among all control sequences. 11205 11206@d get_x_token_or_active_char==@t@>@; 11207 begin get_x_token; 11208 if cur_cmd=relax then if cur_chr=no_expand_flag then 11209 begin cur_cmd:=active_char; 11210 cur_chr:=cur_tok-cs_token_flag-active_base; 11211 end; 11212 end 11213 11214@<Test if two characters match@>= 11215begin get_x_token_or_active_char; 11216if (cur_cmd>active_char)or(cur_chr>biggest_usv) then {not a character} 11217 begin m:=relax; n:=too_big_usv; 11218 end 11219else begin m:=cur_cmd; n:=cur_chr; 11220 end; 11221get_x_token_or_active_char; 11222if (cur_cmd>active_char)or(cur_chr>biggest_usv) then 11223 begin cur_cmd:=relax; cur_chr:=too_big_usv; 11224 end; 11225if this_if=if_char_code then b:=(n=cur_chr)@+else b:=(m=cur_cmd); 11226end 11227 11228@ Note that `\.{\\ifx}' will declare two macros different if one is \\{long} 11229or \\{outer} and the other isn't, even though the texts of the macros are 11230the same. 11231 11232We need to reset |scanner_status|, since \.{\\outer} control sequences 11233are allowed, but we might be scanning a macro definition or preamble. 11234 11235@<Test if two tokens match@>= 11236begin save_scanner_status:=scanner_status; scanner_status:=normal; 11237get_next; n:=cur_cs; p:=cur_cmd; q:=cur_chr; 11238get_next; if cur_cmd<>p then b:=false 11239else if cur_cmd<call then b:=(cur_chr=q) 11240else @<Test if two macro texts match@>; 11241scanner_status:=save_scanner_status; 11242end 11243 11244@ Note also that `\.{\\ifx}' decides that macros \.{\\a} and \.{\\b} are 11245different in examples like this: 11246$$\vbox{\halign{\.{#}\hfil&\qquad\.{#}\hfil\cr 11247 {}\\def\\a\{\\c\}& 11248 {}\\def\\c\{\}\cr 11249 {}\\def\\b\{\\d\}& 11250 {}\\def\\d\{\}\cr}}$$ 11251 11252@<Test if two macro texts match@>= 11253begin p:=link(cur_chr); q:=link(equiv(n)); {omit reference counts} 11254if p=q then b:=true 11255else begin while (p<>null)and(q<>null) do 11256 if info(p)<>info(q) then p:=null 11257 else begin p:=link(p); q:=link(q); 11258 end; 11259 b:=((p=null)and(q=null)); 11260 end; 11261end 11262 11263@ @<Select the appropriate case and |return| or |goto common_ending|@>= 11264begin scan_int; n:=cur_val; {|n| is the number of cases to pass} 11265if tracing_commands>1 then 11266 begin begin_diagnostic; print("{case "); print_int(n); print_char("}"); 11267 end_diagnostic(false); 11268 end; 11269while n<>0 do 11270 begin pass_text; 11271 if cond_ptr=save_cond_ptr then 11272 if cur_chr=or_code then decr(n) 11273 else goto common_ending 11274 else if cur_chr=fi_code then @<Pop the condition stack@>; 11275 end; 11276change_if_limit(or_code,save_cond_ptr); 11277return; {wait for \.{\\or}, \.{\\else}, or \.{\\fi}} 11278end 11279 11280@ The processing of conditionals is complete except for the following 11281code, which is actually part of |expand|. It comes into play when 11282\.{\\or}, \.{\\else}, or \.{\\fi} is scanned. 11283 11284@<Terminate the current conditional and skip to \.{\\fi}@>= 11285begin if tracing_ifs>0 then if tracing_commands<=1 then show_cur_cmd_chr; 11286if cur_chr>if_limit then 11287 if if_limit=if_code then insert_relax {condition not yet evaluated} 11288 else begin print_err("Extra "); print_cmd_chr(fi_or_else,cur_chr); 11289@.Extra \\or@> 11290@.Extra \\else@> 11291@.Extra \\fi@> 11292 help1("I'm ignoring this; it doesn't match any \if."); 11293 error; 11294 end 11295else begin while cur_chr<>fi_code do pass_text; {skip to \.{\\fi}} 11296 @<Pop the condition stack@>; 11297 end; 11298end 11299 11300@* \[29] File names. 11301It's time now to fret about file names. Besides the fact that different 11302operating systems treat files in different ways, we must cope with the 11303fact that completely different naming conventions are used by different 11304groups of people. The following programs show what is required for one 11305particular operating system; similar routines for other systems are not 11306difficult to devise. 11307@^fingers@> 11308@^system dependencies@> 11309 11310\TeX\ assumes that a file name has three parts: the name proper; its 11311``extension''; and a ``file area'' where it is found in an external file 11312system. The extension of an input file or a write file is assumed to be 11313`\.{.tex}' unless otherwise specified; it is `\.{.log}' on the 11314transcript file that records each run of \TeX; it is `\.{.tfm}' on the font 11315metric files that describe characters in the fonts \TeX\ uses; it is 11316`\.{.dvi}' on the output files that specify typesetting information; and it 11317is `\.{.fmt}' on the format files written by \.{INITEX} to initialize \TeX. 11318The file area can be arbitrary on input files, but files are usually 11319output to the user's current area. If an input file cannot be 11320found on the specified area, \TeX\ will look for it on a special system 11321area; this special area is intended for commonly used input files like 11322\.{webmac.tex}. 11323 11324Simple uses of \TeX\ refer only to file names that have no explicit 11325extension or area. For example, a person usually says `\.{\\input} \.{paper}' 11326or `\.{\\font\\tenrm} \.= \.{helvetica}' instead of `\.{\\input} 11327\.{paper.new}' or `\.{\\font\\tenrm} \.= \.{<csd.knuth>test}'. Simple file 11328names are best, because they make the \TeX\ source files portable; 11329whenever a file name consists entirely of letters and digits, it should be 11330treated in the same way by all implementations of \TeX. However, users 11331need the ability to refer to other files in their environment, especially 11332when responding to error messages concerning unopenable files; therefore 11333we want to let them use the syntax that appears in their favorite 11334operating system. 11335 11336@ In order to isolate the system-dependent aspects of file names, the 11337@^system dependencies@> 11338system-independent parts of \TeX\ are expressed in terms 11339of three system-dependent 11340procedures called |begin_name|, |more_name|, and |end_name|. In 11341essence, if the user-specified characters of the file name are $c_1\ldots c_n$, 11342the system-independent driver program does the operations 11343$$|begin_name|;\,|more_name|(c_1);\,\ldots\,;\,|more_name|(c_n); 11344\,|end_name|.$$ 11345These three procedures communicate with each other via global variables. 11346Afterwards the file name will appear in the string pool as three strings 11347called |cur_name|\penalty10000\hskip-.05em, 11348|cur_area|, and |cur_ext|; the latter two are null (i.e., 11349|""|), unless they were explicitly specified by the user. 11350 11351Actually the situation is slightly more complicated, because \TeX\ needs 11352to know when the file name ends. The |more_name| routine is a function 11353(with side effects) that returns |true| on the calls |more_name|$(c_1)$, 11354\dots, |more_name|$(c_{n-1})$. The final call |more_name|$(c_n)$ 11355returns |false|; or, it returns |true| and the token following $c_n$ is 11356something like `\.{\\hbox}' (i.e., not a character). In other words, 11357|more_name| is supposed to return |true| unless it is sure that the 11358file name has been completely scanned; and |end_name| is supposed to be able 11359to finish the assembly of |cur_name|, |cur_area|, and |cur_ext| regardless of 11360whether $|more_name|(c_n)$ returned |true| or |false|. 11361 11362@<Glob...@>= 11363@!cur_name:str_number; {name of file just scanned} 11364@!cur_area:str_number; {file area just scanned, or \.{""}} 11365@!cur_ext:str_number; {file extension just scanned, or \.{""}} 11366 11367@ The file names we shall deal with for illustrative purposes have the 11368following structure: If the name contains `\.>' or `\.:', the file area 11369consists of all characters up to and including the final such character; 11370otherwise the file area is null. If the remaining file name contains 11371`\..', the file extension consists of all such characters from the first 11372remaining `\..' to the end, otherwise the file extension is null. 11373@^system dependencies@> 11374 11375We can scan such file names easily by using two global variables that keep track 11376of the occurrences of area and extension delimiters: 11377 11378@<Glob...@>= 11379@!area_delimiter:pool_pointer; {the most recent `\.>' or `\.:', if any} 11380@!ext_delimiter:pool_pointer; {the relevant `\..', if any} 11381@!file_name_quote_char:UTF16_code; 11382 11383@ Input files that can't be found in the user's area may appear in a standard 11384system area called |TEX_area|. Font metric files whose areas are not given 11385explicitly are assumed to appear in a standard system area called 11386|TEX_font_area|. These system area names will, of course, vary from place 11387to place. 11388@^system dependencies@> 11389 11390@d TEX_area=="TeXinputs:" 11391@.TeXinputs@> 11392@d TEX_font_area=="TeXfonts:" 11393@.TeXfonts@> 11394 11395@ Here now is the first of the system-dependent routines for file name scanning. 11396@^system dependencies@> 11397 11398@p procedure begin_name; 11399begin area_delimiter:=0; ext_delimiter:=0; 11400file_name_quote_char:=0; 11401end; 11402 11403@ And here's the second. The string pool might change as the file name is 11404being scanned, since a new \.{\\csname} might be entered; therefore we keep 11405|area_delimiter| and |ext_delimiter| relative to the beginning of the current 11406string, instead of assigning an absolute address like |pool_ptr| to them. 11407@^system dependencies@> 11408 11409@p function more_name(@!c:ASCII_code):boolean; 11410begin if c=" " then more_name:=false 11411else begin str_room(1); append_char(c); {contribute |c| to the current string} 11412 if (c=">")or(c=":") then 11413 begin area_delimiter:=cur_length; ext_delimiter:=0; 11414 end 11415 else if (c=".")and(ext_delimiter=0) then ext_delimiter:=cur_length; 11416 more_name:=true; 11417 end; 11418end; 11419 11420@ The third. 11421@^system dependencies@> 11422 11423@p procedure end_name; 11424begin if str_ptr+3>max_strings then 11425 overflow("number of strings",max_strings-init_str_ptr); 11426@:TeX capacity exceeded number of strings}{\quad number of strings@> 11427if area_delimiter=0 then cur_area:="" 11428else begin cur_area:=str_ptr; 11429 str_start_macro(str_ptr+1):=str_start_macro(str_ptr)+area_delimiter; incr(str_ptr); 11430 end; 11431if ext_delimiter=0 then 11432 begin cur_ext:=""; cur_name:=make_string; 11433 end 11434else begin cur_name:=str_ptr; 11435 str_start_macro(str_ptr+1):=str_start_macro(str_ptr)+ext_delimiter-area_delimiter-1; 11436 incr(str_ptr); cur_ext:=make_string; 11437 end; 11438end; 11439 11440@ Conversely, here is a routine that takes three strings and prints a file 11441name that might have produced them. (The routine is system dependent, because 11442some operating systems put the file area last instead of first.) 11443@^system dependencies@> 11444 11445@<Basic printing...@>= 11446procedure print_file_name(@!n,@!a,@!e:integer); 11447begin slow_print(a); slow_print(n); slow_print(e); 11448end; 11449 11450@ Another system-dependent routine is needed to convert three internal 11451\TeX\ strings 11452into the |name_of_file| value that is used to open files. The present code 11453allows both lowercase and uppercase letters in the file name. 11454@^system dependencies@> 11455 11456@d append_to_name(#)==begin c:=#; incr(k); 11457 if k<=file_name_size then name_of_file[k]:=xchr[c]; 11458 end 11459 11460@p procedure pack_file_name(@!n,@!a,@!e:str_number); 11461var k:integer; {number of positions filled in |name_of_file|} 11462@!c: ASCII_code; {character being packed} 11463@!j:pool_pointer; {index into |str_pool|} 11464begin k:=0; 11465for j:=str_start_macro(a) to str_start_macro(a+1)-1 do append_to_name(so(str_pool[j])); 11466for j:=str_start_macro(n) to str_start_macro(n+1)-1 do append_to_name(so(str_pool[j])); 11467for j:=str_start_macro(e) to str_start_macro(e+1)-1 do append_to_name(so(str_pool[j])); 11468if k<=file_name_size then name_length:=k@+else name_length:=file_name_size; 11469for k:=name_length+1 to file_name_size do name_of_file[k]:=' '; 11470end; 11471 11472@ A messier routine is also needed, since format file names must be scanned 11473before \TeX's string mechanism has been initialized. We shall use the 11474global variable |TEX_format_default| to supply the text for default system areas 11475and extensions related to format files. 11476@^system dependencies@> 11477 11478@d format_default_length=20 {length of the |TEX_format_default| string} 11479@d format_area_length=11 {length of its area part} 11480@d format_ext_length=4 {length of its `\.{.fmt}' part} 11481@d format_extension=".fmt" {the extension, as a \.{WEB} constant} 11482 11483@<Glob...@>= 11484@!TEX_format_default:packed array[1..format_default_length] of char; 11485 11486@ @<Set init...@>= 11487TEX_format_default:='TeXformats:plain.fmt'; 11488@.TeXformats@> 11489@.plain@> 11490@^system dependencies@> 11491 11492@ @<Check the ``constant'' values for consistency@>= 11493if format_default_length>file_name_size then bad:=31; 11494 11495@ Here is the messy routine that was just mentioned. It sets |name_of_file| 11496from the first |n| characters of |TEX_format_default|, followed by 11497|buffer[a..b]|, followed by the last |format_ext_length| characters of 11498|TEX_format_default|. 11499 11500We dare not give error messages here, since \TeX\ calls this routine before 11501the |error| routine is ready to roll. Instead, we simply drop excess characters, 11502since the error will be detected in another way when a strange file name 11503isn't found. 11504@^system dependencies@> 11505 11506@p procedure pack_buffered_name(@!n:small_number;@!a,@!b:integer); 11507var k:integer; {number of positions filled in |name_of_file|} 11508@!c: ASCII_code; {character being packed} 11509@!j:integer; {index into |buffer| or |TEX_format_default|} 11510begin if n+b-a+1+format_ext_length>file_name_size then 11511 b:=a+file_name_size-n-1-format_ext_length; 11512k:=0; 11513for j:=1 to n do append_to_name(xord[TEX_format_default[j]]); 11514for j:=a to b do append_to_name(buffer[j]); 11515for j:=format_default_length-format_ext_length+1 to format_default_length do 11516 append_to_name(xord[TEX_format_default[j]]); 11517if k<=file_name_size then name_length:=k@+else name_length:=file_name_size; 11518for k:=name_length+1 to file_name_size do name_of_file[k]:=' '; 11519end; 11520 11521@ Here is the only place we use |pack_buffered_name|. This part of the program 11522becomes active when a ``virgin'' \TeX\ is trying to get going, just after 11523the preliminary initialization, or when the user is substituting another 11524format file by typing `\.\&' after the initial `\.{**}' prompt. The buffer 11525contains the first line of input in |buffer[loc..(last-1)]|, where 11526|loc<last| and |buffer[loc]<>" "|. 11527 11528@<Declare the function called |open_fmt_file|@>= 11529function open_fmt_file:boolean; 11530label found,exit; 11531var j:0..buf_size; {the first space after the format file name} 11532begin j:=loc; 11533if buffer[loc]="&" then 11534 begin incr(loc); j:=loc; buffer[last]:=" "; 11535 while buffer[j]<>" " do incr(j); 11536 pack_buffered_name(0,loc,j-1); {try first without the system file area} 11537 if w_open_in(fmt_file) then goto found; 11538 pack_buffered_name(format_area_length,loc,j-1); 11539 {now try the system format file area} 11540 if w_open_in(fmt_file) then goto found; 11541 wake_up_terminal; 11542 wterm_ln('Sorry, I can''t find that format;',' will try PLAIN.'); 11543@.Sorry, I can't find...@> 11544 update_terminal; 11545 end; 11546 {now pull out all the stops: try for the system \.{plain} file} 11547pack_buffered_name(format_default_length-format_ext_length,1,0); 11548if not w_open_in(fmt_file) then 11549 begin wake_up_terminal; 11550 wterm_ln('I can''t find the PLAIN format file!'); 11551@.I can't find PLAIN...@> 11552@.plain@> 11553 open_fmt_file:=false; return; 11554 end; 11555found:loc:=j; open_fmt_file:=true; 11556exit:end; 11557 11558@ Operating systems often make it possible to determine the exact name (and 11559possible version number) of a file that has been opened. The following routine, 11560which simply makes a \TeX\ string from the value of |name_of_file|, should 11561ideally be changed to deduce the full name of file~|f|, which is the file 11562most recently opened, if it is possible to do this in a \PASCAL\ program. 11563@^system dependencies@> 11564 11565This routine might be called after string memory has overflowed, hence 11566we dare not use `|str_room|'. 11567 11568@p function make_name_string:str_number; 11569var k:0..file_name_size; {index into |name_of_file|} 11570begin if (pool_ptr+name_length>pool_size)or(str_ptr=max_strings)or 11571 (cur_length>0) then 11572 make_name_string:="?" 11573else begin 11574 make_utf16_name; 11575 for k:=0 to name_length16-1 do append_char(name_of_file16[k]); 11576 make_name_string:=make_string; 11577 end; 11578end; 11579function u_make_name_string(var f:unicode_file):str_number; 11580begin u_make_name_string:=make_name_string; 11581end; 11582function a_make_name_string(var f:alpha_file):str_number; 11583begin a_make_name_string:=make_name_string; 11584end; 11585function b_make_name_string(var f:byte_file):str_number; 11586begin b_make_name_string:=make_name_string; 11587end; 11588function w_make_name_string(var f:word_file):str_number; 11589begin w_make_name_string:=make_name_string; 11590end; 11591 11592@ Now let's consider the ``driver'' 11593routines by which \TeX\ deals with file names 11594in a system-independent manner. First comes a procedure that looks for a 11595file name in the input by calling |get_x_token| for the information. 11596 11597@p procedure scan_file_name; 11598label done; 11599begin name_in_progress:=true; begin_name; 11600@<Get the next non-blank non-call...@>; 11601loop@+begin if (cur_cmd>other_char)or(cur_chr>biggest_char) then 11602 {not a character} 11603 begin back_input; goto done; 11604 end; 11605 if not more_name(cur_chr) then goto done; 11606 get_x_token; 11607 end; 11608done: end_name; name_in_progress:=false; 11609end; 11610 11611@ The global variable |name_in_progress| is used to prevent recursive 11612use of |scan_file_name|, since the |begin_name| and other procedures 11613communicate via global variables. Recursion would arise only by 11614devious tricks like `\.{\\input\\input f}'; such attempts at sabotage 11615must be thwarted. Furthermore, |name_in_progress| prevents \.{\\input} 11616@^recursion@> 11617from being initiated when a font size specification is being scanned. 11618 11619Another global variable, |job_name|, contains the file name that was first 11620\.{\\input} by the user. This name is extended by `\.{.log}' and `\.{.dvi}' 11621and `\.{.fmt}' in the names of \TeX's output files. 11622 11623@<Glob...@>= 11624@!name_in_progress:boolean; {is a file name being scanned?} 11625@!job_name:str_number; {principal file name} 11626@!log_opened:boolean; {has the transcript file been opened?} 11627 11628@ Initially |job_name=0|; it becomes nonzero as soon as the true name is known. 11629We have |job_name=0| if and only if the `\.{log}' file has not been opened, 11630except of course for a short time just after |job_name| has become nonzero. 11631 11632@<Initialize the output...@>= 11633job_name:=0; name_in_progress:=false; log_opened:=false; 11634 11635@ Here is a routine that manufactures the output file names, assuming that 11636|job_name<>0|. It ignores and changes the current settings of |cur_area| 11637and |cur_ext|. 11638 11639@d pack_cur_name==pack_file_name(cur_name,cur_area,cur_ext) 11640 11641@p procedure pack_job_name(@!s:str_number); {|s = ".log"|, |output_file_extension|, or 11642 |format_extension|} 11643begin cur_area:=""; cur_ext:=s; 11644cur_name:=job_name; pack_cur_name; 11645end; 11646 11647@ If some trouble arises when \TeX\ tries to open a file, the following 11648routine calls upon the user to supply another file name. Parameter~|s| 11649is used in the error message to identify the type of file; parameter~|e| 11650is the default extension if none is given. Upon exit from the routine, 11651variables |cur_name|, |cur_area|, |cur_ext|, and |name_of_file| are 11652ready for another attempt at file opening. 11653 11654@p procedure prompt_file_name(@!s,@!e:str_number); 11655label done; 11656var k:0..buf_size; {index into |buffer|} 11657begin if interaction=scroll_mode then wake_up_terminal; 11658if s="input file name" then print_err("I can't find file `") 11659@.I can't find file x@> 11660else print_err("I can't write on file `"); 11661@.I can't write on file x@> 11662print_file_name(cur_name,cur_area,cur_ext); print("'."); 11663if e=".tex" then show_context; 11664print_nl("Please type another "); print(s); 11665@.Please type...@> 11666if interaction<scroll_mode then 11667 fatal_error("*** (job aborted, file error in nonstop mode)"); 11668@.job aborted, file error...@> 11669clear_terminal; prompt_input(": "); @<Scan file name in the buffer@>; 11670if cur_ext="" then cur_ext:=e; 11671pack_cur_name; 11672end; 11673 11674@ @<Scan file name in the buffer@>= 11675begin begin_name; k:=first; 11676while (buffer[k]=" ")and(k<last) do incr(k); 11677loop@+ begin if k=last then goto done; 11678 if not more_name(buffer[k]) then goto done; 11679 incr(k); 11680 end; 11681done:end_name; 11682end 11683 11684@ Here's an example of how these conventions are used. Whenever it is time to 11685ship out a box of stuff, we shall use the macro |ensure_dvi_open|. 11686 11687@d ensure_dvi_open==if output_file_name=0 then 11688 begin if job_name=0 then open_log_file; 11689 pack_job_name(output_file_extension); 11690 while not dvi_open_out(dvi_file) do 11691 prompt_file_name("file name for output",output_file_extension); 11692 output_file_name:=b_make_name_string(dvi_file); 11693 end 11694 11695@<Glob...@>= 11696@!output_file_extension: str_number; 11697@!no_pdf_output: boolean; 11698@!dvi_file: byte_file; {the device-independent output goes here} 11699@!output_file_name: str_number; {full name of the output file} 11700@!log_name:str_number; {full name of the log file} 11701 11702@ @<Initialize the output...@>= 11703 output_file_name:=0; 11704 if no_pdf_output then output_file_extension:=".xdv" 11705 else output_file_extension:=".pdf"; 11706 11707@ The |open_log_file| routine is used to open the transcript file and to help 11708it catch up to what has previously been printed on the terminal. 11709 11710@p procedure open_log_file; 11711var old_setting:0..max_selector; {previous |selector| setting} 11712@!k:0..buf_size; {index into |months| and |buffer|} 11713@!l:0..buf_size; {end of first input line} 11714@!months:packed array [1..36] of char; {abbreviations of month names} 11715begin old_setting:=selector; 11716if job_name=0 then job_name:="texput"; 11717@.texput@> 11718pack_job_name(".log"); 11719while not a_open_out(log_file) do @<Try to get a different log file name@>; 11720log_name:=a_make_name_string(log_file); 11721selector:=log_only; log_opened:=true; 11722@<Print the banner line, including the date and time@>; 11723input_stack[input_ptr]:=cur_input; {make sure bottom level is in memory} 11724print_nl("**"); 11725@.**@> 11726l:=input_stack[0].limit_field; {last position of first line} 11727if buffer[l]=end_line_char then decr(l); 11728for k:=1 to l do print(buffer[k]); 11729print_ln; {now the transcript file contains the first line of input} 11730selector:=old_setting+2; {|log_only| or |term_and_log|} 11731end; 11732 11733@ Sometimes |open_log_file| is called at awkward moments when \TeX\ is 11734unable to print error messages or even to |show_context|. 11735The |prompt_file_name| routine can result in a |fatal_error|, but the |error| 11736routine will not be invoked because |log_opened| will be false. 11737 11738The normal idea of |batch_mode| is that nothing at all should be written 11739on the terminal. However, in the unusual case that 11740no log file could be opened, we make an exception and allow 11741an explanatory message to be seen. 11742 11743Incidentally, the program always refers to the log file as a `\.{transcript 11744file}', because some systems cannot use the extension `\.{.log}' for 11745this file. 11746 11747@<Try to get a different log file name@>= 11748begin selector:=term_only; 11749prompt_file_name("transcript file name",".log"); 11750end 11751 11752@ @<Print the banner...@>= 11753begin wlog(banner); 11754slow_print(format_ident); print(" "); 11755print_int(day); print_char(" "); 11756months:='JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC'; 11757for k:=3*month-2 to 3*month do wlog(months[k]); 11758print_char(" "); print_int(year); print_char(" "); 11759print_two(time div 60); print_char(":"); print_two(time mod 60); 11760if eTeX_ex then 11761 begin; wlog_cr; wlog('entering extended mode'); 11762 end; 11763end 11764 11765@ Let's turn now to the procedure that is used to initiate file reading 11766when an `\.{\\input}' command is being processed. 11767 11768@p procedure start_input; {\TeX\ will \.{\\input} something} 11769label done; 11770begin scan_file_name; {set |cur_name| to desired file name} 11771if cur_ext="" then cur_ext:=".tex"; 11772pack_cur_name; 11773loop@+ begin begin_file_reading; {set up |cur_file| and new level of input} 11774 if a_open_in(cur_file) then goto done; 11775 if cur_area="" then 11776 begin pack_file_name(cur_name,TEX_area,cur_ext); 11777 if a_open_in(cur_file) then goto done; 11778 end; 11779 end_file_reading; {remove the level that didn't work} 11780 prompt_file_name("input file name",".tex"); 11781 end; 11782done: name:=a_make_name_string(cur_file); 11783if job_name=0 then 11784 begin job_name:=cur_name; open_log_file; 11785 end; {|open_log_file| doesn't |show_context|, so |limit| 11786 and |loc| needn't be set to meaningful values yet} 11787if term_offset+length(name)>max_print_line-2 then print_ln 11788else if (term_offset>0)or(file_offset>0) then print_char(" "); 11789print_char("("); incr(open_parens); slow_print(name); update_terminal; 11790state:=new_line; 11791if name=str_ptr-1 then {we can conserve string pool space now} 11792 begin flush_string; name:=cur_name; 11793 end; 11794@<Read the first line of the new file@>; 11795end; 11796 11797@ Here we have to remember to tell the |input_ln| routine not to 11798start with a |get|. If the file is empty, it is considered to 11799contain a single blank line. 11800@^system dependencies@> 11801@^empty line at end of file@> 11802 11803@<Read the first line...@>= 11804begin line:=1; 11805if input_ln(cur_file,false) then do_nothing; 11806firm_up_the_line; 11807if end_line_char_inactive then decr(limit) 11808else buffer[limit]:=end_line_char; 11809first:=limit+1; loc:=start; 11810end 11811 11812@* \[30] Font metric data. 11813\TeX\ gets its knowledge about fonts from font metric files, also called 11814\.{TFM} files; the `\.T' in `\.{TFM}' stands for \TeX, 11815but other programs know about them too. 11816@:TFM files}{\.{TFM} files@> 11817@^font metric files@> 11818 11819The information in a \.{TFM} file appears in a sequence of 8-bit bytes. 11820Since the number of bytes is always a multiple of 4, we could 11821also regard the file as a sequence of 32-bit words, but \TeX\ uses the 11822byte interpretation. The format of \.{TFM} files was designed by 11823Lyle Ramshaw in 1980. The intent is to convey a lot of different kinds 11824@^Ramshaw, Lyle Harold@> 11825of information in a compact but useful form. 11826 11827@<Glob...@>= 11828@!tfm_file:byte_file; 11829 11830@ The first 24 bytes (6 words) of a \.{TFM} file contain twelve 16-bit 11831integers that give the lengths of the various subsequent portions 11832of the file. These twelve integers are, in order: 11833$$\vbox{\halign{\hfil#&$\null=\null$#\hfil\cr 11834|lf|&length of the entire file, in words;\cr 11835|lh|&length of the header data, in words;\cr 11836|bc|&smallest character code in the font;\cr 11837|ec|&largest character code in the font;\cr 11838|nw|&number of words in the width table;\cr 11839|nh|&number of words in the height table;\cr 11840|nd|&number of words in the depth table;\cr 11841|ni|&number of words in the italic correction table;\cr 11842|nl|&number of words in the lig/kern table;\cr 11843|nk|&number of words in the kern table;\cr 11844|ne|&number of words in the extensible character table;\cr 11845|np|&number of font parameter words.\cr}}$$ 11846They are all nonnegative and less than $2^{15}$. We must have |bc-1<=ec<=255|, 11847and 11848$$\hbox{|lf=6+lh+(ec-bc+1)+nw+nh+nd+ni+nl+nk+ne+np|.}$$ 11849Note that a font may contain as many as 256 characters (if |bc=0| and |ec=255|), 11850and as few as 0 characters (if |bc=ec+1|). 11851 11852Incidentally, when two or more 8-bit bytes are combined to form an integer of 1185316 or more bits, the most significant bytes appear first in the file. 11854This is called BigEndian order. 11855@!@^BigEndian order@> 11856 11857@ The rest of the \.{TFM} file may be regarded as a sequence of ten data 11858arrays having the informal specification 11859$$\def\arr$[#1]#2${\&{array} $[#1]$ \&{of} #2} 11860\vbox{\halign{\hfil\\{#}&$\,:\,$\arr#\hfil\cr 11861header&|[0..lh-1]@t\\{stuff}@>|\cr 11862char\_info&|[bc..ec]char_info_word|\cr 11863width&|[0..nw-1]fix_word|\cr 11864height&|[0..nh-1]fix_word|\cr 11865depth&|[0..nd-1]fix_word|\cr 11866italic&|[0..ni-1]fix_word|\cr 11867lig\_kern&|[0..nl-1]lig_kern_command|\cr 11868kern&|[0..nk-1]fix_word|\cr 11869exten&|[0..ne-1]extensible_recipe|\cr 11870param&|[1..np]fix_word|\cr}}$$ 11871The most important data type used here is a |@!fix_word|, which is 11872a 32-bit representation of a binary fraction. A |fix_word| is a signed 11873quantity, with the two's complement of the entire word used to represent 11874negation. Of the 32 bits in a |fix_word|, exactly 12 are to the left of the 11875binary point; thus, the largest |fix_word| value is $2048-2^{-20}$, and 11876the smallest is $-2048$. We will see below, however, that all but two of 11877the |fix_word| values must lie between $-16$ and $+16$. 11878 11879@ The first data array is a block of header information, which contains 11880general facts about the font. The header must contain at least two words, 11881|header[0]| and |header[1]|, whose meaning is explained below. 11882Additional header information of use to other software routines might 11883also be included, but \TeX82 does not need to know about such details. 11884For example, 16 more words of header information are in use at the Xerox 11885Palo Alto Research Center; the first ten specify the character coding 11886scheme used (e.g., `\.{XEROX text}' or `\.{TeX math symbols}'), the next five 11887give the font identifier (e.g., `\.{HELVETICA}' or `\.{CMSY}'), and the 11888last gives the ``face byte.'' The program that converts \.{DVI} files 11889to Xerox printing format gets this information by looking at the \.{TFM} 11890file, which it needs to read anyway because of other information that 11891is not explicitly repeated in \.{DVI}~format. 11892 11893\yskip\hang|header[0]| is a 32-bit check sum that \TeX\ will copy into 11894the \.{DVI} output file. Later on when the \.{DVI} file is printed, 11895possibly on another computer, the actual font that gets used is supposed 11896to have a check sum that agrees with the one in the \.{TFM} file used by 11897\TeX. In this way, users will be warned about potential incompatibilities. 11898(However, if the check sum is zero in either the font file or the \.{TFM} 11899file, no check is made.) The actual relation between this check sum and 11900the rest of the \.{TFM} file is not important; the check sum is simply an 11901identification number with the property that incompatible fonts almost 11902always have distinct check sums. 11903@^check sum@> 11904 11905\yskip\hang|header[1]| is a |fix_word| containing the design size of 11906the font, in units of \TeX\ points. This number must be at least 1.0; it is 11907fairly arbitrary, but usually the design size is 10.0 for a ``10 point'' 11908font, i.e., a font that was designed to look best at a 10-point size, 11909whatever that really means. When a \TeX\ user asks for a font 11910`\.{at} $\delta$ \.{pt}', the effect is to override the design size 11911and replace it by $\delta$, and to multiply the $x$ and~$y$ coordinates 11912of the points in the font image by a factor of $\delta$ divided by the 11913design size. {\sl All other dimensions in the\/ \.{TFM} file are 11914|fix_word|\kern-1pt\ numbers in design-size units}, with the exception of 11915|param[1]| (which denotes the slant ratio). Thus, for example, the value 11916of |param[6]|, which defines the \.{em} unit, is often the |fix_word| value 11917$2^{20}=1.0$, since many fonts have a design size equal to one em. 11918The other dimensions must be less than 16 design-size units in absolute 11919value; thus, |header[1]| and |param[1]| are the only |fix_word| 11920entries in the whole \.{TFM} file whose first byte might be something 11921besides 0 or 255. 11922 11923@ Next comes the |char_info| array, which contains one |@!char_info_word| 11924per character. Each word in this part of the file contains six fields 11925packed into four bytes as follows. 11926 11927\yskip\hang first byte: |@!width_index| (8 bits)\par 11928\hang second byte: |@!height_index| (4 bits) times 16, plus |@!depth_index| 11929 (4~bits)\par 11930\hang third byte: |@!italic_index| (6 bits) times 4, plus |@!tag| 11931 (2~bits)\par 11932\hang fourth byte: |@!remainder| (8 bits)\par 11933\yskip\noindent 11934The actual width of a character is \\{width}|[width_index]|, in design-size 11935units; this is a device for compressing information, since many characters 11936have the same width. Since it is quite common for many characters 11937to have the same height, depth, or italic correction, the \.{TFM} format 11938imposes a limit of 16 different heights, 16 different depths, and 1193964 different italic corrections. 11940 11941@!@^italic correction@> 11942The italic correction of a character has two different uses. 11943(a)~In ordinary text, the italic correction is added to the width only if 11944the \TeX\ user specifies `\.{\\/}' after the character. 11945(b)~In math formulas, the italic correction is always added to the width, 11946except with respect to the positioning of subscripts. 11947 11948Incidentally, the relation $\\{width}[0]=\\{height}[0]=\\{depth}[0]= 11949\\{italic}[0]=0$ should always hold, so that an index of zero implies a 11950value of zero. The |width_index| should never be zero unless the 11951character does not exist in the font, since a character is valid if and 11952only if it lies between |bc| and |ec| and has a nonzero |width_index|. 11953 11954@ The |tag| field in a |char_info_word| has four values that explain how to 11955interpret the |remainder| field. 11956 11957\yskip\hangg|tag=0| (|no_tag|) means that |remainder| is unused.\par 11958\hangg|tag=1| (|lig_tag|) means that this character has a ligature/kerning 11959program starting at position |remainder| in the |lig_kern| array.\par 11960\hangg|tag=2| (|list_tag|) means that this character is part of a chain of 11961characters of ascending sizes, and not the largest in the chain. The 11962|remainder| field gives the character code of the next larger character.\par 11963\hangg|tag=3| (|ext_tag|) means that this character code represents an 11964extensible character, i.e., a character that is built up of smaller pieces 11965so that it can be made arbitrarily large. The pieces are specified in 11966|@!exten[remainder]|.\par 11967\yskip\noindent 11968Characters with |tag=2| and |tag=3| are treated as characters with |tag=0| 11969unless they are used in special circumstances in math formulas. For example, 11970the \.{\\sum} operation looks for a |list_tag|, and the \.{\\left} 11971operation looks for both |list_tag| and |ext_tag|. 11972 11973@d no_tag=0 {vanilla character} 11974@d lig_tag=1 {character has a ligature/kerning program} 11975@d list_tag=2 {character has a successor in a charlist} 11976@d ext_tag=3 {character is extensible} 11977 11978@ The |lig_kern| array contains instructions in a simple programming language 11979that explains what to do for special letter pairs. Each word in this array is a 11980|@!lig_kern_command| of four bytes. 11981 11982\yskip\hang first byte: |skip_byte|, indicates that this is the final program 11983 step if the byte is 128 or more, otherwise the next step is obtained by 11984 skipping this number of intervening steps.\par 11985\hang second byte: |next_char|, ``if |next_char| follows the current character, 11986 then perform the operation and stop, otherwise continue.''\par 11987\hang third byte: |op_byte|, indicates a ligature step if less than~128, 11988 a kern step otherwise.\par 11989\hang fourth byte: |remainder|.\par 11990\yskip\noindent 11991In a kern step, an 11992additional space equal to |kern[256*(op_byte-128)+remainder]| is inserted 11993between the current character and |next_char|. This amount is 11994often negative, so that the characters are brought closer together 11995by kerning; but it might be positive. 11996 11997There are eight kinds of ligature steps, having |op_byte| codes $4a+2b+c$ where 11998$0\le a\le b+c$ and $0\le b,c\le1$. The character whose code is 11999|remainder| is inserted between the current character and |next_char|; 12000then the current character is deleted if $b=0$, and |next_char| is 12001deleted if $c=0$; then we pass over $a$~characters to reach the next 12002current character (which may have a ligature/kerning program of its own). 12003 12004If the very first instruction of the |lig_kern| array has |skip_byte=255|, 12005the |next_char| byte is the so-called right boundary character of this font; 12006the value of |next_char| need not lie between |bc| and~|ec|. 12007If the very last instruction of the |lig_kern| array has |skip_byte=255|, 12008there is a special ligature/kerning program for a left boundary character, 12009beginning at location |256*op_byte+remainder|. 12010The interpretation is that \TeX\ puts implicit boundary characters 12011before and after each consecutive string of characters from the same font. 12012These implicit characters do not appear in the output, but they can affect 12013ligatures and kerning. 12014 12015If the very first instruction of a character's |lig_kern| program has 12016|skip_byte>128|, the program actually begins in location 12017|256*op_byte+remainder|. This feature allows access to large |lig_kern| 12018arrays, because the first instruction must otherwise 12019appear in a location |<=255|. 12020 12021Any instruction with |skip_byte>128| in the |lig_kern| array must satisfy 12022the condition 12023$$\hbox{|256*op_byte+remainder<nl|.}$$ 12024If such an instruction is encountered during 12025normal program execution, it denotes an unconditional halt; no ligature 12026or kerning command is performed. 12027 12028@d stop_flag==qi(128) {value indicating `\.{STOP}' in a lig/kern program} 12029@d kern_flag==qi(128) {op code for a kern step} 12030@d skip_byte(#)==#.b0 12031@d next_char(#)==#.b1 12032@d op_byte(#)==#.b2 12033@d rem_byte(#)==#.b3 12034 12035@ Extensible characters are specified by an |@!extensible_recipe|, which 12036consists of four bytes called |@!top|, |@!mid|, |@!bot|, and |@!rep| (in this 12037order). These bytes are the character codes of individual pieces used to 12038build up a large symbol. If |top|, |mid|, or |bot| are zero, they are not 12039present in the built-up result. For example, an extensible vertical line is 12040like an extensible bracket, except that the top and bottom pieces are missing. 12041 12042Let $T$, $M$, $B$, and $R$ denote the respective pieces, or an empty box 12043if the piece isn't present. Then the extensible characters have the form 12044$TR^kMR^kB$ from top to bottom, for some |k>=0|, unless $M$ is absent; 12045in the latter case we can have $TR^kB$ for both even and odd values of~|k|. 12046The width of the extensible character is the width of $R$; and the 12047height-plus-depth is the sum of the individual height-plus-depths of the 12048components used, since the pieces are butted together in a vertical list. 12049 12050@d ext_top(#)==#.b0 {|top| piece in a recipe} 12051@d ext_mid(#)==#.b1 {|mid| piece in a recipe} 12052@d ext_bot(#)==#.b2 {|bot| piece in a recipe} 12053@d ext_rep(#)==#.b3 {|rep| piece in a recipe} 12054 12055@ The final portion of a \.{TFM} file is the |param| array, which is another 12056sequence of |fix_word| values. 12057 12058\yskip\hang|param[1]=slant| is the amount of italic slant, which is used 12059to help position accents. For example, |slant=.25| means that when you go 12060up one unit, you also go .25 units to the right. The |slant| is a pure 12061number; it's the only |fix_word| other than the design size itself that is 12062not scaled by the design size. 12063 12064\hang|param[2]=space| is the normal spacing between words in text. 12065Note that character |" "| in the font need not have anything to do with 12066blank spaces. 12067 12068\hang|param[3]=space_stretch| is the amount of glue stretching between words. 12069 12070\hang|param[4]=space_shrink| is the amount of glue shrinking between words. 12071 12072\hang|param[5]=x_height| is the size of one ex in the font; it is also 12073the height of letters for which accents don't have to be raised or lowered. 12074 12075\hang|param[6]=quad| is the size of one em in the font. 12076 12077\hang|param[7]=extra_space| is the amount added to |param[2]| at the 12078ends of sentences. 12079 12080\yskip\noindent 12081If fewer than seven parameters are present, \TeX\ sets the missing parameters 12082to zero. Fonts used for math symbols are required to have 12083additional parameter information, which is explained later. 12084 12085@d slant_code=1 12086@d space_code=2 12087@d space_stretch_code=3 12088@d space_shrink_code=4 12089@d x_height_code=5 12090@d quad_code=6 12091@d extra_space_code=7 12092 12093@ So that is what \.{TFM} files hold. Since \TeX\ has to absorb such information 12094about lots of fonts, it stores most of the data in a large array called 12095|font_info|. Each item of |font_info| is a |memory_word|; the |fix_word| 12096data gets converted into |scaled| entries, while everything else goes into 12097words of type |four_quarters|. 12098 12099When the user defines \.{\\font\\f}, say, \TeX\ assigns an internal number 12100to the user's font~\.{\\f}. Adding this number to |font_id_base| gives the 12101|eqtb| location of a ``frozen'' control sequence that will always select 12102the font. 12103 12104@<Types...@>= 12105@!internal_font_number=font_base..font_max; {|font| in a |char_node|} 12106@!font_index=0..font_mem_size; {index into |font_info|} 12107 12108@ Here now is the (rather formidable) array of font arrays. 12109 12110@d otgr_font_flag=@"FFFE 12111@d aat_font_flag=@"FFFF 12112@d is_aat_font(#)==(font_area[#]=aat_font_flag) 12113@d is_ot_font(#)==((font_area[#]=otgr_font_flag) and (usingOpenType(font_layout_engine[#]))) 12114@d is_gr_font(#)==((font_area[#]=otgr_font_flag) and (usingGraphite(font_layout_engine[#]))) 12115@d is_otgr_font(#)==(font_area[#]=otgr_font_flag) 12116@d is_native_font(#)==(is_aat_font(#) or is_otgr_font(#)) 12117 {native fonts have |font_area| = 65534 or 65535, 12118 which would be a string containing an invalid Unicode character} 12119@d is_new_mathfont(#)==((font_area[#]=otgr_font_flag) and (isOpenTypeMathFont(font_layout_engine[#]))) 12120 12121@d non_char==qi(too_big_char) {a |halfword| code that can't match a real character} 12122@d non_address=0 {a spurious |bchar_label|} 12123 12124@<Glob...@>= 12125@!font_info:array[font_index] of memory_word; 12126 {the big collection of font data} 12127@!fmem_ptr:font_index; {first unused word of |font_info|} 12128@!font_ptr:internal_font_number; {largest internal font number in use} 12129@!font_check:array[internal_font_number] of four_quarters; {check sum} 12130@!font_size:array[internal_font_number] of scaled; {``at'' size} 12131@!font_dsize:array[internal_font_number] of scaled; {``design'' size} 12132@!font_params:array[internal_font_number] of font_index; {how many font 12133 parameters are present} 12134@!font_name:array[internal_font_number] of str_number; {name of the font} 12135@!font_area:array[internal_font_number] of str_number; {area of the font} 12136@!font_bc:array[internal_font_number] of eight_bits; 12137 {beginning (smallest) character code} 12138@!font_ec:array[internal_font_number] of eight_bits; 12139 {ending (largest) character code} 12140@!font_glue:array[internal_font_number] of pointer; 12141 {glue specification for interword space, |null| if not allocated} 12142@!font_used:array[internal_font_number] of boolean; 12143 {has a character from this font actually appeared in the output?} 12144@!hyphen_char:array[internal_font_number] of integer; 12145 {current \.{\\hyphenchar} values} 12146@!skew_char:array[internal_font_number] of integer; 12147 {current \.{\\skewchar} values} 12148@!bchar_label:array[internal_font_number] of font_index; 12149 {start of |lig_kern| program for left boundary character, 12150 |non_address| if there is none} 12151@!font_bchar:array[internal_font_number] of min_quarterword..non_char; 12152 {right boundary character, |non_char| if there is none} 12153@!font_false_bchar:array[internal_font_number] of min_quarterword..non_char; 12154 {|font_bchar| if it doesn't exist in the font, otherwise |non_char|} 12155 12156@ Besides the arrays just enumerated, we have directory arrays that make it 12157easy to get at the individual entries in |font_info|. For example, the 12158|char_info| data for character |c| in font |f| will be in 12159|font_info[char_base[f]+c].qqqq|; and if |w| is the |width_index| 12160part of this word (the |b0| field), the width of the character is 12161|font_info[width_base[f]+w].sc|. (These formulas assume that 12162|min_quarterword| has already been added to |c| and to |w|, since \TeX\ 12163stores its quarterwords that way.) 12164 12165@<Glob...@>= 12166@!char_base:array[internal_font_number] of integer; 12167 {base addresses for |char_info|} 12168@!width_base:array[internal_font_number] of integer; 12169 {base addresses for widths} 12170@!height_base:array[internal_font_number] of integer; 12171 {base addresses for heights} 12172@!depth_base:array[internal_font_number] of integer; 12173 {base addresses for depths} 12174@!italic_base:array[internal_font_number] of integer; 12175 {base addresses for italic corrections} 12176@!lig_kern_base:array[internal_font_number] of integer; 12177 {base addresses for ligature/kerning programs} 12178@!kern_base:array[internal_font_number] of integer; 12179 {base addresses for kerns} 12180@!exten_base:array[internal_font_number] of integer; 12181 {base addresses for extensible recipes} 12182@!param_base:array[internal_font_number] of integer; 12183 {base addresses for font parameters} 12184 12185@ @<Set init...@>= 12186for k:=font_base to font_max do font_used[k]:=false; 12187 12188@ \TeX\ always knows at least one font, namely the null font. It has no 12189characters, and its seven parameters are all equal to zero. 12190 12191@<Initialize table...@>= 12192font_ptr:=null_font; fmem_ptr:=7; 12193font_name[null_font]:="nullfont"; font_area[null_font]:=""; 12194hyphen_char[null_font]:="-"; skew_char[null_font]:=-1; 12195bchar_label[null_font]:=non_address; 12196font_bchar[null_font]:=non_char; font_false_bchar[null_font]:=non_char; 12197font_bc[null_font]:=1; font_ec[null_font]:=0; 12198font_size[null_font]:=0; font_dsize[null_font]:=0; 12199char_base[null_font]:=0; width_base[null_font]:=0; 12200height_base[null_font]:=0; depth_base[null_font]:=0; 12201italic_base[null_font]:=0; lig_kern_base[null_font]:=0; 12202kern_base[null_font]:=0; exten_base[null_font]:=0; 12203font_glue[null_font]:=null; font_params[null_font]:=7; 12204param_base[null_font]:=-1; 12205for k:=0 to 6 do font_info[k].sc:=0; 12206 12207@ @<Put each...@>= 12208primitive("nullfont",set_font,null_font); 12209@!@:null_font_}{\.{\\nullfont} primitive@> 12210text(frozen_null_font):="nullfont"; eqtb[frozen_null_font]:=eqtb[cur_val]; 12211 12212@ Of course we want to define macros that suppress the detail of how font 12213information is actually packed, so that we don't have to write things like 12214$$\hbox{|font_info[width_base[f]+font_info[char_base[f]+c].qqqq.b0].sc|}$$ 12215too often. The \.{WEB} definitions here make |char_info(f)(c)| the 12216|four_quarters| word of font information corresponding to character 12217|c| of font |f|. If |q| is such a word, |char_width(f)(q)| will be 12218the character's width; hence the long formula above is at least 12219abbreviated to 12220$$\hbox{|char_width(f)(char_info(f)(c))|.}$$ 12221Usually, of course, we will fetch |q| first and look at several of its 12222fields at the same time. 12223 12224The italic correction of a character will be denoted by 12225|char_italic(f)(q)|, so it is analogous to |char_width|. But we will get 12226at the height and depth in a slightly different way, since we usually want 12227to compute both height and depth if we want either one. The value of 12228|height_depth(q)| will be the 8-bit quantity 12229$$b=|height_index|\times16+|depth_index|,$$ and if |b| is such a byte we 12230will write |char_height(f)(b)| and |char_depth(f)(b)| for the height and 12231depth of the character |c| for which |q=char_info(f)(c)|. Got that? 12232 12233The tag field will be called |char_tag(q)|; the remainder byte will be 12234called |rem_byte(q)|, using a macro that we have already defined above. 12235 12236Access to a character's |width|, |height|, |depth|, and |tag| fields is 12237part of \TeX's inner loop, so we want these macros to produce code that is 12238as fast as possible under the circumstances. 12239@^inner loop@> 12240 12241@d char_info_end(#)==#].qqqq 12242@d char_info(#)==font_info[char_base[#]+char_info_end 12243@d char_width_end(#)==#.b0].sc 12244@d char_width(#)==font_info[width_base[#]+char_width_end 12245@d char_exists(#)==(#.b0>min_quarterword) 12246@d char_italic_end(#)==(qo(#.b2)) div 4].sc 12247@d char_italic(#)==font_info[italic_base[#]+char_italic_end 12248@d height_depth(#)==qo(#.b1) 12249@d char_height_end(#)==(#) div 16].sc 12250@d char_height(#)==font_info[height_base[#]+char_height_end 12251@d char_depth_end(#)==(#) mod 16].sc 12252@d char_depth(#)==font_info[depth_base[#]+char_depth_end 12253@d char_tag(#)==((qo(#.b2)) mod 4) 12254 12255@ The global variable |null_character| is set up to be a word of 12256|char_info| for a character that doesn't exist. Such a word provides a 12257convenient way to deal with erroneous situations. 12258 12259@<Glob...@>= 12260@!null_character:four_quarters; {nonexistent character information} 12261 12262@ @<Set init...@>= 12263null_character.b0:=min_quarterword; null_character.b1:=min_quarterword; 12264null_character.b2:=min_quarterword; null_character.b3:=min_quarterword; 12265 12266@ Here are some macros that help process ligatures and kerns. 12267We write |char_kern(f)(j)| to find the amount of kerning specified by 12268kerning command~|j| in font~|f|. If |j| is the |char_info| for a character 12269with a ligature/kern program, the first instruction of that program is either 12270|i=font_info[lig_kern_start(f)(j)]| or |font_info[lig_kern_restart(f)(i)]|, 12271depending on whether or not |skip_byte(i)<=stop_flag|. 12272 12273The constant |kern_base_offset| should be simplified, for \PASCAL\ compilers 12274that do not do local optimization. 12275@^system dependencies@> 12276 12277@d char_kern_end(#)==256*op_byte(#)+rem_byte(#)].sc 12278@d char_kern(#)==font_info[kern_base[#]+char_kern_end 12279@d kern_base_offset==256*(128+min_quarterword) 12280@d lig_kern_start(#)==lig_kern_base[#]+rem_byte {beginning of lig/kern program} 12281@d lig_kern_restart_end(#)==256*op_byte(#)+rem_byte(#)+32768-kern_base_offset 12282@d lig_kern_restart(#)==lig_kern_base[#]+lig_kern_restart_end 12283 12284@ Font parameters are referred to as |slant(f)|, |space(f)|, etc. 12285 12286@d param_end(#)==param_base[#]].sc 12287@d param(#)==font_info[#+param_end 12288@d slant==param(slant_code) {slant to the right, per unit distance upward} 12289@d space==param(space_code) {normal space between words} 12290@d space_stretch==param(space_stretch_code) {stretch between words} 12291@d space_shrink==param(space_shrink_code) {shrink between words} 12292@d x_height==param(x_height_code) {one ex} 12293@d quad==param(quad_code) {one em} 12294@d extra_space==param(extra_space_code) {additional space at end of sentence} 12295 12296@<The em width for |cur_font|@>=quad(cur_font) 12297 12298@ @<The x-height for |cur_font|@>=x_height(cur_font) 12299 12300@ \TeX\ checks the information of a \.{TFM} file for validity as the 12301file is being read in, so that no further checks will be needed when 12302typesetting is going on. The somewhat tedious subroutine that does this 12303is called |read_font_info|. It has four parameters: the user font 12304identifier~|u|, the file name and area strings |nom| and |aire|, and the 12305``at'' size~|s|. If |s|~is negative, it's the negative of a scale factor 12306to be applied to the design size; |s=-1000| is the normal case. 12307Otherwise |s| will be substituted for the design size; in this 12308case, |s| must be positive and less than $2048\rm\,pt$ 12309(i.e., it must be less than $2^{27}$ when considered as an integer). 12310 12311The subroutine opens and closes a global file variable called |tfm_file|. 12312It returns the value of the internal font number that was just loaded. 12313If an error is detected, an error message is issued and no font 12314information is stored; |null_font| is returned in this case. 12315 12316@d bad_tfm=11 {label for |read_font_info|} 12317@d abort==goto bad_tfm {do this when the \.{TFM} data is wrong} 12318 12319@p function read_font_info(@!u:pointer;@!nom,@!aire:str_number; 12320 @!s:scaled):internal_font_number; {input a \.{TFM} file} 12321label done,bad_tfm,not_found; 12322var k:font_index; {index into |font_info|} 12323@!file_opened:boolean; {was |tfm_file| successfully opened?} 12324@!lf,@!lh,@!bc,@!ec,@!nw,@!nh,@!nd,@!ni,@!nl,@!nk,@!ne,@!np:halfword; 12325 {sizes of subfiles} 12326@!f:internal_font_number; {the new font's number} 12327@!g:internal_font_number; {the number to return} 12328@!a,@!b,@!c,@!d:eight_bits; {byte variables} 12329@!qw:four_quarters;@!sw:scaled; {accumulators} 12330@!bch_label:integer; {left boundary start location, or infinity} 12331@!bchar:0..256; {right boundary character, or 256} 12332@!z:scaled; {the design size or the ``at'' size} 12333@!alpha:integer;@!beta:1..16; 12334 {auxiliary quantities used in fixed-point multiplication} 12335begin g:=null_font;@/ 12336file_opened:=false; 12337pack_file_name(nom,aire,cur_ext); 12338if XeTeX_tracing_fonts_state>0 then begin 12339 begin_diagnostic; 12340 print_nl("Requested font """); 12341 print_c_string(stringcast(name_of_file+1)); 12342 print('"'); 12343 if s < 0 then begin 12344 print(" scaled "); 12345 print_int(-s); 12346 end else begin 12347 print(" at "); 12348 print_scaled(s); 12349 print("pt"); 12350 end; 12351 end_diagnostic(false); 12352end; 12353if quoted_filename then begin 12354 { quoted name, so try for a native font } 12355 g:=load_native_font(u,nom,aire,s); 12356 if g<>null_font then goto done; 12357end; 12358{ it was an unquoted name, or not found as an installed font, so try for a TFM file } 12359@<Read and check the font data if file exists; 12360 |abort| if the \.{TFM} file is 12361 malformed; if there's no room for this font, say so and |goto 12362 done|; otherwise |incr(font_ptr)| and |goto done|@>; 12363if g<>null_font then goto done; 12364if not quoted_filename then begin 12365 { we failed to find a TFM file, so try for a native font } 12366 g:=load_native_font(u,nom,aire,s); 12367 if g<>null_font then goto done 12368end; 12369bad_tfm: 12370if suppress_fontnotfound_error=0 then begin 12371 @<Report that the font won't be loaded@>; 12372 end; 12373done: if file_opened then b_close(tfm_file); 12374if XeTeX_tracing_fonts_state>0 then begin 12375 if g=null_font then begin 12376 begin_diagnostic; 12377 print_nl(" -> font not found, using ""nullfont"""); 12378 end_diagnostic(false); 12379 end else if file_opened then begin 12380 begin_diagnostic; 12381 print_nl(" -> "); 12382 print_c_string(stringcast(name_of_file+1)); 12383 end_diagnostic(false); 12384 end; 12385end; 12386read_font_info:=g; 12387end; 12388 12389@ There are programs called \.{TFtoPL} and \.{PLtoTF} that convert 12390between the \.{TFM} format and a symbolic property-list format 12391that can be easily edited. These programs contain extensive 12392diagnostic information, so \TeX\ does not have to bother giving 12393precise details about why it rejects a particular \.{TFM} file. 12394@.TFtoPL@> @.PLtoTF@> 12395 12396@d start_font_error_message==print_err("Font "); sprint_cs(u); 12397 print_char("="); 12398 if file_name_quote_char<>0 then print_char(file_name_quote_char); 12399 print_file_name(nom,aire,cur_ext); 12400 if file_name_quote_char<>0 then print_char(file_name_quote_char); 12401 if s>=0 then 12402 begin print(" at "); print_scaled(s); print("pt"); 12403 end 12404 else if s<>-1000 then 12405 begin print(" scaled "); print_int(-s); 12406 end 12407 12408@<Report that the font won't be loaded@>= 12409start_font_error_message; 12410@.Font x=xx not loadable...@> 12411if file_opened then print(" not loadable: Bad metric (TFM) file") 12412else print(" not loadable: Metric (TFM) file not found"); 12413help5("I wasn't able to read the size data for this font,")@/ 12414("so I will ignore the font specification.")@/ 12415("[Wizards can fix TFM files using TFtoPL/PLtoTF.]")@/ 12416("You might try inserting a different font spec;")@/ 12417("e.g., type `I\font<same font id>=<substitute font name>'."); 12418error 12419 12420@ @<Read and check...@>= 12421@<Open |tfm_file| for input and |begin|@>; 12422@<Read the {\.{TFM}} size fields@>; 12423@<Use size fields to allocate font information@>; 12424@<Read the {\.{TFM}} header@>; 12425@<Read character data@>; 12426@<Read box dimensions@>; 12427@<Read ligature/kern program@>; 12428@<Read extensible character recipes@>; 12429@<Read font parameters@>; 12430@<Make final adjustments and |goto done|@>; 12431end 12432 12433@ @<Open |tfm_file| for input...@>= 12434if aire="" then pack_file_name(nom,TEX_font_area,".tfm") 12435else pack_file_name(nom,aire,".tfm"); 12436check_for_tfm_font_mapping; 12437if b_open_in(tfm_file) then begin 12438 file_opened:=true 12439 12440@ Note: A malformed \.{TFM} file might be shorter than it claims to be; 12441thus |eof(tfm_file)| might be true when |read_font_info| refers to 12442|tfm_file^| or when it says |get(tfm_file)|. If such circumstances 12443cause system error messages, you will have to defeat them somehow, 12444for example by defining |fget| to be `\ignorespaces|begin get(tfm_file);| 12445|if eof(tfm_file) then abort; end|\unskip'. 12446@^system dependencies@> 12447 12448@d fget==get(tfm_file) 12449@d fbyte==tfm_file^ 12450@d read_sixteen(#)==begin #:=fbyte; 12451 if #>127 then abort; 12452 fget; #:=#*@'400+fbyte; 12453 end 12454@d store_four_quarters(#)==begin fget; a:=fbyte; qw.b0:=qi(a); 12455 fget; b:=fbyte; qw.b1:=qi(b); 12456 fget; c:=fbyte; qw.b2:=qi(c); 12457 fget; d:=fbyte; qw.b3:=qi(d); 12458 #:=qw; 12459 end 12460 12461@ @<Read the {\.{TFM}} size fields@>= 12462begin read_sixteen(lf); 12463fget; read_sixteen(lh); 12464fget; read_sixteen(bc); 12465fget; read_sixteen(ec); 12466if (bc>ec+1)or(ec>255) then abort; 12467if bc>255 then {|bc=256| and |ec=255|} 12468 begin bc:=1; ec:=0; 12469 end; 12470fget; read_sixteen(nw); 12471fget; read_sixteen(nh); 12472fget; read_sixteen(nd); 12473fget; read_sixteen(ni); 12474fget; read_sixteen(nl); 12475fget; read_sixteen(nk); 12476fget; read_sixteen(ne); 12477fget; read_sixteen(np); 12478if lf<>6+lh+(ec-bc+1)+nw+nh+nd+ni+nl+nk+ne+np then abort; 12479if (nw=0)or(nh=0)or(nd=0)or(ni=0) then abort; 12480end 12481 12482@ The preliminary settings of the index-offset variables |char_base|, 12483|width_base|, |lig_kern_base|, |kern_base|, and |exten_base| will be 12484corrected later by subtracting |min_quarterword| from them; and we will 12485subtract 1 from |param_base| too. It's best to forget about such anomalies 12486until later. 12487 12488@<Use size fields to allocate font information@>= 12489lf:=lf-6-lh; {|lf| words should be loaded into |font_info|} 12490if np<7 then lf:=lf+7-np; {at least seven parameters will appear} 12491if (font_ptr=font_max)or(fmem_ptr+lf>font_mem_size) then 12492 @<Apologize for not loading the font, |goto done|@>; 12493f:=font_ptr+1; 12494char_base[f]:=fmem_ptr-bc; 12495width_base[f]:=char_base[f]+ec+1; 12496height_base[f]:=width_base[f]+nw; 12497depth_base[f]:=height_base[f]+nh; 12498italic_base[f]:=depth_base[f]+nd; 12499lig_kern_base[f]:=italic_base[f]+ni; 12500kern_base[f]:=lig_kern_base[f]+nl-kern_base_offset; 12501exten_base[f]:=kern_base[f]+kern_base_offset+nk; 12502param_base[f]:=exten_base[f]+ne 12503 12504@ @<Apologize for not loading...@>= 12505begin start_font_error_message; 12506print(" not loaded: Not enough room left"); 12507@.Font x=xx not loaded...@> 12508help4("I'm afraid I won't be able to make use of this font,")@/ 12509("because my memory for character-size data is too small.")@/ 12510("If you're really stuck, ask a wizard to enlarge me.")@/ 12511("Or maybe try `I\font<same font id>=<name of loaded font>'."); 12512error; goto done; 12513end 12514 12515@ Only the first two words of the header are needed by \TeX82. 12516 12517@<Read the {\.{TFM}} header@>= 12518begin if lh<2 then abort; 12519store_four_quarters(font_check[f]); 12520fget; read_sixteen(z); {this rejects a negative design size} 12521fget; z:=z*@'400+fbyte; fget; z:=(z*@'20)+(fbyte div@'20); 12522if z<unity then abort; 12523while lh>2 do 12524 begin fget;fget;fget;fget;decr(lh); {ignore the rest of the header} 12525 end; 12526font_dsize[f]:=z; 12527if s<>-1000 then 12528 if s>=0 then z:=s 12529 else z:=xn_over_d(z,-s,1000); 12530font_size[f]:=z; 12531end 12532 12533@ @<Read character data@>= 12534for k:=fmem_ptr to width_base[f]-1 do 12535 begin store_four_quarters(font_info[k].qqqq); 12536 if (a>=nw)or(b div @'20>=nh)or(b mod @'20>=nd)or 12537 (c div 4>=ni) then abort; 12538 case c mod 4 of 12539 lig_tag: if d>=nl then abort; 12540 ext_tag: if d>=ne then abort; 12541 list_tag: @<Check for charlist cycle@>; 12542 othercases do_nothing {|no_tag|} 12543 endcases; 12544 end 12545 12546@ We want to make sure that there is no cycle of characters linked together 12547by |list_tag| entries, since such a cycle would get \TeX\ into an endless 12548loop. If such a cycle exists, the routine here detects it when processing 12549the largest character code in the cycle. 12550 12551@d check_byte_range(#)==begin if (#<bc)or(#>ec) then abort@+end 12552@d current_character_being_worked_on==k+bc-fmem_ptr 12553 12554@<Check for charlist cycle@>= 12555begin check_byte_range(d); 12556while d<current_character_being_worked_on do 12557 begin qw:=char_info(f)(d); 12558 {N.B.: not |qi(d)|, since |char_base[f]| hasn't been adjusted yet} 12559 if char_tag(qw)<>list_tag then goto not_found; 12560 d:=qo(rem_byte(qw)); {next character on the list} 12561 end; 12562if d=current_character_being_worked_on then abort; {yes, there's a cycle} 12563not_found:end 12564 12565@ A |fix_word| whose four bytes are $(a,b,c,d)$ from left to right represents 12566the number 12567$$x=\left\{\vcenter{\halign{$#$,\hfil\qquad&if $#$\hfil\cr 12568b\cdot2^{-4}+c\cdot2^{-12}+d\cdot2^{-20}&a=0;\cr 12569-16+b\cdot2^{-4}+c\cdot2^{-12}+d\cdot2^{-20}&a=255.\cr}}\right.$$ 12570(No other choices of |a| are allowed, since the magnitude of a number in 12571design-size units must be less than 16.) We want to multiply this 12572quantity by the integer~|z|, which is known to be less than $2^{27}$. 12573If $|z|<2^{23}$, the individual multiplications $b\cdot z$, 12574$c\cdot z$, $d\cdot z$ cannot overflow; otherwise we will divide |z| by 2, 125754, 8, or 16, to obtain a multiplier less than $2^{23}$, and we can 12576compensate for this later. If |z| has thereby been replaced by 12577$|z|^\prime=|z|/2^e$, let $\beta=2^{4-e}$; we shall compute 12578$$\lfloor(b+c\cdot2^{-8}+d\cdot2^{-16})\,z^\prime/\beta\rfloor$$ 12579if $a=0$, or the same quantity minus $\alpha=2^{4+e}z^\prime$ if $a=255$. 12580This calculation must be done exactly, in order to guarantee portability 12581of \TeX\ between computers. 12582 12583@d store_scaled(#)==begin fget; a:=fbyte; fget; b:=fbyte; 12584 fget; c:=fbyte; fget; d:=fbyte;@/ 12585 sw:=(((((d*z)div@'400)+(c*z))div@'400)+(b*z))div beta; 12586 if a=0 then #:=sw@+else if a=255 then #:=sw-alpha@+else abort; 12587 end 12588 12589@<Read box dimensions@>= 12590begin @<Replace |z| by $|z|^\prime$ and compute $\alpha,\beta$@>; 12591for k:=width_base[f] to lig_kern_base[f]-1 do 12592 store_scaled(font_info[k].sc); 12593if font_info[width_base[f]].sc<>0 then abort; {\\{width}[0] must be zero} 12594if font_info[height_base[f]].sc<>0 then abort; {\\{height}[0] must be zero} 12595if font_info[depth_base[f]].sc<>0 then abort; {\\{depth}[0] must be zero} 12596if font_info[italic_base[f]].sc<>0 then abort; {\\{italic}[0] must be zero} 12597end 12598 12599@ @<Replace |z|...@>= 12600begin alpha:=16; 12601while z>=@'40000000 do 12602 begin z:=z div 2; alpha:=alpha+alpha; 12603 end; 12604beta:=256 div alpha; alpha:=alpha*z; 12605end 12606 12607@ @d check_existence(#)==@t@>@;@/ 12608 begin check_byte_range(#); 12609 qw:=char_info(f)(#); {N.B.: not |qi(#)|} 12610 if not char_exists(qw) then abort; 12611 end 12612 12613@<Read ligature/kern program@>= 12614bch_label:=@'77777; bchar:=256; 12615if nl>0 then 12616 begin for k:=lig_kern_base[f] to kern_base[f]+kern_base_offset-1 do 12617 begin store_four_quarters(font_info[k].qqqq); 12618 if a>128 then 12619 begin if 256*c+d>=nl then abort; 12620 if a=255 then if k=lig_kern_base[f] then bchar:=b; 12621 end 12622 else begin if b<>bchar then check_existence(b); 12623 if c<128 then check_existence(d) {check ligature} 12624 else if 256*(c-128)+d>=nk then abort; {check kern} 12625 if a<128 then if k-lig_kern_base[f]+a+1>=nl then abort; 12626 end; 12627 end; 12628 if a=255 then bch_label:=256*c+d; 12629 end; 12630for k:=kern_base[f]+kern_base_offset to exten_base[f]-1 do 12631 store_scaled(font_info[k].sc); 12632 12633@ @<Read extensible character recipes@>= 12634for k:=exten_base[f] to param_base[f]-1 do 12635 begin store_four_quarters(font_info[k].qqqq); 12636 if a<>0 then check_existence(a); 12637 if b<>0 then check_existence(b); 12638 if c<>0 then check_existence(c); 12639 check_existence(d); 12640 end 12641 12642@ We check to see that the \.{TFM} file doesn't end prematurely; but 12643no error message is given for files having more than |lf| words. 12644 12645@<Read font parameters@>= 12646begin for k:=1 to np do 12647 if k=1 then {the |slant| parameter is a pure number} 12648 begin fget; sw:=fbyte; if sw>127 then sw:=sw-256; 12649 fget; sw:=sw*@'400+fbyte; fget; sw:=sw*@'400+fbyte; 12650 fget; font_info[param_base[f]].sc:= 12651 (sw*@'20)+(fbyte div@'20); 12652 end 12653 else store_scaled(font_info[param_base[f]+k-1].sc); 12654if eof(tfm_file) then abort; 12655for k:=np+1 to 7 do font_info[param_base[f]+k-1].sc:=0; 12656end 12657 12658@ Now to wrap it up, we have checked all the necessary things about the \.{TFM} 12659file, and all we need to do is put the finishing touches on the data for 12660the new font. 12661 12662@d adjust(#)==#[f]:=qo(#[f]) 12663 {correct for the excess |min_quarterword| that was added} 12664 12665@<Make final adjustments...@>= 12666if np>=7 then font_params[f]:=np@+else font_params[f]:=7; 12667hyphen_char[f]:=default_hyphen_char; skew_char[f]:=default_skew_char; 12668if bch_label<nl then bchar_label[f]:=bch_label+lig_kern_base[f] 12669else bchar_label[f]:=non_address; 12670font_bchar[f]:=qi(bchar); 12671font_false_bchar[f]:=qi(bchar); 12672if bchar<=ec then if bchar>=bc then 12673 begin qw:=char_info(f)(bchar); {N.B.: not |qi(bchar)|} 12674 if char_exists(qw) then font_false_bchar[f]:=non_char; 12675 end; 12676font_name[f]:=nom; 12677font_area[f]:=aire; 12678font_bc[f]:=bc; font_ec[f]:=ec; font_glue[f]:=null; 12679adjust(char_base); adjust(width_base); adjust(lig_kern_base); 12680adjust(kern_base); adjust(exten_base); 12681decr(param_base[f]); 12682fmem_ptr:=fmem_ptr+lf; font_ptr:=f; g:=f; 12683font_mapping[f]:=load_tfm_font_mapping; 12684goto done 12685 12686@ Before we forget about the format of these tables, let's deal with two 12687of \TeX's basic scanning routines related to font information. 12688 12689@<Declare procedures that scan font-related stuff@>= 12690procedure scan_font_ident; 12691var f:internal_font_number; 12692@!m:halfword; 12693begin @<Get the next non-blank non-call...@>; 12694if cur_cmd=def_font then f:=cur_font 12695else if cur_cmd=set_font then f:=cur_chr 12696else if cur_cmd=def_family then 12697 begin m:=cur_chr; scan_math_fam_int; f:=equiv(m+cur_val); 12698 end 12699else begin print_err("Missing font identifier"); 12700@.Missing font identifier@> 12701 help2("I was looking for a control sequence whose")@/ 12702 ("current meaning has been defined by \font."); 12703 back_error; f:=null_font; 12704 end; 12705cur_val:=f; 12706end; 12707 12708@ The following routine is used to implement `\.{\\fontdimen} |n| |f|'. 12709The boolean parameter |writing| is set |true| if the calling program 12710intends to change the parameter value. 12711 12712@<Declare procedures that scan font-related stuff@>= 12713procedure find_font_dimen(@!writing:boolean); 12714 {sets |cur_val| to |font_info| location} 12715var f:internal_font_number; 12716@!n:integer; {the parameter number} 12717begin scan_int; n:=cur_val; scan_font_ident; f:=cur_val; 12718if n<=0 then cur_val:=fmem_ptr 12719else begin if writing and(n<=space_shrink_code)and@| 12720 (n>=space_code)and(font_glue[f]<>null) then 12721 begin delete_glue_ref(font_glue[f]); 12722 font_glue[f]:=null; 12723 end; 12724 if n>font_params[f] then 12725 if f<font_ptr then cur_val:=fmem_ptr 12726 else @<Increase the number of parameters in the last font@> 12727 else cur_val:=n+param_base[f]; 12728 end; 12729@<Issue an error message if |cur_val=fmem_ptr|@>; 12730end; 12731 12732@ @<Issue an error message if |cur_val=fmem_ptr|@>= 12733if cur_val=fmem_ptr then 12734 begin print_err("Font "); print_esc(font_id_text(f)); 12735 print(" has only "); print_int(font_params[f]); 12736 print(" fontdimen parameters"); 12737@.Font x has only...@> 12738 help2("To increase the number of font parameters, you must")@/ 12739 ("use \fontdimen immediately after the \font is loaded."); 12740 error; 12741 end 12742 12743@ @<Increase the number of parameters...@>= 12744begin repeat if fmem_ptr=font_mem_size then 12745 overflow("font memory",font_mem_size); 12746@:TeX capacity exceeded font memory}{\quad font memory@> 12747font_info[fmem_ptr].sc:=0; incr(fmem_ptr); incr(font_params[f]); 12748until n=font_params[f]; 12749cur_val:=fmem_ptr-1; {this equals |param_base[f]+font_params[f]|} 12750end 12751 12752@ When \TeX\ wants to typeset a character that doesn't exist, the 12753character node is not created; thus the output routine can assume 12754that characters exist when it sees them. The following procedure 12755prints a warning message unless the user has suppressed it. 12756 12757@<Declare subroutines for |new_character|@>= 12758procedure char_warning(@!f:internal_font_number;@!c:integer); 12759var old_setting: integer; {saved value of |tracing_online|} 12760begin if tracing_lost_chars>0 then 12761 begin old_setting:=tracing_online; 12762 if eTeX_ex and(tracing_lost_chars>1) then tracing_online:=1; 12763 begin begin_diagnostic; 12764 print_nl("Missing character: There is no "); 12765@.Missing character@> 12766 if c < @"10000 then print_ASCII(c) 12767 else print_char(c); {non-Plane 0 Unicodes can't be sent through |print_ASCII|} 12768 print(" in font "); 12769 slow_print(font_name[f]); print_char("!"); end_diagnostic(false); 12770 end; 12771 tracing_online:=old_setting; 12772 end; 12773end; 12774 12775@ We need a few subroutines for |new_character|. 12776 12777@p @t\4@>@<Declare subroutines for |new_character|@>@; 12778 12779@ Here is a function that returns a pointer to a character node for a 12780given character in a given font. If that character doesn't exist, 12781|null| is returned instead. 12782 12783@p function new_character(@!f:internal_font_number;@!c:eight_bits):pointer; 12784label exit; 12785var p:pointer; {newly allocated node} 12786begin if font_bc[f]<=c then if font_ec[f]>=c then 12787 if char_exists(char_info(f)(qi(c))) then 12788 begin p:=get_avail; font(p):=f; character(p):=qi(c); 12789 new_character:=p; return; 12790 end; 12791char_warning(f,c); 12792new_character:=null; 12793exit:end; 12794 12795@* \[31] Device-independent file format. 12796The most important output produced by a run of \TeX\ is the ``device 12797independent'' (\.{DVI}) file that specifies where characters and rules 12798are to appear on printed pages. The form of these files was designed by 12799David R. Fuchs in 1979. Almost any reasonable typesetting device can be 12800@^Fuchs, David Raymond@> 12801@:DVI_files}{\.{DVI} files@> 12802driven by a program that takes \.{DVI} files as input, and dozens of such 12803\.{DVI}-to-whatever programs have been written. Thus, it is possible to 12804print the output of \TeX\ on many different kinds of equipment, using \TeX\ 12805as a device-independent ``front end.'' 12806 12807A \.{DVI} file is a stream of 8-bit bytes, which may be regarded as a 12808series of commands in a machine-like language. The first byte of each command 12809is the operation code, and this code is followed by zero or more bytes 12810that provide parameters to the command. The parameters themselves may consist 12811of several consecutive bytes; for example, the `|set_rule|' command has two 12812parameters, each of which is four bytes long. Parameters are usually 12813regarded as nonnegative integers; but four-byte-long parameters, 12814and shorter parameters that denote distances, can be 12815either positive or negative. Such parameters are given in two's complement 12816notation. For example, a two-byte-long distance parameter has a value between 12817$-2^{15}$ and $2^{15}-1$. As in \.{TFM} files, numbers that occupy 12818more than one byte position appear in BigEndian order. 12819 12820\XeTeX\ extends the format of \.{DVI} with its own commands, and thus produced 12821``extended device independent'' (\.{XDV}) files. 12822 12823A \.{DVI} file consists of a ``preamble,'' followed by a sequence of one 12824or more ``pages,'' followed by a ``postamble.'' The preamble is simply a 12825|pre| command, with its parameters that define the dimensions used in the 12826file; this must come first. Each ``page'' consists of a |bop| command, 12827followed by any number of other commands that tell where characters are to 12828be placed on a physical page, followed by an |eop| command. The pages 12829appear in the order that \TeX\ generated them. If we ignore |nop| commands 12830and \\{fnt\_def} commands (which are allowed between any two commands in 12831the file), each |eop| command is immediately followed by a |bop| command, 12832or by a |post| command; in the latter case, there are no more pages in the 12833file, and the remaining bytes form the postamble. Further details about 12834the postamble will be explained later. 12835 12836Some parameters in \.{DVI} commands are ``pointers.'' These are four-byte 12837quantities that give the location number of some other byte in the file; 12838the first byte is number~0, then comes number~1, and so on. For example, 12839one of the parameters of a |bop| command points to the previous |bop|; 12840this makes it feasible to read the pages in backwards order, in case the 12841results are being directed to a device that stacks its output face up. 12842Suppose the preamble of a \.{DVI} file occupies bytes 0 to 99. Now if the 12843first page occupies bytes 100 to 999, say, and if the second 12844page occupies bytes 1000 to 1999, then the |bop| that starts in byte 1000 12845points to 100 and the |bop| that starts in byte 2000 points to 1000. (The 12846very first |bop|, i.e., the one starting in byte 100, has a pointer of~$-1$.) 12847 12848@ The \.{DVI} format is intended to be both compact and easily interpreted 12849by a machine. Compactness is achieved by making most of the information 12850implicit instead of explicit. When a \.{DVI}-reading program reads the 12851commands for a page, it keeps track of several quantities: (a)~The current 12852font |f| is an integer; this value is changed only 12853by \\{fnt} and \\{fnt\_num} commands. (b)~The current position on the page 12854is given by two numbers called the horizontal and vertical coordinates, 12855|h| and |v|. Both coordinates are zero at the upper left corner of the page; 12856moving to the right corresponds to increasing the horizontal coordinate, and 12857moving down corresponds to increasing the vertical coordinate. Thus, the 12858coordinates are essentially Cartesian, except that vertical directions are 12859flipped; the Cartesian version of |(h,v)| would be |(h,-v)|. (c)~The 12860current spacing amounts are given by four numbers |w|, |x|, |y|, and |z|, 12861where |w| and~|x| are used for horizontal spacing and where |y| and~|z| 12862are used for vertical spacing. (d)~There is a stack containing 12863|(h,v,w,x,y,z)| values; the \.{DVI} commands |push| and |pop| are used to 12864change the current level of operation. Note that the current font~|f| is 12865not pushed and popped; the stack contains only information about 12866positioning. 12867 12868The values of |h|, |v|, |w|, |x|, |y|, and |z| are signed integers having up 12869to 32 bits, including the sign. Since they represent physical distances, 12870there is a small unit of measurement such that increasing |h| by~1 means 12871moving a certain tiny distance to the right. The actual unit of 12872measurement is variable, as explained below; \TeX\ sets things up so that 12873its \.{DVI} output is in sp units, i.e., scaled points, in agreement with 12874all the |scaled| dimensions in \TeX's data structures. 12875 12876@ Here is a list of all the commands that may appear in a \.{XDV} file. Each 12877command is specified by its symbolic name (e.g., |bop|), its opcode byte 12878(e.g., 139), and its parameters (if any). The parameters are followed 12879by a bracketed number telling how many bytes they occupy; for example, 12880`|p[4]|' means that parameter |p| is four bytes long. 12881 12882\yskip\hang|set_char_0| 0. Typeset character number~0 from font~|f| 12883such that the reference point of the character is at |(h,v)|. Then 12884increase |h| by the width of that character. Note that a character may 12885have zero or negative width, so one cannot be sure that |h| will advance 12886after this command; but |h| usually does increase. 12887 12888\yskip\hang\\{set\_char\_1} through \\{set\_char\_127} (opcodes 1 to 127). 12889Do the operations of |set_char_0|; but use the character whose number 12890matches the opcode, instead of character~0. 12891 12892\yskip\hang|set1| 128 |c[1]|. Same as |set_char_0|, except that character 12893number~|c| is typeset. \TeX82 uses this command for characters in the 12894range |128<=c<256|. 12895 12896\yskip\hang|@!set2| 129 |c[2]|. Same as |set1|, except that |c|~is two 12897bytes long, so it is in the range |0<=c<65536|. \TeX82 never uses this 12898command, but it should come in handy for extensions of \TeX\ that deal 12899with oriental languages. 12900@^oriental characters@>@^Chinese characters@>@^Japanese characters@> 12901 12902\yskip\hang|@!set3| 130 |c[3]|. Same as |set1|, except that |c|~is three 12903bytes long, so it can be as large as $2^{24}-1$. Not even the Chinese 12904language has this many characters, but this command might prove useful 12905in some yet unforeseen extension. 12906 12907\yskip\hang|@!set4| 131 |c[4]|. Same as |set1|, except that |c|~is four 12908bytes long. Imagine that. 12909 12910\yskip\hang|set_rule| 132 |a[4]| |b[4]|. Typeset a solid black rectangle 12911of height~|a| and width~|b|, with its bottom left corner at |(h,v)|. Then 12912set |h:=h+b|. If either |a<=0| or |b<=0|, nothing should be typeset. Note 12913that if |b<0|, the value of |h| will decrease even though nothing else happens. 12914See below for details about how to typeset rules so that consistency with 12915\MF\ is guaranteed. 12916 12917\yskip\hang|@!put1| 133 |c[1]|. Typeset character number~|c| from font~|f| 12918such that the reference point of the character is at |(h,v)|. (The `put' 12919commands are exactly like the `set' commands, except that they simply put out a 12920character or a rule without moving the reference point afterwards.) 12921 12922\yskip\hang|@!put2| 134 |c[2]|. Same as |set2|, except that |h| is not changed. 12923 12924\yskip\hang|@!put3| 135 |c[3]|. Same as |set3|, except that |h| is not changed. 12925 12926\yskip\hang|@!put4| 136 |c[4]|. Same as |set4|, except that |h| is not changed. 12927 12928\yskip\hang|put_rule| 137 |a[4]| |b[4]|. Same as |set_rule|, except that 12929|h| is not changed. 12930 12931\yskip\hang|nop| 138. No operation, do nothing. Any number of |nop|'s 12932may occur between \.{DVI} commands, but a |nop| cannot be inserted between 12933a command and its parameters or between two parameters. 12934 12935\yskip\hang|bop| 139 $c_0[4]$ $c_1[4]$ $\ldots$ $c_9[4]$ $p[4]$. Beginning 12936of a page: Set |(h,v,w,x,y,z):=(0,0,0,0,0,0)| and set the stack empty. Set 12937the current font |f| to an undefined value. The ten $c_i$ parameters hold 12938the values of \.{\\count0} $\ldots$ \.{\\count9} in \TeX\ at the time 12939\.{\\shipout} was invoked for this page; they can be used to identify 12940pages, if a user wants to print only part of a \.{DVI} file. The parameter 12941|p| points to the previous |bop| in the file; the first 12942|bop| has $p=-1$. 12943 12944\yskip\hang|eop| 140. End of page: Print what you have read since the 12945previous |bop|. At this point the stack should be empty. (The \.{DVI}-reading 12946programs that drive most output devices will have kept a buffer of the 12947material that appears on the page that has just ended. This material is 12948largely, but not entirely, in order by |v| coordinate and (for fixed |v|) by 12949|h|~coordinate; so it usually needs to be sorted into some order that is 12950appropriate for the device in question.) 12951 12952\yskip\hang|push| 141. Push the current values of |(h,v,w,x,y,z)| onto the 12953top of the stack; do not change any of these values. Note that |f| is 12954not pushed. 12955 12956\yskip\hang|pop| 142. Pop the top six values off of the stack and assign 12957them respectively to |(h,v,w,x,y,z)|. The number of pops should never 12958exceed the number of pushes, since it would be highly embarrassing if the 12959stack were empty at the time of a |pop| command. 12960 12961\yskip\hang|right1| 143 |b[1]|. Set |h:=h+b|, i.e., move right |b| units. 12962The parameter is a signed number in two's complement notation, |-128<=b<128|; 12963if |b<0|, the reference point moves left. 12964 12965\yskip\hang|right2| 144 |b[2]|. Same as |right1|, except that |b| is a 12966two-byte quantity in the range |-32768<=b<32768|. 12967 12968\yskip\hang|right3| 145 |b[3]|. Same as |right1|, except that |b| is a 12969three-byte quantity in the range |@t$-2^{23}$@><=b<@t$2^{23}$@>|. 12970 12971\yskip\hang|right4| 146 |b[4]|. Same as |right1|, except that |b| is a 12972four-byte quantity in the range |@t$-2^{31}$@><=b<@t$2^{31}$@>|. 12973 12974\yskip\hang|w0| 147. Set |h:=h+w|; i.e., move right |w| units. With luck, 12975this parameterless command will usually suffice, because the same kind of motion 12976will occur several times in succession; the following commands explain how 12977|w| gets particular values. 12978 12979\yskip\hang|w1| 148 |b[1]|. Set |w:=b| and |h:=h+b|. The value of |b| is a 12980signed quantity in two's complement notation, |-128<=b<128|. This command 12981changes the current |w|~spacing and moves right by |b|. 12982 12983\yskip\hang|@!w2| 149 |b[2]|. Same as |w1|, but |b| is two bytes long, 12984|-32768<=b<32768|. 12985 12986\yskip\hang|@!w3| 150 |b[3]|. Same as |w1|, but |b| is three bytes long, 12987|@t$-2^{23}$@><=b<@t$2^{23}$@>|. 12988 12989\yskip\hang|@!w4| 151 |b[4]|. Same as |w1|, but |b| is four bytes long, 12990|@t$-2^{31}$@><=b<@t$2^{31}$@>|. 12991 12992\yskip\hang|x0| 152. Set |h:=h+x|; i.e., move right |x| units. The `|x|' 12993commands are like the `|w|' commands except that they involve |x| instead 12994of |w|. 12995 12996\yskip\hang|x1| 153 |b[1]|. Set |x:=b| and |h:=h+b|. The value of |b| is a 12997signed quantity in two's complement notation, |-128<=b<128|. This command 12998changes the current |x|~spacing and moves right by |b|. 12999 13000\yskip\hang|@!x2| 154 |b[2]|. Same as |x1|, but |b| is two bytes long, 13001|-32768<=b<32768|. 13002 13003\yskip\hang|@!x3| 155 |b[3]|. Same as |x1|, but |b| is three bytes long, 13004|@t$-2^{23}$@><=b<@t$2^{23}$@>|. 13005 13006\yskip\hang|@!x4| 156 |b[4]|. Same as |x1|, but |b| is four bytes long, 13007|@t$-2^{31}$@><=b<@t$2^{31}$@>|. 13008 13009\yskip\hang|down1| 157 |a[1]|. Set |v:=v+a|, i.e., move down |a| units. 13010The parameter is a signed number in two's complement notation, |-128<=a<128|; 13011if |a<0|, the reference point moves up. 13012 13013\yskip\hang|@!down2| 158 |a[2]|. Same as |down1|, except that |a| is a 13014two-byte quantity in the range |-32768<=a<32768|. 13015 13016\yskip\hang|@!down3| 159 |a[3]|. Same as |down1|, except that |a| is a 13017three-byte quantity in the range |@t$-2^{23}$@><=a<@t$2^{23}$@>|. 13018 13019\yskip\hang|@!down4| 160 |a[4]|. Same as |down1|, except that |a| is a 13020four-byte quantity in the range |@t$-2^{31}$@><=a<@t$2^{31}$@>|. 13021 13022\yskip\hang|y0| 161. Set |v:=v+y|; i.e., move down |y| units. With luck, 13023this parameterless command will usually suffice, because the same kind of motion 13024will occur several times in succession; the following commands explain how 13025|y| gets particular values. 13026 13027\yskip\hang|y1| 162 |a[1]|. Set |y:=a| and |v:=v+a|. The value of |a| is a 13028signed quantity in two's complement notation, |-128<=a<128|. This command 13029changes the current |y|~spacing and moves down by |a|. 13030 13031\yskip\hang|@!y2| 163 |a[2]|. Same as |y1|, but |a| is two bytes long, 13032|-32768<=a<32768|. 13033 13034\yskip\hang|@!y3| 164 |a[3]|. Same as |y1|, but |a| is three bytes long, 13035|@t$-2^{23}$@><=a<@t$2^{23}$@>|. 13036 13037\yskip\hang|@!y4| 165 |a[4]|. Same as |y1|, but |a| is four bytes long, 13038|@t$-2^{31}$@><=a<@t$2^{31}$@>|. 13039 13040\yskip\hang|z0| 166. Set |v:=v+z|; i.e., move down |z| units. The `|z|' commands 13041are like the `|y|' commands except that they involve |z| instead of |y|. 13042 13043\yskip\hang|z1| 167 |a[1]|. Set |z:=a| and |v:=v+a|. The value of |a| is a 13044signed quantity in two's complement notation, |-128<=a<128|. This command 13045changes the current |z|~spacing and moves down by |a|. 13046 13047\yskip\hang|@!z2| 168 |a[2]|. Same as |z1|, but |a| is two bytes long, 13048|-32768<=a<32768|. 13049 13050\yskip\hang|@!z3| 169 |a[3]|. Same as |z1|, but |a| is three bytes long, 13051|@t$-2^{23}$@><=a<@t$2^{23}$@>|. 13052 13053\yskip\hang|@!z4| 170 |a[4]|. Same as |z1|, but |a| is four bytes long, 13054|@t$-2^{31}$@><=a<@t$2^{31}$@>|. 13055 13056\yskip\hang|fnt_num_0| 171. Set |f:=0|. Font 0 must previously have been 13057defined by a \\{fnt\_def} instruction, as explained below. 13058 13059\yskip\hang\\{fnt\_num\_1} through \\{fnt\_num\_63} (opcodes 172 to 234). Set 13060|f:=1|, \dots, \hbox{|f:=63|}, respectively. 13061 13062\yskip\hang|fnt1| 235 |k[1]|. Set |f:=k|. \TeX82 uses this command for font 13063numbers in the range |64<=k<256|. 13064 13065\yskip\hang|@!fnt2| 236 |k[2]|. Same as |fnt1|, except that |k|~is two 13066bytes long, so it is in the range |0<=k<65536|. \TeX82 never generates this 13067command, but large font numbers may prove useful for specifications of 13068color or texture, or they may be used for special fonts that have fixed 13069numbers in some external coding scheme. 13070 13071\yskip\hang|@!fnt3| 237 |k[3]|. Same as |fnt1|, except that |k|~is three 13072bytes long, so it can be as large as $2^{24}-1$. 13073 13074\yskip\hang|@!fnt4| 238 |k[4]|. Same as |fnt1|, except that |k|~is four 13075bytes long; this is for the really big font numbers (and for the negative ones). 13076 13077\yskip\hang|xxx1| 239 |k[1]| |x[k]|. This command is undefined in 13078general; it functions as a $(k+2)$-byte |nop| unless special \.{DVI}-reading 13079programs are being used. \TeX82 generates |xxx1| when a short enough 13080\.{\\special} appears, setting |k| to the number of bytes being sent. It 13081is recommended that |x| be a string having the form of a keyword followed 13082by possible parameters relevant to that keyword. 13083 13084\yskip\hang|@!xxx2| 240 |k[2]| |x[k]|. Like |xxx1|, but |0<=k<65536|. 13085 13086\yskip\hang|@!xxx3| 241 |k[3]| |x[k]|. Like |xxx1|, but |0<=k<@t$2^{24}$@>|. 13087 13088\yskip\hang|xxx4| 242 |k[4]| |x[k]|. Like |xxx1|, but |k| can be ridiculously 13089large. \TeX82 uses |xxx4| when sending a string of length 256 or more. 13090 13091\yskip\hang|fnt_def1| 243 |k[1]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|. 13092Define font |k|, where |0<=k<256|; font definitions will be explained shortly. 13093 13094\yskip\hang|@!fnt_def2| 244 |k[2]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|. 13095Define font |k|, where |0<=k<65536|. 13096 13097\yskip\hang|@!fnt_def3| 245 |k[3]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|. 13098Define font |k|, where |0<=k<@t$2^{24}$@>|. 13099 13100\yskip\hang|@!fnt_def4| 246 |k[4]| |c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|. 13101Define font |k|, where |@t$-2^{31}$@><=k<@t$2^{31}$@>|. 13102 13103\yskip\hang|pre| 247 |i[1]| |num[4]| |den[4]| |mag[4]| |k[1]| |x[k]|. 13104Beginning of the preamble; this must come at the very beginning of the 13105file. Parameters |i|, |num|, |den|, |mag|, |k|, and |x| are explained below. 13106 13107\yskip\hang|post| 248. Beginning of the postamble, see below. 13108 13109\yskip\hang|post_post| 249. Ending of the postamble, see below. 13110 13111\yskip\noindent Commands 250--255 are undefined in normal \.{DVI} files, but 13112the following commands are used in \.{XDV} files. 13113 13114\yskip\hang\vbox{\halign{#&#\hfil\cr 13115|define_native_font| 252 & |k[4]| |s[4]| |flags[2]| |l[1]| |n[l]| |i[4]|\cr 13116& |if (flags and COLORED) then| |rgba[4]|\cr 13117& |if (flags and EXTEND) then| |extend[4]|\cr 13118& |if (flags and SLANT) then| |slant[4]|\cr 13119& |if (flags and EMBOLDEN) then| |embolden[4]|\cr 13120}} 13121 13122\yskip\hang|set_glyphs| 253 |w[4]| |k[2]| |xy[8k]| |g[2k]|. 13123 13124\yskip\noindent Commands 250 and 255 are undefined in normal \.{XDV} files. 13125 13126@ @d set_char_0=0 {typeset character 0 and move right} 13127@d set1=128 {typeset a character and move right} 13128@d set_rule=132 {typeset a rule and move right} 13129@d put_rule=137 {typeset a rule} 13130@d nop=138 {no operation} 13131@d bop=139 {beginning of page} 13132@d eop=140 {ending of page} 13133@d push=141 {save the current positions} 13134@d pop=142 {restore previous positions} 13135@d right1=143 {move right} 13136@d w0=147 {move right by |w|} 13137@d w1=148 {move right and set |w|} 13138@d x0=152 {move right by |x|} 13139@d x1=153 {move right and set |x|} 13140@d down1=157 {move down} 13141@d y0=161 {move down by |y|} 13142@d y1=162 {move down and set |y|} 13143@d z0=166 {move down by |z|} 13144@d z1=167 {move down and set |z|} 13145@d fnt_num_0=171 {set current font to 0} 13146@d fnt1=235 {set current font} 13147@d xxx1=239 {extension to \.{DVI} primitives} 13148@d xxx4=242 {potentially long extension to \.{DVI} primitives} 13149@d fnt_def1=243 {define the meaning of a font number} 13150@d pre=247 {preamble} 13151@d post=248 {postamble beginning} 13152@d post_post=249 {postamble ending} 13153 13154@d define_native_font=252 {define native font} 13155@d set_glyphs=253 {sequence of glyphs with individual x-y coordinates} 13156 13157@ The preamble contains basic information about the file as a whole. As 13158stated above, there are six parameters: 13159$$\hbox{|@!i[1]| |@!num[4]| |@!den[4]| |@!mag[4]| |@!k[1]| |@!x[k]|.}$$ 13160The |i| byte identifies \.{DVI} format; in \XeTeX\ this byte is set to~6, as we 13161have new \.{DVI} opcodes, while in \TeX82 it is always set to~2. (The value 13162|i=3| is used for an extended format that allows a mixture of right-to-left and 13163left-to-right typesetting. Older versions of \XeTeX\ used |i=4| and |i=5|.) 13164 13165The next two parameters, |num| and |den|, are positive integers that define 13166the units of measurement; they are the numerator and denominator of a 13167fraction by which all dimensions in the \.{DVI} file could be multiplied 13168in order to get lengths in units of $10^{-7}$ meters. Since $\rm 7227{pt} = 13169254{cm}$, and since \TeX\ works with scaled points where there are $2^{16}$ 13170sp in a point, \TeX\ sets 13171$|num|/|den|=(254\cdot10^5)/(7227\cdot2^{16})=25400000/473628672$. 13172@^sp@> 13173 13174The |mag| parameter is what \TeX\ calls \.{\\mag}, i.e., 1000 times the 13175desired magnification. The actual fraction by which dimensions are 13176multiplied is therefore $|mag|\cdot|num|/1000|den|$. Note that if a \TeX\ 13177source document does not call for any `\.{true}' dimensions, and if you 13178change it only by specifying a different \.{\\mag} setting, the \.{DVI} 13179file that \TeX\ creates will be completely unchanged except for the value 13180of |mag| in the preamble and postamble. (Fancy \.{DVI}-reading programs allow 13181users to override the |mag|~setting when a \.{DVI} file is being printed.) 13182 13183Finally, |k| and |x| allow the \.{DVI} writer to include a comment, which is not 13184interpreted further. The length of comment |x| is |k|, where |0<=k<256|. 13185 13186@d id_byte=6 {identifies the kind of \.{DVI} files described here} 13187 13188@ Font definitions for a given font number |k| contain further parameters 13189$$\hbox{|c[4]| |s[4]| |d[4]| |a[1]| |l[1]| |n[a+l]|.}$$ 13190The four-byte value |c| is the check sum that \TeX\ found in the \.{TFM} 13191file for this font; |c| should match the check sum of the font found by 13192programs that read this \.{DVI} file. 13193@^check sum@> 13194 13195Parameter |s| contains a fixed-point scale factor that is applied to 13196the character widths in font |k|; font dimensions in \.{TFM} files and 13197other font files are relative to this quantity, which is called the 13198``at size'' elsewhere in this documentation. The value of |s| is 13199always positive and less than $2^{27}$. It is given in the same units 13200as the other \.{DVI} dimensions, i.e., in sp when \TeX82 has made the 13201file. Parameter |d| is similar to |s|; it is the ``design size,'' and 13202(like~|s|) it is given in \.{DVI} units. Thus, font |k| is to be used 13203at $|mag|\cdot s/1000d$ times its normal size. 13204 13205The remaining part of a font definition gives the external name of the font, 13206which is an ASCII string of length |a+l|. The number |a| is the length 13207of the ``area'' or directory, and |l| is the length of the font name itself; 13208the standard local system font area is supposed to be used when |a=0|. 13209The |n| field contains the area in its first |a| bytes. 13210 13211Font definitions must appear before the first use of a particular font number. 13212Once font |k| is defined, it must not be defined again; however, we 13213shall see below that font definitions appear in the postamble as well as 13214in the pages, so in this sense each font number is defined exactly twice, 13215if at all. Like |nop| commands, font definitions can 13216appear before the first |bop|, or between an |eop| and a |bop|. 13217 13218@ Sometimes it is desirable to make horizontal or vertical rules line up 13219precisely with certain features in characters of a font. It is possible to 13220guarantee the correct matching between \.{DVI} output and the characters 13221generated by \MF\ by adhering to the following principles: (1)~The \MF\ 13222characters should be positioned so that a bottom edge or left edge that is 13223supposed to line up with the bottom or left edge of a rule appears at the 13224reference point, i.e., in row~0 and column~0 of the \MF\ raster. This 13225ensures that the position of the rule will not be rounded differently when 13226the pixel size is not a perfect multiple of the units of measurement in 13227the \.{DVI} file. (2)~A typeset rule of height $a>0$ and width $b>0$ 13228should be equivalent to a \MF-generated character having black pixels in 13229precisely those raster positions whose \MF\ coordinates satisfy 13230|0<=x<@t$\alpha$@>b| and |0<=y<@t$\alpha$@>a|, where $\alpha$ is the number 13231of pixels per \.{DVI} unit. 13232@:METAFONT}{\MF@> 13233@^alignment of rules with characters@> 13234@^rules aligning with characters@> 13235 13236@ The last page in a \.{DVI} file is followed by `|post|'; this command 13237introduces the postamble, which summarizes important facts that \TeX\ has 13238accumulated about the file, making it possible to print subsets of the data 13239with reasonable efficiency. The postamble has the form 13240$$\vbox{\halign{\hbox{#\hfil}\cr 13241 |post| |p[4]| |num[4]| |den[4]| |mag[4]| |l[4]| |u[4]| |s[2]| |t[2]|\cr 13242 $\langle\,$font definitions$\,\rangle$\cr 13243 |post_post| |q[4]| |i[1]| 223's$[{\G}4]$\cr}}$$ 13244Here |p| is a pointer to the final |bop| in the file. The next three 13245parameters, |num|, |den|, and |mag|, are duplicates of the quantities that 13246appeared in the preamble. 13247 13248Parameters |l| and |u| give respectively the height-plus-depth of the tallest 13249page and the width of the widest page, in the same units as other dimensions 13250of the file. These numbers might be used by a \.{DVI}-reading program to 13251position individual ``pages'' on large sheets of film or paper; however, 13252the standard convention for output on normal size paper is to position each 13253page so that the upper left-hand corner is exactly one inch from the left 13254and the top. Experience has shown that it is unwise to design \.{DVI}-to-printer 13255software that attempts cleverly to center the output; a fixed position of 13256the upper left corner is easiest for users to understand and to work with. 13257Therefore |l| and~|u| are often ignored. 13258 13259Parameter |s| is the maximum stack depth (i.e., the largest excess of 13260|push| commands over |pop| commands) needed to process this file. Then 13261comes |t|, the total number of pages (|bop| commands) present. 13262 13263The postamble continues with font definitions, which are any number of 13264\\{fnt\_def} commands as described above, possibly interspersed with |nop| 13265commands. Each font number that is used in the \.{DVI} file must be defined 13266exactly twice: Once before it is first selected by a \\{fnt} command, and once 13267in the postamble. 13268 13269@ The last part of the postamble, following the |post_post| byte that 13270signifies the end of the font definitions, contains |q|, a pointer to the 13271|post| command that started the postamble. An identification byte, |i|, 13272comes next; this currently equals~2, as in the preamble. 13273 13274The |i| byte is followed by four or more bytes that are all equal to 13275the decimal number 223 (i.e., @'337 in octal). \TeX\ puts out four to seven of 13276these trailing bytes, until the total length of the file is a multiple of 13277four bytes, since this works out best on machines that pack four bytes per 13278word; but any number of 223's is allowed, as long as there are at least four 13279of them. In effect, 223 is a sort of signature that is added at the very end. 13280@^Fuchs, David Raymond@> 13281 13282This curious way to finish off a \.{DVI} file makes it feasible for 13283\.{DVI}-reading programs to find the postamble first, on most computers, 13284even though \TeX\ wants to write the postamble last. Most operating 13285systems permit random access to individual words or bytes of a file, so 13286the \.{DVI} reader can start at the end and skip backwards over the 223's 13287until finding the identification byte. Then it can back up four bytes, read 13288|q|, and move to byte |q| of the file. This byte should, of course, 13289contain the value 248 (|post|); now the postamble can be read, so the 13290\.{DVI} reader can discover all the information needed for typesetting the 13291pages. Note that it is also possible to skip through the \.{DVI} file at 13292reasonably high speed to locate a particular page, if that proves 13293desirable. This saves a lot of time, since \.{DVI} files used in production 13294jobs tend to be large. 13295 13296Unfortunately, however, standard \PASCAL\ does not include the ability to 13297@^system dependencies@> 13298access a random position in a file, or even to determine the length of a file. 13299Almost all systems nowadays provide the necessary capabilities, so \.{DVI} 13300format has been designed to work most efficiently with modern operating systems. 13301But if \.{DVI} files have to be processed under the restrictions of standard 13302\PASCAL, one can simply read them from front to back, since the necessary 13303header information is present in the preamble and in the font definitions. 13304(The |l| and |u| and |s| and |t| parameters, which appear only in the 13305postamble, are ``frills'' that are handy but not absolutely necessary.) 13306 13307@* \[32] Shipping pages out. 13308After considering \TeX's eyes and stomach, we come now to the bowels. 13309@^bowels@> 13310 13311The |ship_out| procedure is given a pointer to a box; its mission is 13312to describe that box in \.{DVI} form, outputting a ``page'' to |dvi_file|. 13313The \.{DVI} coordinates $(h,v)=(0,0)$ should correspond to the upper left 13314corner of the box being shipped. 13315 13316Since boxes can be inside of boxes inside of boxes, the main work of 13317|ship_out| is done by two mutually recursive routines, |hlist_out| 13318and |vlist_out|, which traverse the hlists and vlists inside of horizontal 13319and vertical boxes. 13320 13321As individual pages are being processed, we need to accumulate 13322information about the entire set of pages, since such statistics must be 13323reported in the postamble. The global variables |total_pages|, |max_v|, 13324|max_h|, |max_push|, and |last_bop| are used to record this information. 13325 13326The variable |doing_leaders| is |true| while leaders are being output. 13327The variable |dead_cycles| contains the number of times an output routine 13328has been initiated since the last |ship_out|. 13329 13330A few additional global variables are also defined here for use in 13331|vlist_out| and |hlist_out|. They could have been local variables, but 13332that would waste stack space when boxes are deeply nested, since the 13333values of these variables are not needed during recursive calls. 13334@^recursion@> 13335 13336@<Glob...@>= 13337@!total_pages:integer; {the number of pages that have been shipped out} 13338@!max_v:scaled; {maximum height-plus-depth of pages shipped so far} 13339@!max_h:scaled; {maximum width of pages shipped so far} 13340@!max_push:integer; {deepest nesting of |push| commands encountered so far} 13341@!last_bop:integer; {location of previous |bop| in the \.{DVI} output} 13342@!dead_cycles:integer; {recent outputs that didn't ship anything out} 13343@!doing_leaders:boolean; {are we inside a leader box?} 13344@# 13345@!c,@!f:quarterword; {character and font in current |char_node|} 13346@!rule_ht,@!rule_dp,@!rule_wd:scaled; {size of current rule being output} 13347@!g:pointer; {current glue specification} 13348@!lq,@!lr:integer; {quantities used in calculations for leaders} 13349 13350@ @<Set init...@>= 13351total_pages:=0; max_v:=0; max_h:=0; max_push:=0; last_bop:=-1; 13352doing_leaders:=false; dead_cycles:=0; cur_s:=-1; 13353 13354@ The \.{DVI} bytes are output to a buffer instead of being written directly 13355to the output file. This makes it possible to reduce the overhead of 13356subroutine calls, thereby measurably speeding up the computation, since 13357output of \.{DVI} bytes is part of \TeX's inner loop. And it has another 13358advantage as well, since we can change instructions in the buffer in order to 13359make the output more compact. For example, a `|down2|' command can be 13360changed to a `|y2|', thereby making a subsequent `|y0|' command possible, 13361saving two bytes. 13362 13363The output buffer is divided into two parts of equal size; the bytes found 13364in |dvi_buf[0..half_buf-1]| constitute the first half, and those in 13365|dvi_buf[half_buf..dvi_buf_size-1]| constitute the second. The global 13366variable |dvi_ptr| points to the position that will receive the next 13367output byte. When |dvi_ptr| reaches |dvi_limit|, which is always equal 13368to one of the two values |half_buf| or |dvi_buf_size|, the half buffer that 13369is about to be invaded next is sent to the output and |dvi_limit| is 13370changed to its other value. Thus, there is always at least a half buffer's 13371worth of information present, except at the very beginning of the job. 13372 13373Bytes of the \.{DVI} file are numbered sequentially starting with 0; 13374the next byte to be generated will be number |dvi_offset+dvi_ptr|. 13375A byte is present in the buffer only if its number is |>=dvi_gone|. 13376 13377@<Types...@>= 13378@!dvi_index=0..dvi_buf_size; {an index into the output buffer} 13379 13380@ Some systems may find it more efficient to make |dvi_buf| a |packed| 13381array, since output of four bytes at once may be facilitated. 13382@^system dependencies@> 13383 13384@<Glob...@>= 13385@!dvi_buf:array[dvi_index] of eight_bits; {buffer for \.{DVI} output} 13386@!half_buf:dvi_index; {half of |dvi_buf_size|} 13387@!dvi_limit:dvi_index; {end of the current half buffer} 13388@!dvi_ptr:dvi_index; {the next available buffer address} 13389@!dvi_offset:integer; {|dvi_buf_size| times the number of times the 13390 output buffer has been fully emptied} 13391@!dvi_gone:integer; {the number of bytes already output to |dvi_file|} 13392 13393@ Initially the buffer is all in one piece; we will output half of it only 13394after it first fills up. 13395 13396@<Set init...@>= 13397half_buf:=dvi_buf_size div 2; dvi_limit:=dvi_buf_size; dvi_ptr:=0; 13398dvi_offset:=0; dvi_gone:=0; 13399 13400@ The actual output of |dvi_buf[a..b]| to |dvi_file| is performed by calling 13401|write_dvi(a,b)|. For best results, this procedure should be optimized to 13402run as fast as possible on each particular system, since it is part of 13403\TeX's inner loop. It is safe to assume that |a| and |b+1| will both be 13404multiples of 4 when |write_dvi(a,b)| is called; therefore it is possible on 13405many machines to use efficient methods to pack four bytes per word and to 13406output an array of words with one system call. 13407@^system dependencies@> 13408@^inner loop@> 13409@^defecation@> 13410 13411@p procedure write_dvi(@!a,@!b:dvi_index); 13412var k:dvi_index; 13413begin for k:=a to b do write(dvi_file,dvi_buf[k]); 13414end; 13415 13416@ To put a byte in the buffer without paying the cost of invoking a procedure 13417each time, we use the macro |dvi_out|. 13418 13419@d dvi_out(#)==@+begin dvi_buf[dvi_ptr]:=#; incr(dvi_ptr); 13420 if dvi_ptr=dvi_limit then dvi_swap; 13421 end 13422 13423@p procedure dvi_swap; {outputs half of the buffer} 13424begin if dvi_limit=dvi_buf_size then 13425 begin write_dvi(0,half_buf-1); dvi_limit:=half_buf; 13426 dvi_offset:=dvi_offset+dvi_buf_size; dvi_ptr:=0; 13427 end 13428else begin write_dvi(half_buf,dvi_buf_size-1); dvi_limit:=dvi_buf_size; 13429 end; 13430dvi_gone:=dvi_gone+half_buf; 13431end; 13432 13433@ Here is how we clean out the buffer when \TeX\ is all through; |dvi_ptr| 13434will be a multiple of~4. 13435 13436@<Empty the last bytes out of |dvi_buf|@>= 13437if dvi_limit=half_buf then write_dvi(half_buf,dvi_buf_size-1); 13438if dvi_ptr>0 then write_dvi(0,dvi_ptr-1) 13439 13440@ The |dvi_four| procedure outputs four bytes in two's complement notation, 13441without risking arithmetic overflow. 13442 13443@p procedure dvi_four(@!x:integer); 13444begin if x>=0 then dvi_out(x div @'100000000) 13445else begin x:=x+@'10000000000; 13446 x:=x+@'10000000000; 13447 dvi_out((x div @'100000000) + 128); 13448 end; 13449x:=x mod @'100000000; dvi_out(x div @'200000); 13450x:=x mod @'200000; dvi_out(x div @'400); 13451dvi_out(x mod @'400); 13452end; 13453 13454procedure dvi_two(s: UTF16_code); 13455begin 13456 dvi_out(s div @'400); 13457 dvi_out(s mod @'400); 13458end; 13459 13460@ A mild optimization of the output is performed by the |dvi_pop| 13461routine, which issues a |pop| unless it is possible to cancel a 13462`|push| |pop|' pair. The parameter to |dvi_pop| is the byte address 13463following the old |push| that matches the new |pop|. 13464 13465@p procedure dvi_pop(@!l:integer); 13466begin if (l=dvi_offset+dvi_ptr)and(dvi_ptr>0) then decr(dvi_ptr) 13467else dvi_out(pop); 13468end; 13469 13470@ Here's a procedure that outputs a font definition. Since \TeX82 uses at 13471most 256 different fonts per job, |fnt_def1| is always used as the command code. 13472 13473@p procedure dvi_native_font_def(@!f:internal_font_number); 13474var 13475 font_def_length, i: integer; 13476begin 13477 dvi_out(define_native_font); 13478 dvi_four(f-font_base-1); 13479 font_def_length:=make_font_def(f); 13480 for i:=0 to font_def_length - 1 do dvi_out(xdv_buffer[i]); 13481end; 13482 13483procedure dvi_font_def(@!f:internal_font_number); 13484var k:pool_pointer; {index into |str_pool|} 13485l:integer; {length of name without mapping option} 13486begin if is_native_font(f) then dvi_native_font_def(f) else 13487begin dvi_out(fnt_def1); 13488dvi_out(f-font_base-1);@/ 13489dvi_out(qo(font_check[f].b0)); 13490dvi_out(qo(font_check[f].b1)); 13491dvi_out(qo(font_check[f].b2)); 13492dvi_out(qo(font_check[f].b3));@/ 13493dvi_four(font_size[f]); 13494dvi_four(font_dsize[f]);@/ 13495dvi_out(length(font_area[f])); 13496@<Output the font name whose internal number is |f|@>; 13497end; 13498 13499@ @<Output the font name whose internal number is |f|@>= 13500l:=0; k:=str_start_macro(font_name[f]); 13501{search for colon; we will truncate the name there} 13502while (l=0) and (k<str_start_macro(font_name[f]+1)) do begin 13503 if so(str_pool[k]) = ":" then l:=k-str_start_macro(font_name[f]); 13504 incr(k); 13505end; 13506if l=0 then l:=length(font_name[f]); {no colon found} 13507dvi_out(l); 13508for k:=str_start_macro(font_area[f]) to str_start_macro(font_area[f]+1)-1 do 13509 dvi_out(so(str_pool[k])); 13510for k:=str_start_macro(font_name[f]) to str_start_macro(font_name[f])+l-1 do 13511 dvi_out(so(str_pool[k])); 13512end; 13513 13514@ Versions of \TeX\ intended for small computers might well choose to omit 13515the ideas in the next few parts of this program, since it is not really 13516necessary to optimize the \.{DVI} code by making use of the |w0|, |x0|, 13517|y0|, and |z0| commands. Furthermore, the algorithm that we are about to 13518describe does not pretend to give an optimum reduction in the length 13519of the \.{DVI} code; after all, speed is more important than compactness. 13520But the method is surprisingly effective, and it takes comparatively little 13521time. 13522 13523We can best understand the basic idea by first considering a simpler problem 13524that has the same essential characteristics. Given a sequence of digits, 13525say $3\,1\,4\,1\,5\,9\,2\,6\,5\,3\,5\,8\,9$, we want to assign subscripts 13526$d$, $y$, or $z$ to each digit so as to maximize the number of ``$y$-hits'' 13527and ``$z$-hits''; a $y$-hit is an instance of two appearances of the same 13528digit with the subscript $y$, where no $y$'s intervene between the two 13529appearances, and a $z$-hit is defined similarly. For example, the sequence 13530above could be decorated with subscripts as follows: 13531$$3_z\,1_y\,4_d\,1_y\,5_y\,9_d\,2_d\,6_d\,5_y\,3_z\,5_y\,8_d\,9_d.$$ 13532There are three $y$-hits ($1_y\ldots1_y$ and $5_y\ldots5_y\ldots5_y$) and 13533one $z$-hit ($3_z\ldots3_z$); there are no $d$-hits, since the two appearances 13534of $9_d$ have $d$'s between them, but we don't count $d$-hits so it doesn't 13535matter how many there are. These subscripts are analogous to the \.{DVI} 13536commands called \\{down}, $y$, and $z$, and the digits are analogous to 13537different amounts of vertical motion; a $y$-hit or $z$-hit corresponds to 13538the opportunity to use the one-byte commands |y0| or |z0| in a \.{DVI} file. 13539 13540\TeX's method of assigning subscripts works like this: Append a new digit, 13541say $\delta$, to the right of the sequence. Now look back through the 13542sequence until one of the following things happens: (a)~You see 13543$\delta_y$ or $\delta_z$, and this was the first time you encountered a 13544$y$ or $z$ subscript, respectively. Then assign $y$ or $z$ to the new 13545$\delta$; you have scored a hit. (b)~You see $\delta_d$, and no $y$ 13546subscripts have been encountered so far during this search. Then change 13547the previous $\delta_d$ to $\delta_y$ (this corresponds to changing a 13548command in the output buffer), and assign $y$ to the new $\delta$; it's 13549another hit. (c)~You see $\delta_d$, and a $y$ subscript has been seen 13550but not a $z$. Change the previous $\delta_d$ to $\delta_z$ and assign 13551$z$ to the new $\delta$. (d)~You encounter both $y$ and $z$ subscripts 13552before encountering a suitable $\delta$, or you scan all the way to the 13553front of the sequence. Assign $d$ to the new $\delta$; this assignment may 13554be changed later. 13555 13556The subscripts $3_z\,1_y\,4_d\ldots\,$ in the example above were, in fact, 13557produced by this procedure, as the reader can verify. (Go ahead and try it.) 13558 13559@ In order to implement such an idea, \TeX\ maintains a stack of pointers 13560to the \\{down}, $y$, and $z$ commands that have been generated for the 13561current page. And there is a similar stack for \\{right}, |w|, and |x| 13562commands. These stacks are called the down stack and right stack, and their 13563top elements are maintained in the variables |down_ptr| and |right_ptr|. 13564 13565Each entry in these stacks contains four fields: The |width| field is 13566the amount of motion down or to the right; the |location| field is the 13567byte number of the \.{DVI} command in question (including the appropriate 13568|dvi_offset|); the |link| field points to the next item below this one 13569on the stack; and the |info| field encodes the options for possible change 13570in the \.{DVI} command. 13571 13572@d movement_node_size=3 {number of words per entry in the down and right stacks} 13573@d location(#)==mem[#+2].int {\.{DVI} byte number for a movement command} 13574 13575@<Glob...@>= 13576@!down_ptr,@!right_ptr:pointer; {heads of the down and right stacks} 13577 13578@ @<Set init...@>= 13579down_ptr:=null; right_ptr:=null; 13580 13581@ Here is a subroutine that produces a \.{DVI} command for some specified 13582downward or rightward motion. It has two parameters: |w| is the amount 13583of motion, and |o| is either |down1| or |right1|. We use the fact that 13584the command codes have convenient arithmetic properties: |y1-down1=w1-right1| 13585and |z1-down1=x1-right1|. 13586 13587@p procedure movement(@!w:scaled;@!o:eight_bits); 13588label exit,found,not_found,2,1; 13589var mstate:small_number; {have we seen a |y| or |z|?} 13590@!p,@!q:pointer; {current and top nodes on the stack} 13591@!k:integer; {index into |dvi_buf|, modulo |dvi_buf_size|} 13592begin q:=get_node(movement_node_size); {new node for the top of the stack} 13593width(q):=w; location(q):=dvi_offset+dvi_ptr; 13594if o=down1 then 13595 begin link(q):=down_ptr; down_ptr:=q; 13596 end 13597else begin link(q):=right_ptr; right_ptr:=q; 13598 end; 13599@<Look at the other stack entries until deciding what sort of \.{DVI} command 13600 to generate; |goto found| if node |p| is a ``hit''@>; 13601@<Generate a |down| or |right| command for |w| and |return|@>; 13602found: @<Generate a |y0| or |z0| command in order to reuse a previous 13603 appearance of~|w|@>; 13604exit:end; 13605 13606@ The |info| fields in the entries of the down stack or the right stack 13607have six possible settings: |y_here| or |z_here| mean that the \.{DVI} 13608command refers to |y| or |z|, respectively (or to |w| or |x|, in the 13609case of horizontal motion); |yz_OK| means that the \.{DVI} command is 13610\\{down} (or \\{right}) but can be changed to either |y| or |z| (or 13611to either |w| or |x|); |y_OK| means that it is \\{down} and can be changed 13612to |y| but not |z|; |z_OK| is similar; and |d_fixed| means it must stay 13613\\{down}. 13614 13615The four settings |yz_OK|, |y_OK|, |z_OK|, |d_fixed| would not need to 13616be distinguished from each other if we were simply solving the 13617digit-subscripting problem mentioned above. But in \TeX's case there is 13618a complication because of the nested structure of |push| and |pop| 13619commands. Suppose we add parentheses to the digit-subscripting problem, 13620redefining hits so that $\delta_y\ldots \delta_y$ is a hit if all $y$'s between 13621the $\delta$'s are enclosed in properly nested parentheses, and if the 13622parenthesis level of the right-hand $\delta_y$ is deeper than or equal to 13623that of the left-hand one. Thus, `(' and `)' correspond to `|push|' 13624and `|pop|'. Now if we want to assign a subscript to the final 1 in the 13625sequence 13626$$2_y\,7_d\,1_d\,(\,8_z\,2_y\,8_z\,)\,1$$ 13627we cannot change the previous $1_d$ to $1_y$, since that would invalidate 13628the $2_y\ldots2_y$ hit. But we can change it to $1_z$, scoring a hit 13629since the intervening $8_z$'s are enclosed in parentheses. 13630 13631The program below removes movement nodes that are introduced after a |push|, 13632before it outputs the corresponding |pop|. 13633 13634@d y_here=1 {|info| when the movement entry points to a |y| command} 13635@d z_here=2 {|info| when the movement entry points to a |z| command} 13636@d yz_OK=3 {|info| corresponding to an unconstrained \\{down} command} 13637@d y_OK=4 {|info| corresponding to a \\{down} that can't become a |z|} 13638@d z_OK=5 {|info| corresponding to a \\{down} that can't become a |y|} 13639@d d_fixed=6 {|info| corresponding to a \\{down} that can't change} 13640 13641@ When the |movement| procedure gets to the label |found|, the value of 13642|info(p)| will be either |y_here| or |z_here|. If it is, say, |y_here|, 13643the procedure generates a |y0| command (or a |w0| command), and marks 13644all |info| fields between |q| and |p| so that |y| is not OK in that range. 13645 13646@<Generate a |y0| or |z0| command...@>= 13647info(q):=info(p); 13648if info(q)=y_here then 13649 begin dvi_out(o+y0-down1); {|y0| or |w0|} 13650 while link(q)<>p do 13651 begin q:=link(q); 13652 case info(q) of 13653 yz_OK: info(q):=z_OK; 13654 y_OK: info(q):=d_fixed; 13655 othercases do_nothing 13656 endcases; 13657 end; 13658 end 13659else begin dvi_out(o+z0-down1); {|z0| or |x0|} 13660 while link(q)<>p do 13661 begin q:=link(q); 13662 case info(q) of 13663 yz_OK: info(q):=y_OK; 13664 z_OK: info(q):=d_fixed; 13665 othercases do_nothing 13666 endcases; 13667 end; 13668 end 13669 13670@ @<Generate a |down| or |right|...@>= 13671info(q):=yz_OK; 13672if abs(w)>=@'40000000 then 13673 begin dvi_out(o+3); {|down4| or |right4|} 13674 dvi_four(w); return; 13675 end; 13676if abs(w)>=@'100000 then 13677 begin dvi_out(o+2); {|down3| or |right3|} 13678 if w<0 then w:=w+@'100000000; 13679 dvi_out(w div @'200000); w:=w mod @'200000; goto 2; 13680 end; 13681if abs(w)>=@'200 then 13682 begin dvi_out(o+1); {|down2| or |right2|} 13683 if w<0 then w:=w+@'200000; 13684 goto 2; 13685 end; 13686dvi_out(o); {|down1| or |right1|} 13687if w<0 then w:=w+@'400; 13688goto 1; 136892: dvi_out(w div @'400); 136901: dvi_out(w mod @'400); return 13691 13692@ As we search through the stack, we are in one of three states, 13693|y_seen|, |z_seen|, or |none_seen|, depending on whether we have 13694encountered |y_here| or |z_here| nodes. These states are encoded as 13695multiples of 6, so that they can be added to the |info| fields for quick 13696decision-making. 13697@^inner loop@> 13698 13699@d none_seen=0 {no |y_here| or |z_here| nodes have been encountered yet} 13700@d y_seen=6 {we have seen |y_here| but not |z_here|} 13701@d z_seen=12 {we have seen |z_here| but not |y_here|} 13702 13703@<Look at the other stack entries until deciding...@>= 13704p:=link(q); mstate:=none_seen; 13705while p<>null do 13706 begin if width(p)=w then @<Consider a node with matching width; 13707 |goto found| if it's a hit@> 13708 else case mstate+info(p) of 13709 none_seen+y_here: mstate:=y_seen; 13710 none_seen+z_here: mstate:=z_seen; 13711 y_seen+z_here,z_seen+y_here: goto not_found; 13712 othercases do_nothing 13713 endcases; 13714 p:=link(p); 13715 end; 13716not_found: 13717 13718@ We might find a valid hit in a |y| or |z| byte that is already gone 13719from the buffer. But we can't change bytes that are gone forever; ``the 13720moving finger writes, $\ldots\,\,$.'' 13721 13722@<Consider a node with matching width...@>= 13723case mstate+info(p) of 13724none_seen+yz_OK,none_seen+y_OK,z_seen+yz_OK,z_seen+y_OK:@t@>@;@/ 13725 if location(p)<dvi_gone then goto not_found 13726 else @<Change buffered instruction to |y| or |w| and |goto found|@>; 13727none_seen+z_OK,y_seen+yz_OK,y_seen+z_OK:@t@>@;@/ 13728 if location(p)<dvi_gone then goto not_found 13729 else @<Change buffered instruction to |z| or |x| and |goto found|@>; 13730none_seen+y_here,none_seen+z_here,y_seen+z_here,z_seen+y_here: goto found; 13731othercases do_nothing 13732endcases 13733 13734@ @<Change buffered instruction to |y| or |w| and |goto found|@>= 13735begin k:=location(p)-dvi_offset; 13736if k<0 then k:=k+dvi_buf_size; 13737dvi_buf[k]:=dvi_buf[k]+y1-down1; 13738info(p):=y_here; goto found; 13739end 13740 13741@ @<Change buffered instruction to |z| or |x| and |goto found|@>= 13742begin k:=location(p)-dvi_offset; 13743if k<0 then k:=k+dvi_buf_size; 13744dvi_buf[k]:=dvi_buf[k]+z1-down1; 13745info(p):=z_here; goto found; 13746end 13747 13748@ In case you are wondering when all the movement nodes are removed from 13749\TeX's memory, the answer is that they are recycled just before 13750|hlist_out| and |vlist_out| finish outputting a box. This restores the 13751down and right stacks to the state they were in before the box was output, 13752except that some |info|'s may have become more restrictive. 13753 13754@p procedure prune_movements(@!l:integer); 13755 {delete movement nodes with |location>=l|} 13756label done,exit; 13757var p:pointer; {node being deleted} 13758begin while down_ptr<>null do 13759 begin if location(down_ptr)<l then goto done; 13760 p:=down_ptr; down_ptr:=link(p); free_node(p,movement_node_size); 13761 end; 13762done: while right_ptr<>null do 13763 begin if location(right_ptr)<l then return; 13764 p:=right_ptr; right_ptr:=link(p); free_node(p,movement_node_size); 13765 end; 13766exit:end; 13767 13768@ The actual distances by which we want to move might be computed as the 13769sum of several separate movements. For example, there might be several 13770glue nodes in succession, or we might want to move right by the width of 13771some box plus some amount of glue. More importantly, the baselineskip 13772distances are computed in terms of glue together with the depth and 13773height of adjacent boxes, and we want the \.{DVI} file to lump these 13774three quantities together into a single motion. 13775 13776Therefore, \TeX\ maintains two pairs of global variables: |dvi_h| and |dvi_v| 13777are the |h| and |v| coordinates corresponding to the commands actually 13778output to the \.{DVI} file, while |cur_h| and |cur_v| are the coordinates 13779corresponding to the current state of the output routines. Coordinate 13780changes will accumulate in |cur_h| and |cur_v| without being reflected 13781in the output, until such a change becomes necessary or desirable; we 13782can call the |movement| procedure whenever we want to make |dvi_h=cur_h| 13783or |dvi_v=cur_v|. 13784 13785The current font reflected in the \.{DVI} output is called |dvi_f|; 13786there is no need for a `\\{cur\_f}' variable. 13787 13788The depth of nesting of |hlist_out| and |vlist_out| is called |cur_s|; 13789this is essentially the depth of |push| commands in the \.{DVI} output. 13790 13791For mixed direction text (\TeXXeT) the current text direction is called 13792|cur_dir|. As the box being shipped out will never be used again and 13793soon be recycled, we can simply reverse any R-text (i.e., right-to-left) 13794segments of hlist nodes as well as complete hlist nodes embedded in such 13795segments. Moreover this can be done iteratively rather than recursively. 13796There are, however, two complications related to leaders that require 13797some additional bookkeeping: (1)~One and the same hlist node might be 13798used more than once (but never inside both L- and R-text); and 13799(2)~leader boxes inside hlists must be aligned with respect to the left 13800edge of the original hlist. 13801 13802A math node is changed into a kern node whenever the text direction 13803remains the same, it is replaced by an |edge_node| if the text direction 13804changes; the subtype of an an |hlist_node| inside R-text is changed to 13805|reversed| once its hlist has been reversed. 13806@!@^data structure assumptions@> 13807 13808@d reversed=1 {subtype for an |hlist_node| whose hlist has been reversed} 13809@d dlist=2 {subtype for an |hlist_node| from display math mode} 13810@d box_lr(#) == (qo(subtype(#))) {direction mode of a box} 13811@d set_box_lr(#) == subtype(#):=set_box_lr_end 13812@d set_box_lr_end(#) == qi(#) 13813@# 13814@d left_to_right=0 13815@d right_to_left=1 13816@d reflected==1-cur_dir {the opposite of |cur_dir|} 13817@# 13818@d synch_h==if cur_h<>dvi_h then 13819 begin movement(cur_h-dvi_h,right1); dvi_h:=cur_h; 13820 end 13821@d synch_v==if cur_v<>dvi_v then 13822 begin movement(cur_v-dvi_v,down1); dvi_v:=cur_v; 13823 end 13824 13825@<Glob...@>= 13826@!dvi_h,@!dvi_v:scaled; {a \.{DVI} reader program thinks we are here} 13827@!cur_h,@!cur_v:scaled; {\TeX\ thinks we are here} 13828@!dvi_f:internal_font_number; {the current font} 13829@!cur_s:integer; {current depth of output box nesting, initially $-1$} 13830 13831@ @<Initialize variables as |ship_out| begins@>= 13832dvi_h:=0; dvi_v:=0; cur_h:=h_offset; dvi_f:=null_font; 13833@<Calculate page dimensions and margins@>; 13834ensure_dvi_open; 13835if total_pages=0 then 13836 begin dvi_out(pre); dvi_out(id_byte); {output the preamble} 13837@^preamble of \.{DVI} file@> 13838 dvi_four(25400000); dvi_four(473628672); {conversion ratio for sp} 13839 prepare_mag; dvi_four(mag); {magnification factor is frozen} 13840 old_setting:=selector; selector:=new_string; 13841 print(" XeTeX output "); print_int(year); print_char("."); 13842 print_two(month); print_char("."); print_two(day); 13843 print_char(":"); print_two(time div 60); 13844 print_two(time mod 60); 13845 selector:=old_setting; dvi_out(cur_length); 13846 for s:=str_start_macro(str_ptr) to pool_ptr-1 do dvi_out(so(str_pool[s])); 13847 pool_ptr:=str_start_macro(str_ptr); {flush the current string} 13848 end 13849 13850@ When |hlist_out| is called, its duty is to output the box represented 13851by the |hlist_node| pointed to by |temp_ptr|. The reference point of that 13852box has coordinates |(cur_h,cur_v)|. 13853 13854Similarly, when |vlist_out| is called, its duty is to output the box represented 13855by the |vlist_node| pointed to by |temp_ptr|. The reference point of that 13856box has coordinates |(cur_h,cur_v)|. 13857@^recursion@> 13858 13859@p procedure@?vlist_out; forward; {|hlist_out| and |vlist_out| are mutually 13860 recursive} 13861 13862@ The recursive procedures |hlist_out| and |vlist_out| each have local variables 13863|save_h| and |save_v| to hold the values of |dvi_h| and |dvi_v| just before 13864entering a new level of recursion. In effect, the values of |save_h| and 13865|save_v| on \TeX's run-time stack correspond to the values of |h| and |v| 13866that a \.{DVI}-reading program will push onto its coordinate stack. 13867 13868@d move_past=13 {go to this label when advancing past glue or a rule} 13869@d fin_rule=14 {go to this label to finish processing a rule} 13870@d next_p=15 {go to this label when finished with node |p|} 13871 13872@d check_next=1236 13873@d end_node_run=1237 13874 13875@p @t\4@>@<Declare procedures needed in |hlist_out|, |vlist_out|@>@t@>@/ 13876procedure hlist_out; {output an |hlist_node| box} 13877label reswitch, move_past, fin_rule, next_p; 13878var base_line: scaled; {the baseline coordinate for this box} 13879@!left_edge: scaled; {the left coordinate for this box} 13880@!save_h,@!save_v: scaled; {what |dvi_h| and |dvi_v| should pop to} 13881@!this_box: pointer; {pointer to containing box} 13882@!g_order: glue_ord; {applicable order of infinity for glue} 13883@!g_sign: normal..shrinking; {selects type of glue} 13884@!p:pointer; {current position in the hlist} 13885@!save_loc:integer; {\.{DVI} byte location upon entry} 13886@!leader_box:pointer; {the leader box being replicated} 13887@!leader_wd:scaled; {width of leader box being replicated} 13888@!lx:scaled; {extra space between leader boxes} 13889@!outer_doing_leaders:boolean; {were we doing leaders?} 13890@!edge:scaled; {right edge of sub-box or leader space} 13891@!prev_p:pointer; {one step behind |p|} 13892@!len: integer; {length of scratch string for native word output} 13893@!q,@!r: pointer; 13894@!k,@!j: integer; 13895@!glue_temp:real; {glue value before rounding} 13896@!cur_glue:real; {glue seen so far} 13897@!cur_g:scaled; {rounded equivalent of |cur_glue| times the glue ratio} 13898begin cur_g:=0; cur_glue:=float_constant(0); 13899this_box:=temp_ptr; g_order:=glue_order(this_box); 13900g_sign:=glue_sign(this_box); 13901@<Merge sequences of words using AAT fonts and inter-word spaces into single nodes@>; 13902p:=list_ptr(this_box); 13903incr(cur_s); 13904if cur_s>0 then dvi_out(push); 13905if cur_s>max_push then max_push:=cur_s; 13906save_loc:=dvi_offset+dvi_ptr; base_line:=cur_v; 13907prev_p:=this_box+list_offset; 13908@<Initialize |hlist_out| for mixed direction typesetting@>; 13909left_edge:=cur_h; 13910while p<>null do @<Output node |p| for |hlist_out| and move to the next node, 13911 maintaining the condition |cur_v=base_line|@>; 13912@<Finish |hlist_out| for mixed direction typesetting@>; 13913prune_movements(save_loc); 13914if cur_s>0 then dvi_pop(save_loc); 13915decr(cur_s); 13916end; 13917 13918@ Extra stuff for justifiable AAT text; need to merge runs of words and normal spaces. 13919 13920@d is_native_word_node(#) == (((#)<>null and (not is_char_node(#)) and (type(#) = whatsit_node) and (subtype(#) = native_word_node))) 13921@d is_glyph_node(#) == (((#)<>null and (not is_char_node(#)) and (type(#) = whatsit_node) and (subtype(#) = glyph_node))) 13922 13923@<Merge sequences of words using AAT fonts and inter-word spaces into single nodes@>= 13924p:=list_ptr(this_box); 13925prev_p:=this_box+list_offset; 13926while p<>null do begin 13927 if link(p) <> null then begin {not worth looking ahead at the end} 13928 if is_native_word_node(p) and (font_area[native_font(p)] = aat_font_flag) 13929 and (font_letter_space[native_font(p)] = 0) then begin 13930 {got a word in an AAT font, might be the start of a run} 13931 r:=p; {|r| is start of possible run} 13932 k:=native_length(r); 13933 q:=link(p); 13934check_next: 13935 @<Advance |q| past ignorable nodes@>; 13936 if (q <> null) and not is_char_node(q) then begin 13937 if (type(q) = glue_node) and (subtype(q) = normal) and (glue_ptr(q) = font_glue[native_font(r)]) then begin 13938 {found a normal space; if the next node is another word in the same font, we'll merge} 13939 q:=link(q); 13940 @<Advance |q| past ignorable nodes@>; 13941 if is_native_word_node(q) and (native_font(q) = native_font(r)) then begin 13942 p:=q; {record new tail of run in |p|} 13943 k:=k + 1 + native_length(q); 13944 q:=link(q); 13945 goto check_next; 13946 end; 13947 goto end_node_run; 13948 end; 13949 if is_native_word_node(q) and (native_font(q) = native_font(r)) then begin 13950 p:=q; {record new tail of run in |p|} 13951 q:=link(q); 13952 goto check_next; 13953 end 13954 end; 13955end_node_run: {now |r| points to first |native_word_node| of the run, and |p| to the last} 13956 if p <> r then begin {merge nodes from |r| to |p| inclusive; total text length is |k|} 13957 str_room(k); 13958 k:=0; {now we'll use this as accumulator for total width} 13959 q:=r; 13960 loop begin 13961 if type(q) = whatsit_node then begin 13962 if subtype(q) = native_word_node then begin 13963 for j:=0 to native_length(q)-1 do 13964 append_char(get_native_char(q, j)); 13965 k:=k + width(q); 13966 end 13967 end else if type(q) = glue_node then begin 13968 append_char(" "); 13969 g:=glue_ptr(q); 13970 k:=k + width(g); 13971 if g_sign <> normal then begin 13972 if g_sign = stretching then begin 13973 if stretch_order(g) = g_order then begin 13974 k:=k + round(float(glue_set(this_box)) * stretch(g)) 13975 end 13976 end else begin 13977 if shrink_order(g) = g_order then begin 13978 k:=k - round(float(glue_set(this_box)) * shrink(g)) 13979 end 13980 end 13981 end 13982 end; 13983 {discretionary and deleted nodes can be discarded here} 13984 if q = p then break 13985 else q:=link(q); 13986 end; 13987done: 13988 q:=new_native_word_node(native_font(r), cur_length); 13989 link(prev_p):=q; 13990 for j:=0 to cur_length - 1 do 13991 set_native_char(q, j, str_pool[str_start_macro(str_ptr) + j]); 13992 link(q):=link(p); 13993 link(p):=null; 13994 flush_node_list(r); 13995 width(q):=k; 13996 set_justified_native_glyphs(q); 13997 p:=q; 13998 pool_ptr:=str_start_macro(str_ptr); {flush the temporary string data} 13999 end 14000 end; 14001 prev_p:=p; 14002 end; 14003 p:=link(p); 14004end 14005 14006@ @<Advance |q| past ignorable nodes@>= 14007while (q <> null) and (not is_char_node(q)) and (type(q) = disc_node) do 14008 q:=link(q) 14009 14010@ We ought to give special care to the efficiency of one part of |hlist_out|, 14011since it belongs to \TeX's inner loop. When a |char_node| is encountered, 14012we save a little time by processing several nodes in succession until 14013reaching a non-|char_node|. The program uses the fact that |set_char_0=0|. 14014@^inner loop@> 14015 14016@<Output node |p| for |hlist_out|...@>= 14017reswitch: if is_char_node(p) then 14018 begin synch_h; synch_v; 14019 repeat f:=font(p); c:=character(p); 14020 if (p<>lig_trick) and (font_mapping[f]<>nil) then c:=apply_tfm_font_mapping(font_mapping[f],c); 14021 if f<>dvi_f then @<Change font |dvi_f| to |f|@>; 14022 if c>=qi(128) then dvi_out(set1); 14023 dvi_out(qo(c));@/ 14024 cur_h:=cur_h+char_width(f)(char_info(f)(c)); 14025 prev_p:=link(prev_p); {N.B.: not |prev_p:=p|, |p| might be |lig_trick|} 14026 p:=link(p); 14027 until not is_char_node(p); 14028 dvi_h:=cur_h; 14029 end 14030else @<Output the non-|char_node| |p| for |hlist_out| 14031 and move to the next node@> 14032 14033@ @<Change font |dvi_f| to |f|@>= 14034begin if not font_used[f] then 14035 begin dvi_font_def(f); font_used[f]:=true; 14036 end; 14037if f<=64+font_base then dvi_out(f-font_base-1+fnt_num_0) 14038else begin dvi_out(fnt1); dvi_out(f-font_base-1); 14039 end; 14040dvi_f:=f; 14041end 14042 14043@ @<Output the non-|char_node| |p| for |hlist_out|...@>= 14044begin case type(p) of 14045hlist_node,vlist_node:@<Output a box in an hlist@>; 14046rule_node: begin rule_ht:=height(p); rule_dp:=depth(p); rule_wd:=width(p); 14047 goto fin_rule; 14048 end; 14049whatsit_node: @<Output the whatsit node |p| in an hlist@>; 14050glue_node: @<Move right or output leaders@>; 14051margin_kern_node: begin 14052 cur_h:=cur_h+width(p); 14053end; 14054kern_node:cur_h:=cur_h+width(p); 14055math_node: @<Handle a math node in |hlist_out|@>; 14056ligature_node: @<Make node |p| look like a |char_node| and |goto reswitch|@>; 14057@/@<Cases of |hlist_out| that arise in mixed direction text only@>@; 14058othercases do_nothing 14059endcases;@/ 14060goto next_p; 14061fin_rule: @<Output a rule in an hlist@>; 14062move_past: cur_h:=cur_h+rule_wd; 14063next_p:prev_p:=p; p:=link(p); 14064end 14065 14066@ @<Output a box in an hlist@>= 14067if list_ptr(p)=null then cur_h:=cur_h+width(p) 14068else begin save_h:=dvi_h; save_v:=dvi_v; 14069 cur_v:=base_line+shift_amount(p); {shift the box down} 14070 temp_ptr:=p; edge:=cur_h+width(p); 14071 if cur_dir=right_to_left then cur_h:=edge; 14072 if type(p)=vlist_node then vlist_out@+else hlist_out; 14073 dvi_h:=save_h; dvi_v:=save_v; 14074 cur_h:=edge; cur_v:=base_line; 14075 end 14076 14077@ @<Output a rule in an hlist@>= 14078if is_running(rule_ht) then rule_ht:=height(this_box); 14079if is_running(rule_dp) then rule_dp:=depth(this_box); 14080rule_ht:=rule_ht+rule_dp; {this is the rule thickness} 14081if (rule_ht>0)and(rule_wd>0) then {we don't output empty rules} 14082 begin synch_h; cur_v:=base_line+rule_dp; synch_v; 14083 dvi_out(set_rule); dvi_four(rule_ht); dvi_four(rule_wd); 14084 cur_v:=base_line; dvi_h:=dvi_h+rule_wd; 14085 end 14086 14087@ @d billion==float_constant(1000000000) 14088@d vet_glue(#)== glue_temp:=#; 14089 if glue_temp>billion then 14090 glue_temp:=billion 14091 else if glue_temp<-billion then 14092 glue_temp:=-billion 14093@# 14094@d round_glue==g:=glue_ptr(p); rule_wd:=width(g)-cur_g; 14095if g_sign<>normal then 14096 begin if g_sign=stretching then 14097 begin if stretch_order(g)=g_order then 14098 begin cur_glue:=cur_glue+stretch(g); 14099 vet_glue(float(glue_set(this_box))*cur_glue); 14100@^real multiplication@> 14101 cur_g:=round(glue_temp); 14102 end; 14103 end 14104 else if shrink_order(g)=g_order then 14105 begin cur_glue:=cur_glue-shrink(g); 14106 vet_glue(float(glue_set(this_box))*cur_glue); 14107 cur_g:=round(glue_temp); 14108 end; 14109 end; 14110rule_wd:=rule_wd+cur_g 14111 14112@<Move right or output leaders@>= 14113begin round_glue; 14114if eTeX_ex then @<Handle a glue node for mixed direction typesetting@>; 14115if subtype(p)>=a_leaders then 14116 @<Output leaders in an hlist, |goto fin_rule| if a rule 14117 or to |next_p| if done@>; 14118goto move_past; 14119end 14120 14121@ @<Output leaders in an hlist...@>= 14122begin leader_box:=leader_ptr(p); 14123if type(leader_box)=rule_node then 14124 begin rule_ht:=height(leader_box); rule_dp:=depth(leader_box); 14125 goto fin_rule; 14126 end; 14127leader_wd:=width(leader_box); 14128if (leader_wd>0)and(rule_wd>0) then 14129 begin rule_wd:=rule_wd+10; {compensate for floating-point rounding} 14130 if cur_dir=right_to_left then cur_h:=cur_h-10; 14131 edge:=cur_h+rule_wd; lx:=0; 14132 @<Let |cur_h| be the position of the first box, and set |leader_wd+lx| 14133 to the spacing between corresponding parts of boxes@>; 14134 while cur_h+leader_wd<=edge do 14135 @<Output a leader box at |cur_h|, 14136 then advance |cur_h| by |leader_wd+lx|@>; 14137 if cur_dir=right_to_left then cur_h:=edge 14138 else cur_h:=edge-10; 14139 goto next_p; 14140 end; 14141end 14142 14143@ The calculations related to leaders require a bit of care. First, in the 14144case of |a_leaders| (aligned leaders), we want to move |cur_h| to 14145|left_edge| plus the smallest multiple of |leader_wd| for which the result 14146is not less than the current value of |cur_h|; i.e., |cur_h| should become 14147$|left_edge|+|leader_wd|\times\lceil 14148(|cur_h|-|left_edge|)/|leader_wd|\rceil$. The program here should work in 14149all cases even though some implementations of \PASCAL\ give nonstandard 14150results for the |div| operation when |cur_h| is less than |left_edge|. 14151 14152In the case of |c_leaders| (centered leaders), we want to increase |cur_h| 14153by half of the excess space not occupied by the leaders; and in the 14154case of |x_leaders| (expanded leaders) we increase |cur_h| 14155by $1/(q+1)$ of this excess space, where $q$ is the number of times the 14156leader box will be replicated. Slight inaccuracies in the division might 14157accumulate; half of this rounding error is placed at each end of the leaders. 14158 14159@<Let |cur_h| be the position of the first box, ...@>= 14160if subtype(p)=a_leaders then 14161 begin save_h:=cur_h; 14162 cur_h:=left_edge+leader_wd*((cur_h-left_edge)@!div leader_wd); 14163 if cur_h<save_h then cur_h:=cur_h+leader_wd; 14164 end 14165else begin lq:=rule_wd div leader_wd; {the number of box copies} 14166 lr:=rule_wd mod leader_wd; {the remaining space} 14167 if subtype(p)=c_leaders then cur_h:=cur_h+(lr div 2) 14168 else begin lx:=lr div (lq+1); 14169 cur_h:=cur_h+((lr-(lq-1)*lx) div 2); 14170 end; 14171 end 14172 14173@ The `\\{synch}' operations here are intended to decrease the number of 14174bytes needed to specify horizontal and vertical motion in the \.{DVI} output. 14175 14176@<Output a leader box at |cur_h|, ...@>= 14177begin cur_v:=base_line+shift_amount(leader_box); synch_v; save_v:=dvi_v;@/ 14178synch_h; save_h:=dvi_h; temp_ptr:=leader_box; 14179if cur_dir=right_to_left then cur_h:=cur_h+leader_wd; 14180outer_doing_leaders:=doing_leaders; doing_leaders:=true; 14181if type(leader_box)=vlist_node then vlist_out@+else hlist_out; 14182doing_leaders:=outer_doing_leaders; 14183dvi_v:=save_v; dvi_h:=save_h; cur_v:=base_line; 14184cur_h:=save_h+leader_wd+lx; 14185end 14186 14187@ The |vlist_out| routine is similar to |hlist_out|, but a bit simpler. 14188 14189@p procedure vlist_out; {output a |vlist_node| box} 14190label move_past, fin_rule, next_p; 14191var left_edge: scaled; {the left coordinate for this box} 14192@!top_edge: scaled; {the top coordinate for this box} 14193@!save_h,@!save_v: scaled; {what |dvi_h| and |dvi_v| should pop to} 14194@!this_box: pointer; {pointer to containing box} 14195@!g_order: glue_ord; {applicable order of infinity for glue} 14196@!g_sign: normal..shrinking; {selects type of glue} 14197@!p:pointer; {current position in the vlist} 14198@!save_loc:integer; {\.{DVI} byte location upon entry} 14199@!leader_box:pointer; {the leader box being replicated} 14200@!leader_ht:scaled; {height of leader box being replicated} 14201@!lx:scaled; {extra space between leader boxes} 14202@!outer_doing_leaders:boolean; {were we doing leaders?} 14203@!edge:scaled; {bottom boundary of leader space} 14204@!glue_temp:real; {glue value before rounding} 14205@!cur_glue:real; {glue seen so far} 14206@!cur_g:scaled; {rounded equivalent of |cur_glue| times the glue ratio} 14207@!upwards:boolean; {whether we're stacking upwards} 14208begin cur_g:=0; cur_glue:=float_constant(0); 14209this_box:=temp_ptr; g_order:=glue_order(this_box); 14210g_sign:=glue_sign(this_box); p:=list_ptr(this_box); 14211upwards:=(subtype(this_box)=min_quarterword+1); 14212incr(cur_s); 14213if cur_s>0 then dvi_out(push); 14214if cur_s>max_push then max_push:=cur_s; 14215save_loc:=dvi_offset+dvi_ptr; left_edge:=cur_h; 14216if upwards then cur_v:=cur_v+depth(this_box) else cur_v:=cur_v-height(this_box); 14217top_edge:=cur_v; 14218while p<>null do @<Output node |p| for |vlist_out| and move to the next node, 14219 maintaining the condition |cur_h=left_edge|@>; 14220prune_movements(save_loc); 14221if cur_s>0 then dvi_pop(save_loc); 14222decr(cur_s); 14223end; 14224 14225@ @<Output node |p| for |vlist_out|...@>= 14226begin if is_char_node(p) then confusion("vlistout") 14227@:this can't happen vlistout}{\quad vlistout@> 14228else @<Output the non-|char_node| |p| for |vlist_out|@>; 14229next_p:p:=link(p); 14230end 14231 14232@ @<Output the non-|char_node| |p| for |vlist_out|@>= 14233begin case type(p) of 14234hlist_node,vlist_node:@<Output a box in a vlist@>; 14235rule_node: begin rule_ht:=height(p); rule_dp:=depth(p); rule_wd:=width(p); 14236 goto fin_rule; 14237 end; 14238whatsit_node: @<Output the whatsit node |p| in a vlist@>; 14239glue_node: @<Move down or output leaders@>; 14240kern_node:if upwards then cur_v:=cur_v-width(p) else cur_v:=cur_v+width(p); 14241othercases do_nothing 14242endcases;@/ 14243goto next_p; 14244fin_rule: @<Output a rule in a vlist, |goto next_p|@>; 14245move_past: if upwards then cur_v:=cur_v-rule_ht else cur_v:=cur_v+rule_ht; 14246end 14247 14248@ The |synch_v| here allows the \.{DVI} output to use one-byte commands 14249for adjusting |v| in most cases, since the baselineskip distance will 14250usually be constant. 14251 14252@<Output a box in a vlist@>= 14253if list_ptr(p)=null then cur_v:=cur_v+height(p)+depth(p) 14254else begin if upwards then cur_v:=cur_v-depth(p) else cur_v:=cur_v+height(p); synch_v; 14255 save_h:=dvi_h; save_v:=dvi_v; 14256 if cur_dir=right_to_left then cur_h:=left_edge-shift_amount(p) 14257 else cur_h:=left_edge+shift_amount(p); {shift the box right} 14258 temp_ptr:=p; 14259 if type(p)=vlist_node then vlist_out@+else hlist_out; 14260 dvi_h:=save_h; dvi_v:=save_v; 14261 if upwards then cur_v:=save_v-height(p) else cur_v:=save_v+depth(p); cur_h:=left_edge; 14262 end 14263 14264@ @<Output a rule in a vlist...@>= 14265if is_running(rule_wd) then rule_wd:=width(this_box); 14266rule_ht:=rule_ht+rule_dp; {this is the rule thickness} 14267if upwards then cur_v:=cur_v-rule_ht else cur_v:=cur_v+rule_ht; 14268if (rule_ht>0)and(rule_wd>0) then {we don't output empty rules} 14269 begin if cur_dir=right_to_left then cur_h:=cur_h-rule_wd; 14270 synch_h; synch_v; 14271 dvi_out(put_rule); dvi_four(rule_ht); dvi_four(rule_wd); 14272 cur_h:=left_edge; 14273 end; 14274goto next_p 14275 14276@ @<Move down or output leaders@>= 14277begin g:=glue_ptr(p); rule_ht:=width(g)-cur_g; 14278if g_sign<>normal then 14279 begin if g_sign=stretching then 14280 begin if stretch_order(g)=g_order then 14281 begin cur_glue:=cur_glue+stretch(g); 14282 vet_glue(float(glue_set(this_box))*cur_glue); 14283@^real multiplication@> 14284 cur_g:=round(glue_temp); 14285 end; 14286 end 14287 else if shrink_order(g)=g_order then 14288 begin cur_glue:=cur_glue-shrink(g); 14289 vet_glue(float(glue_set(this_box))*cur_glue); 14290 cur_g:=round(glue_temp); 14291 end; 14292 end; 14293rule_ht:=rule_ht+cur_g; 14294if subtype(p)>=a_leaders then 14295 @<Output leaders in a vlist, |goto fin_rule| if a rule 14296 or to |next_p| if done@>; 14297goto move_past; 14298end 14299 14300@ @<Output leaders in a vlist...@>= 14301begin leader_box:=leader_ptr(p); 14302if type(leader_box)=rule_node then 14303 begin rule_wd:=width(leader_box); rule_dp:=0; 14304 goto fin_rule; 14305 end; 14306leader_ht:=height(leader_box)+depth(leader_box); 14307if (leader_ht>0)and(rule_ht>0) then 14308 begin rule_ht:=rule_ht+10; {compensate for floating-point rounding} 14309 edge:=cur_v+rule_ht; lx:=0; 14310 @<Let |cur_v| be the position of the first box, and set |leader_ht+lx| 14311 to the spacing between corresponding parts of boxes@>; 14312 while cur_v+leader_ht<=edge do 14313 @<Output a leader box at |cur_v|, 14314 then advance |cur_v| by |leader_ht+lx|@>; 14315 cur_v:=edge-10; goto next_p; 14316 end; 14317end 14318 14319@ @<Let |cur_v| be the position of the first box, ...@>= 14320if subtype(p)=a_leaders then 14321 begin save_v:=cur_v; 14322 cur_v:=top_edge+leader_ht*((cur_v-top_edge)@!div leader_ht); 14323 if cur_v<save_v then cur_v:=cur_v+leader_ht; 14324 end 14325else begin lq:=rule_ht div leader_ht; {the number of box copies} 14326 lr:=rule_ht mod leader_ht; {the remaining space} 14327 if subtype(p)=c_leaders then cur_v:=cur_v+(lr div 2) 14328 else begin lx:=lr div (lq+1); 14329 cur_v:=cur_v+((lr-(lq-1)*lx) div 2); 14330 end; 14331 end 14332 14333@ When we reach this part of the program, |cur_v| indicates the top of a 14334leader box, not its baseline. 14335 14336@<Output a leader box at |cur_v|, ...@>= 14337begin if cur_dir=right_to_left then 14338 cur_h:=left_edge-shift_amount(leader_box) 14339 else cur_h:=left_edge+shift_amount(leader_box); 14340synch_h; save_h:=dvi_h;@/ 14341cur_v:=cur_v+height(leader_box); synch_v; save_v:=dvi_v; 14342temp_ptr:=leader_box; 14343outer_doing_leaders:=doing_leaders; doing_leaders:=true; 14344if type(leader_box)=vlist_node then vlist_out@+else hlist_out; 14345doing_leaders:=outer_doing_leaders; 14346dvi_v:=save_v; dvi_h:=save_h; cur_h:=left_edge; 14347cur_v:=save_v-height(leader_box)+leader_ht+lx; 14348end 14349 14350@ The |hlist_out| and |vlist_out| procedures are now complete, so we are 14351ready for the |ship_out| routine that gets them started in the first place. 14352 14353@p procedure ship_out(@!p:pointer); {output the box |p|} 14354label done; 14355var page_loc:integer; {location of the current |bop|} 14356@!j,@!k:0..9; {indices to first ten count registers} 14357@!s:pool_pointer; {index into |str_pool|} 14358@!old_setting:0..max_selector; {saved |selector| setting} 14359begin 14360if job_name=0 then open_log_file; 14361if tracing_output>0 then 14362 begin print_nl(""); print_ln; 14363 print("Completed box being shipped out"); 14364@.Completed box...@> 14365 end; 14366if term_offset>max_print_line-9 then print_ln 14367else if (term_offset>0)or(file_offset>0) then print_char(" "); 14368print_char("["); j:=9; 14369while (count(j)=0)and(j>0) do decr(j); 14370for k:=0 to j do 14371 begin print_int(count(k)); 14372 if k<j then print_char("."); 14373 end; 14374update_terminal; 14375if tracing_output>0 then 14376 begin print_char("]"); 14377 begin_diagnostic; show_box(p); end_diagnostic(true); 14378 end; 14379@<Ship box |p| out@>; 14380if eTeX_ex then @<Check for LR anomalies at the end of |ship_out|@>; 14381if tracing_output<=0 then print_char("]"); 14382dead_cycles:=0; 14383update_terminal; {progress report} 14384@<Flush the box from memory, showing statistics if requested@>; 14385end; 14386 14387@ @<Flush the box from memory, showing statistics if requested@>= 14388@!stat if tracing_stats>1 then 14389 begin print_nl("Memory usage before: "); 14390@.Memory usage...@> 14391 print_int(var_used); print_char("&"); 14392 print_int(dyn_used); print_char(";"); 14393 end; 14394tats@/ 14395flush_node_list(p); 14396@!stat if tracing_stats>1 then 14397 begin print(" after: "); 14398 print_int(var_used); print_char("&"); 14399 print_int(dyn_used); print("; still untouched: "); 14400 print_int(hi_mem_min-lo_mem_max-1); print_ln; 14401 end; 14402tats 14403 14404@ @<Ship box |p| out@>= 14405@<Update the values of |max_h| and |max_v|; but if the page is too large, 14406 |goto done|@>; 14407@<Initialize variables as |ship_out| begins@>; 14408page_loc:=dvi_offset+dvi_ptr; 14409dvi_out(bop); 14410for k:=0 to 9 do dvi_four(count(k)); 14411dvi_four(last_bop); last_bop:=page_loc; 14412{ generate a pagesize special at start of page } 14413old_setting:=selector; selector:=new_string; 14414print("pdf:pagesize "); 14415if (pdf_page_width > 0) and (pdf_page_height > 0) then begin 14416 print("width"); print(" "); 14417 print_scaled(pdf_page_width); 14418 print("pt"); print(" "); 14419 print("height"); print(" "); 14420 print_scaled(pdf_page_height); 14421 print("pt"); 14422end else 14423 print("default"); 14424selector:=old_setting; 14425dvi_out(xxx1); dvi_out(cur_length); 14426for s:=str_start_macro(str_ptr) to pool_ptr-1 do dvi_out(so(str_pool[s])); 14427pool_ptr:=str_start_macro(str_ptr); {erase the string} 14428cur_v:=height(p)+v_offset; { does this need changing for upwards mode ???? } 14429temp_ptr:=p; 14430if type(p)=vlist_node then vlist_out@+else hlist_out; 14431dvi_out(eop); incr(total_pages); cur_s:=-1; 14432if not no_pdf_output then fflush(dvi_file); 14433done: 14434 14435@ Sometimes the user will generate a huge page because other error messages 14436are being ignored. Such pages are not output to the \.{dvi} file, since they 14437may confuse the printing software. 14438 14439@<Update the values of |max_h| and |max_v|; but if the page is too large...@>= 14440if (height(p)>max_dimen)or@|(depth(p)>max_dimen)or@| 14441 (height(p)+depth(p)+v_offset>max_dimen)or@| 14442 (width(p)+h_offset>max_dimen) then 14443 begin print_err("Huge page cannot be shipped out"); 14444@.Huge page...@> 14445 help2("The page just created is more than 18 feet tall or")@/ 14446 ("more than 18 feet wide, so I suspect something went wrong."); 14447 error; 14448 if tracing_output<=0 then 14449 begin begin_diagnostic; 14450 print_nl("The following box has been deleted:"); 14451@.The following...deleted@> 14452 show_box(p); 14453 end_diagnostic(true); 14454 end; 14455 goto done; 14456 end; 14457if height(p)+depth(p)+v_offset>max_v then max_v:=height(p)+depth(p)+v_offset; 14458if width(p)+h_offset>max_h then max_h:=width(p)+h_offset 14459 14460@ At the end of the program, we must finish things off by writing the 14461post\-amble. If |total_pages=0|, the \.{DVI} file was never opened. 14462If |total_pages>=65536|, the \.{DVI} file will lie. And if 14463|max_push>=65536|, the user deserves whatever chaos might ensue. 14464 14465An integer variable |k| will be declared for use by this routine. 14466 14467@<Finish the \.{DVI} file@>= 14468while cur_s>-1 do 14469 begin if cur_s>0 then dvi_out(pop) 14470 else begin dvi_out(eop); incr(total_pages); 14471 end; 14472 decr(cur_s); 14473 end; 14474if total_pages=0 then print_nl("No pages of output.") 14475@.No pages of output@> 14476else begin dvi_out(post); {beginning of the postamble} 14477 dvi_four(last_bop); last_bop:=dvi_offset+dvi_ptr-5; {|post| location} 14478 dvi_four(25400000); dvi_four(473628672); {conversion ratio for sp} 14479 prepare_mag; dvi_four(mag); {magnification factor} 14480 dvi_four(max_v); dvi_four(max_h);@/ 14481 dvi_out(max_push div 256); dvi_out(max_push mod 256);@/ 14482 dvi_out((total_pages div 256) mod 256); dvi_out(total_pages mod 256);@/ 14483 @<Output the font definitions for all fonts that were used@>; 14484 dvi_out(post_post); dvi_four(last_bop); dvi_out(id_byte);@/ 14485 k:=4+((dvi_buf_size-dvi_ptr) mod 4); {the number of 223's} 14486 while k>0 do 14487 begin dvi_out(223); decr(k); 14488 end; 14489 @<Empty the last bytes out of |dvi_buf|@>; 14490 print_nl("Output written on "); slow_print(output_file_name); 14491@.Output written on x@> 14492 print(" ("); print_int(total_pages); print(" page"); 14493 if total_pages<>1 then print_char("s"); 14494 print(", "); print_int(dvi_offset+dvi_ptr); print(" bytes)."); 14495 b_close(dvi_file); 14496 end 14497 14498@ @<Output the font definitions...@>= 14499while font_ptr>font_base do 14500 begin if font_used[font_ptr] then dvi_font_def(font_ptr); 14501 decr(font_ptr); 14502 end 14503 14504@* \[33] Packaging. 14505We're essentially done with the parts of \TeX\ that are concerned with 14506the input (|get_next|) and the output (|ship_out|). So it's time to 14507get heavily into the remaining part, which does the real work of typesetting. 14508 14509After lists are constructed, \TeX\ wraps them up and puts them into boxes. 14510Two major subroutines are given the responsibility for this task: |hpack| 14511applies to horizontal lists (hlists) and |vpack| applies to vertical lists 14512(vlists). The main duty of |hpack| and |vpack| is to compute the dimensions 14513of the resulting boxes, and to adjust the glue if one of those dimensions 14514is pre-specified. The computed sizes normally enclose all of the material 14515inside the new box; but some items may stick out if negative glue is used, 14516if the box is overfull, or if a \.{\\vbox} includes other boxes that have 14517been shifted left. 14518 14519The subroutine call |hpack(p,w,m)| returns a pointer to an |hlist_node| 14520for a box containing the hlist that starts at |p|. Parameter |w| specifies 14521a width; and parameter |m| is either `|exactly|' or `|additional|'. Thus, 14522|hpack(p,w,exactly)| produces a box whose width is exactly |w|, while 14523|hpack(p,w,additional)| yields a box whose width is the natural width plus 14524|w|. It is convenient to define a macro called `|natural|' to cover the 14525most common case, so that we can say |hpack(p,natural)| to get a box that 14526has the natural width of list |p|. 14527 14528Similarly, |vpack(p,w,m)| returns a pointer to a |vlist_node| for a 14529box containing the vlist that starts at |p|. In this case |w| represents 14530a height instead of a width; the parameter |m| is interpreted as in |hpack|. 14531 14532@d exactly=0 {a box dimension is pre-specified} 14533@d additional=1 {a box dimension is increased from the natural one} 14534@d natural==0,additional {shorthand for parameters to |hpack| and |vpack|} 14535 14536@ The parameters to |hpack| and |vpack| correspond to \TeX's primitives 14537like `\.{\\hbox} \.{to} \.{300pt}', `\.{\\hbox} \.{spread} \.{10pt}'; note 14538that `\.{\\hbox}' with no dimension following it is equivalent to 14539`\.{\\hbox} \.{spread} \.{0pt}'. The |scan_spec| subroutine scans such 14540constructions in the user's input, including the mandatory left brace that 14541follows them, and it puts the specification onto |save_stack| so that the 14542desired box can later be obtained by executing the following code: 14543$$\vbox{\halign{#\hfil\cr 14544|save_ptr:=save_ptr-2;|\cr 14545|hpack(p,saved(1),saved(0)).|\cr}}$$ 14546Special care is necessary to ensure that the special |save_stack| codes 14547are placed just below the new group code, because scanning can change 14548|save_stack| when \.{\\csname} appears. 14549 14550@p procedure scan_spec(@!c:group_code;@!three_codes:boolean); 14551 {scans a box specification and left brace} 14552label found; 14553var @!s:integer; {temporarily saved value} 14554@!spec_code:exactly..additional; 14555begin if three_codes then s:=saved(0); 14556if scan_keyword("to") then spec_code:=exactly 14557@.to@> 14558else if scan_keyword("spread") then spec_code:=additional 14559@.spread@> 14560else begin spec_code:=additional; cur_val:=0; 14561 goto found; 14562 end; 14563scan_normal_dimen; 14564found: if three_codes then 14565 begin saved(0):=s; incr(save_ptr); 14566 end; 14567saved(0):=spec_code; saved(1):=cur_val; save_ptr:=save_ptr+2; 14568new_save_level(c); scan_left_brace; 14569end; 14570 14571@ To figure out the glue setting, |hpack| and |vpack| determine how much 14572stretchability and shrinkability are present, considering all four orders 14573of infinity. The highest order of infinity that has a nonzero coefficient 14574is then used as if no other orders were present. 14575 14576For example, suppose that the given list contains six glue nodes with 14577the respective stretchabilities 3pt, 8fill, 5fil, 6pt, $-3$fil, $-8$fill. 14578Then the total is essentially 2fil; and if a total additional space of 6pt 14579is to be achieved by stretching, the actual amounts of stretch will be 145800pt, 0pt, 15pt, 0pt, $-9$pt, and 0pt, since only `fil' glue will be 14581considered. (The `fill' glue is therefore not really stretching infinitely 14582with respect to `fil'; nobody would actually want that to happen.) 14583 14584The arrays |total_stretch| and |total_shrink| are used to determine how much 14585glue of each kind is present. A global variable |last_badness| is used 14586to implement \.{\\badness}. 14587 14588@<Glob...@>= 14589@!total_stretch, @!total_shrink: array[glue_ord] of scaled; 14590 {glue found by |hpack| or |vpack|} 14591@!last_badness:integer; {badness of the most recently packaged box} 14592 14593@ If the global variable |adjust_tail| is non-null, the |hpack| routine 14594also removes all occurrences of |ins_node|, |mark_node|, and |adjust_node| 14595items and appends the resulting material onto the list that ends at 14596location |adjust_tail|. 14597 14598@< Glob...@>= 14599@!adjust_tail:pointer; {tail of adjustment list} 14600 14601@ @<Set init...@>=adjust_tail:=null; last_badness:=0; 14602 14603@ Some stuff for character protrusion. 14604 14605@d left_pw(#) == char_pw(#, left_side) 14606@d right_pw(#) == char_pw(#, right_side) 14607 14608@p 14609function char_pw(p: pointer; side: small_number): scaled; 14610var f: internal_font_number; 14611 c: integer; 14612begin 14613 char_pw:=0; 14614 if side = left_side then 14615 last_leftmost_char:=null 14616 else 14617 last_rightmost_char:=null; 14618 if p = null then 14619 return; 14620 14621 { native word } 14622 if is_native_word_node(p) then begin 14623 if native_glyph_info_ptr(p) <> null_ptr then begin 14624 f:=native_font(p); 14625 char_pw:=round_xn_over_d(quad(f), get_native_word_cp(p, side), 1000); 14626 end; 14627 return; 14628 end; 14629 14630 { glyph node } 14631 if is_glyph_node(p) then begin 14632 f:=native_font(p); 14633 char_pw:=round_xn_over_d(quad(f), get_cp_code(f, native_glyph(p), side), 1000); 14634 return; 14635 end; 14636 14637 { char node or ligature; same like pdftex } 14638 if not is_char_node(p) then begin 14639 if type(p) = ligature_node then 14640 p:=lig_char(p) 14641 else 14642 return; 14643 end; 14644 f:=font(p); 14645 c:=get_cp_code(f, character(p), side); 14646 case side of 14647 left_side: 14648 last_leftmost_char:=p; 14649 right_side: 14650 last_rightmost_char:=p; 14651 endcases; 14652 if c = 0 then 14653 return; 14654 char_pw:=round_xn_over_d(quad(f), c, 1000); 14655end; 14656 14657function new_margin_kern(w: scaled; p: pointer; side: small_number): pointer; 14658var k: pointer; 14659begin 14660 k:=get_node(margin_kern_node_size); 14661 type(k):=margin_kern_node; 14662 subtype(k):=side; 14663 width(k):=w; 14664 new_margin_kern:=k; 14665end; 14666 14667@ Here now is |hpack|, which contains few if any surprises. 14668 14669@p function hpack(@!p:pointer;@!w:scaled;@!m:small_number):pointer; 14670label reswitch, common_ending, exit, restart; 14671var r:pointer; {the box node that will be returned} 14672@!q:pointer; {trails behind |p|} 14673@!h,@!d,@!x:scaled; {height, depth, and natural width} 14674@!s:scaled; {shift amount} 14675@!g:pointer; {points to a glue specification} 14676@!o:glue_ord; {order of infinity} 14677@!f:internal_font_number; {the font in a |char_node|} 14678@!i:four_quarters; {font information about a |char_node|} 14679@!hd:eight_bits; {height and depth indices for a character} 14680@!pp,@!ppp: pointer; 14681@!total_chars, @!k: integer; 14682begin last_badness:=0; r:=get_node(box_node_size); type(r):=hlist_node; 14683subtype(r):=min_quarterword; shift_amount(r):=0; 14684q:=r+list_offset; link(q):=p;@/ 14685h:=0; @<Clear dimensions to zero@>; 14686if TeXXeT_en then @<Initialize the LR stack@>; 14687while p<>null do @<Examine node |p| in the hlist, taking account of its effect 14688 on the dimensions of the new box, or moving it to the adjustment list; 14689 then advance |p| to the next node@>; 14690if adjust_tail<>null then link(adjust_tail):=null; 14691if pre_adjust_tail<>null then link(pre_adjust_tail):=null; 14692height(r):=h; depth(r):=d;@/ 14693@<Determine the value of |width(r)| and the appropriate glue setting; 14694 then |return| or |goto common_ending|@>; 14695common_ending: @<Finish issuing a diagnostic message 14696 for an overfull or underfull hbox@>; 14697exit: if TeXXeT_en then @<Check for LR anomalies at the end of |hpack|@>; 14698hpack:=r; 14699end; 14700 14701@ @<Clear dimensions to zero@>= 14702d:=0; x:=0; 14703total_stretch[normal]:=0; total_shrink[normal]:=0; 14704total_stretch[fil]:=0; total_shrink[fil]:=0; 14705total_stretch[fill]:=0; total_shrink[fill]:=0; 14706total_stretch[filll]:=0; total_shrink[filll]:=0 14707 14708@ @<Examine node |p| in the hlist, taking account of its effect...@>= 14709@^inner loop@> 14710begin reswitch: while is_char_node(p) do 14711 @<Incorporate character dimensions into the dimensions of 14712 the hbox that will contain~it, then move to the next node@>; 14713if p<>null then 14714 begin case type(p) of 14715 hlist_node,vlist_node,rule_node,unset_node: 14716 @<Incorporate box dimensions into the dimensions of 14717 the hbox that will contain~it@>; 14718 ins_node,mark_node,adjust_node: if (adjust_tail<>null) or (pre_adjust_tail<> null) then 14719 @<Transfer node |p| to the adjustment list@>; 14720 whatsit_node:@<Incorporate a whatsit node into an hbox@>; 14721 glue_node:@<Incorporate glue into the horizontal totals@>; 14722 kern_node: x:=x+width(p); 14723 margin_kern_node: x:=x+width(p); 14724 math_node: begin x:=x+width(p); 14725 if TeXXeT_en then @<Adjust \(t)the LR stack for the |hpack| routine@>; 14726 end; 14727 ligature_node: @<Make node |p| look like a |char_node| 14728 and |goto reswitch|@>; 14729 othercases do_nothing 14730 endcases;@/ 14731 p:=link(p); 14732 end; 14733end 14734 14735 14736@ @<Make node |p| look like a |char_node| and |goto reswitch|@>= 14737begin mem[lig_trick]:=mem[lig_char(p)]; link(lig_trick):=link(p); 14738p:=lig_trick; xtx_ligature_present:=true; goto reswitch; 14739end 14740 14741@ The code here implicitly uses the fact that running dimensions are 14742indicated by |null_flag|, which will be ignored in the calculations 14743because it is a highly negative number. 14744 14745@<Incorporate box dimensions into the dimensions of the hbox...@>= 14746begin x:=x+width(p); 14747if type(p)>=rule_node then s:=0 @+else s:=shift_amount(p); 14748if height(p)-s>h then h:=height(p)-s; 14749if depth(p)+s>d then d:=depth(p)+s; 14750end 14751 14752@ The following code is part of \TeX's inner loop; i.e., adding another 14753character of text to the user's input will cause each of these instructions 14754to be exercised one more time. 14755@^inner loop@> 14756 14757@<Incorporate character dimensions into the dimensions of the hbox...@>= 14758begin f:=font(p); i:=char_info(f)(character(p)); hd:=height_depth(i); 14759x:=x+char_width(f)(i);@/ 14760s:=char_height(f)(hd);@+if s>h then h:=s; 14761s:=char_depth(f)(hd);@+if s>d then d:=s; 14762p:=link(p); 14763end 14764 14765@ Although node |q| is not necessarily the immediate predecessor of node |p|, 14766it always points to some node in the list preceding |p|. Thus, we can delete 14767nodes by moving |q| when necessary. The algorithm takes linear time, and the 14768extra computation does not intrude on the inner loop unless it is necessary 14769to make a deletion. 14770@^inner loop@> 14771 14772@<Glob...@>= 14773@!pre_adjust_tail: pointer; 14774 14775@ @<Set init...@>= 14776pre_adjust_tail:=null; 14777 14778@ Materials in \.{\\vadjust} used with \.{pre} keyword will be appended to 14779|pre_adjust_tail| instead of |adjust_tail|. 14780 14781@d update_adjust_list(#) == begin 14782 if # = null then 14783 confusion("pre vadjust"); 14784 link(#):=adjust_ptr(p); 14785 while link(#) <> null do 14786 #:=link(#); 14787end 14788 14789@<Transfer node |p| to the adjustment list@>= 14790begin while link(q)<>p do q:=link(q); 14791 if type(p) = adjust_node then begin 14792 if adjust_pre(p) <> 0 then 14793 update_adjust_list(pre_adjust_tail) 14794 else 14795 update_adjust_list(adjust_tail); 14796 p:=link(p); free_node(link(q), small_node_size); 14797 end 14798else begin link(adjust_tail):=p; adjust_tail:=p; p:=link(p); 14799 end; 14800link(q):=p; p:=q; 14801end 14802 14803@ @<Incorporate glue into the horizontal totals@>= 14804begin g:=glue_ptr(p); x:=x+width(g);@/ 14805o:=stretch_order(g); total_stretch[o]:=total_stretch[o]+stretch(g); 14806o:=shrink_order(g); total_shrink[o]:=total_shrink[o]+shrink(g); 14807if subtype(p)>=a_leaders then 14808 begin g:=leader_ptr(p); 14809 if height(g)>h then h:=height(g); 14810 if depth(g)>d then d:=depth(g); 14811 end; 14812end 14813 14814@ When we get to the present part of the program, |x| is the natural width 14815of the box being packaged. 14816 14817@<Determine the value of |width(r)| and the appropriate glue setting...@>= 14818if m=additional then w:=x+w; 14819width(r):=w; x:=w-x; {now |x| is the excess to be made up} 14820if x=0 then 14821 begin glue_sign(r):=normal; glue_order(r):=normal; 14822 set_glue_ratio_zero(glue_set(r)); 14823 return; 14824 end 14825else if x>0 then @<Determine horizontal glue stretch setting, then |return| 14826 or \hbox{|goto common_ending|}@> 14827else @<Determine horizontal glue shrink setting, then |return| 14828 or \hbox{|goto common_ending|}@> 14829 14830@ @<Determine horizontal glue stretch setting...@>= 14831begin @<Determine the stretch order@>; 14832glue_order(r):=o; glue_sign(r):=stretching; 14833if total_stretch[o]<>0 then glue_set(r):=unfloat(x/total_stretch[o]) 14834@^real division@> 14835else begin glue_sign(r):=normal; 14836 set_glue_ratio_zero(glue_set(r)); {there's nothing to stretch} 14837 end; 14838if o=normal then if list_ptr(r)<>null then 14839 @<Report an underfull hbox and |goto common_ending|, if this box 14840 is sufficiently bad@>; 14841return; 14842end 14843 14844@ @<Determine the stretch order@>= 14845if total_stretch[filll]<>0 then o:=filll 14846else if total_stretch[fill]<>0 then o:=fill 14847else if total_stretch[fil]<>0 then o:=fil 14848else o:=normal 14849 14850@ @<Report an underfull hbox and |goto common_ending|, if...@>= 14851begin last_badness:=badness(x,total_stretch[normal]); 14852if last_badness>hbadness then 14853 begin print_ln; 14854 if last_badness>100 then print_nl("Underfull")@+else print_nl("Loose"); 14855 print(" \hbox (badness "); print_int(last_badness); 14856@.Underfull \\hbox...@> 14857@.Loose \\hbox...@> 14858 goto common_ending; 14859 end; 14860end 14861 14862@ In order to provide a decent indication of where an overfull or underfull 14863box originated, we use a global variable |pack_begin_line| that is 14864set nonzero only when |hpack| is being called by the paragraph builder 14865or the alignment finishing routine. 14866 14867@<Glob...@>= 14868@!pack_begin_line:integer; {source file line where the current paragraph 14869 or alignment began; a negative value denotes alignment} 14870 14871@ @<Set init...@>= 14872pack_begin_line:=0; 14873 14874@ @<Finish issuing a diagnostic message for an overfull or underfull hbox@>= 14875if output_active then print(") has occurred while \output is active") 14876else begin if pack_begin_line<>0 then 14877 begin if pack_begin_line>0 then print(") in paragraph at lines ") 14878 else print(") in alignment at lines "); 14879 print_int(abs(pack_begin_line)); 14880 print("--"); 14881 end 14882 else print(") detected at line "); 14883 print_int(line); 14884 end; 14885print_ln;@/ 14886font_in_short_display:=null_font; short_display(list_ptr(r)); print_ln;@/ 14887begin_diagnostic; show_box(r); end_diagnostic(true) 14888 14889@ @<Determine horizontal glue shrink setting...@>= 14890begin @<Determine the shrink order@>; 14891glue_order(r):=o; glue_sign(r):=shrinking; 14892if total_shrink[o]<>0 then glue_set(r):=unfloat((-x)/total_shrink[o]) 14893@^real division@> 14894else begin glue_sign(r):=normal; 14895 set_glue_ratio_zero(glue_set(r)); {there's nothing to shrink} 14896 end; 14897if (total_shrink[o]<-x)and(o=normal)and(list_ptr(r)<>null) then 14898 begin last_badness:=1000000; 14899 set_glue_ratio_one(glue_set(r)); {use the maximum shrinkage} 14900 @<Report an overfull hbox and |goto common_ending|, if this box 14901 is sufficiently bad@>; 14902 end 14903else if o=normal then if list_ptr(r)<>null then 14904 @<Report a tight hbox and |goto common_ending|, if this box 14905 is sufficiently bad@>; 14906return; 14907end 14908 14909@ @<Determine the shrink order@>= 14910if total_shrink[filll]<>0 then o:=filll 14911else if total_shrink[fill]<>0 then o:=fill 14912else if total_shrink[fil]<>0 then o:=fil 14913else o:=normal 14914 14915@ @<Report an overfull hbox and |goto common_ending|, if...@>= 14916if (-x-total_shrink[normal]>hfuzz)or(hbadness<100) then 14917 begin if (overfull_rule>0)and(-x-total_shrink[normal]>hfuzz) then 14918 begin while link(q)<>null do q:=link(q); 14919 link(q):=new_rule; 14920 width(link(q)):=overfull_rule; 14921 end; 14922 print_ln; print_nl("Overfull \hbox ("); 14923@.Overfull \\hbox...@> 14924 print_scaled(-x-total_shrink[normal]); print("pt too wide"); 14925 goto common_ending; 14926 end 14927 14928@ @<Report a tight hbox and |goto common_ending|, if...@>= 14929begin last_badness:=badness(-x,total_shrink[normal]); 14930if last_badness>hbadness then 14931 begin print_ln; print_nl("Tight \hbox (badness "); print_int(last_badness); 14932@.Tight \\hbox...@> 14933 goto common_ending; 14934 end; 14935end 14936 14937@ The |vpack| subroutine is actually a special case of a slightly more 14938general routine called |vpackage|, which has four parameters. The fourth 14939parameter, which is |max_dimen| in the case of |vpack|, specifies the 14940maximum depth of the page box that is constructed. The depth is first 14941computed by the normal rules; if it exceeds this limit, the reference 14942point is simply moved down until the limiting depth is attained. 14943 14944@d vpack(#)==vpackage(#,max_dimen) {special case of unconstrained depth} 14945 14946@p function vpackage(@!p:pointer;@!h:scaled;@!m:small_number;@!l:scaled): 14947 pointer; 14948label common_ending, exit; 14949var r:pointer; {the box node that will be returned} 14950@!w,@!d,@!x:scaled; {width, depth, and natural height} 14951@!s:scaled; {shift amount} 14952@!g:pointer; {points to a glue specification} 14953@!o:glue_ord; {order of infinity} 14954begin last_badness:=0; r:=get_node(box_node_size); type(r):=vlist_node; 14955if XeTeX_upwards then subtype(r):=min_quarterword+1 else subtype(r):=min_quarterword; 14956shift_amount(r):=0; 14957list_ptr(r):=p;@/ 14958w:=0; @<Clear dimensions to zero@>; 14959while p<>null do @<Examine node |p| in the vlist, taking account of its effect 14960 on the dimensions of the new box; then advance |p| to the next node@>; 14961width(r):=w; 14962if d>l then 14963 begin x:=x+d-l; depth(r):=l; 14964 end 14965else depth(r):=d; 14966@<Determine the value of |height(r)| and the appropriate glue setting; 14967 then |return| or |goto common_ending|@>; 14968common_ending: @<Finish issuing a diagnostic message 14969 for an overfull or underfull vbox@>; 14970exit: vpackage:=r; 14971end; 14972 14973@ @<Examine node |p| in the vlist, taking account of its effect...@>= 14974begin if is_char_node(p) then confusion("vpack") 14975@:this can't happen vpack}{\quad vpack@> 14976else case type(p) of 14977 hlist_node,vlist_node,rule_node,unset_node: 14978 @<Incorporate box dimensions into the dimensions of 14979 the vbox that will contain~it@>; 14980 whatsit_node:@<Incorporate a whatsit node into a vbox@>; 14981 glue_node: @<Incorporate glue into the vertical totals@>; 14982 kern_node: begin x:=x+d+width(p); d:=0; 14983 end; 14984 othercases do_nothing 14985 endcases; 14986p:=link(p); 14987end 14988 14989@ @<Incorporate box dimensions into the dimensions of the vbox...@>= 14990begin x:=x+d+height(p); d:=depth(p); 14991if type(p)>=rule_node then s:=0 @+else s:=shift_amount(p); 14992if width(p)+s>w then w:=width(p)+s; 14993end 14994 14995@ @<Incorporate glue into the vertical totals@>= 14996begin x:=x+d; d:=0;@/ 14997g:=glue_ptr(p); x:=x+width(g);@/ 14998o:=stretch_order(g); total_stretch[o]:=total_stretch[o]+stretch(g); 14999o:=shrink_order(g); total_shrink[o]:=total_shrink[o]+shrink(g); 15000if subtype(p)>=a_leaders then 15001 begin g:=leader_ptr(p); 15002 if width(g)>w then w:=width(g); 15003 end; 15004end 15005 15006@ When we get to the present part of the program, |x| is the natural height 15007of the box being packaged. 15008 15009@<Determine the value of |height(r)| and the appropriate glue setting...@>= 15010if m=additional then h:=x+h; 15011height(r):=h; x:=h-x; {now |x| is the excess to be made up} 15012if x=0 then 15013 begin glue_sign(r):=normal; glue_order(r):=normal; 15014 set_glue_ratio_zero(glue_set(r)); 15015 return; 15016 end 15017else if x>0 then @<Determine vertical glue stretch setting, then |return| 15018 or \hbox{|goto common_ending|}@> 15019else @<Determine vertical glue shrink setting, then |return| 15020 or \hbox{|goto common_ending|}@> 15021 15022@ @<Determine vertical glue stretch setting...@>= 15023begin @<Determine the stretch order@>; 15024glue_order(r):=o; glue_sign(r):=stretching; 15025if total_stretch[o]<>0 then glue_set(r):=unfloat(x/total_stretch[o]) 15026@^real division@> 15027else begin glue_sign(r):=normal; 15028 set_glue_ratio_zero(glue_set(r)); {there's nothing to stretch} 15029 end; 15030if o=normal then if list_ptr(r)<>null then 15031 @<Report an underfull vbox and |goto common_ending|, if this box 15032 is sufficiently bad@>; 15033return; 15034end 15035 15036@ @<Report an underfull vbox and |goto common_ending|, if...@>= 15037begin last_badness:=badness(x,total_stretch[normal]); 15038if last_badness>vbadness then 15039 begin print_ln; 15040 if last_badness>100 then print_nl("Underfull")@+else print_nl("Loose"); 15041 print(" \vbox (badness "); print_int(last_badness); 15042@.Underfull \\vbox...@> 15043@.Loose \\vbox...@> 15044 goto common_ending; 15045 end; 15046end 15047 15048@ @<Finish issuing a diagnostic message for an overfull or underfull vbox@>= 15049if output_active then print(") has occurred while \output is active") 15050else begin if pack_begin_line<>0 then {it's actually negative} 15051 begin print(") in alignment at lines "); 15052 print_int(abs(pack_begin_line)); 15053 print("--"); 15054 end 15055 else print(") detected at line "); 15056 print_int(line); 15057 print_ln;@/ 15058 end; 15059begin_diagnostic; show_box(r); end_diagnostic(true) 15060 15061@ @<Determine vertical glue shrink setting...@>= 15062begin @<Determine the shrink order@>; 15063glue_order(r):=o; glue_sign(r):=shrinking; 15064if total_shrink[o]<>0 then glue_set(r):=unfloat((-x)/total_shrink[o]) 15065@^real division@> 15066else begin glue_sign(r):=normal; 15067 set_glue_ratio_zero(glue_set(r)); {there's nothing to shrink} 15068 end; 15069if (total_shrink[o]<-x)and(o=normal)and(list_ptr(r)<>null) then 15070 begin last_badness:=1000000; 15071 set_glue_ratio_one(glue_set(r)); {use the maximum shrinkage} 15072 @<Report an overfull vbox and |goto common_ending|, if this box 15073 is sufficiently bad@>; 15074 end 15075else if o=normal then if list_ptr(r)<>null then 15076 @<Report a tight vbox and |goto common_ending|, if this box 15077 is sufficiently bad@>; 15078return; 15079end 15080 15081@ @<Report an overfull vbox and |goto common_ending|, if...@>= 15082if (-x-total_shrink[normal]>vfuzz)or(vbadness<100) then 15083 begin print_ln; print_nl("Overfull \vbox ("); 15084@.Overfull \\vbox...@> 15085 print_scaled(-x-total_shrink[normal]); print("pt too high"); 15086 goto common_ending; 15087 end 15088 15089@ @<Report a tight vbox and |goto common_ending|, if...@>= 15090begin last_badness:=badness(-x,total_shrink[normal]); 15091if last_badness>vbadness then 15092 begin print_ln; print_nl("Tight \vbox (badness "); print_int(last_badness); 15093@.Tight \\vbox...@> 15094 goto common_ending; 15095 end; 15096end 15097 15098@ When a box is being appended to the current vertical list, the 15099baselineskip calculation is handled by the |append_to_vlist| routine. 15100 15101@p procedure append_to_vlist(@!b:pointer); 15102var d:scaled; {deficiency of space between baselines} 15103@!p:pointer; {a new glue node} 15104@!upwards:boolean; 15105begin upwards:=XeTeX_upwards; 15106 if prev_depth>ignore_depth then 15107 begin if upwards then d:=width(baseline_skip)-prev_depth-depth(b) 15108 else d:=width(baseline_skip)-prev_depth-height(b); 15109 if d<line_skip_limit then p:=new_param_glue(line_skip_code) 15110 else begin p:=new_skip_param(baseline_skip_code); 15111 width(temp_ptr):=d; {|temp_ptr=glue_ptr(p)|} 15112 end; 15113 link(tail):=p; tail:=p; 15114 end; 15115link(tail):=b; tail:=b; if upwards then prev_depth:=height(b) else prev_depth:=depth(b); 15116end; 15117 15118@* \[34] Data structures for math mode. 15119When \TeX\ reads a formula that is enclosed between \.\$'s, it constructs an 15120{\sl mlist}, which is essentially a tree structure representing that 15121formula. An mlist is a linear sequence of items, but we can regard it as 15122a tree structure because mlists can appear within mlists. For example, many 15123of the entries can be subscripted or superscripted, and such ``scripts'' 15124are mlists in their own right. 15125 15126An entire formula is parsed into such a tree before any of the actual 15127typesetting is done, because the current style of type is usually not 15128known until the formula has been fully scanned. For example, when the 15129formula `\.{\$a+b \\over c+d\$}' is being read, there is no way to tell 15130that `\.{a+b}' will be in script size until `\.{\\over}' has appeared. 15131 15132During the scanning process, each element of the mlist being built is 15133classified as a relation, a binary operator, an open parenthesis, etc., 15134or as a construct like `\.{\\sqrt}' that must be built up. This classification 15135appears in the mlist data structure. 15136 15137After a formula has been fully scanned, the mlist is converted to an hlist 15138so that it can be incorporated into the surrounding text. This conversion is 15139controlled by a recursive procedure that decides all of the appropriate 15140styles by a ``top-down'' process starting at the outermost level and working 15141in towards the subformulas. The formula is ultimately pasted together using 15142combinations of horizontal and vertical boxes, with glue and penalty nodes 15143inserted as necessary. 15144 15145An mlist is represented internally as a linked list consisting chiefly 15146of ``noads'' (pronounced ``no-adds''), to distinguish them from the somewhat 15147similar ``nodes'' in hlists and vlists. Certain kinds of ordinary nodes are 15148allowed to appear in mlists together with the noads; \TeX\ tells the difference 15149by means of the |type| field, since a noad's |type| is always greater than 15150that of a node. An mlist does not contain character nodes, hlist nodes, vlist 15151nodes, math nodes, ligature nodes, 15152or unset nodes; in particular, each mlist item appears in the 15153variable-size part of |mem|, so the |type| field is always present. 15154 15155@ Each noad is four or more words long. The first word contains the |type| 15156and |subtype| and |link| fields that are already so familiar to us; the 15157second, third, and fourth words are called the noad's |nucleus|, |subscr|, 15158and |supscr| fields. 15159 15160Consider, for example, the simple formula `\.{\$x\^2\$}', which would be 15161parsed into an mlist containing a single element called an |ord_noad|. 15162The |nucleus| of this noad is a representation of `\.x', the |subscr| is 15163empty, and the |supscr| is a representation of `\.2'. 15164 15165The |nucleus|, |subscr|, and |supscr| fields are further broken into 15166subfields. If |p| points to a noad, and if |q| is one of its principal 15167fields (e.g., |q=subscr(p)|), there are several possibilities for the 15168subfields, depending on the |math_type| of |q|. 15169 15170\yskip\hang|math_type(q)=math_char| means that |fam(q)| refers to one of 15171the sixteen font families, and |character(q)| is the number of a character 15172within a font of that family, as in a character node. 15173 15174\yskip\hang|math_type(q)=math_text_char| is similar, but the character is 15175unsubscripted and unsuperscripted and it is followed immediately by another 15176character from the same font. (This |math_type| setting appears only 15177briefly during the processing; it is used to suppress unwanted italic 15178corrections.) 15179 15180\yskip\hang|math_type(q)=empty| indicates a field with no value (the 15181corresponding attribute of noad |p| is not present). 15182 15183\yskip\hang|math_type(q)=sub_box| means that |info(q)| points to a box 15184node (either an |hlist_node| or a |vlist_node|) that should be used as the 15185value of the field. The |shift_amount| in the subsidiary box node is the 15186amount by which that box will be shifted downward. 15187 15188\yskip\hang|math_type(q)=sub_mlist| means that |info(q)| points to 15189an mlist; the mlist must be converted to an hlist in order to obtain 15190the value of this field. 15191 15192\yskip\noindent In the latter case, we might have |info(q)=null|. This 15193is not the same as |math_type(q)=empty|; for example, `\.{\$P\_\{\}\$}' 15194and `\.{\$P\$}' produce different results (the former will not have the 15195``italic correction'' added to the width of |P|, but the ``script skip'' 15196will be added). 15197 15198The definitions of subfields given here are evidently wasteful of space, 15199since a halfword is being used for the |math_type| although only three 15200bits would be needed. However, there are hardly ever many noads present at 15201once, since they are soon converted to nodes that take up even more space, 15202so we can afford to represent them in whatever way simplifies the 15203programming. 15204 15205@d noad_size=4 {number of words in a normal noad} 15206@d nucleus(#)==#+1 {the |nucleus| field of a noad} 15207@d supscr(#)==#+2 {the |supscr| field of a noad} 15208@d subscr(#)==#+3 {the |subscr| field of a noad} 15209@d math_type==link {a |halfword| in |mem|} 15210@d plane_and_fam_field==font {a |quarterword| in |mem|} 15211@d fam(#) == (plane_and_fam_field(#) mod @"100) 15212@d math_char=1 {|math_type| when the attribute is simple} 15213@d sub_box=2 {|math_type| when the attribute is a box} 15214@d sub_mlist=3 {|math_type| when the attribute is a formula} 15215@d math_text_char=4 {|math_type| when italic correction is dubious} 15216 15217@ Each portion of a formula is classified as Ord, Op, Bin, Rel, Ope, 15218Clo, Pun, or Inn, for purposes of spacing and line breaking. An 15219|ord_noad|, |op_noad|, |bin_noad|, |rel_noad|, |open_noad|, |close_noad|, 15220|punct_noad|, or |inner_noad| is used to represent portions of the various 15221types. For example, an `\.=' sign in a formula leads to the creation of a 15222|rel_noad| whose |nucleus| field is a representation of an equals sign 15223(usually |fam=0|, |character=@'75|). A formula preceded by \.{\\mathrel} 15224also results in a |rel_noad|. When a |rel_noad| is followed by an 15225|op_noad|, say, and possibly separated by one or more ordinary nodes (not 15226noads), \TeX\ will insert a penalty node (with the current |rel_penalty|) 15227just after the formula that corresponds to the |rel_noad|, unless there 15228already was a penalty immediately following; and a ``thick space'' will be 15229inserted just before the formula that corresponds to the |op_noad|. 15230 15231A noad of type |ord_noad|, |op_noad|, \dots, |inner_noad| usually 15232has a |subtype=normal|. The only exception is that an |op_noad| might 15233have |subtype=limits| or |no_limits|, if the normal positioning of 15234limits has been overridden for this operator. 15235 15236@d ord_noad=unset_node+3 {|type| of a noad classified Ord} 15237@d op_noad=ord_noad+1 {|type| of a noad classified Op} 15238@d bin_noad=ord_noad+2 {|type| of a noad classified Bin} 15239@d rel_noad=ord_noad+3 {|type| of a noad classified Rel} 15240@d open_noad=ord_noad+4 {|type| of a noad classified Ope} 15241@d close_noad=ord_noad+5 {|type| of a noad classified Clo} 15242@d punct_noad=ord_noad+6 {|type| of a noad classified Pun} 15243@d inner_noad=ord_noad+7 {|type| of a noad classified Inn} 15244@d limits=1 {|subtype| of |op_noad| whose scripts are to be above, below} 15245@d no_limits=2 {|subtype| of |op_noad| whose scripts are to be normal} 15246 15247@ A |radical_noad| is five words long; the fifth word is the |left_delimiter| 15248field, which usually represents a square root sign. 15249 15250A |fraction_noad| is six words long; it has a |right_delimiter| field 15251as well as a |left_delimiter|. 15252 15253Delimiter fields are of type |four_quarters|, and they have four subfields 15254called |small_fam|, |small_char|, |large_fam|, |large_char|. These subfields 15255represent variable-size delimiters by giving the ``small'' and ``large'' 15256starting characters, as explained in Chapter~17 of {\sl The \TeX book}. 15257@:TeXbook}{\sl The \TeX book@> 15258 15259A |fraction_noad| is actually quite different from all other noads. Not 15260only does it have six words, it has |thickness|, |denominator|, and 15261|numerator| fields instead of |nucleus|, |subscr|, and |supscr|. The 15262|thickness| is a scaled value that tells how thick to make a fraction 15263rule; however, the special value |default_code| is used to stand for the 15264|default_rule_thickness| of the current size. The |numerator| and 15265|denominator| point to mlists that define a fraction; we always have 15266$$\hbox{|math_type(numerator)=math_type(denominator)=sub_mlist|}.$$ The 15267|left_delimiter| and |right_delimiter| fields specify delimiters that will 15268be placed at the left and right of the fraction. In this way, a 15269|fraction_noad| is able to represent all of \TeX's operators \.{\\over}, 15270\.{\\atop}, \.{\\above}, \.{\\overwithdelims}, \.{\\atopwithdelims}, and 15271 \.{\\abovewithdelims}. 15272 15273@d left_delimiter(#)==#+4 {first delimiter field of a noad} 15274@d right_delimiter(#)==#+5 {second delimiter field of a fraction noad} 15275@d radical_noad=inner_noad+1 {|type| of a noad for square roots} 15276@d radical_noad_size=5 {number of |mem| words in a radical noad} 15277@d fraction_noad=radical_noad+1 {|type| of a noad for generalized fractions} 15278@d fraction_noad_size=6 {number of |mem| words in a fraction noad} 15279@d small_fam(#)==(mem[#].qqqq.b0 mod @"100) {|fam| for ``small'' delimiter} 15280@d small_char(#)==(mem[#].qqqq.b1 + (mem[#].qqqq.b0 div @"100) * @"10000) {|character| for ``small'' delimiter} 15281@d large_fam(#)==(mem[#].qqqq.b2 mod @"100) {|fam| for ``large'' delimiter} 15282@d large_char(#)==(mem[#].qqqq.b3 + (mem[#].qqqq.b2 div @"100) * @"10000) {|character| for ``large'' delimiter} 15283@d small_plane_and_fam_field(#)==mem[#].qqqq.b0 15284@d small_char_field(#)==mem[#].qqqq.b1 15285@d large_plane_and_fam_field(#)==mem[#].qqqq.b2 15286@d large_char_field(#)==mem[#].qqqq.b3 15287@d thickness==width {|thickness| field in a fraction noad} 15288@d default_code==@'10000000000 {denotes |default_rule_thickness|} 15289@d numerator==supscr {|numerator| field in a fraction noad} 15290@d denominator==subscr {|denominator| field in a fraction noad} 15291 15292@ The global variable |empty_field| is set up for initialization of empty 15293fields in new noads. Similarly, |null_delimiter| is for the initialization 15294of delimiter fields. 15295 15296@<Glob...@>= 15297@!empty_field:two_halves; 15298@!null_delimiter:four_quarters; 15299 15300@ @<Set init...@>= 15301empty_field.rh:=empty; empty_field.lh:=null;@/ 15302null_delimiter.b0:=0; null_delimiter.b1:=min_quarterword;@/ 15303null_delimiter.b2:=0; null_delimiter.b3:=min_quarterword; 15304 15305@ The |new_noad| function creates an |ord_noad| that is completely null. 15306 15307@p function new_noad:pointer; 15308var p:pointer; 15309begin p:=get_node(noad_size); 15310type(p):=ord_noad; subtype(p):=normal; 15311mem[nucleus(p)].hh:=empty_field; 15312mem[subscr(p)].hh:=empty_field; 15313mem[supscr(p)].hh:=empty_field; 15314new_noad:=p; 15315end; 15316 15317@ A few more kinds of noads will complete the set: An |under_noad| has its 15318nucleus underlined; an |over_noad| has it overlined. An |accent_noad| places 15319an accent over its nucleus; the accent character appears as 15320|fam(accent_chr(p))| and |character(accent_chr(p))|. A |vcenter_noad| 15321centers its nucleus vertically with respect to the axis of the formula; 15322in such noads we always have |math_type(nucleus(p))=sub_box|. 15323 15324And finally, we have |left_noad| and |right_noad| types, to implement 15325\TeX's \.{\\left} and \.{\\right} as well as \eTeX's \.{\\middle}. 15326The |nucleus| of such noads is 15327replaced by a |delimiter| field; thus, for example, `\.{\\left(}' produces 15328a |left_noad| such that |delimiter(p)| holds the family and character 15329codes for all left parentheses. A |left_noad| never appears in an mlist 15330except as the first element, and a |right_noad| never appears in an mlist 15331except as the last element; furthermore, we either have both a |left_noad| 15332and a |right_noad|, or neither one is present. The |subscr| and |supscr| 15333fields are always |empty| in a |left_noad| and a |right_noad|. 15334 15335@d under_noad=fraction_noad+1 {|type| of a noad for underlining} 15336@d over_noad=under_noad+1 {|type| of a noad for overlining} 15337@d accent_noad=over_noad+1 {|type| of a noad for accented subformulas} 15338@d fixed_acc=1 {|subtype| for non growing math accents} 15339@d bottom_acc=2 {|subtype| for bottom math accents} 15340@d is_bottom_acc(#)==((subtype(#)=bottom_acc) or (subtype(#)=bottom_acc+fixed_acc)) 15341@d accent_noad_size=5 {number of |mem| words in an accent noad} 15342@d accent_chr(#)==#+4 {the |accent_chr| field of an accent noad} 15343@d vcenter_noad=accent_noad+1 {|type| of a noad for \.{\\vcenter}} 15344@d left_noad=vcenter_noad+1 {|type| of a noad for \.{\\left}} 15345@d right_noad=left_noad+1 {|type| of a noad for \.{\\right}} 15346@d delimiter==nucleus {|delimiter| field in left and right noads} 15347@d middle_noad==1 {|subtype| of right noad representing \.{\\middle}} 15348@d scripts_allowed(#)==(type(#)>=ord_noad)and(type(#)<left_noad) 15349 15350@ Math formulas can also contain instructions like \.{\\textstyle} that 15351override \TeX's normal style rules. A |style_node| is inserted into the 15352data structure to record such instructions; it is three words long, so it 15353is considered a node instead of a noad. The |subtype| is either |display_style| 15354or |text_style| or |script_style| or |script_script_style|. The 15355second and third words of a |style_node| are not used, but they are 15356present because a |choice_node| is converted to a |style_node|. 15357 15358\TeX\ uses even numbers 0, 2, 4, 6 to encode the basic styles 15359|display_style|, \dots, |script_script_style|, and adds~1 to get the 15360``cramped'' versions of these styles. This gives a numerical order that 15361is backwards from the convention of Appendix~G in {\sl The \TeX book\/}; 15362i.e., a smaller style has a larger numerical value. 15363@:TeXbook}{\sl The \TeX book@> 15364 15365@d style_node=unset_node+1 {|type| of a style node} 15366@d style_node_size=3 {number of words in a style node} 15367@d display_style=0 {|subtype| for \.{\\displaystyle}} 15368@d text_style=2 {|subtype| for \.{\\textstyle}} 15369@d script_style=4 {|subtype| for \.{\\scriptstyle}} 15370@d script_script_style=6 {|subtype| for \.{\\scriptscriptstyle}} 15371@d cramped=1 {add this to an uncramped style if you want to cramp it} 15372 15373@p function new_style(@!s:small_number):pointer; {create a style node} 15374var p:pointer; {the new node} 15375begin p:=get_node(style_node_size); type(p):=style_node; 15376subtype(p):=s; width(p):=0; depth(p):=0; {the |width| and |depth| are not used} 15377new_style:=p; 15378end; 15379 15380@ Finally, the \.{\\mathchoice} primitive creates a |choice_node|, which 15381has special subfields |display_mlist|, |text_mlist|, |script_mlist|, 15382and |script_script_mlist| pointing to the mlists for each style. 15383 15384@d choice_node=unset_node+2 {|type| of a choice node} 15385@d display_mlist(#)==info(#+1) {mlist to be used in display style} 15386@d text_mlist(#)==link(#+1) {mlist to be used in text style} 15387@d script_mlist(#)==info(#+2) {mlist to be used in script style} 15388@d script_script_mlist(#)==link(#+2) {mlist to be used in scriptscript style} 15389 15390@p function new_choice:pointer; {create a choice node} 15391var p:pointer; {the new node} 15392begin p:=get_node(style_node_size); type(p):=choice_node; 15393subtype(p):=0; {the |subtype| is not used} 15394display_mlist(p):=null; text_mlist(p):=null; script_mlist(p):=null; 15395script_script_mlist(p):=null; 15396new_choice:=p; 15397end; 15398 15399@ Let's consider now the previously unwritten part of |show_node_list| 15400that displays the things that can only be present in mlists; this 15401program illustrates how to access the data structures just defined. 15402 15403In the context of the following program, |p| points to a node or noad that 15404should be displayed, and the current string contains the ``recursion history'' 15405that leads to this point. The recursion history consists of a dot for each 15406outer level in which |p| is subsidiary to some node, or in which |p| is 15407subsidiary to the |nucleus| field of some noad; the dot is replaced by 15408`\.\_' or `\.\^' or `\./' or `\.\\' if |p| is descended from the |subscr| 15409or |supscr| or |denominator| or |numerator| fields of noads. For example, 15410the current string would be `\.{.\^.\_/}' if |p| points to the |ord_noad| for 15411|x| in the (ridiculous) formula 15412`\.{\$\\sqrt\{a\^\{\\mathinner\{b\_\{c\\over x+y\}\}\}\}\$}'. 15413 15414@<Cases of |show_node_list| that arise...@>= 15415style_node:print_style(subtype(p)); 15416choice_node:@<Display choice node |p|@>; 15417ord_noad,op_noad,bin_noad,rel_noad,open_noad,close_noad,punct_noad,inner_noad, 15418 radical_noad,over_noad,under_noad,vcenter_noad,accent_noad, 15419 left_noad,right_noad:@<Display normal noad |p|@>; 15420fraction_noad:@<Display fraction noad |p|@>; 15421 15422@ Here are some simple routines used in the display of noads. 15423 15424@<Declare procedures needed for displaying the elements of mlists@>= 15425procedure print_fam_and_char(@!p:pointer); {prints family and character} 15426var c:integer; 15427begin print_esc("fam"); print_int(fam(p) mod @"100); print_char(" "); 15428c:=(cast_to_ushort(character(p)) + ((plane_and_fam_field(p) div @"100) * @"10000)); 15429if c < @"10000 then print_ASCII(c) 15430else print_char(c); {non-Plane 0 Unicodes can't be sent through |print_ASCII|} 15431end; 15432@# 15433procedure print_delimiter(@!p:pointer); {prints a delimiter as 24-bit hex value} 15434var a:integer; {accumulator} 15435begin a:=small_fam(p)*256+qo(small_char(p)); 15436a:=a*@"1000+large_fam(p)*256+qo(large_char(p)); 15437if a<0 then print_int(a) {this should never happen} 15438else print_hex(a); 15439end; 15440 15441@ The next subroutine will descend to another level of recursion when a 15442subsidiary mlist needs to be displayed. The parameter |c| indicates what 15443character is to become part of the recursion history. An empty mlist is 15444distinguished from a field with |math_type(p)=empty|, because these are 15445not equivalent (as explained above). 15446@^recursion@> 15447 15448@<Declare procedures needed for displaying...@>= 15449procedure@?show_info; forward;@t\2@>@?{|show_node_list(info(temp_ptr))|} 15450procedure print_subsidiary_data(@!p:pointer;@!c:ASCII_code); 15451 {display a noad field} 15452begin if cur_length>=depth_threshold then 15453 begin if math_type(p)<>empty then print(" []"); 15454 end 15455else begin append_char(c); {include |c| in the recursion history} 15456 temp_ptr:=p; {prepare for |show_info| if recursion is needed} 15457 case math_type(p) of 15458 math_char: begin print_ln; print_current_string; print_fam_and_char(p); 15459 end; 15460 sub_box: show_info; {recursive call} 15461 sub_mlist: if info(p)=null then 15462 begin print_ln; print_current_string; print("{}"); 15463 end 15464 else show_info; {recursive call} 15465 othercases do_nothing {|empty|} 15466 endcases;@/ 15467 flush_char; {remove |c| from the recursion history} 15468 end; 15469end; 15470 15471@ The inelegant introduction of |show_info| in the code above seems better 15472than the alternative of using \PASCAL's strange |forward| declaration for a 15473procedure with parameters. The \PASCAL\ convention about dropping parameters 15474from a post-|forward| procedure is, frankly, so intolerable to the author 15475of \TeX\ that he would rather stoop to communication via a global temporary 15476variable. (A similar stoopidity occurred with respect to |hlist_out| and 15477|vlist_out| above, and it will occur with respect to |mlist_to_hlist| below.) 15478@^Knuth, Donald Ervin@> 15479@:PASCAL}{\PASCAL@> 15480 15481@p procedure show_info; {the reader will kindly forgive this} 15482begin show_node_list(info(temp_ptr)); 15483end; 15484 15485@ @<Declare procedures needed for displaying...@>= 15486procedure print_style(@!c:integer); 15487begin case c div 2 of 154880: print_esc("displaystyle"); {|display_style=0|} 154891: print_esc("textstyle"); {|text_style=2|} 154902: print_esc("scriptstyle"); {|script_style=4|} 154913: print_esc("scriptscriptstyle"); {|script_script_style=6|} 15492othercases print("Unknown style!") 15493endcases; 15494end; 15495 15496@ @<Display choice node |p|@>= 15497begin print_esc("mathchoice"); 15498append_char("D"); show_node_list(display_mlist(p)); flush_char; 15499append_char("T"); show_node_list(text_mlist(p)); flush_char; 15500append_char("S"); show_node_list(script_mlist(p)); flush_char; 15501append_char("s"); show_node_list(script_script_mlist(p)); flush_char; 15502end 15503 15504@ @<Display normal noad |p|@>= 15505begin case type(p) of 15506ord_noad: print_esc("mathord"); 15507op_noad: print_esc("mathop"); 15508bin_noad: print_esc("mathbin"); 15509rel_noad: print_esc("mathrel"); 15510open_noad: print_esc("mathopen"); 15511close_noad: print_esc("mathclose"); 15512punct_noad: print_esc("mathpunct"); 15513inner_noad: print_esc("mathinner"); 15514over_noad: print_esc("overline"); 15515under_noad: print_esc("underline"); 15516vcenter_noad: print_esc("vcenter"); 15517radical_noad: begin print_esc("radical"); print_delimiter(left_delimiter(p)); 15518 end; 15519accent_noad: begin print_esc("accent"); print_fam_and_char(accent_chr(p)); 15520 end; 15521left_noad: begin print_esc("left"); print_delimiter(delimiter(p)); 15522 end; 15523right_noad: begin if subtype(p)=normal then print_esc("right") 15524 else print_esc("middle"); 15525 print_delimiter(delimiter(p)); 15526 end; 15527end; 15528if type(p)<left_noad then 15529 begin if subtype(p)<>normal then 15530 if subtype(p)=limits then print_esc("limits") 15531 else print_esc("nolimits"); 15532 print_subsidiary_data(nucleus(p),"."); 15533 end; 15534print_subsidiary_data(supscr(p),"^"); 15535print_subsidiary_data(subscr(p),"_"); 15536end 15537 15538@ @<Display fraction noad |p|@>= 15539begin print_esc("fraction, thickness "); 15540if thickness(p)=default_code then print("= default") 15541else print_scaled(thickness(p)); 15542if (small_fam(left_delimiter(p))<>0)or@+ 15543 (small_char(left_delimiter(p))<>min_quarterword)or@| 15544 (large_fam(left_delimiter(p))<>0)or@| 15545 (large_char(left_delimiter(p))<>min_quarterword) then 15546 begin print(", left-delimiter "); print_delimiter(left_delimiter(p)); 15547 end; 15548if (small_fam(right_delimiter(p))<>0)or@| 15549 (small_char(right_delimiter(p))<>min_quarterword)or@| 15550 (large_fam(right_delimiter(p))<>0)or@| 15551 (large_char(right_delimiter(p))<>min_quarterword) then 15552 begin print(", right-delimiter "); print_delimiter(right_delimiter(p)); 15553 end; 15554print_subsidiary_data(numerator(p),"\"); 15555print_subsidiary_data(denominator(p),"/"); 15556end 15557 15558@ That which can be displayed can also be destroyed. 15559 15560@<Cases of |flush_node_list| that arise...@>= 15561style_node: begin free_node(p,style_node_size); goto done; 15562 end; 15563choice_node:begin flush_node_list(display_mlist(p)); 15564 flush_node_list(text_mlist(p)); 15565 flush_node_list(script_mlist(p)); 15566 flush_node_list(script_script_mlist(p)); 15567 free_node(p,style_node_size); goto done; 15568 end; 15569ord_noad,op_noad,bin_noad,rel_noad,open_noad,close_noad,punct_noad,inner_noad, 15570 radical_noad,over_noad,under_noad,vcenter_noad,accent_noad:@t@>@;@/ 15571 begin if math_type(nucleus(p))>=sub_box then 15572 flush_node_list(info(nucleus(p))); 15573 if math_type(supscr(p))>=sub_box then 15574 flush_node_list(info(supscr(p))); 15575 if math_type(subscr(p))>=sub_box then 15576 flush_node_list(info(subscr(p))); 15577 if type(p)=radical_noad then free_node(p,radical_noad_size) 15578 else if type(p)=accent_noad then free_node(p,accent_noad_size) 15579 else free_node(p,noad_size); 15580 goto done; 15581 end; 15582left_noad,right_noad: begin free_node(p,noad_size); goto done; 15583 end; 15584fraction_noad: begin flush_node_list(info(numerator(p))); 15585 flush_node_list(info(denominator(p))); 15586 free_node(p,fraction_noad_size); goto done; 15587 end; 15588 15589@* \[35] Subroutines for math mode. 15590In order to convert mlists to hlists, i.e., noads to nodes, we need several 15591subroutines that are conveniently dealt with now. 15592 15593Let us first introduce the macros that make it easy to get at the parameters and 15594other font information. A size code, which is a multiple of 16, is added to a 15595family number to get an index into the table of internal font numbers 15596for each combination of family and size. (Be alert: Size codes get 15597larger as the type gets smaller.) 15598 15599@<Basic printing procedures@>= 15600procedure print_size(@!s:integer); 15601begin if s=text_size then print_esc("textfont") 15602else if s=script_size then print_esc("scriptfont") 15603else print_esc("scriptscriptfont"); 15604end; 15605 15606@ Before an mlist is converted to an hlist, \TeX\ makes sure that 15607the fonts in family~2 have enough parameters to be math-symbol 15608fonts, and that the fonts in family~3 have enough parameters to be 15609math-extension fonts. The math-symbol parameters are referred to by using the 15610following macros, which take a size code as their parameter; for example, 15611|num1(cur_size)| gives the value of the |num1| parameter for the current size. 15612@^parameters for symbols@> 15613@^font parameters@> 15614 15615NB: the access functions here must all put the font \# into /f/ for mathsy(). 15616 15617The accessors are defined with 15618|define_mathsy_accessor(NAME)(fontdimen-number)(NAME)| 15619because I can't see how to only give the name once, with WEB's limited 15620macro capabilities. This seems a bit ugly, but it works. 15621 15622@d total_mathsy_params=22 15623 15624{the following are OpenType MATH constant indices for use with OT math fonts} 15625@d scriptPercentScaleDown = 0 15626@d scriptScriptPercentScaleDown = 1 15627@d delimitedSubFormulaMinHeight = 2 15628@d displayOperatorMinHeight = 3 15629@d mathLeading = 4 15630@d firstMathValueRecord = mathLeading 15631@d axisHeight = 5 15632@d accentBaseHeight = 6 15633@d flattenedAccentBaseHeight = 7 15634@d subscriptShiftDown = 8 15635@d subscriptTopMax = 9 15636@d subscriptBaselineDropMin = 10 15637@d superscriptShiftUp = 11 15638@d superscriptShiftUpCramped = 12 15639@d superscriptBottomMin = 13 15640@d superscriptBaselineDropMax = 14 15641@d subSuperscriptGapMin = 15 15642@d superscriptBottomMaxWithSubscript = 16 15643@d spaceAfterScript = 17 15644@d upperLimitGapMin = 18 15645@d upperLimitBaselineRiseMin = 19 15646@d lowerLimitGapMin = 20 15647@d lowerLimitBaselineDropMin = 21 15648@d stackTopShiftUp = 22 15649@d stackTopDisplayStyleShiftUp = 23 15650@d stackBottomShiftDown = 24 15651@d stackBottomDisplayStyleShiftDown = 25 15652@d stackGapMin = 26 15653@d stackDisplayStyleGapMin = 27 15654@d stretchStackTopShiftUp = 28 15655@d stretchStackBottomShiftDown = 29 15656@d stretchStackGapAboveMin = 30 15657@d stretchStackGapBelowMin = 31 15658@d fractionNumeratorShiftUp = 32 15659@d fractionNumeratorDisplayStyleShiftUp = 33 15660@d fractionDenominatorShiftDown = 34 15661@d fractionDenominatorDisplayStyleShiftDown = 35 15662@d fractionNumeratorGapMin = 36 15663@d fractionNumDisplayStyleGapMin = 37 15664@d fractionRuleThickness = 38 15665@d fractionDenominatorGapMin = 39 15666@d fractionDenomDisplayStyleGapMin = 40 15667@d skewedFractionHorizontalGap = 41 15668@d skewedFractionVerticalGap = 42 15669@d overbarVerticalGap = 43 15670@d overbarRuleThickness = 44 15671@d overbarExtraAscender = 45 15672@d underbarVerticalGap = 46 15673@d underbarRuleThickness = 47 15674@d underbarExtraDescender = 48 15675@d radicalVerticalGap = 49 15676@d radicalDisplayStyleVerticalGap = 50 15677@d radicalRuleThickness = 51 15678@d radicalExtraAscender = 52 15679@d radicalKernBeforeDegree = 53 15680@d radicalKernAfterDegree = 54 15681@d lastMathValueRecord = radicalKernAfterDegree 15682@d radicalDegreeBottomRaisePercent = 55 15683@d lastMathConstant = radicalDegreeBottomRaisePercent 15684 15685 15686@d mathsy(#)==font_info[#+param_base[f]].sc 15687 15688@d define_mathsy_end(#)== 15689 #:=rval; 15690 end 15691@d define_mathsy_body(#)== 15692 var 15693 f: integer; 15694 rval: scaled; 15695 begin 15696 f:=fam_fnt(2 + size_code); 15697 if is_new_mathfont(cur_f) then 15698 rval:=get_native_mathsy_param(cur_f, #) 15699 else 15700 rval:=mathsy(#); 15701 define_mathsy_end 15702@d define_mathsy_accessor(#)==function #(size_code: integer): scaled; define_mathsy_body 15703 15704@p define_mathsy_accessor(math_x_height)(5)(math_x_height); 15705define_mathsy_accessor(math_quad)(6)(math_quad); 15706define_mathsy_accessor(num1)(8)(num1); 15707define_mathsy_accessor(num2)(9)(num2); 15708define_mathsy_accessor(num3)(10)(num3); 15709define_mathsy_accessor(denom1)(11)(denom1); 15710define_mathsy_accessor(denom2)(12)(denom2); 15711define_mathsy_accessor(sup1)(13)(sup1); 15712define_mathsy_accessor(sup2)(14)(sup2); 15713define_mathsy_accessor(sup3)(15)(sup3); 15714define_mathsy_accessor(sub1)(16)(sub1); 15715define_mathsy_accessor(sub2)(17)(sub2); 15716define_mathsy_accessor(sup_drop)(18)(sup_drop); 15717define_mathsy_accessor(sub_drop)(19)(sub_drop); 15718define_mathsy_accessor(delim1)(20)(delim1); 15719define_mathsy_accessor(delim2)(21)(delim2); 15720define_mathsy_accessor(axis_height)(22)(axis_height); 15721 15722@ The math-extension parameters have similar macros, but the size code is 15723omitted (since it is always |cur_size| when we refer to such parameters). 15724@^parameters for symbols@> 15725@^font parameters@> 15726 15727@d total_mathex_params=13 15728 15729@d mathex(#)==font_info[#+param_base[f]].sc 15730 15731@d define_mathex_end(#)== 15732 #:=rval; 15733 end 15734@d define_mathex_body(#)== 15735 var 15736 f: integer; 15737 rval: scaled; 15738 begin 15739 f:=fam_fnt(3 + cur_size); 15740 if is_new_mathfont(cur_f) then 15741 rval:=get_native_mathex_param(cur_f, #) 15742 else 15743 rval:=mathex(#); 15744 define_mathex_end 15745@d define_mathex_accessor(#)==function #:scaled; define_mathex_body 15746 15747@p define_mathex_accessor(default_rule_thickness)(8)(default_rule_thickness); 15748define_mathex_accessor(big_op_spacing1)(9)(big_op_spacing1); 15749define_mathex_accessor(big_op_spacing2)(10)(big_op_spacing2); 15750define_mathex_accessor(big_op_spacing3)(11)(big_op_spacing3); 15751define_mathex_accessor(big_op_spacing4)(12)(big_op_spacing4); 15752define_mathex_accessor(big_op_spacing5)(13)(big_op_spacing5); 15753 15754@ Native font support requires these additional subroutines. 15755 15756|new_native_word_node| creates the node, but does not actually set its metrics; 15757call |set_native_metrics(node)| if that is required. 15758 15759@<Declare subroutines for |new_character|@>= 15760function new_native_word_node(@!f:internal_font_number;@!n:integer):pointer; 15761var 15762 l: integer; 15763 q: pointer; 15764begin 15765 l:=native_node_size + (n * sizeof(UTF16_code) + sizeof(memory_word) - 1) div sizeof(memory_word); 15766 15767 q:=get_node(l); 15768 type(q):=whatsit_node; 15769 subtype(q):=native_word_node; 15770 15771 native_size(q):=l; 15772 native_font(q):=f; 15773 native_length(q):=n; 15774 15775 native_glyph_count(q):=0; 15776 native_glyph_info_ptr(q):=null_ptr; 15777 15778 new_native_word_node:=q; 15779end; 15780 15781function new_native_character(@!f:internal_font_number;@!c:UnicodeScalar):pointer; 15782var 15783 p: pointer; 15784 i, len: integer; 15785begin 15786 if font_mapping[f] <> 0 then begin 15787 if c > @"FFFF then begin 15788 str_room(2); 15789 append_char((c - @"10000) div 1024 + @"D800); 15790 append_char((c - @"10000) mod 1024 + @"DC00); 15791 end 15792 else begin 15793 str_room(1); 15794 append_char(c); 15795 end; 15796 len:=apply_mapping(font_mapping[f], addressof(str_pool[str_start_macro(str_ptr)]), cur_length); 15797 pool_ptr:=str_start_macro(str_ptr); { flush the string, as we'll be using the mapped text instead } 15798 15799 i:=0; 15800 while i < len do begin 15801 if (mapped_text[i] >= @"D800) and (mapped_text[i] < @"DC00) then begin 15802 c:=(mapped_text[i] - @"D800) * 1024 + mapped_text[i+1] - @"DC00 + @"10000; 15803 if map_char_to_glyph(f, c) = 0 then begin 15804 char_warning(f, c); 15805 end; 15806 i:=i + 2; 15807 end 15808 else begin 15809 if map_char_to_glyph(f, mapped_text[i]) = 0 then begin 15810 char_warning(f, mapped_text[i]); 15811 end; 15812 i:=i + 1; 15813 end; 15814 end; 15815 15816 p:=new_native_word_node(f, len); 15817 for i:=0 to len-1 do begin 15818 set_native_char(p, i, mapped_text[i]); 15819 end 15820 end 15821 else begin 15822 if tracing_lost_chars > 0 then 15823 if map_char_to_glyph(f, c) = 0 then begin 15824 char_warning(f, c); 15825 end; 15826 15827 p:=get_node(native_node_size + 1); 15828 type(p):=whatsit_node; 15829 subtype(p):=native_word_node; 15830 15831 native_size(p):=native_node_size + 1; 15832 native_glyph_count(p):=0; 15833 native_glyph_info_ptr(p):=null_ptr; 15834 native_font(p):=f; 15835 15836 if c > @"FFFF then begin 15837 native_length(p):=2; 15838 set_native_char(p, 0, (c - @"10000) div 1024 + @"D800); 15839 set_native_char(p, 1, (c - @"10000) mod 1024 + @"DC00); 15840 end 15841 else begin 15842 native_length(p):=1; 15843 set_native_char(p, 0, c); 15844 end; 15845 end; 15846 15847 set_native_metrics(p, XeTeX_use_glyph_metrics); 15848 15849 new_native_character:=p; 15850end; 15851 15852procedure font_feature_warning(featureNameP:void_pointer; featLen:integer; 15853 settingNameP:void_pointer; setLen:integer); 15854var 15855 i: integer; 15856begin 15857 begin_diagnostic; 15858 print_nl("Unknown "); 15859 if setLen > 0 then begin 15860 print("selector `"); 15861 print_utf8_str(settingNameP, setLen); 15862 print("' for "); 15863 end; 15864 print("feature `"); 15865 print_utf8_str(featureNameP, featLen); 15866 print("' in font `"); 15867 i:=1; 15868 while ord(name_of_file[i]) <> 0 do begin 15869 print_visible_char(name_of_file[i]); { this is already UTF-8 } 15870 incr(i); 15871 end; 15872 print("'."); 15873 end_diagnostic(false); 15874end; 15875 15876procedure font_mapping_warning(mappingNameP:void_pointer; 15877 mappingNameLen:integer; 15878 warningType:integer); { 0: just logging; 1: file not found; 2: can't load } 15879var 15880 i: integer; 15881begin 15882 begin_diagnostic; 15883 if warningType=0 then print_nl("Loaded mapping `") 15884 else print_nl("Font mapping `"); 15885 print_utf8_str(mappingNameP, mappingNameLen); 15886 print("' for font `"); 15887 i:=1; 15888 while ord(name_of_file[i]) <> 0 do begin 15889 print_visible_char(name_of_file[i]); { this is already UTF-8 } 15890 incr(i); 15891 end; 15892 case warningType of 15893 1: print("' not found."); 15894 2: begin print("' not usable;"); 15895 print_nl("bad mapping file or incorrect mapping type."); 15896 end; 15897 othercases print("'.") 15898 endcases; 15899 end_diagnostic(false); 15900end; 15901 15902procedure graphite_warning; 15903var 15904 i: integer; 15905begin 15906 begin_diagnostic; 15907 print_nl("Font `"); 15908 i:=1; 15909 while ord(name_of_file[i]) <> 0 do begin 15910 print_visible_char(name_of_file[i]); { this is already UTF-8 } 15911 incr(i); 15912 end; 15913 print("' does not support Graphite. Trying OpenType layout instead."); 15914 end_diagnostic(false); 15915end; 15916 15917function load_native_font(u: pointer; nom, aire:str_number; s: scaled): internal_font_number; 15918label 15919 done; 15920const 15921 first_math_fontdimen = 10; 15922var 15923 k, num_font_dimens: integer; 15924 font_engine: void_pointer; {really an CFDictionaryRef or XeTeXLayoutEngine} 15925 actual_size: scaled; {|s| converted to real size, if it was negative} 15926 p: pointer; {for temporary |native_char| node we'll create} 15927 ascent, descent, font_slant, x_ht, cap_ht: scaled; 15928 f: internal_font_number; 15929 full_name: str_number; 15930begin 15931 { on entry here, the full name is packed into |name_of_file| in UTF8 form } 15932 15933 load_native_font:=null_font; 15934 15935 font_engine:=find_native_font(name_of_file + 1, s); 15936 if font_engine = 0 then goto done; 15937 15938 if s>=0 then 15939 actual_size:=s 15940 else begin 15941 if (s <> -1000) then 15942 actual_size:=xn_over_d(loaded_font_design_size,-s,1000) 15943 else 15944 actual_size:=loaded_font_design_size; 15945 end; 15946 15947 { look again to see if the font is already loaded, now that we know its canonical name } 15948 str_room(name_length); 15949 for k:=1 to name_length do 15950 append_char(name_of_file[k]); 15951 full_name:=make_string; { not |slow_make_string| because we'll flush it if the font was already loaded } 15952 15953 for f:=font_base+1 to font_ptr do 15954 if (font_area[f] = native_font_type_flag) and str_eq_str(font_name[f], full_name) and (font_size[f] = actual_size) then begin 15955 release_font_engine(font_engine, native_font_type_flag); 15956 flush_string; 15957 load_native_font:=f; 15958 goto done; 15959 end; 15960 15961 if (native_font_type_flag = otgr_font_flag) and isOpenTypeMathFont(font_engine) then 15962 num_font_dimens:=first_math_fontdimen + lastMathConstant 15963 else 15964 num_font_dimens:=8; 15965 15966 if (font_ptr = font_max) or (fmem_ptr + num_font_dimens > font_mem_size) then begin 15967 @<Apologize for not loading the font, |goto done|@>; 15968 end; 15969 15970 { we've found a valid installed font, and have room } 15971 incr(font_ptr); 15972 font_area[font_ptr]:=native_font_type_flag; { set by |find_native_font| to either |aat_font_flag| or |ot_font_flag| } 15973 15974 { store the canonical name } 15975 font_name[font_ptr]:=full_name; 15976 15977 font_check[font_ptr].b0:=0; 15978 font_check[font_ptr].b1:=0; 15979 font_check[font_ptr].b2:=0; 15980 font_check[font_ptr].b3:=0; 15981 font_glue[font_ptr]:=null; 15982 font_dsize[font_ptr]:=loaded_font_design_size; 15983 font_size[font_ptr]:=actual_size; 15984 15985 if (native_font_type_flag = aat_font_flag) then begin 15986 aat_get_font_metrics(font_engine, addressof(ascent), addressof(descent), 15987 addressof(x_ht), addressof(cap_ht), addressof(font_slant)) 15988 end else begin 15989 ot_get_font_metrics(font_engine, addressof(ascent), addressof(descent), 15990 addressof(x_ht), addressof(cap_ht), addressof(font_slant)); 15991 end; 15992 15993 height_base[font_ptr]:=ascent; 15994 depth_base[font_ptr]:=-descent; 15995 15996 font_params[font_ptr]:=num_font_dimens; {we add an extra \.{\\fontdimen8} |:= cap_height|; then OT math fonts have a bunch more} 15997 font_bc[font_ptr]:=0; 15998 font_ec[font_ptr]:=65535; 15999 font_used[font_ptr]:=false; 16000 hyphen_char[font_ptr]:=default_hyphen_char; 16001 skew_char[font_ptr]:=default_skew_char; 16002 param_base[font_ptr]:=fmem_ptr-1; 16003 16004 font_layout_engine[font_ptr]:=font_engine; 16005 font_mapping[font_ptr]:=0; {don't use the mapping, if any, when measuring space here} 16006 font_letter_space[font_ptr]:=loaded_font_letter_space; 16007 16008@# {measure the width of the space character and set up font parameters} 16009 p:=new_native_character(font_ptr, " "); 16010 s:=width(p) + loaded_font_letter_space; 16011 free_node(p, native_size(p)); 16012 16013 font_info[fmem_ptr].sc:=font_slant; {|slant|} 16014 incr(fmem_ptr); 16015 font_info[fmem_ptr].sc:=s; {|space| = width of space character} 16016 incr(fmem_ptr); 16017 font_info[fmem_ptr].sc:=s div 2; {|space_stretch| = 1/2 * space} 16018 incr(fmem_ptr); 16019 font_info[fmem_ptr].sc:=s div 3; {|space_shrink| = 1/3 * space} 16020 incr(fmem_ptr); 16021 font_info[fmem_ptr].sc:=x_ht; {|x_height|} 16022 incr(fmem_ptr); 16023 font_info[fmem_ptr].sc:=font_size[font_ptr]; {|quad| = font size} 16024 incr(fmem_ptr); 16025 font_info[fmem_ptr].sc:=s div 3; {|extra_space| = 1/3 * space} 16026 incr(fmem_ptr); 16027 font_info[fmem_ptr].sc:=cap_ht; {|cap_height|} 16028 incr(fmem_ptr); 16029 16030 if num_font_dimens = first_math_fontdimen + lastMathConstant then begin 16031 font_info[fmem_ptr].int:=num_font_dimens; { \.{\\fontdimen9} |:=| number of assigned fontdimens } 16032 incr(fmem_ptr); 16033 for k:=0 to lastMathConstant do begin 16034 font_info[fmem_ptr].sc:=get_ot_math_constant(font_ptr, k); 16035 incr(fmem_ptr); 16036 end; 16037 end; 16038 16039 font_mapping[font_ptr]:=loaded_font_mapping; 16040 font_flags[font_ptr]:=loaded_font_flags; 16041 16042 load_native_font:=font_ptr; 16043done: 16044end; 16045 16046procedure do_locale_linebreaks(s: integer; len: integer); 16047var 16048 offs, prevOffs, i: integer; 16049 use_penalty, use_skip: boolean; 16050begin 16051 if (XeTeX_linebreak_locale = 0) or (len = 1) then begin 16052 link(tail):=new_native_word_node(main_f, len); 16053 tail:=link(tail); 16054 for i:=0 to len - 1 do 16055 set_native_char(tail, i, native_text[s + i]); 16056 set_native_metrics(tail, XeTeX_use_glyph_metrics); 16057 end else begin 16058 use_skip:=XeTeX_linebreak_skip <> zero_glue; 16059 use_penalty:=XeTeX_linebreak_penalty <> 0 or not use_skip; 16060 linebreak_start(main_f, XeTeX_linebreak_locale, native_text + s, len); 16061 offs:=0; 16062 repeat 16063 prevOffs:=offs; 16064 offs:=linebreak_next; 16065 if offs > 0 then begin 16066 if prevOffs <> 0 then begin 16067 if use_penalty then 16068 tail_append(new_penalty(XeTeX_linebreak_penalty)); 16069 if use_skip then 16070 tail_append(new_param_glue(XeTeX_linebreak_skip_code)); 16071 end; 16072 link(tail):=new_native_word_node(main_f, offs - prevOffs); 16073 tail:=link(tail); 16074 for i:=prevOffs to offs - 1 do 16075 set_native_char(tail, i - prevOffs, native_text[s + i]); 16076 set_native_metrics(tail, XeTeX_use_glyph_metrics); 16077 end; 16078 until offs < 0; 16079 end 16080end; 16081 16082procedure bad_utf8_warning; 16083begin 16084 begin_diagnostic; 16085 print_nl("Invalid UTF-8 byte or sequence"); 16086 if terminal_input then print(" in terminal input") 16087 else begin 16088 print(" at line "); 16089 print_int(line); 16090 end; 16091 print(" replaced by U+FFFD."); 16092 end_diagnostic(false); 16093end; 16094 16095function get_input_normalization_state: integer; 16096begin 16097 if eqtb=nil then get_input_normalization_state:=0 { may be called before eqtb is initialized } 16098 else get_input_normalization_state:=XeTeX_input_normalization_state; 16099end; 16100 16101function get_tracing_fonts_state: integer; 16102begin 16103 get_tracing_fonts_state:=XeTeX_tracing_fonts_state; 16104end; 16105 16106@ We also need to compute the change in style between mlists and their 16107subsidiaries. The following macros define the subsidiary style for 16108an overlined nucleus (|cramped_style|), for a subscript or a superscript 16109(|sub_style| or |sup_style|), or for a numerator or denominator (|num_style| 16110or |denom_style|). 16111 16112@d cramped_style(#)==2*(# div 2)+cramped {cramp the style} 16113@d sub_style(#)==2*(# div 4)+script_style+cramped {smaller and cramped} 16114@d sup_style(#)==2*(# div 4)+script_style+(# mod 2) {smaller} 16115@d num_style(#)==#+2-2*(# div 6) {smaller unless already script-script} 16116@d denom_style(#)==2*(# div 2)+cramped+2-2*(# div 6) {smaller, cramped} 16117 16118@ When the style changes, the following piece of program computes associated 16119information: 16120 16121@<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>= 16122begin if cur_style<script_style then cur_size:=text_size 16123else cur_size:=script_size*((cur_style-text_style) div 2); 16124cur_mu:=x_over_n(math_quad(cur_size),18); 16125end 16126 16127@ Here is a function that returns a pointer to a rule node having a given 16128thickness |t|. The rule will extend horizontally to the boundary of the vlist 16129that eventually contains it. 16130 16131@p function fraction_rule(@!t:scaled):pointer; 16132 {construct the bar for a fraction} 16133var p:pointer; {the new node} 16134begin p:=new_rule; height(p):=t; depth(p):=0; fraction_rule:=p; 16135end; 16136 16137@ The |overbar| function returns a pointer to a vlist box that consists of 16138a given box |b|, above which has been placed a kern of height |k| under a 16139fraction rule of thickness |t| under additional space of height |t|. 16140 16141@p function overbar(@!b:pointer;@!k,@!t:scaled):pointer; 16142var p,@!q:pointer; {nodes being constructed} 16143begin p:=new_kern(k); link(p):=b; q:=fraction_rule(t); link(q):=p; 16144p:=new_kern(t); link(p):=q; overbar:=vpack(p,natural); 16145end; 16146 16147@ The |var_delimiter| function, which finds or constructs a sufficiently 16148large delimiter, is the most interesting of the auxiliary functions that 16149currently concern us. Given a pointer |d| to a delimiter field in some noad, 16150together with a size code |s| and a vertical distance |v|, this function 16151returns a pointer to a box that contains the smallest variant of |d| whose 16152height plus depth is |v| or more. (And if no variant is large enough, it 16153returns the largest available variant.) In particular, this routine will 16154construct arbitrarily large delimiters from extensible components, if 16155|d| leads to such characters. 16156 16157The value returned is a box whose |shift_amount| has been set so that 16158the box is vertically centered with respect to the axis in the given size. 16159If a built-up symbol is returned, the height of the box before shifting 16160will be the height of its topmost component. 16161 16162@p@t\4@>@<Declare subprocedures for |var_delimiter|@> 16163procedure stack_glyph_into_box(@!b:pointer;@!f:internal_font_number;@!g:integer); 16164var p,q:pointer; 16165begin 16166 p:=get_node(glyph_node_size); 16167 type(p):=whatsit_node; subtype(p):=glyph_node; 16168 native_font(p):=f; native_glyph(p):=g; 16169 set_native_glyph_metrics(p, 1); 16170 if type(b)=hlist_node then begin 16171 q:=list_ptr(b); 16172 if q=null then list_ptr(b):=p else begin 16173 while link(q)<>null do q:=link(q); 16174 link(q):=p; 16175 if (height(b) < height(p)) then height(b):=height(p); 16176 if (depth(b) < depth(p)) then depth(b):=depth(p); 16177 end; 16178 end else begin 16179 link(p):=list_ptr(b); list_ptr(b):=p; 16180 height(b):=height(p); 16181 if (width(b) < width(p)) then width(b):=width(p); 16182 end; 16183end; 16184 16185procedure stack_glue_into_box(@!b:pointer;@!min,max:scaled); 16186var p,q:pointer; 16187begin 16188 q:=new_spec(zero_glue); 16189 width(q):=min; 16190 stretch(q):=max-min; 16191 p:=new_glue(q); 16192 if type(b)=hlist_node then begin 16193 q:=list_ptr(b); 16194 if q=null then list_ptr(b):=p else begin 16195 while link(q)<>null do q:=link(q); 16196 link(q):=p; 16197 end; 16198 end else begin 16199 link(p):=list_ptr(b); list_ptr(b):=p; 16200 height(b):=height(p); width(b):=width(p); 16201 end; 16202end; 16203 16204function build_opentype_assembly(@!f:internal_font_number;@!a:void_pointer;@!s:scaled;@!horiz:boolean):pointer; 16205 {return a box with height/width at least |s|, using font |f|, with glyph assembly info from |a|} 16206var 16207 b:pointer; {the box we're constructing} 16208 n:integer; {the number of repetitions of each extender} 16209 i,j:integer; {indexes} 16210 g:integer; {glyph code} 16211 p:pointer; {temp pointer} 16212 s_max,o,oo,prev_o,min_o:scaled; 16213 no_extenders: boolean; 16214 nat,str:scaled; {natural size, stretch} 16215begin 16216 b:=new_null_box; 16217 if horiz then 16218 type(b):=hlist_node 16219 else 16220 type(b):=vlist_node; 16221 16222 {figure out how many repeats of each extender to use} 16223 n:=-1; 16224 no_extenders:=true; 16225 min_o:=ot_min_connector_overlap(f); 16226 repeat 16227 n:=n+1; 16228 {calc max possible size with this number of extenders} 16229 s_max:=0; 16230 prev_o:=0; 16231 for i:=0 to ot_part_count(a)-1 do begin 16232 if ot_part_is_extender(a, i) then begin 16233 no_extenders:=false; 16234 for j:=1 to n do begin 16235 o:=ot_part_start_connector(f, a, i); 16236 if min_o<o then o:=min_o; 16237 if prev_o<o then o:=prev_o; 16238 s_max:=s_max-o+ot_part_full_advance(f, a, i); 16239 prev_o:=ot_part_end_connector(f, a, i); 16240 end 16241 end else begin 16242 o:=ot_part_start_connector(f, a, i); 16243 if min_o<o then o:=min_o; 16244 if prev_o<o then o:=prev_o; 16245 s_max:=s_max-o+ot_part_full_advance(f, a, i); 16246 prev_o:=ot_part_end_connector(f, a, i); 16247 end; 16248 end; 16249 until (s_max>=s) or no_extenders; 16250 16251 {assemble box using |n| copies of each extender, 16252 with appropriate glue wherever an overlap occurs} 16253 prev_o:=0; 16254 for i:=0 to ot_part_count(a)-1 do begin 16255 if ot_part_is_extender(a, i) then begin 16256 for j:=1 to n do begin 16257 o:=ot_part_start_connector(f, a, i); 16258 if prev_o<o then o:=prev_o; 16259 oo:=o; {max overlap} 16260 if min_o<o then o:=min_o; 16261 if oo>0 then stack_glue_into_box(b, -oo, -o); 16262 g:=ot_part_glyph(a, i); 16263 stack_glyph_into_box(b, f, g); 16264 prev_o:=ot_part_end_connector(f, a, i); 16265 end 16266 end else begin 16267 o:=ot_part_start_connector(f, a, i); 16268 if prev_o<o then o:=prev_o; 16269 oo:=o; {max overlap} 16270 if min_o<o then o:=min_o; 16271 if oo>0 then stack_glue_into_box(b, -oo, -o); 16272 g:=ot_part_glyph(a, i); 16273 stack_glyph_into_box(b, f, g); 16274 prev_o:=ot_part_end_connector(f, a, i); 16275 end; 16276 end; 16277 16278 {find natural size and total stretch of the box} 16279 p:=list_ptr(b); nat:=0; str:=0; 16280 while p<>null do begin 16281 if type(p)=whatsit_node then begin 16282 if horiz then 16283 nat:=nat+width(p) 16284 else 16285 nat:=nat+height(p)+depth(p); 16286 end else if type(p)=glue_node then begin 16287 nat:=nat+width(glue_ptr(p)); 16288 str:=str+stretch(glue_ptr(p)); 16289 end; 16290 p:=link(p); 16291 end; 16292 16293 {set glue so as to stretch the connections if needed} 16294 o:=0; 16295 if (s>nat) and (str>0) then begin 16296 o:=(s-nat); 16297 {don't stretch more than |str|} 16298 if (o>str) then o:=str; 16299 glue_order(b):=normal; glue_sign(b):=stretching; 16300 glue_set(b):=unfloat(o/str); 16301 if horiz then 16302 width(b):= nat+round(str*float(glue_set(b))) 16303 else 16304 height(b):=nat+round(str*float(glue_set(b))); 16305 end else 16306 if horiz then 16307 width(b):=nat 16308 else 16309 height(b):=nat; 16310 16311 build_opentype_assembly:=b; 16312end; 16313 16314function var_delimiter(@!d:pointer;@!s:integer;@!v:scaled):pointer; 16315label found,continue; 16316var b:pointer; {the box that will be constructed} 16317ot_assembly_ptr:void_pointer; 16318@!f,@!g: internal_font_number; {best-so-far and tentative font codes} 16319@!c,@!x,@!y: quarterword; {best-so-far and tentative character codes} 16320@!m,@!n: integer; {the number of extensible pieces} 16321@!u: scaled; {height-plus-depth of a tentative character} 16322@!w: scaled; {largest height-plus-depth so far} 16323@!q: four_quarters; {character info} 16324@!hd: eight_bits; {height-depth byte} 16325@!r: four_quarters; {extensible pieces} 16326@!z: integer; {runs through font family members} 16327@!large_attempt: boolean; {are we trying the ``large'' variant?} 16328begin f:=null_font; w:=0; large_attempt:=false; 16329z:=small_fam(d); x:=small_char(d); 16330ot_assembly_ptr:=nil; 16331loop@+ begin @<Look at the variants of |(z,x)|; set |f| and |c| whenever 16332 a better character is found; |goto found| as soon as a 16333 large enough variant is encountered@>; 16334 if large_attempt then goto found; {there were none large enough} 16335 large_attempt:=true; z:=large_fam(d); x:=large_char(d); 16336 end; 16337found: if f<>null_font then begin 16338 if not is_ot_font(f) then 16339 @<Make variable |b| point to a box for |(f,c)|@> 16340 else begin 16341 {for OT fonts, c is the glyph ID to use} 16342 if ot_assembly_ptr<>nil then 16343 b:=build_opentype_assembly(f, ot_assembly_ptr, v, 0) 16344 else begin 16345 b:=new_null_box; type(b):=vlist_node; list_ptr(b):=get_node(glyph_node_size); 16346 type(list_ptr(b)):=whatsit_node; subtype(list_ptr(b)):=glyph_node; 16347 native_font(list_ptr(b)):=f; native_glyph(list_ptr(b)):=c; 16348 set_native_glyph_metrics(list_ptr(b), 1); 16349 width(b):=width(list_ptr(b)); 16350 height(b):=height(list_ptr(b)); 16351 depth(b):=depth(list_ptr(b)); 16352 end 16353 end 16354end else begin b:=new_null_box; 16355 width(b):=null_delimiter_space; {use this width if no delimiter was found} 16356 end; 16357shift_amount(b):=half(height(b)-depth(b)) - axis_height(s); 16358var_delimiter:=b; 16359end; 16360 16361@ The search process is complicated slightly by the facts that some of the 16362characters might not be present in some of the fonts, and they might not 16363be probed in increasing order of height. 16364 16365@<Look at the variants of |(z,x)|; set |f| and |c|...@>= 16366if (z<>0)or(x<>min_quarterword) then 16367 begin z:=z+s+script_size; 16368 repeat z:=z-script_size; g:=fam_fnt(z); 16369 if g<>null_font then 16370 @<Look at the list of characters starting with |x| in 16371 font |g|; set |f| and |c| whenever 16372 a better character is found; |goto found| as soon as a 16373 large enough variant is encountered@>; 16374 until z<script_size; 16375 end 16376 16377@ @<Look at the list of characters starting with |x|...@>= 16378if is_ot_font(g) then begin 16379 b:=new_native_character(g, x); 16380 x:=get_native_glyph(b, 0); 16381 free_node(b, native_size(b)); 16382 f:=g; c:=x; w:=0; n:=0; 16383 repeat 16384 y:=get_ot_math_variant(g, x, n, addressof(u), 0); 16385 if u>w then begin 16386 c:=y; w:=u; 16387 if u>=v then goto found; 16388 end; 16389 n:=n+1; 16390 until u<0; 16391 {if we get here, then we didn't find a big enough glyph; check if the char is extensible} 16392 ot_assembly_ptr:=get_ot_assembly_ptr(g, x, 0); 16393 if ot_assembly_ptr<>nil then goto found; 16394end else 16395begin y:=x; 16396if (qo(y)>=font_bc[g])and(qo(y)<=font_ec[g]) then 16397 begin continue: q:=char_info(g)(y); 16398 if char_exists(q) then 16399 begin if char_tag(q)=ext_tag then 16400 begin f:=g; c:=y; goto found; 16401 end; 16402 hd:=height_depth(q); 16403 u:=char_height(g)(hd)+char_depth(g)(hd); 16404 if u>w then 16405 begin f:=g; c:=y; w:=u; 16406 if u>=v then goto found; 16407 end; 16408 if char_tag(q)=list_tag then 16409 begin y:=rem_byte(q); goto continue; 16410 end; 16411 end; 16412 end; 16413end 16414 16415@ Here is a subroutine that creates a new box, whose list contains a 16416single character, and whose width includes the italic correction for 16417that character. The height or depth of the box will be negative, if 16418the height or depth of the character is negative; thus, this routine 16419may deliver a slightly different result than |hpack| would produce. 16420 16421@<Declare subprocedures for |var_delimiter|@>= 16422function char_box(@!f:internal_font_number;@!c:integer):pointer; 16423var q:four_quarters; 16424@!hd:eight_bits; {|height_depth| byte} 16425@!b,@!p:pointer; {the new box and its character node} 16426begin 16427if is_native_font(f) then begin 16428 b:=new_null_box; 16429 p:=new_native_character(f, c); 16430 list_ptr(b):=p; 16431 height(b):=height(p); width(b):=width(p); 16432 if depth(p)<0 then depth(b):=0 else depth(b):=depth(p); 16433 end 16434else begin 16435 q:=char_info(f)(c); hd:=height_depth(q); 16436 b:=new_null_box; width(b):=char_width(f)(q)+char_italic(f)(q); 16437 height(b):=char_height(f)(hd); depth(b):=char_depth(f)(hd); 16438 p:=get_avail; character(p):=c; font(p):=f; 16439 end; 16440list_ptr(b):=p; char_box:=b; 16441end; 16442 16443@ When the following code is executed, |char_tag(q)| will be equal to 16444|ext_tag| if and only if a built-up symbol is supposed to be returned. 16445 16446@<Make variable |b| point to a box for |(f,c)|@>= 16447if char_tag(q)=ext_tag then 16448 @<Construct an extensible character in a new box |b|, 16449 using recipe |rem_byte(q)| and font |f|@> 16450else b:=char_box(f,c) 16451 16452@ When we build an extensible character, it's handy to have the 16453following subroutine, which puts a given character on top 16454of the characters already in box |b|: 16455 16456@<Declare subprocedures for |var_delimiter|@>= 16457procedure stack_into_box(@!b:pointer;@!f:internal_font_number; 16458 @!c:quarterword); 16459var p:pointer; {new node placed into |b|} 16460begin p:=char_box(f,c); link(p):=list_ptr(b); list_ptr(b):=p; 16461height(b):=height(p); 16462end; 16463 16464@ Another handy subroutine computes the height plus depth of 16465a given character: 16466 16467@<Declare subprocedures for |var_delimiter|@>= 16468function height_plus_depth(@!f:internal_font_number;@!c:quarterword):scaled; 16469var q:four_quarters; 16470@!hd:eight_bits; {|height_depth| byte} 16471begin q:=char_info(f)(c); hd:=height_depth(q); 16472height_plus_depth:=char_height(f)(hd)+char_depth(f)(hd); 16473end; 16474 16475@ @<Construct an extensible...@>= 16476begin b:=new_null_box; 16477type(b):=vlist_node; 16478r:=font_info[exten_base[f]+rem_byte(q)].qqqq;@/ 16479@<Compute the minimum suitable height, |w|, and the corresponding 16480 number of extension steps, |n|; also set |width(b)|@>; 16481c:=ext_bot(r); 16482if c<>min_quarterword then stack_into_box(b,f,c); 16483c:=ext_rep(r); 16484for m:=1 to n do stack_into_box(b,f,c); 16485c:=ext_mid(r); 16486if c<>min_quarterword then 16487 begin stack_into_box(b,f,c); c:=ext_rep(r); 16488 for m:=1 to n do stack_into_box(b,f,c); 16489 end; 16490c:=ext_top(r); 16491if c<>min_quarterword then stack_into_box(b,f,c); 16492depth(b):=w-height(b); 16493end 16494 16495@ The width of an extensible character is the width of the repeatable 16496module. If this module does not have positive height plus depth, 16497we don't use any copies of it, otherwise we use as few as possible 16498(in groups of two if there is a middle part). 16499 16500@<Compute the minimum suitable height, |w|, and...@>= 16501c:=ext_rep(r); u:=height_plus_depth(f,c); 16502w:=0; q:=char_info(f)(c); width(b):=char_width(f)(q)+char_italic(f)(q);@/ 16503c:=ext_bot(r);@+if c<>min_quarterword then w:=w+height_plus_depth(f,c); 16504c:=ext_mid(r);@+if c<>min_quarterword then w:=w+height_plus_depth(f,c); 16505c:=ext_top(r);@+if c<>min_quarterword then w:=w+height_plus_depth(f,c); 16506n:=0; 16507if u>0 then while w<v do 16508 begin w:=w+u; incr(n); 16509 if ext_mid(r)<>min_quarterword then w:=w+u; 16510 end 16511 16512@ The next subroutine is much simpler; it is used for numerators and 16513denominators of fractions as well as for displayed operators and 16514their limits above and below. It takes a given box~|b| and 16515changes it so that the new box is centered in a box of width~|w|. 16516The centering is done by putting \.{\\hss} glue at the left and right 16517of the list inside |b|, then packaging the new box; thus, the 16518actual box might not really be centered, if it already contains 16519infinite glue. 16520 16521The given box might contain a single character whose italic correction 16522has been added to the width of the box; in this case a compensating 16523kern is inserted. 16524 16525@p function rebox(@!b:pointer;@!w:scaled):pointer; 16526var p:pointer; {temporary register for list manipulation} 16527@!f:internal_font_number; {font in a one-character box} 16528@!v:scaled; {width of a character without italic correction} 16529begin if (width(b)<>w)and(list_ptr(b)<>null) then 16530 begin if type(b)=vlist_node then b:=hpack(b,natural); 16531 p:=list_ptr(b); 16532 if (is_char_node(p))and(link(p)=null) then 16533 begin f:=font(p); v:=char_width(f)(char_info(f)(character(p))); 16534 if v<>width(b) then link(p):=new_kern(width(b)-v); 16535 end; 16536 free_node(b,box_node_size); 16537 b:=new_glue(ss_glue); link(b):=p; 16538 while link(p)<>null do p:=link(p); 16539 link(p):=new_glue(ss_glue); 16540 rebox:=hpack(b,w,exactly); 16541 end 16542else begin width(b):=w; rebox:=b; 16543 end; 16544end; 16545 16546@ Here is a subroutine that creates a new glue specification from another 16547one that is expressed in `\.{mu}', given the value of the math unit. 16548 16549@d mu_mult(#)==nx_plus_y(n,#,xn_over_d(#,f,@'200000)) 16550 16551@p function math_glue(@!g:pointer;@!m:scaled):pointer; 16552var p:pointer; {the new glue specification} 16553@!n:integer; {integer part of |m|} 16554@!f:scaled; {fraction part of |m|} 16555begin n:=x_over_n(m,@'200000); f:=remainder;@/ 16556if f<0 then 16557 begin decr(n); f:=f+@'200000; 16558 end; 16559p:=get_node(glue_spec_size); 16560width(p):=mu_mult(width(g)); {convert \.{mu} to \.{pt}} 16561stretch_order(p):=stretch_order(g); 16562if stretch_order(p)=normal then stretch(p):=mu_mult(stretch(g)) 16563else stretch(p):=stretch(g); 16564shrink_order(p):=shrink_order(g); 16565if shrink_order(p)=normal then shrink(p):=mu_mult(shrink(g)) 16566else shrink(p):=shrink(g); 16567math_glue:=p; 16568end; 16569 16570@ The |math_kern| subroutine removes |mu_glue| from a kern node, given 16571the value of the math unit. 16572 16573@p procedure math_kern(@!p:pointer;@!m:scaled); 16574var @!n:integer; {integer part of |m|} 16575@!f:scaled; {fraction part of |m|} 16576begin if subtype(p)=mu_glue then 16577 begin n:=x_over_n(m,@'200000); f:=remainder;@/ 16578 if f<0 then 16579 begin decr(n); f:=f+@'200000; 16580 end; 16581 width(p):=mu_mult(width(p)); subtype(p):=explicit; 16582 end; 16583end; 16584 16585@ Sometimes it is necessary to destroy an mlist. The following 16586subroutine empties the current list, assuming that |abs(mode)=mmode|. 16587 16588@p procedure flush_math; 16589begin flush_node_list(link(head)); flush_node_list(incompleat_noad); 16590link(head):=null; tail:=head; incompleat_noad:=null; 16591end; 16592 16593@* \[36] Typesetting math formulas. 16594\TeX's most important routine for dealing with formulas is called 16595|mlist_to_hlist|. After a formula has been scanned and represented as an 16596mlist, this routine converts it to an hlist that can be placed into a box 16597or incorporated into the text of a paragraph. There are three implicit 16598parameters, passed in global variables: |cur_mlist| points to the first 16599node or noad in the given mlist (and it might be |null|); |cur_style| is a 16600style code; and |mlist_penalties| is |true| if penalty nodes for potential 16601line breaks are to be inserted into the resulting hlist. After 16602|mlist_to_hlist| has acted, |link(temp_head)| points to the translated hlist. 16603 16604Since mlists can be inside mlists, the procedure is recursive. And since this 16605is not part of \TeX's inner loop, the program has been written in a manner 16606that stresses compactness over efficiency. 16607@^recursion@> 16608 16609@<Glob...@>= 16610@!cur_mlist:pointer; {beginning of mlist to be translated} 16611@!cur_style:small_number; {style code at current place in the list} 16612@!cur_size:integer; {size code corresponding to |cur_style|} 16613@!cur_mu:scaled; {the math unit width corresponding to |cur_size|} 16614@!mlist_penalties:boolean; {should |mlist_to_hlist| insert penalties?} 16615 16616@ The recursion in |mlist_to_hlist| is due primarily to a subroutine 16617called |clean_box| that puts a given noad field into a box using a given 16618math style; |mlist_to_hlist| can call |clean_box|, which can call 16619|mlist_to_hlist|. 16620@^recursion@> 16621 16622The box returned by |clean_box| is ``clean'' in the 16623sense that its |shift_amount| is zero. 16624 16625@p procedure@?mlist_to_hlist; forward;@t\2@>@/ 16626function clean_box(@!p:pointer;@!s:small_number):pointer; 16627label found; 16628var q:pointer; {beginning of a list to be boxed} 16629@!save_style:small_number; {|cur_style| to be restored} 16630@!x:pointer; {box to be returned} 16631@!r:pointer; {temporary pointer} 16632begin case math_type(p) of 16633math_char: begin cur_mlist:=new_noad; mem[nucleus(cur_mlist)]:=mem[p]; 16634 end; 16635sub_box: begin q:=info(p); goto found; 16636 end; 16637sub_mlist: cur_mlist:=info(p); 16638othercases begin q:=new_null_box; goto found; 16639 end 16640endcases;@/ 16641save_style:=cur_style; cur_style:=s; mlist_penalties:=false;@/ 16642mlist_to_hlist; q:=link(temp_head); {recursive call} 16643cur_style:=save_style; {restore the style} 16644@<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>; 16645found: if is_char_node(q)or(q=null) then x:=hpack(q,natural) 16646 else if (link(q)=null)and(type(q)<=vlist_node)and(shift_amount(q)=0) then 16647 x:=q {it's already clean} 16648 else x:=hpack(q,natural); 16649@<Simplify a trivial box@>; 16650clean_box:=x; 16651end; 16652 16653@ Here we save memory space in a common case. 16654 16655@<Simplify a trivial box@>= 16656q:=list_ptr(x); 16657if is_char_node(q) then 16658 begin r:=link(q); 16659 if r<>null then if link(r)=null then if not is_char_node(r) then 16660 if type(r)=kern_node then {unneeded italic correction} 16661 begin free_node(r,small_node_size); link(q):=null; 16662 end; 16663 end 16664 16665@ It is convenient to have a procedure that converts a |math_char| 16666field to an ``unpacked'' form. The |fetch| routine sets |cur_f|, |cur_c|, 16667and |cur_i| to the font code, character code, and character information bytes of 16668a given noad field. It also takes care of issuing error messages for 16669nonexistent characters; in such cases, |char_exists(cur_i)| will be |false| 16670after |fetch| has acted, and the field will also have been reset to |empty|. 16671 16672@p procedure fetch(@!a:pointer); {unpack the |math_char| field |a|} 16673begin cur_c:=cast_to_ushort(character(a)); cur_f:=fam_fnt(fam(a)+cur_size); 16674cur_c:=cur_c + (plane_and_fam_field(a) div @"100) * @"10000; 16675if cur_f=null_font then 16676 @<Complain about an undefined family and set |cur_i| null@> 16677else if is_native_font(cur_f) then begin 16678 cur_i:=null_character; 16679end else begin if (qo(cur_c)>=font_bc[cur_f])and(qo(cur_c)<=font_ec[cur_f]) then 16680 cur_i:=char_info(cur_f)(cur_c) 16681 else cur_i:=null_character; 16682 if not(char_exists(cur_i)) then 16683 begin char_warning(cur_f,qo(cur_c)); 16684 math_type(a):=empty; 16685 end; 16686 end; 16687end; 16688 16689@ @<Complain about an undefined family...@>= 16690begin print_err(""); print_size(cur_size); print_char(" "); 16691print_int(fam(a)); print(" is undefined (character "); 16692print_ASCII(qo(cur_c)); print_char(")"); 16693help4("Somewhere in the math formula just ended, you used the")@/ 16694("stated character from an undefined font family. For example,")@/ 16695("plain TeX doesn't allow \it or \sl in subscripts. Proceed,")@/ 16696("and I'll try to forget that I needed that character."); 16697error; cur_i:=null_character; math_type(a):=empty; 16698end 16699 16700@ The outputs of |fetch| are placed in global variables. 16701 16702@<Glob...@>= 16703@!cur_f:internal_font_number; {the |font| field of a |math_char|} 16704@!cur_c:integer; {the |character| field of a |math_char|} 16705@!cur_i:four_quarters; {the |char_info| of a |math_char|, 16706 or a lig/kern instruction} 16707 16708@ We need to do a lot of different things, so |mlist_to_hlist| makes two 16709passes over the given mlist. 16710 16711The first pass does most of the processing: It removes ``mu'' spacing from 16712glue, it recursively evaluates all subsidiary mlists so that only the 16713top-level mlist remains to be handled, it puts fractions and square roots 16714and such things into boxes, it attaches subscripts and superscripts, and 16715it computes the overall height and depth of the top-level mlist so that 16716the size of delimiters for a |left_noad| and a |right_noad| will be known. 16717The hlist resulting from each noad is recorded in that noad's |new_hlist| 16718field, an integer field that replaces the |nucleus| or |thickness|. 16719@^recursion@> 16720 16721The second pass eliminates all noads and inserts the correct glue and 16722penalties between nodes. 16723 16724@d new_hlist(#)==mem[nucleus(#)].int {the translation of an mlist} 16725 16726@ Here is the overall plan of |mlist_to_hlist|, and the list of its 16727local variables. 16728 16729@d done_with_noad=80 {go here when a noad has been fully translated} 16730@d done_with_node=81 {go here when a node has been fully converted} 16731@d check_dimensions=82 {go here to update |max_h| and |max_d|} 16732@d delete_q=83 {go here to delete |q| and move to the next node} 16733 16734@p@t\4@>@<Declare math construction procedures@> 16735procedure mlist_to_hlist; 16736label reswitch, check_dimensions, done_with_noad, done_with_node, delete_q, 16737 done; 16738var mlist:pointer; {beginning of the given list} 16739@!penalties:boolean; {should penalty nodes be inserted?} 16740@!style:small_number; {the given style} 16741@!save_style:small_number; {holds |cur_style| during recursion} 16742@!q:pointer; {runs through the mlist} 16743@!r:pointer; {the most recent noad preceding |q|} 16744@!r_type:small_number; {the |type| of noad |r|, or |op_noad| if |r=null|} 16745@!t:small_number; {the effective |type| of noad |q| during the second pass} 16746@!p,@!x,@!y,@!z: pointer; {temporary registers for list construction} 16747@!pen:integer; {a penalty to be inserted} 16748@!s:small_number; {the size of a noad to be deleted} 16749@!max_h,@!max_d:scaled; {maximum height and depth of the list translated so far} 16750@!delta:scaled; {offset between subscript and superscript} 16751begin mlist:=cur_mlist; penalties:=mlist_penalties; 16752style:=cur_style; {tuck global parameters away as local variables} 16753q:=mlist; r:=null; r_type:=op_noad; max_h:=0; max_d:=0; 16754@<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>; 16755while q<>null do @<Process node-or-noad |q| as much as possible in preparation 16756 for the second pass of |mlist_to_hlist|, then move to the next 16757 item in the mlist@>; 16758@<Convert \(a)a final |bin_noad| to an |ord_noad|@>; 16759@<Make a second pass over the mlist, removing all noads and inserting the 16760 proper spacing and penalties@>; 16761end; 16762 16763@ We use the fact that no character nodes appear in an mlist, hence 16764the field |type(q)| is always present. 16765 16766@<Process node-or-noad...@>= 16767begin @<Do first-pass processing based on |type(q)|; |goto done_with_noad| 16768 if a noad has been fully processed, |goto check_dimensions| if it 16769 has been translated into |new_hlist(q)|, or |goto done_with_node| 16770 if a node has been fully processed@>; 16771check_dimensions: z:=hpack(new_hlist(q),natural); 16772if height(z)>max_h then max_h:=height(z); 16773if depth(z)>max_d then max_d:=depth(z); 16774free_node(z,box_node_size); 16775done_with_noad: r:=q; r_type:=type(r); 16776if r_type=right_noad then 16777 begin r_type:=left_noad; cur_style:=style; @<Set up the values...@>; 16778 end; 16779done_with_node: q:=link(q); 16780end 16781 16782@ One of the things we must do on the first pass is change a |bin_noad| to 16783an |ord_noad| if the |bin_noad| is not in the context of a binary operator. 16784The values of |r| and |r_type| make this fairly easy. 16785 16786@<Do first-pass processing...@>= 16787reswitch: delta:=0; 16788case type(q) of 16789bin_noad: case r_type of 16790 bin_noad,op_noad,rel_noad,open_noad,punct_noad,left_noad: 16791 begin type(q):=ord_noad; goto reswitch; 16792 end; 16793 othercases do_nothing 16794 endcases; 16795rel_noad,close_noad,punct_noad,right_noad: begin@t@>@;@/ 16796 @<Convert \(a)a final |bin_noad| to an |ord_noad|@>; 16797 if type(q)=right_noad then goto done_with_noad; 16798 end; 16799@t\4@>@<Cases for noads that can follow a |bin_noad|@>@; 16800@t\4@>@<Cases for nodes that can appear in an mlist, after which we 16801 |goto done_with_node|@>@; 16802othercases confusion("mlist1") 16803@:this can't happen mlist1}{\quad mlist1@> 16804endcases;@/ 16805@<Convert \(n)|nucleus(q)| to an hlist and attach the sub/superscripts@> 16806 16807@ @<Convert \(a)a final |bin_noad| to an |ord_noad|@>= 16808if r_type=bin_noad then type(r):=ord_noad 16809 16810@ @<Cases for nodes that can appear in an mlist...@>= 16811style_node: begin cur_style:=subtype(q); 16812 @<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>; 16813 goto done_with_node; 16814 end; 16815choice_node: @<Change this node to a style node followed by the correct choice, 16816 then |goto done_with_node|@>; 16817ins_node,mark_node,adjust_node, 16818 whatsit_node,penalty_node,disc_node: goto done_with_node; 16819rule_node: begin if height(q)>max_h then max_h:=height(q); 16820 if depth(q)>max_d then max_d:=depth(q); goto done_with_node; 16821 end; 16822glue_node: begin @<Convert \(m)math glue to ordinary glue@>; 16823 goto done_with_node; 16824 end; 16825kern_node: begin math_kern(q,cur_mu); goto done_with_node; 16826 end; 16827 16828@ @d choose_mlist(#)==begin p:=#(q); #(q):=null;@+end 16829 16830@<Change this node to a style node...@>= 16831begin case cur_style div 2 of 168320: choose_mlist(display_mlist); {|display_style=0|} 168331: choose_mlist(text_mlist); {|text_style=2|} 168342: choose_mlist(script_mlist); {|script_style=4|} 168353: choose_mlist(script_script_mlist); {|script_script_style=6|} 16836end; {there are no other cases} 16837flush_node_list(display_mlist(q)); 16838flush_node_list(text_mlist(q)); 16839flush_node_list(script_mlist(q)); 16840flush_node_list(script_script_mlist(q));@/ 16841type(q):=style_node; subtype(q):=cur_style; width(q):=0; depth(q):=0; 16842if p<>null then 16843 begin z:=link(q); link(q):=p; 16844 while link(p)<>null do p:=link(p); 16845 link(p):=z; 16846 end; 16847goto done_with_node; 16848end 16849 16850@ Conditional math glue (`\.{\\nonscript}') results in a |glue_node| 16851pointing to |zero_glue|, with |subtype(q)=cond_math_glue|; in such a case 16852the node following will be eliminated if it is a glue or kern node and if the 16853current size is different from |text_size|. Unconditional math glue 16854(`\.{\\muskip}') is converted to normal glue by multiplying the dimensions 16855by |cur_mu|. 16856@!@:non_script_}{\.{\\nonscript} primitive@> 16857 16858@<Convert \(m)math glue to ordinary glue@>= 16859if subtype(q)=mu_glue then 16860 begin x:=glue_ptr(q); 16861 y:=math_glue(x,cur_mu); delete_glue_ref(x); glue_ptr(q):=y; 16862 subtype(q):=normal; 16863 end 16864else if (cur_size<>text_size)and(subtype(q)=cond_math_glue) then 16865 begin p:=link(q); 16866 if p<>null then if (type(p)=glue_node)or(type(p)=kern_node) then 16867 begin link(q):=link(p); link(p):=null; flush_node_list(p); 16868 end; 16869 end 16870 16871@ @<Cases for noads that can follow a |bin_noad|@>= 16872left_noad: goto done_with_noad; 16873fraction_noad: begin make_fraction(q); goto check_dimensions; 16874 end; 16875op_noad: begin delta:=make_op(q); 16876 if subtype(q)=limits then goto check_dimensions; 16877 end; 16878ord_noad: make_ord(q); 16879open_noad,inner_noad: do_nothing; 16880radical_noad: make_radical(q); 16881over_noad: make_over(q); 16882under_noad: make_under(q); 16883accent_noad: make_math_accent(q); 16884vcenter_noad: make_vcenter(q); 16885 16886@ Most of the actual construction work of |mlist_to_hlist| is done 16887by procedures with names 16888like |make_fraction|, |make_radical|, etc. To illustrate 16889the general setup of such procedures, let's begin with a couple of 16890simple ones. 16891 16892@<Declare math...@>= 16893procedure make_over(@!q:pointer); 16894begin info(nucleus(q)):=@| 16895 overbar(clean_box(nucleus(q),cramped_style(cur_style)),@| 16896 3*default_rule_thickness,default_rule_thickness); 16897math_type(nucleus(q)):=sub_box; 16898end; 16899 16900@ @<Declare math...@>= 16901procedure make_under(@!q:pointer); 16902var p,@!x,@!y: pointer; {temporary registers for box construction} 16903@!delta:scaled; {overall height plus depth} 16904begin x:=clean_box(nucleus(q),cur_style); 16905p:=new_kern(3*default_rule_thickness); link(x):=p; 16906link(p):=fraction_rule(default_rule_thickness); 16907y:=vpack(x,natural); 16908delta:=height(y)+depth(y)+default_rule_thickness; 16909height(y):=height(x); depth(y):=delta-height(y); 16910info(nucleus(q)):=y; math_type(nucleus(q)):=sub_box; 16911end; 16912 16913@ @<Declare math...@>= 16914procedure make_vcenter(@!q:pointer); 16915var v:pointer; {the box that should be centered vertically} 16916@!delta:scaled; {its height plus depth} 16917begin v:=info(nucleus(q)); 16918if type(v)<>vlist_node then confusion("vcenter"); 16919@:this can't happen vcenter}{\quad vcenter@> 16920delta:=height(v)+depth(v); 16921height(v):=axis_height(cur_size)+half(delta); 16922depth(v):=delta-height(v); 16923end; 16924 16925@ According to the rules in the \.{DVI} file specifications, we ensure alignment 16926@^square roots@> 16927between a square root sign and the rule above its nucleus by assuming that the 16928baseline of the square-root symbol is the same as the bottom of the rule. The 16929height of the square-root symbol will be the thickness of the rule, and the 16930depth of the square-root symbol should exceed or equal the height-plus-depth 16931of the nucleus plus a certain minimum clearance~|clr|. The symbol will be 16932placed so that the actual clearance is |clr| plus half the excess. 16933 16934@<Declare math...@>= 16935procedure make_radical(@!q:pointer); 16936var x,@!y:pointer; {temporary registers for box construction} 16937f:internal_font_number; 16938rule_thickness:scaled; {rule thickness} 16939@!delta,@!clr:scaled; {dimensions involved in the calculation} 16940begin f:=fam_fnt(small_fam(left_delimiter(q)) + cur_size); 16941if is_new_mathfont(f) then rule_thickness:=get_ot_math_constant(f,radicalRuleThickness) 16942else rule_thickness:=default_rule_thickness; 16943x:=clean_box(nucleus(q),cramped_style(cur_style)); 16944if is_new_mathfont(f) then begin 16945 if cur_style<text_style then {display style} 16946 clr:=get_ot_math_constant(f,radicalDisplayStyleVerticalGap) 16947 else clr:=get_ot_math_constant(f,radicalVerticalGap); 16948end else begin 16949 if cur_style<text_style then {display style} 16950 clr:=rule_thickness+(abs(math_x_height(cur_size)) div 4) 16951 else begin clr:=rule_thickness; clr:=clr + (abs(clr) div 4); 16952 end; 16953end; 16954y:=var_delimiter(left_delimiter(q),cur_size,height(x)+depth(x)+clr+rule_thickness); 16955if is_new_mathfont(f) then begin 16956 depth(y):=height(y)+depth(y)-rule_thickness; 16957 height(y):=rule_thickness; 16958end; 16959delta:=depth(y)-(height(x)+depth(x)+clr); 16960if delta>0 then clr:=clr+half(delta); {increase the actual clearance} 16961shift_amount(y):=-(height(x)+clr); 16962link(y):=overbar(x,clr,height(y)); 16963info(nucleus(q)):=hpack(y,natural); math_type(nucleus(q)):=sub_box; 16964end; 16965 16966@ Slants are not considered when placing accents in math mode. The accenter is 16967centered over the accentee, and the accent width is treated as zero with 16968respect to the size of the final box. 16969 16970@<Declare math...@>= 16971function compute_ot_math_accent_pos(@!p:pointer):scaled; 16972var 16973 @!q,@!r:pointer; 16974 @!s,@!g:scaled; 16975begin 16976 if (math_type(nucleus(p))=math_char) then begin 16977 fetch(nucleus(p)); 16978 q:=new_native_character(cur_f, qo(cur_c)); 16979 g:=get_native_glyph(q, 0); 16980 s:=get_ot_math_accent_pos(cur_f, g); 16981 end else begin 16982 if (math_type(nucleus(p))=sub_mlist) then begin 16983 r:=info(nucleus(p)); 16984 if (r<>null) and (type(r)=accent_noad) then 16985 s:=compute_ot_math_accent_pos(r) 16986 else 16987 s:=@"7FFFFFFF; 16988 end else 16989 s:=@"7FFFFFFF; 16990 end; 16991 compute_ot_math_accent_pos:=s; 16992end; 16993 16994procedure make_math_accent(@!q:pointer); 16995label done,done1; 16996var p,@!x,@!y:pointer; {temporary registers for box construction} 16997@!a:integer; {address of lig/kern instruction} 16998@!c,@!g:integer; {accent character} 16999@!f:internal_font_number; {its font} 17000@!i:four_quarters; {its |char_info|} 17001@!s,@!sa:scaled; {amount to skew the accent to the right} 17002@!h:scaled; {height of character being accented} 17003@!delta:scaled; {space to remove between accent and accentee} 17004@!w,@!w2:scaled; {width of the accentee, not including sub/superscripts} 17005@!ot_assembly_ptr:void_pointer; 17006begin fetch(accent_chr(q)); 17007x:=null; 17008if is_native_font(cur_f) then 17009 begin c:=cur_c; f:=cur_f; 17010 if not is_bottom_acc(q) then s:=compute_ot_math_accent_pos(q) else s:=0; 17011 x:=clean_box(nucleus(q),cramped_style(cur_style)); w:=width(x); h:=height(x); 17012 end 17013else if char_exists(cur_i) then 17014 begin i:=cur_i; c:=cur_c; f:=cur_f;@/ 17015 @<Compute the amount of skew@>; 17016 x:=clean_box(nucleus(q),cramped_style(cur_style)); w:=width(x); h:=height(x); 17017 @<Switch to a larger accent if available and appropriate@>; 17018 end; 17019if x<>null then begin 17020 if is_new_mathfont(f) then 17021 if is_bottom_acc(q) then delta:=0 17022 else if h<get_ot_math_constant(f, accentBaseHeight) then delta:=h@+else delta:=get_ot_math_constant(f, accentBaseHeight) 17023 else 17024 if h<x_height(f) then delta:=h@+else delta:=x_height(f); 17025 if (math_type(supscr(q))<>empty)or(math_type(subscr(q))<>empty) then 17026 if math_type(nucleus(q))=math_char then 17027 @<Swap the subscript and superscript into box |x|@>; 17028 y:=char_box(f,c); 17029 if is_native_font(f) then begin 17030 {turn the |native_word| node into a |native_glyph| one} 17031 p:=get_node(glyph_node_size); 17032 type(p):=whatsit_node; subtype(p):=glyph_node; 17033 native_font(p):=f; native_glyph(p):=get_native_glyph(list_ptr(y), 0); 17034 set_native_glyph_metrics(p, 1); 17035 free_node(list_ptr(y), native_size(list_ptr(y))); 17036 list_ptr(y):=p; 17037 17038 @<Switch to a larger native-font accent if available and appropriate@>; 17039 17040 {determine horiz positioning} 17041 if is_glyph_node(p) then 17042 begin sa:=get_ot_math_accent_pos(f,native_glyph(p)); 17043 if sa=@"7FFFFFFF then sa:=half(width(y)); 17044 end 17045 else sa:=half(width(y)); 17046 if is_bottom_acc(q) or (s=@"7FFFFFFF) then s:=half(w); 17047 shift_amount(y):=s-sa; 17048 end else 17049 shift_amount(y):=s+half(w-width(y)); 17050 width(y):=0; 17051 if is_bottom_acc(q) then begin 17052 link(x):=y; y:=vpack(x,natural); shift_amount(y):=-(h - height(y)); 17053 end else begin 17054 p:=new_kern(-delta); link(p):=x; link(y):=p; y:=vpack(y,natural); 17055 if height(y)<h then @<Make the height of box |y| equal to |h|@>; 17056 end; 17057 width(y):=width(x); 17058 info(nucleus(q)):=y; 17059 math_type(nucleus(q)):=sub_box; 17060 end; 17061end; 17062 17063@ @<Make the height of box |y|...@>= 17064begin p:=new_kern(h-height(y)); link(p):=list_ptr(y); list_ptr(y):=p; 17065height(y):=h; 17066end 17067 17068@ @<Switch to a larger native-font accent if available and appropriate@>= 17069 if odd(subtype(q)) then {non growing accent} 17070 set_native_glyph_metrics(p, 1) 17071 else begin 17072 c:=native_glyph(p); 17073 a:=0; 17074 repeat 17075 g:=get_ot_math_variant(f, c, a, addressof(w2), 1); 17076 if (w2>0) and (w2<=w) then begin 17077 native_glyph(p):=g; 17078 set_native_glyph_metrics(p, 1); 17079 incr(a); 17080 end; 17081 until (w2<0) or (w2>=w); 17082 if (w2<0) then begin 17083 ot_assembly_ptr:=get_ot_assembly_ptr(f, c, 1); 17084 if ot_assembly_ptr<>nil then begin 17085 free_node(p,glyph_node_size); 17086 p:=build_opentype_assembly(f, ot_assembly_ptr, w, 1); 17087 list_ptr(y):=p; 17088 goto found; 17089 end; 17090 end else 17091 set_native_glyph_metrics(p, 1); 17092 end; 17093found: 17094 width(y):=width(p); height(y):=height(p); depth(y):=depth(p); 17095 if is_bottom_acc(q) then begin 17096 if height(y)<0 then height(y):=0 17097 end else if depth(y)<0 then depth(y):=0; 17098 17099@ @<Switch to a larger accent if available and appropriate@>= 17100loop@+ begin if char_tag(i)<>list_tag then goto done; 17101 y:=rem_byte(i); 17102 i:=char_info(f)(y); 17103 if not char_exists(i) then goto done; 17104 if char_width(f)(i)>w then goto done; 17105 c:=y; 17106 end; 17107done: 17108 17109@ @<Compute the amount of skew@>= 17110s:=0; 17111if math_type(nucleus(q))=math_char then 17112 begin fetch(nucleus(q)); 17113 if char_tag(cur_i)=lig_tag then 17114 begin a:=lig_kern_start(cur_f)(cur_i); 17115 cur_i:=font_info[a].qqqq; 17116 if skip_byte(cur_i)>stop_flag then 17117 begin a:=lig_kern_restart(cur_f)(cur_i); 17118 cur_i:=font_info[a].qqqq; 17119 end; 17120 loop@+ begin if qo(next_char(cur_i))=skew_char[cur_f] then 17121 begin if op_byte(cur_i)>=kern_flag then 17122 if skip_byte(cur_i)<=stop_flag then s:=char_kern(cur_f)(cur_i); 17123 goto done1; 17124 end; 17125 if skip_byte(cur_i)>=stop_flag then goto done1; 17126 a:=a+qo(skip_byte(cur_i))+1; 17127 cur_i:=font_info[a].qqqq; 17128 end; 17129 end; 17130 end; 17131done1: 17132 17133@ @<Swap the subscript and superscript into box |x|@>= 17134begin flush_node_list(x); x:=new_noad; 17135mem[nucleus(x)]:=mem[nucleus(q)]; 17136mem[supscr(x)]:=mem[supscr(q)]; 17137mem[subscr(x)]:=mem[subscr(q)];@/ 17138mem[supscr(q)].hh:=empty_field; 17139mem[subscr(q)].hh:=empty_field;@/ 17140math_type(nucleus(q)):=sub_mlist; info(nucleus(q)):=x; 17141x:=clean_box(nucleus(q),cur_style); delta:=delta+height(x)-h; h:=height(x); 17142end 17143 17144@ The |make_fraction| procedure is a bit different because it sets 17145|new_hlist(q)| directly rather than making a sub-box. 17146 17147@<Declare math...@>= 17148procedure make_fraction(@!q:pointer); 17149var p,@!v,@!x,@!y,@!z:pointer; {temporary registers for box construction} 17150@!delta,@!delta1,@!delta2,@!shift_up,@!shift_down,@!clr:scaled; 17151 {dimensions for box calculations} 17152begin if thickness(q)=default_code then thickness(q):=default_rule_thickness; 17153@<Create equal-width boxes |x| and |z| for the numerator and denominator, 17154 and compute the default amounts |shift_up| and |shift_down| by which they 17155 are displaced from the baseline@>; 17156if thickness(q)=0 then @<Adjust \(s)|shift_up| and |shift_down| for the case 17157 of no fraction line@> 17158else @<Adjust \(s)|shift_up| and |shift_down| for the case of a fraction line@>; 17159@<Construct a vlist box for the fraction, according to |shift_up| and 17160 |shift_down|@>; 17161@<Put the \(f)fraction into a box with its delimiters, and make |new_hlist(q)| 17162 point to it@>; 17163end; 17164 17165@ @<Create equal-width boxes |x| and |z| for the numerator and denom...@>= 17166x:=clean_box(numerator(q),num_style(cur_style)); 17167z:=clean_box(denominator(q),denom_style(cur_style)); 17168if width(x)<width(z) then x:=rebox(x,width(z)) 17169else z:=rebox(z,width(x)); 17170if cur_style<text_style then {display style} 17171 begin shift_up:=num1(cur_size); shift_down:=denom1(cur_size); 17172 end 17173else begin shift_down:=denom2(cur_size); 17174 if thickness(q)<>0 then shift_up:=num2(cur_size) 17175 else shift_up:=num3(cur_size); 17176 end 17177 17178@ The numerator and denominator must be separated by a certain minimum 17179clearance, called |clr| in the following program. The difference between 17180|clr| and the actual clearance is |2delta|. 17181 17182@<Adjust \(s)|shift_up| and |shift_down| for the case of no fraction line@>= 17183begin if is_new_mathfont(cur_f) then begin 17184 if cur_style<text_style then clr:=get_ot_math_constant(cur_f, stackDisplayStyleGapMin) 17185 else clr:=get_ot_math_constant(cur_f, stackGapMin); 17186end else begin 17187 if cur_style<text_style then clr:=7*default_rule_thickness 17188 else clr:=3*default_rule_thickness; 17189end; 17190delta:=half(clr-((shift_up-depth(x))-(height(z)-shift_down))); 17191if delta>0 then 17192 begin shift_up:=shift_up+delta; 17193 shift_down:=shift_down+delta; 17194 end; 17195end 17196 17197@ In the case of a fraction line, the minimum clearance depends on the actual 17198thickness of the line. 17199 17200@<Adjust \(s)|shift_up| and |shift_down| for the case of a fraction line@>= 17201begin if is_new_mathfont(cur_f) then begin 17202 delta:=half(thickness(q)); 17203 if cur_style<text_style then clr:=get_ot_math_constant(cur_f, fractionNumDisplayStyleGapMin) 17204 else clr:=get_ot_math_constant(cur_f, fractionNumeratorGapMin); 17205 delta1:=clr-((shift_up-depth(x))-(axis_height(cur_size)+delta)); 17206 if cur_style<text_style then clr:=get_ot_math_constant(cur_f, fractionDenomDisplayStyleGapMin) 17207 else clr:=get_ot_math_constant(cur_f, fractionDenominatorGapMin); 17208 delta2:=clr-((axis_height(cur_size)-delta)-(height(z)-shift_down)); 17209end else begin 17210 if cur_style<text_style then clr:=3*thickness(q) 17211 else clr:=thickness(q); 17212 delta:=half(thickness(q)); 17213 delta1:=clr-((shift_up-depth(x))-(axis_height(cur_size)+delta)); 17214 delta2:=clr-((axis_height(cur_size)-delta)-(height(z)-shift_down)); 17215end; 17216if delta1>0 then shift_up:=shift_up+delta1; 17217if delta2>0 then shift_down:=shift_down+delta2; 17218end 17219 17220@ @<Construct a vlist box for the fraction...@>= 17221v:=new_null_box; type(v):=vlist_node; 17222height(v):=shift_up+height(x); depth(v):=depth(z)+shift_down; 17223width(v):=width(x); {this also equals |width(z)|} 17224if thickness(q)=0 then 17225 begin p:=new_kern((shift_up-depth(x))-(height(z)-shift_down)); 17226 link(p):=z; 17227 end 17228else begin y:=fraction_rule(thickness(q));@/ 17229 p:=new_kern((axis_height(cur_size)-delta)-@|(height(z)-shift_down));@/ 17230 link(y):=p; link(p):=z;@/ 17231 p:=new_kern((shift_up-depth(x))-(axis_height(cur_size)+delta)); 17232 link(p):=y; 17233 end; 17234link(x):=p; list_ptr(v):=x 17235 17236@ @<Put the \(f)fraction into a box with its delimiters...@>= 17237if cur_style<text_style then delta:=delim1(cur_size) 17238else delta:=delim2(cur_size); 17239x:=var_delimiter(left_delimiter(q), cur_size, delta); link(x):=v;@/ 17240z:=var_delimiter(right_delimiter(q), cur_size, delta); link(v):=z;@/ 17241new_hlist(q):=hpack(x,natural) 17242 17243@ If the nucleus of an |op_noad| is a single character, it is to be 17244centered vertically with respect to the axis, after first being enlarged 17245(via a character list in the font) if we are in display style. The normal 17246convention for placing displayed limits is to put them above and below the 17247operator in display style. 17248 17249The italic correction is removed from the character if there is a subscript 17250and the limits are not being displayed. The |make_op| 17251routine returns the value that should be used as an offset between 17252subscript and superscript. 17253 17254After |make_op| has acted, |subtype(q)| will be |limits| if and only if 17255the limits have been set above and below the operator. In that case, 17256|new_hlist(q)| will already contain the desired final box. 17257 17258@<Declare math...@>= 17259function make_op(@!q:pointer):scaled; 17260label found; 17261var delta:scaled; {offset between subscript and superscript} 17262@!p,@!v,@!x,@!y,@!z:pointer; {temporary registers for box construction} 17263@!c:quarterword;@+@!i:four_quarters; {registers for character examination} 17264@!shift_up,@!shift_down:scaled; {dimensions for box calculation} 17265@!h1,@!h2:scaled; {height of original text-style symbol and possible replacement} 17266@!n,@!g:integer; {potential variant index and glyph code} 17267@!ot_assembly_ptr:void_pointer; 17268@!save_f:internal_font_number; 17269begin if (subtype(q)=normal)and(cur_style<text_style) then 17270 subtype(q):=limits; 17271delta:=0; 17272if math_type(nucleus(q))=math_char then 17273 begin fetch(nucleus(q)); 17274 if not is_ot_font(cur_f) then begin 17275 if (cur_style<text_style)and(char_tag(cur_i)=list_tag) then {make it larger} 17276 begin c:=rem_byte(cur_i); i:=char_info(cur_f)(c); 17277 if char_exists(i) then 17278 begin cur_c:=c; cur_i:=i; character(nucleus(q)):=c; 17279 end; 17280 end; 17281 delta:=char_italic(cur_f)(cur_i); 17282 end; 17283 x:=clean_box(nucleus(q),cur_style); 17284 if is_new_mathfont(cur_f) then begin 17285 p:=list_ptr(x); 17286 if is_glyph_node(p) then begin 17287 if cur_style<text_style then begin 17288 {try to replace the operator glyph with a display-size variant, 17289 ensuring it is larger than the text size} 17290 h1:=get_ot_math_constant(cur_f,displayOperatorMinHeight); 17291 if h1<(height(p)+depth(p))*5/4 then h1:=(height(p)+depth(p))*5/4; 17292 c:=native_glyph(p); 17293 n:=0; 17294 repeat 17295 g:=get_ot_math_variant(cur_f, c, n, addressof(h2), 0); 17296 if h2>0 then begin 17297 native_glyph(p):=g; 17298 set_native_glyph_metrics(p, 1); 17299 end; 17300 incr(n); 17301 until (h2<0) or (h2>=h1); 17302 if (h2<0) then begin 17303 {if we get here, then we didn't find a big enough glyph; check if the char is extensible} 17304 ot_assembly_ptr:=get_ot_assembly_ptr(cur_f, c, 0); 17305 if ot_assembly_ptr<>nil then begin 17306 free_node(p,glyph_node_size); 17307 p:=build_opentype_assembly(cur_f, ot_assembly_ptr, h1, 0); 17308 list_ptr(x):=p; 17309 delta:=0; 17310 goto found; 17311 end; 17312 end else 17313 set_native_glyph_metrics(p, 1); 17314 end; 17315 delta:=get_ot_math_ital_corr(cur_f, native_glyph(p)); 17316found: 17317 width(x):=width(p); height(x):=height(p); depth(x):=depth(p); 17318 end 17319 end; 17320 if (math_type(subscr(q))<>empty)and(subtype(q)<>limits) then 17321 width(x):=width(x)-delta; {remove italic correction} 17322 shift_amount(x):=half(height(x)-depth(x)) - axis_height(cur_size); 17323 {center vertically} 17324 math_type(nucleus(q)):=sub_box; info(nucleus(q)):=x; 17325 end; 17326save_f:=cur_f; 17327if subtype(q)=limits then 17328 @<Construct a box with limits above and below it, skewed by |delta|@>; 17329make_op:=delta; 17330end; 17331 17332@ The following program builds a vlist box |v| for displayed limits. The 17333width of the box is not affected by the fact that the limits may be skewed. 17334 17335@<Construct a box with limits above and below it...@>= 17336begin x:=clean_box(supscr(q),sup_style(cur_style)); 17337y:=clean_box(nucleus(q),cur_style); 17338z:=clean_box(subscr(q),sub_style(cur_style)); 17339v:=new_null_box; type(v):=vlist_node; width(v):=width(y); 17340if width(x)>width(v) then width(v):=width(x); 17341if width(z)>width(v) then width(v):=width(z); 17342x:=rebox(x,width(v)); y:=rebox(y,width(v)); z:=rebox(z,width(v));@/ 17343shift_amount(x):=half(delta); shift_amount(z):=-shift_amount(x); 17344height(v):=height(y); depth(v):=depth(y); 17345@<Attach the limits to |y| and adjust |height(v)|, |depth(v)| to 17346 account for their presence@>; 17347new_hlist(q):=v; 17348end 17349 17350@ We use |shift_up| and |shift_down| in the following program for the 17351amount of glue between the displayed operator |y| and its limits |x| and 17352|z|. The vlist inside box |v| will consist of |x| followed by |y| followed 17353by |z|, with kern nodes for the spaces between and around them. 17354 17355@<Attach the limits to |y| and adjust |height(v)|, |depth(v)|...@>= 17356cur_f:=save_f; 17357if math_type(supscr(q))=empty then 17358 begin free_node(x,box_node_size); list_ptr(v):=y; 17359 end 17360else begin shift_up:=big_op_spacing3-depth(x); 17361 if shift_up<big_op_spacing1 then shift_up:=big_op_spacing1; 17362 p:=new_kern(shift_up); link(p):=y; link(x):=p;@/ 17363 p:=new_kern(big_op_spacing5); link(p):=x; list_ptr(v):=p; 17364 height(v):=height(v)+big_op_spacing5+height(x)+depth(x)+shift_up; 17365 end; 17366if math_type(subscr(q))=empty then free_node(z,box_node_size) 17367else begin shift_down:=big_op_spacing4-height(z); 17368 if shift_down<big_op_spacing2 then shift_down:=big_op_spacing2; 17369 p:=new_kern(shift_down); link(y):=p; link(p):=z;@/ 17370 p:=new_kern(big_op_spacing5); link(z):=p; 17371 depth(v):=depth(v)+big_op_spacing5+height(z)+depth(z)+shift_down; 17372 end 17373 17374@ A ligature found in a math formula does not create a |ligature_node|, because 17375there is no question of hyphenation afterwards; the ligature will simply be 17376stored in an ordinary |char_node|, after residing in an |ord_noad|. 17377 17378The |math_type| is converted to |math_text_char| here if we would not want to 17379apply an italic correction to the current character unless it belongs 17380to a math font (i.e., a font with |space=0|). 17381 17382No boundary characters enter into these ligatures. 17383 17384@<Declare math...@>= 17385procedure make_ord(@!q:pointer); 17386label restart,exit; 17387var a:integer; {address of lig/kern instruction} 17388@!p,@!r:pointer; {temporary registers for list manipulation} 17389begin restart:@t@>@;@/ 17390if math_type(subscr(q))=empty then if math_type(supscr(q))=empty then 17391 if math_type(nucleus(q))=math_char then 17392 begin p:=link(q); 17393 if p<>null then if (type(p)>=ord_noad)and(type(p)<=punct_noad) then 17394 if math_type(nucleus(p))=math_char then 17395 if fam(nucleus(p))=fam(nucleus(q)) then 17396 begin math_type(nucleus(q)):=math_text_char; 17397 fetch(nucleus(q)); 17398 if char_tag(cur_i)=lig_tag then 17399 begin a:=lig_kern_start(cur_f)(cur_i); 17400 cur_c:=character(nucleus(p)); 17401 cur_i:=font_info[a].qqqq; 17402 if skip_byte(cur_i)>stop_flag then 17403 begin a:=lig_kern_restart(cur_f)(cur_i); 17404 cur_i:=font_info[a].qqqq; 17405 end; 17406 loop@+ begin @<If instruction |cur_i| is a kern with |cur_c|, attach 17407 the kern after~|q|; or if it is a ligature with |cur_c|, combine 17408 noads |q| and~|p| appropriately; then |return| if the cursor has 17409 moved past a noad, or |goto restart|@>; 17410 if skip_byte(cur_i)>=stop_flag then return; 17411 a:=a+qo(skip_byte(cur_i))+1; 17412 cur_i:=font_info[a].qqqq; 17413 end; 17414 end; 17415 end; 17416 end; 17417exit:end; 17418 17419@ Note that a ligature between an |ord_noad| and another kind of noad 17420is replaced by an |ord_noad|, when the two noads collapse into one. 17421But we could make a parenthesis (say) change shape when it follows 17422certain letters. Presumably a font designer will define such 17423ligatures only when this convention makes sense. 17424 17425\chardef\?='174 % vertical line to indicate character retention 17426 17427@<If instruction |cur_i| is a kern with |cur_c|, ...@>= 17428if next_char(cur_i)=cur_c then if skip_byte(cur_i)<=stop_flag then 17429 if op_byte(cur_i)>=kern_flag then 17430 begin p:=new_kern(char_kern(cur_f)(cur_i)); 17431 link(p):=link(q); link(q):=p; return; 17432 end 17433 else begin check_interrupt; {allow a way out of infinite ligature loop} 17434 case op_byte(cur_i) of 17435 qi(1),qi(5): character(nucleus(q)):=rem_byte(cur_i); {\.{=:\?}, \.{=:\?>}} 17436 qi(2),qi(6): character(nucleus(p)):=rem_byte(cur_i); {\.{\?=:}, \.{\?=:>}} 17437 qi(3),qi(7),qi(11):begin r:=new_noad; {\.{\?=:\?}, \.{\?=:\?>}, \.{\?=:\?>>}} 17438 character(nucleus(r)):=rem_byte(cur_i); 17439 plane_and_fam_field(nucleus(r)):=fam(nucleus(q));@/ 17440 link(q):=r; link(r):=p; 17441 if op_byte(cur_i)<qi(11) then math_type(nucleus(r)):=math_char 17442 else math_type(nucleus(r)):=math_text_char; {prevent combination} 17443 end; 17444 othercases begin link(q):=link(p); 17445 character(nucleus(q)):=rem_byte(cur_i); {\.{=:}} 17446 mem[subscr(q)]:=mem[subscr(p)]; mem[supscr(q)]:=mem[supscr(p)];@/ 17447 free_node(p,noad_size); 17448 end 17449 endcases; 17450 if op_byte(cur_i)>qi(3) then return; 17451 math_type(nucleus(q)):=math_char; goto restart; 17452 end 17453 17454@ When we get to the following part of the program, we have ``fallen through'' 17455from cases that did not lead to |check_dimensions| or |done_with_noad| or 17456|done_with_node|. Thus, |q|~points to a noad whose nucleus may need to be 17457converted to an hlist, and whose subscripts and superscripts need to be 17458appended if they are present. 17459 17460If |nucleus(q)| is not a |math_char|, the variable |delta| is the amount 17461by which a superscript should be moved right with respect to a subscript 17462when both are present. 17463@^subscripts@> 17464@^superscripts@> 17465 17466@<Convert \(n)|nucleus(q)| to an hlist and attach the sub/superscripts@>= 17467case math_type(nucleus(q)) of 17468math_char, math_text_char: 17469 @<Create a character node |p| for |nucleus(q)|, possibly followed 17470 by a kern node for the italic correction, and set |delta| to the 17471 italic correction if a subscript is present@>; 17472empty: p:=null; 17473sub_box: p:=info(nucleus(q)); 17474sub_mlist: begin cur_mlist:=info(nucleus(q)); save_style:=cur_style; 17475 mlist_penalties:=false; mlist_to_hlist; {recursive call} 17476@^recursion@> 17477 cur_style:=save_style; @<Set up the values...@>; 17478 p:=hpack(link(temp_head),natural); 17479 end; 17480othercases confusion("mlist2") 17481@:this can't happen mlist2}{\quad mlist2@> 17482endcases;@/ 17483new_hlist(q):=p; 17484if (math_type(subscr(q))=empty)and(math_type(supscr(q))=empty) then 17485 goto check_dimensions; 17486make_scripts(q,delta) 17487 17488@ @<Create a character node |p| for |nucleus(q)|...@>= 17489begin fetch(nucleus(q)); 17490if is_native_font(cur_f) then begin 17491 z:=new_native_character(cur_f, qo(cur_c)); 17492 p:=get_node(glyph_node_size); 17493 type(p):=whatsit_node; subtype(p):=glyph_node; 17494 native_font(p):=cur_f; native_glyph(p):=get_native_glyph(z, 0); 17495 set_native_glyph_metrics(p, 1); 17496 free_node(z, native_size(z)); 17497 delta:=get_ot_math_ital_corr(cur_f, native_glyph(p)); 17498 if (math_type(nucleus(q))=math_text_char)and(not is_new_mathfont(cur_f)<>0) then 17499 delta:=0; {no italic correction in mid-word of text font} 17500 if (math_type(subscr(q))=empty)and(delta<>0) then 17501 begin link(p):=new_kern(delta); delta:=0; 17502 end; 17503end else if char_exists(cur_i) then 17504 begin delta:=char_italic(cur_f)(cur_i); p:=new_character(cur_f,qo(cur_c)); 17505 if (math_type(nucleus(q))=math_text_char)and(space(cur_f)<>0) then 17506 delta:=0; {no italic correction in mid-word of text font} 17507 if (math_type(subscr(q))=empty)and(delta<>0) then 17508 begin link(p):=new_kern(delta); delta:=0; 17509 end; 17510 end 17511else p:=null; 17512end 17513 17514@ The purpose of |make_scripts(q,delta)| is to attach the subscript and/or 17515superscript of noad |q| to the list that starts at |new_hlist(q)|, 17516given that the subscript and superscript aren't both empty. The superscript 17517will appear to the right of the subscript by a given distance |delta|. 17518 17519We set |shift_down| and |shift_up| to the minimum amounts to shift the 17520baseline of subscripts and superscripts based on the given nucleus. 17521 17522@<Declare math...@>= 17523function attach_hkern_to_new_hlist(@!q:pointer;@!delta:scaled):pointer; 17524var y,@!z:pointer; {temporary registers for box construction} 17525begin z:=new_kern(delta); 17526if new_hlist(q)=null then new_hlist(q):=z 17527else begin y:=new_hlist(q); 17528 while link(y)<>null do y:=link(y); 17529 link(y):=z; 17530 end; 17531attach_hkern_to_new_hlist:=new_hlist(q); 17532end; 17533 17534procedure make_scripts(@!q:pointer;@!delta:scaled); 17535var p,@!x,@!y,@!z:pointer; {temporary registers for box construction} 17536@!shift_up,@!shift_down,@!clr,@!sub_kern,@!sup_kern:scaled; {dimensions in the calculation} 17537@!script_c:pointer; {temprary native character for sub/superscript} 17538@!script_g:quarterword; {temporary register for sub/superscript native glyph id} 17539@!script_f:internal_font_number; {temporary register for sub/superscript font} 17540@!t:integer; {subsidiary size code} 17541@!save_f:internal_font_number; 17542begin p:=new_hlist(q); 17543script_c:=null; script_g:=0; script_f:=0; sup_kern:=0; sub_kern:=0; 17544if is_char_node(p) or is_glyph_node(p) then 17545 begin shift_up:=0; shift_down:=0; 17546 end 17547else begin z:=hpack(p,natural); 17548 if cur_style<script_style then t:=script_size@+else t:=script_script_size; 17549 shift_up:=height(z)-sup_drop(t); 17550 shift_down:=depth(z)+sub_drop(t); 17551 free_node(z,box_node_size); 17552 end; 17553if math_type(supscr(q))=empty then 17554 @<Construct a subscript box |x| when there is no superscript@> 17555else begin @<Construct a superscript box |x|@>; 17556 if math_type(subscr(q))=empty then shift_amount(x):=-shift_up 17557 else @<Construct a sub/superscript combination box |x|, with the 17558 superscript offset by |delta|@>; 17559 end; 17560if new_hlist(q)=null then new_hlist(q):=x 17561else begin p:=new_hlist(q); 17562 while link(p)<>null do p:=link(p); 17563 link(p):=x; 17564 end; 17565end; 17566 17567@ When there is a subscript without a superscript, the top of the subscript 17568should not exceed the baseline plus four-fifths of the x-height. 17569 17570@<Construct a subscript box |x| when there is no superscript@>= 17571begin 17572save_f:=cur_f; 17573x:=clean_box(subscr(q),sub_style(cur_style)); 17574cur_f:=save_f; 17575width(x):=width(x)+script_space; 17576if shift_down<sub1(cur_size) then shift_down:=sub1(cur_size); 17577if is_new_mathfont(cur_f) then 17578 clr:=height(x)-get_ot_math_constant(cur_f, subscriptTopMax) 17579else 17580 clr:=height(x)-(abs(math_x_height(cur_size)*4) div 5); 17581if shift_down<clr then shift_down:=clr; 17582shift_amount(x):=shift_down; 17583if is_new_mathfont(cur_f) then @<Attach subscript OpenType math kerning@> 17584end 17585 17586@ The bottom of a superscript should never descend below the baseline plus 17587one-fourth of the x-height. 17588 17589@<Construct a superscript box |x|@>= 17590begin 17591save_f:=cur_f; 17592x:=clean_box(supscr(q),sup_style(cur_style)); 17593cur_f:=save_f; 17594width(x):=width(x)+script_space; 17595if odd(cur_style) then clr:=sup3(cur_size) 17596else if cur_style<text_style then clr:=sup1(cur_size) 17597else clr:=sup2(cur_size); 17598if shift_up<clr then shift_up:=clr; 17599if is_new_mathfont(cur_f) then 17600 clr:=depth(x)+get_ot_math_constant(cur_f, superscriptBottomMin) 17601else 17602 clr:=depth(x)+(abs(math_x_height(cur_size)) div 4); 17603if shift_up<clr then shift_up:=clr; 17604if is_new_mathfont(cur_f) then @<Attach superscript OpenType math kerning@> 17605end 17606 17607@ When both subscript and superscript are present, the subscript must be 17608separated from the superscript by at least four times |default_rule_thickness|. 17609If this condition would be violated, the subscript moves down, after which 17610both subscript and superscript move up so that the bottom of the superscript 17611is at least as high as the baseline plus four-fifths of the x-height. 17612 17613@<Construct a sub/superscript combination box |x|...@>= 17614begin 17615save_f:=cur_f; 17616y:=clean_box(subscr(q),sub_style(cur_style)); 17617cur_f:=save_f; 17618width(y):=width(y)+script_space; 17619if shift_down<sub2(cur_size) then shift_down:=sub2(cur_size); 17620if is_new_mathfont(cur_f) then 17621 clr:=get_ot_math_constant(cur_f, subSuperscriptGapMin)-((shift_up-depth(x))-(height(y)-shift_down)) 17622else 17623 clr:=4*default_rule_thickness-((shift_up-depth(x))-(height(y)-shift_down)); 17624if clr>0 then 17625 begin shift_down:=shift_down+clr; 17626 if is_new_mathfont(cur_f) then 17627 clr:=get_ot_math_constant(cur_f, superscriptBottomMaxWithSubscript)-(shift_up-depth(x)) 17628 else 17629 clr:=(abs(math_x_height(cur_size)*4) div 5)-(shift_up-depth(x)); 17630 if clr>0 then 17631 begin shift_up:=shift_up+clr; 17632 shift_down:=shift_down-clr; 17633 end; 17634 end; 17635if is_new_mathfont(cur_f) then begin 17636 @<Attach subscript OpenType math kerning@>@/ 17637 @<Attach superscript OpenType math kerning@> 17638 end; 17639shift_amount(x):=sup_kern+delta-sub_kern; {superscript is |delta| to the right of the subscript} 17640p:=new_kern((shift_up-depth(x))-(height(y)-shift_down)); link(x):=p; link(p):=y; 17641x:=vpack(x,natural); shift_amount(x):=shift_down; 17642end 17643 17644@ OpenType math fonts provide an additional adjustment for the horizontal 17645position of sub/superscripts called math kerning. 17646 17647The following definitions should be kept in sync with \.{XeTeXOTMath.cpp}. 17648 17649@d sup_cmd=0 {superscript kern type for |get_ot_math_kern|} 17650@d sub_cmd=1 {subscript kern type for |get_ot_math_kern|} 17651 17652@<Attach subscript OpenType math kerning@>= 17653begin if math_type(subscr(q))=math_char then 17654 begin save_f:=cur_f; 17655 fetch(subscr(q)); 17656 if is_new_mathfont(cur_f) then 17657 begin script_c:=new_native_character(cur_f, qo(cur_c)); 17658 script_g:=get_native_glyph(script_c, 0); 17659 script_f:=cur_f; 17660 end else 17661 begin script_g:=0; script_f:=0 17662 end; 17663 cur_f:=save_f; 17664 end; 17665if is_glyph_node(p) then 17666 sub_kern:=get_ot_math_kern(native_font(p), native_glyph(p), script_f, script_g, sub_cmd, shift_down); 17667 17668if sub_kern<>0 then p:=attach_hkern_to_new_hlist(q, sub_kern); 17669end; 17670 17671@ @<Attach superscript OpenType math kerning@>= 17672begin if math_type(supscr(q))=math_char then 17673 begin save_f:=cur_f; 17674 fetch(supscr(q)); 17675 if is_new_mathfont(cur_f) then 17676 begin script_c:=new_native_character(cur_f, qo(cur_c)); 17677 script_g:=get_native_glyph(script_c, 0); 17678 script_f:=cur_f; 17679 end else 17680 begin script_g:=0; script_f:=0 17681 end; 17682 cur_f:=save_f; 17683 end; 17684if is_glyph_node(p) then 17685 sup_kern:=get_ot_math_kern(native_font(p), native_glyph(p), script_f, script_g, sup_cmd, shift_up); 17686 17687if (sup_kern<>0) and (math_type(subscr(q))=empty) then 17688 {if there is a superscript the kern will be added to |shift_amount(x)|} 17689 p:=attach_hkern_to_new_hlist(q, sup_kern); 17690end; 17691 17692@ We have now tied up all the loose ends of the first pass of |mlist_to_hlist|. 17693The second pass simply goes through and hooks everything together with the 17694proper glue and penalties. It also handles the |left_noad| and |right_noad| that 17695might be present, since |max_h| and |max_d| are now known. Variable |p| points 17696to a node at the current end of the final hlist. 17697 17698@<Make a second pass over the mlist, ...@>= 17699p:=temp_head; link(p):=null; q:=mlist; r_type:=0; cur_style:=style; 17700@<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>; 17701while q<>null do 17702 begin @<If node |q| is a style node, change the style and |goto delete_q|; 17703 otherwise if it is not a noad, put it into the hlist, 17704 advance |q|, and |goto done|; otherwise set |s| to the size 17705 of noad |q|, set |t| to the associated type (|ord_noad.. 17706 inner_noad|), and set |pen| to the associated penalty@>; 17707 @<Append inter-element spacing based on |r_type| and |t|@>; 17708 @<Append any |new_hlist| entries for |q|, and any appropriate penalties@>; 17709 if type(q)=right_noad then t:=open_noad; 17710 r_type:=t; 17711 delete_q: r:=q; q:=link(q); free_node(r,s); 17712 done: end 17713 17714@ Just before doing the big |case| switch in the second pass, the program 17715sets up default values so that most of the branches are short. 17716 17717@<If node |q| is a style node, change the style...@>= 17718t:=ord_noad; s:=noad_size; pen:=inf_penalty; 17719case type(q) of 17720op_noad,open_noad,close_noad,punct_noad,inner_noad: t:=type(q); 17721bin_noad: begin t:=bin_noad; pen:=bin_op_penalty; 17722 end; 17723rel_noad: begin t:=rel_noad; pen:=rel_penalty; 17724 end; 17725ord_noad,vcenter_noad,over_noad,under_noad: do_nothing; 17726radical_noad: s:=radical_noad_size; 17727accent_noad: s:=accent_noad_size; 17728fraction_noad: begin t:=inner_noad; s:=fraction_noad_size; 17729 end; 17730left_noad,right_noad: t:=make_left_right(q,style,max_d,max_h); 17731style_node: @<Change the current style and |goto delete_q|@>; 17732whatsit_node,penalty_node,rule_node,disc_node,adjust_node,ins_node,mark_node, 17733 glue_node,kern_node:@t@>@;@/ 17734 begin link(p):=q; p:=q; q:=link(q); link(p):=null; goto done; 17735 end; 17736othercases confusion("mlist3") 17737@:this can't happen mlist3}{\quad mlist3@> 17738endcases 17739 17740@ The |make_left_right| function constructs a left or right delimiter of 17741the required size and returns the value |open_noad| or |close_noad|. The 17742|right_noad| and |left_noad| will both be based on the original |style|, 17743so they will have consistent sizes. 17744 17745We use the fact that |right_noad-left_noad=close_noad-open_noad|. 17746 17747@<Declare math...@>= 17748function make_left_right(@!q:pointer;@!style:small_number; 17749 @!max_d,@!max_h:scaled):small_number; 17750var delta,@!delta1,@!delta2:scaled; {dimensions used in the calculation} 17751begin cur_style:=style; @<Set up the values...@>; 17752delta2:=max_d+axis_height(cur_size); 17753delta1:=max_h+max_d-delta2; 17754if delta2>delta1 then delta1:=delta2; {|delta1| is max distance from axis} 17755delta:=(delta1 div 500)*delimiter_factor; 17756delta2:=delta1+delta1-delimiter_shortfall; 17757if delta<delta2 then delta:=delta2; 17758new_hlist(q):=var_delimiter(delimiter(q),cur_size,delta); 17759make_left_right:=type(q)-(left_noad-open_noad); {|open_noad| or |close_noad|} 17760end; 17761 17762@ @<Change the current style and |goto delete_q|@>= 17763begin cur_style:=subtype(q); s:=style_node_size; 17764@<Set up the values of |cur_size| and |cur_mu|, based on |cur_style|@>; 17765goto delete_q; 17766end 17767 17768@ The inter-element spacing in math formulas depends on a $8\times8$ table that 17769\TeX\ preloads as a 64-digit string. The elements of this string have the 17770following significance: 17771$$\vbox{\halign{#\hfil\cr 17772\.0 means no space;\cr 17773\.1 means a conditional thin space (\.{\\nonscript\\mskip\\thinmuskip});\cr 17774\.2 means a thin space (\.{\\mskip\\thinmuskip});\cr 17775\.3 means a conditional medium space 17776 (\.{\\nonscript\\mskip\\medmuskip});\cr 17777\.4 means a conditional thick space 17778 (\.{\\nonscript\\mskip\\thickmuskip});\cr 17779\.* means an impossible case.\cr}}$$ 17780This is all pretty cryptic, but {\sl The \TeX book\/} explains what is 17781supposed to happen, and the string makes it happen. 17782@:TeXbook}{\sl The \TeX book@> 17783 17784A global variable |magic_offset| is computed so that if |a| and |b| are 17785in the range |ord_noad..inner_noad|, then |str_pool[a*8+b+magic_offset]| 17786is the digit for spacing between noad types |a| and |b|. 17787 17788If \PASCAL\ had provided a good way to preload constant arrays, this part of 17789the program would not have been so strange. 17790@:PASCAL}{\PASCAL@> 17791 17792@d math_spacing=@;@/ 17793@t\hskip-35pt@> 17794"0234000122*4000133**3**344*0400400*000000234000111*1111112341011" 17795@t$ \hskip-35pt$@> 17796 17797@<Glob...@>= 17798@!magic_offset:integer; {used to find inter-element spacing} 17799 17800@ @<Compute the magic offset@>= 17801magic_offset:=str_start_macro(math_spacing)-9*ord_noad 17802 17803@ @<Append inter-element spacing based on |r_type| and |t|@>= 17804if r_type>0 then {not the first noad} 17805 begin case so(str_pool[r_type*8+t+magic_offset]) of 17806 "0": x:=0; 17807 "1": if cur_style<script_style then x:=thin_mu_skip_code@+else x:=0; 17808 "2": x:=thin_mu_skip_code; 17809 "3": if cur_style<script_style then x:=med_mu_skip_code@+else x:=0; 17810 "4": if cur_style<script_style then x:=thick_mu_skip_code@+else x:=0; 17811 othercases confusion("mlist4") 17812@:this can't happen mlist4}{\quad mlist4@> 17813 endcases; 17814 if x<>0 then 17815 begin y:=math_glue(glue_par(x),cur_mu); 17816 z:=new_glue(y); glue_ref_count(y):=null; link(p):=z; p:=z;@/ 17817 subtype(z):=x+1; {store a symbolic subtype} 17818 end; 17819 end 17820 17821@ We insert a penalty node after the hlist entries of noad |q| if |pen| 17822is not an ``infinite'' penalty, and if the node immediately following |q| 17823is not a penalty node or a |rel_noad| or absent entirely. 17824 17825@<Append any |new_hlist| entries for |q|, and any appropriate penalties@>= 17826if new_hlist(q)<>null then 17827 begin link(p):=new_hlist(q); 17828 repeat p:=link(p); 17829 until link(p)=null; 17830 end; 17831if penalties then if link(q)<>null then if pen<inf_penalty then 17832 begin r_type:=type(link(q)); 17833 if r_type<>penalty_node then if r_type<>rel_noad then 17834 begin z:=new_penalty(pen); link(p):=z; p:=z; 17835 end; 17836 end 17837 17838@* \[37] Alignment. 17839It's sort of a miracle whenever \.{\\halign} and \.{\\valign} work, because 17840they cut across so many of the control structures of \TeX. 17841 17842Therefore the 17843present page is probably not the best place for a beginner to start reading 17844this program; it is better to master everything else first. 17845 17846Let us focus our thoughts on an example of what the input might be, in order 17847to get some idea about how the alignment miracle happens. The example doesn't 17848do anything useful, but it is sufficiently general to indicate all of the 17849special cases that must be dealt with; please do not be disturbed by its 17850apparent complexity and meaninglessness. 17851$$\vbox{\halign{\.{#}\hfil\cr 17852{}\\tabskip 2pt plus 3pt\cr 17853{}\\halign to 300pt\{u1\#v1\&\cr 17854\hskip 50pt\\tabskip 1pt plus 1fil u2\#v2\&\cr 17855\hskip 50pt u3\#v3\\cr\cr 17856\hskip 25pt a1\&\\omit a2\&\\vrule\\cr\cr 17857\hskip 25pt \\noalign\{\\vskip 3pt\}\cr 17858\hskip 25pt b1\\span b2\\cr\cr 17859\hskip 25pt \\omit\&c2\\span\\omit\\cr\}\cr}}$$ 17860Here's what happens: 17861 17862\yskip 17863(0) When `\.{\\halign to 300pt\{}' is scanned, the |scan_spec| routine 17864places the 300pt dimension onto the |save_stack|, and an |align_group| 17865code is placed above it. This will make it possible to complete the alignment 17866when the matching `\.\}' is found. 17867 17868(1) The preamble is scanned next. Macros in the preamble are not expanded, 17869@^preamble@> 17870except as part of a tabskip specification. For example, if \.{u2} had been 17871a macro in the preamble above, it would have been expanded, since \TeX\ 17872must look for `\.{minus...}' as part of the tabskip glue. A ``preamble list'' 17873is constructed based on the user's preamble; in our case it contains the 17874following seven items: 17875$$\vbox{\halign{\.{#}\hfil\qquad&(#)\hfil\cr 17876{}\\glue 2pt plus 3pt&the tabskip preceding column 1\cr 17877{}\\alignrecord, width $-\infty$&preamble info for column 1\cr 17878{}\\glue 2pt plus 3pt&the tabskip between columns 1 and 2\cr 17879{}\\alignrecord, width $-\infty$&preamble info for column 2\cr 17880{}\\glue 1pt plus 1fil&the tabskip between columns 2 and 3\cr 17881{}\\alignrecord, width $-\infty$&preamble info for column 3\cr 17882{}\\glue 1pt plus 1fil&the tabskip following column 3\cr}}$$ 17883These ``alignrecord'' entries have the same size as an |unset_node|, 17884since they will later be converted into such nodes. However, at the 17885moment they have no |type| or |subtype| fields; they have |info| fields 17886instead, and these |info| fields are initially set to the value |end_span|, 17887for reasons explained below. Furthermore, the alignrecord nodes have no 17888|height| or |depth| fields; these are renamed |u_part| and |v_part|, 17889and they point to token lists for the templates of the alignment. 17890For example, the |u_part| field in the first alignrecord points to the 17891token list `\.{u1}', i.e., the template preceding the `\.\#' for column~1. 17892 17893(2) \TeX\ now looks at what follows the \.{\\cr} that ended the preamble. 17894It is not `\.{\\noalign}' or `\.{\\omit}', so this input is put back to 17895be read again, and the template `\.{u1}' is fed to the scanner. Just 17896before reading `\.{u1}', \TeX\ goes into restricted horizontal mode. 17897Just after reading `\.{u1}', \TeX\ will see `\.{a1}', and then (when the 17898{\.\&} is sensed) \TeX\ will see `\.{v1}'. Then \TeX\ scans an |endv| 17899token, indicating the end of a column. At this point an |unset_node| is 17900created, containing the contents of the current hlist (i.e., `\.{u1a1v1}'). 17901The natural width of this unset node replaces the |width| field of the 17902alignrecord for column~1; in general, the alignrecords will record the 17903maximum natural width that has occurred so far in a given column. 17904 17905(3) Since `\.{\\omit}' follows the `\.\&', the templates for column~2 17906are now bypassed. Again \TeX\ goes into restricted horizontal mode and 17907makes an |unset_node| from the resulting hlist; but this time the 17908hlist contains simply `\.{a2}'. The natural width of the new unset box 17909is remembered in the |width| field of the alignrecord for column~2. 17910 17911(4) A third |unset_node| is created for column 3, using essentially the 17912mechanism that worked for column~1; this unset box contains `\.{u3\\vrule 17913v3}'. The vertical rule in this case has running dimensions that will later 17914extend to the height and depth of the whole first row, since each |unset_node| 17915in a row will eventually inherit the height and depth of its enclosing box. 17916 17917(5) The first row has now ended; it is made into a single unset box 17918comprising the following seven items: 17919$$\vbox{\halign{\hbox to 325pt{\qquad\.{#}\hfil}\cr 17920{}\\glue 2pt plus 3pt\cr 17921{}\\unsetbox for 1 column: u1a1v1\cr 17922{}\\glue 2pt plus 3pt\cr 17923{}\\unsetbox for 1 column: a2\cr 17924{}\\glue 1pt plus 1fil\cr 17925{}\\unsetbox for 1 column: u3\\vrule v3\cr 17926{}\\glue 1pt plus 1fil\cr}}$$ 17927The width of this unset row is unimportant, but it has the correct height 17928and depth, so the correct baselineskip glue will be computed as the row 17929is inserted into a vertical list. 17930 17931(6) Since `\.{\\noalign}' follows the current \.{\\cr}, \TeX\ appends 17932additional material (in this case \.{\\vskip 3pt}) to the vertical list. 17933While processing this material, \TeX\ will be in internal vertical 17934mode, and |no_align_group| will be on |save_stack|. 17935 17936(7) The next row produces an unset box that looks like this: 17937$$\vbox{\halign{\hbox to 325pt{\qquad\.{#}\hfil}\cr 17938{}\\glue 2pt plus 3pt\cr 17939{}\\unsetbox for 2 columns: u1b1v1u2b2v2\cr 17940{}\\glue 1pt plus 1fil\cr 17941{}\\unsetbox for 1 column: {\rm(empty)}\cr 17942{}\\glue 1pt plus 1fil\cr}}$$ 17943The natural width of the unset box that spans columns 1~and~2 is stored 17944in a ``span node,'' which we will explain later; the |info| field of the 17945alignrecord for column~1 now points to the new span node, and the |info| 17946of the span node points to |end_span|. 17947 17948(8) The final row produces the unset box 17949$$\vbox{\halign{\hbox to 325pt{\qquad\.{#}\hfil}\cr 17950{}\\glue 2pt plus 3pt\cr 17951{}\\unsetbox for 1 column: {\rm(empty)}\cr 17952{}\\glue 2pt plus 3pt\cr 17953{}\\unsetbox for 2 columns: u2c2v2\cr 17954{}\\glue 1pt plus 1fil\cr}}$$ 17955A new span node is attached to the alignrecord for column 2. 17956 17957(9) The last step is to compute the true column widths and to change all the 17958unset boxes to hboxes, appending the whole works to the vertical list that 17959encloses the \.{\\halign}. The rules for deciding on the final widths of 17960each unset column box will be explained below. 17961 17962\yskip\noindent 17963Note that as \.{\\halign} is being processed, we fearlessly give up control 17964to the rest of \TeX. At critical junctures, an alignment routine is 17965called upon to step in and do some little action, but most of the time 17966these routines just lurk in the background. It's something like 17967post-hypnotic suggestion. 17968 17969@ We have mentioned that alignrecords contain no |height| or |depth| fields. 17970Their |glue_sign| and |glue_order| are pre-empted as well, since it 17971is necessary to store information about what to do when a template ends. 17972This information is called the |extra_info| field. 17973 17974@d u_part(#)==mem[#+height_offset].int {pointer to \<u_j> token list} 17975@d v_part(#)==mem[#+depth_offset].int {pointer to \<v_j> token list} 17976@d extra_info(#)==info(#+list_offset) {info to remember during template} 17977 17978@ Alignments can occur within alignments, so a small stack is used to access 17979the alignrecord information. At each level we have a |preamble| pointer, 17980indicating the beginning of the preamble list; a |cur_align| pointer, 17981indicating the current position in the preamble list; a |cur_span| pointer, 17982indicating the value of |cur_align| at the beginning of a sequence of 17983spanned columns; a |cur_loop| pointer, indicating the tabskip glue before 17984an alignrecord that should be copied next if the current list is extended; 17985and the |align_state| variable, which indicates the nesting of braces so 17986that \.{\\cr} and \.{\\span} and tab marks are properly intercepted. 17987There also are pointers |cur_head| and |cur_tail| to the head and tail 17988of a list of adjustments being moved out from horizontal mode to 17989vertical~mode. 17990 17991The current values of these seven quantities appear in global variables; 17992when they have to be pushed down, they are stored in 5-word nodes, and 17993|align_ptr| points to the topmost such node. 17994 17995@d preamble==link(align_head) {the current preamble list} 17996@d align_stack_node_size=6 {number of |mem| words to save alignment states} 17997 17998@<Glob...@>= 17999@!cur_align:pointer; {current position in preamble list} 18000@!cur_span:pointer; {start of currently spanned columns in preamble list} 18001@!cur_loop:pointer; {place to copy when extending a periodic preamble} 18002@!align_ptr:pointer; {most recently pushed-down alignment stack node} 18003@!cur_head,@!cur_tail:pointer; {adjustment list pointers} 18004@!cur_pre_head,@!cur_pre_tail:pointer; {pre-adjustment list pointers} 18005 18006@ The |align_state| and |preamble| variables are initialized elsewhere. 18007 18008@<Set init...@>= 18009align_ptr:=null; cur_align:=null; cur_span:=null; cur_loop:=null; 18010cur_head:=null; cur_tail:=null; 18011cur_pre_head:=null; cur_pre_tail:=null; 18012 18013@ Alignment stack maintenance is handled by a pair of trivial routines 18014called |push_alignment| and |pop_alignment|. 18015 18016@p procedure push_alignment; 18017var p:pointer; {the new alignment stack node} 18018begin p:=get_node(align_stack_node_size); 18019link(p):=align_ptr; info(p):=cur_align; 18020llink(p):=preamble; rlink(p):=cur_span; 18021mem[p+2].int:=cur_loop; mem[p+3].int:=align_state; 18022info(p+4):=cur_head; link(p+4):=cur_tail; 18023info(p+5):=cur_pre_head; link(p+5):=cur_pre_tail; 18024align_ptr:=p; 18025cur_head:=get_avail; 18026cur_pre_head:=get_avail; 18027end; 18028@# 18029procedure pop_alignment; 18030var p:pointer; {the top alignment stack node} 18031begin free_avail(cur_head); 18032free_avail(cur_pre_head); 18033p:=align_ptr; 18034cur_tail:=link(p+4); cur_head:=info(p+4); 18035cur_pre_tail:=link(p+5); cur_pre_head:=info(p+5); 18036align_state:=mem[p+3].int; cur_loop:=mem[p+2].int; 18037cur_span:=rlink(p); preamble:=llink(p); 18038cur_align:=info(p); align_ptr:=link(p); 18039free_node(p,align_stack_node_size); 18040end; 18041 18042@ \TeX\ has eight procedures that govern alignments: |init_align| and 18043|fin_align| are used at the very beginning and the very end; |init_row| and 18044|fin_row| are used at the beginning and end of individual rows; |init_span| 18045is used at the beginning of a sequence of spanned columns (possibly involving 18046only one column); |init_col| and |fin_col| are used at the beginning and 18047end of individual columns; and |align_peek| is used after \.{\\cr} to see 18048whether the next item is \.{\\noalign}. 18049 18050We shall consider these routines in the order they are first used during 18051the course of a complete \.{\\halign}, namely |init_align|, |align_peek|, 18052|init_row|, |init_span|, |init_col|, |fin_col|, |fin_row|, |fin_align|. 18053 18054@ When \.{\\halign} or \.{\\valign} has been scanned in an appropriate 18055mode, \TeX\ calls |init_align|, whose task is to get everything off to a 18056good start. This mostly involves scanning the preamble and putting its 18057information into the preamble list. 18058@^preamble@> 18059 18060@p @t\4@>@<Declare the procedure called |get_preamble_token|@>@t@>@/ 18061procedure@?align_peek; forward;@t\2@>@/ 18062procedure@?normal_paragraph; forward;@t\2@>@/ 18063procedure init_align; 18064label done, done1, done2, continue; 18065var save_cs_ptr:pointer; {|warning_index| value for error messages} 18066@!p:pointer; {for short-term temporary use} 18067begin save_cs_ptr:=cur_cs; {\.{\\halign} or \.{\\valign}, usually} 18068push_alignment; align_state:=-1000000; {enter a new alignment level} 18069@<Check for improper alignment in displayed math@>; 18070push_nest; {enter a new semantic level} 18071@<Change current mode to |-vmode| for \.{\\halign}, |-hmode| for \.{\\valign}@>; 18072scan_spec(align_group,false);@/ 18073@<Scan the preamble and record it in the |preamble| list@>; 18074new_save_level(align_group); 18075if every_cr<>null then begin_token_list(every_cr,every_cr_text); 18076align_peek; {look for \.{\\noalign} or \.{\\omit}} 18077end; 18078 18079@ In vertical modes, |prev_depth| already has the correct value. But 18080if we are in |mmode| (displayed formula mode), we reach out to the 18081enclosing vertical mode for the |prev_depth| value that produces the 18082correct baseline calculations. 18083 18084@<Change current mode...@>= 18085if mode=mmode then 18086 begin mode:=-vmode; prev_depth:=nest[nest_ptr-2].aux_field.sc; 18087 end 18088else if mode>0 then negate(mode) 18089 18090@ When \.{\\halign} is used as a displayed formula, there should be 18091no other pieces of mlists present. 18092 18093@<Check for improper alignment in displayed math@>= 18094if (mode=mmode)and((tail<>head)or(incompleat_noad<>null)) then 18095 begin print_err("Improper "); print_esc("halign"); print(" inside $$'s"); 18096@.Improper \\halign...@> 18097 help3("Displays can use special alignments (like \eqalignno)")@/ 18098 ("only if nothing but the alignment itself is between $$'s.")@/ 18099 ("So I've deleted the formulas that preceded this alignment."); 18100 error; flush_math; 18101 end 18102 18103@ @<Scan the preamble and record it in the |preamble| list@>= 18104preamble:=null; cur_align:=align_head; cur_loop:=null; scanner_status:=aligning; 18105warning_index:=save_cs_ptr; align_state:=-1000000; 18106 {at this point, |cur_cmd=left_brace|} 18107loop@+ begin @<Append the current tabskip glue to the preamble list@>; 18108 if cur_cmd=car_ret then goto done; {\.{\\cr} ends the preamble} 18109 @<Scan preamble text until |cur_cmd| is |tab_mark| or |car_ret|, 18110 looking for changes in the tabskip glue; append an 18111 alignrecord to the preamble list@>; 18112 end; 18113done: scanner_status:=normal 18114 18115@ @<Append the current tabskip glue to the preamble list@>= 18116link(cur_align):=new_param_glue(tab_skip_code); 18117cur_align:=link(cur_align) 18118 18119@ @<Scan preamble text until |cur_cmd| is |tab_mark| or |car_ret|...@>= 18120@<Scan the template \<u_j>, putting the resulting token list in |hold_head|@>; 18121link(cur_align):=new_null_box; cur_align:=link(cur_align); {a new alignrecord} 18122info(cur_align):=end_span; width(cur_align):=null_flag; 18123u_part(cur_align):=link(hold_head); 18124@<Scan the template \<v_j>, putting the resulting token list in |hold_head|@>; 18125v_part(cur_align):=link(hold_head) 18126 18127@ We enter `\.{\\span}' into |eqtb| with |tab_mark| as its command code, 18128and with |span_code| as the command modifier. This makes \TeX\ interpret it 18129essentially the same as an alignment delimiter like `\.\&', yet it is 18130recognizably different when we need to distinguish it from a normal delimiter. 18131It also turns out to be useful to give a special |cr_code| to `\.{\\cr}', 18132and an even larger |cr_cr_code| to `\.{\\crcr}'. 18133 18134The end of a template is represented by two ``frozen'' control sequences 18135called \.{\\endtemplate}. The first has the command code |end_template|, which 18136is |>outer_call|, so it will not easily disappear in the presence of errors. 18137The |get_x_token| routine converts the first into the second, which has |endv| 18138as its command code. 18139 18140@d span_code=special_char {distinct from any character} 18141@d cr_code=span_code+1 {distinct from |span_code| and from any character} 18142@d cr_cr_code=cr_code+1 {this distinguishes \.{\\crcr} from \.{\\cr}} 18143@d end_template_token==cs_token_flag+frozen_end_template 18144 18145@<Put each of \TeX's primitives into the hash table@>= 18146primitive("span",tab_mark,span_code);@/ 18147@!@:span_}{\.{\\span} primitive@> 18148primitive("cr",car_ret,cr_code); 18149@!@:cr_}{\.{\\cr} primitive@> 18150text(frozen_cr):="cr"; eqtb[frozen_cr]:=eqtb[cur_val];@/ 18151primitive("crcr",car_ret,cr_cr_code); 18152@!@:cr_cr_}{\.{\\crcr} primitive@> 18153text(frozen_end_template):="endtemplate"; text(frozen_endv):="endtemplate"; 18154eq_type(frozen_endv):=endv; equiv(frozen_endv):=null_list; 18155eq_level(frozen_endv):=level_one;@/ 18156eqtb[frozen_end_template]:=eqtb[frozen_endv]; 18157eq_type(frozen_end_template):=end_template; 18158 18159@ @<Cases of |print_cmd_chr|...@>= 18160tab_mark: if chr_code=span_code then print_esc("span") 18161 else chr_cmd("alignment tab character "); 18162car_ret: if chr_code=cr_code then print_esc("cr") 18163 else print_esc("crcr"); 18164 18165@ The preamble is copied directly, except that \.{\\tabskip} causes a change 18166to the tabskip glue, thereby possibly expanding macros that immediately 18167follow it. An appearance of \.{\\span} also causes such an expansion. 18168 18169Note that if the preamble contains `\.{\\global\\tabskip}', the `\.{\\global}' 18170token survives in the preamble and the `\.{\\tabskip}' defines new 18171tabskip glue (locally). 18172 18173@<Declare the procedure called |get_preamble_token|@>= 18174procedure get_preamble_token; 18175label restart; 18176begin restart: get_token; 18177while (cur_chr=span_code)and(cur_cmd=tab_mark) do 18178 begin get_token; {this token will be expanded once} 18179 if cur_cmd>max_command then 18180 begin expand; get_token; 18181 end; 18182 end; 18183if cur_cmd=endv then 18184 fatal_error("(interwoven alignment preambles are not allowed)"); 18185@.interwoven alignment preambles...@> 18186if (cur_cmd=assign_glue)and(cur_chr=glue_base+tab_skip_code) then 18187 begin scan_optional_equals; scan_glue(glue_val); 18188 if global_defs>0 then geq_define(glue_base+tab_skip_code,glue_ref,cur_val) 18189 else eq_define(glue_base+tab_skip_code,glue_ref,cur_val); 18190 goto restart; 18191 end; 18192end; 18193 18194@ Spaces are eliminated from the beginning of a template. 18195 18196@<Scan the template \<u_j>...@>= 18197p:=hold_head; link(p):=null; 18198loop@+ begin get_preamble_token; 18199 if cur_cmd=mac_param then goto done1; 18200 if (cur_cmd<=car_ret)and(cur_cmd>=tab_mark)and(align_state=-1000000) then 18201 if (p=hold_head)and(cur_loop=null)and(cur_cmd=tab_mark) 18202 then cur_loop:=cur_align 18203 else begin print_err("Missing # inserted in alignment preamble"); 18204@.Missing \# inserted...@> 18205 help3("There should be exactly one # between &'s, when an")@/ 18206 ("\halign or \valign is being set up. In this case you had")@/ 18207 ("none, so I've put one in; maybe that will work."); 18208 back_error; goto done1; 18209 end 18210 else if (cur_cmd<>spacer)or(p<>hold_head) then 18211 begin link(p):=get_avail; p:=link(p); info(p):=cur_tok; 18212 end; 18213 end; 18214done1: 18215 18216@ @<Scan the template \<v_j>...@>= 18217p:=hold_head; link(p):=null; 18218loop@+ begin continue: get_preamble_token; 18219 if (cur_cmd<=car_ret)and(cur_cmd>=tab_mark)and(align_state=-1000000) then 18220 goto done2; 18221 if cur_cmd=mac_param then 18222 begin print_err("Only one # is allowed per tab"); 18223@.Only one \# is allowed...@> 18224 help3("There should be exactly one # between &'s, when an")@/ 18225 ("\halign or \valign is being set up. In this case you had")@/ 18226 ("more than one, so I'm ignoring all but the first."); 18227 error; goto continue; 18228 end; 18229 link(p):=get_avail; p:=link(p); info(p):=cur_tok; 18230 end; 18231done2: link(p):=get_avail; p:=link(p); 18232info(p):=end_template_token {put \.{\\endtemplate} at the end} 18233 18234@ The tricky part about alignments is getting the templates into the 18235scanner at the right time, and recovering control when a row or column 18236is finished. 18237 18238We usually begin a row after each \.{\\cr} has been sensed, unless that 18239\.{\\cr} is followed by \.{\\noalign} or by the right brace that terminates 18240the alignment. The |align_peek| routine is used to look ahead and do 18241the right thing; it either gets a new row started, or gets a \.{\\noalign} 18242started, or finishes off the alignment. 18243 18244@<Declare the procedure called |align_peek|@>= 18245procedure align_peek; 18246label restart; 18247begin restart: align_state:=1000000; 18248repeat get_x_or_protected; 18249until cur_cmd<>spacer; 18250if cur_cmd=no_align then 18251 begin scan_left_brace; new_save_level(no_align_group); 18252 if mode=-vmode then normal_paragraph; 18253 end 18254else if cur_cmd=right_brace then fin_align 18255else if (cur_cmd=car_ret)and(cur_chr=cr_cr_code) then 18256 goto restart {ignore \.{\\crcr}} 18257else begin init_row; {start a new row} 18258 init_col; {start a new column and replace what we peeked at} 18259 end; 18260end; 18261 18262@ To start a row (i.e., a `row' that rhymes with `dough' but not with `bough'), 18263we enter a new semantic level, copy the first tabskip glue, and change 18264from internal vertical mode to restricted horizontal mode or vice versa. 18265The |space_factor| and |prev_depth| are not used on this semantic level, 18266but we clear them to zero just to be tidy. 18267 18268@p @t\4@>@<Declare the procedure called |init_span|@>@t@>@/ 18269procedure init_row; 18270begin push_nest; mode:=(-hmode-vmode)-mode; 18271if mode=-hmode then space_factor:=0 @+else prev_depth:=0; 18272tail_append(new_glue(glue_ptr(preamble))); 18273subtype(tail):=tab_skip_code+1;@/ 18274cur_align:=link(preamble); cur_tail:=cur_head; cur_pre_tail:=cur_pre_head; 18275init_span(cur_align); 18276end; 18277 18278@ The parameter to |init_span| is a pointer to the alignrecord where the 18279next column or group of columns will begin. A new semantic level is 18280entered, so that the columns will generate a list for subsequent packaging. 18281 18282@<Declare the procedure called |init_span|@>= 18283procedure init_span(@!p:pointer); 18284begin push_nest; 18285if mode=-hmode then space_factor:=1000 18286else begin prev_depth:=ignore_depth; normal_paragraph; 18287 end; 18288cur_span:=p; 18289end; 18290 18291@ When a column begins, we assume that |cur_cmd| is either |omit| or else 18292the current token should be put back into the input until the \<u_j> 18293template has been scanned. (Note that |cur_cmd| might be |tab_mark| or 18294|car_ret|.) We also assume that |align_state| is approximately 1000000 at 18295this time. We remain in the same mode, and start the template if it is 18296called for. 18297 18298@p procedure init_col; 18299begin extra_info(cur_align):=cur_cmd; 18300if cur_cmd=omit then align_state:=0 18301else begin back_input; begin_token_list(u_part(cur_align),u_template); 18302 end; {now |align_state=1000000|} 18303end; 18304 18305@ The scanner sets |align_state| to zero when the \<u_j> template ends. When 18306a subsequent \.{\\cr} or \.{\\span} or tab mark occurs with |align_state=0|, 18307the scanner activates the following code, which fires up the \<v_j> template. 18308We need to remember the |cur_chr|, which is either |cr_cr_code|, |cr_code|, 18309|span_code|, or a character code, depending on how the column text has ended. 18310 18311This part of the program had better not be activated when the preamble 18312to another alignment is being scanned, or when no alignment preamble is active. 18313 18314@<Insert the \(v)\<v_j>...@>= 18315begin if (scanner_status=aligning) or (cur_align=null) then 18316 fatal_error("(interwoven alignment preambles are not allowed)"); 18317@.interwoven alignment preambles...@> 18318cur_cmd:=extra_info(cur_align); extra_info(cur_align):=cur_chr; 18319if cur_cmd=omit then begin_token_list(omit_template,v_template) 18320else begin_token_list(v_part(cur_align),v_template); 18321align_state:=1000000; goto restart; 18322end 18323 18324@ The token list |omit_template| just referred to is a constant token 18325list that contains the special control sequence \.{\\endtemplate} only. 18326 18327@<Initialize the special...@>= 18328info(omit_template):=end_template_token; {|link(omit_template)=null|} 18329 18330@ When the |endv| command at the end of a \<v_j> template comes through the 18331scanner, things really start to happen; and it is the |fin_col| routine 18332that makes them happen. This routine returns |true| if a row as well as a 18333column has been finished. 18334 18335@p function fin_col:boolean; 18336label exit; 18337var p:pointer; {the alignrecord after the current one} 18338@!q,@!r:pointer; {temporary pointers for list manipulation} 18339@!s:pointer; {a new span node} 18340@!u:pointer; {a new unset box} 18341@!w:scaled; {natural width} 18342@!o:glue_ord; {order of infinity} 18343@!n:halfword; {span counter} 18344begin if cur_align=null then confusion("endv"); 18345q:=link(cur_align);@+if q=null then confusion("endv"); 18346@:this can't happen endv}{\quad endv@> 18347if align_state<500000 then 18348 fatal_error("(interwoven alignment preambles are not allowed)"); 18349@.interwoven alignment preambles...@> 18350p:=link(q); 18351@<If the preamble list has been traversed, check that the row has ended@>; 18352if extra_info(cur_align)<>span_code then 18353 begin unsave; new_save_level(align_group);@/ 18354 @<Package an unset box for the current column and record its width@>; 18355 @<Copy the tabskip glue between columns@>; 18356 if extra_info(cur_align)>=cr_code then 18357 begin fin_col:=true; return; 18358 end; 18359 init_span(p); 18360 end; 18361align_state:=1000000; 18362repeat get_x_or_protected; 18363until cur_cmd<>spacer; 18364cur_align:=p; 18365init_col; fin_col:=false; 18366exit: end; 18367 18368@ @<If the preamble list has been traversed, check that the row has ended@>= 18369if (p=null)and(extra_info(cur_align)<cr_code) then 18370 if cur_loop<>null then @<Lengthen the preamble periodically@> 18371 else begin print_err("Extra alignment tab has been changed to "); 18372@.Extra alignment tab...@> 18373 print_esc("cr"); 18374 help3("You have given more \span or & marks than there were")@/ 18375 ("in the preamble to the \halign or \valign now in progress.")@/ 18376 ("So I'll assume that you meant to type \cr instead."); 18377 extra_info(cur_align):=cr_code; error; 18378 end 18379 18380@ @<Lengthen the preamble...@>= 18381begin link(q):=new_null_box; p:=link(q); {a new alignrecord} 18382info(p):=end_span; width(p):=null_flag; cur_loop:=link(cur_loop); 18383@<Copy the templates from node |cur_loop| into node |p|@>; 18384cur_loop:=link(cur_loop); 18385link(p):=new_glue(glue_ptr(cur_loop)); 18386end 18387 18388@ @<Copy the templates from node |cur_loop| into node |p|@>= 18389q:=hold_head; r:=u_part(cur_loop); 18390while r<>null do 18391 begin link(q):=get_avail; q:=link(q); info(q):=info(r); r:=link(r); 18392 end; 18393link(q):=null; u_part(p):=link(hold_head); 18394q:=hold_head; r:=v_part(cur_loop); 18395while r<>null do 18396 begin link(q):=get_avail; q:=link(q); info(q):=info(r); r:=link(r); 18397 end; 18398link(q):=null; v_part(p):=link(hold_head) 18399 18400@ @<Copy the tabskip glue...@>= 18401tail_append(new_glue(glue_ptr(link(cur_align)))); 18402subtype(tail):=tab_skip_code+1 18403 18404@ @<Package an unset...@>= 18405begin if mode=-hmode then 18406 begin adjust_tail:=cur_tail; pre_adjust_tail:=cur_pre_tail; 18407 u:=hpack(link(head),natural); w:=width(u); 18408 cur_tail:=adjust_tail; adjust_tail:=null; 18409 cur_pre_tail:=pre_adjust_tail; pre_adjust_tail:=null; 18410 end 18411else begin u:=vpackage(link(head),natural,0); w:=height(u); 18412 end; 18413n:=min_quarterword; {this represents a span count of 1} 18414if cur_span<>cur_align then @<Update width entry for spanned columns@> 18415else if w>width(cur_align) then width(cur_align):=w; 18416type(u):=unset_node; span_count(u):=n;@/ 18417@<Determine the stretch order@>; 18418glue_order(u):=o; glue_stretch(u):=total_stretch[o];@/ 18419@<Determine the shrink order@>; 18420glue_sign(u):=o; glue_shrink(u):=total_shrink[o];@/ 18421pop_nest; link(tail):=u; tail:=u; 18422end 18423 18424@ A span node is a 2-word record containing |width|, |info|, and |link| 18425fields. The |link| field is not really a link, it indicates the number of 18426spanned columns; the |info| field points to a span node for the same 18427starting column, having a greater extent of spanning, or to |end_span|, 18428which has the largest possible |link| field; the |width| field holds the 18429largest natural width corresponding to a particular set of spanned columns. 18430 18431A list of the maximum widths so far, for spanned columns starting at a 18432given column, begins with the |info| field of the alignrecord for that 18433column. 18434 18435@d span_node_size=2 {number of |mem| words for a span node} 18436 18437@<Initialize the special list heads...@>= 18438link(end_span):=max_quarterword+1; info(end_span):=null; 18439 18440@ @<Update width entry for spanned columns@>= 18441begin q:=cur_span; 18442repeat incr(n); q:=link(link(q)); 18443until q=cur_align; 18444if n>max_quarterword then confusion("too many spans"); 18445 {this can happen, but won't} 18446@^system dependencies@> 18447@:this can't happen too many spans}{\quad too many spans@> 18448q:=cur_span; while link(info(q))<n do q:=info(q); 18449if link(info(q))>n then 18450 begin s:=get_node(span_node_size); info(s):=info(q); link(s):=n; 18451 info(q):=s; width(s):=w; 18452 end 18453else if width(info(q))<w then width(info(q)):=w; 18454end 18455 18456@ At the end of a row, we append an unset box to the current vlist (for 18457\.{\\halign}) or the current hlist (for \.{\\valign}). This unset box 18458contains the unset boxes for the columns, separated by the tabskip glue. 18459Everything will be set later. 18460 18461@p procedure fin_row; 18462var p:pointer; {the new unset box} 18463begin if mode=-hmode then 18464 begin p:=hpack(link(head),natural); 18465 pop_nest; 18466 if cur_pre_head <> cur_pre_tail then 18467 append_list(cur_pre_head)(cur_pre_tail); 18468 append_to_vlist(p); 18469 if cur_head <> cur_tail then 18470 append_list(cur_head)(cur_tail); 18471 end 18472else begin p:=vpack(link(head),natural); pop_nest; 18473 link(tail):=p; tail:=p; space_factor:=1000; 18474 end; 18475type(p):=unset_node; glue_stretch(p):=0; 18476if every_cr<>null then begin_token_list(every_cr,every_cr_text); 18477align_peek; 18478end; {note that |glue_shrink(p)=0| since |glue_shrink==shift_amount|} 18479 18480@ Finally, we will reach the end of the alignment, and we can breathe a 18481sigh of relief that memory hasn't overflowed. All the unset boxes will now be 18482set so that the columns line up, taking due account of spanned columns. 18483 18484@p procedure@?do_assignments; forward;@t\2@>@/ 18485procedure@?resume_after_display; forward;@t\2@>@/ 18486procedure@?build_page; forward;@t\2@>@/ 18487procedure fin_align; 18488var @!p,@!q,@!r,@!s,@!u,@!v: pointer; {registers for the list operations} 18489@!t,@!w:scaled; {width of column} 18490@!o:scaled; {shift offset for unset boxes} 18491@!n:halfword; {matching span amount} 18492@!rule_save:scaled; {temporary storage for |overfull_rule|} 18493@!aux_save:memory_word; {temporary storage for |aux|} 18494begin if cur_group<>align_group then confusion("align1"); 18495@:this can't happen align}{\quad align@> 18496unsave; {that |align_group| was for individual entries} 18497if cur_group<>align_group then confusion("align0"); 18498unsave; {that |align_group| was for the whole alignment} 18499if nest[nest_ptr-1].mode_field=mmode then o:=display_indent 18500 else o:=0; 18501@<Go through the preamble list, determining the column widths and 18502 changing the alignrecords to dummy unset boxes@>; 18503@<Package the preamble list, to determine the actual tabskip glue amounts, 18504 and let |p| point to this prototype box@>; 18505@<Set the glue in all the unset boxes of the current list@>; 18506flush_node_list(p); pop_alignment; 18507@<Insert the \(c)current list into its environment@>; 18508end;@/ 18509@t\4@>@<Declare the procedure called |align_peek|@> 18510 18511@ It's time now to dismantle the preamble list and to compute the column 18512widths. Let $w_{ij}$ be the maximum of the natural widths of all entries 18513that span columns $i$ through $j$, inclusive. The alignrecord for column~$i$ 18514contains $w_{ii}$ in its |width| field, and there is also a linked list of 18515the nonzero $w_{ij}$ for increasing $j$, accessible via the |info| field; 18516these span nodes contain the value $j-i+|min_quarterword|$ in their 18517|link| fields. The values of $w_{ii}$ were initialized to |null_flag|, which 18518we regard as $-\infty$. 18519 18520The final column widths are defined by the formula 18521$$w_j=\max_{1\L i\L j}\biggl( w_{ij}-\sum_{i\L k<j}(t_k+w_k)\biggr),$$ 18522where $t_k$ is the natural width of the tabskip glue between columns 18523$k$ and~$k+1$. However, if $w_{ij}=-\infty$ for all |i| in the range 18524|1<=i<=j| (i.e., if every entry that involved column~|j| also involved 18525column~|j+1|), we let $w_j=0$, and we zero out the tabskip glue after 18526column~|j|. 18527 18528\TeX\ computes these values by using the following scheme: First $w_1=w_{11}$. 18529Then replace $w_{2j}$ by $\max(w_{2j},w_{1j}-t_1-w_1)$, for all $j>1$. 18530Then $w_2=w_{22}$. Then replace $w_{3j}$ by $\max(w_{3j},w_{2j}-t_2-w_2)$ 18531for all $j>2$; and so on. If any $w_j$ turns out to be $-\infty$, its 18532value is changed to zero and so is the next tabskip. 18533 18534@<Go through the preamble list,...@>= 18535q:=link(preamble); 18536repeat flush_list(u_part(q)); flush_list(v_part(q)); 18537p:=link(link(q)); 18538if width(q)=null_flag then 18539 @<Nullify |width(q)| and the tabskip glue following this column@>; 18540if info(q)<>end_span then 18541 @<Merge the widths in the span nodes of |q| with those of |p|, 18542 destroying the span nodes of |q|@>; 18543type(q):=unset_node; span_count(q):=min_quarterword; height(q):=0; 18544depth(q):=0; glue_order(q):=normal; glue_sign(q):=normal; 18545glue_stretch(q):=0; glue_shrink(q):=0; q:=p; 18546until q=null 18547 18548@ @<Nullify |width(q)| and the tabskip glue following this column@>= 18549begin width(q):=0; r:=link(q); s:=glue_ptr(r); 18550if s<>zero_glue then 18551 begin add_glue_ref(zero_glue); delete_glue_ref(s); 18552 glue_ptr(r):=zero_glue; 18553 end; 18554end 18555 18556@ Merging of two span-node lists is a typical exercise in the manipulation of 18557linearly linked data structures. The essential invariant in the following 18558|repeat| loop is that we want to dispense with node |r|, in |q|'s list, 18559and |u| is its successor; all nodes of |p|'s list up to and including |s| 18560have been processed, and the successor of |s| matches |r| or precedes |r| 18561or follows |r|, according as |link(r)=n| or |link(r)>n| or |link(r)<n|. 18562 18563@<Merge the widths...@>= 18564begin t:=width(q)+width(glue_ptr(link(q))); 18565r:=info(q); s:=end_span; info(s):=p; n:=min_quarterword+1; 18566repeat width(r):=width(r)-t; u:=info(r); 18567while link(r)>n do 18568 begin s:=info(s); n:=link(info(s))+1; 18569 end; 18570if link(r)<n then 18571 begin info(r):=info(s); info(s):=r; decr(link(r)); s:=r; 18572 end 18573else begin if width(r)>width(info(s)) then width(info(s)):=width(r); 18574 free_node(r,span_node_size); 18575 end; 18576r:=u; 18577until r=end_span; 18578end 18579 18580@ Now the preamble list has been converted to a list of alternating unset 18581boxes and tabskip glue, where the box widths are equal to the final 18582column sizes. In case of \.{\\valign}, we change the widths to heights, 18583so that a correct error message will be produced if the alignment is 18584overfull or underfull. 18585 18586@<Package the preamble list...@>= 18587save_ptr:=save_ptr-2; pack_begin_line:=-mode_line; 18588if mode=-vmode then 18589 begin rule_save:=overfull_rule; 18590 overfull_rule:=0; {prevent rule from being packaged} 18591 p:=hpack(preamble,saved(1),saved(0)); overfull_rule:=rule_save; 18592 end 18593else begin q:=link(preamble); 18594 repeat height(q):=width(q); width(q):=0; q:=link(link(q)); 18595 until q=null; 18596 p:=vpack(preamble,saved(1),saved(0)); 18597 q:=link(preamble); 18598 repeat width(q):=height(q); height(q):=0; q:=link(link(q)); 18599 until q=null; 18600 end; 18601pack_begin_line:=0 18602 18603@ @<Set the glue in all the unset...@>= 18604q:=link(head); s:=head; 18605while q<>null do 18606 begin if not is_char_node(q) then 18607 if type(q)=unset_node then 18608 @<Set the unset box |q| and the unset boxes in it@> 18609 else if type(q)=rule_node then 18610 @<Make the running dimensions in rule |q| extend to the 18611 boundaries of the alignment@>; 18612 s:=q; q:=link(q); 18613 end 18614 18615@ @<Make the running dimensions in rule |q| extend...@>= 18616begin if is_running(width(q)) then width(q):=width(p); 18617if is_running(height(q)) then height(q):=height(p); 18618if is_running(depth(q)) then depth(q):=depth(p); 18619if o<>0 then 18620 begin r:=link(q); link(q):=null; q:=hpack(q,natural); 18621 shift_amount(q):=o; link(q):=r; link(s):=q; 18622 end; 18623end 18624 18625@ The unset box |q| represents a row that contains one or more unset boxes, 18626depending on how soon \.{\\cr} occurred in that row. 18627 18628@<Set the unset box |q| and the unset boxes in it@>= 18629begin if mode=-vmode then 18630 begin type(q):=hlist_node; width(q):=width(p); 18631 if nest[nest_ptr-1].mode_field=mmode then set_box_lr(q)(dlist); {for |ship_out|} 18632 end 18633else begin type(q):=vlist_node; height(q):=height(p); 18634 end; 18635glue_order(q):=glue_order(p); glue_sign(q):=glue_sign(p); 18636glue_set(q):=glue_set(p); shift_amount(q):=o; 18637r:=link(list_ptr(q)); s:=link(list_ptr(p)); 18638repeat @<Set the glue in node |r| and change it from an unset node@>; 18639r:=link(link(r)); s:=link(link(s)); 18640until r=null; 18641end 18642 18643@ A box made from spanned columns will be followed by tabskip glue nodes and 18644by empty boxes as if there were no spanning. This permits perfect alignment 18645of subsequent entries, and it prevents values that depend on floating point 18646arithmetic from entering into the dimensions of any boxes. 18647 18648@<Set the glue in node |r|...@>= 18649n:=span_count(r); t:=width(s); w:=t; u:=hold_head; 18650set_box_lr(r)(0); {for |ship_out|} 18651while n>min_quarterword do 18652 begin decr(n); 18653 @<Append tabskip glue and an empty box to list |u|, 18654 and update |s| and |t| as the prototype nodes are passed@>; 18655 end; 18656if mode=-vmode then 18657 @<Make the unset node |r| into an |hlist_node| of width |w|, 18658 setting the glue as if the width were |t|@> 18659else @<Make the unset node |r| into a |vlist_node| of height |w|, 18660 setting the glue as if the height were |t|@>; 18661shift_amount(r):=0; 18662if u<>hold_head then {append blank boxes to account for spanned nodes} 18663 begin link(u):=link(r); link(r):=link(hold_head); r:=u; 18664 end 18665 18666@ @<Append tabskip glue and an empty box to list |u|...@>= 18667s:=link(s); v:=glue_ptr(s); link(u):=new_glue(v); u:=link(u); 18668subtype(u):=tab_skip_code+1; t:=t+width(v); 18669if glue_sign(p)=stretching then 18670 begin if stretch_order(v)=glue_order(p) then 18671 t:=t+round(float(glue_set(p))*stretch(v)); 18672@^real multiplication@> 18673 end 18674else if glue_sign(p)=shrinking then 18675 begin if shrink_order(v)=glue_order(p) then 18676 t:=t-round(float(glue_set(p))*shrink(v)); 18677 end; 18678s:=link(s); link(u):=new_null_box; u:=link(u); t:=t+width(s); 18679if mode=-vmode then width(u):=width(s)@+else 18680 begin type(u):=vlist_node; height(u):=width(s); 18681 end 18682 18683@ @<Make the unset node |r| into an |hlist_node| of width |w|...@>= 18684begin height(r):=height(q); depth(r):=depth(q); 18685if t=width(r) then 18686 begin glue_sign(r):=normal; glue_order(r):=normal; 18687 set_glue_ratio_zero(glue_set(r)); 18688 end 18689else if t>width(r) then 18690 begin glue_sign(r):=stretching; 18691 if glue_stretch(r)=0 then set_glue_ratio_zero(glue_set(r)) 18692 else glue_set(r):=unfloat((t-width(r))/glue_stretch(r)); 18693@^real division@> 18694 end 18695else begin glue_order(r):=glue_sign(r); glue_sign(r):=shrinking; 18696 if glue_shrink(r)=0 then set_glue_ratio_zero(glue_set(r)) 18697 else if (glue_order(r)=normal)and(width(r)-t>glue_shrink(r)) then 18698 set_glue_ratio_one(glue_set(r)) 18699 else glue_set(r):=unfloat((width(r)-t)/glue_shrink(r)); 18700 end; 18701width(r):=w; type(r):=hlist_node; 18702end 18703 18704@ @<Make the unset node |r| into a |vlist_node| of height |w|...@>= 18705begin width(r):=width(q); 18706if t=height(r) then 18707 begin glue_sign(r):=normal; glue_order(r):=normal; 18708 set_glue_ratio_zero(glue_set(r)); 18709 end 18710else if t>height(r) then 18711 begin glue_sign(r):=stretching; 18712 if glue_stretch(r)=0 then set_glue_ratio_zero(glue_set(r)) 18713 else glue_set(r):=unfloat((t-height(r))/glue_stretch(r)); 18714@^real division@> 18715 end 18716else begin glue_order(r):=glue_sign(r); glue_sign(r):=shrinking; 18717 if glue_shrink(r)=0 then set_glue_ratio_zero(glue_set(r)) 18718 else if (glue_order(r)=normal)and(height(r)-t>glue_shrink(r)) then 18719 set_glue_ratio_one(glue_set(r)) 18720 else glue_set(r):=unfloat((height(r)-t)/glue_shrink(r)); 18721 end; 18722height(r):=w; type(r):=vlist_node; 18723end 18724 18725@ We now have a completed alignment, in the list that starts at |head| 18726and ends at |tail|. This list will be merged with the one that encloses 18727it. (In case the enclosing mode is |mmode|, for displayed formulas, 18728we will need to insert glue before and after the display; that part of the 18729program will be deferred until we're more familiar with such operations.) 18730 18731In restricted horizontal mode, the |clang| part of |aux| is undefined; 18732an over-cautious \PASCAL\ runtime system may complain about this. 18733@^dirty \PASCAL@> 18734 18735@<Insert the \(c)current list into its environment@>= 18736aux_save:=aux; p:=link(head); q:=tail; pop_nest; 18737if mode=mmode then @<Finish an alignment in a display@> 18738else begin aux:=aux_save; link(tail):=p; 18739 if p<>null then tail:=q; 18740 if mode=vmode then build_page; 18741 end 18742 18743@* \[38] Breaking paragraphs into lines. 18744We come now to what is probably the most interesting algorithm of \TeX: 18745the mechanism for choosing the ``best possible'' breakpoints that yield 18746the individual lines of a paragraph. \TeX's line-breaking algorithm takes 18747a given horizontal list and converts it to a sequence of boxes that are 18748appended to the current vertical list. In the course of doing this, it 18749creates a special data structure containing three kinds of records that are 18750not used elsewhere in \TeX. Such nodes are created while a paragraph is 18751being processed, and they are destroyed afterwards; thus, the other parts 18752of \TeX\ do not need to know anything about how line-breaking is done. 18753 18754The method used here is based on an approach devised by Michael F. Plass and 18755@^Plass, Michael Frederick@> 18756@^Knuth, Donald Ervin@> 18757the author in 1977, subsequently generalized and improved by the same two 18758people in 1980. A detailed discussion appears in {\sl SOFTWARE---Practice 18759\AM\ Experience \bf11} (1981), 1119--1184, where it is shown that the 18760line-breaking problem can be regarded as a special case of the problem of 18761computing the shortest path in an acyclic network. The cited paper includes 18762numerous examples and describes the history of line breaking as it has been 18763practiced by printers through the ages. The present implementation adds two 18764new ideas to the algorithm of 1980: Memory space requirements are considerably 18765reduced by using smaller records for inactive nodes than for active ones, 18766and arithmetic overflow is avoided by using ``delta distances'' instead of 18767keeping track of the total distance from the beginning of the paragraph to the 18768current point. 18769 18770@ The |line_break| procedure should be invoked only in horizontal mode; it 18771leaves that mode and places its output into the current vlist of the 18772enclosing vertical mode (or internal vertical mode). 18773There is one explicit parameter: |d| is true for partial paragraphs 18774preceding display math mode; in this case the amount of additional 18775penalty inserted before the final line is |display_widow_penalty| 18776instead of |widow_penalty|. 18777 18778There are also a number of implicit parameters: The hlist to be broken 18779starts at |link(head)|, and it is nonempty. The value of |prev_graf| in the 18780enclosing semantic level tells where the paragraph should begin in the 18781sequence of line numbers, in case hanging indentation or \.{\\parshape} 18782is in use; |prev_graf| is zero unless this paragraph is being continued 18783after a displayed formula. Other implicit parameters, such as the 18784|par_shape_ptr| and various penalties to use for hyphenation, etc., appear 18785in |eqtb|. 18786 18787After |line_break| has acted, it will have updated the current vlist and the 18788value of |prev_graf|. Furthermore, the global variable |just_box| will 18789point to the final box created by |line_break|, so that the width of this 18790line can be ascertained when it is necessary to decide whether to use 18791|above_display_skip| or |above_display_short_skip| before a displayed formula. 18792 18793@<Glob...@>= 18794@!just_box:pointer; {the |hlist_node| for the last line of the new paragraph} 18795 18796@ Since |line_break| is a rather lengthy procedure---sort of a small world unto 18797itself---we must build it up little by little, somewhat more cautiously 18798than we have done with the simpler procedures of \TeX. Here is the 18799general outline. 18800 18801@p@t\4@>@<Declare subprocedures for |line_break|@> 18802procedure line_break(@!d:boolean); 18803label done,done1,done2,done3,done4,done5,done6,continue, restart; 18804var @<Local variables for line breaking@>@; 18805begin pack_begin_line:=mode_line; {this is for over/underfull box messages} 18806@<Get ready to start line breaking@>; 18807@<Find optimal breakpoints@>; 18808@<Break the paragraph at the chosen breakpoints, justify the resulting lines 18809to the correct widths, and append them to the current vertical list@>; 18810@<Clean up the memory by removing the break nodes@>; 18811pack_begin_line:=0; 18812end; 18813@# 18814@t\4@>@<Declare \eTeX\ procedures for use by |main_control|@> 18815 18816@ The first task is to move the list from |head| to |temp_head| and go 18817into the enclosing semantic level. We also append the \.{\\parfillskip} 18818glue to the end of the paragraph, removing a space (or other glue node) if 18819it was there, since spaces usually precede blank lines and instances of 18820`\.{\$\$}'. The |par_fill_skip| is preceded by an infinite penalty, so 18821it will never be considered as a potential breakpoint. 18822 18823This code assumes that a |glue_node| and a |penalty_node| occupy the 18824same number of |mem|~words. 18825@^data structure assumptions@> 18826 18827@<Get ready to start...@>= 18828link(temp_head):=link(head); 18829if is_char_node(tail) then tail_append(new_penalty(inf_penalty)) 18830else if type(tail)<>glue_node then tail_append(new_penalty(inf_penalty)) 18831else begin type(tail):=penalty_node; delete_glue_ref(glue_ptr(tail)); 18832 flush_node_list(leader_ptr(tail)); penalty(tail):=inf_penalty; 18833 end; 18834link(tail):=new_param_glue(par_fill_skip_code); 18835last_line_fill:=link(tail); 18836init_cur_lang:=prev_graf mod @'200000; 18837init_l_hyf:=prev_graf div @'20000000; 18838init_r_hyf:=(prev_graf div @'200000) mod @'100; 18839pop_nest; 18840 18841@ When looking for optimal line breaks, \TeX\ creates a ``break node'' for 18842each break that is {\sl feasible}, in the sense that there is a way to end 18843a line at the given place without requiring any line to stretch more than 18844a given tolerance. A break node is characterized by three things: the position 18845of the break (which is a pointer to a |glue_node|, |math_node|, |penalty_node|, 18846or |disc_node|); the ordinal number of the line that will follow this 18847breakpoint; and the fitness classification of the line that has just 18848ended, i.e., |tight_fit|, |decent_fit|, |loose_fit|, or |very_loose_fit|. 18849 18850@d tight_fit=3 {fitness classification for lines shrinking 0.5 to 1.0 of their 18851 shrinkability} 18852@d loose_fit=1 {fitness classification for lines stretching 0.5 to 1.0 of their 18853 stretchability} 18854@d very_loose_fit=0 {fitness classification for lines stretching more than 18855 their stretchability} 18856@d decent_fit=2 {fitness classification for all other lines} 18857 18858@ The algorithm essentially determines the best possible way to achieve 18859each feasible combination of position, line, and fitness. Thus, it answers 18860questions like, ``What is the best way to break the opening part of the 18861paragraph so that the fourth line is a tight line ending at such-and-such 18862a place?'' However, the fact that all lines are to be the same length 18863after a certain point makes it possible to regard all sufficiently large 18864line numbers as equivalent, when the looseness parameter is zero, and this 18865makes it possible for the algorithm to save space and time. 18866 18867An ``active node'' and a ``passive node'' are created in |mem| for each 18868feasible breakpoint that needs to be considered. Active nodes are three 18869words long and passive nodes are two words long. We need active nodes only 18870for breakpoints near the place in the paragraph that is currently being 18871examined, so they are recycled within a comparatively short time after 18872they are created. 18873 18874@ An active node for a given breakpoint contains six fields: 18875 18876\yskip\hang|link| points to the next node in the list of active nodes; the 18877last active node has |link=last_active|. 18878 18879\yskip\hang|break_node| points to the passive node associated with this 18880breakpoint. 18881 18882\yskip\hang|line_number| is the number of the line that follows this 18883breakpoint. 18884 18885\yskip\hang|fitness| is the fitness classification of the line ending at this 18886breakpoint. 18887 18888\yskip\hang|type| is either |hyphenated| or |unhyphenated|, depending on 18889whether this breakpoint is a |disc_node|. 18890 18891\yskip\hang|total_demerits| is the minimum possible sum of demerits over all 18892lines leading from the beginning of the paragraph to this breakpoint. 18893 18894\yskip\noindent 18895The value of |link(active)| points to the first active node on a linked list 18896of all currently active nodes. This list is in order by |line_number|, 18897except that nodes with |line_number>easy_line| may be in any order relative 18898to each other. 18899 18900@d active_node_size_normal=3 {number of words in normal active nodes} 18901@d fitness==subtype {|very_loose_fit..tight_fit| on final line for this break} 18902@d break_node==rlink {pointer to the corresponding passive node} 18903@d line_number==llink {line that begins at this breakpoint} 18904@d total_demerits(#)==mem[#+2].int {the quantity that \TeX\ minimizes} 18905@d unhyphenated=0 {the |type| of a normal active break node} 18906@d hyphenated=1 {the |type| of an active node that breaks at a |disc_node|} 18907@d last_active==active {the active list ends where it begins} 18908 18909@ @<Initialize the special list heads...@>= 18910type(last_active):=hyphenated; line_number(last_active):=max_halfword; 18911subtype(last_active):=0; {the |subtype| is never examined by the algorithm} 18912 18913@ The passive node for a given breakpoint contains only four fields: 18914 18915\yskip\hang|link| points to the passive node created just before this one, 18916if any, otherwise it is |null|. 18917 18918\yskip\hang|cur_break| points to the position of this breakpoint in the 18919horizontal list for the paragraph being broken. 18920 18921\yskip\hang|prev_break| points to the passive node that should precede this 18922one in an optimal path to this breakpoint. 18923 18924\yskip\hang|serial| is equal to |n| if this passive node is the |n|th 18925one created during the current pass. (This field is used only when 18926printing out detailed statistics about the line-breaking calculations.) 18927 18928\yskip\noindent 18929There is a global variable called |passive| that points to the most 18930recently created passive node. Another global variable, |printed_node|, 18931is used to help print out the paragraph when detailed information about 18932the line-breaking computation is being displayed. 18933 18934@d passive_node_size=2 {number of words in passive nodes} 18935@d cur_break==rlink {in passive node, points to position of this breakpoint} 18936@d prev_break==llink {points to passive node that should precede this one} 18937@d serial==info {serial number for symbolic identification} 18938 18939@<Glob...@>= 18940@!passive:pointer; {most recent node on passive list} 18941@!printed_node:pointer; {most recent node that has been printed} 18942@!pass_number:halfword; {the number of passive nodes allocated on this pass} 18943 18944@ The active list also contains ``delta'' nodes that help the algorithm 18945compute the badness of individual lines. Such nodes appear only between two 18946active nodes, and they have |type=delta_node|. If |p| and |r| are active nodes 18947and if |q| is a delta node between them, so that |link(p)=q| and |link(q)=r|, 18948then |q| tells the space difference between lines in the horizontal list that 18949start after breakpoint |p| and lines that start after breakpoint |r|. In 18950other words, if we know the length of the line that starts after |p| and 18951ends at our current position, then the corresponding length of the line that 18952starts after |r| is obtained by adding the amounts in node~|q|. A delta node 18953contains six scaled numbers, since it must record the net change in glue 18954stretchability with respect to all orders of infinity. The natural width 18955difference appears in |mem[q+1].sc|; the stretch differences in units of 18956pt, fil, fill, and filll appear in |mem[q+2..q+5].sc|; and the shrink difference 18957appears in |mem[q+6].sc|. The |subtype| field of a delta node is not used. 18958 18959@d delta_node_size=7 {number of words in a delta node} 18960@d delta_node=2 {|type| field in a delta node} 18961 18962@ As the algorithm runs, it maintains a set of six delta-like registers 18963for the length of the line following the first active breakpoint to the 18964current position in the given hlist. When it makes a pass through the 18965active list, it also maintains a similar set of six registers for the 18966length following the active breakpoint of current interest. A third set 18967holds the length of an empty line (namely, the sum of \.{\\leftskip} and 18968\.{\\rightskip}); and a fourth set is used to create new delta nodes. 18969 18970When we pass a delta node we want to do operations like 18971$$\hbox{\ignorespaces|for 18972k:=1 to 6 do cur_active_width[k]:=cur_active_width[k]+mem[q+k].sc|};$$ and we 18973want to do this without the overhead of |for| loops. The |do_all_six| 18974macro makes such six-tuples convenient. 18975 18976@d do_all_six(#)==#(1);#(2);#(3);#(4);#(5);#(6) 18977 18978@<Glob...@>= 18979@!active_width:array[1..6] of scaled; 18980 {distance from first active node to~|cur_p|} 18981@!cur_active_width:array[1..6] of scaled; {distance from current active node} 18982@!background:array[1..6] of scaled; {length of an ``empty'' line} 18983@!break_width:array[1..6] of scaled; {length being computed after current break} 18984 18985@ Let's state the principles of the delta nodes more precisely and concisely, 18986so that the following programs will be less obscure. For each legal 18987breakpoint~|p| in the paragraph, we define two quantities $\alpha(p)$ and 18988$\beta(p)$ such that the length of material in a line from breakpoint~|p| 18989to breakpoint~|q| is $\gamma+\beta(q)-\alpha(p)$, for some fixed $\gamma$. 18990Intuitively, $\alpha(p)$ and $\beta(q)$ are the total length of material from 18991the beginning of the paragraph to a point ``after'' a break at |p| and to a 18992point ``before'' a break at |q|; and $\gamma$ is the width of an empty line, 18993namely the length contributed by \.{\\leftskip} and \.{\\rightskip}. 18994 18995Suppose, for example, that the paragraph consists entirely of alternating 18996boxes and glue skips; let the boxes have widths $x_1\ldots x_n$ and 18997let the skips have widths $y_1\ldots y_n$, so that the paragraph can be 18998represented by $x_1y_1\ldots x_ny_n$. Let $p_i$ be the legal breakpoint 18999at $y_i$; then $\alpha(p_i)=x_1+y_1+\cdots+x_i+y_i$, and $\beta(p_i)= 19000x_1+y_1+\cdots+x_i$. To check this, note that the length of material from 19001$p_2$ to $p_5$, say, is $\gamma+x_3+y_3+x_4+y_4+x_5=\gamma+\beta(p_5) 19002-\alpha(p_2)$. 19003 19004The quantities $\alpha$, $\beta$, $\gamma$ involve glue stretchability and 19005shrinkability as well as a natural width. If we were to compute $\alpha(p)$ 19006and $\beta(p)$ for each |p|, we would need multiple precision arithmetic, and 19007the multiprecise numbers would have to be kept in the active nodes. 19008\TeX\ avoids this problem by working entirely with relative differences 19009or ``deltas.'' Suppose, for example, that the active list contains 19010$a_1\,\delta_1\,a_2\,\delta_2\,a_3$, where the |a|'s are active breakpoints 19011and the $\delta$'s are delta nodes. Then $\delta_1=\alpha(a_1)-\alpha(a_2)$ 19012and $\delta_2=\alpha(a_2)-\alpha(a_3)$. If the line breaking algorithm is 19013currently positioned at some other breakpoint |p|, the |active_width| array 19014contains the value $\gamma+\beta(p)-\alpha(a_1)$. If we are scanning through 19015the list of active nodes and considering a tentative line that runs from 19016$a_2$ to~|p|, say, the |cur_active_width| array will contain the value 19017$\gamma+\beta(p)-\alpha(a_2)$. Thus, when we move from $a_2$ to $a_3$, 19018we want to add $\alpha(a_2)-\alpha(a_3)$ to |cur_active_width|; and this 19019is just $\delta_2$, which appears in the active list between $a_2$ and 19020$a_3$. The |background| array contains $\gamma$. The |break_width| array 19021will be used to calculate values of new delta nodes when the active 19022list is being updated. 19023 19024@ Glue nodes in a horizontal list that is being paragraphed are not supposed to 19025include ``infinite'' shrinkability; that is why the algorithm maintains 19026four registers for stretching but only one for shrinking. If the user tries to 19027introduce infinite shrinkability, the shrinkability will be reset to finite 19028and an error message will be issued. A boolean variable |no_shrink_error_yet| 19029prevents this error message from appearing more than once per paragraph. 19030 19031@d check_shrinkage(#)==if (shrink_order(#)<>normal)and(shrink(#)<>0) then 19032 begin #:=finite_shrink(#); 19033 end 19034 19035@<Glob...@>= 19036@!no_shrink_error_yet:boolean; {have we complained about infinite shrinkage?} 19037 19038@ @<Declare subprocedures for |line_break|@>= 19039function finite_shrink(@!p:pointer):pointer; {recovers from infinite shrinkage} 19040var q:pointer; {new glue specification} 19041begin if no_shrink_error_yet then 19042 begin no_shrink_error_yet:=false; 19043 print_err("Infinite glue shrinkage found in a paragraph"); 19044@.Infinite glue shrinkage...@> 19045 help5("The paragraph just ended includes some glue that has")@/ 19046 ("infinite shrinkability, e.g., `\hskip 0pt minus 1fil'.")@/ 19047 ("Such glue doesn't belong there---it allows a paragraph")@/ 19048 ("of any length to fit on one line. But it's safe to proceed,")@/ 19049 ("since the offensive shrinkability has been made finite."); 19050 error; 19051 end; 19052q:=new_spec(p); shrink_order(q):=normal; 19053delete_glue_ref(p); finite_shrink:=q; 19054end; 19055 19056@ @<Get ready to start...@>= 19057no_shrink_error_yet:=true;@/ 19058check_shrinkage(left_skip); check_shrinkage(right_skip);@/ 19059q:=left_skip; r:=right_skip; background[1]:=width(q)+width(r);@/ 19060background[2]:=0; background[3]:=0; background[4]:=0; background[5]:=0;@/ 19061background[2+stretch_order(q)]:=stretch(q);@/ 19062background[2+stretch_order(r)]:=@|background[2+stretch_order(r)]+stretch(r);@/ 19063background[6]:=shrink(q)+shrink(r); 19064@<Check for special treatment of last line of paragraph@>; 19065 19066@ A pointer variable |cur_p| runs through the given horizontal list as we look 19067for breakpoints. This variable is global, since it is used both by |line_break| 19068and by its subprocedure |try_break|. 19069 19070Another global variable called |threshold| is used to determine the feasibility 19071of individual lines: Breakpoints are feasible if there is a way to reach 19072them without creating lines whose badness exceeds |threshold|. (The 19073badness is compared to |threshold| before penalties are added, so that 19074penalty values do not affect the feasibility of breakpoints, except that 19075no break is allowed when the penalty is 10000 or more.) If |threshold| 19076is 10000 or more, all legal breaks are considered feasible, since the 19077|badness| function specified above never returns a value greater than~10000. 19078 19079Up to three passes might be made through the paragraph in an attempt to find at 19080least one set of feasible breakpoints. On the first pass, we have 19081|threshold=pretolerance| and |second_pass=final_pass=false|. 19082If this pass fails to find a 19083feasible solution, |threshold| is set to |tolerance|, |second_pass| is set 19084|true|, and an attempt is made to hyphenate as many words as possible. 19085If that fails too, we add |emergency_stretch| to the background 19086stretchability and set |final_pass=true|. 19087 19088@<Glob...@>= 19089@!cur_p:pointer; {the current breakpoint under consideration} 19090@!second_pass:boolean; {is this our second attempt to break this paragraph?} 19091@!final_pass:boolean; {is this our final attempt to break this paragraph?} 19092@!threshold:integer; {maximum badness on feasible lines} 19093 19094@ The heart of the line-breaking procedure is `|try_break|', a subroutine 19095that tests if the current breakpoint |cur_p| is feasible, by running 19096through the active list to see what lines of text can be made from active 19097nodes to~|cur_p|. If feasible breaks are possible, new break nodes are 19098created. If |cur_p| is too far from an active node, that node is 19099deactivated. 19100 19101The parameter |pi| to |try_break| is the penalty associated 19102with a break at |cur_p|; we have |pi=eject_penalty| if the break is forced, 19103and |pi=inf_penalty| if the break is illegal. 19104 19105The other parameter, |break_type|, is set to |hyphenated| or |unhyphenated|, 19106depending on whether or not the current break is at a |disc_node|. The 19107end of a paragraph is also regarded as `|hyphenated|'; this case is 19108distinguishable by the condition |cur_p=null|. 19109 19110@d copy_to_cur_active(#)==cur_active_width[#]:=active_width[#] 19111@d deactivate=60 {go here when node |r| should be deactivated} 19112 19113@d cp_skipable(#) == {skipable nodes at the margins during character protrusion} 19114( 19115 not is_char_node(#) and 19116 ( 19117 (type(#) = ins_node) 19118 or (type(#) = mark_node) 19119 or (type(#) = adjust_node) 19120 or (type(#) = penalty_node) 19121 or ((type(#) = disc_node) and 19122 (pre_break(#) = null) and 19123 (post_break(#) = null) and 19124 (replace_count(#) = 0)) {an empty |disc_node|} 19125 or ((type(#) = math_node) and (width(#) = 0)) 19126 or ((type(#) = kern_node) and 19127 ((width(#) = 0) or (subtype(#) = normal))) 19128 or ((type(#) = glue_node) and (glue_ptr(#) = zero_glue)) 19129 or ((type(#) = hlist_node) and (width(#) = 0) and (height(#) = 0) and 19130 (depth(#) = 0) and (list_ptr(#) = null)) 19131 ) 19132) 19133 19134@<Declare subprocedures for |line_break|@>= 19135procedure push_node(p: pointer); 19136begin 19137 if hlist_stack_level > max_hlist_stack then 19138 pdf_error("push_node", "stack overflow"); 19139 hlist_stack[hlist_stack_level]:=p; 19140 hlist_stack_level:=hlist_stack_level + 1; 19141end; 19142 19143function pop_node: pointer; 19144begin 19145 hlist_stack_level:=hlist_stack_level - 1; 19146 if hlist_stack_level < 0 then {would point to some bug} 19147 pdf_error("pop_node", "stack underflow (internal error)"); 19148 pop_node:=hlist_stack[hlist_stack_level]; 19149end; 19150 19151function find_protchar_left(l: pointer; d: boolean): pointer; 19152{searches left to right from list head |l|, returns 1st non-skipable item} 19153var t: pointer; 19154 run: boolean; 19155begin 19156 if (link(l) <> null) and (type(l) = hlist_node) and (width(l) = 0) 19157 and (height(l) = 0) and (depth(l) = 0) and (list_ptr(l) = null) then 19158 l:=link(l) {for paragraph start with \.{\\parindent = 0pt}} 19159 else if d then 19160 while (link(l) <> null) and (not (is_char_node(l) or non_discardable(l))) do 19161 l:=link(l); {std.\ discardables at line break, \TeX book, p 95} 19162 hlist_stack_level:=0; 19163 run:=true; 19164 repeat 19165 t:=l; 19166 while run and (type(l) = hlist_node) and (list_ptr(l) <> null) do begin 19167 push_node(l); 19168 l:=list_ptr(l); 19169 end; 19170 while run and cp_skipable(l) do begin 19171 while (link(l) = null) and (hlist_stack_level > 0) do begin 19172 l:=pop_node; {don't visit this node again} 19173 end; 19174 if link(l) <> null then 19175 l:=link(l) 19176 else if hlist_stack_level = 0 then run:=false 19177 end; 19178 until t = l; 19179 find_protchar_left:=l; 19180end; 19181 19182function find_protchar_right(l, r: pointer): pointer; 19183{searches right to left from list tail |r| to head |l|, returns 1st non-skipable item} 19184var t: pointer; 19185 run: boolean; 19186begin 19187 find_protchar_right:=null; 19188 if r = null then return; 19189 hlist_stack_level:=0; 19190 run:=true; 19191 repeat 19192 t:=r; 19193 while run and (type(r) = hlist_node) and (list_ptr(r) <> null) do begin 19194 push_node(l); 19195 push_node(r); 19196 l:=list_ptr(r); 19197 r:=l; 19198 while link(r) <> null do 19199 r:=link(r); 19200 end; 19201 while run and cp_skipable(r) do begin 19202 while (r = l) and (hlist_stack_level > 0) do begin 19203 r:=pop_node; {don't visit this node again} 19204 l:=pop_node; 19205 end; 19206 if (r <> l) and (r <> null) then 19207 r:=prev_rightmost(l, r) 19208 else if (r = l) and (hlist_stack_level = 0) then run:=false 19209 end; 19210 until t = r; 19211 find_protchar_right:=r; 19212end; 19213 19214function total_pw(q, p: pointer): scaled; 19215{returns the total width of character protrusion of a line; 19216|cur_break(break_node(q))| and |p| is the leftmost resp. rightmost node in the 19217horizontal list representing the actual line} 19218var l, r: pointer; 19219 n: integer; 19220begin 19221 if break_node(q) = null then 19222 l:=first_p 19223 else 19224 l:=cur_break(break_node(q)); 19225 r:=prev_rightmost(global_prev_p, p); {get |link(r)=p|} 19226 {let's look at the right margin first} 19227 if (p <> null) and (type(p) = disc_node) and (pre_break(p) <> null) then 19228 {a |disc_node| with non-empty |pre_break|, protrude the last char of |pre_break|} 19229 begin 19230 r:=pre_break(p); 19231 while link(r) <> null do 19232 r:=link(r); 19233 end else r:=find_protchar_right(l, r); 19234 {now the left margin} 19235 if (l <> null) and (type(l) = disc_node) then begin 19236 if post_break(l) <> null then begin 19237 l:=post_break(l); {protrude the first char} 19238 goto done; 19239 end else {discard |replace_count(l)| nodes} 19240 begin 19241 n:=replace_count(l); 19242 l:=link(l); 19243 while n > 0 do begin 19244 if link(l) <> null then 19245 l:=link(l); 19246 decr(n); 19247 end; 19248 end; 19249 end; 19250 l:=find_protchar_left(l, true); 19251done: 19252 total_pw:=left_pw(l) + right_pw(r); 19253end; 19254procedure try_break(@!pi:integer;@!break_type:small_number); 19255label exit,done,done1,continue,deactivate,found,not_found; 19256var r:pointer; {runs through the active list} 19257@!prev_r:pointer; {stays a step behind |r|} 19258@!old_l:halfword; {maximum line number in current equivalence class of lines} 19259@!no_break_yet:boolean; {have we found a feasible break at |cur_p|?} 19260@<Other local variables for |try_break|@>@; 19261begin @<Make sure that |pi| is in the proper range@>; 19262no_break_yet:=true; prev_r:=active; old_l:=0; 19263do_all_six(copy_to_cur_active); 19264loop@+ begin continue: r:=link(prev_r); 19265 @<If node |r| is of type |delta_node|, update |cur_active_width|, 19266 set |prev_r| and |prev_prev_r|, then |goto continue|@>; 19267 @<If a line number class has ended, create new active nodes for 19268 the best feasible breaks in that class; then |return| 19269 if |r=last_active|, otherwise compute the new |line_width|@>; 19270 @<Consider the demerits for a line from |r| to |cur_p|; 19271 deactivate node |r| if it should no longer be active; 19272 then |goto continue| if a line from |r| to |cur_p| is infeasible, 19273 otherwise record a new feasible break@>; 19274 end; 19275exit: @!stat @<Update the value of |printed_node| for 19276 symbolic displays@>@+tats@; 19277end; 19278 19279@ @<Other local variables for |try_break|@>= 19280@!prev_prev_r:pointer; {a step behind |prev_r|, if |type(prev_r)=delta_node|} 19281@!s:pointer; {runs through nodes ahead of |cur_p|} 19282@!q:pointer; {points to a new node being created} 19283@!v:pointer; {points to a glue specification or a node ahead of |cur_p|} 19284@!t:integer; {node count, if |cur_p| is a discretionary node} 19285@!f:internal_font_number; {used in character width calculation} 19286@!l:halfword; {line number of current active node} 19287@!node_r_stays_active:boolean; {should node |r| remain in the active list?} 19288@!line_width:scaled; {the current line will be justified to this width} 19289@!fit_class:very_loose_fit..tight_fit; {possible fitness class of test line} 19290@!b:halfword; {badness of test line} 19291@!d:integer; {demerits of test line} 19292@!artificial_demerits:boolean; {has |d| been forced to zero?} 19293@!save_link:pointer; {temporarily holds value of |link(cur_p)|} 19294@!shortfall:scaled; {used in badness calculations} 19295 19296@ @<Make sure that |pi| is in the proper range@>= 19297if abs(pi)>=inf_penalty then 19298 if pi>0 then return {this breakpoint is inhibited by infinite penalty} 19299 else pi:=eject_penalty {this breakpoint will be forced} 19300 19301@ The following code uses the fact that |type(last_active)<>delta_node|. 19302 19303@d update_width(#)==@| 19304 cur_active_width[#]:=cur_active_width[#]+mem[r+#].sc 19305 19306@<If node |r|...@>= 19307@^inner loop@> 19308if type(r)=delta_node then 19309 begin do_all_six(update_width); 19310 prev_prev_r:=prev_r; prev_r:=r; goto continue; 19311 end 19312 19313@ As we consider various ways to end a line at |cur_p|, in a given line number 19314class, we keep track of the best total demerits known, in an array with 19315one entry for each of the fitness classifications. For example, 19316|minimal_demerits[tight_fit]| contains the fewest total demerits of feasible 19317line breaks ending at |cur_p| with a |tight_fit| line; |best_place[tight_fit]| 19318points to the passive node for the break before~|cur_p| that achieves such 19319an optimum; and |best_pl_line[tight_fit]| is the |line_number| field in the 19320active node corresponding to |best_place[tight_fit]|. When no feasible break 19321sequence is known, the |minimal_demerits| entries will be equal to 19322|awful_bad|, which is $2^{30}-1$. Another variable, |minimum_demerits|, 19323keeps track of the smallest value in the |minimal_demerits| array. 19324 19325@d awful_bad==@'7777777777 {more than a billion demerits} 19326 19327@<Global...@>= 19328@!minimal_demerits:array[very_loose_fit..tight_fit] of integer; {best total 19329 demerits known for current line class and position, given the fitness} 19330@!minimum_demerits:integer; {best total demerits known for current line class 19331 and position} 19332@!best_place:array[very_loose_fit..tight_fit] of pointer; {how to achieve 19333 |minimal_demerits|} 19334@!best_pl_line:array[very_loose_fit..tight_fit] of halfword; {corresponding 19335 line number} 19336 19337@ @<Get ready to start...@>= 19338minimum_demerits:=awful_bad; 19339minimal_demerits[tight_fit]:=awful_bad; 19340minimal_demerits[decent_fit]:=awful_bad; 19341minimal_demerits[loose_fit]:=awful_bad; 19342minimal_demerits[very_loose_fit]:=awful_bad; 19343 19344@ The first part of the following code is part of \TeX's inner loop, so 19345we don't want to waste any time. The current active node, namely node |r|, 19346contains the line number that will be considered next. At the end of the 19347list we have arranged the data structure so that |r=last_active| and 19348|line_number(last_active)>old_l|. 19349@^inner loop@> 19350 19351@<If a line number class...@>= 19352begin l:=line_number(r); 19353if l>old_l then 19354 begin {now we are no longer in the inner loop} 19355 if (minimum_demerits<awful_bad)and@| 19356 ((old_l<>easy_line)or(r=last_active)) then 19357 @<Create new active nodes for the best feasible breaks 19358 just found@>; 19359 if r=last_active then return; 19360 @<Compute the new line width@>; 19361 end; 19362end 19363 19364@ It is not necessary to create new active nodes having |minimal_demerits| 19365greater than 19366|minimum_demerits+abs(adj_demerits)|, since such active nodes will never 19367be chosen in the final paragraph breaks. This observation allows us to 19368omit a substantial number of feasible breakpoints from further consideration. 19369 19370@<Create new active nodes...@>= 19371begin if no_break_yet then @<Compute the values of |break_width|@>; 19372@<Insert a delta node to prepare for breaks at |cur_p|@>; 19373if abs(adj_demerits)>=awful_bad-minimum_demerits then 19374 minimum_demerits:=awful_bad-1 19375else minimum_demerits:=minimum_demerits+abs(adj_demerits); 19376for fit_class:=very_loose_fit to tight_fit do 19377 begin if minimal_demerits[fit_class]<=minimum_demerits then 19378 @<Insert a new active node 19379 from |best_place[fit_class]| to |cur_p|@>; 19380 minimal_demerits[fit_class]:=awful_bad; 19381 end; 19382minimum_demerits:=awful_bad; 19383@<Insert a delta node to prepare for the next active node@>; 19384end 19385 19386@ When we insert a new active node for a break at |cur_p|, suppose this 19387new node is to be placed just before active node |a|; then we essentially 19388want to insert `$\delta\,|cur_p|\,\delta^\prime$' before |a|, where 19389$\delta=\alpha(a)-\alpha(|cur_p|)$ and $\delta^\prime=\alpha(|cur_p|)-\alpha(a)$ 19390in the notation explained above. The |cur_active_width| array now holds 19391$\gamma+\beta(|cur_p|)-\alpha(a)$; so $\delta$ can be obtained by 19392subtracting |cur_active_width| from the quantity $\gamma+\beta(|cur_p|)- 19393\alpha(|cur_p|)$. The latter quantity can be regarded as the length of a 19394line ``from |cur_p| to |cur_p|''; we call it the |break_width| at |cur_p|. 19395 19396The |break_width| is usually negative, since it consists of the background 19397(which is normally zero) minus the width of nodes following~|cur_p| that are 19398eliminated after a break. If, for example, node |cur_p| is a glue node, the 19399width of this glue is subtracted from the background; and we also look 19400ahead to eliminate all subsequent glue and penalty and kern and math 19401nodes, subtracting their widths as well. 19402 19403Kern nodes do not disappear at a line break unless they are |explicit|. 19404 19405@d set_break_width_to_background(#)==break_width[#]:=background[#] 19406 19407@<Compute the values of |break...@>= 19408begin no_break_yet:=false; do_all_six(set_break_width_to_background); 19409s:=cur_p; 19410if break_type>unhyphenated then if cur_p<>null then 19411 @<Compute the discretionary |break_width| values@>; 19412while s<>null do 19413 begin if is_char_node(s) then goto done; 19414 case type(s) of 19415 glue_node:@<Subtract glue from |break_width|@>; 19416 penalty_node: do_nothing; 19417 math_node: break_width[1]:=break_width[1]-width(s); 19418 kern_node: if subtype(s)<>explicit then goto done 19419 else break_width[1]:=break_width[1]-width(s); 19420 othercases goto done 19421 endcases;@/ 19422 s:=link(s); 19423 end; 19424done: end 19425 19426@ @<Subtract glue from |break...@>= 19427begin v:=glue_ptr(s); break_width[1]:=break_width[1]-width(v); 19428break_width[2+stretch_order(v)]:=break_width[2+stretch_order(v)]-stretch(v); 19429break_width[6]:=break_width[6]-shrink(v); 19430end 19431 19432@ When |cur_p| is a discretionary break, the length of a line ``from |cur_p| to 19433|cur_p|'' has to be defined properly so that the other calculations work out. 19434Suppose that the pre-break text at |cur_p| has length $l_0$, the post-break 19435text has length $l_1$, and the replacement text has length |l|. Suppose 19436also that |q| is the node following the replacement text. Then length of a 19437line from |cur_p| to |q| will be computed as $\gamma+\beta(q)-\alpha(|cur_p|)$, 19438where $\beta(q)=\beta(|cur_p|)-l_0+l$. The actual length will be the background 19439plus $l_1$, so the length from |cur_p| to |cur_p| should be $\gamma+l_0+l_1-l$. 19440If the post-break text of the discretionary is empty, a break may also 19441discard~|q|; in that unusual case we subtract the length of~|q| and any 19442other nodes that will be discarded after the discretionary break. 19443 19444The value of $l_0$ need not be computed, since |line_break| will put 19445it into the global variable |disc_width| before calling |try_break|. 19446 19447@<Glob...@>= 19448@!disc_width:scaled; {the length of discretionary material preceding a break} 19449 19450@ @<Compute the discretionary |break...@>= 19451begin t:=replace_count(cur_p); v:=cur_p; s:=post_break(cur_p); 19452while t>0 do 19453 begin decr(t); v:=link(v); 19454 @<Subtract the width of node |v| from |break_width|@>; 19455 end; 19456while s<>null do 19457 begin @<Add the width of node |s| to |break_width|@>; 19458 s:=link(s); 19459 end; 19460break_width[1]:=break_width[1]+disc_width; 19461if post_break(cur_p)=null then s:=link(v); 19462 {nodes may be discardable after the break} 19463end 19464 19465@ Replacement texts and discretionary texts are supposed to contain 19466only character nodes, kern nodes, ligature nodes, and box or rule nodes. 19467 19468@<Subtract the width of node |v|...@>= 19469if is_char_node(v) then 19470 begin f:=font(v); 19471 break_width[1]:=break_width[1]-char_width(f)(char_info(f)(character(v))); 19472 end 19473else case type(v) of 19474 ligature_node: begin f:=font(lig_char(v));@/ 19475 xtx_ligature_present:=true; 19476 break_width[1]:=@|break_width[1]- 19477 char_width(f)(char_info(f)(character(lig_char(v)))); 19478 end; 19479 hlist_node,vlist_node,rule_node,kern_node: 19480 break_width[1]:=break_width[1]-width(v); 19481 whatsit_node: 19482 if (subtype(v)=native_word_node) 19483 or (subtype(v)=glyph_node) 19484 or (subtype(v)=pic_node) 19485 or (subtype(v)=pdf_node) 19486 then break_width[1]:=break_width[1]-width(v) 19487 else confusion("disc1a"); 19488 othercases confusion("disc1") 19489@:this can't happen disc1}{\quad disc1@> 19490 endcases 19491 19492@ @<Add the width of node |s| to |b...@>= 19493if is_char_node(s) then 19494 begin f:=font(s); 19495 break_width[1]:=@|break_width[1]+char_width(f)(char_info(f)(character(s))); 19496 end 19497else case type(s) of 19498 ligature_node: begin f:=font(lig_char(s)); 19499 xtx_ligature_present:=true; 19500 break_width[1]:=break_width[1]+ 19501 char_width(f)(char_info(f)(character(lig_char(s)))); 19502 end; 19503 hlist_node,vlist_node,rule_node,kern_node: 19504 break_width[1]:=break_width[1]+width(s); 19505 whatsit_node: 19506 if (subtype(s)=native_word_node) 19507 or (subtype(s)=glyph_node) 19508 or (subtype(s)=pic_node) 19509 or (subtype(s)=pdf_node) 19510 then break_width[1]:=break_width[1]+width(s) 19511 else confusion("disc2a"); 19512 othercases confusion("disc2") 19513@:this can't happen disc2}{\quad disc2@> 19514 endcases 19515 19516@ We use the fact that |type(active)<>delta_node|. 19517 19518@d convert_to_break_width(#)==@| 19519 mem[prev_r+#].sc:=@|@t\hskip10pt@>mem[prev_r+#].sc 19520 -cur_active_width[#]+break_width[#] 19521@d store_break_width(#)==active_width[#]:=break_width[#] 19522@d new_delta_to_break_width(#)==@| 19523 mem[q+#].sc:=break_width[#]-cur_active_width[#] 19524 19525@<Insert a delta node to prepare for breaks at |cur_p|@>= 19526if type(prev_r)=delta_node then {modify an existing delta node} 19527 begin do_all_six(convert_to_break_width); 19528 end 19529else if prev_r=active then {no delta node needed at the beginning} 19530 begin do_all_six(store_break_width); 19531 end 19532else begin q:=get_node(delta_node_size); link(q):=r; type(q):=delta_node;@/ 19533 subtype(q):=0; {the |subtype| is not used} 19534 do_all_six(new_delta_to_break_width); 19535 link(prev_r):=q; prev_prev_r:=prev_r; prev_r:=q; 19536 end 19537 19538@ When the following code is performed, we will have just inserted at 19539least one active node before |r|, so |type(prev_r)<>delta_node|. 19540 19541@d new_delta_from_break_width(#)==@|mem[q+#].sc:= 19542 cur_active_width[#]-break_width[#] 19543 19544@<Insert a delta node to prepare for the next active node@>= 19545if r<>last_active then 19546 begin q:=get_node(delta_node_size); link(q):=r; type(q):=delta_node;@/ 19547 subtype(q):=0; {the |subtype| is not used} 19548 do_all_six(new_delta_from_break_width); 19549 link(prev_r):=q; prev_prev_r:=prev_r; prev_r:=q; 19550 end 19551 19552@ When we create an active node, we also create the corresponding 19553passive node. 19554 19555@<Insert a new active node from |best_place[fit_class]| to |cur_p|@>= 19556begin q:=get_node(passive_node_size); 19557link(q):=passive; passive:=q; cur_break(q):=cur_p; 19558@!stat incr(pass_number); serial(q):=pass_number;@+tats@;@/ 19559prev_break(q):=best_place[fit_class];@/ 19560q:=get_node(active_node_size); break_node(q):=passive; 19561line_number(q):=best_pl_line[fit_class]+1; 19562fitness(q):=fit_class; type(q):=break_type; 19563total_demerits(q):=minimal_demerits[fit_class]; 19564if do_last_line_fit then 19565 @<Store \(a)additional data in the new active node@>; 19566link(q):=r; link(prev_r):=q; prev_r:=q; 19567@!stat if tracing_paragraphs>0 then 19568 @<Print a symbolic description of the new break node@>; 19569tats@;@/ 19570end 19571 19572@ @<Print a symbolic description of the new break node@>= 19573begin print_nl("@@@@"); print_int(serial(passive)); 19574@.\AT!\AT!@> 19575print(": line "); print_int(line_number(q)-1); 19576print_char("."); print_int(fit_class); 19577if break_type=hyphenated then print_char("-"); 19578print(" t="); print_int(total_demerits(q)); 19579if do_last_line_fit then @<Print additional data in the new active node@>; 19580print(" -> @@@@"); 19581if prev_break(passive)=null then print_char("0") 19582else print_int(serial(prev_break(passive))); 19583end 19584 19585@ The length of lines depends on whether the user has specified 19586\.{\\parshape} or \.{\\hangindent}. If |par_shape_ptr| is not null, it 19587points to a $(2n+1)$-word record in |mem|, where the |info| in the first 19588word contains the value of |n|, and the other $2n$ words contain the left 19589margins and line lengths for the first |n| lines of the paragraph; the 19590specifications for line |n| apply to all subsequent lines. If 19591|par_shape_ptr=null|, the shape of the paragraph depends on the value of 19592|n=hang_after|; if |n>=0|, hanging indentation takes place on lines |n+1|, 19593|n+2|, \dots, otherwise it takes place on lines 1, \dots, $\vert 19594n\vert$. When hanging indentation is active, the left margin is 19595|hang_indent|, if |hang_indent>=0|, else it is 0; the line length is 19596$|hsize|-\vert|hang_indent|\vert$. The normal setting is 19597|par_shape_ptr=null|, |hang_after=1|, and |hang_indent=0|. 19598Note that if |hang_indent=0|, the value of |hang_after| is irrelevant. 19599@^length of lines@> @^hanging indentation@> 19600 19601@<Glob...@>= 19602@!easy_line:halfword; {line numbers |>easy_line| are equivalent in break nodes} 19603@!last_special_line:halfword; {line numbers |>last_special_line| all have 19604 the same width} 19605@!first_width:scaled; {the width of all lines |<=last_special_line|, if 19606 no \.{\\parshape} has been specified} 19607@!second_width:scaled; {the width of all lines |>last_special_line|} 19608@!first_indent:scaled; {left margin to go with |first_width|} 19609@!second_indent:scaled; {left margin to go with |second_width|} 19610 19611@ We compute the values of |easy_line| and the other local variables relating 19612to line length when the |line_break| procedure is initializing itself. 19613 19614@<Get ready to start...@>= 19615if par_shape_ptr=null then 19616 if hang_indent=0 then 19617 begin last_special_line:=0; second_width:=hsize; 19618 second_indent:=0; 19619 end 19620 else @<Set line length parameters in preparation for hanging indentation@> 19621else begin last_special_line:=info(par_shape_ptr)-1; 19622 second_width:=mem[par_shape_ptr+2*(last_special_line+1)].sc; 19623 second_indent:=mem[par_shape_ptr+2*last_special_line+1].sc; 19624 end; 19625if looseness=0 then easy_line:=last_special_line 19626else easy_line:=max_halfword 19627 19628@ @<Set line length parameters in preparation for hanging indentation@>= 19629begin last_special_line:=abs(hang_after); 19630if hang_after<0 then 19631 begin first_width:=hsize-abs(hang_indent); 19632 if hang_indent>=0 then first_indent:=hang_indent 19633 else first_indent:=0; 19634 second_width:=hsize; second_indent:=0; 19635 end 19636else begin first_width:=hsize; first_indent:=0; 19637 second_width:=hsize-abs(hang_indent); 19638 if hang_indent>=0 then second_indent:=hang_indent 19639 else second_indent:=0; 19640 end; 19641end 19642 19643@ When we come to the following code, we have just encountered the first 19644active node~|r| whose |line_number| field contains |l|. Thus we want to 19645compute the length of the $l\mskip1mu$th line of the current paragraph. Furthermore, 19646we want to set |old_l| to the last number in the class of line numbers 19647equivalent to~|l|. 19648 19649@<Compute the new line width@>= 19650if l>easy_line then 19651 begin line_width:=second_width; old_l:=max_halfword-1; 19652 end 19653else begin old_l:=l; 19654 if l>last_special_line then line_width:=second_width 19655 else if par_shape_ptr=null then line_width:=first_width 19656 else line_width:=mem[par_shape_ptr+2*l@,].sc; 19657 end 19658 19659@ The remaining part of |try_break| deals with the calculation of 19660demerits for a break from |r| to |cur_p|. 19661 19662The first thing to do is calculate the badness, |b|. This value will always 19663be between zero and |inf_bad+1|; the latter value occurs only in the 19664case of lines from |r| to |cur_p| that cannot shrink enough to fit the necessary 19665width. In such cases, node |r| will be deactivated. 19666We also deactivate node~|r| when a break at~|cur_p| is forced, since future 19667breaks must go through a forced break. 19668 19669@<Consider the demerits for a line from |r| to |cur_p|...@>= 19670begin artificial_demerits:=false;@/ 19671@^inner loop@> 19672shortfall:=line_width-cur_active_width[1]; {we're this much too short} 19673if XeTeX_protrude_chars > 1 then 19674 shortfall:=shortfall + total_pw(r, cur_p); 19675if shortfall>0 then 19676 @<Set the value of |b| to the badness for stretching the line, 19677 and compute the corresponding |fit_class|@> 19678else @<Set the value of |b| to the badness for shrinking the line, 19679 and compute the corresponding |fit_class|@>; 19680if do_last_line_fit then @<Adjust \(t)the additional data for last line@>; 19681found: 19682if (b>inf_bad)or(pi=eject_penalty) then 19683 @<Prepare to deactivate node~|r|, and |goto deactivate| unless 19684 there is a reason to consider lines of text from |r| to |cur_p|@> 19685else begin prev_r:=r; 19686 if b>threshold then goto continue; 19687 node_r_stays_active:=true; 19688 end; 19689@<Record a new feasible break@>; 19690if node_r_stays_active then goto continue; {|prev_r| has been set to |r|} 19691deactivate: @<Deactivate node |r|@>; 19692end 19693 19694@ When a line must stretch, the available stretchability can be found in the 19695subarray |cur_active_width[2..5]|, in units of points, fil, fill, and filll. 19696 19697The present section is part of \TeX's inner loop, and it is most often performed 19698when the badness is infinite; therefore it is worth while to make a quick 19699test for large width excess and small stretchability, before calling the 19700|badness| subroutine. 19701@^inner loop@> 19702 19703@<Set the value of |b| to the badness for stretching...@>= 19704if (cur_active_width[3]<>0)or(cur_active_width[4]<>0)or@| 19705 (cur_active_width[5]<>0) then 19706 begin if do_last_line_fit then 19707 begin if cur_p=null then {the last line of a paragraph} 19708 @<Perform computations for last line and |goto found|@>; 19709 shortfall:=0; 19710 end; 19711 b:=0; fit_class:=decent_fit; {infinite stretch} 19712 end 19713else begin if shortfall>7230584 then if cur_active_width[2]<1663497 then 19714 begin b:=inf_bad; fit_class:=very_loose_fit; goto done1; 19715 end; 19716 b:=badness(shortfall,cur_active_width[2]); 19717 if b>12 then 19718 if b>99 then fit_class:=very_loose_fit 19719 else fit_class:=loose_fit 19720 else fit_class:=decent_fit; 19721 done1: 19722 end 19723 19724@ Shrinkability is never infinite in a paragraph; 19725we can shrink the line from |r| to |cur_p| by at most |cur_active_width[6]|. 19726 19727@<Set the value of |b| to the badness for shrinking...@>= 19728begin if -shortfall>cur_active_width[6] then b:=inf_bad+1 19729else b:=badness(-shortfall,cur_active_width[6]); 19730if b>12 then fit_class:=tight_fit@+else fit_class:=decent_fit; 19731end 19732 19733@ During the final pass, we dare not lose all active nodes, lest we lose 19734touch with the line breaks already found. The code shown here makes sure 19735that such a catastrophe does not happen, by permitting overfull boxes as 19736a last resort. This particular part of \TeX\ was a source of several subtle 19737bugs before the correct program logic was finally discovered; readers 19738who seek to ``improve'' \TeX\ should therefore think thrice before daring 19739to make any changes here. 19740@^overfull boxes@> 19741 19742@<Prepare to deactivate node~|r|, and |goto deactivate| unless...@>= 19743begin if final_pass and (minimum_demerits=awful_bad) and@| 19744 (link(r)=last_active) and 19745 (prev_r=active) then 19746 artificial_demerits:=true {set demerits zero, this break is forced} 19747else if b>threshold then goto deactivate; 19748node_r_stays_active:=false; 19749end 19750 19751@ When we get to this part of the code, the line from |r| to |cur_p| is 19752feasible, its badness is~|b|, and its fitness classification is |fit_class|. 19753We don't want to make an active node for this break yet, but we will 19754compute the total demerits and record them in the |minimal_demerits| array, 19755if such a break is the current champion among all ways to get to |cur_p| 19756in a given line-number class and fitness class. 19757 19758@<Record a new feasible break@>= 19759if artificial_demerits then d:=0 19760else @<Compute the demerits, |d|, from |r| to |cur_p|@>; 19761@!stat if tracing_paragraphs>0 then 19762 @<Print a symbolic description of this feasible break@>; 19763tats@;@/ 19764d:=d+total_demerits(r); {this is the minimum total demerits 19765 from the beginning to |cur_p| via |r|} 19766if d<=minimal_demerits[fit_class] then 19767 begin minimal_demerits[fit_class]:=d; 19768 best_place[fit_class]:=break_node(r); best_pl_line[fit_class]:=l; 19769 if do_last_line_fit then 19770 @<Store \(a)additional data for this feasible break@>; 19771 if d<minimum_demerits then minimum_demerits:=d; 19772 end 19773 19774@ @<Print a symbolic description of this feasible break@>= 19775begin if printed_node<>cur_p then 19776 @<Print the list between |printed_node| and |cur_p|, 19777 then set |printed_node:=cur_p|@>; 19778print_nl("@@"); 19779@.\AT!@> 19780if cur_p=null then print_esc("par") 19781else if type(cur_p)<>glue_node then 19782 begin if type(cur_p)=penalty_node then print_esc("penalty") 19783 else if type(cur_p)=disc_node then print_esc("discretionary") 19784 else if type(cur_p)=kern_node then print_esc("kern") 19785 else print_esc("math"); 19786 end; 19787print(" via @@@@"); 19788if break_node(r)=null then print_char("0") 19789else print_int(serial(break_node(r))); 19790print(" b="); 19791if b>inf_bad then print_char("*")@+else print_int(b); 19792@.*\relax@> 19793print(" p="); print_int(pi); print(" d="); 19794if artificial_demerits then print_char("*")@+else print_int(d); 19795end 19796 19797@ @<Print the list between |printed_node| and |cur_p|...@>= 19798begin print_nl(""); 19799if cur_p=null then short_display(link(printed_node)) 19800else begin save_link:=link(cur_p); 19801 link(cur_p):=null; print_nl(""); short_display(link(printed_node)); 19802 link(cur_p):=save_link; 19803 end; 19804printed_node:=cur_p; 19805end 19806 19807@ When the data for a discretionary break is being displayed, we will have 19808printed the |pre_break| and |post_break| lists; we want to skip over the 19809third list, so that the discretionary data will not appear twice. The 19810following code is performed at the very end of |try_break|. 19811 19812@<Update the value of |printed_node|...@>= 19813if cur_p=printed_node then if cur_p<>null then if type(cur_p)=disc_node then 19814 begin t:=replace_count(cur_p); 19815 while t>0 do 19816 begin decr(t); printed_node:=link(printed_node); 19817 end; 19818 end 19819 19820@ @<Compute the demerits, |d|, from |r| to |cur_p|@>= 19821begin d:=line_penalty+b; 19822if abs(d)>=10000 then d:=100000000@+else d:=d*d; 19823if pi<>0 then 19824 if pi>0 then d:=d+pi*pi 19825 else if pi>eject_penalty then d:=d-pi*pi; 19826if (break_type=hyphenated)and(type(r)=hyphenated) then 19827 if cur_p<>null then d:=d+double_hyphen_demerits 19828 else d:=d+final_hyphen_demerits; 19829if abs(fit_class-fitness(r))>1 then d:=d+adj_demerits; 19830end 19831 19832@ When an active node disappears, we must delete an adjacent delta node if the 19833active node was at the beginning or the end of the active list, or if it 19834was surrounded by delta nodes. We also must preserve the property that 19835|cur_active_width| represents the length of material from |link(prev_r)| 19836to~|cur_p|. 19837 19838@d combine_two_deltas(#)==@|mem[prev_r+#].sc:=mem[prev_r+#].sc+mem[r+#].sc 19839@d downdate_width(#)==@|cur_active_width[#]:=cur_active_width[#]- 19840 mem[prev_r+#].sc 19841 19842@<Deactivate node |r|@>= 19843link(prev_r):=link(r); free_node(r,active_node_size); 19844if prev_r=active then @<Update the active widths, since the first active 19845 node has been deleted@> 19846else if type(prev_r)=delta_node then 19847 begin r:=link(prev_r); 19848 if r=last_active then 19849 begin do_all_six(downdate_width); 19850 link(prev_prev_r):=last_active; 19851 free_node(prev_r,delta_node_size); prev_r:=prev_prev_r; 19852 end 19853 else if type(r)=delta_node then 19854 begin do_all_six(update_width); 19855 do_all_six(combine_two_deltas); 19856 link(prev_r):=link(r); free_node(r,delta_node_size); 19857 end; 19858 end 19859 19860@ The following code uses the fact that |type(last_active)<>delta_node|. If the 19861active list has just become empty, we do not need to update the 19862|active_width| array, since it will be initialized when an active 19863node is next inserted. 19864 19865@d update_active(#)==active_width[#]:=active_width[#]+mem[r+#].sc 19866 19867@<Update the active widths,...@>= 19868begin r:=link(active); 19869if type(r)=delta_node then 19870 begin do_all_six(update_active); 19871 do_all_six(copy_to_cur_active); 19872 link(active):=link(r); free_node(r,delta_node_size); 19873 end; 19874end 19875 19876@* \[39] Breaking paragraphs into lines, continued. 19877So far we have gotten a little way into the |line_break| routine, having 19878covered its important |try_break| subroutine. Now let's consider the 19879rest of the process. 19880 19881The main loop of |line_break| traverses the given hlist, 19882starting at |link(temp_head)|, and calls |try_break| at each legal 19883breakpoint. A variable called |auto_breaking| is set to true except 19884within math formulas, since glue nodes are not legal breakpoints when 19885they appear in formulas. 19886 19887The current node of interest in the hlist is pointed to by |cur_p|. Another 19888variable, |prev_p|, is usually one step behind |cur_p|, but the real 19889meaning of |prev_p| is this: If |type(cur_p)=glue_node| then |cur_p| is a legal 19890breakpoint if and only if |auto_breaking| is true and |prev_p| does not 19891point to a glue node, penalty node, explicit kern node, or math node. 19892 19893The following declarations provide for a few other local variables that are 19894used in special calculations. 19895 19896@<Local variables for line breaking@>= 19897@!auto_breaking:boolean; {is node |cur_p| outside a formula?} 19898@!prev_p:pointer; {helps to determine when glue nodes are breakpoints} 19899@!q,@!r,@!s,@!prev_s:pointer; {miscellaneous nodes of temporary interest} 19900@!f:internal_font_number; {used when calculating character widths} 19901 19902@ The `\ignorespaces|loop|\unskip' in the following code is performed at most 19903thrice per call of |line_break|, since it is actually a pass over the 19904entire paragraph. 19905 19906@d update_prev_p == begin 19907 prev_p:=cur_p; 19908 global_prev_p:=cur_p; 19909end 19910 19911@<Find optimal breakpoints@>= 19912threshold:=pretolerance; 19913if threshold>=0 then 19914 begin @!stat if tracing_paragraphs>0 then 19915 begin begin_diagnostic; print_nl("@@firstpass");@+end;@;@+tats@;@/ 19916 second_pass:=false; final_pass:=false; 19917 end 19918else begin threshold:=tolerance; second_pass:=true; 19919 final_pass:=(emergency_stretch<=0); 19920 @!stat if tracing_paragraphs>0 then begin_diagnostic;@+tats@; 19921 end; 19922loop@+ begin if threshold>inf_bad then threshold:=inf_bad; 19923 if second_pass then @<Initialize for hyphenating a paragraph@>; 19924 @<Create an active breakpoint representing the beginning of the paragraph@>; 19925 cur_p:=link(temp_head); auto_breaking:=true;@/ 19926 update_prev_p; {glue at beginning is not a legal breakpoint} 19927 first_p:=cur_p; {to access the first node of paragraph as the first active 19928 node has |break_node=null|} 19929 while (cur_p<>null)and(link(active)<>last_active) do 19930 @<Call |try_break| if |cur_p| is a legal breakpoint; 19931 on the second pass, also try to hyphenate the next 19932 word, if |cur_p| is a glue node; 19933 then advance |cur_p| to the next node of the paragraph 19934 that could possibly be a legal breakpoint@>; 19935 if cur_p=null then 19936 @<Try the final line break at the end of the paragraph, 19937 and |goto done| if the desired breakpoints have been found@>; 19938 @<Clean up the memory by removing the break nodes@>; 19939 if not second_pass then 19940 begin@!stat if tracing_paragraphs>0 then print_nl("@@secondpass");@;@+tats@/ 19941 threshold:=tolerance; second_pass:=true; final_pass:=(emergency_stretch<=0); 19942 end {if at first you don't succeed, \dots} 19943 else begin @!stat if tracing_paragraphs>0 then 19944 print_nl("@@emergencypass");@;@+tats@/ 19945 background[2]:=background[2]+emergency_stretch; final_pass:=true; 19946 end; 19947 end; 19948done: @!stat if tracing_paragraphs>0 then 19949 begin end_diagnostic(true); normalize_selector; 19950 end;@+tats@/ 19951if do_last_line_fit then @<Adjust \(t)the final line of the paragraph@>; 19952 19953@ The active node that represents the starting point does not need a 19954corresponding passive node. 19955 19956@d store_background(#)==active_width[#]:=background[#] 19957 19958@<Create an active breakpoint representing the beginning of the paragraph@>= 19959q:=get_node(active_node_size); 19960type(q):=unhyphenated; fitness(q):=decent_fit; 19961link(q):=last_active; break_node(q):=null; 19962line_number(q):=prev_graf+1; total_demerits(q):=0; link(active):=q; 19963if do_last_line_fit then 19964 @<Initialize additional fields of the first active node@>; 19965do_all_six(store_background);@/ 19966passive:=null; printed_node:=temp_head; pass_number:=0; 19967font_in_short_display:=null_font 19968 19969@ @<Clean...@>= 19970q:=link(active); 19971while q<>last_active do 19972 begin cur_p:=link(q); 19973 if type(q)=delta_node then free_node(q,delta_node_size) 19974 else free_node(q,active_node_size); 19975 q:=cur_p; 19976 end; 19977q:=passive; 19978while q<>null do 19979 begin cur_p:=link(q); 19980 free_node(q,passive_node_size); 19981 q:=cur_p; 19982 end 19983 19984@ Here is the main switch in the |line_break| routine, where legal breaks 19985are determined. As we move through the hlist, we need to keep the |active_width| 19986array up to date, so that the badness of individual lines is readily calculated 19987by |try_break|. It is convenient to use the short name |act_width| for 19988the component of active width that represents real width as opposed to glue. 19989 19990@d act_width==active_width[1] {length from first active node to current node} 19991@d kern_break==begin if not is_char_node(link(cur_p)) and auto_breaking then 19992 if type(link(cur_p))=glue_node then try_break(0,unhyphenated); 19993 act_width:=act_width+width(cur_p); 19994 end 19995 19996@<Call |try_break| if |cur_p| is a legal breakpoint...@>= 19997begin if is_char_node(cur_p) then 19998 @<Advance \(c)|cur_p| to the node following the present 19999 string of characters@>; 20000case type(cur_p) of 20001hlist_node,vlist_node,rule_node: act_width:=act_width+width(cur_p); 20002whatsit_node: @<Advance \(p)past a whatsit node in the \(l)|line_break| loop@>; 20003glue_node: begin @<If node |cur_p| is a legal breakpoint, call |try_break|; 20004 then update the active widths by including the glue in |glue_ptr(cur_p)|@>; 20005 if second_pass and auto_breaking then 20006 @<Try to hyphenate the following word@>; 20007 end; 20008kern_node: if subtype(cur_p)=explicit then kern_break 20009 else act_width:=act_width+width(cur_p); 20010ligature_node: begin f:=font(lig_char(cur_p)); 20011 xtx_ligature_present:=true; 20012 act_width:=act_width+char_width(f)(char_info(f)(character(lig_char(cur_p)))); 20013 end; 20014disc_node: @<Try to break after a discretionary fragment, then |goto done5|@>; 20015math_node: begin if subtype(cur_p)<L_code then auto_breaking:=odd(subtype(cur_p)); 20016 kern_break; 20017 end; 20018penalty_node: try_break(penalty(cur_p),unhyphenated); 20019mark_node,ins_node,adjust_node: do_nothing; 20020othercases confusion("paragraph") 20021@:this can't happen paragraph}{\quad paragraph@> 20022endcases;@/ 20023update_prev_p; cur_p:=link(cur_p); 20024done5:end 20025 20026@ The code that passes over the characters of words in a paragraph is 20027part of \TeX's inner loop, so it has been streamlined for speed. We use 20028the fact that `\.{\\parfillskip}' glue appears at the end of each paragraph; 20029it is therefore unnecessary to check if |link(cur_p)=null| when |cur_p| is a 20030character node. 20031@^inner loop@> 20032 20033@<Advance \(c)|cur_p| to the node following the present string...@>= 20034begin update_prev_p; 20035repeat f:=font(cur_p); 20036act_width:=act_width+char_width(f)(char_info(f)(character(cur_p))); 20037cur_p:=link(cur_p); 20038until not is_char_node(cur_p); 20039end 20040 20041@ When node |cur_p| is a glue node, we look at |prev_p| to see whether or not 20042a breakpoint is legal at |cur_p|, as explained above. 20043 20044@<If node |cur_p| is a legal breakpoint, call...@>= 20045if auto_breaking then 20046 begin if is_char_node(prev_p) then try_break(0,unhyphenated) 20047 else if precedes_break(prev_p) then try_break(0,unhyphenated) 20048 else if (type(prev_p)=kern_node)and(subtype(prev_p)<>explicit) then 20049 try_break(0,unhyphenated); 20050 end; 20051check_shrinkage(glue_ptr(cur_p)); q:=glue_ptr(cur_p); 20052act_width:=act_width+width(q);@| 20053active_width[2+stretch_order(q)]:=@| 20054 active_width[2+stretch_order(q)]+stretch(q);@/ 20055active_width[6]:=active_width[6]+shrink(q) 20056 20057@ The following code knows that discretionary texts contain 20058only character nodes, kern nodes, box nodes, rule nodes, and ligature nodes. 20059 20060@<Try to break after a discretionary fragment...@>= 20061begin s:=pre_break(cur_p); disc_width:=0; 20062if s=null then try_break(ex_hyphen_penalty,hyphenated) 20063else begin repeat @<Add the width of node |s| to |disc_width|@>; 20064 s:=link(s); 20065 until s=null; 20066 act_width:=act_width+disc_width; 20067 try_break(hyphen_penalty,hyphenated); 20068 act_width:=act_width-disc_width; 20069 end; 20070r:=replace_count(cur_p); s:=link(cur_p); 20071while r>0 do 20072 begin @<Add the width of node |s| to |act_width|@>; 20073 decr(r); s:=link(s); 20074 end; 20075update_prev_p; cur_p:=s; goto done5; 20076end 20077 20078@ @<Add the width of node |s| to |disc_width|@>= 20079if is_char_node(s) then 20080 begin f:=font(s); 20081 disc_width:=disc_width+char_width(f)(char_info(f)(character(s))); 20082 end 20083else case type(s) of 20084 ligature_node: begin f:=font(lig_char(s)); 20085 xtx_ligature_present:=true; 20086 disc_width:=disc_width+ 20087 char_width(f)(char_info(f)(character(lig_char(s)))); 20088 end; 20089 hlist_node,vlist_node,rule_node,kern_node: 20090 disc_width:=disc_width+width(s); 20091 whatsit_node: 20092 if (subtype(s)=native_word_node) 20093 or (subtype(s)=glyph_node) 20094 or (subtype(s)=pic_node) 20095 or (subtype(s)=pdf_node) 20096 then disc_width:=disc_width+width(s) 20097 else confusion("disc3a"); 20098 othercases confusion("disc3") 20099@:this can't happen disc3}{\quad disc3@> 20100 endcases 20101 20102@ @<Add the width of node |s| to |act_width|@>= 20103if is_char_node(s) then 20104 begin f:=font(s); 20105 act_width:=act_width+char_width(f)(char_info(f)(character(s))); 20106 end 20107else case type(s) of 20108 ligature_node: begin f:=font(lig_char(s)); 20109 xtx_ligature_present:=true; 20110 act_width:=act_width+ 20111 char_width(f)(char_info(f)(character(lig_char(s)))); 20112 end; 20113 hlist_node,vlist_node,rule_node,kern_node: 20114 act_width:=act_width+width(s); 20115 whatsit_node: 20116 if (subtype(s)=native_word_node) 20117 or (subtype(s)=glyph_node) 20118 or (subtype(s)=pic_node) 20119 or (subtype(s)=pdf_node) 20120 then act_width:=act_width+width(s) 20121 else confusion("disc4a"); 20122 othercases confusion("disc4") 20123@:this can't happen disc4}{\quad disc4@> 20124 endcases 20125 20126@ The forced line break at the paragraph's end will reduce the list of 20127breakpoints so that all active nodes represent breaks at |cur_p=null|. 20128On the first pass, we insist on finding an active node that has the 20129correct ``looseness.'' On the final pass, there will be at least one active 20130node, and we will match the desired looseness as well as we can. 20131 20132The global variable |best_bet| will be set to the active node for the best 20133way to break the paragraph, and a few other variables are used to 20134help determine what is best. 20135 20136@<Glob...@>= 20137@!best_bet:pointer; {use this passive node and its predecessors} 20138@!fewest_demerits:integer; {the demerits associated with |best_bet|} 20139@!best_line:halfword; {line number following the last line of the new paragraph} 20140@!actual_looseness:integer; {the difference between |line_number(best_bet)| 20141 and the optimum |best_line|} 20142@!line_diff:integer; {the difference between the current line number and 20143 the optimum |best_line|} 20144 20145@ @<Try the final line break at the end of the paragraph...@>= 20146begin try_break(eject_penalty,hyphenated); 20147if link(active)<>last_active then 20148 begin @<Find an active node with fewest demerits@>; 20149 if looseness=0 then goto done; 20150 @<Find the best active node for the desired looseness@>; 20151 if (actual_looseness=looseness)or final_pass then goto done; 20152 end; 20153end 20154 20155@ @<Find an active node...@>= 20156r:=link(active); fewest_demerits:=awful_bad; 20157repeat if type(r)<>delta_node then if total_demerits(r)<fewest_demerits then 20158 begin fewest_demerits:=total_demerits(r); best_bet:=r; 20159 end; 20160r:=link(r); 20161until r=last_active; 20162best_line:=line_number(best_bet) 20163 20164@ The adjustment for a desired looseness is a slightly more complicated 20165version of the loop just considered. Note that if a paragraph is broken 20166into segments by displayed equations, each segment will be subject to the 20167looseness calculation, independently of the other segments. 20168 20169@<Find the best active node...@>= 20170begin r:=link(active); actual_looseness:=0; 20171repeat if type(r)<>delta_node then 20172 begin line_diff:=line_number(r)-best_line; 20173 if ((line_diff<actual_looseness)and(looseness<=line_diff))or@| 20174 ((line_diff>actual_looseness)and(looseness>=line_diff)) then 20175 begin best_bet:=r; actual_looseness:=line_diff; 20176 fewest_demerits:=total_demerits(r); 20177 end 20178 else if (line_diff=actual_looseness)and@| 20179 (total_demerits(r)<fewest_demerits) then 20180 begin best_bet:=r; fewest_demerits:=total_demerits(r); 20181 end; 20182 end; 20183r:=link(r); 20184until r=last_active; 20185best_line:=line_number(best_bet); 20186end 20187 20188@ Once the best sequence of breakpoints has been found (hurray), we call on the 20189procedure |post_line_break| to finish the remainder of the work. 20190(By introducing this subprocedure, we are able to keep |line_break| 20191from getting extremely long.) 20192 20193@<Break the paragraph at the chosen...@>= 20194post_line_break(d) 20195 20196@ The total number of lines that will be set by |post_line_break| 20197is |best_line-prev_graf-1|. The last breakpoint is specified by 20198|break_node(best_bet)|, and this passive node points to the other breakpoints 20199via the |prev_break| links. The finishing-up phase starts by linking the 20200relevant passive nodes in forward order, changing |prev_break| to 20201|next_break|. (The |next_break| fields actually reside in the same memory 20202space as the |prev_break| fields did, but we give them a new name because 20203of their new significance.) Then the lines are justified, one by one. 20204 20205@d next_break==prev_break {new name for |prev_break| after links are reversed} 20206 20207@<Declare subprocedures for |line_break|@>= 20208procedure post_line_break(@!d:boolean); 20209label done,done1; 20210var q,@!r,@!s:pointer; {temporary registers for list manipulation} 20211 p, k: pointer; 20212 w: scaled; 20213 glue_break: boolean; {was a break at glue?} 20214 ptmp: pointer; 20215@!disc_break:boolean; {was the current break at a discretionary node?} 20216@!post_disc_break:boolean; {and did it have a nonempty post-break part?} 20217@!cur_width:scaled; {width of line number |cur_line|} 20218@!cur_indent:scaled; {left margin of line number |cur_line|} 20219@!t:quarterword; {used for replacement counts in discretionary nodes} 20220@!pen:integer; {use when calculating penalties between lines} 20221@!cur_line: halfword; {the current line number being justified} 20222@!LR_ptr:pointer; {stack of LR codes} 20223begin LR_ptr:=LR_save; 20224@<Reverse the links of the relevant passive nodes, setting |cur_p| to the 20225 first breakpoint@>; 20226cur_line:=prev_graf+1; 20227repeat @<Justify the line ending at breakpoint |cur_p|, and append it to the 20228 current vertical list, together with associated penalties and other 20229 insertions@>; 20230incr(cur_line); cur_p:=next_break(cur_p); 20231if cur_p<>null then if not post_disc_break then 20232 @<Prune unwanted nodes at the beginning of the next line@>; 20233until cur_p=null; 20234if (cur_line<>best_line)or(link(temp_head)<>null) then 20235 confusion("line breaking"); 20236@:this can't happen line breaking}{\quad line breaking@> 20237prev_graf:=best_line-1; 20238LR_save:=LR_ptr; 20239end; 20240 20241@ The job of reversing links in a list is conveniently regarded as the job 20242of taking items off one stack and putting them on another. In this case we 20243take them off a stack pointed to by |q| and having |prev_break| fields; 20244we put them on a stack pointed to by |cur_p| and having |next_break| fields. 20245Node |r| is the passive node being moved from stack to stack. 20246 20247@<Reverse the links of the relevant passive nodes...@>= 20248q:=break_node(best_bet); cur_p:=null; 20249repeat r:=q; q:=prev_break(q); next_break(r):=cur_p; cur_p:=r; 20250until q=null 20251 20252@ Glue and penalty and kern and math nodes are deleted at the beginning of 20253a line, except in the anomalous case that the node to be deleted is actually 20254one of the chosen breakpoints. Otherwise 20255the pruning done here is designed to match 20256the lookahead computation in |try_break|, where the |break_width| values 20257are computed for non-discretionary breakpoints. 20258 20259@<Prune unwanted nodes at the beginning of the next line@>= 20260begin r:=temp_head; 20261loop@+ begin q:=link(r); 20262 if q=cur_break(cur_p) then goto done1; 20263 {|cur_break(cur_p)| is the next breakpoint} 20264 {now |q| cannot be |null|} 20265 if is_char_node(q) then goto done1; 20266 if non_discardable(q) then goto done1; 20267 if type(q)=kern_node then if subtype(q)<>explicit then goto done1; 20268 r:=q; {now |type(q)=glue_node|, |kern_node|, |math_node| or |penalty_node|} 20269 if type(q)=math_node then if TeXXeT_en then 20270 @<Adjust \(t)the LR stack for the |post_line_break| routine@>; 20271 end; 20272done1: if r<>temp_head then 20273 begin link(r):=null; flush_node_list(link(temp_head)); 20274 link(temp_head):=q; 20275 end; 20276end 20277 20278@ The current line to be justified appears in a horizontal list starting 20279at |link(temp_head)| and ending at |cur_break(cur_p)|. If |cur_break(cur_p)| is 20280a glue node, we reset the glue to equal the |right_skip| glue; otherwise 20281we append the |right_skip| glue at the right. If |cur_break(cur_p)| is a 20282discretionary node, we modify the list so that the discretionary break 20283is compulsory, and we set |disc_break| to |true|. We also append 20284the |left_skip| glue at the left of the line, unless it is zero. 20285 20286@<Justify the line ending at breakpoint |cur_p|, and append it...@>= 20287if TeXXeT_en then 20288 @<Insert LR nodes at the beginning of the current line and adjust 20289 the LR stack based on LR nodes in this line@>; 20290@<Modify the end of the line to reflect the nature of the break and to include 20291 \.{\\rightskip}; also set the proper value of |disc_break|@>; 20292if TeXXeT_en then @<Insert LR nodes at the end of the current line@>; 20293@<Put the \(l)\.{\\leftskip} glue at the left and detach this line@>; 20294@<Call the packaging subroutine, setting |just_box| to the justified box@>; 20295@<Append the new box to the current vertical list, followed by the list of 20296 special nodes taken out of the box by the packager@>; 20297@<Append a penalty node, if a nonzero penalty is appropriate@> 20298 20299@ At the end of the following code, |q| will point to the final node on the 20300list about to be justified. 20301 20302@<Modify the end of the line...@>= 20303q:=cur_break(cur_p); disc_break:=false; post_disc_break:=false; 20304glue_break:=false; 20305if q<>null then {|q| cannot be a |char_node|} 20306 if type(q)=glue_node then 20307 begin delete_glue_ref(glue_ptr(q)); 20308 glue_ptr(q):=right_skip; 20309 subtype(q):=right_skip_code+1; add_glue_ref(right_skip); 20310 glue_break:=true; 20311 goto done; 20312 end 20313 else begin if type(q)=disc_node then 20314 @<Change discretionary to compulsory and set 20315 |disc_break:=true|@> 20316 else if type(q)=kern_node then width(q):=0 20317 else if type(q)=math_node then 20318 begin width(q):=0; 20319 if TeXXeT_en then @<Adjust \(t)the LR stack for the |p...@>; 20320 end; 20321 end 20322else begin q:=temp_head; 20323 while link(q)<>null do q:=link(q); 20324 end; 20325done: 20326{at this point |q| is the rightmost breakpoint; the only exception is the case 20327of a discretionary break with non-empty |pre_break|, then |q| has been changed 20328to the last node of the |pre_break| list} 20329if XeTeX_protrude_chars > 0 then begin 20330 if disc_break and (is_char_node(q) or (type(q) <> disc_node)) 20331 {|q| has been reset to the last node of |pre_break|} 20332 then begin 20333 p:=q; 20334 ptmp:=p; 20335 end else begin 20336 p:=prev_rightmost(link(temp_head), q); {get |link(p) = q|} 20337 ptmp:=p; 20338 p:=find_protchar_right(link(temp_head), p); 20339 end; 20340 w:=right_pw(p); 20341 if w <> 0 then {we have found a marginal kern, append it after |ptmp|} 20342 begin 20343 k:=new_margin_kern(-w, last_rightmost_char, right_side); 20344 link(k):=link(ptmp); 20345 link(ptmp):=k; 20346 if (ptmp = q) then 20347 q:=link(q); 20348 end; 20349end; 20350{if |q| was not a breakpoint at glue and has been reset to |rightskip| then 20351 we append |rightskip| after |q| now} 20352if not glue_break then begin 20353 @<Put the \(r)\.{\\rightskip} glue after node |q|@>; 20354end; 20355 20356@ @<Change discretionary to compulsory...@>= 20357begin t:=replace_count(q); 20358@<Destroy the |t| nodes following |q|, and 20359 make |r| point to the following node@>; 20360if post_break(q)<>null then @<Transplant the post-break list@>; 20361if pre_break(q)<>null then @<Transplant the pre-break list@>; 20362link(q):=r; disc_break:=true; 20363end 20364 20365@ @<Destroy the |t| nodes following |q|...@>= 20366if t=0 then r:=link(q) 20367else begin r:=q; 20368 while t>1 do 20369 begin r:=link(r); decr(t); 20370 end; 20371 s:=link(r); 20372 r:=link(s); link(s):=null; 20373 flush_node_list(link(q)); replace_count(q):=0; 20374 end 20375 20376@ We move the post-break list from inside node |q| to the main list by 20377re\-attaching it just before the present node |r|, then resetting |r|. 20378 20379@<Transplant the post-break list@>= 20380begin s:=post_break(q); 20381while link(s)<>null do s:=link(s); 20382link(s):=r; r:=post_break(q); post_break(q):=null; post_disc_break:=true; 20383end 20384 20385@ We move the pre-break list from inside node |q| to the main list by 20386re\-attaching it just after the present node |q|, then resetting |q|. 20387 20388@<Transplant the pre-break list@>= 20389begin s:=pre_break(q); link(q):=s; 20390while link(s)<>null do s:=link(s); 20391pre_break(q):=null; q:=s; 20392end 20393 20394@ @<Put the \(r)\.{\\rightskip} glue after node |q|@>= 20395r:=new_param_glue(right_skip_code); link(r):=link(q); link(q):=r; q:=r 20396 20397@ The following code begins with |q| at the end of the list to be 20398justified. It ends with |q| at the beginning of that list, and with 20399|link(temp_head)| pointing to the remainder of the paragraph, if any. 20400 20401@<Put the \(l)\.{\\leftskip} glue at the left...@>= 20402r:=link(q); link(q):=null; q:=link(temp_head); link(temp_head):=r; 20403{at this point |q| is the leftmost node; all discardable nodes have been discarded} 20404if XeTeX_protrude_chars > 0 then begin 20405 p:=q; 20406 p:=find_protchar_left(p, false); {no more discardables} 20407 w:=left_pw(p); 20408 if w <> 0 then begin 20409 k:=new_margin_kern(-w, last_leftmost_char, left_side); 20410 link(k):=q; 20411 q:=k; 20412 end; 20413end; 20414if left_skip<>zero_glue then 20415 begin r:=new_param_glue(left_skip_code); 20416 link(r):=q; q:=r; 20417 end 20418 20419@ @<Append the new box to the current vertical list...@>= 20420if pre_adjust_head <> pre_adjust_tail then 20421 append_list(pre_adjust_head)(pre_adjust_tail); 20422pre_adjust_tail:=null; 20423append_to_vlist(just_box); 20424if adjust_head <> adjust_tail then 20425 append_list(adjust_head)(adjust_tail); 20426adjust_tail:=null 20427 20428@ Now |q| points to the hlist that represents the current line of the 20429paragraph. We need to compute the appropriate line width, pack the 20430line into a box of this size, and shift the box by the appropriate 20431amount of indentation. 20432 20433@<Call the packaging subroutine...@>= 20434if cur_line>last_special_line then 20435 begin cur_width:=second_width; cur_indent:=second_indent; 20436 end 20437else if par_shape_ptr=null then 20438 begin cur_width:=first_width; cur_indent:=first_indent; 20439 end 20440else begin cur_width:=mem[par_shape_ptr+2*cur_line].sc; 20441 cur_indent:=mem[par_shape_ptr+2*cur_line-1].sc; 20442 end; 20443adjust_tail:=adjust_head; 20444pre_adjust_tail:=pre_adjust_head; 20445just_box:=hpack(q,cur_width,exactly); 20446shift_amount(just_box):=cur_indent 20447 20448@ Penalties between the lines of a paragraph come from club and widow lines, 20449from the |inter_line_penalty| parameter, and from lines that end at 20450discretionary breaks. Breaking between lines of a two-line paragraph gets 20451both club-line and widow-line penalties. The local variable |pen| will 20452be set to the sum of all relevant penalties for the current line, except 20453that the final line is never penalized. 20454 20455@<Append a penalty node, if a nonzero penalty is appropriate@>= 20456if cur_line+1<>best_line then 20457 begin q:=inter_line_penalties_ptr; 20458 if q<>null then 20459 begin r:=cur_line; 20460 if r>penalty(q) then r:=penalty(q); 20461 pen:=penalty(q+r); 20462 end 20463 else pen:=inter_line_penalty; 20464 q:=club_penalties_ptr; 20465 if q<>null then 20466 begin r:=cur_line-prev_graf; 20467 if r>penalty(q) then r:=penalty(q); 20468 pen:=pen+penalty(q+r); 20469 end 20470 else if cur_line=prev_graf+1 then pen:=pen+club_penalty; 20471 if d then q:=display_widow_penalties_ptr 20472 else q:=widow_penalties_ptr; 20473 if q<>null then 20474 begin r:=best_line-cur_line-1; 20475 if r>penalty(q) then r:=penalty(q); 20476 pen:=pen+penalty(q+r); 20477 end 20478 else if cur_line+2=best_line then 20479 if d then pen:=pen+display_widow_penalty 20480 else pen:=pen+widow_penalty; 20481 if disc_break then pen:=pen+broken_penalty; 20482 if pen<>0 then 20483 begin r:=new_penalty(pen); 20484 link(tail):=r; tail:=r; 20485 end; 20486 end 20487 20488@* \[40] Pre-hyphenation. 20489When the line-breaking routine is unable to find a feasible sequence of 20490breakpoints, it makes a second pass over the paragraph, attempting to 20491hyphenate the hyphenatable words. The goal of hyphenation is to insert 20492discretionary material into the paragraph so that there are more 20493potential places to break. 20494 20495The general rules for hyphenation are somewhat complex and technical, 20496because we want to be able to hyphenate words that are preceded or 20497followed by punctuation marks, and because we want the rules to work 20498for languages other than English. We also must contend with the fact 20499that hyphens might radically alter the ligature and kerning structure 20500of a word. 20501 20502A sequence of characters will be considered for hyphenation only if it 20503belongs to a ``potentially hyphenatable part'' of the current paragraph. 20504This is a sequence of nodes $p_0p_1\ldots p_m$ where $p_0$ is a glue node, 20505$p_1\ldots p_{m-1}$ are either character or ligature or whatsit or 20506implicit kern or text direction nodes, and $p_m$ is a glue or penalty or 20507insertion or adjust 20508or mark or whatsit or explicit kern node. (Therefore hyphenation is 20509disabled by boxes, math formulas, and discretionary nodes already inserted 20510by the user.) The ligature nodes among $p_1\ldots p_{m-1}$ are effectively 20511expanded into the original non-ligature characters; the kern nodes and 20512whatsits are ignored. Each character |c| is now classified as either a 20513nonletter (if |lc_code(c)=0|), a lowercase letter (if 20514|lc_code(c)=c|), or an uppercase letter (otherwise); an uppercase letter 20515is treated as if it were |lc_code(c)| for purposes of hyphenation. The 20516characters generated by $p_1\ldots p_{m-1}$ may begin with nonletters; let 20517$c_1$ be the first letter that is not in the middle of a ligature. Whatsit 20518nodes preceding $c_1$ are ignored; a whatsit found after $c_1$ will be the 20519terminating node $p_m$. All characters that do not have the same font as 20520$c_1$ will be treated as nonletters. The |hyphen_char| for that font 20521must be between 0 and 255, otherwise hyphenation will not be attempted. 20522\TeX\ looks ahead for as many consecutive letters $c_1\ldots c_n$ as 20523possible; however, |n| must be less than 64, so a character that would 20524otherwise be $c_{64}$ is effectively not a letter. Furthermore $c_n$ must 20525not be in the middle of a ligature. In this way we obtain a string of 20526letters $c_1\ldots c_n$ that are generated by nodes $p_a\ldots p_b$, where 20527|1<=a<=b+1<=m|. If |n>=l_hyf+r_hyf|, this string qualifies for hyphenation; 20528however, |uc_hyph| must be positive, if $c_1$ is uppercase. 20529 20530The hyphenation process takes place in three stages. First, the candidate 20531sequence $c_1\ldots c_n$ is found; then potential positions for hyphens 20532are determined by referring to hyphenation tables; and finally, the nodes 20533$p_a\ldots p_b$ are replaced by a new sequence of nodes that includes the 20534discretionary breaks found. 20535 20536Fortunately, we do not have to do all this calculation very often, because 20537of the way it has been taken out of \TeX's inner loop. For example, when 20538the second edition of the author's 700-page book {\sl Seminumerical 20539Algorithms} was typeset by \TeX, only about 1.2 hyphenations needed to be 20540@^Knuth, Donald Ervin@> 20541tried per paragraph, since the line breaking algorithm needed to use two 20542passes on only about 5 per cent of the paragraphs. 20543 20544@<Initialize for hyphenating...@>= 20545begin @!init if trie_not_ready then init_trie;@+tini@;@/ 20546cur_lang:=init_cur_lang; l_hyf:=init_l_hyf; r_hyf:=init_r_hyf; 20547set_hyph_index; 20548end 20549 20550@ The letters $c_1\ldots c_n$ that are candidates for hyphenation are placed 20551into an array called |hc|; the number |n| is placed into |hn|; pointers to 20552nodes $p_{a-1}$ and~$p_b$ in the description above are placed into variables 20553|ha| and |hb|; and the font number is placed into |hf|. 20554 20555@<Glob...@>= 20556@!hc:array[0..66] of 0..number_usvs; {word to be hyphenated} 20557{ note that element 0 needs to be a full UnicodeScalar, even though we 20558 basically work in UTF16 } 20559@!hn:small_number; {the number of positions occupied in |hc|} 20560@!ha,@!hb:pointer; {nodes |ha..hb| should be replaced by the hyphenated result} 20561@!hf:internal_font_number; {font number of the letters in |hc|} 20562@!hu:array[0..64] of 0..too_big_char; 20563 {like |hc|, before conversion to lowercase} 20564@!hyf_char:integer; {hyphen character of the relevant font} 20565@!cur_lang,@!init_cur_lang:0..biggest_lang; 20566 {current hyphenation table of interest} 20567@!l_hyf,@!r_hyf,@!init_l_hyf,@!init_r_hyf:integer; {limits on fragment sizes} 20568@!hyf_bchar:halfword; {boundary character after $c_n$} 20569@!max_hyph_char:integer; 20570 20571@ @<Set initial values of key variables@>= 20572max_hyph_char:=too_big_lang; 20573 20574@ Hyphenation routines need a few more local variables. 20575 20576@<Local variables for line...@>= 20577@!j:small_number; {an index into |hc| or |hu|} 20578@!c:UnicodeScalar; {character being considered for hyphenation} 20579 20580@ When the following code is activated, the |line_break| procedure is in its 20581second pass, and |cur_p| points to a glue node. 20582 20583@<Try to hyphenate...@>= 20584begin prev_s:=cur_p; s:=link(prev_s); 20585if s<>null then 20586 begin @<Skip to node |ha|, or |goto done1| if no hyphenation 20587 should be attempted@>; 20588 if l_hyf+r_hyf>63 then goto done1; 20589if is_native_word_node(ha) then begin 20590 @<Check that nodes after |native_word| permit hyphenation; if not, |goto done1|@>; 20591 @<Prepare a |native_word_node| for hyphenation@>; 20592end else begin 20593 @<Skip to node |hb|, putting letters into |hu| and |hc|@>; 20594end; 20595 @<Check that the nodes following |hb| permit hyphenation and that at least 20596 |l_hyf+r_hyf| letters have been found, otherwise |goto done1|@>; 20597 hyphenate; 20598 end; 20599done1: end 20600 20601@ @<Declare subprocedures for |line_break|@>= 20602@t\4@>@<Declare the function called |reconstitute|@> 20603procedure hyphenate; 20604label common_ending,done,found,found1,found2,not_found,exit; 20605var @<Local variables for hyphenation@>@; 20606begin @<Find hyphen locations for the word in |hc|, or |return|@>; 20607@<If no hyphens were found, |return|@>; 20608@<Replace nodes |ha..hb| by a sequence of nodes that includes 20609 the discretionary hyphens@>; 20610exit:end; 20611 20612@ @<Check that nodes after |native_word| permit hyphenation; if not, |goto done1|@>= 20613s:=link(ha); 20614loop@+ begin if not(is_char_node(s)) then 20615 case type(s) of 20616 ligature_node: do_nothing; 20617 kern_node: if subtype(s)<>normal then goto done6; 20618 whatsit_node,glue_node,penalty_node,ins_node,adjust_node,mark_node: 20619 goto done6; 20620 othercases goto done1 20621 endcases; 20622 s:=link(s); 20623 end; 20624done6: 20625 20626@ @<Prepare a |native_word_node| for hyphenation@>= 20627{ note that if there are chars with |lccode = 0|, we split them out into separate |native_word| nodes } 20628hn:=0; 20629restart: 20630for l:=0 to native_length(ha)-1 do begin 20631 c:=get_native_usv(ha, l); 20632 set_lc_code(c); 20633 if (hc[0] = 0) then begin 20634 if (hn > 0) then begin 20635 { we've got some letters, and now found a non-letter, so break off the tail of the |native_word| 20636 and link it after this node, and goto done3 } 20637 @<Split the |native_word_node| at |l| and link the second part after |ha|@>; 20638 goto done3; 20639 end 20640 end else if (hn = 0) and (l > 0) then begin 20641 { we've found the first letter after some non-letters, so break off the head of the |native_word| and restart } 20642 @<Split the |native_word_node| at |l| and link the second part after |ha|@>; 20643 ha:=link(ha); 20644 goto restart; 20645 end else if (hn = 63) then 20646 { reached max hyphenatable length } 20647 goto done3 20648 else begin 20649 { found a letter that is part of a potentially hyphenatable sequence } 20650 incr(hn); 20651 if c<@"10000 then begin 20652 hu[hn]:=c; hc[hn]:=hc[0]; 20653 end 20654 else begin 20655 hu[hn]:=(c - @"10000) div @"400 + @"D800; 20656 hc[hn]:=(hc[0] - @"10000) div @"400 + @"D800; 20657 incr(hn); 20658 hu[hn]:=c mod @"400 + @"DC00; 20659 hc[hn]:=hc[0] mod @"400 + @"DC00; 20660 incr(l); 20661 end; 20662 hyf_bchar:=non_char; 20663 end 20664end; 20665 20666@ @<Split the |native_word_node| at |l| and link the second part after |ha|@>= 20667 q:=new_native_word_node(hf, native_length(ha) - l); 20668 for i:=l to native_length(ha) - 1 do 20669 set_native_char(q, i - l, get_native_char(ha, i)); 20670 set_native_metrics(q, XeTeX_use_glyph_metrics); 20671 link(q):=link(ha); 20672 link(ha):=q; 20673 { truncate text in node |ha| } 20674 native_length(ha):=l; 20675 set_native_metrics(ha, XeTeX_use_glyph_metrics); 20676 20677@ @<Local variables for line breaking@>= 20678l: integer; 20679i: integer; 20680 20681@ The first thing we need to do is find the node |ha| just before the 20682first letter. 20683 20684@<Skip to node |ha|, or |goto done1|...@>= 20685loop@+ begin if is_char_node(s) then 20686 begin c:=qo(character(s)); hf:=font(s); 20687 end 20688 else if type(s)=ligature_node then 20689 if lig_ptr(s)=null then goto continue 20690 else begin q:=lig_ptr(s); c:=qo(character(q)); hf:=font(q); 20691 end 20692 else if (type(s)=kern_node)and(subtype(s)=normal) then goto continue 20693 else if (type(s)=math_node)and(subtype(s)>=L_code) then goto continue 20694 else if type(s)=whatsit_node then 20695 begin 20696 if subtype(s) = native_word_node then begin 20697 { we only consider the node if it contains at least one letter, otherwise we'll skip it } 20698 for l:=0 to native_length(s) - 1 do begin 20699 c:=get_native_usv(s, l); 20700 if lc_code(c) <> 0 then begin 20701 hf:=native_font(s); 20702 prev_s:=s; 20703 goto done2; 20704 end; 20705 if c>=@"10000 then incr(l); 20706 end 20707 end; 20708 @<Advance \(p)past a whatsit node in the \(p)pre-hyphenation loop@>; 20709 goto continue 20710 end 20711 else goto done1; 20712 set_lc_code(c); 20713 if hc[0]<>0 then 20714 if (hc[0]=c)or(uc_hyph>0) then goto done2 20715 else goto done1; 20716continue: prev_s:=s; s:=link(prev_s); 20717 end; 20718done2: hyf_char:=hyphen_char[hf]; 20719if hyf_char<0 then goto done1; 20720if hyf_char>biggest_char then goto done1; 20721ha:=prev_s 20722 20723@ The word to be hyphenated is now moved to the |hu| and |hc| arrays. 20724 20725@<Skip to node |hb|, putting letters...@>= 20726hn:=0; 20727loop@+ begin if is_char_node(s) then 20728 begin if font(s)<>hf then goto done3; 20729 hyf_bchar:=character(s); c:=qo(hyf_bchar); 20730 set_lc_code(c); 20731 if hc[0]=0 then goto done3; 20732 if hc[0]>max_hyph_char then goto done3; 20733 if hn=63 then goto done3; 20734 hb:=s; incr(hn); hu[hn]:=c; hc[hn]:=hc[0]; hyf_bchar:=non_char; 20735 end 20736 else if type(s)=ligature_node then 20737 @<Move the characters of a ligature node to |hu| and |hc|; 20738 but |goto done3| if they are not all letters@> 20739 else if (type(s)=kern_node)and(subtype(s)=normal) then 20740 begin hb:=s; 20741 hyf_bchar:=font_bchar[hf]; 20742 end 20743 else goto done3; 20744 s:=link(s); 20745 end; 20746done3: 20747 20748@ We let |j| be the index of the character being stored when a ligature node 20749is being expanded, since we do not want to advance |hn| until we are sure 20750that the entire ligature consists of letters. Note that it is possible 20751to get to |done3| with |hn=0| and |hb| not set to any value. 20752 20753@<Move the characters of a ligature node to |hu| and |hc|...@>= 20754begin if font(lig_char(s))<>hf then goto done3; 20755j:=hn; q:=lig_ptr(s);@+if q>null then hyf_bchar:=character(q); 20756while q>null do 20757 begin c:=qo(character(q)); 20758 set_lc_code(c); 20759 if hc[0]=0 then goto done3; 20760 if hc[0]>max_hyph_char then goto done3; 20761 if j=63 then goto done3; 20762 incr(j); hu[j]:=c; hc[j]:=hc[0];@/ 20763 q:=link(q); 20764 end; 20765hb:=s; hn:=j; 20766if odd(subtype(s)) then hyf_bchar:=font_bchar[hf]@+else hyf_bchar:=non_char; 20767end 20768 20769@ @<Check that the nodes following |hb| permit hyphenation...@>= 20770if hn<l_hyf+r_hyf then goto done1; {|l_hyf| and |r_hyf| are |>=1|} 20771loop@+ begin if not(is_char_node(s)) then 20772 case type(s) of 20773 ligature_node: do_nothing; 20774 kern_node: if subtype(s)<>normal then goto done4; 20775 whatsit_node,glue_node,penalty_node,ins_node,adjust_node,mark_node: 20776 goto done4; 20777 math_node: if subtype(s)>=L_code then goto done4@+else goto done1; 20778 othercases goto done1 20779 endcases; 20780 s:=link(s); 20781 end; 20782done4: 20783 20784@* \[41] Post-hyphenation. 20785If a hyphen may be inserted between |hc[j]| and |hc[j+1]|, the hyphenation 20786procedure will set |hyf[j]| to some small odd number. But before we look 20787at \TeX's hyphenation procedure, which is independent of the rest of the 20788line-breaking algorithm, let us consider what we will do with the hyphens 20789it finds, since it is better to work on this part of the program before 20790forgetting what |ha| and |hb|, etc., are all about. 20791 20792@<Glob...@>= 20793@!hyf:array [0..64] of 0..9; {odd values indicate discretionary hyphens} 20794@!init_list:pointer; {list of punctuation characters preceding the word} 20795@!init_lig:boolean; {does |init_list| represent a ligature?} 20796@!init_lft:boolean; {if so, did the ligature involve a left boundary?} 20797 20798@ @<Local variables for hyphenation@>= 20799@!i,@!j,@!l:0..65; {indices into |hc| or |hu|} 20800@!q,@!r,@!s:pointer; {temporary registers for list manipulation} 20801@!bchar:halfword; {right boundary character of hyphenated word, or |non_char|} 20802 20803@ \TeX\ will never insert a hyphen that has fewer than 20804\.{\\lefthyphenmin} letters before it or fewer than 20805\.{\\righthyphenmin} after it; hence, a short word has 20806comparatively little chance of being hyphenated. If no hyphens have 20807been found, we can save time by not having to make any changes to the 20808paragraph. 20809 20810@<If no hyphens were found, |return|@>= 20811for j:=l_hyf to hn-r_hyf do if odd(hyf[j]) then goto found1; 20812return; 20813found1: 20814 20815@ If hyphens are in fact going to be inserted, \TeX\ first deletes the 20816subsequence of nodes between |ha| and~|hb|. An attempt is made to 20817preserve the effect that implicit boundary characters and punctuation marks 20818had on ligatures inside the hyphenated word, by storing a left boundary or 20819preceding character in |hu[0]| and by storing a possible right boundary 20820in |bchar|. We set |j:=0| if |hu[0]| is to be part of the reconstruction; 20821otherwise |j:=1|. 20822The variable |s| will point to the tail of the current hlist, and 20823|q| will point to the node following |hb|, so that 20824things can be hooked up after we reconstitute the hyphenated word. 20825 20826@<Replace nodes |ha..hb| by a sequence of nodes...@>= 20827if is_native_word_node(ha) then begin 20828 @<Hyphenate the |native_word_node| at |ha|@>; 20829end else begin 20830q:=link(hb); link(hb):=null; r:=link(ha); link(ha):=null; bchar:=hyf_bchar; 20831if is_char_node(ha) then 20832 if font(ha)<>hf then goto found2 20833 else begin init_list:=ha; init_lig:=false; hu[0]:=qo(character(ha)); 20834 end 20835else if type(ha)=ligature_node then 20836 if font(lig_char(ha))<>hf then goto found2 20837 else begin init_list:=lig_ptr(ha); init_lig:=true; init_lft:=(subtype(ha)>1); 20838 hu[0]:=qo(character(lig_char(ha))); 20839 if init_list=null then if init_lft then 20840 begin hu[0]:=max_hyph_char; init_lig:=false; 20841 end; {in this case a ligature will be reconstructed from scratch} 20842 free_node(ha,small_node_size); 20843 end 20844else begin {no punctuation found; look for left boundary} 20845 if not is_char_node(r) then if type(r)=ligature_node then 20846 if subtype(r)>1 then goto found2; 20847 j:=1; s:=ha; init_list:=null; goto common_ending; 20848 end; 20849s:=cur_p; {we have |cur_p<>ha| because |type(cur_p)=glue_node|} 20850while link(s)<>ha do s:=link(s); 20851j:=0; goto common_ending; 20852found2: s:=ha; j:=0; hu[0]:=max_hyph_char; init_lig:=false; init_list:=null; 20853common_ending: flush_node_list(r); 20854@<Reconstitute nodes for the hyphenated word, inserting discretionary hyphens@>; 20855flush_list(init_list); 20856end 20857 20858@ @<Hyphenate the |native_word_node| at |ha|@>= 20859{ find the node immediately before the word to be hyphenated } 20860s:=cur_p; {we have |cur_p<>ha| because |type(cur_p)=glue_node|} 20861while link(s) <> ha do s:=link(s); 20862 20863{ for each hyphen position, 20864 create a |native_word_node| fragment for the text before this point, 20865 and a |disc_node| for the break, with the |hyf_char| in the |pre_break| text 20866} 20867 20868hyphen_passed:=0; { location of last hyphen we saw } 20869 20870for j:=l_hyf to hn - r_hyf do begin 20871 { if this is a valid break.... } 20872 if odd(hyf[j]) then begin 20873 20874 { make a |native_word_node| for the fragment before the hyphen } 20875 q:=new_native_word_node(hf, j - hyphen_passed); 20876 for i:=0 to j - hyphen_passed - 1 do 20877 set_native_char(q, i, get_native_char(ha, i + hyphen_passed)); 20878 set_native_metrics(q, XeTeX_use_glyph_metrics); 20879 link(s):=q; { append the new node } 20880 s:=q; 20881 20882 { make the |disc_node| for the hyphenation point } 20883 q:=new_disc; 20884 pre_break(q):=new_native_character(hf, hyf_char); 20885 link(s):=q; 20886 s:=q; 20887 20888 hyphen_passed:=j; 20889 end 20890end; 20891 20892{ make a |native_word_node| for the last fragment of the word } 20893hn:=native_length(ha); { ensure trailing punctuation is not lost! } 20894q:=new_native_word_node(hf, hn - hyphen_passed); 20895for i:=0 to hn - hyphen_passed - 1 do 20896 set_native_char(q, i, get_native_char(ha, i + hyphen_passed)); 20897set_native_metrics(q, XeTeX_use_glyph_metrics); 20898link(s):=q; { append the new node } 20899s:=q; 20900 20901q:=link(ha); 20902link(s):=q; 20903link(ha):=null; 20904flush_node_list(ha); 20905 20906@ We must now face the fact that the battle is not over, even though the 20907{\def\!{\kern-1pt}% 20908hyphens have been found: The process of reconstituting a word can be nontrivial 20909because ligatures might change when a hyphen is present. {\sl The \TeX book\/} 20910discusses the difficulties of the word ``difficult'', and 20911the discretionary material surrounding a 20912hyphen can be considerably more complex than that. Suppose 20913\.{abcdef} is a word in a font for which the only ligatures are \.{b\!c}, 20914\.{c\!d}, \.{d\!e}, and \.{e\!f}. If this word permits hyphenation 20915between \.b and \.c, the two patterns with and without hyphenation are 20916$\.a\,\.b\,\.-\,\.{c\!d}\,\.{e\!f}$ and $\.a\,\.{b\!c}\,\.{d\!e}\,\.f$. 20917Thus the insertion of a hyphen might cause effects to ripple arbitrarily 20918far into the rest of the word. A further complication arises if additional 20919hyphens appear together with such rippling, e.g., if the word in the 20920example just given could also be hyphenated between \.c and \.d; \TeX\ 20921avoids this by simply ignoring the additional hyphens in such weird cases.} 20922 20923Still further complications arise in the presence of ligatures that do not 20924delete the original characters. When punctuation precedes the word being 20925hyphenated, \TeX's method is not perfect under all possible scenarios, 20926because punctuation marks and letters can propagate information back and forth. 20927For example, suppose the original pre-hyphenation pair 20928\.{*a} changes to \.{*y} via a \.{\?=:} ligature, which changes to \.{xy} 20929via a \.{=:\?} ligature; if $p_{a-1}=\.x$ and $p_a=\.y$, the reconstitution 20930procedure isn't smart enough to obtain \.{xy} again. In such cases the 20931font designer should include a ligature that goes from \.{xa} to \.{xy}. 20932 20933@ The processing is facilitated by a subroutine called |reconstitute|. Given 20934a string of characters $x_j\ldots x_n$, there is a smallest index $m\ge j$ 20935such that the ``translation'' of $x_j\ldots x_n$ by ligatures and kerning 20936has the form $y_1\ldots y_t$ followed by the translation of $x_{m+1}\ldots x_n$, 20937where $y_1\ldots y_t$ is some nonempty sequence of character, ligature, and 20938kern nodes. We call $x_j\ldots x_m$ a ``cut prefix'' of $x_j\ldots x_n$. 20939For example, if $x_1x_2x_3=\.{fly}$, and if the font contains `fl' as a 20940ligature and a kern between `fl' and `y', then $m=2$, $t=2$, and $y_1$ will 20941be a ligature node for `fl' followed by an appropriate kern node~$y_2$. 20942In the most common case, $x_j$~forms no ligature with $x_{j+1}$ and we 20943simply have $m=j$, $y_1=x_j$. If $m<n$ we can repeat the procedure on 20944$x_{m+1}\ldots x_n$ until the entire translation has been found. 20945 20946The |reconstitute| function returns the integer $m$ and puts the nodes 20947$y_1\ldots y_t$ into a linked list starting at |link(hold_head)|, 20948getting the input $x_j\ldots x_n$ from the |hu| array. If $x_j=256$, 20949we consider $x_j$ to be an implicit left boundary character; in this 20950case |j| must be strictly less than~|n|. There is a 20951parameter |bchar|, which is either 256 or an implicit right boundary character 20952assumed to be present just following~$x_n$. (The value |hu[n+1]| is never 20953explicitly examined, but the algorithm imagines that |bchar| is there.) 20954 20955If there exists an index |k| in the range $j\le k\le m$ such that |hyf[k]| 20956is odd and such that the result of |reconstitute| would have been different 20957if $x_{k+1}$ had been |hchar|, then |reconstitute| sets |hyphen_passed| 20958to the smallest such~|k|. Otherwise it sets |hyphen_passed| to zero. 20959 20960A special convention is used in the case |j=0|: Then we assume that the 20961translation of |hu[0]| appears in a special list of charnodes starting at 20962|init_list|; moreover, if |init_lig| is |true|, then |hu[0]| will be 20963a ligature character, involving a left boundary if |init_lft| is |true|. 20964This facility is provided for cases when a hyphenated 20965word is preceded by punctuation (like single or double quotes) that might 20966affect the translation of the beginning of the word. 20967 20968@<Glob...@>= 20969@!hyphen_passed:small_number; {first hyphen in a ligature, if any} 20970 20971@ @<Declare the function called |reconstitute|@>= 20972function reconstitute(@!j,@!n:small_number;@!bchar,@!hchar:halfword): 20973 small_number; 20974label continue,done; 20975var @!p:pointer; {temporary register for list manipulation} 20976@!t:pointer; {a node being appended to} 20977@!q:four_quarters; {character information or a lig/kern instruction} 20978@!cur_rh:halfword; {hyphen character for ligature testing} 20979@!test_char:halfword; {hyphen or other character for ligature testing} 20980@!w:scaled; {amount of kerning} 20981@!k:font_index; {position of current lig/kern instruction} 20982begin hyphen_passed:=0; t:=hold_head; w:=0; link(hold_head):=null; 20983 {at this point |ligature_present=lft_hit=rt_hit=false|} 20984@<Set up data structures with the cursor following position |j|@>; 20985continue:@<If there's a ligature or kern at the cursor position, update the data 20986 structures, possibly advancing~|j|; continue until the cursor moves@>; 20987@<Append a ligature and/or kern to the translation; 20988 |goto continue| if the stack of inserted ligatures is nonempty@>; 20989reconstitute:=j; 20990end; 20991 20992@ The reconstitution procedure shares many of the global data structures 20993by which \TeX\ has processed the words before they were hyphenated. 20994There is an implied ``cursor'' between characters |cur_l| and |cur_r|; 20995these characters will be tested for possible ligature activity. If 20996|ligature_present| then |cur_l| is a ligature character formed from the 20997original characters following |cur_q| in the current translation list. 20998There is a ``ligature stack'' between the cursor and character |j+1|, 20999consisting of pseudo-ligature nodes linked together by their |link| fields. 21000This stack is normally empty unless a ligature command has created a new 21001character that will need to be processed later. A pseudo-ligature is 21002a special node having a |character| field that represents a potential 21003ligature and a |lig_ptr| field that points to a |char_node| or is |null|. 21004We have 21005$$|cur_r|=\cases{|character(lig_stack)|,&if |lig_stack>null|;\cr 21006 |qi(hu[j+1])|,&if |lig_stack=null| and |j<n|;\cr 21007 bchar,&if |lig_stack=null| and |j=n|.\cr}$$ 21008 21009@<Glob...@>= 21010@!cur_l,@!cur_r:halfword; {characters before and after the cursor} 21011@!cur_q:pointer; {where a ligature should be detached} 21012@!lig_stack:pointer; {unfinished business to the right of the cursor} 21013@!ligature_present:boolean; {should a ligature node be made for |cur_l|?} 21014@!lft_hit,@!rt_hit:boolean; {did we hit a ligature with a boundary character?} 21015 21016@ @d append_charnode_to_t(#)== begin link(t):=get_avail; t:=link(t); 21017 font(t):=hf; character(t):=#; 21018 end 21019@d set_cur_r==begin if j<n then cur_r:=qi(hu[j+1])@+else cur_r:=bchar; 21020 if odd(hyf[j]) then cur_rh:=hchar@+else cur_rh:=non_char; 21021 end 21022 21023@<Set up data structures with the cursor following position |j|@>= 21024cur_l:=qi(hu[j]); cur_q:=t; 21025if j=0 then 21026 begin ligature_present:=init_lig; p:=init_list; 21027 if ligature_present then lft_hit:=init_lft; 21028 while p>null do 21029 begin append_charnode_to_t(character(p)); p:=link(p); 21030 end; 21031 end 21032else if cur_l<non_char then append_charnode_to_t(cur_l); 21033lig_stack:=null; set_cur_r 21034 21035@ We may want to look at the lig/kern program twice, once for a hyphen 21036and once for a normal letter. (The hyphen might appear after the letter 21037in the program, so we'd better not try to look for both at once.) 21038 21039@<If there's a ligature or kern at the cursor position, update...@>= 21040if cur_l=non_char then 21041 begin k:=bchar_label[hf]; 21042 if k=non_address then goto done@+else q:=font_info[k].qqqq; 21043 end 21044else begin q:=char_info(hf)(cur_l); 21045 if char_tag(q)<>lig_tag then goto done; 21046 k:=lig_kern_start(hf)(q); q:=font_info[k].qqqq; 21047 if skip_byte(q)>stop_flag then 21048 begin k:=lig_kern_restart(hf)(q); q:=font_info[k].qqqq; 21049 end; 21050 end; {now |k| is the starting address of the lig/kern program} 21051if cur_rh<non_char then test_char:=cur_rh@+else test_char:=cur_r; 21052loop@+begin if next_char(q)=test_char then if skip_byte(q)<=stop_flag then 21053 if cur_rh<non_char then 21054 begin hyphen_passed:=j; hchar:=non_char; cur_rh:=non_char; 21055 goto continue; 21056 end 21057 else begin if hchar<non_char then if odd(hyf[j]) then 21058 begin hyphen_passed:=j; hchar:=non_char; 21059 end; 21060 if op_byte(q)<kern_flag then 21061 @<Carry out a ligature replacement, updating the cursor structure 21062 and possibly advancing~|j|; |goto continue| if the cursor doesn't 21063 advance, otherwise |goto done|@>; 21064 w:=char_kern(hf)(q); goto done; {this kern will be inserted below} 21065 end; 21066 if skip_byte(q)>=stop_flag then 21067 if cur_rh=non_char then goto done 21068 else begin cur_rh:=non_char; goto continue; 21069 end; 21070 k:=k+qo(skip_byte(q))+1; q:=font_info[k].qqqq; 21071 end; 21072done: 21073 21074@ @d wrap_lig(#)==if ligature_present then 21075 begin p:=new_ligature(hf,cur_l,link(cur_q)); 21076 if lft_hit then 21077 begin subtype(p):=2; lft_hit:=false; 21078 end; 21079 if # then if lig_stack=null then 21080 begin incr(subtype(p)); rt_hit:=false; 21081 end; 21082 link(cur_q):=p; t:=p; ligature_present:=false; 21083 end 21084@d pop_lig_stack==begin if lig_ptr(lig_stack)>null then 21085 begin link(t):=lig_ptr(lig_stack); {this is a charnode for |hu[j+1]|} 21086 t:=link(t); incr(j); 21087 end; 21088 p:=lig_stack; lig_stack:=link(p); free_node(p,small_node_size); 21089 if lig_stack=null then set_cur_r@+else cur_r:=character(lig_stack); 21090 end {if |lig_stack| isn't |null| we have |cur_rh=non_char|} 21091 21092@<Append a ligature and/or kern to the translation...@>= 21093wrap_lig(rt_hit); 21094if w<>0 then 21095 begin link(t):=new_kern(w); t:=link(t); w:=0; 21096 end; 21097if lig_stack>null then 21098 begin cur_q:=t; cur_l:=character(lig_stack); ligature_present:=true; 21099 pop_lig_stack; goto continue; 21100 end 21101 21102@ @<Carry out a ligature replacement, updating the cursor structure...@>= 21103begin if cur_l=non_char then lft_hit:=true; 21104if j=n then if lig_stack=null then rt_hit:=true; 21105check_interrupt; {allow a way out in case there's an infinite ligature loop} 21106case op_byte(q) of 21107qi(1),qi(5):begin cur_l:=rem_byte(q); {\.{=:\?}, \.{=:\?>}} 21108 ligature_present:=true; 21109 end; 21110qi(2),qi(6):begin cur_r:=rem_byte(q); {\.{\?=:}, \.{\?=:>}} 21111 if lig_stack>null then character(lig_stack):=cur_r 21112 else begin lig_stack:=new_lig_item(cur_r); 21113 if j=n then bchar:=non_char 21114 else begin p:=get_avail; lig_ptr(lig_stack):=p; 21115 character(p):=qi(hu[j+1]); font(p):=hf; 21116 end; 21117 end; 21118 end; 21119qi(3):begin cur_r:=rem_byte(q); {\.{\?=:\?}} 21120 p:=lig_stack; lig_stack:=new_lig_item(cur_r); link(lig_stack):=p; 21121 end; 21122qi(7),qi(11):begin wrap_lig(false); {\.{\?=:\?>}, \.{\?=:\?>>}} 21123 cur_q:=t; cur_l:=rem_byte(q); ligature_present:=true; 21124 end; 21125othercases begin cur_l:=rem_byte(q); ligature_present:=true; {\.{=:}} 21126 if lig_stack>null then pop_lig_stack 21127 else if j=n then goto done 21128 else begin append_charnode_to_t(cur_r); incr(j); set_cur_r; 21129 end; 21130 end 21131endcases; 21132if op_byte(q)>qi(4) then if op_byte(q)<>qi(7) then goto done; 21133goto continue; 21134end 21135 21136@ Okay, we're ready to insert the potential hyphenations that were found. 21137When the following program is executed, we want to append the word 21138|hu[1..hn]| after node |ha|, and node |q| should be appended to the result. 21139During this process, the variable |i| will be a temporary 21140index into |hu|; the variable |j| will be an index to our current position 21141in |hu|; the variable |l| will be the counterpart of |j|, in a discretionary 21142branch; the variable |r| will point to new nodes being created; and 21143we need a few new local variables: 21144 21145@<Local variables for hyph...@>= 21146@!major_tail,@!minor_tail:pointer; {the end of lists in the main and 21147 discretionary branches being reconstructed} 21148@!c:UnicodeScalar; {character temporarily replaced by a hyphen} 21149@!c_loc:0..63; {where that character came from} 21150@!r_count:integer; {replacement count for discretionary} 21151@!hyf_node:pointer; {the hyphen, if it exists} 21152 21153@ When the following code is performed, |hyf[0]| and |hyf[hn]| will be zero. 21154 21155@<Reconstitute nodes for the hyphenated word...@>= 21156repeat l:=j; j:=reconstitute(j,hn,bchar,qi(hyf_char))+1; 21157if hyphen_passed=0 then 21158 begin link(s):=link(hold_head); 21159 while link(s)>null do s:=link(s); 21160 if odd(hyf[j-1]) then 21161 begin l:=j; hyphen_passed:=j-1; link(hold_head):=null; 21162 end; 21163 end; 21164if hyphen_passed>0 then 21165 @<Create and append a discretionary node as an alternative to the 21166 unhyphenated word, and continue to develop both branches until they 21167 become equivalent@>; 21168until j>hn; 21169link(s):=q 21170 21171@ In this repeat loop we will insert another discretionary if |hyf[j-1]| is 21172odd, when both branches of the previous discretionary end at position |j-1|. 21173Strictly speaking, we aren't justified in doing this, because we don't know 21174that a hyphen after |j-1| is truly independent of those branches. But in almost 21175all applications we would rather not lose a potentially valuable hyphenation 21176point. (Consider the word `difficult', where the letter `c' is in position |j|.) 21177 21178@d advance_major_tail==begin major_tail:=link(major_tail); incr(r_count); 21179 end 21180 21181@<Create and append a discretionary node as an alternative...@>= 21182repeat r:=get_node(small_node_size); 21183link(r):=link(hold_head); type(r):=disc_node; 21184major_tail:=r; r_count:=0; 21185while link(major_tail)>null do advance_major_tail; 21186i:=hyphen_passed; hyf[i]:=0; 21187@<Put the \(c)characters |hu[l..i]| and a hyphen into |pre_break(r)|@>; 21188@<Put the \(c)characters |hu[i+1..@,]| into |post_break(r)|, appending to this 21189 list and to |major_tail| until synchronization has been achieved@>; 21190@<Move pointer |s| to the end of the current list, and set |replace_count(r)| 21191 appropriately@>; 21192hyphen_passed:=j-1; link(hold_head):=null; 21193until not odd(hyf[j-1]) 21194 21195@ The new hyphen might combine with the previous character via ligature 21196or kern. At this point we have |l-1<=i<j| and |i<hn|. 21197 21198@<Put the \(c)characters |hu[l..i]| and a hyphen into |pre_break(r)|@>= 21199minor_tail:=null; pre_break(r):=null; hyf_node:=new_character(hf,hyf_char); 21200if hyf_node<>null then 21201 begin incr(i); c:=hu[i]; hu[i]:=hyf_char; free_avail(hyf_node); 21202 end; 21203while l<=i do 21204 begin l:=reconstitute(l,i,font_bchar[hf],non_char)+1; 21205 if link(hold_head)>null then 21206 begin if minor_tail=null then pre_break(r):=link(hold_head) 21207 else link(minor_tail):=link(hold_head); 21208 minor_tail:=link(hold_head); 21209 while link(minor_tail)>null do minor_tail:=link(minor_tail); 21210 end; 21211 end; 21212if hyf_node<>null then 21213 begin hu[i]:=c; {restore the character in the hyphen position} 21214 l:=i; decr(i); 21215 end 21216 21217@ The synchronization algorithm begins with |l=i+1<=j|. 21218 21219@<Put the \(c)characters |hu[i+1..@,]| into |post_break(r)|...@>= 21220minor_tail:=null; post_break(r):=null; c_loc:=0; 21221if bchar_label[hf]<>non_address then {put left boundary at beginning of new line} 21222 begin decr(l); c:=hu[l]; c_loc:=l; hu[l]:=max_hyph_char; 21223 end; 21224while l<j do 21225 begin repeat l:=reconstitute(l,hn,bchar,non_char)+1; 21226 if c_loc>0 then 21227 begin hu[c_loc]:=c; c_loc:=0; 21228 end; 21229 if link(hold_head)>null then 21230 begin if minor_tail=null then post_break(r):=link(hold_head) 21231 else link(minor_tail):=link(hold_head); 21232 minor_tail:=link(hold_head); 21233 while link(minor_tail)>null do minor_tail:=link(minor_tail); 21234 end; 21235 until l>=j; 21236 while l>j do 21237 @<Append characters of |hu[j..@,]| to |major_tail|, advancing~|j|@>; 21238 end 21239 21240@ @<Append characters of |hu[j..@,]|...@>= 21241begin j:=reconstitute(j,hn,bchar,non_char)+1; 21242link(major_tail):=link(hold_head); 21243while link(major_tail)>null do advance_major_tail; 21244end 21245 21246@ Ligature insertion can cause a word to grow exponentially in size. Therefore 21247we must test the size of |r_count| here, even though the hyphenated text 21248was at most 63 characters long. 21249 21250@<Move pointer |s| to the end of the current list...@>= 21251if r_count>127 then {we have to forget the discretionary hyphen} 21252 begin link(s):=link(r); link(r):=null; flush_node_list(r); 21253 end 21254else begin link(s):=r; replace_count(r):=r_count; 21255 end; 21256s:=major_tail 21257 21258@* \[42] Hyphenation. 21259When a word |hc[1..hn]| has been set up to contain a candidate for hyphenation, 21260\TeX\ first looks to see if it is in the user's exception dictionary. If not, 21261hyphens are inserted based on patterns that appear within the given word, 21262using an algorithm due to Frank~M. Liang. 21263@^Liang, Franklin Mark@> 21264 21265Let's consider Liang's method first, since it is much more interesting than the 21266exception-lookup routine. The algorithm begins by setting |hyf[j]| to zero 21267for all |j|, and invalid characters are inserted into |hc[0]| 21268and |hc[hn+1]| to serve as delimiters. Then a reasonably fast method is 21269used to see which of a given set of patterns occurs in the word 21270|hc[0..(hn+1)]|. Each pattern $p_1\ldots p_k$ of length |k| has an associated 21271sequence of |k+1| numbers $n_0\ldots n_k$; and if the pattern occurs in 21272|hc[(j+1)..(j+k)]|, \TeX\ will set |hyf[j+i]:=@tmax@>(hyf[j+i],@t$n_i$@>)| for 21273|0<=i<=k|. After this has been done for each pattern that occurs, a 21274discretionary hyphen will be inserted between |hc[j]| and |hc[j+1]| when 21275|hyf[j]| is odd, as we have already seen. 21276 21277The set of patterns $p_1\ldots p_k$ and associated numbers $n_0\ldots n_k$ 21278depends, of course, on the language whose words are being hyphenated, and 21279on the degree of hyphenation that is desired. A method for finding 21280appropriate |p|'s and |n|'s, from a given dictionary of words and acceptable 21281hyphenations, is discussed in Liang's Ph.D. thesis (Stanford University, 212821983); \TeX\ simply starts with the patterns and works from there. 21283 21284@ The patterns are stored in a compact table that is also efficient for 21285retrieval, using a variant of ``trie memory'' [cf.\ {\sl The Art of 21286Computer Programming \bf3} (1973), 481--505]. We can find each pattern 21287$p_1\ldots p_k$ by letting $z_0$ be one greater than the relevant language 21288index and then, for |1<=i<=k|, 21289setting |@t$z_i$@>:=trie_link@t$(z_{i-1})+p_i$@>|; the pattern will be 21290identified by the number $z_k$. Since all the pattern information is 21291packed together into a single |trie_link| array, it is necessary to 21292prevent confusion between the data from inequivalent patterns, so another 21293table is provided such that |trie_char@t$(z_i)=p_i$@>| for all |i|. There 21294is also a table |trie_op|$(z_k)$ to identify the numbers $n_0\ldots n_k$ 21295associated with $p_1\ldots p_k$. 21296 21297Comparatively few different number sequences $n_0\ldots n_k$ actually occur, 21298since most of the |n|'s are generally zero. Therefore the number sequences 21299are encoded in such a way that |trie_op|$(z_k)$ is only one byte long. 21300If |trie_op(@t$z_k$@>)<>min_quarterword|, when $p_1\ldots p_k$ has matched 21301the letters in |hc[(l-k+1)..l@,]| of language |t|, 21302we perform all of the required operations 21303for this pattern by carrying out the following little program: Set 21304|v:=trie_op(@t$z_k$@>)|. Then set |v:=v+op_start[t]|, 21305|hyf[l-hyf_distance[v]]:=@tmax@>(hyf[l-hyf_distance[v]], hyf_num[v])|, 21306and |v:=hyf_next[v]|; repeat, if necessary, until |v=min_quarterword|. 21307 21308@<Types...@>= 21309@!trie_pointer=0..trie_size; {an index into |trie|} 21310 21311@ @d trie_link(#)==trie[#].rh {``downward'' link in a trie} 21312@d trie_char(#)==trie[#].b1 {character matched at this trie location} 21313@d trie_op(#)==trie[#].b0 {program for hyphenation at this trie location} 21314 21315@<Glob...@>= 21316@!trie:array[trie_pointer] of two_halves; {|trie_link|, |trie_char|, |trie_op|} 21317@!hyf_distance:array[1..trie_op_size] of small_number; {position |k-j| of $n_j$} 21318@!hyf_num:array[1..trie_op_size] of small_number; {value of $n_j$} 21319@!hyf_next:array[1..trie_op_size] of quarterword; {continuation code} 21320@!op_start:array[0..biggest_lang] of 0..trie_op_size; {offset for current language} 21321 21322@ @<Local variables for hyph...@>= 21323@!z:trie_pointer; {an index into |trie|} 21324@!v:integer; {an index into |hyf_distance|, etc.} 21325 21326@ Assuming that these auxiliary tables have been set up properly, the 21327hyphenation algorithm is quite short. In the following code we set |hc[hn+2]| 21328to the impossible value 256, in order to guarantee that |hc[hn+3]| will 21329never be fetched. 21330 21331@<Find hyphen locations for the word in |hc|...@>= 21332for j:=0 to hn do hyf[j]:=0; 21333@<Look for the word |hc[1..hn]| in the exception table, and |goto found| (with 21334 |hyf| containing the hyphens) if an entry is found@>; 21335if trie_char(cur_lang+1)<>qi(cur_lang) then return; {no patterns for |cur_lang|} 21336hc[0]:=0; hc[hn+1]:=0; hc[hn+2]:=max_hyph_char; {insert delimiters} 21337for j:=0 to hn-r_hyf+1 do 21338 begin z:=trie_link(cur_lang+1)+hc[j]; l:=j; 21339 while hc[l]=qo(trie_char(z)) do 21340 begin if trie_op(z)<>min_quarterword then 21341 @<Store \(m)maximum values in the |hyf| table@>; 21342 incr(l); z:=trie_link(z)+hc[l]; 21343 end; 21344 end; 21345found: for j:=0 to l_hyf-1 do hyf[j]:=0; 21346for j:=0 to r_hyf-1 do hyf[hn-j]:=0 21347 21348@ @<Store \(m)maximum values in the |hyf| table@>= 21349begin v:=trie_op(z); 21350repeat v:=v+op_start[cur_lang]; i:=l-hyf_distance[v]; 21351if hyf_num[v]>hyf[i] then hyf[i]:=hyf_num[v]; 21352v:=hyf_next[v]; 21353until v=min_quarterword; 21354end 21355 21356@ The exception table that is built by \TeX's \.{\\hyphenation} primitive is 21357organized as an ordered hash table [cf.\ Amble and Knuth, {\sl The Computer 21358@^Amble, Ole@> @^Knuth, Donald Ervin@> 21359Journal\/ \bf17} (1974), 135--142] using linear probing. If $\alpha$ and 21360$\beta$ are words, we will say that $\alpha<\beta$ if $\vert\alpha\vert< 21361\vert\beta\vert$ or if $\vert\alpha\vert=\vert\beta\vert$ and 21362$\alpha$ is lexicographically smaller than $\beta$. (The notation $\vert 21363\alpha\vert$ stands for the length of $\alpha$.) The idea of ordered hashing 21364is to arrange the table so that a given word $\alpha$ can be sought by computing 21365a hash address $h=h(\alpha)$ and then looking in table positions |h|, |h-1|, 21366\dots, until encountering the first word $\L\alpha$. If this word is 21367different from $\alpha$, we can conclude that $\alpha$ is not in the table. 21368 21369The words in the table point to lists in |mem| that specify hyphen positions 21370in their |info| fields. The list for $c_1\ldots c_n$ contains the number |k| if 21371the word $c_1\ldots c_n$ has a discretionary hyphen between $c_k$ and 21372$c_{k+1}$. 21373 21374@<Types...@>= 21375@!hyph_pointer=0..hyph_size; {an index into the ordered hash table} 21376 21377@ @<Glob...@>= 21378@!hyph_word:array[hyph_pointer] of str_number; {exception words} 21379@!hyph_list:array[hyph_pointer] of pointer; {lists of hyphen positions} 21380@!hyph_count:hyph_pointer; {the number of words in the exception dictionary} 21381 21382@ @<Local variables for init...@>= 21383@!z:hyph_pointer; {runs through the exception dictionary} 21384 21385@ @<Set init...@>= 21386for z:=0 to hyph_size do 21387 begin hyph_word[z]:=0; hyph_list[z]:=null; 21388 end; 21389hyph_count:=0; 21390 21391@ The algorithm for exception lookup is quite simple, as soon as we have 21392a few more local variables to work with. 21393 21394@<Local variables for hyph...@>= 21395@!h:hyph_pointer; {an index into |hyph_word| and |hyph_list|} 21396@!k:str_number; {an index into |str_start|} 21397@!u:pool_pointer; {an index into |str_pool|} 21398 21399@ First we compute the hash code |h|, then we search until we either 21400find the word or we don't. Words from different languages are kept 21401separate by appending the language code to the string. 21402 21403@<Look for the word |hc[1...@>= 21404h:=hc[1]; incr(hn); hc[hn]:=cur_lang; 21405for j:=2 to hn do h:=(h+h+hc[j]) mod hyph_size; 21406loop@+ begin @<If the string |hyph_word[h]| is less than \(hc)|hc[1..hn]|, 21407 |goto not_found|; but if the two strings are equal, 21408 set |hyf| to the hyphen positions and |goto found|@>; 21409 if h>0 then decr(h)@+else h:=hyph_size; 21410 end; 21411not_found: decr(hn) 21412 21413@ @<If the string |hyph_word[h]| is less than \(hc)...@>= 21414k:=hyph_word[h]; if k=0 then goto not_found; 21415if length(k)<hn then goto not_found; 21416if length(k)=hn then 21417 begin j:=1; u:=str_start_macro(k); 21418 repeat if so(str_pool[u])<hc[j] then goto not_found; 21419 if so(str_pool[u])>hc[j] then goto done; 21420 incr(j); incr(u); 21421 until j>hn; 21422 @<Insert hyphens as specified in |hyph_list[h]|@>; 21423 decr(hn); goto found; 21424 end; 21425done: 21426 21427@ @<Insert hyphens as specified...@>= 21428s:=hyph_list[h]; 21429while s<>null do 21430 begin hyf[info(s)]:=1; s:=link(s); 21431 end 21432 21433@ @<Search |hyph_list| for pointers to |p|@>= 21434for q:=0 to hyph_size do 21435 begin if hyph_list[q]=p then 21436 begin print_nl("HYPH("); print_int(q); print_char(")"); 21437 end; 21438 end 21439 21440@ We have now completed the hyphenation routine, so the |line_break| procedure 21441is finished at last. Since the hyphenation exception table is fresh in our 21442minds, it's a good time to deal with the routine that adds new entries to it. 21443 21444When \TeX\ has scanned `\.{\\hyphenation}', it calls on a procedure named 21445|new_hyph_exceptions| to do the right thing. 21446 21447@d set_cur_lang==if language<=0 then cur_lang:=0 21448 else if language>biggest_lang then cur_lang:=0 21449 else cur_lang:=language 21450 21451@p procedure new_hyph_exceptions; {enters new exceptions} 21452label reswitch, exit, found, not_found, not_found1; 21453var n:0..64; {length of current word; not always a |small_number|} 21454@!j:0..64; {an index into |hc|} 21455@!h:hyph_pointer; {an index into |hyph_word| and |hyph_list|} 21456@!k:str_number; {an index into |str_start|} 21457@!p:pointer; {head of a list of hyphen positions} 21458@!q:pointer; {used when creating a new node for list |p|} 21459@!s,@!t:str_number; {strings being compared or stored} 21460@!u,@!v:pool_pointer; {indices into |str_pool|} 21461begin scan_left_brace; {a left brace must follow \.{\\hyphenation}} 21462set_cur_lang; 21463@!init if trie_not_ready then 21464 begin hyph_index:=0; goto not_found1; 21465 end; 21466tini@/ 21467set_hyph_index; 21468not_found1: 21469@<Enter as many hyphenation exceptions as are listed, 21470until coming to a right brace; then |return|@>; 21471exit:end; 21472 21473@ @<Enter as many...@>= 21474n:=0; p:=null; 21475loop@+ begin get_x_token; 21476 reswitch: case cur_cmd of 21477 letter,other_char,char_given:@<Append a new letter or hyphen@>; 21478 char_num: begin scan_char_num; cur_chr:=cur_val; cur_cmd:=char_given; 21479 goto reswitch; 21480 end; 21481 spacer,right_brace: begin if n>1 then @<Enter a hyphenation exception@>; 21482 if cur_cmd=right_brace then return; 21483 n:=0; p:=null; 21484 end; 21485 othercases @<Give improper \.{\\hyphenation} error@> 21486 endcases; 21487 end 21488 21489@ @<Give improper \.{\\hyph...@>= 21490begin print_err("Improper "); print_esc("hyphenation"); 21491@.Improper \\hyphenation...@> 21492 print(" will be flushed"); 21493help2("Hyphenation exceptions must contain only letters")@/ 21494 ("and hyphens. But continue; I'll forgive and forget."); 21495error; 21496end 21497 21498@ @<Append a new letter or hyphen@>= 21499if cur_chr="-" then @<Append the value |n| to list |p|@> 21500else begin set_lc_code(cur_chr); 21501 if hc[0]=0 then 21502 begin print_err("Not a letter"); 21503@.Not a letter@> 21504 help2("Letters in \hyphenation words must have \lccode>0.")@/ 21505 ("Proceed; I'll ignore the character I just read."); 21506 error; 21507 end 21508 else if n<63 then 21509 begin incr(n); 21510 if hc[0]<@"10000 then hc[n]:=hc[0] 21511 else begin 21512 hc[n]:=(hc[0] - @"10000) div @"400 + @"D800; 21513 incr(n); 21514 hc[n]:=hc[0] mod @"400 + @"DC00; 21515 end; 21516 end; 21517 end 21518 21519@ @<Append the value |n| to list |p|@>= 21520begin if n<63 then 21521 begin q:=get_avail; link(q):=p; info(q):=n; p:=q; 21522 end; 21523end 21524 21525@ @<Enter a hyphenation exception@>= 21526begin incr(n); hc[n]:=cur_lang; str_room(n); h:=0; 21527for j:=1 to n do 21528 begin h:=(h+h+hc[j]) mod hyph_size; 21529 append_char(hc[j]); 21530 end; 21531s:=make_string; 21532@<Insert the \(p)pair |(s,p)| into the exception table@>; 21533end 21534 21535@ @<Insert the \(p)pair |(s,p)|...@>= 21536if hyph_count=hyph_size then overflow("exception dictionary",hyph_size); 21537@:TeX capacity exceeded exception dictionary}{\quad exception dictionary@> 21538incr(hyph_count); 21539while hyph_word[h]<>0 do 21540 begin @<If the string |hyph_word[h]| is less than \(or)or equal to 21541 |s|, interchange |(hyph_word[h],hyph_list[h])| with |(s,p)|@>; 21542 if h>0 then decr(h)@+else h:=hyph_size; 21543 end; 21544hyph_word[h]:=s; hyph_list[h]:=p 21545 21546@ @<If the string |hyph_word[h]| is less than \(or)...@>= 21547k:=hyph_word[h]; 21548if length(k)<length(s) then goto found; 21549if length(k)>length(s) then goto not_found; 21550u:=str_start_macro(k); v:=str_start_macro(s); 21551repeat if str_pool[u]<str_pool[v] then goto found; 21552if str_pool[u]>str_pool[v] then goto not_found; 21553incr(u); incr(v); 21554until u=str_start_macro(k+1); 21555found:q:=hyph_list[h]; hyph_list[h]:=p; p:=q;@/ 21556t:=hyph_word[h]; hyph_word[h]:=s; s:=t; 21557not_found: 21558 21559@* \[43] Initializing the hyphenation tables. 21560The trie for \TeX's hyphenation algorithm is built from a sequence of 21561patterns following a \.{\\patterns} specification. Such a specification 21562is allowed only in \.{INITEX}, since the extra memory for auxiliary tables 21563and for the initialization program itself would only clutter up the 21564production version of \TeX\ with a lot of deadwood. 21565 21566The first step is to build a trie that is linked, instead of packed 21567into sequential storage, so that insertions are readily made. 21568After all patterns have been processed, \.{INITEX} 21569compresses the linked trie by identifying common subtries. Finally the 21570trie is packed into the efficient sequential form that the hyphenation 21571algorithm actually uses. 21572 21573@<Declare subprocedures for |line_break|@>= 21574@!init @<Declare procedures for preprocessing hyphenation patterns@>@; 21575tini 21576 21577@ Before we discuss trie building in detail, let's consider the simpler 21578problem of creating the |hyf_distance|, |hyf_num|, and |hyf_next| arrays. 21579 21580Suppose, for example, that \TeX\ reads the pattern `\.{ab2cde1}'. This is 21581a pattern of length 5, with $n_0\ldots n_5=0\,0\,2\,0\,0\,1$ in the 21582notation above. We want the corresponding |trie_op| code |v| to have 21583|hyf_distance[v]=3|, |hyf_num[v]=2|, and |hyf_next[v]=@t$v^\prime$@>|, 21584where the auxiliary |trie_op| code $v^\prime$ has 21585|hyf_distance[@t$v^\prime$@>]=0|, |hyf_num[@t$v^\prime$@>]=1|, and 21586|hyf_next[@t$v^\prime$@>]=min_quarterword|. 21587 21588\TeX\ computes an appropriate value |v| with the |new_trie_op| subroutine 21589below, by setting 21590$$\hbox{|@t$v^\prime$@>:=new_trie_op(0,1,min_quarterword)|,\qquad 21591|v:=new_trie_op(3,2,@t$v^\prime$@>)|.}$$ 21592This subroutine looks up its three 21593parameters in a special hash table, assigning a new value only if these 21594three have not appeared before for the current language. 21595 21596The hash table is called |trie_op_hash|, and the number of entries it contains 21597is |trie_op_ptr|. 21598 21599@<Glob...@>= 21600@!init@! trie_op_hash:array[-trie_op_size..trie_op_size] of 0..trie_op_size; 21601 {trie op codes for quadruples} 21602@!trie_used:array[ASCII_code] of quarterword; 21603 {largest opcode used so far for this language} 21604@!trie_op_lang:array[1..trie_op_size] of 0..biggest_lang; 21605 {language part of a hashed quadruple} 21606@!trie_op_val:array[1..trie_op_size] of quarterword; 21607 {opcode corresponding to a hashed quadruple} 21608@!trie_op_ptr:0..trie_op_size; {number of stored ops so far} 21609tini 21610 21611@ It's tempting to remove the |overflow| stops in the following procedure; 21612|new_trie_op| could return |min_quarterword| (thereby simply ignoring 21613part of a hyphenation pattern) instead of aborting the job. However, that would 21614lead to different hyphenation results on different installations of \TeX\ 21615using the same patterns. The |overflow| stops are necessary for portability 21616of patterns. 21617 21618@<Declare procedures for preprocessing hyph...@>= 21619function new_trie_op(@!d,@!n:small_number;@!v:quarterword):quarterword; 21620label exit; 21621var h:-trie_op_size..trie_op_size; {trial hash location} 21622@!u:quarterword; {trial op code} 21623@!l:0..trie_op_size; {pointer to stored data} 21624begin h:=abs(n+313*d+361*v+1009*cur_lang) mod (trie_op_size+trie_op_size) 21625 - trie_op_size; 21626loop@+ begin l:=trie_op_hash[h]; 21627 if l=0 then {empty position found for a new op} 21628 begin if trie_op_ptr=trie_op_size then 21629 overflow("pattern memory ops",trie_op_size); 21630 u:=trie_used[cur_lang]; 21631 if u=max_quarterword then 21632 overflow("pattern memory ops per language", 21633 max_quarterword-min_quarterword); 21634 incr(trie_op_ptr); incr(u); trie_used[cur_lang]:=u; 21635 hyf_distance[trie_op_ptr]:=d; 21636 hyf_num[trie_op_ptr]:=n; hyf_next[trie_op_ptr]:=v; 21637 trie_op_lang[trie_op_ptr]:=cur_lang; trie_op_hash[h]:=trie_op_ptr; 21638 trie_op_val[trie_op_ptr]:=u; new_trie_op:=u; return; 21639 end; 21640 if (hyf_distance[l]=d)and(hyf_num[l]=n)and(hyf_next[l]=v) 21641 and(trie_op_lang[l]=cur_lang) then 21642 begin new_trie_op:=trie_op_val[l]; return; 21643 end; 21644 if h>-trie_op_size then decr(h)@+else h:=trie_op_size; 21645 end; 21646exit:end; 21647 21648@ After |new_trie_op| has compressed the necessary opcode information, 21649plenty of information is available to unscramble the data into the 21650final form needed by our hyphenation algorithm. 21651 21652@<Sort \(t)the hyphenation op tables into proper order@>= 21653op_start[0]:=-min_quarterword; 21654for j:=1 to biggest_lang do op_start[j]:=op_start[j-1]+qo(trie_used[j-1]); 21655for j:=1 to trie_op_ptr do 21656 trie_op_hash[j]:=op_start[trie_op_lang[j]]+trie_op_val[j]; {destination} 21657for j:=1 to trie_op_ptr do while trie_op_hash[j]>j do 21658 begin k:=trie_op_hash[j];@/ 21659 t:=hyf_distance[k]; hyf_distance[k]:=hyf_distance[j]; hyf_distance[j]:=t;@/ 21660 t:=hyf_num[k]; hyf_num[k]:=hyf_num[j]; hyf_num[j]:=t;@/ 21661 t:=hyf_next[k]; hyf_next[k]:=hyf_next[j]; hyf_next[j]:=t;@/ 21662 trie_op_hash[j]:=trie_op_hash[k]; trie_op_hash[k]:=k; 21663 end 21664 21665@ Before we forget how to initialize the data structures that have been 21666mentioned so far, let's write down the code that gets them started. 21667 21668@<Initialize table entries...@>= 21669for k:=-trie_op_size to trie_op_size do trie_op_hash[k]:=0; 21670for k:=0 to 255 do trie_used[k]:=min_quarterword; 21671trie_op_ptr:=0; 21672 21673@ The linked trie that is used to preprocess hyphenation patterns appears 21674in several global arrays. Each node represents an instruction of the form 21675``if you see character |c|, then perform operation |o|, move to the 21676next character, and go to node |l|; otherwise go to node |r|.'' 21677The four quantities |c|, |o|, |l|, and |r| are stored in four arrays 21678|trie_c|, |trie_o|, |trie_l|, and |trie_r|. The root of the trie 21679is |trie_l[0]|, and the number of nodes is |trie_ptr|. Null trie 21680pointers are represented by zero. To initialize the trie, we simply 21681set |trie_l[0]| and |trie_ptr| to zero. We also set |trie_c[0]| to some 21682arbitrary value, since the algorithm may access it. 21683 21684The algorithms maintain the condition 21685$$\hbox{|trie_c[trie_r[z]]>trie_c[z]|\qquad 21686whenever |z<>0| and |trie_r[z]<>0|};$$ in other words, sibling nodes are 21687ordered by their |c| fields. 21688 21689@d trie_root==trie_l[0] {root of the linked trie} 21690 21691@<Glob...@>= 21692@!init @!trie_c:packed array[trie_pointer] of packed_ASCII_code; 21693 {characters to match} 21694@t\hskip10pt@>@!trie_o:packed array[trie_pointer] of quarterword; 21695 {operations to perform} 21696@t\hskip10pt@>@!trie_l:packed array[trie_pointer] of trie_pointer; 21697 {left subtrie links} 21698@t\hskip10pt@>@!trie_r:packed array[trie_pointer] of trie_pointer; 21699 {right subtrie links} 21700@t\hskip10pt@>@!trie_ptr:trie_pointer; {the number of nodes in the trie} 21701@t\hskip10pt@>@!trie_hash:packed array[trie_pointer] of trie_pointer; 21702 {used to identify equivalent subtries} 21703tini 21704 21705@ Let us suppose that a linked trie has already been constructed. 21706Experience shows that we can often reduce its size by recognizing common 21707subtries; therefore another hash table is introduced for this purpose, 21708somewhat similar to |trie_op_hash|. The new hash table will be 21709initialized to zero. 21710 21711The function |trie_node(p)| returns |p| if |p| is distinct from other nodes 21712that it has seen, otherwise it returns the number of the first equivalent 21713node that it has seen. 21714 21715Notice that we might make subtries equivalent even if they correspond to 21716patterns for different languages, in which the trie ops might mean quite 21717different things. That's perfectly all right. 21718 21719@<Declare procedures for preprocessing hyph...@>= 21720function trie_node(@!p:trie_pointer):trie_pointer; {converts 21721 to a canonical form} 21722label exit; 21723var h:trie_pointer; {trial hash location} 21724@!q:trie_pointer; {trial trie node} 21725begin h:=abs(trie_c[p]+1009*trie_o[p]+@| 21726 2718*trie_l[p]+3142*trie_r[p]) mod trie_size; 21727loop@+ begin q:=trie_hash[h]; 21728 if q=0 then 21729 begin trie_hash[h]:=p; trie_node:=p; return; 21730 end; 21731 if (trie_c[q]=trie_c[p])and(trie_o[q]=trie_o[p])and@| 21732 (trie_l[q]=trie_l[p])and(trie_r[q]=trie_r[p]) then 21733 begin trie_node:=q; return; 21734 end; 21735 if h>0 then decr(h)@+else h:=trie_size; 21736 end; 21737exit:end; 21738 21739@ A neat recursive procedure is now able to compress a trie by 21740traversing it and applying |trie_node| to its nodes in ``bottom up'' 21741fashion. We will compress the entire trie by clearing |trie_hash| to 21742zero and then saying `|trie_root:=compress_trie(trie_root)|'. 21743@^recursion@> 21744 21745@<Declare procedures for preprocessing hyph...@>= 21746function compress_trie(@!p:trie_pointer):trie_pointer; 21747begin if p=0 then compress_trie:=0 21748else begin trie_l[p]:=compress_trie(trie_l[p]); 21749 trie_r[p]:=compress_trie(trie_r[p]); 21750 compress_trie:=trie_node(p); 21751 end; 21752end; 21753 21754@ The compressed trie will be packed into the |trie| array using a 21755``top-down first-fit'' procedure. This is a little tricky, so the reader 21756should pay close attention: The |trie_hash| array is cleared to zero 21757again and renamed |trie_ref| for this phase of the operation; later on, 21758|trie_ref[p]| will be nonzero only if the linked trie node |p| is the 21759smallest character 21760in a family and if the characters |c| of that family have been allocated to 21761locations |trie_ref[p]+c| in the |trie| array. Locations of |trie| that 21762are in use will have |trie_link=0|, while the unused holes in |trie| 21763will be doubly linked with |trie_link| pointing to the next larger vacant 21764location and |trie_back| pointing to the next smaller one. This double 21765linking will have been carried out only as far as |trie_max|, where 21766|trie_max| is the largest index of |trie| that will be needed. 21767To save time at the low end of the trie, we maintain array entries 21768|trie_min[c]| pointing to the smallest hole that is greater than~|c|. 21769Another array |trie_taken| tells whether or not a given location is 21770equal to |trie_ref[p]| for some |p|; this array is used to ensure that 21771distinct nodes in the compressed trie will have distinct |trie_ref| 21772entries. 21773 21774@d trie_ref==trie_hash {where linked trie families go into |trie|} 21775@d trie_back(#)==trie[#].lh {backward links in |trie| holes} 21776 21777@<Glob...@>= 21778@!init@!trie_taken:packed array[1..trie_size] of boolean; 21779 {does a family start here?} 21780@t\hskip10pt@>@!trie_min:array[ASCII_code] of trie_pointer; 21781 {the first possible slot for each character} 21782@t\hskip10pt@>@!trie_max:trie_pointer; {largest location used in |trie|} 21783@t\hskip10pt@>@!trie_not_ready:boolean; {is the trie still in linked form?} 21784tini 21785 21786@ Each time \.{\\patterns} appears, it contributes further patterns to 21787the future trie, which will be built only when hyphenation is attempted or 21788when a format file is dumped. The boolean variable |trie_not_ready| 21789will change to |false| when the trie is compressed; this will disable 21790further patterns. 21791 21792@<Initialize table entries...@>= 21793trie_not_ready:=true; trie_root:=0; trie_c[0]:=si(0); trie_ptr:=0; 21794 21795@ Here is how the trie-compression data structures are initialized. 21796If storage is tight, it would be possible to overlap |trie_op_hash|, 21797|trie_op_lang|, and |trie_op_val| with |trie|, |trie_hash|, and |trie_taken|, 21798because we finish with the former just before we need the latter. 21799 21800@<Get ready to compress the trie@>= 21801@<Sort \(t)the hyphenation...@>; 21802for p:=0 to trie_size do trie_hash[p]:=0; 21803hyph_root:=compress_trie(hyph_root); 21804trie_root:=compress_trie(trie_root); {identify equivalent subtries} 21805for p:=0 to trie_ptr do trie_ref[p]:=0; 21806for p:=0 to biggest_char do trie_min[p]:=p+1; 21807trie_link(0):=1; trie_max:=0 21808 21809@ The |first_fit| procedure finds the smallest hole |z| in |trie| such that 21810a trie family starting at a given node |p| will fit into vacant positions 21811starting at |z|. If |c=trie_c[p]|, this means that location |z-c| must 21812not already be taken by some other family, and that |z-c+@t$c^\prime$@>| 21813must be vacant for all characters $c^\prime$ in the family. The procedure 21814sets |trie_ref[p]| to |z-c| when the first fit has been found. 21815 21816@<Declare procedures for preprocessing hyph...@>= 21817procedure first_fit(@!p:trie_pointer); {packs a family into |trie|} 21818label not_found,found; 21819var h:trie_pointer; {candidate for |trie_ref[p]|} 21820@!z:trie_pointer; {runs through holes} 21821@!q:trie_pointer; {runs through the family starting at |p|} 21822@!c:ASCII_code; {smallest character in the family} 21823@!l,@!r:trie_pointer; {left and right neighbors} 21824@!ll:1..too_big_char; {upper limit of |trie_min| updating} 21825begin c:=so(trie_c[p]); 21826z:=trie_min[c]; {get the first conceivably good hole} 21827loop@+ begin h:=z-c;@/ 21828 @<Ensure that |trie_max>=h+max_hyph_char|@>; 21829 if trie_taken[h] then goto not_found; 21830 @<If all characters of the family fit relative to |h|, then 21831 |goto found|,\30\ otherwise |goto not_found|@>; 21832 not_found: z:=trie_link(z); {move to the next hole} 21833 end; 21834found: @<Pack the family into |trie| relative to |h|@>; 21835end; 21836 21837@ By making sure that |trie_max| is at least |h+max_hyph_char|, 21838we can be sure that 21839|trie_max>z|, since |h=z-c|. It follows that location |trie_max| will 21840never be occupied in |trie|, and we will have |trie_max>=trie_link(z)|. 21841 21842@<Ensure that |trie_max>=h+max_hyph_char|@>= 21843if trie_max<h+max_hyph_char then 21844 begin if trie_size<=h+max_hyph_char then overflow("pattern memory",trie_size); 21845@:TeX capacity exceeded pattern memory}{\quad pattern memory@> 21846 repeat incr(trie_max); trie_taken[trie_max]:=false; 21847 trie_link(trie_max):=trie_max+1; trie_back(trie_max):=trie_max-1; 21848 until trie_max=h+max_hyph_char; 21849 end 21850 21851@ @<If all characters of the family fit relative to |h|...@>= 21852q:=trie_r[p]; 21853while q>0 do 21854 begin if trie_link(h+so(trie_c[q]))=0 then goto not_found; 21855 q:=trie_r[q]; 21856 end; 21857goto found 21858 21859@ @<Pack the family into |trie| relative to |h|@>= 21860trie_taken[h]:=true; trie_ref[p]:=h; q:=p; 21861repeat z:=h+so(trie_c[q]); l:=trie_back(z); r:=trie_link(z); 21862trie_back(r):=l; trie_link(l):=r; trie_link(z):=0; 21863if l<max_hyph_char then 21864 begin if z<max_hyph_char then ll:=z @+else ll:=max_hyph_char; 21865 repeat trie_min[l]:=r; incr(l); 21866 until l=ll; 21867 end; 21868q:=trie_r[q]; 21869until q=0 21870 21871@ To pack the entire linked trie, we use the following recursive procedure. 21872@^recursion@> 21873 21874@<Declare procedures for preprocessing hyph...@>= 21875procedure trie_pack(@!p:trie_pointer); {pack subtries of a family} 21876var q:trie_pointer; {a local variable that need not be saved on recursive calls} 21877begin repeat q:=trie_l[p]; 21878if (q>0)and(trie_ref[q]=0) then 21879 begin first_fit(q); trie_pack(q); 21880 end; 21881p:=trie_r[p]; 21882until p=0; 21883end; 21884 21885@ When the whole trie has been allocated into the sequential table, we 21886must go through it once again so that |trie| contains the correct 21887information. Null pointers in the linked trie will be represented by the 21888value~0, which properly implements an ``empty'' family. 21889 21890@<Move the data into |trie|@>= 21891h.rh:=0; h.b0:=min_quarterword; h.b1:=min_quarterword; {|trie_link:=0|, 21892 |trie_op:=min_quarterword|, |trie_char:=qi(0)|} 21893if trie_max=0 then {no patterns were given} 21894 begin for r:=0 to 256 do trie[r]:=h; 21895 trie_max:=256; 21896 end 21897else begin if hyph_root>0 then trie_fix(hyph_root); 21898 if trie_root>0 then trie_fix(trie_root); {this fixes the non-holes in |trie|} 21899 r:=0; {now we will zero out all the holes} 21900 repeat s:=trie_link(r); trie[r]:=h; r:=s; 21901 until r>trie_max; 21902 end; 21903trie_char(0):=qi("?"); {make |trie_char(c)<>c| for all |c|} 21904 21905@ The fixing-up procedure is, of course, recursive. Since the linked trie 21906usually has overlapping subtries, the same data may be moved several 21907times; but that causes no harm, and at most as much work is done as it 21908took to build the uncompressed trie. 21909@^recursion@> 21910 21911@<Declare procedures for preprocessing hyph...@>= 21912procedure trie_fix(@!p:trie_pointer); {moves |p| and its siblings into |trie|} 21913var q:trie_pointer; {a local variable that need not be saved on recursive calls} 21914@!c:ASCII_code; {another one that need not be saved} 21915@!z:trie_pointer; {|trie| reference; this local variable must be saved} 21916begin z:=trie_ref[p]; 21917repeat q:=trie_l[p]; c:=so(trie_c[p]); 21918trie_link(z+c):=trie_ref[q]; trie_char(z+c):=qi(c); trie_op(z+c):=trie_o[p]; 21919if q>0 then trie_fix(q); 21920p:=trie_r[p]; 21921until p=0; 21922end; 21923 21924@ Now let's go back to the easier problem, of building the linked 21925trie. When \.{INITEX} has scanned the `\.{\\patterns}' control 21926sequence, it calls on |new_patterns| to do the right thing. 21927 21928@<Declare procedures for preprocessing hyph...@>= 21929procedure new_patterns; {initializes the hyphenation pattern data} 21930label done, done1; 21931var k,@!l:0..64; {indices into |hc| and |hyf|; 21932 not always in |small_number| range} 21933@!digit_sensed:boolean; {should the next digit be treated as a letter?} 21934@!v:quarterword; {trie op code} 21935@!p,@!q:trie_pointer; {nodes of trie traversed during insertion} 21936@!first_child:boolean; {is |p=trie_l[q]|?} 21937@!c:ASCII_code; {character being inserted} 21938begin if trie_not_ready then 21939 begin set_cur_lang; scan_left_brace; {a left brace must follow \.{\\patterns}} 21940 @<Enter all of the patterns into a linked trie, until coming to a right 21941 brace@>; 21942 if saving_hyph_codes>0 then 21943 @<Store hyphenation codes for current language@>; 21944 end 21945else begin print_err("Too late for "); print_esc("patterns"); 21946 help1("All patterns must be given before typesetting begins."); 21947 error; link(garbage):=scan_toks(false,false); flush_list(def_ref); 21948 end; 21949end; 21950 21951@ Novices are not supposed to be using \.{\\patterns}, so the error 21952messages are terse. (Note that all error messages appear in \TeX's string 21953pool, even if they are used only by \.{INITEX}.) 21954 21955@<Enter all of the patterns into a linked trie...@>= 21956k:=0; hyf[0]:=0; digit_sensed:=false; 21957loop@+ begin get_x_token; 21958 case cur_cmd of 21959 letter,other_char:@<Append a new letter or a hyphen level@>; 21960 spacer,right_brace: begin if k>0 then 21961 @<Insert a new pattern into the linked trie@>; 21962 if cur_cmd=right_brace then goto done; 21963 k:=0; hyf[0]:=0; digit_sensed:=false; 21964 end; 21965 othercases begin print_err("Bad "); print_esc("patterns"); 21966@.Bad \\patterns@> 21967 help1("(See Appendix H.)"); error; 21968 end 21969 endcases; 21970 end; 21971done: 21972 21973@ @<Append a new letter or a hyphen level@>= 21974if digit_sensed or(cur_chr<"0")or(cur_chr>"9") then 21975 begin if cur_chr="." then cur_chr:=0 {edge-of-word delimiter} 21976 else begin cur_chr:=lc_code(cur_chr); 21977 if cur_chr=0 then 21978 begin print_err("Nonletter"); 21979@.Nonletter@> 21980 help1("(See Appendix H.)"); error; 21981 end; 21982 end; 21983 if cur_chr>max_hyph_char then max_hyph_char:=cur_chr; 21984 if k<63 then 21985 begin incr(k); hc[k]:=cur_chr; hyf[k]:=0; digit_sensed:=false; 21986 end; 21987 end 21988else if k<63 then 21989 begin hyf[k]:=cur_chr-"0"; digit_sensed:=true; 21990 end 21991 21992@ When the following code comes into play, the pattern $p_1\ldots p_k$ 21993appears in |hc[1..k]|, and the corresponding sequence of numbers $n_0\ldots 21994n_k$ appears in |hyf[0..k]|. 21995 21996@<Insert a new pattern into the linked trie@>= 21997begin @<Compute the trie op code, |v|, and set |l:=0|@>; 21998q:=0; hc[0]:=cur_lang; 21999while l<=k do 22000 begin c:=hc[l]; incr(l); p:=trie_l[q]; first_child:=true; 22001 while (p>0)and(c>so(trie_c[p])) do 22002 begin q:=p; p:=trie_r[q]; first_child:=false; 22003 end; 22004 if (p=0)or(c<so(trie_c[p])) then 22005 @<Insert a new trie node between |q| and |p|, and 22006 make |p| point to it@>; 22007 q:=p; {now node |q| represents $p_1\ldots p_{l-1}$} 22008 end; 22009if trie_o[q]<>min_quarterword then 22010 begin print_err("Duplicate pattern"); 22011@.Duplicate pattern@> 22012 help1("(See Appendix H.)"); error; 22013 end; 22014trie_o[q]:=v; 22015end 22016 22017@ @<Insert a new trie node between |q| and |p|...@>= 22018begin if trie_ptr=trie_size then overflow("pattern memory",trie_size); 22019@:TeX capacity exceeded pattern memory}{\quad pattern memory@> 22020incr(trie_ptr); trie_r[trie_ptr]:=p; p:=trie_ptr; trie_l[p]:=0; 22021if first_child then trie_l[q]:=p@+else trie_r[q]:=p; 22022trie_c[p]:=si(c); trie_o[p]:=min_quarterword; 22023end 22024 22025@ @<Compute the trie op code, |v|...@>= 22026if hc[1]=0 then hyf[0]:=0; 22027if hc[k]=0 then hyf[k]:=0; 22028l:=k; v:=min_quarterword; 22029loop@+ begin if hyf[l]<>0 then v:=new_trie_op(k-l,hyf[l],v); 22030 if l>0 then decr(l)@+else goto done1; 22031 end; 22032done1: 22033 22034@ Finally we put everything together: Here is how the trie gets to its 22035final, efficient form. 22036The following packing routine is rigged so that the root of the linked 22037tree gets mapped into location 1 of |trie|, as required by the hyphenation 22038algorithm. This happens because the first call of |first_fit| will 22039``take'' location~1. 22040 22041@<Declare procedures for preprocessing hyphenation patterns@>= 22042procedure init_trie; 22043var @!p:trie_pointer; {pointer for initialization} 22044@!j,@!k,@!t:integer; {all-purpose registers for initialization} 22045@!r,@!s:trie_pointer; {used to clean up the packed |trie|} 22046@!h:two_halves; {template used to zero out |trie|'s holes} 22047begin 22048incr(max_hyph_char); 22049@<Get ready to compress the trie@>; 22050if trie_root<>0 then 22051 begin first_fit(trie_root); trie_pack(trie_root); 22052 end; 22053if hyph_root<>0 then @<Pack all stored |hyph_codes|@>; 22054@<Move the data into |trie|@>; 22055trie_not_ready:=false; 22056end; 22057 22058@* \[44] Breaking vertical lists into pages. 22059The |vsplit| procedure, which implements \TeX's \.{\\vsplit} operation, 22060is considerably simpler than |line_break| because it doesn't have to 22061worry about hyphenation, and because its mission is to discover a single 22062break instead of an optimum sequence of breakpoints. But before we get 22063into the details of |vsplit|, we need to consider a few more basic things. 22064 22065@ A subroutine called |prune_page_top| takes a pointer to a vlist and 22066returns a pointer to a modified vlist in which all glue, kern, and penalty nodes 22067have been deleted before the first box or rule node. However, the first 22068box or rule is actually preceded by a newly created glue node designed so that 22069the topmost baseline will be at distance |split_top_skip| from the top, 22070whenever this is possible without backspacing. 22071 22072When the second argument |s| is |false| the deleted nodes are destroyed, 22073otherwise they are collected in a list starting at |split_disc|. 22074 22075In this routine and those that follow, we make use of the fact that a 22076vertical list contains no character nodes, hence the |type| field exists 22077for each node in the list. 22078@^data structure assumptions@> 22079 22080@p function prune_page_top(@!p:pointer;@!s:boolean):pointer; 22081 {adjust top after page break} 22082var prev_p:pointer; {lags one step behind |p|} 22083@!q,@!r:pointer; {temporary variables for list manipulation} 22084begin prev_p:=temp_head; link(temp_head):=p; 22085while p<>null do 22086 case type(p) of 22087 hlist_node,vlist_node,rule_node:@<Insert glue for |split_top_skip| 22088 and set~|p:=null|@>; 22089 whatsit_node,mark_node,ins_node: begin prev_p:=p; p:=link(prev_p); 22090 end; 22091 glue_node,kern_node,penalty_node: begin q:=p; p:=link(q); link(q):=null; 22092 link(prev_p):=p; 22093 if s then 22094 begin if split_disc=null then split_disc:=q@+else link(r):=q; 22095 r:=q; 22096 end 22097 else flush_node_list(q); 22098 end; 22099 othercases confusion("pruning") 22100@:this can't happen pruning}{\quad pruning@> 22101 endcases; 22102prune_page_top:=link(temp_head); 22103end; 22104 22105@ @<Insert glue for |split_top_skip|...@>= 22106begin q:=new_skip_param(split_top_skip_code); link(prev_p):=q; link(q):=p; 22107 {now |temp_ptr=glue_ptr(q)|} 22108if width(temp_ptr)>height(p) then width(temp_ptr):=width(temp_ptr)-height(p) 22109else width(temp_ptr):=0; 22110p:=null; 22111end 22112 22113@ The next subroutine finds the best place to break a given vertical list 22114so as to obtain a box of height~|h|, with maximum depth~|d|. 22115A pointer to the beginning of the vertical list is given, 22116and a pointer to the optimum breakpoint is returned. The list is effectively 22117followed by a forced break, i.e., a penalty node with the |eject_penalty|; 22118if the best break occurs at this artificial node, the value |null| is returned. 22119 22120An array of six |scaled| distances is used to keep track of the height 22121from the beginning of the list to the current place, just as in |line_break|. 22122In fact, we use one of the same arrays, only changing its name to reflect 22123its new significance. 22124 22125@d active_height==active_width {new name for the six distance variables} 22126@d cur_height==active_height[1] {the natural height} 22127@d set_height_zero(#)==active_height[#]:=0 {initialize the height to zero} 22128@# 22129@d update_heights=90 {go here to record glue in the |active_height| table} 22130 22131@p function vert_break(@!p:pointer; @!h,@!d:scaled):pointer; 22132 {finds optimum page break} 22133label done,not_found,update_heights; 22134var prev_p:pointer; {if |p| is a glue node, |type(prev_p)| determines 22135 whether |p| is a legal breakpoint} 22136@!q,@!r:pointer; {glue specifications} 22137@!pi:integer; {penalty value} 22138@!b:integer; {badness at a trial breakpoint} 22139@!least_cost:integer; {the smallest badness plus penalties found so far} 22140@!best_place:pointer; {the most recent break that leads to |least_cost|} 22141@!prev_dp:scaled; {depth of previous box in the list} 22142@!t:small_number; {|type| of the node following a kern} 22143begin prev_p:=p; {an initial glue node is not a legal breakpoint} 22144least_cost:=awful_bad; do_all_six(set_height_zero); prev_dp:=0; 22145loop@+ begin @<If node |p| is a legal breakpoint, check if this break is 22146 the best known, and |goto done| if |p| is null or 22147 if the page-so-far is already too full to accept more stuff@>; 22148 prev_p:=p; p:=link(prev_p); 22149 end; 22150done: vert_break:=best_place; 22151end; 22152 22153@ A global variable |best_height_plus_depth| will be set to the natural size 22154of the box that corresponds to the optimum breakpoint found by |vert_break|. 22155(This value is used by the insertion-splitting algorithm of the page builder.) 22156 22157@<Glob...@>= 22158@!best_height_plus_depth:scaled; {height of the best box, without stretching or 22159 shrinking} 22160 22161@ A subtle point to be noted here is that the maximum depth~|d| might be 22162negative, so |cur_height| and |prev_dp| might need to be corrected even 22163after a glue or kern node. 22164 22165@<If node |p| is a legal breakpoint, check...@>= 22166if p=null then pi:=eject_penalty 22167else @<Use node |p| to update the current height and depth measurements; 22168 if this node is not a legal breakpoint, |goto not_found| 22169 or |update_heights|, 22170 otherwise set |pi| to the associated penalty at the break@>; 22171@<Check if node |p| is a new champion breakpoint; then \(go)|goto done| 22172 if |p| is a forced break or if the page-so-far is already too full@>; 22173if (type(p)<glue_node)or(type(p)>kern_node) then goto not_found; 22174update_heights: @<Update the current height and depth measurements with 22175 respect to a glue or kern node~|p|@>; 22176not_found: if prev_dp>d then 22177 begin cur_height:=cur_height+prev_dp-d; 22178 prev_dp:=d; 22179 end; 22180 22181@ @<Use node |p| to update the current height and depth measurements...@>= 22182case type(p) of 22183hlist_node,vlist_node,rule_node: begin@t@>@;@/ 22184 cur_height:=cur_height+prev_dp+height(p); prev_dp:=depth(p); 22185 goto not_found; 22186 end; 22187whatsit_node:@<Process whatsit |p| in |vert_break| loop, |goto not_found|@>; 22188glue_node: if precedes_break(prev_p) then pi:=0 22189 else goto update_heights; 22190kern_node: begin if link(p)=null then t:=penalty_node 22191 else t:=type(link(p)); 22192 if t=glue_node then pi:=0@+else goto update_heights; 22193 end; 22194penalty_node: pi:=penalty(p); 22195mark_node,ins_node: goto not_found; 22196othercases confusion("vertbreak") 22197@:this can't happen vertbreak}{\quad vertbreak@> 22198endcases 22199 22200@ @d deplorable==100000 {more than |inf_bad|, but less than |awful_bad|} 22201 22202@<Check if node |p| is a new champion breakpoint; then \(go)...@>= 22203if pi<inf_penalty then 22204 begin @<Compute the badness, |b|, using |awful_bad| 22205 if the box is too full@>; 22206 if b<awful_bad then 22207 if pi<=eject_penalty then b:=pi 22208 else if b<inf_bad then b:=b+pi 22209 else b:=deplorable; 22210 if b<=least_cost then 22211 begin best_place:=p; least_cost:=b; 22212 best_height_plus_depth:=cur_height+prev_dp; 22213 end; 22214 if (b=awful_bad)or(pi<=eject_penalty) then goto done; 22215 end 22216 22217@ @<Compute the badness, |b|, using |awful_bad| if the box is too full@>= 22218if cur_height<h then 22219 if (active_height[3]<>0) or (active_height[4]<>0) or 22220 (active_height[5]<>0) then b:=0 22221 else b:=badness(h-cur_height,active_height[2]) 22222else if cur_height-h>active_height[6] then b:=awful_bad 22223else b:=badness(cur_height-h,active_height[6]) 22224 22225@ Vertical lists that are subject to the |vert_break| procedure should not 22226contain infinite shrinkability, since that would permit any amount of 22227information to ``fit'' on one page. 22228 22229@<Update the current height and depth measurements with...@>= 22230if type(p)=kern_node then q:=p 22231else begin q:=glue_ptr(p); 22232 active_height[2+stretch_order(q)]:=@| 22233 active_height[2+stretch_order(q)]+stretch(q);@/ 22234 active_height[6]:=active_height[6]+shrink(q); 22235 if (shrink_order(q)<>normal)and(shrink(q)<>0) then 22236 begin@t@>@;@/ 22237 print_err("Infinite glue shrinkage found in box being split");@/ 22238@.Infinite glue shrinkage...@> 22239 help4("The box you are \vsplitting contains some infinitely")@/ 22240 ("shrinkable glue, e.g., `\vss' or `\vskip 0pt minus 1fil'.")@/ 22241 ("Such glue doesn't belong there; but you can safely proceed,")@/ 22242 ("since the offensive shrinkability has been made finite."); 22243 error; r:=new_spec(q); shrink_order(r):=normal; delete_glue_ref(q); 22244 glue_ptr(p):=r; q:=r; 22245 end; 22246 end; 22247cur_height:=cur_height+prev_dp+width(q); prev_dp:=0 22248 22249@ Now we are ready to consider |vsplit| itself. Most of 22250its work is accomplished by the two subroutines that we have just considered. 22251 22252Given the number of a vlist box |n|, and given a desired page height |h|, 22253the |vsplit| function finds the best initial segment of the vlist and 22254returns a box for a page of height~|h|. The remainder of the vlist, if 22255any, replaces the original box, after removing glue and penalties and 22256adjusting for |split_top_skip|. Mark nodes in the split-off box are used to 22257set the values of |split_first_mark| and |split_bot_mark|; we use the 22258fact that |split_first_mark=null| if and only if |split_bot_mark=null|. 22259 22260The original box becomes ``void'' if and only if it has been entirely 22261extracted. The extracted box is ``void'' if and only if the original 22262box was void (or if it was, erroneously, an hlist box). 22263 22264@p @t\4@>@<Declare the function called |do_marks|@>@; 22265function vsplit(@!n:halfword; @!h:scaled):pointer; 22266 {extracts a page of height |h| from box |n|} 22267label exit,done; 22268var v:pointer; {the box to be split} 22269p:pointer; {runs through the vlist} 22270q:pointer; {points to where the break occurs} 22271begin cur_val:=n; fetch_box(v); 22272flush_node_list(split_disc); split_disc:=null; 22273if sa_mark<>null then 22274 if do_marks(vsplit_init,0,sa_mark) then sa_mark:=null; 22275if split_first_mark<>null then 22276 begin delete_token_ref(split_first_mark); split_first_mark:=null; 22277 delete_token_ref(split_bot_mark); split_bot_mark:=null; 22278 end; 22279@<Dispense with trivial cases of void or bad boxes@>; 22280q:=vert_break(list_ptr(v),h,split_max_depth); 22281@<Look at all the marks in nodes before the break, and set the final 22282 link to |null| at the break@>; 22283q:=prune_page_top(q,saving_vdiscards>0); 22284p:=list_ptr(v); free_node(v,box_node_size); 22285if q<>null then q:=vpack(q,natural); 22286change_box(q); {the |eq_level| of the box stays the same} 22287vsplit:=vpackage(p,h,exactly,split_max_depth); 22288exit: end; 22289 22290@ @<Dispense with trivial cases of void or bad boxes@>= 22291if v=null then 22292 begin vsplit:=null; return; 22293 end; 22294if type(v)<>vlist_node then 22295 begin print_err(""); print_esc("vsplit"); print(" needs a "); 22296 print_esc("vbox"); 22297@:vsplit_}{\.{\\vsplit needs a \\vbox}@> 22298 help2("The box you are trying to split is an \hbox.")@/ 22299 ("I can't split such a box, so I'll leave it alone."); 22300 error; vsplit:=null; return; 22301 end 22302 22303@ It's possible that the box begins with a penalty node that is the 22304``best'' break, so we must be careful to handle this special case correctly. 22305 22306@<Look at all the marks...@>= 22307p:=list_ptr(v); 22308if p=q then list_ptr(v):=null 22309else loop@+begin if type(p)=mark_node then 22310 if mark_class(p)<>0 then @<Update the current marks for |vsplit|@> 22311 else if split_first_mark=null then 22312 begin split_first_mark:=mark_ptr(p); 22313 split_bot_mark:=split_first_mark; 22314 token_ref_count(split_first_mark):=@| 22315 token_ref_count(split_first_mark)+2; 22316 end 22317 else begin delete_token_ref(split_bot_mark); 22318 split_bot_mark:=mark_ptr(p); 22319 add_token_ref(split_bot_mark); 22320 end; 22321 if link(p)=q then 22322 begin link(p):=null; goto done; 22323 end; 22324 p:=link(p); 22325 end; 22326done: 22327 22328@* \[45] The page builder. 22329When \TeX\ appends new material to its main vlist in vertical mode, it uses 22330a method something like |vsplit| to decide where a page ends, except that 22331the calculations are done ``on line'' as new items come in. 22332The main complication in this process is that insertions must be put 22333into their boxes and removed from the vlist, in a more-or-less optimum manner. 22334 22335We shall use the term ``current page'' for that part of the main vlist that 22336is being considered as a candidate for being broken off and sent to the 22337user's output routine. The current page starts at |link(page_head)|, and 22338it ends at |page_tail|. We have |page_head=page_tail| if this list is empty. 22339@^current page@> 22340 22341Utter chaos would reign if the user kept changing page specifications 22342while a page is being constructed, so the page builder keeps the pertinent 22343specifications frozen as soon as the page receives its first box or 22344insertion. The global variable |page_contents| is |empty| when the 22345current page contains only mark nodes and content-less whatsit nodes; it 22346is |inserts_only| if the page contains only insertion nodes in addition to 22347marks and whatsits. Glue nodes, kern nodes, and penalty nodes are 22348discarded until a box or rule node appears, at which time |page_contents| 22349changes to |box_there|. As soon as |page_contents| becomes non-|empty|, 22350the current |vsize| and |max_depth| are squirreled away into |page_goal| 22351and |page_max_depth|; the latter values will be used until the page has 22352been forwarded to the user's output routine. The \.{\\topskip} adjustment 22353is made when |page_contents| changes to |box_there|. 22354 22355Although |page_goal| starts out equal to |vsize|, it is decreased by the 22356scaled natural height-plus-depth of the insertions considered so far, and by 22357the \.{\\skip} corrections for those insertions. Therefore it represents 22358the size into which the non-inserted material should fit, assuming that 22359all insertions in the current page have been made. 22360 22361The global variables |best_page_break| and |least_page_cost| correspond 22362respectively to the local variables |best_place| and |least_cost| in the 22363|vert_break| routine that we have already studied; i.e., they record the 22364location and value of the best place currently known for breaking the 22365current page. The value of |page_goal| at the time of the best break is 22366stored in |best_size|. 22367 22368@d inserts_only=1 22369 {|page_contents| when an insert node has been contributed, but no boxes} 22370@d box_there=2 {|page_contents| when a box or rule has been contributed} 22371 22372@<Glob...@>= 22373@!page_tail:pointer; {the final node on the current page} 22374@!page_contents:empty..box_there; {what is on the current page so far?} 22375@!page_max_depth:scaled; {maximum box depth on page being built} 22376@!best_page_break:pointer; {break here to get the best page known so far} 22377@!least_page_cost:integer; {the score for this currently best page} 22378@!best_size:scaled; {its |page_goal|} 22379 22380@ The page builder has another data structure to keep track of insertions. 22381This is a list of four-word nodes, starting and ending at |page_ins_head|. 22382That is, the first element of the list is node |r@t$_1$@>=link(page_ins_head)|; 22383node $r_j$ is followed by |r@t$_{j+1}$@>=link(r@t$_j$@>)|; and if there are 22384|n| items we have |r@t$_{n+1}$@>=page_ins_head|. The |subtype| field of 22385each node in this list refers to an insertion number; for example, `\.{\\insert 22386250}' would correspond to a node whose |subtype| is |qi(250)| 22387(the same as the |subtype| field of the relevant |ins_node|). These |subtype| 22388fields are in increasing order, and |subtype(page_ins_head)= 22389qi(255)|, so |page_ins_head| serves as a convenient sentinel 22390at the end of the list. A record is present for each insertion number that 22391appears in the current page. 22392 22393The |type| field in these nodes distinguishes two possibilities that 22394might occur as we look ahead before deciding on the optimum page break. 22395If |type(r)=inserting|, then |height(r)| contains the total of the 22396height-plus-depth dimensions of the box and all its inserts seen so far. 22397If |type(r)=split_up|, then no more insertions will be made into this box, 22398because at least one previous insertion was too big to fit on the current 22399page; |broken_ptr(r)| points to the node where that insertion will be 22400split, if \TeX\ decides to split it, |broken_ins(r)| points to the 22401insertion node that was tentatively split, and |height(r)| includes also the 22402natural height plus depth of the part that would be split off. 22403 22404In both cases, |last_ins_ptr(r)| points to the last |ins_node| 22405encountered for box |qo(subtype(r))| that would be at least partially 22406inserted on the next page; and |best_ins_ptr(r)| points to the last 22407such |ins_node| that should actually be inserted, to get the page with 22408minimum badness among all page breaks considered so far. We have 22409|best_ins_ptr(r)=null| if and only if no insertion for this box should 22410be made to produce this optimum page. 22411 22412The data structure definitions here use the fact that the |@!height| field 22413appears in the fourth word of a box node. 22414@^data structure assumptions@> 22415 22416@d page_ins_node_size=4 {number of words for a page insertion node} 22417@d inserting=0 {an insertion class that has not yet overflowed} 22418@d split_up=1 {an overflowed insertion class} 22419@d broken_ptr(#)==link(#+1) 22420 {an insertion for this class will break here if anywhere} 22421@d broken_ins(#)==info(#+1) {this insertion might break at |broken_ptr|} 22422@d last_ins_ptr(#)==link(#+2) {the most recent insertion for this |subtype|} 22423@d best_ins_ptr(#)==info(#+2) {the optimum most recent insertion} 22424 22425@<Initialize the special list heads...@>= 22426subtype(page_ins_head):=qi(255); 22427type(page_ins_head):=split_up; link(page_ins_head):=page_ins_head; 22428 22429@ An array |page_so_far| records the heights and depths of everything 22430on the current page. This array contains six |scaled| numbers, like the 22431similar arrays already considered in |line_break| and |vert_break|; and it 22432also contains |page_goal| and |page_depth|, since these values are 22433all accessible to the user via |set_page_dimen| commands. The 22434value of |page_so_far[1]| is also called |page_total|. The stretch 22435and shrink components of the \.{\\skip} corrections for each insertion are 22436included in |page_so_far|, but the natural space components of these 22437corrections are not, since they have been subtracted from |page_goal|. 22438 22439The variable |page_depth| records the depth of the current page; it has been 22440adjusted so that it is at most |page_max_depth|. The variable 22441|last_glue| points to the glue specification of the most recent node 22442contributed from the contribution list, if this was a glue node; otherwise 22443|last_glue=max_halfword|. (If the contribution list is nonempty, 22444however, the value of |last_glue| is not necessarily accurate.) 22445The variables |last_penalty|, |last_kern|, and |last_node_type| 22446are similar. And 22447finally, |insert_penalties| holds the sum of the penalties associated with 22448all split and floating insertions. 22449 22450@d page_goal==page_so_far[0] {desired height of information on page being built} 22451@d page_total==page_so_far[1] {height of the current page} 22452@d page_shrink==page_so_far[6] {shrinkability of the current page} 22453@d page_depth==page_so_far[7] {depth of the current page} 22454 22455@<Glob...@>= 22456@!page_so_far:array [0..7] of scaled; {height and glue of the current page} 22457@!last_glue:pointer; {used to implement \.{\\lastskip}} 22458@!last_penalty:integer; {used to implement \.{\\lastpenalty}} 22459@!last_kern:scaled; {used to implement \.{\\lastkern}} 22460@!last_node_type:integer; {used to implement \.{\\lastnodetype}} 22461@!insert_penalties:integer; {sum of the penalties for held-over insertions} 22462 22463@ @<Put each...@>= 22464primitive("pagegoal",set_page_dimen,0); 22465@!@:page_goal_}{\.{\\pagegoal} primitive@> 22466primitive("pagetotal",set_page_dimen,1); 22467@!@:page_total_}{\.{\\pagetotal} primitive@> 22468primitive("pagestretch",set_page_dimen,2); 22469@!@:page_stretch_}{\.{\\pagestretch} primitive@> 22470primitive("pagefilstretch",set_page_dimen,3); 22471@!@:page_fil_stretch_}{\.{\\pagefilstretch} primitive@> 22472primitive("pagefillstretch",set_page_dimen,4); 22473@!@:page_fill_stretch_}{\.{\\pagefillstretch} primitive@> 22474primitive("pagefilllstretch",set_page_dimen,5); 22475@!@:page_filll_stretch_}{\.{\\pagefilllstretch} primitive@> 22476primitive("pageshrink",set_page_dimen,6); 22477@!@:page_shrink_}{\.{\\pageshrink} primitive@> 22478primitive("pagedepth",set_page_dimen,7); 22479@!@:page_depth_}{\.{\\pagedepth} primitive@> 22480 22481@ @<Cases of |print_cmd_chr|...@>= 22482set_page_dimen: case chr_code of 224830: print_esc("pagegoal"); 224841: print_esc("pagetotal"); 224852: print_esc("pagestretch"); 224863: print_esc("pagefilstretch"); 224874: print_esc("pagefillstretch"); 224885: print_esc("pagefilllstretch"); 224896: print_esc("pageshrink"); 22490othercases print_esc("pagedepth") 22491endcases; 22492 22493@ @d print_plus_end(#)==print(#);@+end 22494@d print_plus(#)==if page_so_far[#]<>0 then 22495 begin print(" plus "); print_scaled(page_so_far[#]); print_plus_end 22496 22497@p procedure print_totals; 22498begin print_scaled(page_total); 22499print_plus(2)(""); 22500print_plus(3)("fil"); 22501print_plus(4)("fill"); 22502print_plus(5)("filll"); 22503if page_shrink<>0 then 22504 begin print(" minus "); print_scaled(page_shrink); 22505 end; 22506end; 22507 22508@ @<Show the status of the current page@>= 22509if page_head<>page_tail then 22510 begin print_nl("### current page:"); 22511 if output_active then print(" (held over for next output)"); 22512@.held over for next output@> 22513 show_box(link(page_head)); 22514 if page_contents>empty then 22515 begin print_nl("total height "); print_totals; 22516@:total_height}{\.{total height}@> 22517 print_nl(" goal height "); print_scaled(page_goal); 22518@.goal height@> 22519 r:=link(page_ins_head); 22520 while r<>page_ins_head do 22521 begin print_ln; print_esc("insert"); t:=qo(subtype(r)); 22522 print_int(t); print(" adds "); 22523 if count(t)=1000 then t:=height(r) 22524 else t:=x_over_n(height(r),1000)*count(t); 22525 print_scaled(t); 22526 if type(r)=split_up then 22527 begin q:=page_head; t:=0; 22528 repeat q:=link(q); 22529 if (type(q)=ins_node)and(subtype(q)=subtype(r)) then incr(t); 22530 until q=broken_ins(r); 22531 print(", #"); print_int(t); print(" might split"); 22532 end; 22533 r:=link(r); 22534 end; 22535 end; 22536 end 22537 22538@ Here is a procedure that is called when the |page_contents| is changing 22539from |empty| to |inserts_only| or |box_there|. 22540 22541@d set_page_so_far_zero(#)==page_so_far[#]:=0 22542 22543@p procedure freeze_page_specs(@!s:small_number); 22544begin page_contents:=s; 22545page_goal:=vsize; page_max_depth:=max_depth; 22546page_depth:=0; do_all_six(set_page_so_far_zero); 22547least_page_cost:=awful_bad; 22548@!stat if tracing_pages>0 then 22549 begin begin_diagnostic; 22550 print_nl("%% goal height="); print_scaled(page_goal); 22551@.goal height@> 22552 print(", max depth="); print_scaled(page_max_depth); 22553 end_diagnostic(false); 22554 end;@;@+tats@;@/ 22555end; 22556 22557@ Pages are built by appending nodes to the current list in \TeX's 22558vertical mode, which is at the outermost level of the semantic nest. This 22559vlist is split into two parts; the ``current page'' that we have been 22560talking so much about already, and the ``contribution list'' that receives 22561new nodes as they are created. The current page contains everything that 22562the page builder has accounted for in its data structures, as described 22563above, while the contribution list contains other things that have been 22564generated by other parts of \TeX\ but have not yet been 22565seen by the page builder. 22566The contribution list starts at |link(contrib_head)|, and it ends at the 22567current node in \TeX's vertical mode. 22568 22569When \TeX\ has appended new material in vertical mode, it calls the procedure 22570|build_page|, which tries to catch up by moving nodes from the contribution 22571list to the current page. This procedure will succeed in its goal of 22572emptying the contribution list, unless a page break is discovered, i.e., 22573unless the current page has grown to the point where the optimum next 22574page break has been determined. In the latter case, the nodes after the 22575optimum break will go back onto the contribution list, and control will 22576effectively pass to the user's output routine. 22577 22578We make |type(page_head)=glue_node|, so that an initial glue node on 22579the current page will not be considered a valid breakpoint. 22580 22581@<Initialize the special list...@>= 22582type(page_head):=glue_node; subtype(page_head):=normal; 22583 22584@ The global variable |output_active| is true during the time the 22585user's output routine is driving \TeX. 22586 22587@<Glob...@>= 22588@!output_active:boolean; {are we in the midst of an output routine?} 22589 22590@ @<Set init...@>= 22591output_active:=false; insert_penalties:=0; 22592 22593@ The page builder is ready to start a fresh page if we initialize 22594the following state variables. (However, the page insertion list is initialized 22595elsewhere.) 22596 22597@<Start a new current page@>= 22598page_contents:=empty; page_tail:=page_head; link(page_head):=null;@/ 22599last_glue:=max_halfword; last_penalty:=0; last_kern:=0; 22600last_node_type:=-1; 22601page_depth:=0; page_max_depth:=0 22602 22603@ At certain times box 255 is supposed to be void (i.e., |null|), 22604or an insertion box is supposed to be ready to accept a vertical list. 22605If not, an error message is printed, and the following subroutine 22606flushes the unwanted contents, reporting them to the user. 22607 22608@p procedure box_error(@!n:eight_bits); 22609begin error; begin_diagnostic; 22610print_nl("The following box has been deleted:"); 22611@.The following...deleted@> 22612show_box(box(n)); end_diagnostic(true); 22613flush_node_list(box(n)); box(n):=null; 22614end; 22615 22616@ The following procedure guarantees that a given box register 22617does not contain an \.{\\hbox}. 22618 22619@p procedure ensure_vbox(@!n:eight_bits); 22620var p:pointer; {the box register contents} 22621begin p:=box(n); 22622if p<>null then if type(p)=hlist_node then 22623 begin print_err("Insertions can only be added to a vbox"); 22624@.Insertions can only...@> 22625 help3("Tut tut: You're trying to \insert into a")@/ 22626 ("\box register that now contains an \hbox.")@/ 22627 ("Proceed, and I'll discard its present contents."); 22628 box_error(n); 22629 end; 22630end; 22631 22632@ \TeX\ is not always in vertical mode at the time |build_page| 22633is called; the current mode reflects what \TeX\ should return to, after 22634the contribution list has been emptied. A call on |build_page| should 22635be immediately followed by `|goto big_switch|', which is \TeX's central 22636control point. 22637 22638@d contribute=80 {go here to link a node into the current page} 22639 22640@p @t\4@>@<Declare the procedure called |fire_up|@>@;@/ 22641procedure build_page; {append contributions to the current page} 22642label exit,done,done1,continue,contribute,update_heights; 22643var p:pointer; {the node being appended} 22644@!q,@!r:pointer; {nodes being examined} 22645@!b,@!c:integer; {badness and cost of current page} 22646@!pi:integer; {penalty to be added to the badness} 22647@!n:min_quarterword..biggest_reg; {insertion box number} 22648@!delta,@!h,@!w:scaled; {sizes used for insertion calculations} 22649begin if (link(contrib_head)=null)or output_active then return; 22650repeat continue: p:=link(contrib_head);@/ 22651@<Update the values of |last_glue|, |last_penalty|, and |last_kern|@>; 22652@<Move node |p| to the current page; if it is time for a page break, 22653 put the nodes following the break back onto the contribution list, 22654 and |return| to the user's output routine if there is one@>; 22655until link(contrib_head)=null; 22656@<Make the contribution list empty by setting its tail to |contrib_head|@>; 22657exit:end; 22658 22659@ @d contrib_tail==nest[0].tail_field {tail of the contribution list} 22660 22661@<Make the contribution list empty...@>= 22662if nest_ptr=0 then tail:=contrib_head {vertical mode} 22663else contrib_tail:=contrib_head {other modes} 22664 22665@ @<Update the values of |last_glue|...@>= 22666if last_glue<>max_halfword then delete_glue_ref(last_glue); 22667last_penalty:=0; last_kern:=0; 22668last_node_type:=type(p)+1; 22669if type(p)=glue_node then 22670 begin last_glue:=glue_ptr(p); add_glue_ref(last_glue); 22671 end 22672else begin last_glue:=max_halfword; 22673 if type(p)=penalty_node then last_penalty:=penalty(p) 22674 else if type(p)=kern_node then last_kern:=width(p); 22675 end 22676 22677@ The code here is an example of a many-way switch into routines that 22678merge together in different places. Some people call this unstructured 22679programming, but the author doesn't see much wrong with it, as long as 22680@^Knuth, Donald Ervin@> 22681the various labels have a well-understood meaning. 22682 22683@<Move node |p| to the current page; ...@>= 22684@<If the current page is empty and node |p| is to be deleted, |goto done1|; 22685 otherwise use node |p| to update the state of the current page; 22686 if this node is an insertion, |goto contribute|; otherwise if this node 22687 is not a legal breakpoint, |goto contribute| or |update_heights|; 22688 otherwise set |pi| to the penalty associated with this breakpoint@>; 22689@<Check if node |p| is a new champion breakpoint; then \(if)if it is time for 22690 a page break, prepare for output, and either fire up the user's 22691 output routine and |return| or ship out the page and |goto done|@>; 22692if (type(p)<glue_node)or(type(p)>kern_node) then goto contribute; 22693update_heights:@<Update the current page measurements with respect to the 22694 glue or kern specified by node~|p|@>; 22695contribute: @<Make sure that |page_max_depth| is not exceeded@>; 22696@<Link node |p| into the current page and |goto done|@>; 22697done1:@<Recycle node |p|@>; 22698done: 22699 22700@ @<Link node |p| into the current page and |goto done|@>= 22701link(page_tail):=p; page_tail:=p; 22702link(contrib_head):=link(p); link(p):=null; goto done 22703 22704@ @<Recycle node |p|@>= 22705link(contrib_head):=link(p); link(p):=null; 22706if saving_vdiscards>0 then 22707 begin if page_disc=null then page_disc:=p@+else link(tail_page_disc):=p; 22708 tail_page_disc:=p; 22709 end 22710else flush_node_list(p) 22711 22712@ The title of this section is already so long, it seems best to avoid 22713making it more accurate but still longer, by mentioning the fact that a 22714kern node at the end of the contribution list will not be contributed until 22715we know its successor. 22716 22717@<If the current page is empty...@>= 22718case type(p) of 22719hlist_node,vlist_node,rule_node: if page_contents<box_there then 22720 @<Initialize the current page, insert the \.{\\topskip} glue 22721 ahead of |p|, and |goto continue|@> 22722 else @<Prepare to move a box or rule node to the current page, 22723 then |goto contribute|@>; 22724whatsit_node: @<Prepare to move whatsit |p| to the current page, 22725 then |goto contribute|@>; 22726glue_node: if page_contents<box_there then goto done1 22727 else if precedes_break(page_tail) then pi:=0 22728 else goto update_heights; 22729kern_node: if page_contents<box_there then goto done1 22730 else if link(p)=null then return 22731 else if type(link(p))=glue_node then pi:=0 22732 else goto update_heights; 22733penalty_node: if page_contents<box_there then goto done1@+else pi:=penalty(p); 22734mark_node: goto contribute; 22735ins_node: @<Append an insertion to the current page and |goto contribute|@>; 22736othercases confusion("page") 22737@:this can't happen page}{\quad page@> 22738endcases 22739 22740@ @<Initialize the current page, insert the \.{\\topskip} glue...@>= 22741begin if page_contents=empty then freeze_page_specs(box_there) 22742else page_contents:=box_there; 22743q:=new_skip_param(top_skip_code); {now |temp_ptr=glue_ptr(q)|} 22744if width(temp_ptr)>height(p) then width(temp_ptr):=width(temp_ptr)-height(p) 22745else width(temp_ptr):=0; 22746link(q):=p; link(contrib_head):=q; goto continue; 22747end 22748 22749@ @<Prepare to move a box or rule node to the current page...@>= 22750begin page_total:=page_total+page_depth+height(p); 22751page_depth:=depth(p); 22752goto contribute; 22753end 22754 22755@ @<Make sure that |page_max_depth| is not exceeded@>= 22756if page_depth>page_max_depth then 22757 begin page_total:=@| 22758 page_total+page_depth-page_max_depth;@/ 22759 page_depth:=page_max_depth; 22760 end; 22761 22762@ @<Update the current page measurements with respect to the glue...@>= 22763if type(p)=kern_node then q:=p 22764else begin q:=glue_ptr(p); 22765 page_so_far[2+stretch_order(q)]:=@| 22766 page_so_far[2+stretch_order(q)]+stretch(q);@/ 22767 page_shrink:=page_shrink+shrink(q); 22768 if (shrink_order(q)<>normal)and(shrink(q)<>0) then 22769 begin@t@>@;@/ 22770 print_err("Infinite glue shrinkage found on current page");@/ 22771@.Infinite glue shrinkage...@> 22772 help4("The page about to be output contains some infinitely")@/ 22773 ("shrinkable glue, e.g., `\vss' or `\vskip 0pt minus 1fil'.")@/ 22774 ("Such glue doesn't belong there; but you can safely proceed,")@/ 22775 ("since the offensive shrinkability has been made finite."); 22776 error; 22777 r:=new_spec(q); shrink_order(r):=normal; delete_glue_ref(q); 22778 glue_ptr(p):=r; q:=r; 22779 end; 22780 end; 22781page_total:=page_total+page_depth+width(q); page_depth:=0 22782 22783@ @<Check if node |p| is a new champion breakpoint; then \(if)...@>= 22784if pi<inf_penalty then 22785 begin @<Compute the badness, |b|, of the current page, 22786 using |awful_bad| if the box is too full@>; 22787 if b<awful_bad then 22788 if pi<=eject_penalty then c:=pi 22789 else if b<inf_bad then c:=b+pi+insert_penalties 22790 else c:=deplorable 22791 else c:=b; 22792 if insert_penalties>=10000 then c:=awful_bad; 22793 @!stat if tracing_pages>0 then @<Display the page break cost@>;@+tats@;@/ 22794 if c<=least_page_cost then 22795 begin best_page_break:=p; best_size:=page_goal; 22796 least_page_cost:=c; 22797 r:=link(page_ins_head); 22798 while r<>page_ins_head do 22799 begin best_ins_ptr(r):=last_ins_ptr(r); 22800 r:=link(r); 22801 end; 22802 end; 22803 if (c=awful_bad)or(pi<=eject_penalty) then 22804 begin fire_up(p); {output the current page at the best place} 22805 if output_active then return; {user's output routine will act} 22806 goto done; {the page has been shipped out by default output routine} 22807 end; 22808 end 22809 22810@ @<Display the page break cost@>= 22811begin begin_diagnostic; print_nl("%"); 22812print(" t="); print_totals;@/ 22813print(" g="); print_scaled(page_goal);@/ 22814print(" b="); 22815if b=awful_bad then print_char("*")@+else print_int(b); 22816@.*\relax@> 22817print(" p="); print_int(pi); 22818print(" c="); 22819if c=awful_bad then print_char("*")@+else print_int(c); 22820if c<=least_page_cost then print_char("#"); 22821end_diagnostic(false); 22822end 22823 22824@ @<Compute the badness, |b|, of the current page...@>= 22825if page_total<page_goal then 22826 if (page_so_far[3]<>0) or (page_so_far[4]<>0) or@| 22827 (page_so_far[5]<>0) then b:=0 22828 else b:=badness(page_goal-page_total,page_so_far[2]) 22829else if page_total-page_goal>page_shrink then b:=awful_bad 22830else b:=badness(page_total-page_goal,page_shrink) 22831 22832@ @<Append an insertion to the current page and |goto contribute|@>= 22833begin if page_contents=empty then freeze_page_specs(inserts_only); 22834n:=subtype(p); r:=page_ins_head; 22835while n>=subtype(link(r)) do r:=link(r); 22836n:=qo(n); 22837if subtype(r)<>qi(n) then 22838 @<Create a page insertion node with |subtype(r)=qi(n)|, and 22839 include the glue correction for box |n| in the 22840 current page state@>; 22841if type(r)=split_up then insert_penalties:=insert_penalties+float_cost(p) 22842else begin last_ins_ptr(r):=p; 22843 delta:=page_goal-page_total-page_depth+page_shrink; 22844 {this much room is left if we shrink the maximum} 22845 if count(n)=1000 then h:=height(p) 22846 else h:=x_over_n(height(p),1000)*count(n); {this much room is needed} 22847 if ((h<=0)or(h<=delta))and(height(p)+height(r)<=dimen(n)) then 22848 begin page_goal:=page_goal-h; height(r):=height(r)+height(p); 22849 end 22850 else @<Find the best way to split the insertion, and change 22851 |type(r)| to |split_up|@>; 22852 end; 22853goto contribute; 22854end 22855 22856@ We take note of the value of \.{\\skip} |n| and the height plus depth 22857of \.{\\box}~|n| only when the first \.{\\insert}~|n| node is 22858encountered for a new page. A user who changes the contents of \.{\\box}~|n| 22859after that first \.{\\insert}~|n| had better be either extremely careful 22860or extremely lucky, or both. 22861 22862@<Create a page insertion node...@>= 22863begin q:=get_node(page_ins_node_size); link(q):=link(r); link(r):=q; r:=q; 22864subtype(r):=qi(n); type(r):=inserting; ensure_vbox(n); 22865if box(n)=null then height(r):=0 22866else height(r):=height(box(n))+depth(box(n)); 22867best_ins_ptr(r):=null;@/ 22868q:=skip(n); 22869if count(n)=1000 then h:=height(r) 22870else h:=x_over_n(height(r),1000)*count(n); 22871page_goal:=page_goal-h-width(q);@/ 22872page_so_far[2+stretch_order(q)]:=@|page_so_far[2+stretch_order(q)]+stretch(q);@/ 22873page_shrink:=page_shrink+shrink(q); 22874if (shrink_order(q)<>normal)and(shrink(q)<>0) then 22875 begin print_err("Infinite glue shrinkage inserted from "); print_esc("skip"); 22876@.Infinite glue shrinkage...@> 22877 print_int(n); 22878 help3("The correction glue for page breaking with insertions")@/ 22879 ("must have finite shrinkability. But you may proceed,")@/ 22880 ("since the offensive shrinkability has been made finite."); 22881 error; 22882 end; 22883end 22884 22885@ Here is the code that will split a long footnote between pages, in an 22886emergency. The current situation deserves to be recapitulated: Node |p| 22887is an insertion into box |n|; the insertion will not fit, in its entirety, 22888either because it would make the total contents of box |n| greater than 22889\.{\\dimen} |n|, or because it would make the incremental amount of growth 22890|h| greater than the available space |delta|, or both. (This amount |h| has 22891been weighted by the insertion scaling factor, i.e., by \.{\\count} |n| 22892over 1000.) Now we will choose the best way to break the vlist of the 22893insertion, using the same criteria as in the \.{\\vsplit} operation. 22894 22895@<Find the best way to split the insertion...@>= 22896begin if count(n)<=0 then w:=max_dimen 22897else begin w:=page_goal-page_total-page_depth; 22898 if count(n)<>1000 then w:=x_over_n(w,count(n))*1000; 22899 end; 22900if w>dimen(n)-height(r) then w:=dimen(n)-height(r); 22901q:=vert_break(ins_ptr(p),w,depth(p)); 22902height(r):=height(r)+best_height_plus_depth; 22903@!stat if tracing_pages>0 then @<Display the insertion split cost@>;@+tats@;@/ 22904if count(n)<>1000 then 22905 best_height_plus_depth:=x_over_n(best_height_plus_depth,1000)*count(n); 22906page_goal:=page_goal-best_height_plus_depth; 22907type(r):=split_up; broken_ptr(r):=q; broken_ins(r):=p; 22908if q=null then insert_penalties:=insert_penalties+eject_penalty 22909else if type(q)=penalty_node then insert_penalties:=insert_penalties+penalty(q); 22910end 22911 22912@ @<Display the insertion split cost@>= 22913begin begin_diagnostic; print_nl("% split"); print_int(n); 22914@.split@> 22915print(" to "); print_scaled(w); 22916print_char(","); print_scaled(best_height_plus_depth);@/ 22917print(" p="); 22918if q=null then print_int(eject_penalty) 22919else if type(q)=penalty_node then print_int(penalty(q)) 22920else print_char("0"); 22921end_diagnostic(false); 22922end 22923 22924@ When the page builder has looked at as much material as could appear before 22925the next page break, it makes its decision. The break that gave minimum 22926badness will be used to put a completed ``page'' into box 255, with insertions 22927appended to their other boxes. 22928 22929We also set the values of |top_mark|, |first_mark|, and |bot_mark|. The 22930program uses the fact that |bot_mark<>null| implies |first_mark<>null|; 22931it also knows that |bot_mark=null| implies |top_mark=first_mark=null|. 22932 22933The |fire_up| subroutine prepares to output the current page at the best 22934place; then it fires up the user's output routine, if there is one, 22935or it simply ships out the page. There is one parameter, |c|, which represents 22936the node that was being contributed to the page when the decision to 22937force an output was made. 22938 22939@<Declare the procedure called |fire_up|@>= 22940procedure fire_up(@!c:pointer); 22941label exit; 22942var p,@!q,@!r,@!s:pointer; {nodes being examined and/or changed} 22943@!prev_p:pointer; {predecessor of |p|} 22944@!n:min_quarterword..biggest_reg; {insertion box number} 22945@!wait:boolean; {should the present insertion be held over?} 22946@!save_vbadness:integer; {saved value of |vbadness|} 22947@!save_vfuzz: scaled; {saved value of |vfuzz|} 22948@!save_split_top_skip: pointer; {saved value of |split_top_skip|} 22949begin @<Set the value of |output_penalty|@>; 22950if sa_mark<>null then 22951 if do_marks(fire_up_init,0,sa_mark) then sa_mark:=null; 22952if bot_mark<>null then 22953 begin if top_mark<>null then delete_token_ref(top_mark); 22954 top_mark:=bot_mark; add_token_ref(top_mark); 22955 delete_token_ref(first_mark); first_mark:=null; 22956 end; 22957@<Put the \(o)optimal current page into box 255, update |first_mark| and 22958 |bot_mark|, append insertions to their boxes, and put the 22959 remaining nodes back on the contribution list@>; 22960if sa_mark<>null then 22961 if do_marks(fire_up_done,0,sa_mark) then sa_mark:=null; 22962if (top_mark<>null)and(first_mark=null) then 22963 begin first_mark:=top_mark; add_token_ref(top_mark); 22964 end; 22965if output_routine<>null then 22966 if dead_cycles>=max_dead_cycles then 22967 @<Explain that too many dead cycles have occurred in a row@> 22968 else @<Fire up the user's output routine and |return|@>; 22969@<Perform the default output routine@>; 22970exit:end; 22971 22972@ @<Set the value of |output_penalty|@>= 22973if type(best_page_break)=penalty_node then 22974 begin geq_word_define(int_base+output_penalty_code,penalty(best_page_break)); 22975 penalty(best_page_break):=inf_penalty; 22976 end 22977else geq_word_define(int_base+output_penalty_code,inf_penalty) 22978 22979@ As the page is finally being prepared for output, 22980pointer |p| runs through the vlist, with |prev_p| trailing behind; 22981pointer |q| is the tail of a list of insertions that 22982are being held over for a subsequent page. 22983 22984@<Put the \(o)optimal current page into box 255...@>= 22985if c=best_page_break then best_page_break:=null; {|c| not yet linked in} 22986@<Ensure that box 255 is empty before output@>; 22987insert_penalties:=0; {this will count the number of insertions held over} 22988save_split_top_skip:=split_top_skip; 22989if holding_inserts<=0 then 22990 @<Prepare all the boxes involved in insertions to act as queues@>; 22991q:=hold_head; link(q):=null; prev_p:=page_head; p:=link(prev_p); 22992while p<>best_page_break do 22993 begin if type(p)=ins_node then 22994 begin if holding_inserts<=0 then 22995 @<Either insert the material specified by node |p| into the 22996 appropriate box, or hold it for the next page; 22997 also delete node |p| from the current page@>; 22998 end 22999 else if type(p)=mark_node then 23000 if mark_class(p)<>0 then @<Update the current marks for |fire_up|@> 23001 else @<Update the values of 23002 |first_mark| and |bot_mark|@>; 23003 prev_p:=p; p:=link(prev_p); 23004 end; 23005split_top_skip:=save_split_top_skip; 23006@<Break the current page at node |p|, put it in box~255, 23007 and put the remaining nodes on the contribution list@>; 23008@<Delete \(t)the page-insertion nodes@> 23009 23010@ @<Ensure that box 255 is empty before output@>= 23011if box(255)<>null then 23012 begin print_err(""); print_esc("box"); print("255 is not void"); 23013@:box255}{\.{\\box255 is not void}@> 23014 help2("You shouldn't use \box255 except in \output routines.")@/ 23015 ("Proceed, and I'll discard its present contents."); 23016 box_error(255); 23017 end 23018 23019@ @<Update the values of |first_mark| and |bot_mark|@>= 23020begin if first_mark=null then 23021 begin first_mark:=mark_ptr(p); 23022 add_token_ref(first_mark); 23023 end; 23024if bot_mark<>null then delete_token_ref(bot_mark); 23025bot_mark:=mark_ptr(p); add_token_ref(bot_mark); 23026end 23027 23028@ When the following code is executed, the current page runs from node 23029|link(page_head)| to node |prev_p|, and the nodes from |p| to |page_tail| 23030are to be placed back at the front of the contribution list. Furthermore 23031the heldover insertions appear in a list from |link(hold_head)| to |q|; we 23032will put them into the current page list for safekeeping while the user's 23033output routine is active. We might have |q=hold_head|; and |p=null| if 23034and only if |prev_p=page_tail|. Error messages are suppressed within 23035|vpackage|, since the box might appear to be overfull or underfull simply 23036because the stretch and shrink from the \.{\\skip} registers for inserts 23037are not actually present in the box. 23038 23039@<Break the current page at node |p|, put it...@>= 23040if p<>null then 23041 begin if link(contrib_head)=null then 23042 if nest_ptr=0 then tail:=page_tail 23043 else contrib_tail:=page_tail; 23044 link(page_tail):=link(contrib_head); 23045 link(contrib_head):=p; 23046 link(prev_p):=null; 23047 end; 23048save_vbadness:=vbadness; vbadness:=inf_bad; 23049save_vfuzz:=vfuzz; vfuzz:=max_dimen; {inhibit error messages} 23050box(255):=vpackage(link(page_head),best_size,exactly,page_max_depth); 23051vbadness:=save_vbadness; vfuzz:=save_vfuzz; 23052if last_glue<>max_halfword then delete_glue_ref(last_glue); 23053@<Start a new current page@>; {this sets |last_glue:=max_halfword|} 23054if q<>hold_head then 23055 begin link(page_head):=link(hold_head); page_tail:=q; 23056 end 23057 23058@ If many insertions are supposed to go into the same box, we want to know 23059the position of the last node in that box, so that we don't need to waste time 23060when linking further information into it. The |last_ins_ptr| fields of the 23061page insertion nodes are therefore used for this purpose during the 23062packaging phase. 23063 23064@<Prepare all the boxes involved in insertions to act as queues@>= 23065begin r:=link(page_ins_head); 23066while r<>page_ins_head do 23067 begin if best_ins_ptr(r)<>null then 23068 begin n:=qo(subtype(r)); ensure_vbox(n); 23069 if box(n)=null then box(n):=new_null_box; 23070 p:=box(n)+list_offset; 23071 while link(p)<>null do p:=link(p); 23072 last_ins_ptr(r):=p; 23073 end; 23074 r:=link(r); 23075 end; 23076end 23077 23078@ @<Delete \(t)the page-insertion nodes@>= 23079r:=link(page_ins_head); 23080while r<>page_ins_head do 23081 begin q:=link(r); free_node(r,page_ins_node_size); r:=q; 23082 end; 23083link(page_ins_head):=page_ins_head 23084 23085@ We will set |best_ins_ptr:=null| and package the box corresponding to 23086insertion node~|r|, just after making the final insertion into that box. 23087If this final insertion is `|split_up|', the remainder after splitting 23088and pruning (if any) will be carried over to the next page. 23089 23090@<Either insert the material specified by node |p| into...@>= 23091begin r:=link(page_ins_head); 23092while subtype(r)<>subtype(p) do r:=link(r); 23093if best_ins_ptr(r)=null then wait:=true 23094else begin wait:=false; s:=last_ins_ptr(r); link(s):=ins_ptr(p); 23095 if best_ins_ptr(r)=p then 23096 @<Wrap up the box specified by node |r|, splitting node |p| if 23097 called for; set |wait:=true| if node |p| holds a remainder after 23098 splitting@> 23099 else begin while link(s)<>null do s:=link(s); 23100 last_ins_ptr(r):=s; 23101 end; 23102 end; 23103@<Either append the insertion node |p| after node |q|, and remove it 23104 from the current page, or delete |node(p)|@>; 23105end 23106 23107@ @<Wrap up the box specified by node |r|, splitting node |p| if...@>= 23108begin if type(r)=split_up then 23109 if (broken_ins(r)=p)and(broken_ptr(r)<>null) then 23110 begin while link(s)<>broken_ptr(r) do s:=link(s); 23111 link(s):=null; 23112 split_top_skip:=split_top_ptr(p); 23113 ins_ptr(p):=prune_page_top(broken_ptr(r),false); 23114 if ins_ptr(p)<>null then 23115 begin temp_ptr:=vpack(ins_ptr(p),natural); 23116 height(p):=height(temp_ptr)+depth(temp_ptr); 23117 free_node(temp_ptr,box_node_size); wait:=true; 23118 end; 23119 end; 23120best_ins_ptr(r):=null; 23121n:=qo(subtype(r)); 23122temp_ptr:=list_ptr(box(n)); 23123free_node(box(n),box_node_size); 23124box(n):=vpack(temp_ptr,natural); 23125end 23126 23127@ @<Either append the insertion node |p|...@>= 23128link(prev_p):=link(p); link(p):=null; 23129if wait then 23130 begin link(q):=p; q:=p; incr(insert_penalties); 23131 end 23132else begin delete_glue_ref(split_top_ptr(p)); 23133 free_node(p,ins_node_size); 23134 end; 23135p:=prev_p 23136 23137@ The list of heldover insertions, running from |link(page_head)| to 23138|page_tail|, must be moved to the contribution list when the user has 23139specified no output routine. 23140 23141@<Perform the default output routine@>= 23142begin if link(page_head)<>null then 23143 begin if link(contrib_head)=null then 23144 if nest_ptr=0 then tail:=page_tail@+else contrib_tail:=page_tail 23145 else link(page_tail):=link(contrib_head); 23146 link(contrib_head):=link(page_head); 23147 link(page_head):=null; page_tail:=page_head; 23148 end; 23149flush_node_list(page_disc); page_disc:=null; 23150ship_out(box(255)); box(255):=null; 23151end 23152 23153@ @<Explain that too many dead cycles have occurred in a row@>= 23154begin print_err("Output loop---"); print_int(dead_cycles); 23155@.Output loop...@> 23156print(" consecutive dead cycles"); 23157help3("I've concluded that your \output is awry; it never does a")@/ 23158("\shipout, so I'm shipping \box255 out myself. Next time")@/ 23159("increase \maxdeadcycles if you want me to be more patient!"); error; 23160end 23161 23162@ @<Fire up the user's output routine and |return|@>= 23163begin output_active:=true; 23164incr(dead_cycles); 23165push_nest; mode:=-vmode; prev_depth:=ignore_depth; mode_line:=-line; 23166begin_token_list(output_routine,output_text); 23167new_save_level(output_group); normal_paragraph; 23168scan_left_brace; 23169return; 23170end 23171 23172@ When the user's output routine finishes, it has constructed a vlist 23173in internal vertical mode, and \TeX\ will do the following: 23174 23175@<Resume the page builder after an output routine has come to an end@>= 23176begin if (loc<>null) or 23177 ((token_type<>output_text)and(token_type<>backed_up)) then 23178 @<Recover from an unbalanced output routine@>; 23179end_token_list; {conserve stack space in case more outputs are triggered} 23180end_graf; unsave; output_active:=false; insert_penalties:=0;@/ 23181@<Ensure that box 255 is empty after output@>; 23182if tail<>head then {current list goes after heldover insertions} 23183 begin link(page_tail):=link(head); 23184 page_tail:=tail; 23185 end; 23186if link(page_head)<>null then {and both go before heldover contributions} 23187 begin if link(contrib_head)=null then contrib_tail:=page_tail; 23188 link(page_tail):=link(contrib_head); 23189 link(contrib_head):=link(page_head); 23190 link(page_head):=null; page_tail:=page_head; 23191 end; 23192flush_node_list(page_disc); page_disc:=null; 23193pop_nest; build_page; 23194end 23195 23196@ @<Recover from an unbalanced output routine@>= 23197begin print_err("Unbalanced output routine"); 23198@.Unbalanced output routine@> 23199help2("Your sneaky output routine has problematic {'s and/or }'s.")@/ 23200("I can't handle that very well; good luck."); error; 23201repeat get_token; 23202until loc=null; 23203end {loops forever if reading from a file, since |null=min_halfword<=0|} 23204 23205@ @<Ensure that box 255 is empty after output@>= 23206if box(255)<>null then 23207 begin print_err("Output routine didn't use all of "); 23208 print_esc("box"); print_int(255); 23209@.Output routine didn't use...@> 23210 help3("Your \output commands should empty \box255,")@/ 23211 ("e.g., by saying `\shipout\box255'.")@/ 23212 ("Proceed; I'll discard its present contents."); 23213 box_error(255); 23214 end 23215 23216@* \[46] The chief executive. 23217We come now to the |main_control| routine, which contains the master 23218switch that causes all the various pieces of \TeX\ to do their things, 23219in the right order. 23220 23221In a sense, this is the grand climax of the program: It applies all the 23222tools that we have worked so hard to construct. In another sense, this is 23223the messiest part of the program: It necessarily refers to other pieces 23224of code all over the place, so that a person can't fully understand what is 23225going on without paging back and forth to be reminded of conventions that 23226are defined elsewhere. We are now at the hub of the web, the central nervous 23227system that touches most of the other parts and ties them together. 23228@^brain@> 23229 23230The structure of |main_control| itself is quite simple. There's a label 23231called |big_switch|, at which point the next token of input is fetched 23232using |get_x_token|. Then the program branches at high speed into one of 23233about 100 possible directions, based on the value of the current 23234mode and the newly fetched command code; the sum |abs(mode)+cur_cmd| 23235indicates what to do next. For example, the case `|vmode+letter|' arises 23236when a letter occurs in vertical mode (or internal vertical mode); this 23237case leads to instructions that initialize a new paragraph and enter 23238horizontal mode. 23239 23240The big |case| statement that contains this multiway switch has been labeled 23241|reswitch|, so that the program can |goto reswitch| when the next token 23242has already been fetched. Most of the cases are quite short; they call 23243an ``action procedure'' that does the work for that case, and then they 23244either |goto reswitch| or they ``fall through'' to the end of the |case| 23245statement, which returns control back to |big_switch|. Thus, |main_control| 23246is not an extremely large procedure, in spite of the multiplicity of things 23247it must do; it is small enough to be handled by \PASCAL\ compilers that put 23248severe restrictions on procedure size. 23249@!@^action procedure@> 23250 23251One case is singled out for special treatment, because it accounts for most 23252of \TeX's activities in typical applications. The process of reading simple 23253text and converting it into |char_node| records, while looking for ligatures 23254and kerns, is part of \TeX's ``inner loop''; the whole program runs 23255efficiently when its inner loop is fast, so this part has been written 23256with particular care. 23257 23258@ We shall concentrate first on the inner loop of |main_control|, deferring 23259consideration of the other cases until later. 23260@^inner loop@> 23261 23262@d big_switch=60 {go here to branch on the next token of input} 23263@d main_loop=70 {go here to typeset a string of consecutive characters} 23264@d collect_native=71 {go here to collect characters in a "native" font string} 23265@d collected=72 23266@d main_loop_wrapup=80 {go here to finish a character or ligature} 23267@d main_loop_move=90 {go here to advance the ligature cursor} 23268@d main_loop_move_lig=95 {same, when advancing past a generated ligature} 23269@d main_loop_lookahead=100 {go here to bring in another character, if any} 23270@d main_lig_loop=110 {go here to check for ligatures or kerning} 23271@d append_normal_space=120 {go here to append a normal space between words} 23272@# 23273@d pdfbox_crop=1 {|pdf_box_type| passed to |find_pic_file|} 23274@d pdfbox_media=2 23275@d pdfbox_bleed=3 23276@d pdfbox_trim=4 23277@d pdfbox_art=5 23278 23279@p @t\4@>@<Declare action procedures for use by |main_control|@>@; 23280@t\4@>@<Declare the procedure called |handle_right_brace|@>@; 23281procedure main_control; {governs \TeX's activities} 23282label big_switch,reswitch,main_loop,main_loop_wrapup, 23283 main_loop_move,main_loop_move+1,main_loop_move+2,main_loop_move_lig, 23284 main_loop_lookahead,main_loop_lookahead+1, 23285 main_lig_loop,main_lig_loop+1,main_lig_loop+2, 23286 collect_native,collected, 23287 append_normal_space,exit; 23288var@!t:integer; {general-purpose temporary variable} 23289begin if every_job<>null then begin_token_list(every_job,every_job_text); 23290big_switch: get_x_token;@/ 23291reswitch: @<Give diagnostic information, if requested@>; 23292case abs(mode)+cur_cmd of 23293hmode+letter,hmode+other_char,hmode+char_given: goto main_loop; 23294hmode+char_num: begin scan_usv_num; cur_chr:=cur_val; goto main_loop;@+end; 23295hmode+no_boundary: begin get_x_token; 23296 if (cur_cmd=letter)or(cur_cmd=other_char)or(cur_cmd=char_given)or 23297 (cur_cmd=char_num) then cancel_boundary:=true; 23298 goto reswitch; 23299 end; 23300othercases begin 23301 if abs(mode)=hmode then check_for_post_char_toks(big_switch); 23302 case abs(mode)+cur_cmd of 23303 hmode+spacer: if space_factor=1000 then goto append_normal_space 23304 else app_space; 23305 hmode+ex_space,mmode+ex_space: goto append_normal_space; 23306 @t\4@>@<Cases of |main_control| that are not part of the inner loop@> 23307 end 23308 end 23309endcases; {of the big |case| statement} 23310goto big_switch; 23311main_loop:@<Append character |cur_chr| and the following characters (if~any) 23312 to the current hlist in the current font; |goto reswitch| when 23313 a non-character has been fetched@>; 23314append_normal_space:check_for_post_char_toks(big_switch); 23315@<Append a normal inter-word space to the current list, 23316 then |goto big_switch|@>; 23317exit:end; 23318 23319@ When a new token has just been fetched at |big_switch|, we have an 23320ideal place to monitor \TeX's activity. 23321@^debugging@> 23322 23323@<Give diagnostic information, if requested@>= 23324if interrupt<>0 then if OK_to_interrupt then 23325 begin back_input; check_interrupt; goto big_switch; 23326 end; 23327@!debug if panicking then check_mem(false);@+@;@+gubed 23328if tracing_commands>0 then show_cur_cmd_chr 23329 23330@ The following part of the program was first written in a structured 23331manner, according to the philosophy that ``premature optimization is 23332the root of all evil.'' Then it was rearranged into pieces of 23333spaghetti so that the most common actions could proceed with little or 23334no redundancy. 23335 23336The original unoptimized form of this algorithm resembles the 23337|reconstitute| procedure, which was described earlier in connection with 23338hyphenation. Again we have an implied ``cursor'' between characters 23339|cur_l| and |cur_r|. The main difference is that the |lig_stack| can now 23340contain a charnode as well as pseudo-ligatures; that stack is now 23341usually nonempty, because the next character of input (if any) has been 23342appended to it. In |main_control| we have 23343$$|cur_r|=\cases{|character(lig_stack)|,&if |lig_stack>null|;\cr 23344 |font_bchar[cur_font]|,&otherwise;\cr}$$ 23345except when |character(lig_stack)=font_false_bchar[cur_font]|. 23346Several additional global variables are needed. 23347 23348@<Glob...@>= 23349@!main_f:internal_font_number; {the current font} 23350@!main_i:four_quarters; {character information bytes for |cur_l|} 23351@!main_j:four_quarters; {ligature/kern command} 23352@!main_k:font_index; {index into |font_info|} 23353@!main_p:pointer; {temporary register for list manipulation} 23354@!main_pp,@!main_ppp:pointer; {more temporary registers for list manipulation} 23355@!main_h:pointer; {temp for hyphen offset in native-font text} 23356@!is_hyph:boolean; {whether the last char seen is the font's hyphenchar} 23357@!space_class:integer; 23358@!prev_class:integer; 23359@!main_s:integer; {space factor value} 23360@!bchar:halfword; {right boundary character of current font, or |non_char|} 23361@!false_bchar:halfword; {nonexistent character matching |bchar|, or |non_char|} 23362@!cancel_boundary:boolean; {should the left boundary be ignored?} 23363@!ins_disc:boolean; {should we insert a discretionary node?} 23364 23365@ The boolean variables of the main loop are normally false, and always reset 23366to false before the loop is left. That saves us the extra work of initializing 23367each time. 23368 23369@<Set init...@>= 23370ligature_present:=false; cancel_boundary:=false; lft_hit:=false; rt_hit:=false; 23371ins_disc:=false; 23372 23373@ We leave the |space_factor| unchanged if |sf_code(cur_chr)=0|; otherwise we 23374set it equal to |sf_code(cur_chr)|, except that it should never change 23375from a value less than 1000 to a value exceeding 1000. The most common 23376case is |sf_code(cur_chr)=1000|, so we want that case to be fast. 23377 23378The overall structure of the main loop is presented here. Some program labels 23379are inside the individual sections. 23380@^inner loop@> 23381 23382@d adjust_space_factor==@t@>@;@/ 23383 main_s:=sf_code(cur_chr) mod @"10000; 23384 if main_s=1000 then space_factor:=1000 23385 else if main_s<1000 then 23386 begin if main_s>0 then space_factor:=main_s; 23387 end 23388 else if space_factor<1000 then space_factor:=1000 23389 else space_factor:=main_s 23390 23391@d check_for_inter_char_toks(#)=={check for a spacing token list, goto |#| if found, 23392 or |big_switch| in case of the initial letter of a run} 23393 cur_ptr:=null; 23394 space_class:=sf_code(cur_chr) div @"10000; 23395 23396 if XeTeX_inter_char_tokens_en and space_class <> 256 then begin {class 256 = ignored (for combining marks etc)} 23397 if prev_class = 255 then begin {boundary} 23398 if (state<>token_list) or (token_type<>backed_up_char) then begin 23399 find_sa_element(inter_char_val, 255*@"100 + space_class, false); 23400 if cur_ptr<>null then begin 23401 if cur_cmd<>letter then cur_cmd:=other_char; 23402 cur_tok:=(cur_cmd*max_char_val)+cur_chr; 23403 back_input; token_type:=backed_up_char; 23404 begin_token_list(sa_ptr(cur_ptr), inter_char_text); 23405 goto big_switch; 23406 end 23407 end 23408 end else begin 23409 find_sa_element(inter_char_val, prev_class*@"100 + space_class, false); 23410 if cur_ptr<>null then begin 23411 if cur_cmd<>letter then cur_cmd:=other_char; 23412 cur_tok:=(cur_cmd*max_char_val)+cur_chr; 23413 back_input; token_type:=backed_up_char; 23414 begin_token_list(sa_ptr(cur_ptr), inter_char_text); 23415 prev_class:=255; 23416 goto #; 23417 end; 23418 end; 23419 prev_class:=space_class; 23420 end 23421 23422@d check_for_post_char_toks(#)== 23423 if XeTeX_inter_char_tokens_en and (space_class<>256) and (prev_class<>255) then begin 23424 prev_class:=255; 23425 find_sa_element(inter_char_val, space_class*@"100 + 255, false); {boundary} 23426 if cur_ptr<>null then begin 23427 if cur_cs=0 then begin 23428 if cur_cmd=char_num then cur_cmd:=other_char; 23429 cur_tok:=(cur_cmd*max_char_val)+cur_chr; 23430 end else cur_tok:=cs_token_flag+cur_cs; 23431 back_input; 23432 begin_token_list(sa_ptr(cur_ptr), inter_char_text); 23433 goto #; 23434 end; 23435 end 23436 23437@<Append character |cur_chr|...@>= 23438 23439prev_class:=255; {boundary} 23440 23441{ added code for native font support } 23442if is_native_font(cur_font) then begin 23443 if mode>0 then if language<>clang then fix_language; 23444 23445 main_h:=0; 23446 main_f:=cur_font; 23447 native_len:=0; 23448 23449collect_native: 23450 adjust_space_factor; 23451 23452 check_for_inter_char_toks(collected); 23453 23454 if (cur_chr > @"FFFF) then begin 23455 native_room(2); 23456 append_native((cur_chr - @"10000) div 1024 + @"D800); 23457 append_native((cur_chr - @"10000) mod 1024 + @"DC00); 23458 end else begin 23459 native_room(1); 23460 append_native(cur_chr); 23461 end; 23462 is_hyph:=(cur_chr = hyphen_char[main_f]) 23463 or (XeTeX_dash_break_en and ((cur_chr = @"2014) or (cur_chr = @"2013))); 23464 if (main_h = 0) and is_hyph then main_h:=native_len; 23465 23466 {try to collect as many chars as possible in the same font} 23467 get_next; 23468 if (cur_cmd=letter) or (cur_cmd=other_char) or (cur_cmd=char_given) then goto collect_native; 23469 x_token; 23470 if (cur_cmd=letter) or (cur_cmd=other_char) or (cur_cmd=char_given) then goto collect_native; 23471 if cur_cmd=char_num then begin 23472 scan_usv_num; 23473 cur_chr:=cur_val; 23474 goto collect_native; 23475 end; 23476 23477 check_for_post_char_toks(collected); 23478 23479collected: 23480 if (font_mapping[main_f] <> 0) then begin 23481 main_k:=apply_mapping(font_mapping[main_f], native_text, native_len); 23482 native_len:=0; 23483 native_room(main_k); 23484 main_h:=0; 23485 for main_p:=0 to main_k - 1 do begin 23486 append_native(mapped_text[main_p]); 23487 if (main_h = 0) and ((mapped_text[main_p] = hyphen_char[main_f]) 23488 or (XeTeX_dash_break_en and ((mapped_text[main_p] = @"2014) or (mapped_text[main_p] = @"2013)) ) ) 23489 then main_h:=native_len; 23490 end 23491 end; 23492 23493 if tracing_lost_chars > 0 then begin 23494 temp_ptr:=0; 23495 while (temp_ptr < native_len) do begin 23496 main_k:=native_text[temp_ptr]; 23497 incr(temp_ptr); 23498 if (main_k >= @"D800) and (main_k < @"DC00) then begin 23499 main_k:=@"10000 + (main_k - @"D800) * 1024; 23500 main_k:=main_k + native_text[temp_ptr] - @"DC00; 23501 incr(temp_ptr); 23502 end; 23503 if map_char_to_glyph(main_f, main_k) = 0 then 23504 char_warning(main_f, main_k); 23505 end 23506 end; 23507 23508 main_k:=native_len; 23509 main_pp:=tail; 23510 23511 if mode=hmode then begin 23512 main_ppp:=head; 23513 if main_ppp<>main_pp then { find node preceding tail, skipping discretionaries } 23514 while (link(main_ppp)<>main_pp) do begin 23515 if (not is_char_node(main_ppp)) and (type(main_ppp=disc_node)) then begin 23516 temp_ptr:=main_ppp; 23517 for main_p:=1 to replace_count(temp_ptr) do main_ppp:=link(main_ppp); 23518 end; 23519 if main_ppp<>main_pp then main_ppp:=link(main_ppp); 23520 end; 23521 23522 temp_ptr:=0; 23523 repeat 23524 if main_h = 0 then main_h:=main_k; 23525 23526 if is_native_word_node(main_pp) 23527 and (native_font(main_pp)=main_f) 23528 and (main_ppp<>main_pp) 23529 and (not is_char_node(main_ppp)) 23530 and (type(main_ppp)<>disc_node) 23531 then begin 23532 { make a new temp string that contains the concatenated text of |tail| + the current word/fragment } 23533 main_k:=main_h + native_length(main_pp); 23534 native_room(main_k); 23535 23536 save_native_len:=native_len; 23537 for main_p:=0 to native_length(main_pp) - 1 do 23538 append_native(get_native_char(main_pp, main_p)); 23539 for main_p:=0 to main_h - 1 do 23540 append_native(native_text[temp_ptr + main_p]); 23541 23542 do_locale_linebreaks(save_native_len, main_k); 23543 23544 native_len:=save_native_len; { discard the temp string } 23545 main_k:=native_len - main_h - temp_ptr; { and set |main_k| to remaining length of new word } 23546 temp_ptr:=main_h; { pointer to remaining fragment } 23547 23548 main_h:=0; 23549 while (main_h < main_k) and (native_text[temp_ptr + main_h] <> hyphen_char[main_f]) 23550 and ( (not XeTeX_dash_break_en) 23551 or ((native_text[temp_ptr + main_h] <> @"2014) and (native_text[temp_ptr + main_h] <> @"2013)) ) 23552 do incr(main_h); { look for next hyphen or end of text } 23553 if (main_h < main_k) then incr(main_h); 23554 23555 { remove the preceding node from the list } 23556 link(main_ppp):=link(main_pp); 23557 link(main_pp):=null; 23558 flush_node_list(main_pp); 23559 main_pp:=tail; 23560 while (link(main_ppp)<>main_pp) do 23561 main_ppp:=link(main_ppp); 23562 end else begin 23563 do_locale_linebreaks(temp_ptr, main_h); { append fragment of current word } 23564 23565 temp_ptr:=temp_ptr + main_h; { advance ptr to remaining fragment } 23566 main_k:=main_k - main_h; { decrement remaining length } 23567 23568 main_h:=0; 23569 while (main_h < main_k) and (native_text[temp_ptr + main_h] <> hyphen_char[main_f]) 23570 and ( (not XeTeX_dash_break_en) 23571 or ((native_text[temp_ptr + main_h] <> @"2014) and (native_text[temp_ptr + main_h] <> @"2013)) ) 23572 do incr(main_h); { look for next hyphen or end of text } 23573 if (main_h < main_k) then incr(main_h); 23574 end; 23575 23576 if (main_k > 0) or is_hyph then begin 23577 tail_append(new_disc); { add a break if we aren't at end of text (must be a hyphen), 23578 or if last char in original text was a hyphen } 23579 main_pp:=tail; 23580 end; 23581 until main_k = 0; 23582 end else begin 23583 { must be restricted hmode, so no need for line-breaking or discretionaries } 23584 { but there might already be explicit |disc_node|s in the list } 23585 main_ppp:=head; 23586 if main_ppp<>main_pp then { find node preceding tail, skipping discretionaries } 23587 while (link(main_ppp)<>main_pp) do begin 23588 if (not is_char_node(main_ppp)) and (type(main_ppp=disc_node)) then begin 23589 temp_ptr:=main_ppp; 23590 for main_p:=1 to replace_count(temp_ptr) do main_ppp:=link(main_ppp); 23591 end; 23592 if main_ppp<>main_pp then main_ppp:=link(main_ppp); 23593 end; 23594 if is_native_word_node(main_pp) 23595 and (native_font(main_pp)=main_f) 23596 and (main_ppp<>main_pp) 23597 and (not is_char_node(main_ppp)) 23598 and (type(main_ppp)<>disc_node) 23599 then begin 23600 { total string length for the new merged whatsit } 23601 link(main_pp):=new_native_word_node(main_f, main_k + native_length(main_pp)); 23602 tail:=link(main_pp); 23603 23604 { copy text from the old one into the new } 23605 for main_p:=0 to native_length(main_pp) - 1 do 23606 set_native_char(tail, main_p, get_native_char(main_pp, main_p)); 23607 { append the new text } 23608 for main_p:=0 to main_k - 1 do 23609 set_native_char(tail, main_p + native_length(main_pp), native_text[main_p]); 23610 set_native_metrics(tail, XeTeX_use_glyph_metrics); 23611 23612 { remove the preceding node from the list } 23613 main_p:=head; 23614 if main_p<>main_pp then 23615 while link(main_p)<>main_pp do 23616 main_p:=link(main_p); 23617 link(main_p):=link(main_pp); 23618 link(main_pp):=null; 23619 flush_node_list(main_pp); 23620 end else begin 23621 { package the current string into a |native_word| whatsit } 23622 link(main_pp):=new_native_word_node(main_f, main_k); 23623 tail:=link(main_pp); 23624 for main_p:=0 to main_k - 1 do 23625 set_native_char(tail, main_p, native_text[main_p]); 23626 set_native_metrics(tail, XeTeX_use_glyph_metrics); 23627 end 23628 end; 23629 23630 if cur_ptr<>null then goto big_switch 23631 else goto reswitch; 23632end; 23633{ End of added code for native fonts } 23634 23635adjust_space_factor;@/ 23636check_for_inter_char_toks(big_switch); 23637main_f:=cur_font; 23638bchar:=font_bchar[main_f]; false_bchar:=font_false_bchar[main_f]; 23639if mode>0 then if language<>clang then fix_language; 23640fast_get_avail(lig_stack); font(lig_stack):=main_f; cur_l:=qi(cur_chr); 23641character(lig_stack):=cur_l;@/ 23642cur_q:=tail; 23643if cancel_boundary then 23644 begin cancel_boundary:=false; main_k:=non_address; 23645 end 23646else main_k:=bchar_label[main_f]; 23647if main_k=non_address then goto main_loop_move+2; {no left boundary processing} 23648cur_r:=cur_l; cur_l:=non_char; 23649goto main_lig_loop+1; {begin with cursor after left boundary} 23650@# 23651main_loop_wrapup:@<Make a ligature node, if |ligature_present|; 23652 insert a null discretionary, if appropriate@>; 23653main_loop_move:@<If the cursor is immediately followed by the right boundary, 23654 |goto reswitch|; if it's followed by an invalid character, |goto big_switch|; 23655 otherwise move the cursor one step to the right and |goto main_lig_loop|@>; 23656main_loop_lookahead:@<Look ahead for another character, or leave |lig_stack| 23657 empty if there's none there@>; 23658main_lig_loop:@<If there's a ligature/kern command relevant to |cur_l| and 23659 |cur_r|, adjust the text appropriately; exit to |main_loop_wrapup|@>; 23660main_loop_move_lig:@<Move the cursor past a pseudo-ligature, then 23661 |goto main_loop_lookahead| or |main_lig_loop|@> 23662 23663@ If |link(cur_q)| is nonnull when |wrapup| is invoked, |cur_q| points to 23664the list of characters that were consumed while building the ligature 23665character~|cur_l|. 23666 23667A discretionary break is not inserted for an explicit hyphen when we are in 23668restricted horizontal mode. In particular, this avoids putting discretionary 23669nodes inside of other discretionaries. 23670@^inner loop@> 23671 23672@d pack_lig(#)== {the parameter is either |rt_hit| or |false|} 23673 begin main_p:=new_ligature(main_f,cur_l,link(cur_q)); 23674 if lft_hit then 23675 begin subtype(main_p):=2; lft_hit:=false; 23676 end; 23677 if # then if lig_stack=null then 23678 begin incr(subtype(main_p)); rt_hit:=false; 23679 end; 23680 link(cur_q):=main_p; tail:=main_p; ligature_present:=false; 23681 end 23682 23683@d wrapup(#)==if cur_l<non_char then 23684 begin if link(cur_q)>null then 23685 if character(tail)=qi(hyphen_char[main_f]) then ins_disc:=true; 23686 if ligature_present then pack_lig(#); 23687 if ins_disc then 23688 begin ins_disc:=false; 23689 if mode>0 then tail_append(new_disc); 23690 end; 23691 end 23692 23693@<Make a ligature node, if |ligature_present|;...@>= 23694wrapup(rt_hit) 23695 23696@ @<If the cursor is immediately followed by the right boundary...@>= 23697@^inner loop@> 23698if lig_stack=null then goto reswitch; 23699cur_q:=tail; cur_l:=character(lig_stack); 23700main_loop_move+1:if not is_char_node(lig_stack) then goto main_loop_move_lig; 23701main_loop_move+2:if(cur_chr<font_bc[main_f])or(cur_chr>font_ec[main_f]) then 23702 begin char_warning(main_f,cur_chr); free_avail(lig_stack); goto big_switch; 23703 end; 23704main_i:=char_info(main_f)(cur_l); 23705if not char_exists(main_i) then 23706 begin char_warning(main_f,cur_chr); free_avail(lig_stack); goto big_switch; 23707 end; 23708link(tail):=lig_stack; tail:=lig_stack {|main_loop_lookahead| is next} 23709 23710@ Here we are at |main_loop_move_lig|. 23711When we begin this code we have |cur_q=tail| and |cur_l=character(lig_stack)|. 23712 23713@<Move the cursor past a pseudo-ligature...@>= 23714main_p:=lig_ptr(lig_stack); 23715if main_p>null then tail_append(main_p); {append a single character} 23716temp_ptr:=lig_stack; lig_stack:=link(temp_ptr); 23717free_node(temp_ptr,small_node_size); 23718main_i:=char_info(main_f)(cur_l); ligature_present:=true; 23719if lig_stack=null then 23720 if main_p>null then goto main_loop_lookahead 23721 else cur_r:=bchar 23722else cur_r:=character(lig_stack); 23723goto main_lig_loop 23724 23725@ The result of \.{\\char} can participate in a ligature or kern, so we must 23726look ahead for it. 23727 23728@<Look ahead for another character...@>= 23729get_next; {set only |cur_cmd| and |cur_chr|, for speed} 23730if cur_cmd=letter then goto main_loop_lookahead+1; 23731if cur_cmd=other_char then goto main_loop_lookahead+1; 23732if cur_cmd=char_given then goto main_loop_lookahead+1; 23733x_token; {now expand and set |cur_cmd|, |cur_chr|, |cur_tok|} 23734if cur_cmd=letter then goto main_loop_lookahead+1; 23735if cur_cmd=other_char then goto main_loop_lookahead+1; 23736if cur_cmd=char_given then goto main_loop_lookahead+1; 23737if cur_cmd=char_num then 23738 begin scan_char_num; cur_chr:=cur_val; goto main_loop_lookahead+1; 23739 end; 23740if cur_cmd=no_boundary then bchar:=non_char; 23741cur_r:=bchar; lig_stack:=null; goto main_lig_loop; 23742main_loop_lookahead+1: adjust_space_factor; 23743check_for_inter_char_toks(big_switch); 23744fast_get_avail(lig_stack); font(lig_stack):=main_f; 23745cur_r:=qi(cur_chr); character(lig_stack):=cur_r; 23746if cur_r=false_bchar then cur_r:=non_char {this prevents spurious ligatures} 23747 23748@ Even though comparatively few characters have a lig/kern program, several 23749of the instructions here count as part of \TeX's inner loop, since a 23750@^inner loop@> 23751potentially long sequential search must be performed. For example, tests with 23752Computer Modern Roman showed that about 40 per cent of all characters 23753actually encountered in practice had a lig/kern program, and that about four 23754lig/kern commands were investigated for every such character. 23755 23756At the beginning of this code we have |main_i=char_info(main_f)(cur_l)|. 23757 23758@<If there's a ligature/kern command...@>= 23759if char_tag(main_i)<>lig_tag then goto main_loop_wrapup; 23760if cur_r=non_char then goto main_loop_wrapup; 23761main_k:=lig_kern_start(main_f)(main_i); main_j:=font_info[main_k].qqqq; 23762if skip_byte(main_j)<=stop_flag then goto main_lig_loop+2; 23763main_k:=lig_kern_restart(main_f)(main_j); 23764main_lig_loop+1:main_j:=font_info[main_k].qqqq; 23765main_lig_loop+2:if next_char(main_j)=cur_r then 23766 if skip_byte(main_j)<=stop_flag then 23767 @<Do ligature or kern command, returning to |main_lig_loop| 23768 or |main_loop_wrapup| or |main_loop_move|@>; 23769if skip_byte(main_j)=qi(0) then incr(main_k) 23770else begin if skip_byte(main_j)>=stop_flag then goto main_loop_wrapup; 23771 main_k:=main_k+qo(skip_byte(main_j))+1; 23772 end; 23773goto main_lig_loop+1 23774 23775@ When a ligature or kern instruction matches a character, we know from 23776|read_font_info| that the character exists in the font, even though we 23777haven't verified its existence in the normal way. 23778 23779This section could be made into a subroutine, if the code inside 23780|main_control| needs to be shortened. 23781 23782\chardef\?='174 % vertical line to indicate character retention 23783 23784@<Do ligature or kern command...@>= 23785begin if op_byte(main_j)>=kern_flag then 23786 begin wrapup(rt_hit); 23787 tail_append(new_kern(char_kern(main_f)(main_j))); goto main_loop_move; 23788 end; 23789if cur_l=non_char then lft_hit:=true 23790else if lig_stack=null then rt_hit:=true; 23791check_interrupt; {allow a way out in case there's an infinite ligature loop} 23792case op_byte(main_j) of 23793qi(1),qi(5):begin cur_l:=rem_byte(main_j); {\.{=:\?}, \.{=:\?>}} 23794 main_i:=char_info(main_f)(cur_l); ligature_present:=true; 23795 end; 23796qi(2),qi(6):begin cur_r:=rem_byte(main_j); {\.{\?=:}, \.{\?=:>}} 23797 if lig_stack=null then {right boundary character is being consumed} 23798 begin lig_stack:=new_lig_item(cur_r); bchar:=non_char; 23799 end 23800 else if is_char_node(lig_stack) then {|link(lig_stack)=null|} 23801 begin main_p:=lig_stack; lig_stack:=new_lig_item(cur_r); 23802 lig_ptr(lig_stack):=main_p; 23803 end 23804 else character(lig_stack):=cur_r; 23805 end; 23806qi(3):begin cur_r:=rem_byte(main_j); {\.{\?=:\?}} 23807 main_p:=lig_stack; lig_stack:=new_lig_item(cur_r); 23808 link(lig_stack):=main_p; 23809 end; 23810qi(7),qi(11):begin wrapup(false); {\.{\?=:\?>}, \.{\?=:\?>>}} 23811 cur_q:=tail; cur_l:=rem_byte(main_j); 23812 main_i:=char_info(main_f)(cur_l); ligature_present:=true; 23813 end; 23814othercases begin cur_l:=rem_byte(main_j); ligature_present:=true; {\.{=:}} 23815 if lig_stack=null then goto main_loop_wrapup 23816 else goto main_loop_move+1; 23817 end 23818endcases; 23819if op_byte(main_j)>qi(4) then 23820 if op_byte(main_j)<>qi(7) then goto main_loop_wrapup; 23821if cur_l<non_char then goto main_lig_loop; 23822main_k:=bchar_label[main_f]; goto main_lig_loop+1; 23823end 23824 23825@ The occurrence of blank spaces is almost part of \TeX's inner loop, 23826@^inner loop@> 23827since we usually encounter about one space for every five non-blank characters. 23828Therefore |main_control| gives second-highest priority to ordinary spaces. 23829 23830When a glue parameter like \.{\\spaceskip} is set to `\.{0pt}', we will 23831see to it later that the corresponding glue specification is precisely 23832|zero_glue|, not merely a pointer to some specification that happens 23833to be full of zeroes. Therefore it is simple to test whether a glue parameter 23834is zero or~not. 23835 23836@<Append a normal inter-word space...@>= 23837if space_skip=zero_glue then 23838 begin @<Find the glue specification, |main_p|, for 23839 text spaces in the current font@>; 23840 temp_ptr:=new_glue(main_p); 23841 end 23842else temp_ptr:=new_param_glue(space_skip_code); 23843link(tail):=temp_ptr; tail:=temp_ptr; 23844goto big_switch 23845 23846@ Having |font_glue| allocated for each text font saves both time and memory. 23847If any of the three spacing parameters are subsequently changed by the 23848use of \.{\\fontdimen}, the |find_font_dimen| procedure deallocates the 23849|font_glue| specification allocated here. 23850 23851@<Find the glue specification...@>= 23852begin main_p:=font_glue[cur_font]; 23853if main_p=null then 23854 begin main_p:=new_spec(zero_glue); main_k:=param_base[cur_font]+space_code; 23855 width(main_p):=font_info[main_k].sc; {that's |space(cur_font)|} 23856 stretch(main_p):=font_info[main_k+1].sc; {and |space_stretch(cur_font)|} 23857 shrink(main_p):=font_info[main_k+2].sc; {and |space_shrink(cur_font)|} 23858 font_glue[cur_font]:=main_p; 23859 end; 23860end 23861 23862@ @<Declare act...@>= 23863procedure app_space; {handle spaces when |space_factor<>1000|} 23864var@!q:pointer; {glue node} 23865begin if (space_factor>=2000)and(xspace_skip<>zero_glue) then 23866 q:=new_param_glue(xspace_skip_code) 23867else begin if space_skip<>zero_glue then main_p:=space_skip 23868 else @<Find the glue specification...@>; 23869 main_p:=new_spec(main_p); 23870 @<Modify the glue specification in |main_p| according to the space factor@>; 23871 q:=new_glue(main_p); glue_ref_count(main_p):=null; 23872 end; 23873link(tail):=q; tail:=q; 23874end; 23875 23876@ @<Modify the glue specification in |main_p| according to the space factor@>= 23877if space_factor>=2000 then width(main_p):=width(main_p)+extra_space(cur_font); 23878stretch(main_p):=xn_over_d(stretch(main_p),space_factor,1000); 23879shrink(main_p):=xn_over_d(shrink(main_p),1000,space_factor) 23880 23881@ Whew---that covers the main loop. We can now proceed at a leisurely 23882pace through the other combinations of possibilities. 23883 23884@d any_mode(#)==vmode+#,hmode+#,mmode+# {for mode-independent commands} 23885 23886@<Cases of |main_control| that are not part of the inner loop@>= 23887any_mode(relax),vmode+spacer,mmode+spacer,mmode+no_boundary:do_nothing; 23888any_mode(ignore_spaces): begin 23889 if cur_chr = 0 then begin 23890 @<Get the next non-blank non-call...@>; 23891 goto reswitch; 23892 end 23893 else begin 23894 t:=scanner_status; 23895 scanner_status:=normal; 23896 get_next; 23897 scanner_status:=t; 23898 if cur_cs < hash_base then 23899 cur_cs:=prim_lookup(cur_cs-257) 23900 else 23901 cur_cs :=prim_lookup(text(cur_cs)); 23902 if cur_cs<>undefined_primitive then begin 23903 cur_cmd:=prim_eq_type(cur_cs); 23904 cur_chr:=prim_equiv(cur_cs); 23905 goto reswitch; 23906 end; 23907 end; 23908 end; 23909vmode+stop: if its_all_over then return; {this is the only way out} 23910@t\4@>@<Forbidden cases detected in |main_control|@>@+@,any_mode(mac_param): 23911 report_illegal_case; 23912@<Math-only cases in non-math modes, or vice versa@>: insert_dollar_sign; 23913@t\4@>@<Cases of |main_control| that build boxes and lists@>@; 23914@t\4@>@<Cases of |main_control| that don't depend on |mode|@>@; 23915@t\4@>@<Cases of |main_control| that are for extensions to \TeX@>@; 23916 23917@ Here is a list of cases where the user has probably gotten into or out of math 23918mode by mistake. \TeX\ will insert a dollar sign and rescan the current token. 23919 23920@d non_math(#)==vmode+#,hmode+# 23921 23922@<Math-only cases in non-math modes...@>= 23923non_math(sup_mark), non_math(sub_mark), non_math(math_char_num), 23924non_math(math_given), non_math(XeTeX_math_given), non_math(math_comp), non_math(delim_num), 23925non_math(left_right), non_math(above), non_math(radical), 23926non_math(math_style), non_math(math_choice), non_math(vcenter), 23927non_math(non_script), non_math(mkern), non_math(limit_switch), 23928non_math(mskip), non_math(math_accent), 23929mmode+endv, mmode+par_end, mmode+stop, mmode+vskip, mmode+un_vbox, 23930mmode+valign, mmode+hrule 23931 23932@ @<Declare action...@>= 23933procedure insert_dollar_sign; 23934begin back_input; cur_tok:=math_shift_token+"$"; 23935print_err("Missing $ inserted"); 23936@.Missing \$ inserted@> 23937help2("I've inserted a begin-math/end-math symbol since I think")@/ 23938("you left one out. Proceed, with fingers crossed."); ins_error; 23939end; 23940 23941@ When erroneous situations arise, \TeX\ usually issues an error message 23942specific to the particular error. For example, `\.{\\noalign}' should 23943not appear in any mode, since it is recognized by the |align_peek| routine 23944in all of its legitimate appearances; a special error message is given 23945when `\.{\\noalign}' occurs elsewhere. But sometimes the most appropriate 23946error message is simply that the user is not allowed to do what he or she 23947has attempted. For example, `\.{\\moveleft}' is allowed only in vertical mode, 23948and `\.{\\lower}' only in non-vertical modes. Such cases are enumerated 23949here and in the other sections referred to under `See also \dots.' 23950 23951@<Forbidden cases...@>= 23952vmode+vmove,hmode+hmove,mmode+hmove,any_mode(last_item), 23953 23954@ The `|you_cant|' procedure prints a line saying that the current command 23955is illegal in the current mode; it identifies these things symbolically. 23956 23957@<Declare action...@>= 23958procedure you_cant; 23959begin print_err("You can't use `"); 23960@.You can't use x in y mode@> 23961print_cmd_chr(cur_cmd,cur_chr); 23962print("' in "); print_mode(mode); 23963end; 23964 23965@ @<Declare act...@>= 23966procedure report_illegal_case; 23967begin you_cant; 23968help4("Sorry, but I'm not programmed to handle this case;")@/ 23969("I'll just pretend that you didn't ask for it.")@/ 23970("If you're in the wrong mode, you might be able to")@/ 23971("return to the right one by typing `I}' or `I$' or `I\par'.");@/ 23972error; 23973end; 23974 23975@ Some operations are allowed only in privileged modes, i.e., in cases 23976that |mode>0|. The |privileged| function is used to detect violations 23977of this rule; it issues an error message and returns |false| if the 23978current |mode| is negative. 23979 23980@<Declare act...@>= 23981function privileged:boolean; 23982begin if mode>0 then privileged:=true 23983else begin report_illegal_case; privileged:=false; 23984 end; 23985end; 23986 23987@ Either \.{\\dump} or \.{\\end} will cause |main_control| to enter the 23988endgame, since both of them have `|stop|' as their command code. 23989 23990@<Put each...@>= 23991primitive("end",stop,0);@/ 23992@!@:end_}{\.{\\end} primitive@> 23993primitive("dump",stop,1);@/ 23994@!@:dump_}{\.{\\dump} primitive@> 23995 23996@ @<Cases of |print_cmd_chr|...@>= 23997stop:if chr_code=1 then print_esc("dump")@+else print_esc("end"); 23998 23999@ We don't want to leave |main_control| immediately when a |stop| command 24000is sensed, because it may be necessary to invoke an \.{\\output} routine 24001several times before things really grind to a halt. (The output routine 24002might even say `\.{\\gdef\\end\{...\}}', to prolong the life of the job.) 24003Therefore |its_all_over| is |true| only when the current page 24004and contribution list are empty, and when the last output was not a 24005``dead cycle.'' 24006 24007@<Declare act...@>= 24008function its_all_over:boolean; {do this when \.{\\end} or \.{\\dump} occurs} 24009label exit; 24010begin if privileged then 24011 begin if (page_head=page_tail)and(head=tail)and(dead_cycles=0) then 24012 begin its_all_over:=true; return; 24013 end; 24014 back_input; {we will try to end again after ejecting residual material} 24015 tail_append(new_null_box); 24016 width(tail):=hsize; 24017 tail_append(new_glue(fill_glue)); 24018 tail_append(new_penalty(-@'10000000000));@/ 24019 build_page; {append \.{\\hbox to \\hsize\{\}\\vfill\\penalty-'10000000000}} 24020 end; 24021its_all_over:=false; 24022exit:end; 24023 24024@* \[47] Building boxes and lists. 24025The most important parts of |main_control| are concerned with \TeX's 24026chief mission of box-making. We need to control the activities that put 24027entries on vlists and hlists, as well as the activities that convert 24028those lists into boxes. All of the necessary machinery has already been 24029developed; it remains for us to ``push the buttons'' at the right times. 24030 24031@ As an introduction to these routines, let's consider one of the simplest 24032cases: What happens when `\.{\\hrule}' occurs in vertical mode, or 24033`\.{\\vrule}' in horizontal mode or math mode? The code in |main_control| 24034is short, since the |scan_rule_spec| routine already does most of what is 24035required; thus, there is no need for a special action procedure. 24036 24037Note that baselineskip calculations are disabled after a rule in vertical 24038mode, by setting |prev_depth:=ignore_depth|. 24039 24040@<Cases of |main_control| that build...@>= 24041vmode+hrule,hmode+vrule,mmode+vrule: begin tail_append(scan_rule_spec); 24042 if abs(mode)=vmode then prev_depth:=ignore_depth 24043 else if abs(mode)=hmode then space_factor:=1000; 24044 end; 24045 24046@ The processing of things like \.{\\hskip} and \.{\\vskip} is slightly 24047more complicated. But the code in |main_control| is very short, since 24048it simply calls on the action routine |append_glue|. Similarly, \.{\\kern} 24049activates |append_kern|. 24050 24051@<Cases of |main_control| that build...@>= 24052vmode+vskip,hmode+hskip,mmode+hskip,mmode+mskip: append_glue; 24053any_mode(kern),mmode+mkern: append_kern; 24054 24055@ The |hskip| and |vskip| command codes are used for control sequences 24056like \.{\\hss} and \.{\\vfil} as well as for \.{\\hskip} and \.{\\vskip}. 24057The difference is in the value of |cur_chr|. 24058 24059@d fil_code=0 {identifies \.{\\hfil} and \.{\\vfil}} 24060@d fill_code=1 {identifies \.{\\hfill} and \.{\\vfill}} 24061@d ss_code=2 {identifies \.{\\hss} and \.{\\vss}} 24062@d fil_neg_code=3 {identifies \.{\\hfilneg} and \.{\\vfilneg}} 24063@d skip_code=4 {identifies \.{\\hskip} and \.{\\vskip}} 24064@d mskip_code=5 {identifies \.{\\mskip}} 24065 24066@<Put each...@>= 24067primitive("hskip",hskip,skip_code);@/ 24068@!@:hskip_}{\.{\\hskip} primitive@> 24069primitive("hfil",hskip,fil_code); 24070@!@:hfil_}{\.{\\hfil} primitive@> 24071primitive("hfill",hskip,fill_code);@/ 24072@!@:hfill_}{\.{\\hfill} primitive@> 24073primitive("hss",hskip,ss_code); 24074@!@:hss_}{\.{\\hss} primitive@> 24075primitive("hfilneg",hskip,fil_neg_code);@/ 24076@!@:hfil_neg_}{\.{\\hfilneg} primitive@> 24077primitive("vskip",vskip,skip_code);@/ 24078@!@:vskip_}{\.{\\vskip} primitive@> 24079primitive("vfil",vskip,fil_code); 24080@!@:vfil_}{\.{\\vfil} primitive@> 24081primitive("vfill",vskip,fill_code);@/ 24082@!@:vfill_}{\.{\\vfill} primitive@> 24083primitive("vss",vskip,ss_code); 24084@!@:vss_}{\.{\\vss} primitive@> 24085primitive("vfilneg",vskip,fil_neg_code);@/ 24086@!@:vfil_neg_}{\.{\\vfilneg} primitive@> 24087primitive("mskip",mskip,mskip_code);@/ 24088@!@:mskip_}{\.{\\mskip} primitive@> 24089primitive("kern",kern,explicit); 24090@!@:kern_}{\.{\\kern} primitive@> 24091primitive("mkern",mkern,mu_glue);@/ 24092@!@:mkern_}{\.{\\mkern} primitive@> 24093 24094@ @<Cases of |print_cmd_chr|...@>= 24095hskip: case chr_code of 24096 skip_code:print_esc("hskip"); 24097 fil_code:print_esc("hfil"); 24098 fill_code:print_esc("hfill"); 24099 ss_code:print_esc("hss"); 24100 othercases print_esc("hfilneg") 24101 endcases; 24102vskip: case chr_code of 24103 skip_code:print_esc("vskip"); 24104 fil_code:print_esc("vfil"); 24105 fill_code:print_esc("vfill"); 24106 ss_code:print_esc("vss"); 24107 othercases print_esc("vfilneg") 24108 endcases; 24109mskip: print_esc("mskip"); 24110kern: print_esc("kern"); 24111mkern: print_esc("mkern"); 24112 24113@ All the work relating to glue creation has been relegated to the 24114following subroutine. It does not call |build_page|, because it is 24115used in at least one place where that would be a mistake. 24116 24117@<Declare action...@>= 24118procedure append_glue; 24119var s:small_number; {modifier of skip command} 24120begin s:=cur_chr; 24121case s of 24122fil_code: cur_val:=fil_glue; 24123fill_code: cur_val:=fill_glue; 24124ss_code: cur_val:=ss_glue; 24125fil_neg_code: cur_val:=fil_neg_glue; 24126skip_code: scan_glue(glue_val); 24127mskip_code: scan_glue(mu_val); 24128end; {now |cur_val| points to the glue specification} 24129tail_append(new_glue(cur_val)); 24130if s>=skip_code then 24131 begin decr(glue_ref_count(cur_val)); 24132 if s>skip_code then subtype(tail):=mu_glue; 24133 end; 24134end; 24135 24136@ @<Declare act...@>= 24137procedure append_kern; 24138var s:quarterword; {|subtype| of the kern node} 24139begin s:=cur_chr; scan_dimen(s=mu_glue,false,false); 24140tail_append(new_kern(cur_val)); subtype(tail):=s; 24141end; 24142 24143@ Many of the actions related to box-making are triggered by the appearance 24144of braces in the input. For example, when the user says `\.{\\hbox} 24145\.{to} \.{100pt\{$\langle\,\hbox{hlist}\,\rangle$\}}' in vertical mode, 24146the information about the box size (100pt, |exactly|) is put onto |save_stack| 24147with a level boundary word just above it, and |cur_group:=adjusted_hbox_group|; 24148\TeX\ enters restricted horizontal mode to process the hlist. The right 24149brace eventually causes |save_stack| to be restored to its former state, 24150at which time the information about the box size (100pt, |exactly|) is 24151available once again; a box is packaged and we leave restricted horizontal 24152mode, appending the new box to the current list of the enclosing mode 24153(in this case to the current list of vertical mode), followed by any 24154vertical adjustments that were removed from the box by |hpack|. 24155 24156The next few sections of the program are therefore concerned with the 24157treatment of left and right curly braces. 24158 24159@ If a left brace occurs in the middle of a page or paragraph, it simply 24160introduces a new level of grouping, and the matching right brace will not have 24161such a drastic effect. Such grouping affects neither the mode nor the 24162current list. 24163 24164@<Cases of |main_control| that build...@>= 24165non_math(left_brace): new_save_level(simple_group); 24166any_mode(begin_group): new_save_level(semi_simple_group); 24167any_mode(end_group): if cur_group=semi_simple_group then unsave 24168 else off_save; 24169 24170@ We have to deal with errors in which braces and such things are not 24171properly nested. Sometimes the user makes an error of commission by 24172inserting an extra symbol, but sometimes the user makes an error of omission. 24173\TeX\ can't always tell one from the other, so it makes a guess and tries 24174to avoid getting into a loop. 24175 24176The |off_save| routine is called when the current group code is wrong. It tries 24177to insert something into the user's input that will help clean off 24178the top level. 24179 24180@<Declare act...@>= 24181procedure off_save; 24182var p:pointer; {inserted token} 24183begin if cur_group=bottom_level then 24184 @<Drop current token and complain that it was unmatched@> 24185else begin back_input; p:=get_avail; link(temp_head):=p; 24186 print_err("Missing "); 24187 @<Prepare to insert a token that matches |cur_group|, 24188 and print what it is@>; 24189 print(" inserted"); ins_list(link(temp_head)); 24190 help5("I've inserted something that you may have forgotten.")@/ 24191 ("(See the <inserted text> above.)")@/ 24192 ("With luck, this will get me unwedged. But if you")@/ 24193 ("really didn't forget anything, try typing `2' now; then")@/ 24194 ("my insertion and my current dilemma will both disappear."); 24195 error; 24196 end; 24197end; 24198 24199@ At this point, |link(temp_head)=p|, a pointer to an empty one-word node. 24200 24201@<Prepare to insert a token that matches |cur_group|...@>= 24202case cur_group of 24203semi_simple_group: begin info(p):=cs_token_flag+frozen_end_group; 24204 print_esc("endgroup"); 24205@.Missing \\endgroup inserted@> 24206 end; 24207math_shift_group: begin info(p):=math_shift_token+"$"; print_char("$"); 24208@.Missing \$ inserted@> 24209 end; 24210math_left_group: begin info(p):=cs_token_flag+frozen_right; link(p):=get_avail; 24211 p:=link(p); info(p):=other_token+"."; print_esc("right."); 24212@.Missing \\right\hbox{.} inserted@> 24213@^null delimiter@> 24214 end; 24215othercases begin info(p):=right_brace_token+"}"; print_char("}"); 24216@.Missing \} inserted@> 24217 end 24218endcases 24219 24220@ @<Drop current token and complain that it was unmatched@>= 24221begin print_err("Extra "); print_cmd_chr(cur_cmd,cur_chr); 24222@.Extra x@> 24223help1("Things are pretty mixed up, but I think the worst is over.");@/ 24224error; 24225end 24226 24227@ The routine for a |right_brace| character branches into many subcases, 24228since a variety of things may happen, depending on |cur_group|. Some 24229types of groups are not supposed to be ended by a right brace; error 24230messages are given in hopes of pinpointing the problem. Most branches 24231of this routine will be filled in later, when we are ready to understand 24232them; meanwhile, we must prepare ourselves to deal with such errors. 24233 24234@<Cases of |main_control| that build...@>= 24235any_mode(right_brace): handle_right_brace; 24236 24237@ @<Declare the procedure called |handle_right_brace|@>= 24238procedure handle_right_brace; 24239var p,@!q:pointer; {for short-term use} 24240@!d:scaled; {holds |split_max_depth| in |insert_group|} 24241@!f:integer; {holds |floating_penalty| in |insert_group|} 24242begin case cur_group of 24243simple_group: unsave; 24244bottom_level: begin print_err("Too many }'s"); 24245@.Too many \}'s@> 24246 help2("You've closed more groups than you opened.")@/ 24247 ("Such booboos are generally harmless, so keep going."); error; 24248 end; 24249semi_simple_group,math_shift_group,math_left_group: extra_right_brace; 24250@t\4@>@<Cases of |handle_right_brace| where a |right_brace| triggers 24251 a delayed action@>@; 24252othercases confusion("rightbrace") 24253@:this can't happen rightbrace}{\quad rightbrace@> 24254endcases; 24255end; 24256 24257@ @<Declare act...@>= 24258procedure extra_right_brace; 24259begin print_err("Extra }, or forgotten "); 24260@.Extra \}, or forgotten x@> 24261case cur_group of 24262semi_simple_group: print_esc("endgroup"); 24263math_shift_group: print_char("$"); 24264math_left_group: print_esc("right"); 24265end;@/ 24266help5("I've deleted a group-closing symbol because it seems to be")@/ 24267("spurious, as in `$x}$'. But perhaps the } is legitimate and")@/ 24268("you forgot something else, as in `\hbox{$x}'. In such cases")@/ 24269("the way to recover is to insert both the forgotten and the")@/ 24270("deleted material, e.g., by typing `I$}'."); error; 24271incr(align_state); 24272end; 24273 24274@ Here is where we clear the parameters that are supposed to revert to their 24275default values after every paragraph and when internal vertical mode is entered. 24276 24277@<Declare act...@>= 24278procedure normal_paragraph; 24279begin if looseness<>0 then eq_word_define(int_base+looseness_code,0); 24280if hang_indent<>0 then eq_word_define(dimen_base+hang_indent_code,0); 24281if hang_after<>1 then eq_word_define(int_base+hang_after_code,1); 24282if par_shape_ptr<>null then eq_define(par_shape_loc,shape_ref,null); 24283if inter_line_penalties_ptr<>null then 24284 eq_define(inter_line_penalties_loc,shape_ref,null); 24285end; 24286 24287@ Now let's turn to the question of how \.{\\hbox} is treated. We actually 24288need to consider also a slightly larger context, since constructions like 24289`\.{\\setbox3=}\penalty0\.{\\hbox...}' and 24290`\.{\\leaders}\penalty0\.{\\hbox...}' and 24291`\.{\\lower3.8pt\\hbox...}' 24292are supposed to invoke quite 24293different actions after the box has been packaged. Conversely, 24294constructions like `\.{\\setbox3=}' can be followed by a variety of 24295different kinds of boxes, and we would like to encode such things in an 24296efficient way. 24297 24298In other words, there are two problems: to represent the context of a box, 24299and to represent its type. 24300 24301The first problem is solved by putting a ``context code'' on the |save_stack|, 24302just below the two entries that give the dimensions produced by |scan_spec|. 24303The context code is either a (signed) shift amount, or it is a large 24304integer |>=box_flag|, where |box_flag=@t$2^{30}$@>|. Codes |box_flag| through 24305|global_box_flag-1| represent `\.{\\setbox0}' through `\.{\\setbox32767}'; 24306codes |global_box_flag| through |ship_out_flag-1| represent 24307`\.{\\global\\setbox0}' through `\.{\\global\\setbox32767}'; 24308code |ship_out_flag| represents `\.{\\shipout}'; and codes |leader_flag| 24309through |leader_flag+2| represent `\.{\\leaders}', `\.{\\cleaders}', 24310and `\.{\\xleaders}'. 24311 24312The second problem is solved by giving the command code |make_box| to all 24313control sequences that produce a box, and by using the following |chr_code| 24314values to distinguish between them: |box_code|, |copy_code|, |last_box_code|, 24315|vsplit_code|, |vtop_code|, |vtop_code+vmode|, and |vtop_code+hmode|, where 24316the latter two are used to denote \.{\\vbox} and \.{\\hbox}, respectively. 24317 24318@d box_flag==@'10000000000 {context code for `\.{\\setbox0}'} 24319@d global_box_flag==@'10000100000 {context code for `\.{\\global\\setbox0}'} 24320@d ship_out_flag==@'10000200000 {context code for `\.{\\shipout}'} 24321@d leader_flag==@'10000200001 {context code for `\.{\\leaders}'} 24322@d box_code=0 {|chr_code| for `\.{\\box}'} 24323@d copy_code=1 {|chr_code| for `\.{\\copy}'} 24324@d last_box_code=2 {|chr_code| for `\.{\\lastbox}'} 24325@d vsplit_code=3 {|chr_code| for `\.{\\vsplit}'} 24326@d vtop_code=4 {|chr_code| for `\.{\\vtop}'} 24327 24328@<Put each...@>= 24329primitive("moveleft",hmove,1); 24330@!@:move_left_}{\.{\\moveleft} primitive@> 24331primitive("moveright",hmove,0);@/ 24332@!@:move_right_}{\.{\\moveright} primitive@> 24333primitive("raise",vmove,1); 24334@!@:raise_}{\.{\\raise} primitive@> 24335primitive("lower",vmove,0); 24336@!@:lower_}{\.{\\lower} primitive@> 24337@# 24338primitive("box",make_box,box_code); 24339@!@:box_}{\.{\\box} primitive@> 24340primitive("copy",make_box,copy_code); 24341@!@:copy_}{\.{\\copy} primitive@> 24342primitive("lastbox",make_box,last_box_code); 24343@!@:last_box_}{\.{\\lastbox} primitive@> 24344primitive("vsplit",make_box,vsplit_code); 24345@!@:vsplit_}{\.{\\vsplit} primitive@> 24346primitive("vtop",make_box,vtop_code);@/ 24347@!@:vtop_}{\.{\\vtop} primitive@> 24348primitive("vbox",make_box,vtop_code+vmode); 24349@!@:vbox_}{\.{\\vbox} primitive@> 24350primitive("hbox",make_box,vtop_code+hmode);@/ 24351@!@:hbox_}{\.{\\hbox} primitive@> 24352primitive("shipout",leader_ship,a_leaders-1); {|ship_out_flag=leader_flag-1|} 24353@!@:ship_out_}{\.{\\shipout} primitive@> 24354primitive("leaders",leader_ship,a_leaders); 24355@!@:leaders_}{\.{\\leaders} primitive@> 24356primitive("cleaders",leader_ship,c_leaders); 24357@!@:c_leaders_}{\.{\\cleaders} primitive@> 24358primitive("xleaders",leader_ship,x_leaders); 24359@!@:x_leaders_}{\.{\\xleaders} primitive@> 24360 24361@ @<Cases of |print_cmd_chr|...@>= 24362hmove: if chr_code=1 then print_esc("moveleft")@+else print_esc("moveright"); 24363vmove: if chr_code=1 then print_esc("raise")@+else print_esc("lower"); 24364make_box: case chr_code of 24365 box_code: print_esc("box"); 24366 copy_code: print_esc("copy"); 24367 last_box_code: print_esc("lastbox"); 24368 vsplit_code: print_esc("vsplit"); 24369 vtop_code: print_esc("vtop"); 24370 vtop_code+vmode: print_esc("vbox"); 24371 othercases print_esc("hbox") 24372 endcases; 24373leader_ship: if chr_code=a_leaders then print_esc("leaders") 24374 else if chr_code=c_leaders then print_esc("cleaders") 24375 else if chr_code=x_leaders then print_esc("xleaders") 24376 else print_esc("shipout"); 24377 24378@ Constructions that require a box are started by calling |scan_box| with 24379a specified context code. The |scan_box| routine verifies 24380that a |make_box| command comes next and then it calls |begin_box|. 24381 24382@<Cases of |main_control| that build...@>= 24383vmode+hmove,hmode+vmove,mmode+vmove: begin t:=cur_chr; 24384 scan_normal_dimen; 24385 if t=0 then scan_box(cur_val)@+else scan_box(-cur_val); 24386 end; 24387any_mode(leader_ship): scan_box(leader_flag-a_leaders+cur_chr); 24388any_mode(make_box): begin_box(0); 24389 24390@ The global variable |cur_box| will point to a newly made box. If the box 24391is void, we will have |cur_box=null|. Otherwise we will have 24392|type(cur_box)=hlist_node| or |vlist_node| or |rule_node|; the |rule_node| 24393case can occur only with leaders. 24394 24395@<Glob...@>= 24396@!cur_box:pointer; {box to be placed into its context} 24397 24398@ The |box_end| procedure does the right thing with |cur_box|, if 24399|box_context| represents the context as explained above. 24400 24401@<Declare act...@>= 24402procedure box_end(@!box_context:integer); 24403var p:pointer; {|ord_noad| for new box in math mode} 24404@!a:small_number; {global prefix} 24405begin if box_context<box_flag then @<Append box |cur_box| to the current list, 24406 shifted by |box_context|@> 24407else if box_context<ship_out_flag then @<Store \(c)|cur_box| in a box register@> 24408else if cur_box<>null then 24409 if box_context>ship_out_flag then @<Append a new leader node that 24410 uses |cur_box|@> 24411 else ship_out(cur_box); 24412end; 24413 24414@ The global variable |adjust_tail| will be non-null if and only if the 24415current box might include adjustments that should be appended to the 24416current vertical list. 24417 24418@<Append box |cur_box| to the current...@>= 24419begin if cur_box<>null then 24420 begin shift_amount(cur_box):=box_context; 24421 if abs(mode)=vmode then 24422 begin 24423 if pre_adjust_tail <> null then begin 24424 if pre_adjust_head <> pre_adjust_tail then 24425 append_list(pre_adjust_head)(pre_adjust_tail); 24426 pre_adjust_tail:=null; 24427 end; 24428 append_to_vlist(cur_box); 24429 if adjust_tail <> null then begin 24430 if adjust_head <> adjust_tail then 24431 append_list(adjust_head)(adjust_tail); 24432 adjust_tail:=null; 24433 end; 24434 if mode>0 then build_page; 24435 end 24436 else begin if abs(mode)=hmode then space_factor:=1000 24437 else begin p:=new_noad; 24438 math_type(nucleus(p)):=sub_box; 24439 info(nucleus(p)):=cur_box; cur_box:=p; 24440 end; 24441 link(tail):=cur_box; tail:=cur_box; 24442 end; 24443 end; 24444end 24445 24446@ @<Store \(c)|cur_box| in a box register@>= 24447begin if box_context<global_box_flag then 24448 begin cur_val:=box_context-box_flag; a:=0; 24449 end 24450else begin cur_val:=box_context-global_box_flag; a:=4; 24451 end; 24452if cur_val<256 then define(box_base+cur_val,box_ref,cur_box) 24453else sa_def_box; 24454end 24455 24456@ @<Append a new leader node ...@>= 24457begin @<Get the next non-blank non-relax...@>; 24458if ((cur_cmd=hskip)and(abs(mode)<>vmode))or@| 24459 ((cur_cmd=vskip)and(abs(mode)=vmode)) then 24460 begin append_glue; subtype(tail):=box_context-(leader_flag-a_leaders); 24461 leader_ptr(tail):=cur_box; 24462 end 24463else begin print_err("Leaders not followed by proper glue"); 24464@.Leaders not followed by...@> 24465 help3("You should say `\leaders <box or rule><hskip or vskip>'.")@/ 24466 ("I found the <box or rule>, but there's no suitable")@/ 24467 ("<hskip or vskip>, so I'm ignoring these leaders."); back_error; 24468 flush_node_list(cur_box); 24469 end; 24470end 24471 24472@ Now that we can see what eventually happens to boxes, we can consider 24473the first steps in their creation. The |begin_box| routine is called when 24474|box_context| is a context specification, |cur_chr| specifies the type of 24475box desired, and |cur_cmd=make_box|. 24476 24477@<Declare act...@>= 24478procedure begin_box(@!box_context:integer); 24479label exit, done; 24480var @!p,@!q:pointer; {run through the current list} 24481@!r:pointer; {running behind |p|} 24482@!fm:boolean; {a final \.{\\beginM} \.{\\endM} node pair?} 24483@!tx:pointer; {effective tail node} 24484@!m:quarterword; {the length of a replacement list} 24485@!k:halfword; {0 or |vmode| or |hmode|} 24486@!n:halfword; {a box number} 24487begin case cur_chr of 24488box_code: begin scan_register_num; fetch_box(cur_box); 24489 change_box(null); {the box becomes void, at the same level} 24490 end; 24491copy_code: begin scan_register_num; fetch_box(q); cur_box:=copy_node_list(q); 24492 end; 24493last_box_code: @<If the current list ends with a box node, delete it from 24494 the list and make |cur_box| point to it; otherwise set |cur_box:=null|@>; 24495vsplit_code: @<Split off part of a vertical box, make |cur_box| point to it@>; 24496othercases @<Initiate the construction of an hbox or vbox, then |return|@> 24497endcases;@/ 24498box_end(box_context); {in simple cases, we use the box immediately} 24499exit:end; 24500 24501@ Note that the condition |not is_char_node(tail)| implies that |head<>tail|, 24502since |head| is a one-word node. 24503 24504@d fetch_effective_tail_eTeX(#)== {extract |tx|, 24505 drop \.{\\beginM} \.{\\endM} pair} 24506q:=head; p:=null; 24507repeat r:=p; p:=q; fm:=false; 24508if not is_char_node(q) then 24509 if type(q)=disc_node then 24510 begin for m:=1 to replace_count(q) do p:=link(p); 24511 if p=tx then #; 24512 end 24513 else if (type(q)=math_node)and(subtype(q)=begin_M_code) then fm:=true; 24514q:=link(p); 24515until q=tx; {found |r|$\to$|p|$\to$|q=tx|} 24516q:=link(tx); link(p):=q; link(tx):=null; 24517if q=null then if fm then confusion("tail1") 24518@:this can't happen tail1}{\quad tail1@> 24519 else tail:=p 24520else if fm then {|r|$\to$|p=begin_M|$\to$|q=end_M|} 24521 begin tail:=r; link(r):=null; flush_node_list(p);@+end 24522@# 24523@d check_effective_tail(#)==find_effective_tail_eTeX 24524@d fetch_effective_tail==fetch_effective_tail_eTeX 24525 24526@<If the current list ends with a box node, delete it...@>= 24527begin cur_box:=null; 24528if abs(mode)=mmode then 24529 begin you_cant; help1("Sorry; this \lastbox will be void."); error; 24530 end 24531else if (mode=vmode)and(head=tail) then 24532 begin you_cant; 24533 help2("Sorry...I usually can't take things from the current page.")@/ 24534 ("This \lastbox will therefore be void."); error; 24535 end 24536else begin check_effective_tail(goto done); 24537 if not is_char_node(tx) then 24538 if (type(tx)=hlist_node)or(type(tx)=vlist_node) then 24539 @<Remove the last box, unless it's part of a discretionary@>; 24540 done:end; 24541end 24542 24543@ @<Remove the last box...@>= 24544begin fetch_effective_tail(goto done); 24545cur_box:=tx; shift_amount(cur_box):=0; 24546end 24547 24548@ Here we deal with things like `\.{\\vsplit 13 to 100pt}'. 24549 24550@<Split off part of a vertical box, make |cur_box| point to it@>= 24551begin scan_register_num; n:=cur_val; 24552if not scan_keyword("to") then 24553@.to@> 24554 begin print_err("Missing `to' inserted"); 24555@.Missing `to' inserted@> 24556 help2("I'm working on `\vsplit<box number> to <dimen>';")@/ 24557 ("will look for the <dimen> next."); error; 24558 end; 24559scan_normal_dimen; 24560cur_box:=vsplit(n,cur_val); 24561end 24562 24563@ Here is where we enter restricted horizontal mode or internal vertical 24564mode, in order to make a box. 24565 24566@<Initiate the construction of an hbox or vbox, then |return|@>= 24567begin k:=cur_chr-vtop_code; saved(0):=box_context; 24568if k=hmode then 24569 if (box_context<box_flag)and(abs(mode)=vmode) then 24570 scan_spec(adjusted_hbox_group,true) 24571 else scan_spec(hbox_group,true) 24572else begin if k=vmode then scan_spec(vbox_group,true) 24573 else begin scan_spec(vtop_group,true); k:=vmode; 24574 end; 24575 normal_paragraph; 24576 end; 24577push_nest; mode:=-k; 24578if k=vmode then 24579 begin prev_depth:=ignore_depth; 24580 if every_vbox<>null then begin_token_list(every_vbox,every_vbox_text); 24581 end 24582else begin space_factor:=1000; 24583 if every_hbox<>null then begin_token_list(every_hbox,every_hbox_text); 24584 end; 24585return; 24586end 24587 24588@ @<Declare act...@>= 24589procedure scan_box(@!box_context:integer); 24590 {the next input should specify a box or perhaps a rule} 24591begin @<Get the next non-blank non-relax...@>; 24592if cur_cmd=make_box then begin_box(box_context) 24593else if (box_context>=leader_flag)and((cur_cmd=hrule)or(cur_cmd=vrule)) then 24594 begin cur_box:=scan_rule_spec; box_end(box_context); 24595 end 24596else begin@t@>@;@/ 24597 print_err("A <box> was supposed to be here");@/ 24598@.A <box> was supposed to...@> 24599 help3("I was expecting to see \hbox or \vbox or \copy or \box or")@/ 24600 ("something like that. So you might find something missing in")@/ 24601 ("your output. But keep trying; you can fix this later."); back_error; 24602 end; 24603end; 24604 24605@ When the right brace occurs at the end of an \.{\\hbox} or \.{\\vbox} or 24606\.{\\vtop} construction, the |package| routine comes into action. We might 24607also have to finish a paragraph that hasn't ended. 24608 24609@<Cases of |handle...@>= 24610hbox_group: package(0); 24611adjusted_hbox_group: begin adjust_tail:=adjust_head; 24612 pre_adjust_tail:=pre_adjust_head; package(0); 24613 end; 24614vbox_group: begin end_graf; package(0); 24615 end; 24616vtop_group: begin end_graf; package(vtop_code); 24617 end; 24618 24619@ @<Declare action...@>= 24620procedure package(@!c:small_number); 24621var h:scaled; {height of box} 24622@!p:pointer; {first node in a box} 24623@!d:scaled; {max depth} 24624@!u,v:integer; {saved values for upwards mode flag} 24625begin d:=box_max_depth; u:=XeTeX_upwards_state; unsave; save_ptr:=save_ptr-3; 24626v:=XeTeX_upwards_state; XeTeX_upwards_state:=u; 24627if mode=-hmode then cur_box:=hpack(link(head),saved(2),saved(1)) 24628else begin cur_box:=vpackage(link(head),saved(2),saved(1),d); 24629 if c=vtop_code then @<Readjust the height and depth of |cur_box|, 24630 for \.{\\vtop}@>; 24631 end; 24632XeTeX_upwards_state:=v; 24633pop_nest; box_end(saved(0)); 24634end; 24635 24636@ The height of a `\.{\\vtop}' box is inherited from the first item on its list, 24637if that item is an |hlist_node|, |vlist_node|, or |rule_node|; otherwise 24638the \.{\\vtop} height is zero. 24639 24640 24641@<Readjust the height...@>= 24642begin h:=0; p:=list_ptr(cur_box); 24643if p<>null then if type(p)<=rule_node then h:=height(p); 24644depth(cur_box):=depth(cur_box)-h+height(cur_box); height(cur_box):=h; 24645end 24646 24647@ A paragraph begins when horizontal-mode material occurs in vertical mode, 24648or when the paragraph is explicitly started by `\.{\\indent}' or 24649`\.{\\noindent}'. 24650 24651@<Put each...@>= 24652primitive("indent",start_par,1); 24653@!@:indent_}{\.{\\indent} primitive@> 24654primitive("noindent",start_par,0); 24655@!@:no_indent_}{\.{\\noindent} primitive@> 24656 24657@ @<Cases of |print_cmd_chr|...@>= 24658start_par: if chr_code=0 then print_esc("noindent")@+ else print_esc("indent"); 24659 24660@ @<Cases of |main_control| that build...@>= 24661vmode+start_par: new_graf(cur_chr>0); 24662vmode+letter,vmode+other_char,vmode+char_num,vmode+char_given, 24663 vmode+math_shift,vmode+un_hbox,vmode+vrule, 24664 vmode+accent,vmode+discretionary,vmode+hskip,vmode+valign, 24665 vmode+ex_space,vmode+no_boundary:@t@>@;@/ 24666 begin back_input; new_graf(true); 24667 end; 24668 24669@ @<Declare act...@>= 24670function norm_min(@!h:integer):small_number; 24671begin if h<=0 then norm_min:=1@+else if h>=63 then norm_min:=63@+ 24672else norm_min:=h; 24673end; 24674@# 24675procedure new_graf(@!indented:boolean); 24676begin prev_graf:=0; 24677if (mode=vmode)or(head<>tail) then 24678 tail_append(new_param_glue(par_skip_code)); 24679push_nest; mode:=hmode; space_factor:=1000; set_cur_lang; clang:=cur_lang; 24680prev_graf:=(norm_min(left_hyphen_min)*@'100+norm_min(right_hyphen_min)) 24681 *@'200000+cur_lang; 24682if indented then 24683 begin tail:=new_null_box; link(head):=tail; width(tail):=par_indent;@+ 24684 end; 24685if every_par<>null then begin_token_list(every_par,every_par_text); 24686if nest_ptr=1 then build_page; {put |par_skip| glue on current page} 24687end; 24688 24689@ @<Cases of |main_control| that build...@>= 24690hmode+start_par,mmode+start_par: indent_in_hmode; 24691 24692@ @<Declare act...@>= 24693procedure indent_in_hmode; 24694var p,@!q:pointer; 24695begin if cur_chr>0 then {\.{\\indent}} 24696 begin p:=new_null_box; width(p):=par_indent; 24697 if abs(mode)=hmode then space_factor:=1000 24698 else begin q:=new_noad; math_type(nucleus(q)):=sub_box; 24699 info(nucleus(q)):=p; p:=q; 24700 end; 24701 tail_append(p); 24702 end; 24703end; 24704 24705@ A paragraph ends when a |par_end| command is sensed, or when we are in 24706horizontal mode when reaching the right brace of vertical-mode routines 24707like \.{\\vbox}, \.{\\insert}, or \.{\\output}. 24708 24709@<Cases of |main_control| that build...@>= 24710vmode+par_end: begin normal_paragraph; 24711 if mode>0 then build_page; 24712 end; 24713hmode+par_end: begin if align_state<0 then off_save; {this tries to 24714 recover from an alignment that didn't end properly} 24715 end_graf; {this takes us to the enclosing mode, if |mode>0|} 24716 if mode=vmode then build_page; 24717 end; 24718hmode+stop,hmode+vskip,hmode+hrule,hmode+un_vbox,hmode+halign: head_for_vmode; 24719 24720@ @<Declare act...@>= 24721procedure head_for_vmode; 24722begin if mode<0 then 24723 if cur_cmd<>hrule then off_save 24724 else begin print_err("You can't use `"); 24725 print_esc("hrule"); print("' here except with leaders"); 24726@.You can't use \\hrule...@> 24727 help2("To put a horizontal rule in an hbox or an alignment,")@/ 24728 ("you should use \leaders or \hrulefill (see The TeXbook)."); 24729 error; 24730 end 24731else begin back_input; cur_tok:=par_token; back_input; token_type:=inserted; 24732 end; 24733end; 24734 24735@ @<Declare act...@>= 24736procedure end_graf; 24737begin if mode=hmode then 24738 begin if head=tail then pop_nest {null paragraphs are ignored} 24739 else line_break(false); 24740 if LR_save<>null then 24741 begin flush_list(LR_save); LR_save:=null; 24742 end; 24743 normal_paragraph; 24744 error_count:=0; 24745 end; 24746end; 24747 24748@ Insertion and adjustment and mark nodes are constructed by the following 24749pieces of the program. 24750 24751@<Cases of |main_control| that build...@>= 24752any_mode(insert),hmode+vadjust,mmode+vadjust: begin_insert_or_adjust; 24753any_mode(mark): make_mark; 24754 24755@ @<Forbidden...@>= 24756vmode+vadjust, 24757 24758@ @<Declare act...@>= 24759procedure begin_insert_or_adjust; 24760begin if cur_cmd=vadjust then cur_val:=255 24761else begin scan_eight_bit_int; 24762 if cur_val=255 then 24763 begin print_err("You can't "); print_esc("insert"); print_int(255); 24764@.You can't \\insert255@> 24765 help1("I'm changing to \insert0; box 255 is special."); 24766 error; cur_val:=0; 24767 end; 24768 end; 24769saved(0):=cur_val; 24770if (cur_cmd = vadjust) and scan_keyword("pre") then 24771 saved(1):=1 24772else 24773 saved(1):=0; 24774save_ptr:=save_ptr + 2; 24775new_save_level(insert_group); scan_left_brace; normal_paragraph; 24776push_nest; mode:=-vmode; prev_depth:=ignore_depth; 24777end; 24778 24779@ @<Cases of |handle...@>= 24780insert_group: begin end_graf; q:=split_top_skip; add_glue_ref(q); 24781 d:=split_max_depth; f:=floating_penalty; unsave; save_ptr:=save_ptr-2; 24782 {now |saved(0)| is the insertion number, or 255 for |vadjust|} 24783 p:=vpack(link(head),natural); pop_nest; 24784 if saved(0)<255 then 24785 begin tail_append(get_node(ins_node_size)); 24786 type(tail):=ins_node; subtype(tail):=qi(saved(0)); 24787 height(tail):=height(p)+depth(p); ins_ptr(tail):=list_ptr(p); 24788 split_top_ptr(tail):=q; depth(tail):=d; float_cost(tail):=f; 24789 end 24790 else begin tail_append(get_node(small_node_size)); 24791 type(tail):=adjust_node;@/ 24792 adjust_pre(tail):=saved(1); {the |subtype| is used for |adjust_pre|} 24793 adjust_ptr(tail):=list_ptr(p); delete_glue_ref(q); 24794 end; 24795 free_node(p,box_node_size); 24796 if nest_ptr=0 then build_page; 24797 end; 24798output_group: @<Resume the page builder...@>; 24799 24800@ @<Declare act...@>= 24801procedure make_mark; 24802var p:pointer; {new node} 24803@!c:halfword; {the mark class} 24804begin if cur_chr=0 then c:=0 24805else begin scan_register_num; c:=cur_val; 24806 end; 24807p:=scan_toks(false,true); p:=get_node(small_node_size); 24808mark_class(p):=c; 24809type(p):=mark_node; subtype(p):=0; {the |subtype| is not used} 24810mark_ptr(p):=def_ref; link(tail):=p; tail:=p; 24811end; 24812 24813@ Penalty nodes get into a list via the |break_penalty| command. 24814@^penalties@> 24815 24816@<Cases of |main_control| that build...@>= 24817any_mode(break_penalty): append_penalty; 24818 24819@ @<Declare action...@>= 24820procedure append_penalty; 24821begin scan_int; tail_append(new_penalty(cur_val)); 24822if mode=vmode then build_page; 24823end; 24824 24825@ The |remove_item| command removes a penalty, kern, or glue node if it 24826appears at the tail of the current list, using a brute-force linear scan. 24827Like \.{\\lastbox}, this command is not allowed in vertical mode (except 24828internal vertical mode), since the current list in vertical mode is sent 24829to the page builder. But if we happen to be able to implement it in 24830vertical mode, we do. 24831 24832@<Cases of |main_control| that build...@>= 24833any_mode(remove_item): delete_last; 24834 24835@ When |delete_last| is called, |cur_chr| is the |type| of node that 24836will be deleted, if present. 24837 24838@<Declare action...@>= 24839procedure delete_last; 24840label exit; 24841var @!p,@!q:pointer; {run through the current list} 24842@!r:pointer; {running behind |p|} 24843@!fm:boolean; {a final \.{\\beginM} \.{\\endM} node pair?} 24844@!tx:pointer; {effective tail node} 24845@!m:quarterword; {the length of a replacement list} 24846begin if (mode=vmode)and(tail=head) then 24847 @<Apologize for inability to do the operation now, 24848 unless \.{\\unskip} follows non-glue@> 24849else begin check_effective_tail(return); 24850 if not is_char_node(tx) then if type(tx)=cur_chr then 24851 begin fetch_effective_tail(return); 24852 flush_node_list(tx); 24853 end; 24854 end; 24855exit:end; 24856 24857@ @<Apologize for inability to do the operation...@>= 24858begin if (cur_chr<>glue_node)or(last_glue<>max_halfword) then 24859 begin you_cant; 24860 help2("Sorry...I usually can't take things from the current page.")@/ 24861 ("Try `I\vskip-\lastskip' instead."); 24862 if cur_chr=kern_node then help_line[0]:= 24863 ("Try `I\kern-\lastkern' instead.") 24864 else if cur_chr<>glue_node then help_line[0]:=@| 24865 ("Perhaps you can make the output routine do it."); 24866 error; 24867 end; 24868end 24869 24870@ @<Put each...@>= 24871primitive("unpenalty",remove_item,penalty_node);@/ 24872@!@:un_penalty_}{\.{\\unpenalty} primitive@> 24873primitive("unkern",remove_item,kern_node);@/ 24874@!@:un_kern_}{\.{\\unkern} primitive@> 24875primitive("unskip",remove_item,glue_node);@/ 24876@!@:un_skip_}{\.{\\unskip} primitive@> 24877primitive("unhbox",un_hbox,box_code);@/ 24878@!@:un_hbox_}{\.{\\unhbox} primitive@> 24879primitive("unhcopy",un_hbox,copy_code);@/ 24880@!@:un_hcopy_}{\.{\\unhcopy} primitive@> 24881primitive("unvbox",un_vbox,box_code);@/ 24882@!@:un_vbox_}{\.{\\unvbox} primitive@> 24883primitive("unvcopy",un_vbox,copy_code);@/ 24884@!@:un_vcopy_}{\.{\\unvcopy} primitive@> 24885 24886@ @<Cases of |print_cmd_chr|...@>= 24887remove_item: if chr_code=glue_node then print_esc("unskip") 24888 else if chr_code=kern_node then print_esc("unkern") 24889 else print_esc("unpenalty"); 24890un_hbox: if chr_code=copy_code then print_esc("unhcopy") 24891 else print_esc("unhbox"); 24892un_vbox: if chr_code=copy_code then print_esc("unvcopy") 24893 @<Cases of |un_vbox| for |print_cmd_chr|@>@/ 24894 else print_esc("unvbox"); 24895 24896@ The |un_hbox| and |un_vbox| commands unwrap one of the 256 current boxes. 24897 24898@<Cases of |main_control| that build...@>= 24899vmode+un_vbox,hmode+un_hbox,mmode+un_hbox: unpackage; 24900 24901@ @<Declare act...@>= 24902procedure unpackage; 24903label done, exit; 24904var p:pointer; {the box} 24905 r: pointer; {to remove marginal kern nodes} 24906@!c:box_code..copy_code; {should we copy?} 24907begin if cur_chr>copy_code then @<Handle saved items and |goto done|@>; 24908c:=cur_chr; scan_register_num; fetch_box(p); 24909if p=null then return; 24910if (abs(mode)=mmode)or((abs(mode)=vmode)and(type(p)<>vlist_node))or@| 24911 ((abs(mode)=hmode)and(type(p)<>hlist_node)) then 24912 begin print_err("Incompatible list can't be unboxed"); 24913@.Incompatible list...@> 24914 help3("Sorry, Pandora. (You sneaky devil.)")@/ 24915 ("I refuse to unbox an \hbox in vertical mode or vice versa.")@/ 24916 ("And I can't open any boxes in math mode.");@/ 24917 error; return; 24918 end; 24919if c=copy_code then link(tail):=copy_node_list(list_ptr(p)) 24920else begin link(tail):=list_ptr(p); change_box(null); 24921 free_node(p,box_node_size); 24922 end; 24923done: 24924while link(tail) <> null do begin 24925 r:=link(tail); 24926 if not is_char_node(r) and (type(r) = margin_kern_node) then begin 24927 link(tail):=link(r); 24928 free_node(r, margin_kern_node_size); 24929 end; 24930 tail:=link(tail); 24931end; 24932exit:end; 24933 24934@ @<Forbidden...@>=vmode+ital_corr, 24935 24936@ Italic corrections are converted to kern nodes when the |ital_corr| command 24937follows a character. In math mode the same effect is achieved by appending 24938a kern of zero here, since italic corrections are supplied later. 24939 24940@<Cases of |main_control| that build...@>= 24941hmode+ital_corr: append_italic_correction; 24942mmode+ital_corr: tail_append(new_kern(0)); 24943 24944@ @<Declare act...@>= 24945procedure append_italic_correction; 24946label exit; 24947var p:pointer; {|char_node| at the tail of the current list} 24948@!f:internal_font_number; {the font in the |char_node|} 24949begin if tail<>head then 24950 begin if is_char_node(tail) then p:=tail 24951 else if type(tail)=ligature_node then p:=lig_char(tail) 24952 else if (type(tail)=whatsit_node) then begin 24953 if (subtype(tail)=native_word_node) then begin 24954 tail_append(new_kern(get_native_italic_correction(tail))); subtype(tail):=explicit; 24955 end 24956 else if (subtype(tail)=glyph_node) then begin 24957 tail_append(new_kern(get_native_glyph_italic_correction(tail))); subtype(tail):=explicit; 24958 end; 24959 return; 24960 end 24961 else return; 24962 f:=font(p); 24963 tail_append(new_kern(char_italic(f)(char_info(f)(character(p))))); 24964 subtype(tail):=explicit; 24965 end; 24966exit:end; 24967 24968@ Discretionary nodes are easy in the common case `\.{\\-}', but in the 24969general case we must process three braces full of items. 24970 24971@<Put each...@>= 24972primitive("-",discretionary,1); 24973@!@:Single-character primitives -}{\quad\.{\\-}@> 24974primitive("discretionary",discretionary,0); 24975@!@:discretionary_}{\.{\\discretionary} primitive@> 24976 24977@ @<Cases of |print_cmd_chr|...@>= 24978discretionary: if chr_code=1 then 24979 print_esc("-")@+else print_esc("discretionary"); 24980 24981@ @<Cases of |main_control| that build...@>= 24982hmode+discretionary,mmode+discretionary: append_discretionary; 24983 24984@ The space factor does not change when we append a discretionary node, 24985but it starts out as 1000 in the subsidiary lists. 24986 24987@<Declare act...@>= 24988procedure append_discretionary; 24989var c:integer; {hyphen character} 24990begin tail_append(new_disc); 24991if cur_chr=1 then 24992 begin c:=hyphen_char[cur_font]; 24993 if c>=0 then if c<=biggest_char then pre_break(tail):=new_character(cur_font,c); 24994 end 24995else begin incr(save_ptr); saved(-1):=0; new_save_level(disc_group); 24996 scan_left_brace; push_nest; mode:=-hmode; space_factor:=1000; 24997 end; 24998end; 24999 25000@ The three discretionary lists are constructed somewhat as if they were 25001hboxes. A~subroutine called |build_discretionary| handles the transitions. 25002(This is sort of fun.) 25003 25004@<Cases of |handle...@>= 25005disc_group: build_discretionary; 25006 25007@ @<Declare act...@>= 25008procedure build_discretionary; 25009label done,exit; 25010var p,@!q:pointer; {for link manipulation} 25011@!n:integer; {length of discretionary list} 25012begin unsave; 25013@<Prune the current list, if necessary, until it contains only 25014 |char_node|, |kern_node|, |hlist_node|, |vlist_node|, |rule_node|, 25015 and |ligature_node| items; set |n| to the length of the list, 25016 and set |q| to the list's tail@>; 25017p:=link(head); pop_nest; 25018case saved(-1) of 250190:pre_break(tail):=p; 250201:post_break(tail):=p; 250212:@<Attach list |p| to the current list, and record its length; 25022 then finish up and |return|@>; 25023end; {there are no other cases} 25024incr(saved(-1)); new_save_level(disc_group); scan_left_brace; 25025push_nest; mode:=-hmode; space_factor:=1000; 25026exit:end; 25027 25028@ @<Attach list |p| to the current...@>= 25029begin if (n>0)and(abs(mode)=mmode) then 25030 begin print_err("Illegal math "); print_esc("discretionary"); 25031@.Illegal math \\disc...@> 25032 help2("Sorry: The third part of a discretionary break must be")@/ 25033 ("empty, in math formulas. I had to delete your third part."); 25034 flush_node_list(p); n:=0; error; 25035 end 25036else link(tail):=p; 25037if n<=max_quarterword then replace_count(tail):=n 25038else begin print_err("Discretionary list is too long"); 25039@.Discretionary list is too long@> 25040 help2("Wow---I never thought anybody would tweak me here.")@/ 25041 ("You can't seriously need such a huge discretionary list?"); 25042 error; 25043 end; 25044if n>0 then tail:=q; 25045decr(save_ptr); return; 25046end 25047 25048@ During this loop, |p=link(q)| and there are |n| items preceding |p|. 25049 25050@<Prune the current list, if necessary...@>= 25051q:=head; p:=link(q); n:=0; 25052while p<>null do 25053 begin if not is_char_node(p) then if type(p)>rule_node then 25054 if type(p)<>kern_node then if type(p)<>ligature_node then 25055 if (type(p)<>whatsit_node) or ((subtype(p)<>native_word_node) 25056 and (subtype(p)<>glyph_node)) then 25057 begin print_err("Improper discretionary list"); 25058@.Improper discretionary list@> 25059 help1("Discretionary lists must contain only boxes and kerns.");@/ 25060 error; 25061 begin_diagnostic; 25062 print_nl("The following discretionary sublist has been deleted:"); 25063@.The following...deleted@> 25064 show_box(p); 25065 end_diagnostic(true); 25066 flush_node_list(p); link(q):=null; goto done; 25067 end; 25068 q:=p; p:=link(q); incr(n); 25069 end; 25070done: 25071 25072@ We need only one more thing to complete the horizontal mode routines, namely 25073the \.{\\accent} primitive. 25074 25075@<Cases of |main_control| that build...@>= 25076hmode+accent: make_accent; 25077 25078@ The positioning of accents is straightforward but tedious. Given an accent 25079of width |a|, designed for characters of height |x| and slant |s|; 25080and given a character of width |w|, height |h|, and slant |t|: We will shift 25081the accent down by |x-h|, and we will insert kern nodes that have the effect of 25082centering the accent over the character and shifting the accent to the 25083right by $\delta={1\over2}(w-a)+h\cdot t-x\cdot s$. If either character is 25084absent from the font, we will simply use the other, without shifting. 25085 25086@<Declare act...@>= 25087procedure make_accent; 25088var s,@!t: real; {amount of slant} 25089@!p,@!q,@!r:pointer; {character, box, and kern nodes} 25090@!f:internal_font_number; {relevant font} 25091@!a,@!h,@!x,@!w,@!delta,@!lsb,@!rsb:scaled; {heights and widths, as explained above} 25092@!i:four_quarters; {character information} 25093begin scan_char_num; f:=cur_font; p:=new_character(f,cur_val); 25094if p<>null then 25095 begin x:=x_height(f); s:=slant(f)/float_constant(65536); 25096@^real division@> 25097 if is_native_font(f) then 25098 begin a:=width(p); 25099 if a=0 then get_native_char_sidebearings(f, cur_val, addressof(lsb), addressof(rsb)) 25100 end 25101 else a:=char_width(f)(char_info(f)(character(p)));@/ 25102 do_assignments;@/ 25103 @<Create a character node |q| for the next character, 25104 but set |q:=null| if problems arise@>; 25105 if q<>null then @<Append the accent with appropriate kerns, 25106 then set |p:=q|@>; 25107 link(tail):=p; tail:=p; space_factor:=1000; 25108 end; 25109end; 25110 25111@ @<Create a character node |q| for the next...@>= 25112q:=null; f:=cur_font; 25113if (cur_cmd=letter)or(cur_cmd=other_char)or(cur_cmd=char_given) then 25114 begin q:=new_character(f,cur_chr); cur_val:=cur_chr 25115 end 25116else if cur_cmd=char_num then 25117 begin scan_char_num; q:=new_character(f,cur_val); 25118 end 25119else back_input 25120 25121@ The kern nodes appended here must be distinguished from other kerns, lest 25122they be wiped away by the hyphenation algorithm or by a previous line break. 25123 25124The two kerns are computed with (machine-dependent) |real| arithmetic, but 25125their sum is machine-independent; the net effect is machine-independent, 25126because the user cannot remove these nodes nor access them via \.{\\lastkern}. 25127 25128@<Append the accent with appropriate kerns...@>= 25129begin t:=slant(f)/float_constant(65536); 25130@^real division@> 25131if is_native_font(f) then begin 25132 w:=width(q); 25133 get_native_char_height_depth(f, cur_val, addressof(h), addressof(delta)) 25134 {using delta as scratch space for the unneeded depth value} 25135end else begin 25136 i:=char_info(f)(character(q)); 25137 w:=char_width(f)(i); h:=char_height(f)(height_depth(i)) 25138end; 25139if h<>x then {the accent must be shifted up or down} 25140 begin p:=hpack(p,natural); shift_amount(p):=x-h; 25141 end; 25142if is_native_font(f) and (a=0) then { special case for non-spacing marks } 25143 delta:=round((w-lsb+rsb)/float_constant(2)+h*t-x*s) 25144else delta:=round((w-a)/float_constant(2)+h*t-x*s); 25145@^real multiplication@> 25146@^real addition@> 25147r:=new_kern(delta); subtype(r):=acc_kern; link(tail):=r; link(r):=p; 25148tail:=new_kern(-a-delta); subtype(tail):=acc_kern; link(p):=tail; p:=q; 25149end 25150 25151@ When `\.{\\cr}' or `\.{\\span}' or a tab mark comes through the scanner 25152into |main_control|, it might be that the user has foolishly inserted 25153one of them into something that has nothing to do with alignment. But it is 25154far more likely that a left brace or right brace has been omitted, since 25155|get_next| takes actions appropriate to alignment only when `\.{\\cr}' 25156or `\.{\\span}' or tab marks occur with |align_state=0|. The following 25157program attempts to make an appropriate recovery. 25158 25159@<Cases of |main_control| that build...@>= 25160any_mode(car_ret), any_mode(tab_mark): align_error; 25161any_mode(no_align): no_align_error; 25162any_mode(omit): omit_error; 25163 25164@ @<Declare act...@>= 25165procedure align_error; 25166begin if abs(align_state)>2 then 25167 @<Express consternation over the fact that no alignment is in progress@> 25168else begin back_input; 25169 if align_state<0 then 25170 begin print_err("Missing { inserted"); 25171@.Missing \{ inserted@> 25172 incr(align_state); cur_tok:=left_brace_token+"{"; 25173 end 25174 else begin print_err("Missing } inserted"); 25175@.Missing \} inserted@> 25176 decr(align_state); cur_tok:=right_brace_token+"}"; 25177 end; 25178 help3("I've put in what seems to be necessary to fix")@/ 25179 ("the current column of the current alignment.")@/ 25180 ("Try to go on, since this might almost work."); ins_error; 25181 end; 25182end; 25183 25184@ @<Express consternation...@>= 25185begin print_err("Misplaced "); print_cmd_chr(cur_cmd,cur_chr); 25186@.Misplaced \&@> 25187@.Misplaced \\span@> 25188@.Misplaced \\cr@> 25189if cur_tok=tab_token+"&" then 25190 begin help6("I can't figure out why you would want to use a tab mark")@/ 25191 ("here. If you just want an ampersand, the remedy is")@/ 25192 ("simple: Just type `I\&' now. But if some right brace")@/ 25193 ("up above has ended a previous alignment prematurely,")@/ 25194 ("you're probably due for more error messages, and you")@/ 25195 ("might try typing `S' now just to see what is salvageable."); 25196 end 25197else begin help5("I can't figure out why you would want to use a tab mark")@/ 25198 ("or \cr or \span just now. If something like a right brace")@/ 25199 ("up above has ended a previous alignment prematurely,")@/ 25200 ("you're probably due for more error messages, and you")@/ 25201 ("might try typing `S' now just to see what is salvageable."); 25202 end; 25203error; 25204end 25205 25206@ The help messages here contain a little white lie, since \.{\\noalign} 25207and \.{\\omit} are allowed also after `\.{\\noalign\{...\}}'. 25208 25209@<Declare act...@>= 25210procedure no_align_error; 25211begin print_err("Misplaced "); print_esc("noalign"); 25212@.Misplaced \\noalign@> 25213help2("I expect to see \noalign only after the \cr of")@/ 25214 ("an alignment. Proceed, and I'll ignore this case."); error; 25215end; 25216procedure omit_error; 25217begin print_err("Misplaced "); print_esc("omit"); 25218@.Misplaced \\omit@> 25219help2("I expect to see \omit only after tab marks or the \cr of")@/ 25220 ("an alignment. Proceed, and I'll ignore this case."); error; 25221end; 25222 25223@ We've now covered most of the abuses of \.{\\halign} and \.{\\valign}. 25224Let's take a look at what happens when they are used correctly. 25225 25226@<Cases of |main_control| that build...@>= 25227vmode+halign:init_align; 25228hmode+valign:@<Cases of |main_control| for |hmode+valign|@>@; init_align; 25229mmode+halign: if privileged then 25230 if cur_group=math_shift_group then init_align 25231 else off_save; 25232vmode+endv,hmode+endv: do_endv; 25233 25234@ An |align_group| code is supposed to remain on the |save_stack| 25235during an entire alignment, until |fin_align| removes it. 25236 25237A devious user might force an |endv| command to occur just about anywhere; 25238we must defeat such hacks. 25239 25240@<Declare act...@>= 25241procedure do_endv; 25242begin base_ptr:=input_ptr; input_stack[base_ptr]:=cur_input; 25243while (input_stack[base_ptr].index_field<>v_template) and 25244 (input_stack[base_ptr].loc_field=null) and 25245 (input_stack[base_ptr].state_field=token_list) do decr(base_ptr); 25246if (input_stack[base_ptr].index_field<>v_template) or 25247 (input_stack[base_ptr].loc_field<>null) or 25248 (input_stack[base_ptr].state_field<>token_list) then 25249 fatal_error("(interwoven alignment preambles are not allowed)"); 25250@.interwoven alignment preambles...@> 25251 if cur_group=align_group then 25252 begin end_graf; 25253 if fin_col then fin_row; 25254 end 25255else off_save; 25256end; 25257 25258@ @<Cases of |handle_right_brace|...@>= 25259align_group: begin back_input; cur_tok:=cs_token_flag+frozen_cr; 25260 print_err("Missing "); print_esc("cr"); print(" inserted"); 25261@.Missing \\cr inserted@> 25262 help1("I'm guessing that you meant to end an alignment here."); 25263 ins_error; 25264 end; 25265 25266@ @<Cases of |handle_right_brace|...@>= 25267no_align_group: begin end_graf; unsave; align_peek; 25268 end; 25269 25270@ Finally, \.{\\endcsname} is not supposed to get through to |main_control|. 25271 25272@<Cases of |main_control| that build...@>= 25273any_mode(end_cs_name): cs_error; 25274 25275@ @<Declare act...@>= 25276procedure cs_error; 25277begin print_err("Extra "); print_esc("endcsname"); 25278@.Extra \\endcsname@> 25279help1("I'm ignoring this, since I wasn't doing a \csname."); 25280error; 25281end; 25282 25283@* \[48] Building math lists. 25284The routines that \TeX\ uses to create mlists are similar to those we have 25285just seen for the generation of hlists and vlists. But it is necessary to 25286make ``noads'' as well as nodes, so the reader should review the 25287discussion of math mode data structures before trying to make sense out of 25288the following program. 25289 25290Here is a little routine that needs to be done whenever a subformula 25291is about to be processed. The parameter is a code like |math_group|. 25292 25293@<Declare act...@>= 25294procedure push_math(@!c:group_code); 25295begin push_nest; mode:=-mmode; incompleat_noad:=null; new_save_level(c); 25296end; 25297 25298@ We get into math mode from horizontal mode when a `\.\$' (i.e., a 25299|math_shift| character) is scanned. We must check to see whether this 25300`\.\$' is immediately followed by another, in case display math mode is 25301called for. 25302 25303@<Cases of |main_control| that build...@>= 25304hmode+math_shift:init_math; 25305 25306@ @<Declare act...@>= 25307@t\4@>@<Declare subprocedures for |init_math|@>@; 25308procedure init_math; 25309label reswitch,found,not_found,done; 25310var w:scaled; {new or partial |pre_display_size|} 25311@!j:pointer; {prototype box for display} 25312@!x:integer; {new |pre_display_direction|} 25313@!l:scaled; {new |display_width|} 25314@!s:scaled; {new |display_indent|} 25315@!p:pointer; {current node when calculating |pre_display_size|} 25316@!q:pointer; {glue specification when calculating |pre_display_size|} 25317@!f:internal_font_number; {font in current |char_node|} 25318@!n:integer; {scope of paragraph shape specification} 25319@!v:scaled; {|w| plus possible glue amount} 25320@!d:scaled; {increment to |v|} 25321begin get_token; {|get_x_token| would fail on \.{\\ifmmode}\thinspace!} 25322if (cur_cmd=math_shift)and(mode>0) then @<Go into display math mode@> 25323else begin back_input; @<Go into ordinary math mode@>; 25324 end; 25325end; 25326 25327@ @<Go into ordinary math mode@>= 25328begin push_math(math_shift_group); eq_word_define(int_base+cur_fam_code,-1); 25329if every_math<>null then begin_token_list(every_math,every_math_text); 25330end 25331 25332@ We get into ordinary math mode from display math mode when `\.{\\eqno}' or 25333`\.{\\leqno}' appears. In such cases |cur_chr| will be 0 or~1, respectively; 25334the value of |cur_chr| is placed onto |save_stack| for safe keeping. 25335 25336@<Cases of |main_control| that build...@>= 25337mmode+eq_no: if privileged then 25338 if cur_group=math_shift_group then start_eq_no 25339 else off_save; 25340 25341@ @<Put each...@>= 25342primitive("eqno",eq_no,0); 25343@!@:eq_no_}{\.{\\eqno} primitive@> 25344primitive("leqno",eq_no,1); 25345@!@:leq_no_}{\.{\\leqno} primitive@> 25346 25347@ When \TeX\ is in display math mode, |cur_group=math_shift_group|, 25348so it is not necessary for the |start_eq_no| procedure to test for 25349this condition. 25350 25351@<Declare act...@>= 25352procedure start_eq_no; 25353begin saved(0):=cur_chr; incr(save_ptr); 25354@<Go into ordinary math mode@>; 25355end; 25356 25357@ @<Cases of |print_cmd_chr|...@>= 25358eq_no:if chr_code=1 then print_esc("leqno")@+else print_esc("eqno"); 25359 25360@ @<Forbidden...@>=non_math(eq_no), 25361 25362@ When we enter display math mode, we need to call |line_break| to 25363process the partial paragraph that has just been interrupted by the 25364display. Then we can set the proper values of |display_width| and 25365|display_indent| and |pre_display_size|. 25366 25367@<Go into display math mode@>= 25368begin j:=null; w:=-max_dimen; 25369if head=tail then {`\.{\\noindent\$\$}' or `\.{\$\${ }\$\$}'} 25370 @<Prepare for display after an empty paragraph@> 25371else begin line_break(true);@/ 25372 @<Calculate the natural width, |w|, by which the characters of the 25373 final line extend to the right of the reference point, 25374 plus two ems; or set |w:=max_dimen| if the non-blank information 25375 on that line is affected by stretching or shrinking@>; 25376 end; 25377{now we are in vertical mode, working on the list that will contain the display} 25378@<Calculate the length, |l|, and the shift amount, |s|, of the display lines@>; 25379push_math(math_shift_group); mode:=mmode; 25380eq_word_define(int_base+cur_fam_code,-1);@/ 25381eq_word_define(dimen_base+pre_display_size_code,w); 25382LR_box:=j; 25383if eTeX_ex then eq_word_define(int_base+pre_display_direction_code,x); 25384eq_word_define(dimen_base+display_width_code,l); 25385eq_word_define(dimen_base+display_indent_code,s); 25386if every_display<>null then begin_token_list(every_display,every_display_text); 25387if nest_ptr=1 then build_page; 25388end 25389 25390@ @<Calculate the natural width, |w|, by which...@>= 25391@<Prepare for display after a non-empty paragraph@>; 25392while p<>null do 25393 begin @<Let |d| be the natural width of node |p|; 25394 if the node is ``visible,'' |goto found|; 25395 if the node is glue that stretches or shrinks, set |v:=max_dimen|@>; 25396 if v<max_dimen then v:=v+d; 25397 goto not_found; 25398 found: if v<max_dimen then 25399 begin v:=v+d; w:=v; 25400 end 25401 else begin w:=max_dimen; goto done; 25402 end; 25403 not_found: p:=link(p); 25404 end; 25405done: 25406@<Finish the natural width computation@> 25407 25408@ @<Let |d| be the natural width of node |p|...@>= 25409reswitch: if is_char_node(p) then 25410 begin f:=font(p); d:=char_width(f)(char_info(f)(character(p))); 25411 goto found; 25412 end; 25413case type(p) of 25414hlist_node,vlist_node,rule_node: begin d:=width(p); goto found; 25415 end; 25416ligature_node:@<Make node |p| look like a |char_node|...@>; 25417kern_node: d:=width(p); 25418margin_kern_node: d:=width(p); 25419@t\4@>@<Cases of `Let |d| be the natural width' that need special treatment@>@; 25420glue_node:@<Let |d| be the natural width of this glue; if stretching 25421 or shrinking, set |v:=max_dimen|; |goto found| in the case of leaders@>; 25422whatsit_node: @<Let |d| be the width of the whatsit |p|, and |goto found| if ``visible''@>; 25423othercases d:=0 25424endcases 25425 25426@ We need to be careful that |w|, |v|, and |d| do not depend on any |glue_set| 25427values, since such values are subject to system-dependent rounding. 25428System-dependent numbers are not allowed to infiltrate parameters like 25429|pre_display_size|, since \TeX82 is supposed to make the same decisions on all 25430machines. 25431 25432@<Let |d| be the natural width of this glue...@>= 25433begin q:=glue_ptr(p); d:=width(q); 25434if glue_sign(just_box)=stretching then 25435 begin if (glue_order(just_box)=stretch_order(q))and@| 25436 (stretch(q)<>0) then 25437 v:=max_dimen; 25438 end 25439else if glue_sign(just_box)=shrinking then 25440 begin if (glue_order(just_box)=shrink_order(q))and@| 25441 (shrink(q)<>0) then 25442 v:=max_dimen; 25443 end; 25444if subtype(p)>=a_leaders then goto found; 25445end 25446 25447@ A displayed equation is considered to be three lines long, so we 25448calculate the length and offset of line number |prev_graf+2|. 25449 25450@<Calculate the length, |l|, ...@>= 25451if par_shape_ptr=null then 25452 if (hang_indent<>0)and@| 25453 (((hang_after>=0)and(prev_graf+2>hang_after))or@| 25454 (prev_graf+1<-hang_after)) then 25455 begin l:=hsize-abs(hang_indent); 25456 if hang_indent>0 then s:=hang_indent@+else s:=0; 25457 end 25458 else begin l:=hsize; s:=0; 25459 end 25460else begin n:=info(par_shape_ptr); 25461 if prev_graf+2>=n then p:=par_shape_ptr+2*n 25462 else p:=par_shape_ptr+2*(prev_graf+2); 25463 s:=mem[p-1].sc; l:=mem[p].sc; 25464 end 25465 25466@ Subformulas of math formulas cause a new level of math mode to be entered, 25467on the semantic nest as well as the save stack. These subformulas arise in 25468several ways: (1)~A left brace by itself indicates the beginning of a 25469subformula that will be put into a box, thereby freezing its glue and 25470preventing line breaks. (2)~A subscript or superscript is treated as a 25471subformula if it is not a single character; the same applies to 25472the nucleus of things like \.{\\underline}. (3)~The \.{\\left} primitive 25473initiates a subformula that will be terminated by a matching \.{\\right}. 25474The group codes placed on |save_stack| in these three cases are 25475|math_group|, |math_group|, and |math_left_group|, respectively. 25476 25477Here is the code that handles case (1); the other cases are not quite as 25478trivial, so we shall consider them later. 25479 25480@<Cases of |main_control| that build...@>= 25481mmode+left_brace: begin tail_append(new_noad); 25482 back_input; scan_math(nucleus(tail)); 25483 end; 25484 25485@ Recall that the |nucleus|, |subscr|, and |supscr| fields in a noad are 25486broken down into subfields called |math_type| and either |info| or 25487|(fam,character)|. The job of |scan_math| is to figure out what to place 25488in one of these principal fields; it looks at the subformula that 25489comes next in the input, and places an encoding of that subformula 25490into a given word of |mem|. 25491 25492@d fam_in_range==((cur_fam>=0)and(cur_fam<number_math_families)) 25493 25494@<Declare act...@>= 25495procedure scan_math(@!p:pointer); 25496label restart,reswitch,exit; 25497var c:integer; {math character code} 25498begin restart:@<Get the next non-blank non-relax...@>; 25499reswitch:case cur_cmd of 25500letter,other_char,char_given: begin c:=ho(math_code(cur_chr)); 25501 if is_active_math_char(c) then 25502 begin @<Treat |cur_chr| as an active character@>; 25503 goto restart; 25504 end; 25505 end; 25506char_num: begin scan_char_num; cur_chr:=cur_val; cur_cmd:=char_given; 25507 goto reswitch; 25508 end; 25509math_char_num: 25510 if cur_chr = 2 then begin {\.{\\Umathchar}} 25511 scan_math_class_int; c:=set_class_field(cur_val); 25512 scan_math_fam_int; c:=c + set_family_field(cur_val); 25513 scan_usv_num; c:=c + cur_val; 25514 end else if cur_chr = 1 then begin {\.{\\Umathcharnum}} 25515 scan_xetex_math_char_int; c:=cur_val; 25516 end else begin scan_fifteen_bit_int; 25517 c:=set_class_field(cur_val div @"1000) + 25518 set_family_field((cur_val mod @"1000) div @"100) + 25519 (cur_val mod @"100); 25520 end; 25521math_given: begin 25522 c:=set_class_field(cur_chr div @"1000) + 25523 set_family_field((cur_chr mod @"1000) div @"100) + 25524 (cur_chr mod @"100); 25525 end; 25526XeTeX_math_given: c:=cur_chr; 25527delim_num: begin 25528 if cur_chr=1 then begin {\.{\\Udelimiter <class> <fam> <usv>}} 25529 scan_math_class_int; c:=set_class_field(cur_val); 25530 scan_math_fam_int; c:=c + set_family_field(cur_val); 25531 scan_usv_num; c:=c + cur_val; 25532 end else begin {\.{\\delimiter <27-bit delcode>}} 25533 scan_delimiter_int; 25534 c:=cur_val div @'10000; {get the `small' delimiter field} 25535 c:=set_class_field(c div @"1000) + 25536 set_family_field((c mod @"1000) div @"100) + 25537 (c mod @"100); {and convert it to a \XeTeX\ mathchar code} 25538 end; 25539 end; 25540othercases @<Scan a subformula enclosed in braces and |return|@> 25541endcases;@/ 25542math_type(p):=math_char; character(p):=qi(c mod @"10000); 25543if (is_var_family(c)) and fam_in_range then plane_and_fam_field(p):=cur_fam 25544else plane_and_fam_field(p):=(math_fam_field(c)); 25545plane_and_fam_field(p):=plane_and_fam_field(p) + (math_char_field(c) div @"10000) * @"100; 25546exit:end; 25547 25548@ An active character that is an |outer_call| is allowed here. 25549 25550@<Treat |cur_chr|...@>= 25551begin cur_cs:=cur_chr+active_base; 25552cur_cmd:=eq_type(cur_cs); cur_chr:=equiv(cur_cs); 25553x_token; back_input; 25554end 25555 25556@ The pointer |p| is placed on |save_stack| while a complex subformula 25557is being scanned. 25558 25559@<Scan a subformula...@>= 25560begin back_input; scan_left_brace;@/ 25561saved(0):=p; incr(save_ptr); push_math(math_group); return; 25562end 25563 25564@ The simplest math formula is, of course, `\.{\${ }\$}', when no noads are 25565generated. The next simplest cases involve a single character, e.g., 25566`\.{\$x\$}'. Even though such cases may not seem to be very interesting, 25567the reader can perhaps understand how happy the author was when `\.{\$x\$}' 25568was first properly typeset by \TeX. The code in this section was used. 25569@^Knuth, Donald Ervin@> 25570 25571@<Cases of |main_control| that build...@>= 25572mmode+letter,mmode+other_char,mmode+char_given: 25573 set_math_char(ho(math_code(cur_chr))); 25574mmode+char_num: begin scan_char_num; cur_chr:=cur_val; 25575 set_math_char(ho(math_code(cur_chr))); 25576 end; 25577mmode+math_char_num: if cur_chr = 2 then begin {\.{\\Umathchar}} 25578 scan_math_class_int; t:=set_class_field(cur_val); 25579 scan_math_fam_int; t:=t + set_family_field(cur_val); 25580 scan_usv_num; t:=t + cur_val; 25581 set_math_char(t); 25582 end else if cur_chr = 1 then begin {\.{\\Umathcharnum}} 25583 scan_xetex_math_char_int; set_math_char(cur_val); 25584 end else begin scan_fifteen_bit_int; 25585 set_math_char(set_class_field(cur_val div @"1000) + 25586 set_family_field((cur_val mod @"1000) div @"100) + 25587 (cur_val mod @"100)); 25588 end; 25589mmode+math_given: begin 25590 set_math_char(set_class_field(cur_chr div @"1000) + 25591 set_family_field((cur_chr mod @"1000) div @"100) + 25592 (cur_chr mod @"100)); 25593 end; 25594mmode+XeTeX_math_given: set_math_char(cur_chr); 25595mmode+delim_num: begin 25596 if cur_chr=1 then begin {\.{\\Udelimiter}} 25597 scan_math_class_int; t:=set_class_field(cur_val); 25598 scan_math_fam_int; t:=t + set_family_field(cur_val); 25599 scan_usv_num; t:=t + cur_val; 25600 set_math_char(t); 25601 end else begin 25602 scan_delimiter_int; 25603 cur_val:=cur_val div @'10000; {discard the large delimiter code} 25604 set_math_char(set_class_field(cur_val div @"1000) + 25605 set_family_field((cur_val mod @"1000) div @"100) + 25606 (cur_val mod @"100)); 25607 end; 25608 end; 25609 25610@ The |set_math_char| procedure creates a new noad appropriate to a given 25611math code, and appends it to the current mlist. However, if the math code 25612is sufficiently large, the |cur_chr| is treated as an active character and 25613nothing is appended. 25614 25615@<Declare act...@>= 25616procedure set_math_char(@!c:integer); 25617var p:pointer; {the new noad} 25618 ch: UnicodeScalar; 25619begin if is_active_math_char(c) then 25620 @<Treat |cur_chr|...@> 25621else begin p:=new_noad; math_type(nucleus(p)):=math_char; 25622 ch:=math_char_field(c); 25623 character(nucleus(p)):=qi(ch mod @"10000); 25624 plane_and_fam_field(nucleus(p)):=math_fam_field(c); 25625 if is_var_family(c) then 25626 begin if fam_in_range then plane_and_fam_field(nucleus(p)):=cur_fam; 25627 type(p):=ord_noad; 25628 end 25629 else type(p):=ord_noad+math_class_field(c); 25630 plane_and_fam_field(nucleus(p)):=plane_and_fam_field(nucleus(p)) + (ch div @"10000) * @"100; 25631 link(tail):=p; tail:=p; 25632 end; 25633end; 25634 25635@ Primitive math operators like \.{\\mathop} and \.{\\underline} are given 25636the command code |math_comp|, supplemented by the noad type that they 25637generate. 25638 25639@<Put each...@>= 25640primitive("mathord",math_comp,ord_noad); 25641@!@:math_ord_}{\.{\\mathord} primitive@> 25642primitive("mathop",math_comp,op_noad); 25643@!@:math_op_}{\.{\\mathop} primitive@> 25644primitive("mathbin",math_comp,bin_noad); 25645@!@:math_bin_}{\.{\\mathbin} primitive@> 25646primitive("mathrel",math_comp,rel_noad); 25647@!@:math_rel_}{\.{\\mathrel} primitive@> 25648primitive("mathopen",math_comp,open_noad); 25649@!@:math_open_}{\.{\\mathopen} primitive@> 25650primitive("mathclose",math_comp,close_noad); 25651@!@:math_close_}{\.{\\mathclose} primitive@> 25652primitive("mathpunct",math_comp,punct_noad); 25653@!@:math_punct_}{\.{\\mathpunct} primitive@> 25654primitive("mathinner",math_comp,inner_noad); 25655@!@:math_inner_}{\.{\\mathinner} primitive@> 25656primitive("underline",math_comp,under_noad); 25657@!@:underline_}{\.{\\underline} primitive@> 25658primitive("overline",math_comp,over_noad);@/ 25659@!@:overline_}{\.{\\overline} primitive@> 25660primitive("displaylimits",limit_switch,normal); 25661@!@:display_limits_}{\.{\\displaylimits} primitive@> 25662primitive("limits",limit_switch,limits); 25663@!@:limits_}{\.{\\limits} primitive@> 25664primitive("nolimits",limit_switch,no_limits); 25665@!@:no_limits_}{\.{\\nolimits} primitive@> 25666 25667@ @<Cases of |print_cmd_chr|...@>= 25668math_comp: case chr_code of 25669 ord_noad: print_esc("mathord"); 25670 op_noad: print_esc("mathop"); 25671 bin_noad: print_esc("mathbin"); 25672 rel_noad: print_esc("mathrel"); 25673 open_noad: print_esc("mathopen"); 25674 close_noad: print_esc("mathclose"); 25675 punct_noad: print_esc("mathpunct"); 25676 inner_noad: print_esc("mathinner"); 25677 under_noad: print_esc("underline"); 25678 othercases print_esc("overline") 25679 endcases; 25680limit_switch: if chr_code=limits then print_esc("limits") 25681 else if chr_code=no_limits then print_esc("nolimits") 25682 else print_esc("displaylimits"); 25683 25684@ @<Cases of |main_control| that build...@>= 25685mmode+math_comp: begin tail_append(new_noad); 25686 type(tail):=cur_chr; scan_math(nucleus(tail)); 25687 end; 25688mmode+limit_switch: math_limit_switch; 25689 25690@ @<Declare act...@>= 25691procedure math_limit_switch; 25692label exit; 25693begin if head<>tail then if type(tail)=op_noad then 25694 begin subtype(tail):=cur_chr; return; 25695 end; 25696print_err("Limit controls must follow a math operator"); 25697@.Limit controls must follow...@> 25698help1("I'm ignoring this misplaced \limits or \nolimits command."); error; 25699exit:end; 25700 25701@ Delimiter fields of noads are filled in by the |scan_delimiter| routine. 25702The first parameter of this procedure is the |mem| address where the 25703delimiter is to be placed; the second tells if this delimiter follows 25704\.{\\radical} or not. 25705 25706@<Declare act...@>= 25707procedure scan_delimiter(@!p:pointer;@!r:boolean); 25708begin 25709 if r then begin 25710 if cur_chr=1 then begin {\.{\\Uradical}} 25711 cur_val1:=@"40000000; {extended delimiter code flag} 25712 scan_math_fam_int; cur_val1:=cur_val1 + cur_val * @"200000; 25713 scan_usv_num; cur_val:=cur_val1 + cur_val; 25714 end else {radical} 25715 scan_delimiter_int; 25716 end 25717else begin @<Get the next non-blank non-relax...@>; 25718 case cur_cmd of 25719 letter,other_char: begin 25720 cur_val:=del_code(cur_chr); 25721 end; 25722 delim_num: if cur_chr=1 then begin {\.{\\Udelimiter}} 25723 cur_val1:=@"40000000; {extended delimiter code flag} 25724 scan_math_class_int; {discarded} 25725 scan_math_fam_int; cur_val1:=cur_val1 + cur_val * @"200000; 25726 scan_usv_num; cur_val:=cur_val1 + cur_val; 25727 end else scan_delimiter_int; {normal delimiter} 25728 othercases begin cur_val:=-1; end; 25729 endcases; 25730 end; 25731if cur_val<0 then begin @<Report that an invalid delimiter code is being changed 25732 to null; set~|cur_val:=0|@>; 25733 end; 25734if cur_val>=@"40000000 then begin {extended delimiter code, only one size} 25735 small_plane_and_fam_field(p):=((cur_val mod @"200000) div @"10000) * @"100 {plane} 25736 + (cur_val div @"200000) mod @"100; {family} 25737 small_char_field(p):=qi(cur_val mod @"10000); 25738 large_plane_and_fam_field(p):=0; 25739 large_char_field(p):=0; 25740end else begin {standard delimiter code, 4-bit families and 8-bit char codes} 25741 small_plane_and_fam_field(p):=(cur_val div @'4000000) mod 16; 25742 small_char_field(p):=qi((cur_val div @'10000) mod 256); 25743 large_plane_and_fam_field(p):=(cur_val div 256) mod 16; 25744 large_char_field(p):=qi(cur_val mod 256); 25745end; 25746end; 25747 25748@ @<Report that an invalid delimiter...@>= 25749begin print_err("Missing delimiter (. inserted)"); 25750@.Missing delimiter...@> 25751help6("I was expecting to see something like `(' or `\{' or")@/ 25752 ("`\}' here. If you typed, e.g., `{' instead of `\{', you")@/ 25753 ("should probably delete the `{' by typing `1' now, so that")@/ 25754 ("braces don't get unbalanced. Otherwise just proceed.")@/ 25755 ("Acceptable delimiters are characters whose \delcode is")@/ 25756 ("nonnegative, or you can use `\delimiter <delimiter code>'."); 25757back_error; cur_val:=0; 25758end 25759 25760@ @<Cases of |main_control| that build...@>= 25761mmode+radical:math_radical; 25762 25763@ @<Declare act...@>= 25764procedure math_radical; 25765begin tail_append(get_node(radical_noad_size)); 25766type(tail):=radical_noad; subtype(tail):=normal; 25767mem[nucleus(tail)].hh:=empty_field; 25768mem[subscr(tail)].hh:=empty_field; 25769mem[supscr(tail)].hh:=empty_field; 25770scan_delimiter(left_delimiter(tail),true); scan_math(nucleus(tail)); 25771end; 25772 25773@ @<Cases of |main_control| that build...@>= 25774mmode+accent,mmode+math_accent:math_ac; 25775 25776@ @<Declare act...@>= 25777procedure math_ac; 25778var c: integer; 25779begin if cur_cmd=accent then 25780 @<Complain that the user should have said \.{\\mathaccent}@>; 25781tail_append(get_node(accent_noad_size)); 25782type(tail):=accent_noad; subtype(tail):=normal; 25783mem[nucleus(tail)].hh:=empty_field; 25784mem[subscr(tail)].hh:=empty_field; 25785mem[supscr(tail)].hh:=empty_field; 25786math_type(accent_chr(tail)):=math_char; 25787if cur_chr=1 then begin 25788 if scan_keyword("fixed") then 25789 subtype(tail):=fixed_acc 25790 else if scan_keyword("bottom") then begin 25791 if scan_keyword("fixed") then 25792 subtype(tail):=bottom_acc+fixed_acc 25793 else 25794 subtype(tail):=bottom_acc; 25795 end; 25796 scan_math_class_int; c:=set_class_field(cur_val); 25797 scan_math_fam_int; c:=c + set_family_field(cur_val); 25798 scan_usv_num; cur_val:=cur_val + c; 25799end 25800else begin 25801 scan_fifteen_bit_int; 25802 cur_val:=set_class_field(cur_val div @"1000) + 25803 set_family_field((cur_val mod @"1000) div @"100) + 25804 (cur_val mod @"100); 25805end; 25806character(accent_chr(tail)):=qi(cur_val mod @"10000); 25807if (is_var_family(cur_val))and fam_in_range then plane_and_fam_field(accent_chr(tail)):=cur_fam 25808else plane_and_fam_field(accent_chr(tail)):=math_fam_field(cur_val); 25809plane_and_fam_field(accent_chr(tail)) 25810 :=plane_and_fam_field(accent_chr(tail)) + (math_char_field(cur_val) div @"10000) * @"100; 25811scan_math(nucleus(tail)); 25812end; 25813 25814@ @<Complain that the user should have said \.{\\mathaccent}@>= 25815begin print_err("Please use "); print_esc("mathaccent"); 25816print(" for accents in math mode"); 25817@.Please use \\mathaccent...@> 25818help2("I'm changing \accent to \mathaccent here; wish me luck.")@/ 25819 ("(Accents are not the same in formulas as they are in text.)"); 25820error; 25821end 25822 25823@ @<Cases of |main_control| that build...@>= 25824mmode+vcenter: begin scan_spec(vcenter_group,false); normal_paragraph; 25825 push_nest; mode:=-vmode; prev_depth:=ignore_depth; 25826 if every_vbox<>null then begin_token_list(every_vbox,every_vbox_text); 25827 end; 25828 25829@ @<Cases of |handle...@>= 25830vcenter_group: begin end_graf; unsave; save_ptr:=save_ptr-2; 25831 p:=vpack(link(head),saved(1),saved(0)); pop_nest; 25832 tail_append(new_noad); type(tail):=vcenter_noad; 25833 math_type(nucleus(tail)):=sub_box; info(nucleus(tail)):=p; 25834 end; 25835 25836@ The routine that inserts a |style_node| holds no surprises. 25837 25838@<Put each...@>= 25839primitive("displaystyle",math_style,display_style); 25840@!@:display_style_}{\.{\\displaystyle} primitive@> 25841primitive("textstyle",math_style,text_style); 25842@!@:text_style_}{\.{\\textstyle} primitive@> 25843primitive("scriptstyle",math_style,script_style); 25844@!@:script_style_}{\.{\\scriptstyle} primitive@> 25845primitive("scriptscriptstyle",math_style,script_script_style); 25846@!@:script_script_style_}{\.{\\scriptscriptstyle} primitive@> 25847 25848@ @<Cases of |print_cmd_chr|...@>= 25849math_style: print_style(chr_code); 25850 25851@ @<Cases of |main_control| that build...@>= 25852mmode+math_style: tail_append(new_style(cur_chr)); 25853mmode+non_script: begin tail_append(new_glue(zero_glue)); 25854 subtype(tail):=cond_math_glue; 25855 end; 25856mmode+math_choice: append_choices; 25857 25858@ The routine that scans the four mlists of a \.{\\mathchoice} is very 25859much like the routine that builds discretionary nodes. 25860 25861@<Declare act...@>= 25862procedure append_choices; 25863begin tail_append(new_choice); incr(save_ptr); saved(-1):=0; 25864push_math(math_choice_group); scan_left_brace; 25865end; 25866 25867@ @<Cases of |handle_right_brace|...@>= 25868math_choice_group: build_choices; 25869 25870@ @<Declare act...@>= 25871@t\4@>@<Declare the function called |fin_mlist|@>@t@>@;@/ 25872procedure build_choices; 25873label exit; 25874var p:pointer; {the current mlist} 25875begin unsave; p:=fin_mlist(null); 25876case saved(-1) of 258770:display_mlist(tail):=p; 258781:text_mlist(tail):=p; 258792:script_mlist(tail):=p; 258803:begin script_script_mlist(tail):=p; decr(save_ptr); return; 25881 end; 25882end; {there are no other cases} 25883incr(saved(-1)); push_math(math_choice_group); scan_left_brace; 25884exit:end; 25885 25886@ Subscripts and superscripts are attached to the previous nucleus by the 25887@^superscripts@>@^subscripts@> 25888action procedure called |sub_sup|. We use the facts that |sub_mark=sup_mark+1| 25889and |subscr(p)=supscr(p)+1|. 25890 25891@<Cases of |main_control| that build...@>= 25892mmode+sub_mark,mmode+sup_mark: sub_sup; 25893 25894@ @<Declare act...@>= 25895procedure sub_sup; 25896var t:small_number; {type of previous sub/superscript} 25897@!p:pointer; {field to be filled by |scan_math|} 25898begin t:=empty; p:=null; 25899if tail<>head then if scripts_allowed(tail) then 25900 begin p:=supscr(tail)+cur_cmd-sup_mark; {|supscr| or |subscr|} 25901 t:=math_type(p); 25902 end; 25903if (p=null)or(t<>empty) then @<Insert a dummy noad to be sub/superscripted@>; 25904scan_math(p); 25905end; 25906 25907@ @<Insert a dummy...@>= 25908begin tail_append(new_noad); 25909p:=supscr(tail)+cur_cmd-sup_mark; {|supscr| or |subscr|} 25910if t<>empty then 25911 begin if cur_cmd=sup_mark then 25912 begin print_err("Double superscript"); 25913@.Double superscript@> 25914 help1("I treat `x^1^2' essentially like `x^1{}^2'."); 25915 end 25916 else begin print_err("Double subscript"); 25917@.Double subscript@> 25918 help1("I treat `x_1_2' essentially like `x_1{}_2'."); 25919 end; 25920 error; 25921 end; 25922end 25923 25924@ An operation like `\.{\\over}' causes the current mlist to go into a 25925state of suspended animation: |incompleat_noad| points to a |fraction_noad| 25926that contains the mlist-so-far as its numerator, while the denominator 25927is yet to come. Finally when the mlist is finished, the denominator will 25928go into the incompleat fraction noad, and that noad will become the 25929whole formula, unless it is surrounded by `\.{\\left}' and `\.{\\right}' 25930delimiters. 25931 25932@d above_code=0 { `\.{\\above}' } 25933@d over_code=1 { `\.{\\over}' } 25934@d atop_code=2 { `\.{\\atop}' } 25935@d delimited_code=3 { `\.{\\abovewithdelims}', etc.} 25936 25937@<Put each...@>= 25938primitive("above",above,above_code);@/ 25939@!@:above_}{\.{\\above} primitive@> 25940primitive("over",above,over_code);@/ 25941@!@:over_}{\.{\\over} primitive@> 25942primitive("atop",above,atop_code);@/ 25943@!@:atop_}{\.{\\atop} primitive@> 25944primitive("abovewithdelims",above,delimited_code+above_code);@/ 25945@!@:above_with_delims_}{\.{\\abovewithdelims} primitive@> 25946primitive("overwithdelims",above,delimited_code+over_code);@/ 25947@!@:over_with_delims_}{\.{\\overwithdelims} primitive@> 25948primitive("atopwithdelims",above,delimited_code+atop_code); 25949@!@:atop_with_delims_}{\.{\\atopwithdelims} primitive@> 25950 25951@ @<Cases of |print_cmd_chr|...@>= 25952above: case chr_code of 25953 over_code:print_esc("over"); 25954 atop_code:print_esc("atop"); 25955 delimited_code+above_code:print_esc("abovewithdelims"); 25956 delimited_code+over_code:print_esc("overwithdelims"); 25957 delimited_code+atop_code:print_esc("atopwithdelims"); 25958 othercases print_esc("above") 25959 endcases; 25960 25961@ @<Cases of |main_control| that build...@>= 25962mmode+above: math_fraction; 25963 25964@ @<Declare act...@>= 25965procedure math_fraction; 25966var c:small_number; {the type of generalized fraction we are scanning} 25967begin c:=cur_chr; 25968if incompleat_noad<>null then 25969 @<Ignore the fraction operation and complain about this ambiguous case@> 25970else begin incompleat_noad:=get_node(fraction_noad_size); 25971 type(incompleat_noad):=fraction_noad; 25972 subtype(incompleat_noad):=normal; 25973 math_type(numerator(incompleat_noad)):=sub_mlist; 25974 info(numerator(incompleat_noad)):=link(head); 25975 mem[denominator(incompleat_noad)].hh:=empty_field; 25976 mem[left_delimiter(incompleat_noad)].qqqq:=null_delimiter; 25977 mem[right_delimiter(incompleat_noad)].qqqq:=null_delimiter;@/ 25978 link(head):=null; tail:=head; 25979 @<Use code |c| to distinguish between generalized fractions@>; 25980 end; 25981end; 25982 25983@ @<Use code |c|...@>= 25984if c>=delimited_code then 25985 begin scan_delimiter(left_delimiter(incompleat_noad),false); 25986 scan_delimiter(right_delimiter(incompleat_noad),false); 25987 end; 25988case c mod delimited_code of 25989above_code: begin scan_normal_dimen; 25990 thickness(incompleat_noad):=cur_val; 25991 end; 25992over_code: thickness(incompleat_noad):=default_code; 25993atop_code: thickness(incompleat_noad):=0; 25994end {there are no other cases} 25995 25996@ @<Ignore the fraction...@>= 25997begin if c>=delimited_code then 25998 begin scan_delimiter(garbage,false); scan_delimiter(garbage,false); 25999 end; 26000if c mod delimited_code=above_code then scan_normal_dimen; 26001print_err("Ambiguous; you need another { and }"); 26002@.Ambiguous...@> 26003help3("I'm ignoring this fraction specification, since I don't")@/ 26004 ("know whether a construction like `x \over y \over z'")@/ 26005 ("means `{x \over y} \over z' or `x \over {y \over z}'."); 26006error; 26007end 26008 26009@ At the end of a math formula or subformula, the |fin_mlist| routine is 26010called upon to return a pointer to the newly completed mlist, and to 26011pop the nest back to the enclosing semantic level. The parameter to 26012|fin_mlist|, if not null, points to a |right_noad| that ends the 26013current mlist; this |right_noad| has not yet been appended. 26014 26015@<Declare the function called |fin_mlist|@>= 26016function fin_mlist(@!p:pointer):pointer; 26017var q:pointer; {the mlist to return} 26018begin if incompleat_noad<>null then @<Compleat the incompleat noad@> 26019else begin link(tail):=p; q:=link(head); 26020 end; 26021pop_nest; fin_mlist:=q; 26022end; 26023 26024@ @<Compleat...@>= 26025begin math_type(denominator(incompleat_noad)):=sub_mlist; 26026info(denominator(incompleat_noad)):=link(head); 26027if p=null then q:=incompleat_noad 26028else begin q:=info(numerator(incompleat_noad)); 26029 if (type(q)<>left_noad)or(delim_ptr=null) then confusion("right"); 26030@:this can't happen right}{\quad right@> 26031 info(numerator(incompleat_noad)):=link(delim_ptr); 26032 link(delim_ptr):=incompleat_noad; link(incompleat_noad):=p; 26033 end; 26034end 26035 26036@ Now at last we're ready to see what happens when a right brace occurs 26037in a math formula. Two special cases are simplified here: Braces are effectively 26038removed when they surround a single Ord without sub/superscripts, or when they 26039surround an accent that is the nucleus of an Ord atom. 26040 26041@<Cases of |handle...@>= 26042math_group: begin unsave; decr(save_ptr);@/ 26043 math_type(saved(0)):=sub_mlist; p:=fin_mlist(null); info(saved(0)):=p; 26044 if p<>null then if link(p)=null then 26045 if type(p)=ord_noad then 26046 begin if math_type(subscr(p))=empty then 26047 if math_type(supscr(p))=empty then 26048 begin mem[saved(0)].hh:=mem[nucleus(p)].hh; 26049 free_node(p,noad_size); 26050 end; 26051 end 26052 else if type(p)=accent_noad then if saved(0)=nucleus(tail) then 26053 if type(tail)=ord_noad then @<Replace the tail of the list by |p|@>; 26054 end; 26055 26056@ @<Replace the tail...@>= 26057begin q:=head; while link(q)<>tail do q:=link(q); 26058link(q):=p; free_node(tail,noad_size); tail:=p; 26059end 26060 26061@ We have dealt with all constructions of math mode except `\.{\\left}' and 26062`\.{\\right}', so the picture is completed by the following sections of 26063the program. 26064 26065@<Put each...@>= 26066primitive("left",left_right,left_noad); 26067@!@:left_}{\.{\\left} primitive@> 26068primitive("right",left_right,right_noad); 26069@!@:right_}{\.{\\right} primitive@> 26070text(frozen_right):="right"; eqtb[frozen_right]:=eqtb[cur_val]; 26071 26072@ @<Cases of |print_cmd_chr|...@>= 26073left_right: if chr_code=left_noad then print_esc("left") 26074@/@<Cases of |left_right| for |print_cmd_chr|@>@/ 26075else print_esc("right"); 26076 26077@ @<Cases of |main_control| that build...@>= 26078mmode+left_right: math_left_right; 26079 26080@ @<Declare act...@>= 26081procedure math_left_right; 26082var t:small_number; {|left_noad| or |right_noad|} 26083@!p:pointer; {new noad} 26084@!q:pointer; {resulting mlist} 26085begin t:=cur_chr; 26086if (t<>left_noad)and(cur_group<>math_left_group) then 26087 @<Try to recover from mismatched \.{\\right}@> 26088else begin p:=new_noad; type(p):=t; 26089 scan_delimiter(delimiter(p),false); 26090 if t=middle_noad then 26091 begin type(p):=right_noad; subtype(p):=middle_noad; 26092 end; 26093 if t=left_noad then q:=p 26094 else begin q:=fin_mlist(p); unsave; {end of |math_left_group|} 26095 end; 26096 if t<>right_noad then 26097 begin push_math(math_left_group); link(head):=q; tail:=p; 26098 delim_ptr:=p; 26099 end 26100 else begin 26101 tail_append(new_noad); type(tail):=inner_noad; 26102 math_type(nucleus(tail)):=sub_mlist; 26103 info(nucleus(tail)):=q; 26104 end; 26105 end; 26106end; 26107 26108@ @<Try to recover from mismatch...@>= 26109begin if cur_group=math_shift_group then 26110 begin scan_delimiter(garbage,false); 26111 print_err("Extra "); 26112 if t=middle_noad then 26113 begin print_esc("middle"); 26114@.Extra \\middle.@> 26115 help1("I'm ignoring a \middle that had no matching \left."); 26116 end 26117 else begin print_esc("right"); 26118@.Extra \\right.@> 26119 help1("I'm ignoring a \right that had no matching \left."); 26120 end; 26121 error; 26122 end 26123else off_save; 26124end 26125 26126@ Here is the only way out of math mode. 26127 26128@<Cases of |main_control| that build...@>= 26129mmode+math_shift: if cur_group=math_shift_group then after_math 26130 else off_save; 26131 26132@ @<Declare act...@>= 26133@t\4@>@<Declare subprocedures for |after_math|@>@; 26134procedure after_math; 26135var l:boolean; {`\.{\\leqno}' instead of `\.{\\eqno}'} 26136@!danger:boolean; {not enough symbol fonts are present} 26137@!m:integer; {|mmode| or |-mmode|} 26138@!p:pointer; {the formula} 26139@!a:pointer; {box containing equation number} 26140@<Local variables for finishing a displayed formula@>@; 26141begin danger:=false; 26142@<Retrieve the prototype box@>; 26143@<Check that the necessary fonts for math symbols are present; 26144 if not, flush the current math lists and set |danger:=true|@>; 26145m:=mode; l:=false; p:=fin_mlist(null); {this pops the nest} 26146if mode=-m then {end of equation number} 26147 begin @<Check that another \.\$ follows@>; 26148 cur_mlist:=p; cur_style:=text_style; mlist_penalties:=false; 26149 mlist_to_hlist; a:=hpack(link(temp_head),natural); 26150 set_box_lr(a)(dlist); 26151 unsave; decr(save_ptr); {now |cur_group=math_shift_group|} 26152 if saved(0)=1 then l:=true; 26153 danger:=false; 26154 @<Retrieve the prototype box@>; 26155 @<Check that the necessary fonts for math symbols are present; 26156 if not, flush the current math lists and set |danger:=true|@>; 26157 m:=mode; p:=fin_mlist(null); 26158 end 26159else a:=null; 26160if m<0 then @<Finish math in text@> 26161else begin if a=null then @<Check that another \.\$ follows@>; 26162 @<Finish displayed math@>; 26163 end; 26164end; 26165 26166@ @<Check that the necessary fonts...@>= 26167if ((font_params[fam_fnt(2+text_size)]<total_mathsy_params) 26168 and (not is_new_mathfont(fam_fnt(2+text_size)))) or@| 26169 ((font_params[fam_fnt(2+script_size)]<total_mathsy_params) 26170 and (not is_new_mathfont(fam_fnt(2+script_size)))) or@| 26171 ((font_params[fam_fnt(2+script_script_size)]<total_mathsy_params) 26172 and (not is_new_mathfont(fam_fnt(2+script_script_size)))) then 26173 begin print_err("Math formula deleted: Insufficient symbol fonts");@/ 26174@.Math formula deleted...@> 26175 help3("Sorry, but I can't typeset math unless \textfont 2")@/ 26176 ("and \scriptfont 2 and \scriptscriptfont 2 have all")@/ 26177 ("the \fontdimen values needed in math symbol fonts."); 26178 error; flush_math; danger:=true; 26179 end 26180else if ((font_params[fam_fnt(3+text_size)]<total_mathex_params) 26181 and (not is_new_mathfont(fam_fnt(3+text_size)))) or@| 26182 ((font_params[fam_fnt(3+script_size)]<total_mathex_params) 26183 and (not is_new_mathfont(fam_fnt(3+script_size)))) or@| 26184 ((font_params[fam_fnt(3+script_script_size)]<total_mathex_params) 26185 and (not is_new_mathfont(fam_fnt(3+script_script_size)))) then 26186 begin print_err("Math formula deleted: Insufficient extension fonts");@/ 26187 help3("Sorry, but I can't typeset math unless \textfont 3")@/ 26188 ("and \scriptfont 3 and \scriptscriptfont 3 have all")@/ 26189 ("the \fontdimen values needed in math extension fonts."); 26190 error; flush_math; danger:=true; 26191 end 26192 26193@ The |unsave| is done after everything else here; hence an appearance of 26194`\.{\\mathsurround}' inside of `\.{\$...\$}' affects the spacing at these 26195particular \.\$'s. This is consistent with the conventions of 26196`\.{\$\$...\$\$}', since `\.{\\abovedisplayskip}' inside a display affects the 26197space above that display. 26198 26199@<Finish math in text@>= 26200begin tail_append(new_math(math_surround,before)); 26201cur_mlist:=p; cur_style:=text_style; mlist_penalties:=(mode>0); mlist_to_hlist; 26202link(tail):=link(temp_head); 26203while link(tail)<>null do tail:=link(tail); 26204tail_append(new_math(math_surround,after)); 26205space_factor:=1000; unsave; 26206end 26207 26208@ \TeX\ gets to the following part of the program when the first `\.\$' ending 26209a display has been scanned. 26210 26211@<Check that another \.\$ follows@>= 26212begin get_x_token; 26213if cur_cmd<>math_shift then 26214 begin print_err("Display math should end with $$"); 26215@.Display math...with \$\$@> 26216 help2("The `$' that I just saw supposedly matches a previous `$$'.")@/ 26217 ("So I shall assume that you typed `$$' both times."); 26218 back_error; 26219 end; 26220end 26221 26222@ We have saved the worst for last: The fussiest part of math mode processing 26223occurs when a displayed formula is being centered and placed with an optional 26224equation number. 26225 26226@<Local variables for finishing...@>= 26227@!b:pointer; {box containing the equation} 26228@!w:scaled; {width of the equation} 26229@!z:scaled; {width of the line} 26230@!e:scaled; {width of equation number} 26231@!q:scaled; {width of equation number plus space to separate from equation} 26232@!d:scaled; {displacement of equation in the line} 26233@!s:scaled; {move the line right this much} 26234@!g1,@!g2:small_number; {glue parameter codes for before and after} 26235@!r:pointer; {kern node used to position the display} 26236@!t:pointer; {tail of adjustment list} 26237@!pre_t:pointer; {tail of pre-adjustment list} 26238 26239@ At this time |p| points to the mlist for the formula; |a| is either 26240|null| or it points to a box containing the equation number; and we are in 26241vertical mode (or internal vertical mode). 26242 26243@<Finish displayed math@>= 26244cur_mlist:=p; cur_style:=display_style; mlist_penalties:=false; 26245mlist_to_hlist; p:=link(temp_head);@/ 26246adjust_tail:=adjust_head; pre_adjust_tail:=pre_adjust_head; 26247b:=hpack(p,natural); p:=list_ptr(b); 26248t:=adjust_tail; adjust_tail:=null;@/ 26249pre_t:=pre_adjust_tail; pre_adjust_tail:=null;@/ 26250w:=width(b); z:=display_width; s:=display_indent; 26251if pre_display_direction<0 then s:=-s-z; 26252if (a=null)or danger then 26253 begin e:=0; q:=0; 26254 end 26255else begin e:=width(a); q:=e+math_quad(text_size); 26256 end; 26257if w+q>z then 26258 @<Squeeze the equation as much as possible; if there is an equation 26259 number that should go on a separate line by itself, 26260 set~|e:=0|@>; 26261@<Determine the displacement, |d|, of the left edge of the equation, with 26262 respect to the line size |z|, assuming that |l=false|@>; 26263@<Append the glue or equation number preceding the display@>; 26264@<Append the display and perhaps also the equation number@>; 26265@<Append the glue or equation number following the display@>; 26266@<Flush the prototype box@>; 26267resume_after_display 26268 26269@ @<Declare act...@>= 26270procedure resume_after_display; 26271begin if cur_group<>math_shift_group then confusion("display"); 26272@:this can't happen display}{\quad display@> 26273unsave; prev_graf:=prev_graf+3; 26274push_nest; mode:=hmode; space_factor:=1000; set_cur_lang; clang:=cur_lang; 26275prev_graf:=(norm_min(left_hyphen_min)*@'100+norm_min(right_hyphen_min)) 26276 *@'200000+cur_lang; 26277@<Scan an optional space@>; 26278if nest_ptr=1 then build_page; 26279end; 26280 26281@ The user can force the equation number to go on a separate line 26282by causing its width to be zero. 26283 26284@<Squeeze the equation as much as possible...@>= 26285begin if (e<>0)and((w-total_shrink[normal]+q<=z)or@| 26286 (total_shrink[fil]<>0)or(total_shrink[fill]<>0)or 26287 (total_shrink[filll]<>0)) then 26288 begin free_node(b,box_node_size); 26289 b:=hpack(p,z-q,exactly); 26290 end 26291else begin e:=0; 26292 if w>z then 26293 begin free_node(b,box_node_size); 26294 b:=hpack(p,z,exactly); 26295 end; 26296 end; 26297w:=width(b); 26298end 26299 26300@ We try first to center the display without regard to the existence of 26301the equation number. If that would make it too close (where ``too close'' 26302means that the space between display and equation number is less than the 26303width of the equation number), we either center it in the remaining space 26304or move it as far from the equation number as possible. The latter alternative 26305is taken only if the display begins with glue, since we assume that the 26306user put glue there to control the spacing precisely. 26307 26308@<Determine the displacement, |d|, of the left edge of the equation...@>= 26309set_box_lr(b)(dlist); 26310d:=half(z-w); 26311if (e>0)and(d<2*e) then {too close} 26312 begin d:=half(z-w-e); 26313 if p<>null then if not is_char_node(p) then if type(p)=glue_node then d:=0; 26314 end 26315 26316@ If the equation number is set on a line by itself, either before or 26317after the formula, we append an infinite penalty so that no page break will 26318separate the display from its number; and we use the same size and 26319displacement for all three potential lines of the display, even though 26320`\.{\\parshape}' may specify them differently. 26321 26322@<Append the glue or equation number preceding the display@>= 26323tail_append(new_penalty(pre_display_penalty));@/ 26324if (d+s<=pre_display_size)or l then {not enough clearance} 26325 begin g1:=above_display_skip_code; g2:=below_display_skip_code; 26326 end 26327else begin g1:=above_display_short_skip_code; 26328 g2:=below_display_short_skip_code; 26329 end; 26330if l and(e=0) then {it follows that |type(a)=hlist_node|} 26331 begin app_display(j,a,0); 26332 tail_append(new_penalty(inf_penalty)); 26333 end 26334else tail_append(new_param_glue(g1)) 26335 26336@ @<Append the display and perhaps also the equation number@>= 26337if e<>0 then 26338 begin r:=new_kern(z-w-e-d); 26339 if l then 26340 begin link(a):=r; link(r):=b; b:=a; d:=0; 26341 end 26342 else begin link(b):=r; link(r):=a; 26343 end; 26344 b:=hpack(b,natural); 26345 end; 26346app_display(j,b,d) 26347 26348@ @<Append the glue or equation number following the display@>= 26349if (a<>null)and(e=0)and not l then 26350 begin tail_append(new_penalty(inf_penalty)); 26351 app_display(j,a,z-width(a)); 26352 g2:=0; 26353 end; 26354if t<>adjust_head then {migrating material comes after equation number} 26355 begin link(tail):=link(adjust_head); tail:=t; 26356 end; 26357if pre_t<>pre_adjust_head then 26358 begin link(tail):=link(pre_adjust_head); tail:=pre_t; 26359 end; 26360tail_append(new_penalty(post_display_penalty)); 26361if g2>0 then tail_append(new_param_glue(g2)) 26362 26363@ When \.{\\halign} appears in a display, the alignment routines operate 26364essentially as they do in vertical mode. Then the following program is 26365activated, with |p| and |q| pointing to the beginning and end of the 26366resulting list, and with |aux_save| holding the |prev_depth| value. 26367 26368@<Finish an alignment in a display@>= 26369begin do_assignments; 26370if cur_cmd<>math_shift then @<Pontificate about improper alignment in display@> 26371else @<Check that another \.\$ follows@>; 26372flush_node_list(LR_box); 26373pop_nest; 26374tail_append(new_penalty(pre_display_penalty)); 26375tail_append(new_param_glue(above_display_skip_code)); 26376link(tail):=p; 26377if p<>null then tail:=q; 26378tail_append(new_penalty(post_display_penalty)); 26379tail_append(new_param_glue(below_display_skip_code)); 26380prev_depth:=aux_save.sc; resume_after_display; 26381end 26382 26383@ @<Pontificate...@>= 26384begin print_err("Missing $$ inserted"); 26385@.Missing {\$\$} inserted@> 26386help2("Displays can use special alignments (like \eqalignno)")@/ 26387 ("only if nothing but the alignment itself is between $$'s."); 26388back_error; 26389end 26390 26391@* \[49] Mode-independent processing. 26392The long |main_control| procedure has now been fully specified, except for 26393certain activities that are independent of the current mode. These activities 26394do not change the current vlist or hlist or mlist; if they change anything, 26395it is the value of a parameter or the meaning of a control sequence. 26396 26397Assignments to values in |eqtb| can be global or local. Furthermore, a 26398control sequence can be defined to be `\.{\\long}', `\.{\\protected}', 26399or `\.{\\outer}', and it might or might not be expanded. The prefixes 26400`\.{\\global}', `\.{\\long}', `\.{\\protected}', 26401and `\.{\\outer}' can occur in any order. Therefore we assign binary numeric 26402codes, making it possible to accumulate the union of all specified prefixes 26403by adding the corresponding codes. (\PASCAL's |set| operations could also 26404have been used.) 26405 26406@<Put each...@>= 26407primitive("long",prefix,1); 26408@!@:long_}{\.{\\long} primitive@> 26409primitive("outer",prefix,2); 26410@!@:outer_}{\.{\\outer} primitive@> 26411primitive("global",prefix,4); 26412@!@:global_}{\.{\\global} primitive@> 26413primitive("def",def,0); 26414@!@:def_}{\.{\\def} primitive@> 26415primitive("gdef",def,1); 26416@!@:gdef_}{\.{\\gdef} primitive@> 26417primitive("edef",def,2); 26418@!@:edef_}{\.{\\edef} primitive@> 26419primitive("xdef",def,3); 26420@!@:xdef_}{\.{\\xdef} primitive@> 26421 26422@ @<Cases of |print_cmd_chr|...@>= 26423prefix: if chr_code=1 then print_esc("long") 26424 else if chr_code=2 then print_esc("outer") 26425 @/@<Cases of |prefix| for |print_cmd_chr|@>@/ 26426 else print_esc("global"); 26427def: if chr_code=0 then print_esc("def") 26428 else if chr_code=1 then print_esc("gdef") 26429 else if chr_code=2 then print_esc("edef") 26430 else print_esc("xdef"); 26431 26432@ Every prefix, and every command code that might or might not be prefixed, 26433calls the action procedure |prefixed_command|. This routine accumulates 26434a sequence of prefixes until coming to a non-prefix, then it carries out 26435the command. 26436 26437@<Cases of |main_control| that don't...@>= 26438any_mode(toks_register), 26439any_mode(assign_toks), 26440any_mode(assign_int), 26441any_mode(assign_dimen), 26442any_mode(assign_glue), 26443any_mode(assign_mu_glue), 26444any_mode(assign_font_dimen), 26445any_mode(assign_font_int), 26446any_mode(set_aux), 26447any_mode(set_prev_graf), 26448any_mode(set_page_dimen), 26449any_mode(set_page_int), 26450any_mode(set_box_dimen), 26451any_mode(set_shape), 26452any_mode(def_code), 26453any_mode(XeTeX_def_code), 26454any_mode(def_family), 26455any_mode(set_font), 26456any_mode(def_font), 26457any_mode(register), 26458any_mode(advance), 26459any_mode(multiply), 26460any_mode(divide), 26461any_mode(prefix), 26462any_mode(let), 26463any_mode(shorthand_def), 26464any_mode(read_to_cs), 26465any_mode(def), 26466any_mode(set_box), 26467any_mode(hyph_data), 26468any_mode(set_interaction):prefixed_command; 26469 26470@ If the user says, e.g., `\.{\\global\\global}', the redundancy is 26471silently accepted. 26472 26473@<Declare act...@>= 26474@t\4@>@<Declare subprocedures for |prefixed_command|@>@t@>@;@/ 26475procedure prefixed_command; 26476label done,exit; 26477var a:small_number; {accumulated prefix codes so far} 26478@!f:internal_font_number; {identifies a font} 26479@!j:halfword; {index into a \.{\\parshape} specification} 26480@!k:font_index; {index into |font_info|} 26481@!p,@!q:pointer; {for temporary short-term use} 26482@!n:integer; {ditto} 26483@!e:boolean; {should a definition be expanded? or was \.{\\let} not done?} 26484begin a:=0; 26485while cur_cmd=prefix do 26486 begin if not odd(a div cur_chr) then a:=a+cur_chr; 26487 @<Get the next non-blank non-relax...@>; 26488 if cur_cmd<=max_non_prefixed_command then 26489 @<Discard erroneous prefixes and |return|@>; 26490 if tracing_commands>2 then if eTeX_ex then show_cur_cmd_chr; 26491 end; 26492@<Discard the prefixes \.{\\long} and \.{\\outer} if they are irrelevant@>; 26493@<Adjust \(f)for the setting of \.{\\globaldefs}@>; 26494case cur_cmd of 26495@t\4@>@<Assignments@>@; 26496othercases confusion("prefix") 26497@:this can't happen prefix}{\quad prefix@> 26498endcases; 26499done: @<Insert a token saved by \.{\\afterassignment}, if any@>; 26500exit:end; 26501 26502@ @<Discard erroneous...@>= 26503begin print_err("You can't use a prefix with `"); 26504@.You can't use a prefix with x@> 26505print_cmd_chr(cur_cmd,cur_chr); print_char("'"); 26506help1("I'll pretend you didn't say \long or \outer or \global."); 26507if eTeX_ex then help_line[0]:=@| 26508 "I'll pretend you didn't say \long or \outer or \global or \protected."; 26509back_error; return; 26510end 26511 26512@ @<Discard the prefixes...@>= 26513if a>=8 then 26514 begin j:=protected_token; a:=a-8; 26515 end 26516else j:=0; 26517if (cur_cmd<>def)and((a mod 4<>0)or(j<>0)) then 26518 begin print_err("You can't use `"); print_esc("long"); print("' or `"); 26519 print_esc("outer"); 26520 help1("I'll pretend you didn't say \long or \outer here."); 26521 if eTeX_ex then 26522 begin help_line[0]:=@| 26523 "I'll pretend you didn't say \long or \outer or \protected here."; 26524 print("' or `"); print_esc("protected"); 26525 end; 26526 print("' with `"); 26527@.You can't use \\long...@> 26528 print_cmd_chr(cur_cmd,cur_chr); print_char("'"); 26529 error; 26530 end 26531 26532@ The previous routine does not have to adjust |a| so that |a mod 4=0|, 26533since the following routines test for the \.{\\global} prefix as follows. 26534 26535@d global==(a>=4) 26536@d define(#)==if global then geq_define(#)@+else eq_define(#) 26537@d word_define(#)==if global then geq_word_define(#)@+else eq_word_define(#) 26538@d word_define1(#)==if global then geq_word_define1(#)@+else eq_word_define1(#) 26539 26540@<Adjust \(f)for the setting of \.{\\globaldefs}@>= 26541if global_defs<>0 then 26542 if global_defs<0 then 26543 begin if global then a:=a-4; 26544 end 26545 else begin if not global then a:=a+4; 26546 end 26547 26548@ When a control sequence is to be defined, by \.{\\def} or \.{\\let} or 26549something similar, the |get_r_token| routine will substitute a special 26550control sequence for a token that is not redefinable. 26551 26552@<Declare subprocedures for |prefixed_command|@>= 26553procedure get_r_token; 26554label restart; 26555begin restart: repeat get_token; 26556until cur_tok<>space_token; 26557if (cur_cs=0)or(cur_cs>frozen_control_sequence) then 26558 begin print_err("Missing control sequence inserted"); 26559@.Missing control...@> 26560 help5("Please don't say `\def cs{...}', say `\def\cs{...}'.")@/ 26561 ("I've inserted an inaccessible control sequence so that your")@/ 26562 ("definition will be completed without mixing me up too badly.")@/ 26563 ("You can recover graciously from this error, if you're")@/ 26564 ("careful; see exercise 27.2 in The TeXbook."); 26565@:TeXbook}{\sl The \TeX book@> 26566 if cur_cs=0 then back_input; 26567 cur_tok:=cs_token_flag+frozen_protection; ins_error; goto restart; 26568 end; 26569end; 26570 26571@ @<Initialize table entries...@>= 26572text(frozen_protection):="inaccessible"; 26573 26574@ Here's an example of the way many of the following routines operate. 26575(Unfortunately, they aren't all as simple as this.) 26576 26577@<Assignments@>= 26578set_font: define(cur_font_loc,data,cur_chr); 26579 26580@ When a |def| command has been scanned, 26581|cur_chr| is odd if the definition is supposed to be global, and 26582|cur_chr>=2| if the definition is supposed to be expanded. 26583 26584@<Assignments@>= 26585def: begin if odd(cur_chr)and not global and(global_defs>=0) then a:=a+4; 26586 e:=(cur_chr>=2); get_r_token; p:=cur_cs; 26587 q:=scan_toks(true,e); 26588 if j<>0 then 26589 begin q:=get_avail; info(q):=j; link(q):=link(def_ref); 26590 link(def_ref):=q; 26591 end; 26592 define(p,call+(a mod 4),def_ref); 26593 end; 26594 26595@ Both \.{\\let} and \.{\\futurelet} share the command code |let|. 26596 26597@<Put each...@>= 26598primitive("let",let,normal);@/ 26599@!@:let_}{\.{\\let} primitive@> 26600primitive("futurelet",let,normal+1);@/ 26601@!@:future_let_}{\.{\\futurelet} primitive@> 26602 26603@ @<Cases of |print_cmd_chr|...@>= 26604let: if chr_code<>normal then print_esc("futurelet")@+else print_esc("let"); 26605 26606@ @<Assignments@>= 26607let: begin n:=cur_chr; 26608 get_r_token; p:=cur_cs; 26609 if n=normal then 26610 begin repeat get_token; 26611 until cur_cmd<>spacer; 26612 if cur_tok=other_token+"=" then 26613 begin get_token; 26614 if cur_cmd=spacer then get_token; 26615 end; 26616 end 26617 else begin get_token; q:=cur_tok; get_token; back_input; 26618 cur_tok:=q; back_input; {look ahead, then back up} 26619 end; {note that |back_input| doesn't affect |cur_cmd|, |cur_chr|} 26620 if cur_cmd>=call then add_token_ref(cur_chr) 26621 else if (cur_cmd=register)or(cur_cmd=toks_register) then 26622 if (cur_chr<mem_bot)or(cur_chr>lo_mem_stat_max) then 26623 add_sa_ref(cur_chr); 26624 define(p,cur_cmd,cur_chr); 26625 end; 26626 26627@ A \.{\\chardef} creates a control sequence whose |cmd| is |char_given|; 26628a \.{\\mathchardef} creates a control sequence whose |cmd| is |math_given|; 26629and the corresponding |chr| is the character code or math code. A \.{\\countdef} 26630or \.{\\dimendef} or \.{\\skipdef} or \.{\\muskipdef} creates a control 26631sequence whose |cmd| is |assign_int| or \dots\ or |assign_mu_glue|, and the 26632corresponding |chr| is the |eqtb| location of the internal register in question. 26633 26634@d char_def_code=0 {|shorthand_def| for \.{\\chardef}} 26635@d math_char_def_code=1 {|shorthand_def| for \.{\\mathchardef}} 26636@d count_def_code=2 {|shorthand_def| for \.{\\countdef}} 26637@d dimen_def_code=3 {|shorthand_def| for \.{\\dimendef}} 26638@d skip_def_code=4 {|shorthand_def| for \.{\\skipdef}} 26639@d mu_skip_def_code=5 {|shorthand_def| for \.{\\muskipdef}} 26640@d toks_def_code=6 {|shorthand_def| for \.{\\toksdef}} 26641@d XeTeX_math_char_num_def_code=8 26642@d XeTeX_math_char_def_code=9 26643 26644@<Put each...@>= 26645primitive("chardef",shorthand_def,char_def_code);@/ 26646@!@:char_def_}{\.{\\chardef} primitive@> 26647primitive("mathchardef",shorthand_def,math_char_def_code);@/ 26648@!@:math_char_def_}{\.{\\mathchardef} primitive@> 26649primitive("XeTeXmathcharnumdef",shorthand_def,XeTeX_math_char_num_def_code);@/ 26650primitive("Umathcharnumdef",shorthand_def,XeTeX_math_char_num_def_code);@/ 26651@!@:U_math_char_num_def_}{\.{\\Umathcharnumdef} primitive@> 26652primitive("XeTeXmathchardef",shorthand_def,XeTeX_math_char_def_code);@/ 26653primitive("Umathchardef",shorthand_def,XeTeX_math_char_def_code);@/ 26654@!@:U_math_char_def_}{\.{\\Umathchardef} primitive@> 26655primitive("countdef",shorthand_def,count_def_code);@/ 26656@!@:count_def_}{\.{\\countdef} primitive@> 26657primitive("dimendef",shorthand_def,dimen_def_code);@/ 26658@!@:dimen_def_}{\.{\\dimendef} primitive@> 26659primitive("skipdef",shorthand_def,skip_def_code);@/ 26660@!@:skip_def_}{\.{\\skipdef} primitive@> 26661primitive("muskipdef",shorthand_def,mu_skip_def_code);@/ 26662@!@:mu_skip_def_}{\.{\\muskipdef} primitive@> 26663primitive("toksdef",shorthand_def,toks_def_code);@/ 26664@!@:toks_def_}{\.{\\toksdef} primitive@> 26665 26666@ @<Cases of |print_cmd_chr|...@>= 26667shorthand_def: case chr_code of 26668 char_def_code: print_esc("chardef"); 26669 math_char_def_code: print_esc("mathchardef"); 26670 XeTeX_math_char_def_code: print_esc("Umathchardef"); 26671 XeTeX_math_char_num_def_code: print_esc("Umathcharnumdef"); 26672 count_def_code: print_esc("countdef"); 26673 dimen_def_code: print_esc("dimendef"); 26674 skip_def_code: print_esc("skipdef"); 26675 mu_skip_def_code: print_esc("muskipdef"); 26676 othercases print_esc("toksdef") 26677 endcases; 26678char_given: begin print_esc("char"); print_hex(chr_code); 26679 end; 26680math_given: begin print_esc("mathchar"); print_hex(chr_code); 26681 end; 26682XeTeX_math_given: begin print_esc("Umathchar"); 26683 print_hex(math_class_field(chr_code)); 26684 print_hex(math_fam_field(chr_code)); 26685 print_hex(math_char_field(chr_code)); 26686 end; 26687 26688@ We temporarily define |p| to be |relax|, so that an occurrence of |p| 26689while scanning the definition will simply stop the scanning instead of 26690producing an ``undefined control sequence'' error or expanding the 26691previous meaning. This allows, for instance, `\.{\\chardef\\foo=123\\foo}'. 26692 26693@<Assignments@>= 26694shorthand_def: begin n:=cur_chr; get_r_token; p:=cur_cs; define(p,relax,256); 26695 scan_optional_equals; 26696 case n of 26697 char_def_code: begin scan_usv_num; define(p,char_given,cur_val); 26698 end; 26699 math_char_def_code: begin scan_fifteen_bit_int; define(p,math_given,cur_val); 26700 end; 26701 XeTeX_math_char_num_def_code: begin scan_xetex_math_char_int; 26702 define(p, XeTeX_math_given, cur_val); 26703 end; 26704 XeTeX_math_char_def_code: begin 26705 scan_math_class_int; n:=set_class_field(cur_val); 26706 scan_math_fam_int; n:=n + set_family_field(cur_val); 26707 scan_usv_num; n:=n + cur_val; 26708 define(p, XeTeX_math_given, n); 26709 end; 26710 othercases begin scan_register_num; 26711 if cur_val>255 then 26712 begin j:=n-count_def_code; {|int_val..box_val|} 26713 if j>mu_val then j:=tok_val; {|int_val..mu_val| or |tok_val|} 26714 find_sa_element(j,cur_val,true); add_sa_ref(cur_ptr); 26715 if j=tok_val then j:=toks_register@+else j:=register; 26716 define(p,j,cur_ptr); 26717 end 26718 else 26719 case n of 26720 count_def_code: define(p,assign_int,count_base+cur_val); 26721 dimen_def_code: define(p,assign_dimen,scaled_base+cur_val); 26722 skip_def_code: define(p,assign_glue,skip_base+cur_val); 26723 mu_skip_def_code: define(p,assign_mu_glue,mu_skip_base+cur_val); 26724 toks_def_code: define(p,assign_toks,toks_base+cur_val); 26725 end; {there are no other cases} 26726 end 26727 endcases; 26728 end; 26729 26730@ @<Assignments@>= 26731read_to_cs: begin j:=cur_chr; scan_int; n:=cur_val; 26732 if not scan_keyword("to") then 26733@.to@> 26734 begin print_err("Missing `to' inserted"); 26735@.Missing `to'...@> 26736 help2("You should have said `\read<number> to \cs'.")@/ 26737 ("I'm going to look for the \cs now."); error; 26738 end; 26739 get_r_token; 26740 p:=cur_cs; read_toks(n,p,j); define(p,call,cur_val); 26741 end; 26742 26743@ The token-list parameters, \.{\\output} and \.{\\everypar}, etc., receive 26744their values in the following way. (For safety's sake, we place an 26745enclosing pair of braces around an \.{\\output} list.) 26746 26747@<Assignments@>= 26748toks_register,assign_toks: begin q:=cur_cs; 26749 e:=false; {just in case, will be set |true| for sparse array elements} 26750 if cur_cmd=toks_register then 26751 if cur_chr=mem_bot then 26752 begin scan_register_num; 26753 if cur_val>255 then 26754 begin find_sa_element(tok_val,cur_val,true); 26755 cur_chr:=cur_ptr; e:=true; 26756 end 26757 else cur_chr:=toks_base+cur_val; 26758 end 26759 else e:=true 26760 else if cur_chr=XeTeX_inter_char_loc then begin 26761 scan_eight_bit_int; cur_ptr:=cur_val; 26762 scan_eight_bit_int; 26763 find_sa_element(inter_char_val, cur_ptr*@"100 + cur_val, true); 26764 cur_chr:=cur_ptr; e:=true; 26765 end; 26766 p:=cur_chr; {|p=every_par_loc| or |output_routine_loc| or \dots} 26767 scan_optional_equals; 26768 @<Get the next non-blank non-relax non-call token@>; 26769 if cur_cmd<>left_brace then @<If the right-hand side is a token parameter 26770 or token register, finish the assignment and |goto done|@>; 26771 back_input; cur_cs:=q; q:=scan_toks(false,false); 26772 if link(def_ref)=null then {empty list: revert to the default} 26773 begin sa_define(p,null)(p,undefined_cs,null); free_avail(def_ref); 26774 end 26775 else begin if (p=output_routine_loc)and not e then {enclose in curlies} 26776 begin link(q):=get_avail; q:=link(q); 26777 info(q):=right_brace_token+"}"; 26778 q:=get_avail; info(q):=left_brace_token+"{"; 26779 link(q):=link(def_ref); link(def_ref):=q; 26780 end; 26781 sa_define(p,def_ref)(p,call,def_ref); 26782 end; 26783 end; 26784 26785@ @<If the right-hand side is a token parameter...@>= 26786if (cur_cmd=toks_register)or(cur_cmd=assign_toks) then 26787 begin if cur_cmd=toks_register then 26788 if cur_chr=mem_bot then 26789 begin scan_register_num; 26790 if cur_val<256 then q:=equiv(toks_base+cur_val) 26791 else begin find_sa_element(tok_val,cur_val,false); 26792 if cur_ptr=null then q:=null 26793 else q:=sa_ptr(cur_ptr); 26794 end; 26795 end 26796 else q:=sa_ptr(cur_chr) 26797 else if cur_chr=XeTeX_inter_char_loc then begin 26798 scan_eight_bit_int; cur_ptr:=cur_val; 26799 scan_eight_bit_int; 26800 find_sa_element(inter_char_val, cur_ptr*@"100 + cur_val, false); 26801 if cur_ptr=null then q:=null 26802 else q:=sa_ptr(cur_ptr); 26803 end else q:=equiv(cur_chr); 26804 if q=null then sa_define(p,null)(p,undefined_cs,null) 26805 else begin add_token_ref(q); sa_define(p,q)(p,call,q); 26806 end; 26807 goto done; 26808 end 26809 26810@ Similar routines are used to assign values to the numeric parameters. 26811 26812@<Assignments@>= 26813assign_int: begin p:=cur_chr; scan_optional_equals; scan_int; 26814 word_define(p,cur_val); 26815 end; 26816assign_dimen: begin p:=cur_chr; scan_optional_equals; 26817 scan_normal_dimen; word_define(p,cur_val); 26818 end; 26819assign_glue,assign_mu_glue: begin p:=cur_chr; n:=cur_cmd; scan_optional_equals; 26820 if n=assign_mu_glue then scan_glue(mu_val)@+else scan_glue(glue_val); 26821 trap_zero_glue; 26822 define(p,glue_ref,cur_val); 26823 end; 26824 26825@ When a glue register or parameter becomes zero, it will always point to 26826|zero_glue| because of the following procedure. (Exception: The tabskip 26827glue isn't trapped while preambles are being scanned.) 26828 26829@<Declare subprocedures for |prefixed_command|@>= 26830procedure trap_zero_glue; 26831begin if (width(cur_val)=0)and(stretch(cur_val)=0)and(shrink(cur_val)=0) then 26832 begin add_glue_ref(zero_glue); 26833 delete_glue_ref(cur_val); cur_val:=zero_glue; 26834 end; 26835end; 26836 26837@ The various character code tables are changed by the |def_code| commands, 26838and the font families are declared by |def_family|. 26839 26840@<Put each...@>= 26841primitive("catcode",def_code,cat_code_base); 26842@!@:cat_code_}{\.{\\catcode} primitive@> 26843primitive("mathcode",def_code,math_code_base); 26844@!@:math_code_}{\.{\\mathcode} primitive@> 26845primitive("XeTeXmathcodenum",XeTeX_def_code,math_code_base); 26846primitive("Umathcodenum",XeTeX_def_code,math_code_base); 26847@!@:U_math_code_num_}{\.{\\Umathcodenum} primitive@> 26848primitive("XeTeXmathcode",XeTeX_def_code,math_code_base+1); 26849primitive("Umathcode",XeTeX_def_code,math_code_base+1); 26850@!@:U_math_code_}{\.{\\Umathcode} primitive@> 26851primitive("lccode",def_code,lc_code_base); 26852@!@:lc_code_}{\.{\\lccode} primitive@> 26853primitive("uccode",def_code,uc_code_base); 26854@!@:uc_code_}{\.{\\uccode} primitive@> 26855primitive("sfcode",def_code,sf_code_base); 26856@!@:sf_code_}{\.{\\sfcode} primitive@> 26857primitive("XeTeXcharclass",XeTeX_def_code,sf_code_base); 26858@!@:XeTeX_char_class_}{\.{\\XeTeXcharclass} primitive@> 26859primitive("delcode",def_code,del_code_base); 26860@!@:del_code_}{\.{\\delcode} primitive@> 26861primitive("XeTeXdelcodenum",XeTeX_def_code,del_code_base); 26862primitive("Udelcodenum",XeTeX_def_code,del_code_base); 26863@!@:U_del_code_num_}{\.{\\Udelcodenum} primitive@> 26864primitive("XeTeXdelcode",XeTeX_def_code,del_code_base+1); 26865primitive("Udelcode",XeTeX_def_code,del_code_base+1); 26866@!@:U_del_code_}{\.{\\Udelcode} primitive@> 26867primitive("textfont",def_family,math_font_base); 26868@!@:text_font_}{\.{\\textfont} primitive@> 26869primitive("scriptfont",def_family,math_font_base+script_size); 26870@!@:script_font_}{\.{\\scriptfont} primitive@> 26871primitive("scriptscriptfont",def_family,math_font_base+script_script_size); 26872@!@:script_script_font_}{\.{\\scriptscriptfont} primitive@> 26873 26874@ @<Cases of |print_cmd_chr|...@>= 26875def_code: if chr_code=cat_code_base then print_esc("catcode") 26876 else if chr_code=math_code_base then print_esc("mathcode") 26877 else if chr_code=lc_code_base then print_esc("lccode") 26878 else if chr_code=uc_code_base then print_esc("uccode") 26879 else if chr_code=sf_code_base then print_esc("sfcode") 26880 else print_esc("delcode"); 26881XeTeX_def_code: 26882 if chr_code=sf_code_base then print_esc("XeTeXcharclass") 26883 else if chr_code=math_code_base then print_esc("Umathcodenum") 26884 else if chr_code=math_code_base+1 then print_esc("Umathcode") 26885 else if chr_code=del_code_base then print_esc("Udelcodenum") 26886 else print_esc("Udelcode"); 26887def_family: print_size(chr_code-math_font_base); 26888 26889@ The different types of code values have different legal ranges; the 26890following program is careful to check each case properly. 26891 26892@<Assignments@>= 26893XeTeX_def_code: begin 26894 if cur_chr = sf_code_base then begin 26895 p:=cur_chr; scan_usv_num; 26896 p:=p+cur_val; 26897 n:=sf_code(cur_val) mod @"10000; 26898 scan_optional_equals; 26899 scan_char_class; 26900 define(p,data,cur_val*@"10000 + n); 26901 end 26902 else if cur_chr = math_code_base then begin 26903 p:=cur_chr; scan_usv_num; 26904 p:=p+cur_val; 26905 scan_optional_equals; 26906 scan_xetex_math_char_int; 26907 define(p,data,hi(cur_val)); 26908 end 26909 else if cur_chr = math_code_base+1 then begin 26910 p:=cur_chr-1; scan_usv_num; 26911 p:=p+cur_val; 26912 scan_optional_equals; 26913 scan_math_class_int; n:=set_class_field(cur_val); 26914 scan_math_fam_int; n:=n + set_family_field(cur_val); 26915 scan_usv_num; n:=n + cur_val; 26916 define(p,data,hi(n)); 26917 end 26918 else if cur_chr = del_code_base then begin 26919 p:=cur_chr; scan_usv_num; 26920 p:=p+cur_val; 26921 scan_optional_equals; 26922 scan_int; {|scan_xetex_del_code_int|; !!FIXME!!} 26923 word_define(p,hi(cur_val)); 26924 end else begin 26925 p:=cur_chr-1; scan_usv_num; 26926 p:=p+cur_val; 26927 scan_optional_equals; 26928 n:=@"40000000; {extended delimiter code flag} 26929 scan_math_fam_int; n:=n + cur_val * @"200000; {extended delimiter code family} 26930 scan_usv_num; n:=n + cur_val; {extended delimiter code USV} 26931 word_define(p,hi(n)); 26932 end; 26933 end; 26934def_code: begin @<Let |n| be the largest legal code value, based on |cur_chr|@>; 26935 p:=cur_chr; scan_usv_num; p:=p+cur_val; scan_optional_equals; 26936 scan_int; 26937 if ((cur_val<0)and(p<del_code_base))or(cur_val>n) then 26938 begin print_err("Invalid code ("); print_int(cur_val); 26939@.Invalid code@> 26940 if p<del_code_base then print("), should be in the range 0..") 26941 else print("), should be at most "); 26942 print_int(n); 26943 help1("I'm going to use 0 instead of that illegal code value.");@/ 26944 error; cur_val:=0; 26945 end; 26946 if p<math_code_base then begin 26947 if p>=sf_code_base then begin 26948 n:=equiv(p) div @"10000; 26949 define(p,data,n*@"10000 + cur_val); 26950 end else 26951 define(p,data,cur_val) 26952 end else if p<del_code_base then begin 26953 if cur_val=@"8000 then cur_val:=active_math_char 26954 else cur_val:=set_class_field(cur_val div @"1000) + 26955 set_family_field((cur_val mod @"1000) div @"100) + 26956 (cur_val mod @"100); {!!FIXME!! check how this is used} 26957 define(p,data,hi(cur_val)); 26958 end 26959 else word_define(p,cur_val); 26960 end; 26961 26962@ @<Let |n| be the largest...@>= 26963if cur_chr=cat_code_base then n:=max_char_code 26964else if cur_chr=math_code_base then n:=@'100000 26965else if cur_chr=sf_code_base then n:=@'77777 26966else if cur_chr=del_code_base then n:=@'77777777 26967else n:=biggest_usv 26968 26969@ @<Assignments@>= 26970def_family: begin p:=cur_chr; scan_math_fam_int; p:=p+cur_val; 26971 scan_optional_equals; scan_font_ident; define(p,data,cur_val); 26972 end; 26973 26974@ Next we consider changes to \TeX's numeric registers. 26975 26976@<Assignments@>= 26977register,advance,multiply,divide: do_register_command(a); 26978 26979@ We use the fact that |register<advance<multiply<divide|. 26980 26981@<Declare subprocedures for |prefixed_command|@>= 26982procedure do_register_command(@!a:small_number); 26983label found,exit; 26984var l,@!q,@!r,@!s:pointer; {for list manipulation} 26985@!p:int_val..mu_val; {type of register involved} 26986@!e:boolean; {does |l| refer to a sparse array element?} 26987@!w:integer; {integer or dimen value of |l|} 26988begin q:=cur_cmd; 26989e:=false; {just in case, will be set |true| for sparse array elements} 26990@<Compute the register location |l| and its type |p|; but |return| if invalid@>; 26991if q=register then scan_optional_equals 26992else if scan_keyword("by") then do_nothing; {optional `\.{by}'} 26993@.by@> 26994arith_error:=false; 26995if q<multiply then @<Compute result of |register| or 26996 |advance|, put it in |cur_val|@> 26997else @<Compute result of |multiply| or |divide|, put it in |cur_val|@>; 26998if arith_error then 26999 begin print_err("Arithmetic overflow"); 27000@.Arithmetic overflow@> 27001 help2("I can't carry out that multiplication or division,")@/ 27002 ("since the result is out of range."); 27003 if p>=glue_val then delete_glue_ref(cur_val); 27004 error; return; 27005 end; 27006if p<glue_val then sa_word_define(l,cur_val) 27007else begin trap_zero_glue; sa_define(l,cur_val)(l,glue_ref,cur_val); 27008 end; 27009exit: end; 27010 27011@ Here we use the fact that the consecutive codes |int_val..mu_val| and 27012|assign_int..assign_mu_glue| correspond to each other nicely. 27013 27014@<Compute the register location |l| and its type |p|...@>= 27015begin if q<>register then 27016 begin get_x_token; 27017 if (cur_cmd>=assign_int)and(cur_cmd<=assign_mu_glue) then 27018 begin l:=cur_chr; p:=cur_cmd-assign_int; goto found; 27019 end; 27020 if cur_cmd<>register then 27021 begin print_err("You can't use `"); print_cmd_chr(cur_cmd,cur_chr); 27022@.You can't use x after ...@> 27023 print("' after "); print_cmd_chr(q,0); 27024 help1("I'm forgetting what you said and not changing anything."); 27025 error; return; 27026 end; 27027 end; 27028if (cur_chr<mem_bot)or(cur_chr>lo_mem_stat_max) then 27029 begin l:=cur_chr; p:=sa_type(l); e:=true; 27030 end 27031else begin p:=cur_chr-mem_bot; scan_register_num; 27032 if cur_val>255 then 27033 begin find_sa_element(p,cur_val,true); l:=cur_ptr; e:=true; 27034 end 27035 else 27036case p of 27037int_val: l:=cur_val+count_base; 27038dimen_val: l:=cur_val+scaled_base; 27039glue_val: l:=cur_val+skip_base; 27040mu_val: l:=cur_val+mu_skip_base; 27041end; {there are no other cases} 27042 end; 27043end; 27044found: if p<glue_val then@+if e then w:=sa_int(l)@+else w:=eqtb[l].int 27045else if e then s:=sa_ptr(l)@+else s:=equiv(l) 27046 27047@ @<Compute result of |register| or |advance|...@>= 27048if p<glue_val then 27049 begin if p=int_val then scan_int@+else scan_normal_dimen; 27050 if q=advance then cur_val:=cur_val+w; 27051 end 27052else begin scan_glue(p); 27053 if q=advance then @<Compute the sum of two glue specs@>; 27054 end 27055 27056@ @<Compute the sum of two glue specs@>= 27057begin q:=new_spec(cur_val); r:=s; 27058delete_glue_ref(cur_val); 27059width(q):=width(q)+width(r); 27060if stretch(q)=0 then stretch_order(q):=normal; 27061if stretch_order(q)=stretch_order(r) then stretch(q):=stretch(q)+stretch(r) 27062else if (stretch_order(q)<stretch_order(r))and(stretch(r)<>0) then 27063 begin stretch(q):=stretch(r); stretch_order(q):=stretch_order(r); 27064 end; 27065if shrink(q)=0 then shrink_order(q):=normal; 27066if shrink_order(q)=shrink_order(r) then shrink(q):=shrink(q)+shrink(r) 27067else if (shrink_order(q)<shrink_order(r))and(shrink(r)<>0) then 27068 begin shrink(q):=shrink(r); shrink_order(q):=shrink_order(r); 27069 end; 27070cur_val:=q; 27071end 27072 27073@ @<Compute result of |multiply| or |divide|...@>= 27074begin scan_int; 27075if p<glue_val then 27076 if q=multiply then 27077 if p=int_val then cur_val:=mult_integers(w,cur_val) 27078 else cur_val:=nx_plus_y(w,cur_val,0) 27079 else cur_val:=x_over_n(w,cur_val) 27080else begin r:=new_spec(s); 27081 if q=multiply then 27082 begin width(r):=nx_plus_y(width(s),cur_val,0); 27083 stretch(r):=nx_plus_y(stretch(s),cur_val,0); 27084 shrink(r):=nx_plus_y(shrink(s),cur_val,0); 27085 end 27086 else begin width(r):=x_over_n(width(s),cur_val); 27087 stretch(r):=x_over_n(stretch(s),cur_val); 27088 shrink(r):=x_over_n(shrink(s),cur_val); 27089 end; 27090 cur_val:=r; 27091 end; 27092end 27093 27094@ The processing of boxes is somewhat different, because we may need 27095to scan and create an entire box before we actually change the value of the old 27096one. 27097 27098@<Assignments@>= 27099set_box: begin scan_register_num; 27100 if global then n:=global_box_flag+cur_val@+else n:=box_flag+cur_val; 27101 scan_optional_equals; 27102 if set_box_allowed then scan_box(n) 27103 else begin print_err("Improper "); print_esc("setbox"); 27104@.Improper \\setbox@> 27105 help2("Sorry, \setbox is not allowed after \halign in a display,")@/ 27106 ("or between \accent and an accented character."); error; 27107 end; 27108 end; 27109 27110@ The |space_factor| or |prev_depth| settings are changed when a |set_aux| 27111command is sensed. Similarly, |prev_graf| is changed in the presence of 27112|set_prev_graf|, and |dead_cycles| or |insert_penalties| in the presence of 27113|set_page_int|. These definitions are always global. 27114 27115When some dimension of a box register is changed, the change isn't exactly 27116global; but \TeX\ does not look at the \.{\\global} switch. 27117 27118@<Assignments@>= 27119set_aux:alter_aux; 27120set_prev_graf:alter_prev_graf; 27121set_page_dimen:alter_page_so_far; 27122set_page_int:alter_integer; 27123set_box_dimen:alter_box_dimen; 27124 27125@ @<Declare subprocedures for |prefixed_command|@>= 27126procedure alter_aux; 27127var c:halfword; {|hmode| or |vmode|} 27128begin if cur_chr<>abs(mode) then report_illegal_case 27129else begin c:=cur_chr; scan_optional_equals; 27130 if c=vmode then 27131 begin scan_normal_dimen; prev_depth:=cur_val; 27132 end 27133 else begin scan_int; 27134 if (cur_val<=0)or(cur_val>32767) then 27135 begin print_err("Bad space factor"); 27136@.Bad space factor@> 27137 help1("I allow only values in the range 1..32767 here."); 27138 int_error(cur_val); 27139 end 27140 else space_factor:=cur_val; 27141 end; 27142 end; 27143end; 27144 27145@ @<Declare subprocedures for |prefixed_command|@>= 27146procedure alter_prev_graf; 27147var p:0..nest_size; {index into |nest|} 27148begin nest[nest_ptr]:=cur_list; p:=nest_ptr; 27149while abs(nest[p].mode_field)<>vmode do decr(p); 27150scan_optional_equals; scan_int; 27151if cur_val<0 then 27152 begin print_err("Bad "); print_esc("prevgraf"); 27153@.Bad \\prevgraf@> 27154 help1("I allow only nonnegative values here."); 27155 int_error(cur_val); 27156 end 27157else begin nest[p].pg_field:=cur_val; cur_list:=nest[nest_ptr]; 27158 end; 27159end; 27160 27161@ @<Declare subprocedures for |prefixed_command|@>= 27162procedure alter_page_so_far; 27163var c:0..7; {index into |page_so_far|} 27164begin c:=cur_chr; scan_optional_equals; scan_normal_dimen; 27165page_so_far[c]:=cur_val; 27166end; 27167 27168@ @<Declare subprocedures for |prefixed_command|@>= 27169procedure alter_integer; 27170var c:small_number; 27171 {0 for \.{\\deadcycles}, 1 for \.{\\insertpenalties}, etc.} 27172begin c:=cur_chr; scan_optional_equals; scan_int; 27173if c=0 then dead_cycles:=cur_val 27174@/@<Cases for |alter_integer|@>@/ 27175else insert_penalties:=cur_val; 27176end; 27177 27178@ @<Declare subprocedures for |prefixed_command|@>= 27179procedure alter_box_dimen; 27180var c:small_number; {|width_offset| or |height_offset| or |depth_offset|} 27181@!b:pointer; {box register} 27182begin c:=cur_chr; scan_register_num; fetch_box(b); scan_optional_equals; 27183scan_normal_dimen; 27184if b<>null then mem[b+c].sc:=cur_val; 27185end; 27186 27187@ Paragraph shapes are set up in the obvious way. 27188 27189@<Assignments@>= 27190set_shape: begin q:=cur_chr; scan_optional_equals; scan_int; n:=cur_val; 27191 if n<=0 then p:=null 27192 else if q>par_shape_loc then 27193 begin n:=(cur_val div 2)+1; p:=get_node(2*n+1); info(p):=n; 27194 n:=cur_val; mem[p+1].int:=n; {number of penalties} 27195 for j:=p+2 to p+n+1 do 27196 begin scan_int; mem[j].int:=cur_val; {penalty values} 27197 end; 27198 if not odd(n) then mem[p+n+2].int:=0; {unused} 27199 end 27200 else begin p:=get_node(2*n+1); info(p):=n; 27201 for j:=1 to n do 27202 begin scan_normal_dimen; 27203 mem[p+2*j-1].sc:=cur_val; {indentation} 27204 scan_normal_dimen; 27205 mem[p+2*j].sc:=cur_val; {width} 27206 end; 27207 end; 27208 define(q,shape_ref,p); 27209 end; 27210 27211@ Here's something that isn't quite so obvious. It guarantees that 27212|info(par_shape_ptr)| can hold any positive~|n| for which |get_node(2*n+1)| 27213doesn't overflow the memory capacity. 27214 27215@<Check the ``constant''...@>= 27216if 2*max_halfword<mem_top-mem_min then bad:=41; 27217 27218@ New hyphenation data is loaded by the |hyph_data| command. 27219 27220@<Put each...@>= 27221primitive("hyphenation",hyph_data,0); 27222@!@:hyphenation_}{\.{\\hyphenation} primitive@> 27223primitive("patterns",hyph_data,1); 27224@!@:patterns_}{\.{\\patterns} primitive@> 27225 27226@ @<Cases of |print_cmd_chr|...@>= 27227hyph_data: if chr_code=1 then print_esc("patterns") 27228 else print_esc("hyphenation"); 27229 27230@ @<Assignments@>= 27231hyph_data: if cur_chr=1 then 27232 begin @!init new_patterns; goto done;@;@+tini@/ 27233 print_err("Patterns can be loaded only by INITEX"); 27234@.Patterns can be...@> 27235 help0; error; 27236 repeat get_token; until cur_cmd=right_brace; {flush the patterns} 27237 return; 27238 end 27239 else begin new_hyph_exceptions; goto done; 27240 end; 27241 27242@ All of \TeX's parameters are kept in |eqtb| except the font information, 27243the interaction mode, and the hyphenation tables; these are strictly global. 27244 27245@<Assignments@>= 27246assign_font_dimen: begin find_font_dimen(true); k:=cur_val; 27247 scan_optional_equals; scan_normal_dimen; font_info[k].sc:=cur_val; 27248 end; 27249assign_font_int: begin n:=cur_chr; scan_font_ident; f:=cur_val; 27250 if n < lp_code_base then begin 27251 scan_optional_equals; scan_int; 27252 if n=0 then hyphen_char[f]:=cur_val@+else skew_char[f]:=cur_val; 27253 end else begin 27254 if is_native_font(f) then 27255 scan_glyph_number(f) {for native fonts, the value is a glyph id} 27256 else scan_char_num; {for |tfm| fonts it's the same like pdftex} 27257 p:=cur_val; 27258 scan_optional_equals; scan_int; 27259 case n of 27260 lp_code_base: set_cp_code(f, p, left_side, cur_val); 27261 rp_code_base: set_cp_code(f, p, right_side, cur_val); 27262 endcases; 27263 end; 27264end; 27265 27266@ @<Put each...@>= 27267primitive("hyphenchar",assign_font_int,0); 27268@!@:hyphen_char_}{\.{\\hyphenchar} primitive@> 27269primitive("skewchar",assign_font_int,1); 27270@!@:skew_char_}{\.{\\skewchar} primitive@> 27271primitive("lpcode",assign_font_int,lp_code_base); 27272@!@:lp_code_}{\.{\\lpcode} primitive@> 27273primitive("rpcode",assign_font_int,rp_code_base); 27274@!@:rp_code_}{\.{\\rpcode} primitive@> 27275 27276@ @<Cases of |print_cmd_chr|...@>= 27277assign_font_int: case chr_code of 272780: print_esc("hyphenchar"); 272791: print_esc("skewchar"); 27280lp_code_base: print_esc("lpcode"); 27281rp_code_base: print_esc("rpcode"); 27282endcases; 27283 27284@ Here is where the information for a new font gets loaded. 27285 27286@<Assignments@>= 27287def_font: new_font(a); 27288 27289@ @<Declare subprocedures for |prefixed_command|@>= 27290procedure new_font(@!a:small_number); 27291label common_ending; 27292var u:pointer; {user's font identifier} 27293@!s:scaled; {stated ``at'' size, or negative of scaled magnification} 27294@!f:internal_font_number; {runs through existing fonts} 27295@!t:str_number; {name for the frozen font identifier} 27296@!old_setting:0..max_selector; {holds |selector| setting} 27297@!flushable_string:str_number; {string not yet referenced} 27298begin if job_name=0 then open_log_file; 27299 {avoid confusing \.{texput} with the font name} 27300@.texput@> 27301get_r_token; u:=cur_cs; 27302if u>=hash_base then t:=text(u) 27303else if u>=single_base then 27304 if u=null_cs then t:="FONT"@+else t:=u-single_base 27305else begin old_setting:=selector; selector:=new_string; 27306 print("FONT"); print(u-active_base); selector:=old_setting; 27307@.FONTx@> 27308 str_room(1); t:=make_string; 27309 end; 27310define(u,set_font,null_font); scan_optional_equals; scan_file_name; 27311@<Scan the font size specification@>; 27312@<If this font has already been loaded, set |f| to the internal 27313 font number and |goto common_ending|@>; 27314f:=read_font_info(u,cur_name,cur_area,s); 27315common_ending: define(u,set_font,f); eqtb[font_id_base+f]:=eqtb[u]; font_id_text(f):=t; 27316end; 27317 27318@ @<Scan the font size specification@>= 27319name_in_progress:=true; {this keeps |cur_name| from being changed} 27320if scan_keyword("at") then @<Put the \(p)(positive) `at' size into |s|@> 27321@.at@> 27322else if scan_keyword("scaled") then 27323@.scaled@> 27324 begin scan_int; s:=-cur_val; 27325 if (cur_val<=0)or(cur_val>32768) then 27326 begin print_err("Illegal magnification has been changed to 1000");@/ 27327@.Illegal magnification...@> 27328 help1("The magnification ratio must be between 1 and 32768."); 27329 int_error(cur_val); s:=-1000; 27330 end; 27331 end 27332else s:=-1000; 27333name_in_progress:=false 27334 27335@ @<Put the \(p)(positive) `at' size into |s|@>= 27336begin scan_normal_dimen; s:=cur_val; 27337if (s<=0)or(s>=@'1000000000) then 27338 begin print_err("Improper `at' size ("); 27339 print_scaled(s); print("pt), replaced by 10pt"); 27340@.Improper `at' size...@> 27341 help2("I can only handle fonts at positive sizes that are")@/ 27342 ("less than 2048pt, so I've changed what you said to 10pt."); 27343 error; s:=10*unity; 27344 end; 27345end 27346 27347@ When the user gives a new identifier to a font that was previously loaded, 27348the new name becomes the font identifier of record. Font names `\.{xyz}' and 27349`\.{XYZ}' are considered to be different. 27350 27351@<If this font has already been loaded...@>= 27352flushable_string:=str_ptr-1; 27353for f:=font_base+1 to font_ptr do begin 27354 if str_eq_str(font_name[f],cur_name) and 27355 (((cur_area = "") and is_native_font(f)) or str_eq_str(font_area[f],cur_area)) then 27356 begin if cur_name=flushable_string then 27357 begin flush_string; cur_name:=font_name[f]; 27358 end; 27359 if s>0 then 27360 begin if s=font_size[f] then goto common_ending; 27361 end 27362 else if font_size[f]=xn_over_d(font_dsize[f],-s,1000) then 27363 goto common_ending; 27364 end; 27365 { could be a native font whose "name" ended up partly in area or extension } 27366 append_str(cur_area); append_str(cur_name); append_str(cur_ext); 27367 if str_eq_str(font_name[f], make_string) then begin 27368 flush_string; 27369 if is_native_font(f) then 27370 begin if s>0 then 27371 begin if s=font_size[f] then goto common_ending; 27372 end 27373 else if font_size[f]=xn_over_d(font_dsize[f],-s,1000) then 27374 goto common_ending; 27375 end 27376 end 27377 else flush_string; 27378 end 27379 27380@ @<Cases of |print_cmd_chr|...@>= 27381set_font:begin print("select font "); 27382 font_name_str:=font_name[chr_code]; 27383 if is_native_font(chr_code) then begin 27384 quote_char:=""""; 27385 for n:=0 to length(font_name_str) - 1 do 27386 if str_pool[str_start_macro(font_name_str) + n] = """" then quote_char:="'"; 27387 print_char(quote_char); 27388 slow_print(font_name_str); 27389 print_char(quote_char); 27390 end else 27391 slow_print(font_name_str); 27392 if font_size[chr_code]<>font_dsize[chr_code] then 27393 begin print(" at "); print_scaled(font_size[chr_code]); 27394 print("pt"); 27395 end; 27396 end; 27397 27398@ @<Put each...@>= 27399primitive("batchmode",set_interaction,batch_mode); 27400@!@:batch_mode_}{\.{\\batchmode} primitive@> 27401primitive("nonstopmode",set_interaction,nonstop_mode); 27402@!@:nonstop_mode_}{\.{\\nonstopmode} primitive@> 27403primitive("scrollmode",set_interaction,scroll_mode); 27404@!@:scroll_mode_}{\.{\\scrollmode} primitive@> 27405primitive("errorstopmode",set_interaction,error_stop_mode); 27406@!@:error_stop_mode_}{\.{\\errorstopmode} primitive@> 27407 27408@ @<Cases of |print_cmd_chr|...@>= 27409set_interaction: case chr_code of 27410 batch_mode: print_esc("batchmode"); 27411 nonstop_mode: print_esc("nonstopmode"); 27412 scroll_mode: print_esc("scrollmode"); 27413 othercases print_esc("errorstopmode") 27414 endcases; 27415 27416@ @<Assignments@>= 27417set_interaction: new_interaction; 27418 27419@ @<Declare subprocedures for |prefixed_command|@>= 27420procedure new_interaction; 27421begin print_ln; 27422interaction:=cur_chr; 27423@<Initialize the print |selector| based on |interaction|@>; 27424if log_opened then selector:=selector+2; 27425end; 27426 27427@ The \.{\\afterassignment} command puts a token into the global 27428variable |after_token|. This global variable is examined just after 27429every assignment has been performed. 27430 27431@<Glob...@>= 27432@!after_token:halfword; {zero, or a saved token} 27433 27434@ @<Set init...@>= 27435after_token:=0; 27436 27437@ @<Cases of |main_control| that don't...@>= 27438any_mode(after_assignment):begin get_token; after_token:=cur_tok; 27439 end; 27440 27441@ @<Insert a token saved by \.{\\afterassignment}, if any@>= 27442if after_token<>0 then 27443 begin cur_tok:=after_token; back_input; after_token:=0; 27444 end 27445 27446@ Here is a procedure that might be called `Get the next non-blank non-relax 27447non-call non-assignment token'. 27448 27449@<Declare act...@>= 27450procedure do_assignments; 27451label exit; 27452begin loop begin @<Get the next non-blank non-relax...@>; 27453 if cur_cmd<=max_non_prefixed_command then return; 27454 set_box_allowed:=false; prefixed_command; set_box_allowed:=true; 27455 end; 27456exit:end; 27457 27458@ @<Cases of |main_control| that don't...@>= 27459any_mode(after_group):begin get_token; save_for_after(cur_tok); 27460 end; 27461 27462@ Files for \.{\\read} are opened and closed by the |in_stream| command. 27463 27464@<Put each...@>= 27465primitive("openin",in_stream,1); 27466@!@:open_in_}{\.{\\openin} primitive@> 27467primitive("closein",in_stream,0); 27468@!@:close_in_}{\.{\\closein} primitive@> 27469 27470@ @<Cases of |print_cmd_chr|...@>= 27471in_stream: if chr_code=0 then print_esc("closein") 27472 else print_esc("openin"); 27473 27474@ @<Cases of |main_control| that don't...@>= 27475any_mode(in_stream): open_or_close_in; 27476 27477@ @<Declare act...@>= 27478procedure open_or_close_in; 27479var c:0..1; {1 for \.{\\openin}, 0 for \.{\\closein}} 27480@!n:0..15; {stream number} 27481begin c:=cur_chr; scan_four_bit_int; n:=cur_val; 27482if read_open[n]<>closed then 27483 begin u_close(read_file[n]); read_open[n]:=closed; 27484 end; 27485if c<>0 then 27486 begin scan_optional_equals; scan_file_name; 27487 if cur_ext="" then cur_ext:=".tex"; 27488 pack_cur_name; 27489 if a_open_in(read_file[n]) then read_open[n]:=just_open; 27490 end; 27491end; 27492 27493@ The user can issue messages to the terminal, regardless of the 27494current mode. 27495 27496@<Cases of |main_control| that don't...@>= 27497any_mode(message):issue_message; 27498 27499@ @<Put each...@>= 27500primitive("message",message,0); 27501@!@:message_}{\.{\\message} primitive@> 27502primitive("errmessage",message,1); 27503@!@:err_message_}{\.{\\errmessage} primitive@> 27504 27505@ @<Cases of |print_cmd_chr|...@>= 27506message: if chr_code=0 then print_esc("message") 27507 else print_esc("errmessage"); 27508 27509@ @<Declare act...@>= 27510procedure issue_message; 27511var old_setting:0..max_selector; {holds |selector| setting} 27512@!c:0..1; {identifies \.{\\message} and \.{\\errmessage}} 27513@!s:str_number; {the message} 27514begin c:=cur_chr; link(garbage):=scan_toks(false,true); 27515old_setting:=selector; selector:=new_string; 27516token_show(def_ref); selector:=old_setting; 27517flush_list(def_ref); 27518str_room(1); s:=make_string; 27519if c=0 then @<Print string |s| on the terminal@> 27520else @<Print string |s| as an error message@>; 27521flush_string; 27522end; 27523 27524@ @<Print string |s| on the terminal@>= 27525begin if term_offset+length(s)>max_print_line-2 then print_ln 27526else if (term_offset>0)or(file_offset>0) then print_char(" "); 27527slow_print(s); update_terminal; 27528end 27529 27530@ If \.{\\errmessage} occurs often in |scroll_mode|, without user-defined 27531\.{\\errhelp}, we don't want to give a long help message each time. So we 27532give a verbose explanation only once. 27533 27534@<Glob...@>= 27535@!long_help_seen:boolean; {has the long \.{\\errmessage} help been used?} 27536 27537@ @<Set init...@>=long_help_seen:=false; 27538 27539@ @<Print string |s| as an error message@>= 27540begin print_err(""); slow_print(s); 27541if err_help<>null then use_err_help:=true 27542else if long_help_seen then help1("(That was another \errmessage.)") 27543else begin if interaction<error_stop_mode then long_help_seen:=true; 27544 help4("This error message was generated by an \errmessage")@/ 27545 ("command, so I can't give any explicit help.")@/ 27546 ("Pretend that you're Hercule Poirot: Examine all clues,")@/ 27547@^Poirot, Hercule@> 27548 ("and deduce the truth by order and method."); 27549 end; 27550error; use_err_help:=false; 27551end 27552 27553@ The |error| routine calls on |give_err_help| if help is requested from 27554the |err_help| parameter. 27555 27556@p procedure give_err_help; 27557begin token_show(err_help); 27558end; 27559 27560@ The \.{\\uppercase} and \.{\\lowercase} commands are implemented by 27561building a token list and then changing the cases of the letters in it. 27562 27563@<Cases of |main_control| that don't...@>= 27564any_mode(case_shift):shift_case; 27565 27566@ @<Put each...@>= 27567primitive("lowercase",case_shift,lc_code_base); 27568@!@:lowercase_}{\.{\\lowercase} primitive@> 27569primitive("uppercase",case_shift,uc_code_base); 27570@!@:uppercase_}{\.{\\uppercase} primitive@> 27571 27572@ @<Cases of |print_cmd_chr|...@>= 27573case_shift:if chr_code=lc_code_base then print_esc("lowercase") 27574 else print_esc("uppercase"); 27575 27576@ @<Declare act...@>= 27577procedure shift_case; 27578var b:pointer; {|lc_code_base| or |uc_code_base|} 27579@!p:pointer; {runs through the token list} 27580@!t:halfword; {token} 27581@!c:integer; {character code} 27582begin b:=cur_chr; p:=scan_toks(false,false); p:=link(def_ref); 27583while p<>null do 27584 begin @<Change the case of the token in |p|, if a change is appropriate@>; 27585 p:=link(p); 27586 end; 27587back_list(link(def_ref)); free_avail(def_ref); {omit reference count} 27588end; 27589 27590@ When the case of a |chr_code| changes, we don't change the |cmd|. 27591We also change active characters, using the fact that 27592|cs_token_flag+active_base| is a multiple of~256. 27593@^data structure assumptions@> 27594 27595@<Change the case of the token in |p|, if a change is appropriate@>= 27596t:=info(p); 27597if t<cs_token_flag+single_base then 27598 begin c:=t mod max_char_val; 27599 if equiv(b+c)<>0 then info(p):=t-c+equiv(b+c); 27600 end 27601 27602@ We come finally to the last pieces missing from |main_control|, namely the 27603`\.{\\show}' commands that are useful when debugging. 27604 27605@<Cases of |main_control| that don't...@>= 27606any_mode(xray): show_whatever; 27607 27608@ @d show_code=0 { \.{\\show} } 27609@d show_box_code=1 { \.{\\showbox} } 27610@d show_the_code=2 { \.{\\showthe} } 27611@d show_lists=3 { \.{\\showlists} } 27612 27613@<Put each...@>= 27614primitive("show",xray,show_code); 27615@!@:show_}{\.{\\show} primitive@> 27616primitive("showbox",xray,show_box_code); 27617@!@:show_box_}{\.{\\showbox} primitive@> 27618primitive("showthe",xray,show_the_code); 27619@!@:show_the_}{\.{\\showthe} primitive@> 27620primitive("showlists",xray,show_lists); 27621@!@:show_lists_}{\.{\\showlists} primitive@> 27622 27623@ @<Cases of |print_cmd_chr|...@>= 27624xray: case chr_code of 27625 show_box_code:print_esc("showbox"); 27626 show_the_code:print_esc("showthe"); 27627 show_lists:print_esc("showlists"); 27628 @<Cases of |xray| for |print_cmd_chr|@>@;@/ 27629 othercases print_esc("show") 27630 endcases; 27631 27632@ @<Declare act...@>= 27633procedure show_whatever; 27634label common_ending; 27635var p:pointer; {tail of a token list to show} 27636@!t:small_number; {type of conditional being shown} 27637@!m:normal..or_code; {upper bound on |fi_or_else| codes} 27638@!l:integer; {line where that conditional began} 27639@!n:integer; {level of \.{\\if...\\fi} nesting} 27640begin case cur_chr of 27641show_lists: begin begin_diagnostic; show_activities; 27642 end; 27643show_box_code: @<Show the current contents of a box@>; 27644show_code: @<Show the current meaning of a token, then |goto common_ending|@>; 27645@<Cases for |show_whatever|@>@;@/ 27646othercases @<Show the current value of some parameter or register, 27647 then |goto common_ending|@> 27648endcases;@/ 27649@<Complete a potentially long \.{\\show} command@>; 27650common_ending: if interaction<error_stop_mode then 27651 begin help0; decr(error_count); 27652 end 27653else if tracing_online>0 then 27654 begin@t@>@;@/ 27655 help3("This isn't an error message; I'm just \showing something.")@/ 27656 ("Type `I\show...' to show more (e.g., \show\cs,")@/ 27657 ("\showthe\count10, \showbox255, \showlists)."); 27658 end 27659else begin@t@>@;@/ 27660 help5("This isn't an error message; I'm just \showing something.")@/ 27661 ("Type `I\show...' to show more (e.g., \show\cs,")@/ 27662 ("\showthe\count10, \showbox255, \showlists).")@/ 27663 ("And type `I\tracingonline=1\show...' to show boxes and")@/ 27664 ("lists on your terminal as well as in the transcript file."); 27665 end; 27666error; 27667end; 27668 27669@ @<Show the current meaning of a token...@>= 27670begin get_token; 27671if interaction=error_stop_mode then wake_up_terminal; 27672print_nl("> "); 27673if cur_cs<>0 then 27674 begin sprint_cs(cur_cs); print_char("="); 27675 end; 27676print_meaning; goto common_ending; 27677end 27678 27679@ @<Cases of |print_cmd_chr|...@>= 27680undefined_cs: print("undefined"); 27681call,long_call,outer_call,long_outer_call: begin n:=cmd-call; 27682 if info(link(chr_code))=protected_token then n:=n+4; 27683 if odd(n div 4) then print_esc("protected"); 27684 if odd(n) then print_esc("long"); 27685 if odd(n div 2) then print_esc("outer"); 27686 if n>0 then print_char(" "); 27687 print("macro"); 27688 end; 27689end_template: print_esc("outer endtemplate"); 27690 27691@ @<Show the current contents of a box@>= 27692begin scan_register_num; fetch_box(p); begin_diagnostic; 27693print_nl("> \box"); print_int(cur_val); print_char("="); 27694if p=null then print("void")@+else show_box(p); 27695end 27696 27697@ @<Show the current value of some parameter...@>= 27698begin p:=the_toks; 27699if interaction=error_stop_mode then wake_up_terminal; 27700print_nl("> "); token_show(temp_head); 27701flush_list(link(temp_head)); goto common_ending; 27702end 27703 27704@ @<Complete a potentially long \.{\\show} command@>= 27705end_diagnostic(true); print_err("OK"); 27706@.OK@> 27707if selector=term_and_log then if tracing_online<=0 then 27708 begin selector:=term_only; print(" (see the transcript file)"); 27709 selector:=term_and_log; 27710 end 27711 27712@* \[50] Dumping and undumping the tables. 27713After \.{INITEX} has seen a collection of fonts and macros, it 27714can write all the necessary information on an auxiliary file so 27715that production versions of \TeX\ are able to initialize their 27716memory at high speed. The present section of the program takes 27717care of such output and input. We shall consider simultaneously 27718the processes of storing and restoring, 27719so that the inverse relation between them is clear. 27720@.INITEX@> 27721 27722The global variable |format_ident| is a string that is printed right 27723after the |banner| line when \TeX\ is ready to start. For \.{INITEX} this 27724string says simply `\.{(INITEX)}'; for other versions of \TeX\ it says, 27725for example, `\.{(preloaded format=plain 1982.11.19)}', showing the year, 27726month, and day that the format file was created. We have |format_ident=0| 27727before \TeX's tables are loaded. 27728 27729@<Glob...@>= 27730@!format_ident:str_number; 27731 27732@ @<Set init...@>= 27733format_ident:=0; 27734 27735@ @<Initialize table entries...@>= 27736format_ident:=" (INITEX)"; 27737 27738@ @<Declare act...@>= 27739@!init procedure store_fmt_file; 27740label found1,found2,done1,done2; 27741var j,@!k,@!l:integer; {all-purpose indices} 27742@!p,@!q: pointer; {all-purpose pointers} 27743@!x: integer; {something to dump} 27744@!w: four_quarters; {four ASCII codes} 27745begin @<If dumping is not allowed, abort@>; 27746@<Create the |format_ident|, open the format file, 27747 and inform the user that dumping has begun@>; 27748@<Dump constants for consistency check@>; 27749@<Dump the string pool@>; 27750@<Dump the dynamic memory@>; 27751@<Dump the table of equivalents@>; 27752@<Dump the font information@>; 27753@<Dump the hyphenation tables@>; 27754@<Dump a couple more things and the closing check word@>; 27755@<Close the format file@>; 27756end; 27757tini 27758 27759@ Corresponding to the procedure that dumps a format file, we have a function 27760that reads one in. The function returns |false| if the dumped format is 27761incompatible with the present \TeX\ table sizes, etc. 27762 27763@d bad_fmt=6666 {go here if the format file is unacceptable} 27764@d too_small(#)==begin wake_up_terminal; 27765 wterm_ln('---! Must increase the ',#); 27766@.Must increase the x@> 27767 goto bad_fmt; 27768 end 27769 27770@p @t\4@>@<Declare the function called |open_fmt_file|@>@; 27771function load_fmt_file:boolean; 27772label bad_fmt,exit; 27773var j,@!k:integer; {all-purpose indices} 27774@!p,@!q: pointer; {all-purpose pointers} 27775@!x: integer; {something undumped} 27776@!w: four_quarters; {four ASCII codes} 27777begin @<Undump constants for consistency check@>; 27778@<Undump the string pool@>; 27779@<Undump the dynamic memory@>; 27780@<Undump the table of equivalents@>; 27781@<Undump the font information@>; 27782@<Undump the hyphenation tables@>; 27783@<Undump a couple more things and the closing check word@>; 27784load_fmt_file:=true; return; {it worked!} 27785bad_fmt: wake_up_terminal; 27786 wterm_ln('(Fatal format file error; I''m stymied)'); 27787@.Fatal format file error@> 27788load_fmt_file:=false; 27789exit:end; 27790 27791@ The user is not allowed to dump a format file unless |save_ptr=0|. 27792This condition implies that |cur_level=level_one|, hence 27793the |xeq_level| array is constant and it need not be dumped. 27794 27795@<If dumping is not allowed, abort@>= 27796if save_ptr<>0 then 27797 begin print_err("You can't dump inside a group"); 27798@.You can't dump...@> 27799 help1("`{...\dump}' is a no-no."); succumb; 27800 end 27801 27802@ Format files consist of |memory_word| items, and we use the following 27803macros to dump words of different types: 27804 27805@d dump_wd(#)==begin fmt_file^:=#; put(fmt_file);@+end 27806@d dump_int(#)==begin fmt_file^.int:=#; put(fmt_file);@+end 27807@d dump_hh(#)==begin fmt_file^.hh:=#; put(fmt_file);@+end 27808@d dump_qqqq(#)==begin fmt_file^.qqqq:=#; put(fmt_file);@+end 27809 27810@<Glob...@>= 27811@!fmt_file:word_file; {for input or output of format information} 27812 27813@ The inverse macros are slightly more complicated, since we need to check 27814the range of the values we are reading in. We say `|undump(a)(b)(x)|' to 27815read an integer value |x| that is supposed to be in the range |a<=x<=b|. 27816 27817@d undump_wd(#)==begin get(fmt_file); #:=fmt_file^;@+end 27818@d undump_int(#)==begin get(fmt_file); #:=fmt_file^.int;@+end 27819@d undump_hh(#)==begin get(fmt_file); #:=fmt_file^.hh;@+end 27820@d undump_qqqq(#)==begin get(fmt_file); #:=fmt_file^.qqqq;@+end 27821@d undump_end_end(#)==#:=x;@+end 27822@d undump_end(#)==(x>#) then goto bad_fmt@+else undump_end_end 27823@d undump(#)==begin undump_int(x); if (x<#) or undump_end 27824@d undump_size_end_end(#)==too_small(#)@+else undump_end_end 27825@d undump_size_end(#)==if x># then undump_size_end_end 27826@d undump_size(#)==begin undump_int(x); 27827 if x<# then goto bad_fmt; undump_size_end 27828 27829@ The next few sections of the program should make it clear how we use the 27830dump/undump macros. 27831 27832@<Dump constants for consistency check@>= 27833dump_int(@$);@/ 27834@<Dump the \eTeX\ state@>@/ 27835dump_int(mem_bot);@/ 27836dump_int(mem_top);@/ 27837dump_int(eqtb_size);@/ 27838dump_int(hash_prime);@/ 27839dump_int(hyph_size) 27840 27841@ Sections of a \.{WEB} program that are ``commented out'' still contribute 27842strings to the string pool; therefore \.{INITEX} and \TeX\ will have 27843the same strings. (And it is, of course, a good thing that they do.) 27844@.WEB@> 27845@^string pool@> 27846 27847@<Undump constants for consistency check@>= 27848x:=fmt_file^.int; 27849if x<>@$ then goto bad_fmt; {check that strings are the same} 27850@/@<Undump the \eTeX\ state@>@/ 27851undump_int(x); 27852if x<>mem_bot then goto bad_fmt; 27853undump_int(x); 27854if x<>mem_top then goto bad_fmt; 27855undump_int(x); 27856if x<>eqtb_size then goto bad_fmt; 27857undump_int(x); 27858if x<>hash_prime then goto bad_fmt; 27859undump_int(x); 27860if x<>hyph_size then goto bad_fmt 27861 27862@ @d dump_four_ASCII== 27863 w.b0:=qi(so(str_pool[k])); w.b1:=qi(so(str_pool[k+1])); 27864 w.b2:=qi(so(str_pool[k+2])); w.b3:=qi(so(str_pool[k+3])); 27865 dump_qqqq(w) 27866 27867@<Dump the string pool@>= 27868dump_int(pool_ptr); 27869dump_int(str_ptr); 27870for k:=0 to str_ptr do dump_int(str_start[k]); 27871k:=0; 27872while k+4<pool_ptr do 27873 begin dump_four_ASCII; k:=k+4; 27874 end; 27875k:=pool_ptr-4; dump_four_ASCII; 27876print_ln; print_int(str_ptr); print(" strings of total length "); 27877print_int(pool_ptr) 27878 27879@ @d undump_four_ASCII== 27880 undump_qqqq(w); 27881 str_pool[k]:=si(qo(w.b0)); str_pool[k+1]:=si(qo(w.b1)); 27882 str_pool[k+2]:=si(qo(w.b2)); str_pool[k+3]:=si(qo(w.b3)) 27883 27884@<Undump the string pool@>= 27885undump_size(0)(pool_size)('string pool size')(pool_ptr); 27886undump_size(0)(max_strings)('max strings')(str_ptr); 27887for k:=0 to str_ptr do undump(0)(pool_ptr)(str_start[k]); 27888k:=0; 27889while k+4<pool_ptr do 27890 begin undump_four_ASCII; k:=k+4; 27891 end; 27892k:=pool_ptr-4; undump_four_ASCII; 27893init_str_ptr:=str_ptr; init_pool_ptr:=pool_ptr 27894 27895@ By sorting the list of available spaces in the variable-size portion of 27896|mem|, we are usually able to get by without having to dump very much 27897of the dynamic memory. 27898 27899We recompute |var_used| and |dyn_used|, so that \.{INITEX} dumps valid 27900information even when it has not been gathering statistics. 27901 27902@<Dump the dynamic memory@>= 27903sort_avail; var_used:=0; 27904dump_int(lo_mem_max); dump_int(rover); 27905if eTeX_ex then for k:=int_val to inter_char_val do dump_int(sa_root[k]); 27906p:=mem_bot; q:=rover; x:=0; 27907repeat for k:=p to q+1 do dump_wd(mem[k]); 27908x:=x+q+2-p; var_used:=var_used+q-p; 27909p:=q+node_size(q); q:=rlink(q); 27910until q=rover; 27911var_used:=var_used+lo_mem_max-p; dyn_used:=mem_end+1-hi_mem_min;@/ 27912for k:=p to lo_mem_max do dump_wd(mem[k]); 27913x:=x+lo_mem_max+1-p; 27914dump_int(hi_mem_min); dump_int(avail); 27915for k:=hi_mem_min to mem_end do dump_wd(mem[k]); 27916x:=x+mem_end+1-hi_mem_min; 27917p:=avail; 27918while p<>null do 27919 begin decr(dyn_used); p:=link(p); 27920 end; 27921dump_int(var_used); dump_int(dyn_used); 27922print_ln; print_int(x); 27923print(" memory locations dumped; current usage is "); 27924print_int(var_used); print_char("&"); print_int(dyn_used) 27925 27926@ @<Undump the dynamic memory@>= 27927undump(lo_mem_stat_max+1000)(hi_mem_stat_min-1)(lo_mem_max); 27928undump(lo_mem_stat_max+1)(lo_mem_max)(rover); 27929if eTeX_ex then for k:=int_val to inter_char_val do 27930 undump(null)(lo_mem_max)(sa_root[k]); 27931p:=mem_bot; q:=rover; 27932repeat for k:=p to q+1 do undump_wd(mem[k]); 27933p:=q+node_size(q); 27934if (p>lo_mem_max)or((q>=rlink(q))and(rlink(q)<>rover)) then goto bad_fmt; 27935q:=rlink(q); 27936until q=rover; 27937for k:=p to lo_mem_max do undump_wd(mem[k]); 27938if mem_min<mem_bot-2 then {make more low memory available} 27939 begin p:=llink(rover); q:=mem_min+1; 27940 link(mem_min):=null; info(mem_min):=null; {we don't use the bottom word} 27941 rlink(p):=q; llink(rover):=q;@/ 27942 rlink(q):=rover; llink(q):=p; link(q):=empty_flag; 27943 node_size(q):=mem_bot-q; 27944 end; 27945undump(lo_mem_max+1)(hi_mem_stat_min)(hi_mem_min); 27946undump(null)(mem_top)(avail); mem_end:=mem_top; 27947for k:=hi_mem_min to mem_end do undump_wd(mem[k]); 27948undump_int(var_used); undump_int(dyn_used) 27949 27950@ @<Dump the table of equivalents@>= 27951@<Dump regions 1 to 4 of |eqtb|@>; 27952@<Dump regions 5 and 6 of |eqtb|@>; 27953dump_int(par_loc); dump_int(write_loc);@/ 27954@<Dump the hash table@> 27955 27956@ @<Undump the table of equivalents@>= 27957@<Undump regions 1 to 6 of |eqtb|@>; 27958undump(hash_base)(frozen_control_sequence)(par_loc); 27959par_token:=cs_token_flag+par_loc;@/ 27960undump(hash_base)(frozen_control_sequence)(write_loc);@/ 27961@<Undump the hash table@> 27962 27963@ The table of equivalents usually contains repeated information, so we dump it 27964in compressed form: The sequence of $n+2$ values $(n,x_1,\ldots,x_n,m)$ in the 27965format file represents $n+m$ consecutive entries of |eqtb|, with |m| extra 27966copies of $x_n$, namely $(x_1,\ldots,x_n,x_n,\ldots,x_n)$. 27967 27968@<Dump regions 1 to 4 of |eqtb|@>= 27969k:=active_base; 27970repeat j:=k; 27971while j<int_base-1 do 27972 begin if (equiv(j)=equiv(j+1))and(eq_type(j)=eq_type(j+1))and@| 27973 (eq_level(j)=eq_level(j+1)) then goto found1; 27974 incr(j); 27975 end; 27976l:=int_base; goto done1; {|j=int_base-1|} 27977found1: incr(j); l:=j; 27978while j<int_base-1 do 27979 begin if (equiv(j)<>equiv(j+1))or(eq_type(j)<>eq_type(j+1))or@| 27980 (eq_level(j)<>eq_level(j+1)) then goto done1; 27981 incr(j); 27982 end; 27983done1:dump_int(l-k); 27984while k<l do 27985 begin dump_wd(eqtb[k]); incr(k); 27986 end; 27987k:=j+1; dump_int(k-l); 27988until k=int_base 27989 27990@ @<Dump regions 5 and 6 of |eqtb|@>= 27991repeat j:=k; 27992while j<eqtb_size do 27993 begin if eqtb[j].int=eqtb[j+1].int then goto found2; 27994 incr(j); 27995 end; 27996l:=eqtb_size+1; goto done2; {|j=eqtb_size|} 27997found2: incr(j); l:=j; 27998while j<eqtb_size do 27999 begin if eqtb[j].int<>eqtb[j+1].int then goto done2; 28000 incr(j); 28001 end; 28002done2:dump_int(l-k); 28003while k<l do 28004 begin dump_wd(eqtb[k]); incr(k); 28005 end; 28006k:=j+1; dump_int(k-l); 28007until k>eqtb_size 28008 28009@ @<Undump regions 1 to 6 of |eqtb|@>= 28010k:=active_base; 28011repeat undump_int(x); 28012if (x<1)or(k+x>eqtb_size+1) then goto bad_fmt; 28013for j:=k to k+x-1 do undump_wd(eqtb[j]); 28014k:=k+x; 28015undump_int(x); 28016if (x<0)or(k+x>eqtb_size+1) then goto bad_fmt; 28017for j:=k to k+x-1 do eqtb[j]:=eqtb[k-1]; 28018k:=k+x; 28019until k>eqtb_size 28020 28021@ A different scheme is used to compress the hash table, since its lower 28022region is usually sparse. When |text(p)<>0| for |p<=hash_used|, we output 28023two words, |p| and |hash[p]|. The hash table is, of course, densely packed 28024for |p>=hash_used|, so the remaining entries are output in a~block. 28025 28026@<Dump the hash table@>= 28027for p:=0 to prim_size do dump_hh(prim[p]); 28028for p:=0 to prim_size do dump_wd(prim_eqtb[p]); 28029dump_int(hash_used); cs_count:=frozen_control_sequence-1-hash_used; 28030for p:=hash_base to hash_used do if text(p)<>0 then 28031 begin dump_int(p); dump_hh(hash[p]); incr(cs_count); 28032 end; 28033for p:=hash_used+1 to undefined_control_sequence-1 do dump_hh(hash[p]); 28034dump_int(cs_count);@/ 28035print_ln; print_int(cs_count); print(" multiletter control sequences") 28036 28037@ @<Undump the hash table@>= 28038for p:=0 to prim_size do undump_hh(prim[p]); 28039for p:=0 to prim_size do undump_wd(prim_eqtb[p]); 28040undump(hash_base)(frozen_control_sequence)(hash_used); p:=hash_base-1; 28041repeat undump(p+1)(hash_used)(p); undump_hh(hash[p]); 28042until p=hash_used; 28043for p:=hash_used+1 to undefined_control_sequence-1 do undump_hh(hash[p]); 28044undump_int(cs_count) 28045 28046@ @<Dump the font information@>= 28047dump_int(fmem_ptr); 28048for k:=0 to fmem_ptr-1 do dump_wd(font_info[k]); 28049dump_int(font_ptr); 28050for k:=null_font to font_ptr do 28051 @<Dump the array info for internal font number |k|@>; 28052print_ln; print_int(fmem_ptr-7); print(" words of font info for "); 28053print_int(font_ptr-font_base); print(" preloaded font"); 28054if font_ptr<>font_base+1 then print_char("s") 28055 28056@ @<Undump the font information@>= 28057undump_size(7)(font_mem_size)('font mem size')(fmem_ptr); 28058for k:=0 to fmem_ptr-1 do undump_wd(font_info[k]); 28059undump_size(font_base)(font_max)('font max')(font_ptr); 28060for k:=null_font to font_ptr do 28061 @<Undump the array info for internal font number |k|@> 28062 28063@ @<Dump the array info for internal font number |k|@>= 28064begin dump_qqqq(font_check[k]); 28065dump_int(font_size[k]); 28066dump_int(font_dsize[k]); 28067dump_int(font_params[k]);@/ 28068dump_int(hyphen_char[k]); 28069dump_int(skew_char[k]);@/ 28070dump_int(font_name[k]); 28071dump_int(font_area[k]);@/ 28072dump_int(font_bc[k]); 28073dump_int(font_ec[k]);@/ 28074dump_int(char_base[k]); 28075dump_int(width_base[k]); 28076dump_int(height_base[k]);@/ 28077dump_int(depth_base[k]); 28078dump_int(italic_base[k]); 28079dump_int(lig_kern_base[k]);@/ 28080dump_int(kern_base[k]); 28081dump_int(exten_base[k]); 28082dump_int(param_base[k]);@/ 28083dump_int(font_glue[k]);@/ 28084dump_int(bchar_label[k]); 28085dump_int(font_bchar[k]); 28086dump_int(font_false_bchar[k]);@/ 28087print_nl("\font"); print_esc(font_id_text(k)); print_char("="); 28088print_file_name(font_name[k],font_area[k],""); 28089if font_size[k]<>font_dsize[k] then 28090 begin print(" at "); print_scaled(font_size[k]); print("pt"); 28091 end; 28092end 28093 28094@ @<Undump the array info for internal font number |k|@>= 28095begin undump_qqqq(font_check[k]);@/ 28096undump_int(font_size[k]); 28097undump_int(font_dsize[k]); 28098undump(min_halfword)(max_halfword)(font_params[k]);@/ 28099undump_int(hyphen_char[k]); 28100undump_int(skew_char[k]);@/ 28101undump(0)(str_ptr)(font_name[k]); 28102undump(0)(str_ptr)(font_area[k]);@/ 28103undump(0)(255)(font_bc[k]); 28104undump(0)(255)(font_ec[k]);@/ 28105undump_int(char_base[k]); 28106undump_int(width_base[k]); 28107undump_int(height_base[k]);@/ 28108undump_int(depth_base[k]); 28109undump_int(italic_base[k]); 28110undump_int(lig_kern_base[k]);@/ 28111undump_int(kern_base[k]); 28112undump_int(exten_base[k]); 28113undump_int(param_base[k]);@/ 28114undump(min_halfword)(lo_mem_max)(font_glue[k]);@/ 28115undump(0)(fmem_ptr-1)(bchar_label[k]); 28116undump(min_quarterword)(non_char)(font_bchar[k]); 28117undump(min_quarterword)(non_char)(font_false_bchar[k]); 28118end 28119 28120@ @<Dump the hyphenation tables@>= 28121dump_int(hyph_count); 28122for k:=0 to hyph_size do if hyph_word[k]<>0 then 28123 begin dump_int(k); dump_int(hyph_word[k]); dump_int(hyph_list[k]); 28124 end; 28125print_ln; print_int(hyph_count); print(" hyphenation exception"); 28126if hyph_count<>1 then print_char("s"); 28127if trie_not_ready then init_trie; 28128dump_int(trie_max); 28129dump_int(hyph_start); 28130for k:=0 to trie_max do dump_hh(trie[k]); 28131dump_int(max_hyph_char); 28132dump_int(trie_op_ptr); 28133for k:=1 to trie_op_ptr do 28134 begin dump_int(hyf_distance[k]); 28135 dump_int(hyf_num[k]); 28136 dump_int(hyf_next[k]); 28137 end; 28138print_nl("Hyphenation trie of length "); print_int(trie_max); 28139@.Hyphenation trie...@> 28140print(" has "); print_int(trie_op_ptr); print(" op"); 28141if trie_op_ptr<>1 then print_char("s"); 28142print(" out of "); print_int(trie_op_size); 28143for k:=biggest_lang downto 0 do if trie_used[k]>min_quarterword then 28144 begin print_nl(" "); print_int(qo(trie_used[k])); 28145 print(" for language "); print_int(k); 28146 dump_int(k); dump_int(qo(trie_used[k])); 28147 end 28148 28149@ Only ``nonempty'' parts of |op_start| need to be restored. 28150 28151@<Undump the hyphenation tables@>= 28152undump(0)(hyph_size)(hyph_count); 28153for k:=1 to hyph_count do 28154 begin undump(0)(hyph_size)(j); 28155 undump(0)(str_ptr)(hyph_word[j]); 28156 undump(min_halfword)(max_halfword)(hyph_list[j]); 28157 end; 28158undump_size(0)(trie_size)('trie size')(j); @+init trie_max:=j;@+tini 28159undump(0)(j)(hyph_start); 28160for k:=0 to j do undump_hh(trie[k]); 28161undump_int(max_hyph_char); 28162undump_size(0)(trie_op_size)('trie op size')(j); @+init trie_op_ptr:=j;@+tini 28163for k:=1 to j do 28164 begin undump(0)(63)(hyf_distance[k]); {a |small_number|} 28165 undump(0)(63)(hyf_num[k]); 28166 undump(min_quarterword)(max_quarterword)(hyf_next[k]); 28167 end; 28168init for k:=0 to biggest_lang do trie_used[k]:=min_quarterword;@+tini@;@/ 28169k:=biggest_lang+1; 28170while j>0 do 28171 begin undump(0)(k-1)(k); undump(1)(j)(x);@+init trie_used[k]:=qi(x);@+tini@;@/ 28172 j:=j-x; op_start[k]:=qo(j); 28173 end; 28174@!init trie_not_ready:=false @+tini 28175 28176@ We have already printed a lot of statistics, so we set |tracing_stats:=0| 28177to prevent them from appearing again. 28178 28179@<Dump a couple more things and the closing check word@>= 28180dump_int(interaction); dump_int(format_ident); dump_int(69069); 28181tracing_stats:=0 28182 28183@ @<Undump a couple more things and the closing check word@>= 28184undump(batch_mode)(error_stop_mode)(interaction); 28185undump(0)(str_ptr)(format_ident); 28186undump_int(x); 28187if (x<>69069)or eof(fmt_file) then goto bad_fmt 28188 28189@ @<Create the |format_ident|...@>= 28190selector:=new_string; 28191print(" (preloaded format="); print(job_name); print_char(" "); 28192print_int(year); print_char("."); 28193print_int(month); print_char("."); print_int(day); print_char(")"); 28194if interaction=batch_mode then selector:=log_only 28195else selector:=term_and_log; 28196str_room(1); 28197format_ident:=make_string; 28198pack_job_name(format_extension); 28199while not w_open_out(fmt_file) do 28200 prompt_file_name("format file name",format_extension); 28201print_nl("Beginning to dump on file "); 28202@.Beginning to dump...@> 28203slow_print(w_make_name_string(fmt_file)); flush_string; 28204print_nl(""); slow_print(format_ident) 28205 28206@ @<Close the format file@>= 28207w_close(fmt_file) 28208 28209@* \[51] The main program. 28210This is it: the part of \TeX\ that executes all those procedures we have 28211written. 28212 28213Well---almost. Let's leave space for a few more routines that we may 28214have forgotten. 28215 28216@p @<Last-minute procedures@> 28217 28218@ We have noted that there are two versions of \TeX82. One, called \.{INITEX}, 28219@.INITEX@> 28220has to be run first; it initializes everything from scratch, without 28221reading a format file, and it has the capability of dumping a format file. 28222The other one is called `\.{VIRTEX}'; it is a ``virgin'' program that needs 28223@.VIRTEX@> 28224to input a format file in order to get started. \.{VIRTEX} typically has 28225more memory capacity than \.{INITEX}, because it does not need the space 28226consumed by the auxiliary hyphenation tables and the numerous calls on 28227|primitive|, etc. 28228 28229The \.{VIRTEX} program cannot read a format file instantaneously, of course; 28230the best implementations therefore allow for production versions of \TeX\ that 28231not only avoid the loading routine for \PASCAL\ object code, they also have 28232a format file pre-loaded. This is impossible to do if we stick to standard 28233\PASCAL; but there is a simple way to fool many systems into avoiding the 28234initialization, as follows:\quad(1)~We declare a global integer variable 28235called |ready_already|. The probability is negligible that this 28236variable holds any particular value like 314159 when \.{VIRTEX} is first 28237loaded.\quad(2)~After we have read in a format file and initialized 28238everything, we set |ready_already:=314159|.\quad(3)~Soon \.{VIRTEX} 28239will print `\.*', waiting for more input; and at this point we 28240interrupt the program and save its core image in some form that the 28241operating system can reload speedily.\quad(4)~When that core image is 28242activated, the program starts again at the beginning; but now 28243|ready_already=314159| and all the other global variables have 28244their initial values too. The former chastity has vanished! 28245 28246In other words, if we allow ourselves to test the condition 28247|ready_already=314159|, before |ready_already| has been 28248assigned a value, we can avoid the lengthy initialization. Dirty tricks 28249rarely pay off so handsomely. 28250@^dirty \PASCAL@> 28251@^system dependencies@> 28252 28253On systems that allow such preloading, the standard program called \.{TeX} 28254should be the one that has \.{plain} format preloaded, since that agrees 28255with {\sl The \TeX book}. Other versions, e.g., \.{AmSTeX}, should also 28256@:TeXbook}{\sl The \TeX book@> 28257@.AmSTeX@> 28258@.plain@> 28259be provided for commonly used formats. 28260 28261@<Glob...@>= 28262@!ready_already:integer; {a sacrifice of purity for economy} 28263 28264@ Now this is really it: \TeX\ starts and ends here. 28265 28266The initial test involving |ready_already| should be deleted if the 28267\PASCAL\ runtime system is smart enough to detect such a ``mistake.'' 28268@^system dependencies@> 28269 28270@p begin @!{|start_here|} 28271history:=fatal_error_stop; {in case we quit during initialization} 28272t_open_out; {open the terminal for output} 28273if ready_already=314159 then goto start_of_TEX; 28274@<Check the ``constant'' values...@>@; 28275if bad>0 then 28276 begin wterm_ln('Ouch---my internal constants have been clobbered!', 28277 '---case ',bad:1); 28278@.Ouch...clobbered@> 28279 goto final_end; 28280 end; 28281initialize; {set global variables to their starting values} 28282@!init if not get_strings_started then goto final_end; 28283init_prim; {call |primitive| for each primitive} 28284init_str_ptr:=str_ptr; init_pool_ptr:=pool_ptr; fix_date_and_time; 28285tini@/ 28286ready_already:=314159; 28287start_of_TEX: @<Initialize the output routines@>; 28288@<Get the first line of input and prepare to start@>; 28289history:=spotless; {ready to go!} 28290main_control; {come to life} 28291final_cleanup; {prepare for death} 28292end_of_TEX: close_files_and_terminate; 28293final_end: ready_already:=0; 28294end. 28295 28296@ Here we do whatever is needed to complete \TeX's job gracefully on the 28297local operating system. The code here might come into play after a fatal 28298error; it must therefore consist entirely of ``safe'' operations that 28299cannot produce error messages. For example, it would be a mistake to call 28300|str_room| or |make_string| at this time, because a call on |overflow| 28301might lead to an infinite loop. 28302@^system dependencies@> 28303 28304Actually there's one way to get error messages, via |prepare_mag|; 28305but that can't cause infinite recursion. 28306@^recursion@> 28307 28308This program doesn't bother to close the input files that may still be open. 28309 28310@<Last-minute...@>= 28311procedure close_files_and_terminate; 28312var k:integer; {all-purpose index} 28313begin @<Finish the extensions@>; 28314@!stat if tracing_stats>0 then @<Output statistics about this job@>;@;@+tats@/ 28315wake_up_terminal; @<Finish the \.{DVI} file@>; 28316if log_opened then 28317 begin wlog_cr; a_close(log_file); selector:=selector-2; 28318 if selector=term_only then 28319 begin print_nl("Transcript written on "); 28320@.Transcript written...@> 28321 slow_print(log_name); print_char("."); 28322 end; 28323 end; 28324end; 28325 28326@ The present section goes directly to the log file instead of using 28327|print| commands, because there's no need for these strings to take 28328up |str_pool| memory when a non-{\bf stat} version of \TeX\ is being used. 28329 28330@<Output statistics...@>= 28331if log_opened then 28332 begin wlog_ln(' '); 28333 wlog_ln('Here is how much of TeX''s memory',' you used:'); 28334@.Here is how much...@> 28335 wlog(' ',str_ptr-init_str_ptr:1,' string'); 28336 if str_ptr<>init_str_ptr+1 then wlog('s'); 28337 wlog_ln(' out of ', max_strings-init_str_ptr:1);@/ 28338 wlog_ln(' ',pool_ptr-init_pool_ptr:1,' string characters out of ', 28339 pool_size-init_pool_ptr:1);@/ 28340 wlog_ln(' ',lo_mem_max-mem_min+mem_end-hi_mem_min+2:1,@| 28341 ' words of memory out of ',mem_end+1-mem_min:1);@/ 28342 wlog_ln(' ',cs_count:1,' multiletter control sequences out of ', 28343 hash_size:1);@/ 28344 wlog(' ',fmem_ptr:1,' words of font info for ', 28345 font_ptr-font_base:1,' font'); 28346 if font_ptr<>font_base+1 then wlog('s'); 28347 wlog_ln(', out of ',font_mem_size:1,' for ',font_max-font_base:1);@/ 28348 wlog(' ',hyph_count:1,' hyphenation exception'); 28349 if hyph_count<>1 then wlog('s'); 28350 wlog_ln(' out of ',hyph_size:1);@/ 28351 wlog_ln(' ',max_in_stack:1,'i,',max_nest_stack:1,'n,',@| 28352 max_param_stack:1,'p,',@| 28353 max_buf_stack+1:1,'b,',@| 28354 max_save_stack+6:1,'s stack positions out of ',@| 28355 stack_size:1,'i,', 28356 nest_size:1,'n,', 28357 param_size:1,'p,', 28358 buf_size:1,'b,', 28359 save_size:1,'s'); 28360 end 28361 28362@ We get to the |final_cleanup| routine when \.{\\end} or \.{\\dump} has 28363been scanned and |its_all_over|\kern-2pt. 28364 28365@<Last-minute...@>= 28366procedure final_cleanup; 28367label exit; 28368var c:small_number; {0 for \.{\\end}, 1 for \.{\\dump}} 28369begin c:=cur_chr; 28370if job_name=0 then open_log_file; 28371while input_ptr>0 do 28372 if state=token_list then end_token_list@+else end_file_reading; 28373while open_parens>0 do 28374 begin print(" )"); decr(open_parens); 28375 end; 28376if cur_level>level_one then 28377 begin print_nl("("); print_esc("end occurred "); 28378 print("inside a group at level "); 28379@:end_}{\.{(\\end occurred...)}@> 28380 print_int(cur_level-level_one); print_char(")"); 28381 if eTeX_ex then show_save_groups; 28382 end; 28383while cond_ptr<>null do 28384 begin print_nl("("); print_esc("end occurred "); 28385 print("when "); print_cmd_chr(if_test,cur_if); 28386 if if_line<>0 then 28387 begin print(" on line "); print_int(if_line); 28388 end; 28389 print(" was incomplete)"); 28390 if_line:=if_line_field(cond_ptr); 28391 cur_if:=subtype(cond_ptr); temp_ptr:=cond_ptr; 28392 cond_ptr:=link(cond_ptr); free_node(temp_ptr,if_node_size); 28393 end; 28394if history<>spotless then 28395 if ((history=warning_issued)or(interaction<error_stop_mode)) then 28396 if selector=term_and_log then 28397 begin selector:=term_only; 28398 print_nl("(see the transcript file for additional information)"); 28399@.see the transcript file...@> 28400 selector:=term_and_log; 28401 end; 28402if c=1 then 28403 begin @!init for c:=top_mark_code to split_bot_mark_code do 28404 if cur_mark[c]<>null then delete_token_ref(cur_mark[c]); 28405 if sa_mark<>null then 28406 if do_marks(destroy_marks,0,sa_mark) then sa_mark:=null; 28407 for c:=last_box_code to vsplit_code do flush_node_list(disc_ptr[c]); 28408 if last_glue<>max_halfword then delete_glue_ref(last_glue); 28409 store_fmt_file; return;@+tini@/ 28410 print_nl("(\dump is performed only by INITEX)"); return; 28411@:dump_}{\.{\\dump...only by INITEX}@> 28412 end; 28413exit:end; 28414 28415@ @<Last-minute...@>= 28416@!init procedure init_prim; {initialize all the primitives} 28417begin no_new_control_sequence:=false; 28418first:=0; 28419@<Put each...@>; 28420no_new_control_sequence:=true; 28421end; 28422tini 28423 28424@ When we begin the following code, \TeX's tables may still contain garbage; 28425the strings might not even be present. Thus we must proceed cautiously to get 28426bootstrapped in. 28427 28428But when we finish this part of the program, \TeX\ is ready to call on the 28429|main_control| routine to do its work. 28430 28431@<Get the first line...@>= 28432begin @<Initialize the input routines@>; 28433@<Enable \eTeX, if requested@>@;@/ 28434if (format_ident=0)or(buffer[loc]="&") then 28435 begin if format_ident<>0 then initialize; {erase preloaded format} 28436 if not open_fmt_file then goto final_end; 28437 if not load_fmt_file then 28438 begin w_close(fmt_file); goto final_end; 28439 end; 28440 w_close(fmt_file); 28441 while (loc<limit)and(buffer[loc]=" ") do incr(loc); 28442 end; 28443if eTeX_ex then wterm_ln('entering extended mode'); 28444if end_line_char_inactive then decr(limit) 28445else buffer[limit]:=end_line_char; 28446fix_date_and_time;@/ 28447@<Compute the magic offset@>; 28448@<Initialize the print |selector|...@>; 28449if (loc<limit)and(cat_code(buffer[loc])<>escape) then start_input; 28450 {\.{\\input} assumed} 28451end 28452 28453@* \[52] Debugging. 28454Once \TeX\ is working, you should be able to diagnose most errors with 28455the \.{\\show} commands and other diagnostic features. But for the initial 28456stages of debugging, and for the revelation of really deep mysteries, you 28457can compile \TeX\ with a few more aids, including the \PASCAL\ runtime 28458checks and its debugger. An additional routine called |debug_help| 28459will also come into play when you type `\.D' after an error message; 28460|debug_help| also occurs just before a fatal error causes \TeX\ to succumb. 28461@^debugging@> 28462@^system dependencies@> 28463 28464The interface to |debug_help| is primitive, but it is good enough when used 28465with a \PASCAL\ debugger that allows you to set breakpoints and to read 28466variables and change their values. After getting the prompt `\.{debug \#}', you 28467type either a negative number (this exits |debug_help|), or zero (this 28468goes to a location where you can set a breakpoint, thereby entering into 28469dialog with the \PASCAL\ debugger), or a positive number |m| followed by 28470an argument |n|. The meaning of |m| and |n| will be clear from the 28471program below. (If |m=13|, there is an additional argument, |l|.) 28472@.debug \#@> 28473 28474@d breakpoint=888 {place where a breakpoint is desirable} 28475 28476@<Last-minute...@>= 28477@!debug procedure debug_help; {routine to display various things} 28478label breakpoint,exit; 28479var k,@!l,@!m,@!n:integer; 28480begin loop begin wake_up_terminal; 28481 print_nl("debug # (-1 to exit):"); update_terminal; 28482@.debug \#@> 28483 read(term_in,m); 28484 if m<0 then return 28485 else if m=0 then 28486 begin goto breakpoint;@\ {go to every label at least once} 28487 breakpoint: m:=0; @{'BREAKPOINT'@}@\ 28488 end 28489 else begin read(term_in,n); 28490 case m of 28491 @t\4@>@<Numbered cases for |debug_help|@>@; 28492 othercases print("?") 28493 endcases; 28494 end; 28495 end; 28496exit:end; 28497gubed 28498 28499@ @<Numbered cases...@>= 285001: print_word(mem[n]); {display |mem[n]| in all forms} 285012: print_int(info(n)); 285023: print_int(link(n)); 285034: print_word(eqtb[n]); 285045: print_word(font_info[n]); 285056: print_word(save_stack[n]); 285067: show_box(n); 28507 {show a box, abbreviated by |show_box_depth| and |show_box_breadth|} 285088: begin breadth_max:=10000; depth_threshold:=pool_size-pool_ptr-10; 28509 show_node_list(n); {show a box in its entirety} 28510 end; 285119: show_token_list(n,null,1000); 2851210: slow_print(n); 2851311: check_mem(n>0); {check wellformedness; print new busy locations if |n>0|} 2851412: search_mem(n); {look for pointers to |n|} 2851513: begin read(term_in,l); print_cmd_chr(n,l); 28516 end; 2851714: for k:=0 to n do print(buffer[k]); 2851815: begin font_in_short_display:=null_font; short_display(n); 28519 end; 2852016: panicking:=not panicking; 28521 28522@* \[53] Extensions. 28523The program above includes a bunch of ``hooks'' that allow further 28524capabilities to be added without upsetting \TeX's basic structure. 28525Most of these hooks are concerned with ``whatsit'' nodes, which are 28526intended to be used for special purposes; whenever a new extension to 28527\TeX\ involves a new kind of whatsit node, a corresponding change needs 28528to be made to the routines below that deal with such nodes, 28529but it will usually be unnecessary to make many changes to the 28530other parts of this program. 28531 28532In order to demonstrate how extensions can be made, we shall treat 28533`\.{\\write}', `\.{\\openout}', `\.{\\closeout}', `\.{\\immediate}', 28534`\.{\\special}', and `\.{\\setlanguage}' as if they were extensions. 28535These commands are actually primitives of \TeX, and they should 28536appear in all implementations of the system; but let's try to imagine 28537that they aren't. Then the program below illustrates how a person 28538could add them. 28539 28540Sometimes, of course, an extension will require changes to \TeX\ itself; 28541no system of hooks could be complete enough for all conceivable extensions. 28542The features associated with `\.{\\write}' are almost all confined to the 28543following paragraphs, but there are small parts of the |print_ln| and 28544|print_char| procedures that were introduced specifically to \.{\\write} 28545characters. Furthermore one of the token lists recognized by the scanner 28546is a |write_text|; and there are a few other miscellaneous places where we 28547have already provided for some aspect of \.{\\write}. The goal of a \TeX\ 28548extender should be to minimize alterations to the standard parts of the 28549program, and to avoid them completely if possible. He or she should also 28550be quite sure that there's no easy way to accomplish the desired goals 28551with the standard features that \TeX\ already has. ``Think thrice before 28552extending,'' because that may save a lot of work, and it will also keep 28553incompatible extensions of \TeX\ from proliferating. 28554@^system dependencies@> 28555@^extensions to \TeX@> 28556 28557@ First let's consider the format of whatsit nodes that are used to represent 28558the data associated with \.{\\write} and its relatives. Recall that a whatsit 28559has |type=whatsit_node|, and the |subtype| is supposed to distinguish 28560different kinds of whatsits. Each node occupies two or more words; the 28561exact number is immaterial, as long as it is readily determined from the 28562|subtype| or other data. 28563 28564We shall introduce five |subtype| values here, corresponding to the 28565control sequences \.{\\openout}, \.{\\write}, \.{\\closeout}, \.{\\special}, and 28566\.{\\setlanguage}. The second word of I/O whatsits has a |write_stream| field 28567that identifies the write-stream number (0 to 15, or 16 for out-of-range and 28568positive, or 17 for out-of-range and negative). 28569In the case of \.{\\write} and \.{\\special}, there is also a field that 28570points to the reference count of a token list that should be sent. In the 28571case of \.{\\openout}, we need three words and three auxiliary subfields 28572to hold the string numbers for name, area, and extension. 28573 28574@d write_node_size=2 {number of words in a write/whatsit node} 28575@d open_node_size=3 {number of words in an open/whatsit node} 28576@d open_node=0 {|subtype| in whatsits that represent files to \.{\\openout}} 28577@d write_node=1 {|subtype| in whatsits that represent things to \.{\\write}} 28578@d close_node=2 {|subtype| in whatsits that represent streams to \.{\\closeout}} 28579@d special_node=3 {|subtype| in whatsits that represent \.{\\special} things} 28580@d language_node=4 {|subtype| in whatsits that change the current language} 28581@d what_lang(#)==link(#+1) {language number, in the range |0..255|} 28582@d what_lhm(#)==type(#+1) {minimum left fragment, in the range |1..63|} 28583@d what_rhm(#)==subtype(#+1) {minimum right fragment, in the range |1..63|} 28584@d write_tokens(#) == link(#+1) {reference count of token list to write} 28585@d write_stream(#) == info(#+1) {stream number (0 to 17)} 28586@d open_name(#) == link(#+1) {string number of file name to open} 28587@d open_area(#) == info(#+2) {string number of file area for |open_name|} 28588@d open_ext(#) == link(#+2) {string number of file extension for |open_name|} 28589 28590@ The sixteen possible \.{\\write} streams are represented by the |write_file| 28591array. The |j|th file is open if and only if |write_open[j]=true|. The last 28592two streams are special; |write_open[16]| represents a stream number 28593greater than 15, while |write_open[17]| represents a negative stream number, 28594and both of these variables are always |false|. 28595 28596@<Glob...@>= 28597@!write_file:array[0..15] of alpha_file; 28598@!write_open:array[0..17] of boolean; 28599 28600@ @<Set init...@>= 28601for k:=0 to 17 do write_open[k]:=false; 28602 28603@ Extensions might introduce new command codes; but it's best to use 28604|extension| with a modifier, whenever possible, so that |main_control| 28605stays the same. 28606 28607@d immediate_code=4 {command modifier for \.{\\immediate}} 28608@d set_language_code=5 {command modifier for \.{\\setlanguage}} 28609 28610@d pdftex_first_extension_code = 6 28611@d pdf_save_pos_node == pdftex_first_extension_code + 0 28612 28613@d pic_file_code=41 { command modifier for \.{\\XeTeXpicfile}, skipping codes pdfTeX might use } 28614@d pdf_file_code=42 { command modifier for \.{\\XeTeXpdffile} } 28615@d glyph_code=43 { command modifier for \.{\\XeTeXglyph} } 28616 28617@d XeTeX_input_encoding_extension_code=44 28618@d XeTeX_default_encoding_extension_code=45 28619@d XeTeX_linebreak_locale_extension_code=46 28620 28621@<Put each...@>= 28622primitive("openout",extension,open_node);@/ 28623@!@:open_out_}{\.{\\openout} primitive@> 28624primitive("write",extension,write_node); write_loc:=cur_val;@/ 28625@!@:write_}{\.{\\write} primitive@> 28626primitive("closeout",extension,close_node);@/ 28627@!@:close_out_}{\.{\\closeout} primitive@> 28628primitive("special",extension,special_node);@/ 28629@!@:special_}{\.{\\special} primitive@> 28630primitive("immediate",extension,immediate_code);@/ 28631@!@:immediate_}{\.{\\immediate} primitive@> 28632primitive("setlanguage",extension,set_language_code);@/ 28633@!@:set_language_}{\.{\\setlanguage} primitive@> 28634 28635@ The \.{\\XeTeXpicfile} and \.{\\XeTeXpdffile} primitives are only defined in extended mode. 28636 28637@<Generate all \eTeX\ primitives@>= 28638primitive("XeTeXpicfile",extension,pic_file_code);@/ 28639@!@:XeTeX_pic_file_}{\.{\\XeTeXpicfile} primitive@> 28640primitive("XeTeXpdffile",extension,pdf_file_code);@/ 28641@!@:XeTeX_pdf_file_}{\.{\\XeTeXpdffile} primitive@> 28642primitive("XeTeXglyph",extension,glyph_code);@/ 28643@!@:XeTeX_glyph_}{\.{\\XeTeXglyph} primitive@> 28644primitive("XeTeXlinebreaklocale", extension, XeTeX_linebreak_locale_extension_code);@/ 28645@!@:XeTeX_linebreak_locale_}{\.{\\XeTeXlinebreaklocale} primitive@> 28646primitive("XeTeXinterchartoks",assign_toks,XeTeX_inter_char_loc); 28647@!@:XeTeX_inter_char_toks_}{\.{\\XeTeXinterchartoks} primitive@> 28648@# 28649primitive("pdfsavepos",extension,pdf_save_pos_node);@/ 28650@!@:pdf_save_pos_}{\.{\\pdfsavepos} primitive@> 28651 28652@ The variable |write_loc| just introduced is used to provide an 28653appropriate error message in case of ``runaway'' write texts. 28654 28655@<Glob...@>= 28656@!write_loc:pointer; {|eqtb| address of \.{\\write}} 28657 28658@ @<Cases of |print_cmd_chr|...@>= 28659extension: case chr_code of 28660 open_node:print_esc("openout"); 28661 write_node:print_esc("write"); 28662 close_node:print_esc("closeout"); 28663 special_node:print_esc("special"); 28664 immediate_code:print_esc("immediate"); 28665 set_language_code:print_esc("setlanguage"); 28666 pic_file_code:print_esc("XeTeXpicfile"); 28667 pdf_file_code:print_esc("XeTeXpdffile"); 28668 glyph_code:print_esc("XeTeXglyph"); 28669 XeTeX_linebreak_locale_extension_code:print_esc("XeTeXlinebreaklocale"); 28670 XeTeX_input_encoding_extension_code:print_esc("XeTeXinputencoding"); 28671 XeTeX_default_encoding_extension_code:print_esc("XeTeXdefaultencoding"); 28672 pdf_save_pos_node: print_esc("pdfsavepos"); 28673 othercases print("[unknown extension!]") 28674 endcases; 28675 28676@ When an |extension| command occurs in |main_control|, in any mode, 28677the |do_extension| routine is called. 28678 28679@<Cases of |main_control| that are for extensions...@>= 28680any_mode(extension):do_extension; 28681 28682@ @<Declare act...@>= 28683@t\4@>@<Declare procedures needed in |do_extension|@>@; 28684procedure do_extension; 28685var i,@!j,@!k:integer; {all-purpose integers} 28686@!p,@!q,@!r:pointer; {all-purpose pointers} 28687begin case cur_chr of 28688open_node:@<Implement \.{\\openout}@>; 28689write_node:@<Implement \.{\\write}@>; 28690close_node:@<Implement \.{\\closeout}@>; 28691special_node:@<Implement \.{\\special}@>; 28692immediate_code:@<Implement \.{\\immediate}@>; 28693set_language_code:@<Implement \.{\\setlanguage}@>; 28694pic_file_code:@<Implement \.{\\XeTeXpicfile}@>; 28695pdf_file_code:@<Implement \.{\\XeTeXpdffile}@>; 28696glyph_code:@<Implement \.{\\XeTeXglyph}@>; 28697XeTeX_input_encoding_extension_code:@<Implement \.{\\XeTeXinputencoding}@>; 28698XeTeX_default_encoding_extension_code:@<Implement \.{\\XeTeXdefaultencoding}@>; 28699XeTeX_linebreak_locale_extension_code:@<Implement \.{\\XeTeXlinebreaklocale}@>; 28700 28701pdf_save_pos_node: @<Implement \.{\\pdfsavepos}@>; 28702othercases confusion("ext1") 28703@:this can't happen ext1}{\quad ext1@> 28704endcases; 28705end; 28706 28707@ Here is a subroutine that creates a whatsit node having a given |subtype| 28708and a given number of words. It initializes only the first word of the whatsit, 28709and appends it to the current list. 28710 28711@<Declare procedures needed in |do_extension|@>= 28712procedure new_whatsit(@!s:small_number;@!w:small_number); 28713var p:pointer; {the new node} 28714begin p:=get_node(w); type(p):=whatsit_node; subtype(p):=s; 28715link(tail):=p; tail:=p; 28716end; 28717 28718@ The next subroutine uses |cur_chr| to decide what sort of whatsit is 28719involved, and also inserts a |write_stream| number. 28720 28721@<Declare procedures needed in |do_ext...@>= 28722procedure new_write_whatsit(@!w:small_number); 28723begin new_whatsit(cur_chr,w); 28724if w<>write_node_size then scan_four_bit_int 28725else begin scan_int; 28726 if cur_val<0 then cur_val:=17 28727 else if cur_val>15 then cur_val:=16; 28728 end; 28729write_stream(tail):=cur_val; 28730end; 28731 28732@ @<Implement \.{\\openout}@>= 28733begin new_write_whatsit(open_node_size); 28734scan_optional_equals; scan_file_name;@/ 28735open_name(tail):=cur_name; open_area(tail):=cur_area; open_ext(tail):=cur_ext; 28736end 28737 28738@ When `\.{\\write 12\{...\}}' appears, we scan the token list `\.{\{...\}}' 28739without expanding its macros; the macros will be expanded later when this 28740token list is rescanned. 28741 28742@<Implement \.{\\write}@>= 28743begin k:=cur_cs; new_write_whatsit(write_node_size);@/ 28744cur_cs:=k; p:=scan_toks(false,false); write_tokens(tail):=def_ref; 28745end 28746 28747@ @<Implement \.{\\closeout}@>= 28748begin new_write_whatsit(write_node_size); write_tokens(tail):=null; 28749end 28750 28751@ When `\.{\\special\{...\}}' appears, we expand the macros in the token 28752list as in \.{\\xdef} and \.{\\mark}. 28753 28754@<Implement \.{\\special}@>= 28755begin new_whatsit(special_node,write_node_size); write_stream(tail):=null; 28756p:=scan_toks(false,true); write_tokens(tail):=def_ref; 28757end 28758 28759@ @d call_func(#) == begin if # <> 0 then do_nothing end 28760@d flushable(#) == (# = str_ptr - 1) 28761 28762@p procedure flush_str(s: str_number); {flush a string if possible} 28763begin 28764 if flushable(s) then 28765 flush_string; 28766end; 28767 28768function tokens_to_string(p: pointer): str_number; {return a string from tokens 28769list} 28770begin 28771 if selector = new_string then 28772 pdf_error("tokens", "tokens_to_string() called while selector = new_string"); 28773 old_setting:=selector; selector:=new_string; 28774 show_token_list(link(p),null,pool_size-pool_ptr); 28775 selector:=old_setting; 28776 tokens_to_string:=make_string; 28777end; 28778 28779procedure compare_strings; {to implement \.{\\strcmp}} 28780label done; 28781var s1, s2: str_number; 28782 i1, i2, j1, j2: pool_pointer; 28783begin 28784 call_func(scan_toks(false, true)); 28785 s1:=tokens_to_string(def_ref); 28786 delete_token_ref(def_ref); 28787 call_func(scan_toks(false, true)); 28788 s2:=tokens_to_string(def_ref); 28789 delete_token_ref(def_ref); 28790 i1:=str_start_macro(s1); 28791 j1:=str_start_macro(s1 + 1); 28792 i2:=str_start_macro(s2); 28793 j2:=str_start_macro(s2 + 1); 28794 while (i1 < j1) and (i2 < j2) do begin 28795 if str_pool[i1] < str_pool[i2] then begin 28796 cur_val:=-1; 28797 goto done; 28798 end; 28799 if str_pool[i1] > str_pool[i2] then begin 28800 cur_val:=1; 28801 goto done; 28802 end; 28803 incr(i1); 28804 incr(i2); 28805 end; 28806 if (i1 = j1) and (i2 = j2) then 28807 cur_val:=0 28808 else if i1 < j1 then 28809 cur_val:=1 28810 else 28811 cur_val:=-1; 28812done: 28813 flush_str(s2); 28814 flush_str(s1); 28815 cur_val_level:=int_val; 28816end; 28817 28818@ Each new type of node that appears in our data structure must be capable 28819of being displayed, copied, destroyed, and so on. The routines that we 28820need for write-oriented whatsits are somewhat like those for mark nodes; 28821other extensions might, of course, involve more subtlety here. 28822 28823@<Basic printing...@>= 28824procedure print_write_whatsit(@!s:str_number;@!p:pointer); 28825begin print_esc(s); 28826if write_stream(p)<16 then print_int(write_stream(p)) 28827else if write_stream(p)=16 then print_char("*") 28828@.*\relax@> 28829else print_char("-"); 28830end; 28831 28832procedure print_native_word(@!p:pointer); 28833var i,c,cc:integer; 28834begin 28835 for i:=0 to native_length(p) - 1 do begin 28836 c:=get_native_char(p,i); 28837 if (c >= @"D800) and (c <= @"DBFF) then begin 28838 if i < native_length(p) - 1 then begin 28839 cc:=get_native_char(p, i+1); 28840 if (cc >= @"DC00) and (cc <= @"DFFF) then begin 28841 c:=@"10000 + (c - @"D800) * @"400 + (cc - @"DC00); 28842 print_char(c); 28843 incr(i); 28844 end else 28845 print("."); 28846 end else 28847 print("."); 28848 end else 28849 print_char(c); 28850 end 28851end; 28852 28853@ @<Display the whatsit...@>= 28854case subtype(p) of 28855open_node:begin print_write_whatsit("openout",p); 28856 print_char("="); print_file_name(open_name(p),open_area(p),open_ext(p)); 28857 end; 28858write_node:begin print_write_whatsit("write",p); 28859 print_mark(write_tokens(p)); 28860 end; 28861close_node:print_write_whatsit("closeout",p); 28862special_node:begin print_esc("special"); 28863 print_mark(write_tokens(p)); 28864 end; 28865language_node:begin print_esc("setlanguage"); 28866 print_int(what_lang(p)); print(" (hyphenmin "); 28867 print_int(what_lhm(p)); print_char(","); 28868 print_int(what_rhm(p)); print_char(")"); 28869 end; 28870native_word_node:begin print_esc(font_id_text(native_font(p))); 28871 print_char(" "); 28872 print_native_word(p); 28873 end; 28874glyph_node:begin print_esc(font_id_text(native_font(p))); 28875 print(" glyph#"); 28876 print_int(native_glyph(p)); 28877 end; 28878pic_node,pdf_node: begin 28879 if subtype(p) = pic_node then print_esc("XeTeXpicfile") 28880 else print_esc("XeTeXpdffile"); 28881 print(" """); 28882 for i:=0 to pic_path_length(p)-1 do 28883 print_visible_char(pic_path_byte(p, i)); 28884 print(""""); 28885 end; 28886pdf_save_pos_node: print_esc("pdfsavepos"); 28887othercases print("whatsit?") 28888endcases 28889 28890@ Picture nodes are tricky in that they are variable size. 28891@d total_pic_node_size(#) == (pic_node_size + (pic_path_length(#) + sizeof(memory_word) - 1) div sizeof(memory_word)) 28892 28893@<Make a partial copy of the whatsit...@>= 28894case subtype(p) of 28895open_node: begin r:=get_node(open_node_size); words:=open_node_size; 28896 end; 28897write_node,special_node: begin r:=get_node(write_node_size); 28898 add_token_ref(write_tokens(p)); words:=write_node_size; 28899 end; 28900close_node,language_node: begin r:=get_node(small_node_size); 28901 words:=small_node_size; 28902 end; 28903native_word_node: begin words:=native_size(p); 28904 r:=get_node(words); 28905 while words > 0 do 28906 begin decr(words); mem[r+words]:=mem[p+words]; end; 28907 native_glyph_info_ptr(r):=null_ptr; native_glyph_count(r):=0; 28908 copy_native_glyph_info(p, r); 28909 end; 28910glyph_node: begin r:=get_node(glyph_node_size); 28911 words:=glyph_node_size; 28912 end; 28913pic_node,pdf_node: begin words:=total_pic_node_size(p); 28914 r:=get_node(words); 28915 end; 28916pdf_save_pos_node: 28917 r:=get_node(small_node_size); 28918othercases confusion("ext2") 28919@:this can't happen ext2}{\quad ext2@> 28920endcases 28921 28922@ @<Wipe out the whatsit...@>= 28923begin case subtype(p) of 28924open_node: free_node(p,open_node_size); 28925write_node,special_node: begin delete_token_ref(write_tokens(p)); 28926 free_node(p,write_node_size); goto done; 28927 end; 28928close_node,language_node: free_node(p,small_node_size); 28929native_word_node: begin free_native_glyph_info(p); free_node(p,native_size(p)); end; 28930glyph_node: free_node(p,glyph_node_size); 28931pic_node,pdf_node: free_node(p,total_pic_node_size(p)); 28932pdf_save_pos_node: 28933 free_node(p, small_node_size); 28934othercases confusion("ext3") 28935@:this can't happen ext3}{\quad ext3@> 28936endcases;@/ 28937goto done; 28938end 28939 28940@ @<Incorporate a whatsit node into a vbox@>= 28941begin 28942 if (subtype(p)=pic_node) 28943 or (subtype(p)=pdf_node) 28944 then begin 28945 x:=x + d + height(p); 28946 d:=depth(p); 28947 if width(p) > w then w:=width(p); 28948 end; 28949end 28950 28951@ @<Incorporate a whatsit node into an hbox@>= 28952begin 28953 case subtype(p) of 28954 native_word_node: begin 28955 { merge with any following word fragments in same font, discarding discretionary breaks } 28956 if type(q) = disc_node then k:=replace_count(q) else k:=0; 28957 while (link(q) <> p) do begin 28958 decr(k); 28959 q:=link(q); { bring q up in preparation for deletion of nodes starting at p } 28960 if type(q) = disc_node then k:=replace_count(q); 28961 end; 28962 pp:=link(p); 28963 restart: 28964 if (k <= 0) and (pp <> null) and (not is_char_node(pp)) then begin 28965 if (type(pp) = whatsit_node) 28966 and (subtype(pp) = native_word_node) 28967 and (native_font(pp) = native_font(p)) then begin 28968 pp:=link(pp); 28969 goto restart; 28970 end 28971 else if (type(pp) = disc_node) then begin 28972 ppp:=link(pp); 28973 if is_native_word_node(ppp) and (native_font(ppp) = native_font(p)) then begin 28974 pp:=link(ppp); 28975 goto restart; 28976 end 28977 end 28978 end; 28979 28980 { now pp points to the non-|native_word| node that ended the chain, or null } 28981 28982 { we can just check type(p)=|whatsit_node| below, as we know that the chain 28983 contains only discretionaries and |native_word| nodes, no other whatsits or |char_node|s } 28984 28985 if (pp <> link(p)) then begin 28986 { found a chain of at least two pieces starting at p } 28987 total_chars:=0; 28988 p:=link(q); { the first fragment } 28989 while (p <> pp) do begin 28990 if (type(p) = whatsit_node) then 28991 total_chars:=total_chars + native_length(p); { accumulate char count } 28992 ppp:=p; { remember last node seen } 28993 p:=link(p); { point to next fragment or discretionary or terminator } 28994 end; 28995 28996 p:=link(q); { the first fragment again } 28997 pp:=new_native_word_node(native_font(p), total_chars); { make new node for merged word } 28998 link(q):=pp; { link to preceding material } 28999 link(pp):=link(ppp); { attach remainder of hlist to it } 29000 link(ppp):=null; { and detach from the old fragments } 29001 29002 { copy the chars into new node } 29003 total_chars:=0; 29004 ppp:=p; 29005 repeat 29006 if (type(ppp) = whatsit_node) then 29007 for k:=0 to native_length(ppp)-1 do begin 29008 set_native_char(pp, total_chars, get_native_char(ppp, k)); 29009 incr(total_chars); 29010 end; 29011 ppp:=link(ppp); 29012 until (ppp = null); 29013 29014 flush_node_list(p); { delete the fragments } 29015 p:=link(q); { update p to point to the new node } 29016 set_native_metrics(p, XeTeX_use_glyph_metrics); { and measure it (i.e., re-do the OT layout) } 29017 end; 29018 29019 { now incorporate the |native_word| node measurements into the box we're packing } 29020 if height(p) > h then 29021 h:=height(p); 29022 if depth(p) > d then 29023 d:=depth(p); 29024 x:=x + width(p); 29025 end; 29026 glyph_node, pic_node, pdf_node: begin 29027 if height(p) > h then 29028 h:=height(p); 29029 if depth(p) > d then 29030 d:=depth(p); 29031 x:=x + width(p); 29032 end; 29033 othercases do_nothing 29034 endcases; 29035end 29036 29037@ @<Let |d| be the width of the whatsit |p|, and |goto found| if ``visible''@>= 29038if (subtype(p)=native_word_node) 29039or (subtype(p)=glyph_node) 29040or (subtype(p)=pic_node) 29041or (subtype(p)=pdf_node) 29042then begin 29043 d:=width(p); 29044 goto found; 29045end else 29046 d:=0 29047 29048@ @d adv_past_linebreak(#)==@+if subtype(#)=language_node then 29049 begin cur_lang:=what_lang(#); l_hyf:=what_lhm(#); r_hyf:=what_rhm(#); 29050 set_hyph_index; 29051 end 29052 else if (subtype(#)=native_word_node) 29053 or (subtype(#)=glyph_node) 29054 or (subtype(#)=pic_node) 29055 or (subtype(#)=pdf_node) 29056 then 29057 begin act_width:=act_width+width(#); end 29058 29059@<Advance \(p)past a whatsit node in the \(l)|line_break| loop@>=@+ 29060adv_past_linebreak(cur_p) 29061 29062@ @d adv_past_prehyph(#)==@+if subtype(#)=language_node then 29063 begin cur_lang:=what_lang(#); l_hyf:=what_lhm(#); r_hyf:=what_rhm(#); 29064 set_hyph_index; 29065 end 29066 29067@<Advance \(p)past a whatsit node in the \(p)pre-hyphenation loop@>=@+ 29068adv_past_prehyph(s) 29069 29070@ @<Prepare to move whatsit |p| to the current page, then |goto contribute|@>= 29071begin 29072 if (subtype(p)=pic_node) 29073 or (subtype(p)=pdf_node) 29074 then begin 29075 page_total:=page_total + page_depth + height(p); 29076 page_depth:=depth(p); 29077 end; 29078 goto contribute; 29079end 29080 29081@ @<Process whatsit |p| in |vert_break| loop, |goto not_found|@>= 29082begin 29083 if (subtype(p)=pic_node) 29084 or (subtype(p)=pdf_node) 29085 then begin 29086 cur_height:=cur_height + prev_dp + height(p); prev_dp:=depth(p); 29087 end; 29088 goto not_found; 29089end 29090 29091@ @<Output the whatsit node |p| in a vlist@>= 29092begin 29093 case subtype(p) of 29094 glyph_node: begin 29095 cur_v:=cur_v+height(p); 29096 cur_h:=left_edge; 29097 synch_h; synch_v; {Sync DVI state to TeX state} 29098 f:=native_font(p); 29099 if f<>dvi_f then @<Change font |dvi_f| to |f|@>; 29100 dvi_out(set_glyphs); 29101 dvi_four(0); { width } 29102 dvi_two(1); { glyph count } 29103 dvi_four(0); { x-offset as fixed point } 29104 dvi_four(0); { y-offset as fixed point } 29105 dvi_two(native_glyph(p)); 29106 cur_v:=cur_v+depth(p); 29107 cur_h:=left_edge; 29108 end; 29109 pic_node, pdf_node: begin 29110 save_h:=dvi_h; save_v:=dvi_v; 29111 cur_v:=cur_v+height(p); 29112 pic_out(p); 29113 dvi_h:=save_h; dvi_v:=save_v; 29114 cur_v:=save_v+depth(p); cur_h:=left_edge; 29115 end; 29116 pdf_save_pos_node: @<Save current position to |pdf_last_x_pos|, |pdf_last_y_pos|@>; 29117 othercases out_what(p) 29118 endcases 29119end 29120 29121@ @<Save current position to |pdf_last_x_pos|, |pdf_last_y_pos|@>= 29122begin 29123 pdf_last_x_pos:=cur_h + cur_h_offset; 29124 pdf_last_y_pos:=cur_page_height - cur_v - cur_v_offset 29125end 29126 29127@ @<Calculate page dimensions and margins@>= 29128cur_h_offset:=h_offset + (unity * 7227) / 100; 29129cur_v_offset:=v_offset + (unity * 7227) / 100; 29130if pdf_page_width <> 0 then 29131 cur_page_width:=pdf_page_width 29132else 29133 cur_page_width:=width(p) + 2*cur_h_offset; 29134if pdf_page_height <> 0 then 29135 cur_page_height:=pdf_page_height 29136else 29137 cur_page_height:=height(p) + depth(p) + 2*cur_v_offset 29138 29139@ @<Glob...@>= 29140@!cur_page_width: scaled; {width of page being shipped} 29141@!cur_page_height: scaled; {height of page being shipped} 29142@!cur_h_offset: scaled; {horizontal offset of page being shipped} 29143@!cur_v_offset: scaled; {vertical offset of page being shipped} 29144 29145@ @<Output the whatsit node |p| in an hlist@>= 29146begin 29147 case subtype(p) of 29148 native_word_node, glyph_node: begin 29149 synch_h; synch_v; {Sync DVI state to TeX state} 29150 f:=native_font(p); 29151 if f<>dvi_f then @<Change font |dvi_f| to |f|@>; 29152 if subtype(p) = glyph_node then begin 29153 dvi_out(set_glyphs); 29154 dvi_four(width(p)); 29155 dvi_two(1); { glyph count } 29156 dvi_four(0); { x-offset as fixed point } 29157 dvi_four(0); { y-offset as fixed point } 29158 dvi_two(native_glyph(p)); 29159 cur_h:=cur_h + width(p); 29160 end else begin 29161 if native_glyph_info_ptr(p) <> null_ptr then begin 29162 dvi_out(set_glyphs); 29163 len:=make_xdv_glyph_array_data(p); 29164 for k:=0 to len-1 do 29165 dvi_out(xdv_buffer_byte(k)); 29166 end; 29167 cur_h:=cur_h + width(p); 29168 end; 29169 dvi_h:=cur_h; 29170 end; 29171 pic_node, pdf_node: begin 29172 save_h:=dvi_h; save_v:=dvi_v; 29173 cur_v:=base_line; 29174 edge:=cur_h+width(p); 29175 pic_out(p); 29176 dvi_h:=save_h; dvi_v:=save_v; 29177 cur_h:=edge; cur_v:=base_line; 29178 end; 29179 pdf_save_pos_node: @<Save current position to |pdf_last_x_pos|, |pdf_last_y_pos|@>; 29180 othercases out_what(p) 29181 endcases 29182end 29183 29184@ After all this preliminary shuffling, we come finally to the routines 29185that actually send out the requested data. Let's do \.{\\special} first 29186(it's easier). 29187 29188@<Declare procedures needed in |hlist_out|, |vlist_out|@>= 29189procedure special_out(@!p:pointer); 29190var old_setting:0..max_selector; {holds print |selector|} 29191@!k:pool_pointer; {index into |str_pool|} 29192begin synch_h; synch_v;@/ 29193doing_special:=true; 29194old_setting:=selector; selector:=new_string; 29195show_token_list(link(write_tokens(p)),null,pool_size-pool_ptr); 29196selector:=old_setting; 29197str_room(1); 29198if cur_length<256 then 29199 begin dvi_out(xxx1); dvi_out(cur_length); 29200 end 29201else begin dvi_out(xxx4); dvi_four(cur_length); 29202 end; 29203for k:=str_start_macro(str_ptr) to pool_ptr-1 do dvi_out(so(str_pool[k])); 29204pool_ptr:=str_start_macro(str_ptr); {erase the string} 29205doing_special:=false; 29206end; 29207 29208@ To write a token list, we must run it through \TeX's scanner, expanding 29209macros and \.{\\the} and \.{\\number}, etc. This might cause runaways, 29210if a delimited macro parameter isn't matched, and runaways would be 29211extremely confusing since we are calling on \TeX's scanner in the middle 29212of a \.{\\shipout} command. Therefore we will put a dummy control sequence as 29213a ``stopper,'' right after the token list. This control sequence is 29214artificially defined to be \.{\\outer}. 29215@:end_write_}{\.{\\endwrite}@> 29216 29217@<Initialize table...@>= 29218text(end_write):="endwrite"; eq_level(end_write):=level_one; 29219eq_type(end_write):=outer_call; equiv(end_write):=null; 29220 29221@ @<Declare procedures needed in |hlist_out|, |vlist_out|@>= 29222procedure write_out(@!p:pointer); 29223var old_setting:0..max_selector; {holds print |selector|} 29224@!old_mode:integer; {saved |mode|} 29225@!j:small_number; {write stream number} 29226@!k:integer; 29227@!q,@!r:pointer; {temporary variables for list manipulation} 29228begin @<Expand macros in the token list 29229 and make |link(def_ref)| point to the result@>; 29230old_setting:=selector; j:=write_stream(p); 29231if write_open[j] then selector:=j 29232else begin {write to the terminal if file isn't open} 29233 if (j=17)and(selector=term_and_log) then selector:=log_only; 29234 print_nl(""); 29235 end; 29236token_show(def_ref); print_ln; 29237flush_list(def_ref); selector:=old_setting; 29238end; 29239 29240@ The final line of this routine is slightly subtle; at least, the author 29241didn't think about it until getting burnt! There is a used-up token list 29242@^Knuth, Donald Ervin@> 29243on the stack, namely the one that contained |end_write_token|. (We 29244insert this artificial `\.{\\endwrite}' to prevent runaways, as explained 29245above.) If it were not removed, and if there were numerous writes on a 29246single page, the stack would overflow. 29247 29248@d end_write_token==cs_token_flag+end_write 29249 29250@<Expand macros in the token list and...@>= 29251q:=get_avail; info(q):=right_brace_token+"}";@/ 29252r:=get_avail; link(q):=r; info(r):=end_write_token; ins_list(q);@/ 29253begin_token_list(write_tokens(p),write_text);@/ 29254q:=get_avail; info(q):=left_brace_token+"{"; ins_list(q); 29255{now we're ready to scan 29256 `\.\{$\langle\,$token list$\,\rangle$\.{\} \\endwrite}'} 29257old_mode:=mode; mode:=0; 29258 {disable \.{\\prevdepth}, \.{\\spacefactor}, \.{\\lastskip}, \.{\\prevgraf}} 29259cur_cs:=write_loc; q:=scan_toks(false,true); {expand macros, etc.} 29260get_token;@+if cur_tok<>end_write_token then 29261 @<Recover from an unbalanced write command@>; 29262mode:=old_mode; 29263end_token_list {conserve stack space} 29264 29265@ @<Recover from an unbalanced write command@>= 29266begin print_err("Unbalanced write command"); 29267@.Unbalanced write...@> 29268help2("On this page there's a \write with fewer real {'s than }'s.")@/ 29269("I can't handle that very well; good luck."); error; 29270repeat get_token; 29271until cur_tok=end_write_token; 29272end 29273 29274@ The |out_what| procedure takes care of outputting whatsit nodes for 29275|vlist_out| and |hlist_out|\kern-.3pt. 29276 29277@<Declare procedures needed in |hlist_out|, |vlist_out|@>= 29278procedure pic_out(@!p:pointer); 29279var old_setting:0..max_selector; {holds print |selector|} 29280 i:integer; 29281 k:pool_pointer; {index into |str_pool|} 29282begin 29283synch_h; synch_v; 29284old_setting:=selector; selector:=new_string; 29285print("pdf:image "); 29286print("matrix "); 29287print_scaled(pic_transform1(p)); print(" "); 29288print_scaled(pic_transform2(p)); print(" "); 29289print_scaled(pic_transform3(p)); print(" "); 29290print_scaled(pic_transform4(p)); print(" "); 29291print_scaled(pic_transform5(p)); print(" "); 29292print_scaled(pic_transform6(p)); print(" "); 29293print("page "); print_int(pic_page(p)); print(" "); 29294print("("); 29295for i:=0 to pic_path_length(p)-1 do 29296 print_visible_char(pic_path_byte(p, i)); 29297print(")"); 29298selector:=old_setting; 29299if cur_length<256 then 29300 begin dvi_out(xxx1); dvi_out(cur_length); 29301 end 29302else begin dvi_out(xxx4); dvi_four(cur_length); 29303 end; 29304for k:=str_start_macro(str_ptr) to pool_ptr-1 do dvi_out(so(str_pool[k])); 29305pool_ptr:=str_start_macro(str_ptr); {erase the string} 29306end; 29307 29308procedure out_what(@!p:pointer); 29309var j:small_number; {write stream number} 29310begin case subtype(p) of 29311open_node,write_node,close_node:@<Do some work that has been queued up 29312 for \.{\\write}@>; 29313special_node:special_out(p); 29314language_node:do_nothing; 29315othercases confusion("ext4") 29316@:this can't happen ext4}{\quad ext4@> 29317endcases; 29318end; 29319 29320@ We don't implement \.{\\write} inside of leaders. (The reason is that 29321the number of times a leader box appears might be different in different 29322implementations, due to machine-dependent rounding in the glue calculations.) 29323@^leaders@> 29324 29325@<Do some work that has been queued up...@>= 29326if not doing_leaders then 29327 begin j:=write_stream(p); 29328 if subtype(p)=write_node then write_out(p) 29329 else begin if write_open[j] then a_close(write_file[j]); 29330 if subtype(p)=close_node then write_open[j]:=false 29331 else if j<16 then 29332 begin cur_name:=open_name(p); cur_area:=open_area(p); 29333 cur_ext:=open_ext(p); 29334 if cur_ext="" then cur_ext:=".tex"; 29335 pack_cur_name; 29336 while not a_open_out(write_file[j]) do 29337 prompt_file_name("output file name",".tex"); 29338 write_open[j]:=true; 29339 end; 29340 end; 29341 end 29342 29343@ The presence of `\.{\\immediate}' causes the |do_extension| procedure 29344to descend to one level of recursion. Nothing happens unless \.{\\immediate} 29345is followed by `\.{\\openout}', `\.{\\write}', or `\.{\\closeout}'. 29346@^recursion@> 29347 29348@<Implement \.{\\immediate}@>= 29349begin get_x_token; 29350if (cur_cmd=extension)and(cur_chr<=close_node) then 29351 begin p:=tail; do_extension; {append a whatsit node} 29352 out_what(tail); {do the action immediately} 29353 flush_node_list(tail); tail:=p; link(p):=null; 29354 end 29355else back_input; 29356end 29357 29358@ The \.{\\language} extension is somewhat different. 29359We need a subroutine that comes into play when a character of 29360a non-|clang| language is being appended to the current paragraph. 29361 29362@<Declare action...@>= 29363procedure fix_language; 29364var @!l:ASCII_code; {the new current language} 29365begin if language<=0 then l:=0 29366else if language>255 then l:=0 29367else l:=language; 29368if l<>clang then 29369 begin new_whatsit(language_node,small_node_size); 29370 what_lang(tail):=l; clang:=l;@/ 29371 what_lhm(tail):=norm_min(left_hyphen_min); 29372 what_rhm(tail):=norm_min(right_hyphen_min); 29373 end; 29374end; 29375 29376@ @<Implement \.{\\setlanguage}@>= 29377if abs(mode)<>hmode then report_illegal_case 29378else begin new_whatsit(language_node,small_node_size); 29379 scan_int; 29380 if cur_val<=0 then clang:=0 29381 else if cur_val>255 then clang:=0 29382 else clang:=cur_val; 29383 what_lang(tail):=clang; 29384 what_lhm(tail):=norm_min(left_hyphen_min); 29385 what_rhm(tail):=norm_min(right_hyphen_min); 29386 end 29387 29388@ @<Finish the extensions@>= 29389terminate_font_manager; 29390for k:=0 to 15 do if write_open[k] then a_close(write_file[k]) 29391 29392@ @<Implement \.{\\XeTeXpicfile}@>= 29393if abs(mode)=mmode then report_illegal_case 29394else load_picture(false) 29395 29396@ @<Implement \.{\\XeTeXpdffile}@>= 29397if abs(mode)=mmode then report_illegal_case 29398else load_picture(true) 29399 29400@ @<Implement \.{\\XeTeXglyph}@>= 29401begin 29402 if abs(mode)=vmode then begin 29403 back_input; 29404 new_graf(true); 29405 end else if abs(mode)=mmode then report_illegal_case 29406 else begin 29407 if is_native_font(cur_font) then begin 29408 new_whatsit(glyph_node,glyph_node_size); 29409 scan_int; 29410 if (cur_val<0)or(cur_val>65535) then 29411 begin print_err("Bad glyph number"); 29412 help2("A glyph number must be between 0 and 65535.")@/ 29413 ("I changed this one to zero."); int_error(cur_val); cur_val:=0; 29414 end; 29415 native_font(tail):=cur_font; 29416 native_glyph(tail):=cur_val; 29417 set_native_glyph_metrics(tail, XeTeX_use_glyph_metrics); 29418 end else not_native_font_error(extension, glyph_code, cur_font); 29419 end 29420end 29421 29422@ Load a picture file and handle following keywords. 29423 29424@d calc_min_and_max== 29425 begin 29426 xmin:=1000000.0; 29427 xmax:=-xmin; 29428 ymin:=xmin; 29429 ymax:=xmax; 29430 for i:=0 to 3 do begin 29431 if xCoord(corners[i]) < xmin then xmin:=xCoord(corners[i]); 29432 if xCoord(corners[i]) > xmax then xmax:=xCoord(corners[i]); 29433 if yCoord(corners[i]) < ymin then ymin:=yCoord(corners[i]); 29434 if yCoord(corners[i]) > ymax then ymax:=yCoord(corners[i]); 29435 end; 29436 end 29437 29438@d update_corners== 29439 for i:=0 to 3 do 29440 transform_point(addressof(corners[i]), addressof(t2)) 29441 29442@d do_size_requests==begin 29443 { calculate current width and height } 29444 calc_min_and_max; 29445 if x_size_req = 0.0 then begin 29446 make_scale(addressof(t2), y_size_req / (ymax - ymin), y_size_req / (ymax - ymin)); 29447 end else if y_size_req = 0.0 then begin 29448 make_scale(addressof(t2), x_size_req / (xmax - xmin), x_size_req / (xmax - xmin)); 29449 end else begin 29450 make_scale(addressof(t2), x_size_req / (xmax - xmin), y_size_req / (ymax - ymin)); 29451 end; 29452 update_corners; 29453 x_size_req:=0.0; 29454 y_size_req:=0.0; 29455 transform_concat(addressof(t), addressof(t2)); 29456end 29457 29458@<Declare procedures needed in |do_extension|@>= 29459procedure load_picture(@!is_pdf:boolean); 29460var 29461 pic_path: ^char; 29462 bounds: real_rect; 29463 t, t2: transform; 29464 corners: array[0..3] of real_point; 29465 x_size_req,y_size_req: real; 29466 check_keywords: boolean; 29467 xmin,xmax,ymin,ymax: real; 29468 i: small_number; 29469 page: integer; 29470 pdf_box_type: integer; 29471 result: integer; 29472begin 29473 { scan the filename and pack into |name_of_file| } 29474 scan_file_name; 29475 pack_cur_name; 29476 29477 pdf_box_type:=0; 29478 page:=0; 29479 if is_pdf then begin 29480 if scan_keyword("page") then begin 29481 scan_int; 29482 page:=cur_val; 29483 end; 29484 pdf_box_type:=pdfbox_crop; 29485 if scan_keyword("crop") then do_nothing 29486 else if scan_keyword("media") then pdf_box_type:=pdfbox_media 29487 else if scan_keyword("bleed") then pdf_box_type:=pdfbox_bleed 29488 else if scan_keyword("trim") then pdf_box_type:=pdfbox_trim 29489 else if scan_keyword("art") then pdf_box_type:=pdfbox_art; 29490 end; 29491 29492 { access the picture file and check its size } 29493 result:=find_pic_file(addressof(pic_path), addressof(bounds), pdf_box_type, page); 29494 29495 setPoint(corners[0], xField(bounds), yField(bounds)); 29496 setPoint(corners[1], xField(corners[0]), yField(bounds) + htField(bounds)); 29497 setPoint(corners[2], xField(bounds) + wdField(bounds), yField(corners[1])); 29498 setPoint(corners[3], xField(corners[2]), yField(corners[0])); 29499 29500 x_size_req:=0.0; 29501 y_size_req:=0.0; 29502 29503 { look for any scaling requests for this picture } 29504 make_identity(addressof(t)); 29505 29506 check_keywords:=true; 29507 while check_keywords do begin 29508 if scan_keyword("scaled") then begin 29509 scan_int; 29510 if (x_size_req = 0.0) and (y_size_req = 0.0) then begin 29511 make_scale(addressof(t2), float(cur_val) / 1000.0, float(cur_val) / 1000.0); 29512 update_corners; 29513 transform_concat(addressof(t), addressof(t2)); 29514 end 29515 end else if scan_keyword("xscaled") then begin 29516 scan_int; 29517 if (x_size_req = 0.0) and (y_size_req = 0.0) then begin 29518 make_scale(addressof(t2), float(cur_val) / 1000.0, 1.0); 29519 update_corners; 29520 transform_concat(addressof(t), addressof(t2)); 29521 end 29522 end else if scan_keyword("yscaled") then begin 29523 scan_int; 29524 if (x_size_req = 0.0) and (y_size_req = 0.0) then begin 29525 make_scale(addressof(t2), 1.0, float(cur_val) / 1000.0); 29526 update_corners; 29527 transform_concat(addressof(t), addressof(t2)); 29528 end 29529 end else if scan_keyword("width") then begin 29530 scan_normal_dimen; 29531 if cur_val <= 0 then begin 29532 print_err("Improper image "); 29533 print("size ("); 29534 print_scaled(cur_val); 29535 print("pt) will be ignored"); 29536 help2("I can't scale images to zero or negative sizes,")@/ 29537 ("so I'm ignoring this."); 29538 error; 29539 end else 29540 x_size_req:=Fix2D(cur_val); 29541 end else if scan_keyword("height") then begin 29542 scan_normal_dimen; 29543 if cur_val <= 0 then begin 29544 print_err("Improper image "); 29545 print("size ("); 29546 print_scaled(cur_val); 29547 print("pt) will be ignored"); 29548 help2("I can't scale images to zero or negative sizes,")@/ 29549 ("so I'm ignoring this."); 29550 error; 29551 end else 29552 y_size_req:=Fix2D(cur_val); 29553 end else if scan_keyword("rotated") then begin 29554 scan_decimal; 29555 if (x_size_req <> 0.0) or (y_size_req <> 0.0) then do_size_requests; 29556 make_rotation(addressof(t2), Fix2D(cur_val) * 3.141592653589793 / 180.0); 29557 update_corners; 29558 calc_min_and_max; 29559 setPoint(corners[0], xmin, ymin); 29560 setPoint(corners[1], xmin, ymax); 29561 setPoint(corners[2], xmax, ymax); 29562 setPoint(corners[3], xmax, ymin); 29563 transform_concat(addressof(t), addressof(t2)); 29564 end else 29565 check_keywords:=false; 29566 end; 29567 29568 if (x_size_req <> 0.0) or (y_size_req <> 0.0) then do_size_requests; 29569 29570 calc_min_and_max; 29571 make_translation(addressof(t2), -xmin * 72 / 72.27, -ymin * 72 / 72.27); 29572 transform_concat(addressof(t), addressof(t2)); 29573 29574 if result = 0 then begin 29575 new_whatsit(pic_node, pic_node_size + (strlen(pic_path) + sizeof(memory_word) - 1) div sizeof(memory_word)); 29576 if is_pdf then begin 29577 subtype(tail):=pdf_node; 29578 end; 29579 pic_path_length(tail):=strlen(pic_path); 29580 pic_page(tail):=page; 29581 29582 width(tail):=D2Fix(xmax - xmin); 29583 height(tail):=D2Fix(ymax - ymin); 29584 depth(tail):=0; 29585 29586 pic_transform1(tail):=D2Fix(aField(t)); 29587 pic_transform2(tail):=D2Fix(bField(t)); 29588 pic_transform3(tail):=D2Fix(cField(t)); 29589 pic_transform4(tail):=D2Fix(dField(t)); 29590 pic_transform5(tail):=D2Fix(xField(t)); 29591 pic_transform6(tail):=D2Fix(yField(t)); 29592 29593 memcpy(addressof(mem[tail + pic_node_size]), pic_path, strlen(pic_path)); 29594 libc_free(pic_path); 29595 end else begin 29596 print_err("Unable to load picture or PDF file '"); 29597 print_file_name(cur_name,cur_area,cur_ext); print("'"); 29598 if result = -43 then begin { Mac OS file not found error } 29599 help2("The requested image couldn't be read because")@/ 29600 ("the file was not found."); 29601 end 29602 else begin { otherwise assume GraphicImport failed } 29603 help2("The requested image couldn't be read because")@/ 29604 ("it was not a recognized image format."); 29605 end; 29606 error; 29607 end; 29608end; 29609 29610@ @<Implement \.{\\XeTeXinputencoding}@>= 29611begin 29612 scan_and_pack_name; {scan a filename-like arg for the input encoding} 29613 29614 i:=get_encoding_mode_and_info(addressof(j)); {convert it to |mode| and |encoding| values} 29615 if i = XeTeX_input_mode_auto then begin 29616 print_err("Encoding mode `auto' is not valid for \XeTeXinputencoding"); 29617 help2("You can't use `auto' encoding here, only for \XeTeXdefaultencoding.")@/ 29618 ("I'll ignore this and leave the current encoding unchanged.");@/ 29619 error; 29620 end else set_input_file_encoding(input_file[in_open], i, j); {apply them to the current input file} 29621end 29622 29623@ @<Implement \.{\\XeTeXdefaultencoding}@>= 29624begin 29625 scan_and_pack_name; {scan a filename-like arg for the input encoding} 29626 29627 i:=get_encoding_mode_and_info(addressof(j)); {convert it to |mode| and |encoding| values} 29628 XeTeX_default_input_mode:=i; {store them as defaults for new input files} 29629 XeTeX_default_input_encoding:=j; 29630end 29631 29632@ @<Implement \.{\\XeTeXlinebreaklocale}@>= 29633begin 29634 scan_file_name; {scan a filename-like arg for the locale name} 29635 if length(cur_name) = 0 then XeTeX_linebreak_locale:=0 29636 else XeTeX_linebreak_locale:=cur_name; {we ignore the area and extension!} 29637end 29638 29639@ @<Glob...@>= 29640@!pdf_last_x_pos: integer; 29641@!pdf_last_y_pos: integer; 29642 29643@ @<Implement \.{\\pdfsavepos}@>= 29644begin 29645 new_whatsit(pdf_save_pos_node, small_node_size); 29646end 29647 29648 29649@* \[53a] The extended features of \eTeX. 29650The program has two modes of operation: (1)~In \TeX\ compatibility mode 29651it fully deserves the name \TeX\ and there are neither extended features 29652nor additional primitive commands. There are, however, a few 29653modifications that would be legitimate in any implementation of \TeX\ 29654such as, e.g., preventing inadequate results of the glue to \.{DVI} 29655unit conversion during |ship_out|. (2)~In extended mode there are 29656additional primitive commands and the extended features of \eTeX\ are 29657available. 29658 29659The distinction between these two modes of operation initially takes 29660place when a `virgin' \.{eINITEX} starts without reading a format file. 29661Later on the values of all \eTeX\ state variables are inherited when 29662\.{eVIRTEX} (or \.{eINITEX}) reads a format file. 29663 29664The code below is designed to work for cases where `$|init|\ldots|tini|$' 29665is a run-time switch. 29666 29667@<Enable \eTeX, if requested@>= 29668@!init if (buffer[loc]="*")and(format_ident=" (INITEX)") then 29669 begin no_new_control_sequence:=false; 29670 @<Generate all \eTeX\ primitives@>@; 29671 incr(loc); eTeX_mode:=1; {enter extended mode} 29672 @<Initialize variables for \eTeX\ extended mode@>@; 29673 end; 29674tini@;@/ 29675if not no_new_control_sequence then {just entered extended mode ?} 29676 no_new_control_sequence:=true@+else 29677 29678@ The \eTeX\ features available in extended mode are grouped into two 29679categories: (1)~Some of them are permanently enabled and have no 29680semantic effect as long as none of the additional primitives are 29681executed. (2)~The remaining \eTeX\ features are optional and can be 29682individually enabled and disabled. For each optional feature there is 29683an \eTeX\ state variable named \.{\\...state}; the feature is enabled, 29684resp.\ disabled by assigning a positive, resp.\ non-positive value to 29685that integer. 29686 29687@d eTeX_state_base=int_base+eTeX_state_code 29688@d eTeX_state(#)==eqtb[eTeX_state_base+#].int {an \eTeX\ state variable} 29689@# 29690@d eTeX_version_code=eTeX_int {code for \.{\\eTeXversion}} 29691 29692@<Generate all \eTeX...@>= 29693primitive("lastnodetype",last_item,last_node_type_code); 29694@!@:last_node_type_}{\.{\\lastnodetype} primitive@> 29695primitive("eTeXversion",last_item,eTeX_version_code); 29696@!@:eTeX_version_}{\.{\\eTeXversion} primitive@> 29697primitive("eTeXrevision",convert,eTeX_revision_code);@/ 29698@!@:eTeX_revision_}{\.{\\eTeXrevision} primitive@> 29699 29700primitive("XeTeXversion",last_item,XeTeX_version_code); 29701@!@:XeTeX_version_}{\.{\\XeTeXversion} primitive@> 29702primitive("XeTeXrevision",convert,XeTeX_revision_code);@/ 29703@!@:XeTeXrevision_}{\.{\\XeTeXrevision} primitive@> 29704 29705primitive("XeTeXcountglyphs",last_item,XeTeX_count_glyphs_code); 29706 29707primitive("XeTeXcountvariations",last_item,XeTeX_count_variations_code); 29708primitive("XeTeXvariation",last_item,XeTeX_variation_code); 29709primitive("XeTeXfindvariationbyname",last_item,XeTeX_find_variation_by_name_code); 29710primitive("XeTeXvariationmin",last_item,XeTeX_variation_min_code); 29711primitive("XeTeXvariationmax",last_item,XeTeX_variation_max_code); 29712primitive("XeTeXvariationdefault",last_item,XeTeX_variation_default_code); 29713 29714primitive("XeTeXcountfeatures",last_item,XeTeX_count_features_code); 29715primitive("XeTeXfeaturecode",last_item,XeTeX_feature_code_code); 29716primitive("XeTeXfindfeaturebyname",last_item,XeTeX_find_feature_by_name_code); 29717primitive("XeTeXisexclusivefeature",last_item,XeTeX_is_exclusive_feature_code); 29718primitive("XeTeXcountselectors",last_item,XeTeX_count_selectors_code); 29719primitive("XeTeXselectorcode",last_item,XeTeX_selector_code_code); 29720primitive("XeTeXfindselectorbyname",last_item,XeTeX_find_selector_by_name_code); 29721primitive("XeTeXisdefaultselector",last_item,XeTeX_is_default_selector_code); 29722 29723primitive("XeTeXvariationname",convert,XeTeX_variation_name_code); 29724primitive("XeTeXfeaturename",convert,XeTeX_feature_name_code); 29725primitive("XeTeXselectorname",convert,XeTeX_selector_name_code); 29726 29727primitive("XeTeXOTcountscripts",last_item,XeTeX_OT_count_scripts_code); 29728primitive("XeTeXOTcountlanguages",last_item,XeTeX_OT_count_languages_code); 29729primitive("XeTeXOTcountfeatures",last_item,XeTeX_OT_count_features_code); 29730primitive("XeTeXOTscripttag",last_item,XeTeX_OT_script_code); 29731primitive("XeTeXOTlanguagetag",last_item,XeTeX_OT_language_code); 29732primitive("XeTeXOTfeaturetag",last_item,XeTeX_OT_feature_code); 29733 29734primitive("XeTeXcharglyph", last_item, XeTeX_map_char_to_glyph_code); 29735primitive("XeTeXglyphindex", last_item, XeTeX_glyph_index_code); 29736primitive("XeTeXglyphbounds", last_item, XeTeX_glyph_bounds_code); 29737 29738primitive("XeTeXglyphname",convert,XeTeX_glyph_name_code); 29739 29740primitive("XeTeXfonttype", last_item, XeTeX_font_type_code); 29741 29742primitive("XeTeXfirstfontchar", last_item, XeTeX_first_char_code); 29743primitive("XeTeXlastfontchar", last_item, XeTeX_last_char_code); 29744 29745primitive("pdflastxpos",last_item,pdf_last_x_pos_code); 29746primitive("pdflastypos",last_item,pdf_last_y_pos_code); 29747primitive("strcmp",convert,pdf_strcmp_code); 29748primitive("shellescape",last_item,pdf_shell_escape_code); 29749 29750primitive("XeTeXpdfpagecount",last_item,XeTeX_pdf_page_count_code); 29751 29752@ @<Cases of |last_item| for |print_cmd_chr|@>= 29753last_node_type_code: print_esc("lastnodetype"); 29754eTeX_version_code: print_esc("eTeXversion"); 29755XeTeX_version_code: print_esc("XeTeXversion"); 29756 29757XeTeX_count_glyphs_code: print_esc("XeTeXcountglyphs"); 29758 29759XeTeX_count_variations_code: print_esc("XeTeXcountvariations"); 29760XeTeX_variation_code: print_esc("XeTeXvariation"); 29761XeTeX_find_variation_by_name_code: print_esc("XeTeXfindvariationbyname"); 29762XeTeX_variation_min_code: print_esc("XeTeXvariationmin"); 29763XeTeX_variation_max_code: print_esc("XeTeXvariationmax"); 29764XeTeX_variation_default_code: print_esc("XeTeXvariationdefault"); 29765 29766XeTeX_count_features_code: print_esc("XeTeXcountfeatures"); 29767XeTeX_feature_code_code: print_esc("XeTeXfeaturecode"); 29768XeTeX_find_feature_by_name_code: print_esc("XeTeXfindfeaturebyname"); 29769XeTeX_is_exclusive_feature_code: print_esc("XeTeXisexclusivefeature"); 29770XeTeX_count_selectors_code: print_esc("XeTeXcountselectors"); 29771XeTeX_selector_code_code: print_esc("XeTeXselectorcode"); 29772XeTeX_find_selector_by_name_code: print_esc("XeTeXfindselectorbyname"); 29773XeTeX_is_default_selector_code: print_esc("XeTeXisdefaultselector"); 29774 29775XeTeX_OT_count_scripts_code: print_esc("XeTeXOTcountscripts"); 29776XeTeX_OT_count_languages_code: print_esc("XeTeXOTcountlanguages"); 29777XeTeX_OT_count_features_code: print_esc("XeTeXOTcountfeatures"); 29778XeTeX_OT_script_code: print_esc("XeTeXOTscripttag"); 29779XeTeX_OT_language_code: print_esc("XeTeXOTlanguagetag"); 29780XeTeX_OT_feature_code: print_esc("XeTeXOTfeaturetag"); 29781 29782XeTeX_map_char_to_glyph_code: print_esc("XeTeXcharglyph"); 29783XeTeX_glyph_index_code: print_esc("XeTeXglyphindex"); 29784XeTeX_glyph_bounds_code: print_esc("XeTeXglyphbounds"); 29785 29786XeTeX_font_type_code: print_esc("XeTeXfonttype"); 29787 29788XeTeX_first_char_code: print_esc("XeTeXfirstfontchar"); 29789XeTeX_last_char_code: print_esc("XeTeXlastfontchar"); 29790 29791 pdf_last_x_pos_code: print_esc("pdflastxpos"); 29792 pdf_last_y_pos_code: print_esc("pdflastypos"); 29793 29794XeTeX_pdf_page_count_code: print_esc("XeTeXpdfpagecount"); 29795 29796@ @<Cases for fetching an integer value@>= 29797eTeX_version_code: cur_val:=eTeX_version; 29798XeTeX_version_code: cur_val:=XeTeX_version; 29799 29800XeTeX_count_glyphs_code: 29801 begin 29802 scan_font_ident; n:=cur_val; 29803 if is_aat_font(n) then 29804 cur_val:=aat_font_get(m - XeTeX_int, font_layout_engine[n]) 29805 else if is_otgr_font(n) then 29806 cur_val:=ot_font_get(m - XeTeX_int, font_layout_engine[n]) 29807 else 29808 cur_val:=0; 29809 end; 29810 29811XeTeX_count_features_code: 29812 begin 29813 scan_font_ident; n:=cur_val; 29814 if is_aat_font(n) then 29815 cur_val:=aat_font_get(m - XeTeX_int, font_layout_engine[n]) 29816 else if is_gr_font(n) then 29817 cur_val:=ot_font_get(m - XeTeX_int, font_layout_engine[n]) 29818 else 29819 cur_val:=0; 29820 end; 29821 29822XeTeX_variation_code, 29823XeTeX_variation_min_code, 29824XeTeX_variation_max_code, 29825XeTeX_variation_default_code, 29826XeTeX_count_variations_code: 29827 begin 29828 scan_font_ident; n:=cur_val; 29829 cur_val:=0; {Deprecated} 29830 end; 29831 29832XeTeX_feature_code_code, 29833XeTeX_is_exclusive_feature_code, 29834XeTeX_count_selectors_code: 29835 begin 29836 scan_font_ident; n:=cur_val; 29837 if is_aat_font(n) then begin 29838 scan_int; k:=cur_val; 29839 cur_val:=aat_font_get_1(m - XeTeX_int, font_layout_engine[n], k); 29840 end else if is_gr_font(n) then begin 29841 scan_int; k:=cur_val; 29842 cur_val:=ot_font_get_1(m - XeTeX_int, font_layout_engine[n], k); 29843 end else begin 29844 not_aat_gr_font_error(last_item, m, n); cur_val:=-1; 29845 end; 29846 end; 29847 29848XeTeX_selector_code_code, 29849XeTeX_is_default_selector_code: 29850 begin 29851 scan_font_ident; n:=cur_val; 29852 if is_aat_font(n) then begin 29853 scan_int; k:=cur_val; scan_int; 29854 cur_val:=aat_font_get_2(m - XeTeX_int, font_layout_engine[n], k, cur_val); 29855 end else if is_gr_font(n) then begin 29856 scan_int; k:=cur_val; scan_int; 29857 cur_val:=ot_font_get_2(m - XeTeX_int, font_layout_engine[n], k, cur_val); 29858 end else begin 29859 not_aat_gr_font_error(last_item, m, n); cur_val:=-1; 29860 end; 29861 end; 29862 29863XeTeX_find_variation_by_name_code: 29864 begin 29865 scan_font_ident; n:=cur_val; 29866 if is_aat_font(n) then begin 29867 scan_and_pack_name; 29868 cur_val:=aat_font_get_named(m - XeTeX_int, font_layout_engine[n]); 29869 end else begin 29870 not_aat_font_error(last_item, m, n); cur_val:=-1; 29871 end; 29872 end; 29873 29874XeTeX_find_feature_by_name_code: 29875 begin 29876 scan_font_ident; n:=cur_val; 29877 if is_aat_font(n) then begin 29878 scan_and_pack_name; 29879 cur_val:=aat_font_get_named(m - XeTeX_int, font_layout_engine[n]); 29880 end else if is_gr_font(n) then begin 29881 scan_and_pack_name; 29882 cur_val:=gr_font_get_named(m - XeTeX_int, font_layout_engine[n]); 29883 end else begin 29884 not_aat_gr_font_error(last_item, m, n); cur_val:=-1; 29885 end; 29886 end; 29887 29888XeTeX_find_selector_by_name_code: 29889 begin 29890 scan_font_ident; n:=cur_val; 29891 if is_aat_font(n) then begin 29892 scan_int; k:=cur_val; scan_and_pack_name; 29893 cur_val:=aat_font_get_named_1(m - XeTeX_int, font_layout_engine[n], k); 29894 end else if is_gr_font(n) then begin 29895 scan_int; k:=cur_val; scan_and_pack_name; 29896 cur_val:=gr_font_get_named_1(m - XeTeX_int, font_layout_engine[n], k); 29897 end else begin 29898 not_aat_gr_font_error(last_item, m, n); cur_val:=-1; 29899 end; 29900 end; 29901 29902XeTeX_OT_count_scripts_code: 29903 begin 29904 scan_font_ident; n:=cur_val; 29905 if is_ot_font(n) then 29906 cur_val:=ot_font_get(m - XeTeX_int, font_layout_engine[n]) 29907 else begin 29908 cur_val:=0; 29909 end; 29910 end; 29911 29912XeTeX_OT_count_languages_code, 29913XeTeX_OT_script_code: 29914 begin 29915 scan_font_ident; n:=cur_val; 29916 if is_ot_font(n) then begin 29917 scan_int; 29918 cur_val:=ot_font_get_1(m - XeTeX_int, font_layout_engine[n], cur_val); 29919 end else begin 29920 not_ot_font_error(last_item, m, n); cur_val:=-1; 29921 end; 29922 end; 29923 29924XeTeX_OT_count_features_code, 29925XeTeX_OT_language_code: 29926 begin 29927 scan_font_ident; n:=cur_val; 29928 if is_ot_font(n) then begin 29929 scan_int; k:=cur_val; scan_int; 29930 cur_val:=ot_font_get_2(m - XeTeX_int, font_layout_engine[n], k, cur_val); 29931 end else begin 29932 not_ot_font_error(last_item, m, n); cur_val:=-1; 29933 end; 29934 end; 29935 29936XeTeX_OT_feature_code: 29937 begin 29938 scan_font_ident; n:=cur_val; 29939 if is_ot_font(n) then begin 29940 scan_int; k:=cur_val; scan_int; kk:=cur_val; scan_int; 29941 cur_val:=ot_font_get_3(m - XeTeX_int, font_layout_engine[n], k, kk, cur_val); 29942 end else begin 29943 not_ot_font_error(last_item, m, n); cur_val:=-1; 29944 end; 29945 end; 29946 29947XeTeX_map_char_to_glyph_code: 29948 begin 29949 if is_native_font(cur_font) then begin 29950 scan_int; n:=cur_val; cur_val:=map_char_to_glyph(cur_font, n) 29951 end else begin 29952 not_native_font_error(last_item, m, cur_font); cur_val:=0 29953 end 29954 end; 29955 29956XeTeX_glyph_index_code: 29957 begin 29958 if is_native_font(cur_font) then begin 29959 scan_and_pack_name; 29960 cur_val:=map_glyph_to_index(cur_font) 29961 end else begin 29962 not_native_font_error(last_item, m, cur_font); cur_val:=0 29963 end 29964 end; 29965 29966XeTeX_font_type_code: 29967 begin 29968 scan_font_ident; n:=cur_val; 29969 if is_aat_font(n) then cur_val:=1 29970 else if is_ot_font(n) then cur_val:=2 29971 else if is_gr_font(n) then cur_val:=3 29972 else cur_val:=0; 29973 end; 29974 29975XeTeX_first_char_code,XeTeX_last_char_code: 29976 begin 29977 scan_font_ident; n:=cur_val; 29978 if is_native_font(n) then 29979 cur_val:=get_font_char_range(n, m = XeTeX_first_char_code) 29980 else begin 29981 if m = XeTeX_first_char_code then cur_val:=font_bc[n] 29982 else cur_val:=font_ec[n]; 29983 end 29984 end; 29985 29986 pdf_last_x_pos_code: cur_val:=pdf_last_x_pos; 29987 pdf_last_y_pos_code: cur_val:=pdf_last_y_pos; 29988 29989XeTeX_pdf_page_count_code: 29990 begin 29991 scan_and_pack_name; 29992 cur_val:=count_pdf_file_pages; 29993 end; 29994 29995@ Slip in an extra procedure here and there.... 29996 29997@<Error hand...@>= 29998procedure scan_and_pack_name; forward; 29999 30000@ @<Declare procedures needed in |do_extension|@>= 30001procedure scan_and_pack_name; 30002begin 30003 scan_file_name; pack_cur_name; 30004end; 30005 30006@ @<Declare the procedure called |print_cmd_chr|@>= 30007procedure not_aat_font_error(cmd, c: integer; f: integer); 30008begin 30009 print_err("Cannot use "); print_cmd_chr(cmd, c); 30010 print(" with "); print(font_name[f]); 30011 print("; not an AAT font"); 30012 error; 30013end; 30014 30015procedure not_aat_gr_font_error(cmd, c: integer; f: integer); 30016begin 30017 print_err("Cannot use "); print_cmd_chr(cmd, c); 30018 print(" with "); print(font_name[f]); 30019 print("; not an AAT or Graphite font"); 30020 error; 30021end; 30022 30023procedure not_ot_font_error(cmd, c: integer; f: integer); 30024begin 30025 print_err("Cannot use "); print_cmd_chr(cmd, c); 30026 print(" with "); print(font_name[f]); 30027 print("; not an OpenType Layout font"); 30028 error; 30029end; 30030 30031procedure not_native_font_error(cmd, c: integer; f: integer); 30032begin 30033 print_err("Cannot use "); print_cmd_chr(cmd, c); 30034 print(" with "); print(font_name[f]); 30035 print("; not a native platform font"); 30036 error; 30037end; 30038 30039@ @<Cases for fetching a dimension value@>= 30040XeTeX_glyph_bounds_code: 30041 begin 30042 if is_native_font(cur_font) then begin 30043 scan_int; n:=cur_val; { which edge: 1=left, 2=top, 3=right, 4=bottom } 30044 if (n < 1) or (n > 4) then begin 30045 print_err("\\XeTeXglyphbounds requires an edge index from 1 to 4;"); 30046 print_nl("I don't know anything about edge "); print_int(n); 30047 error; 30048 cur_val:=0; 30049 end else begin 30050 scan_int; { glyph number } 30051 cur_val:=get_glyph_bounds(cur_font, n, cur_val); 30052 end 30053 end else begin 30054 not_native_font_error(last_item, m, cur_font); cur_val:=0 30055 end 30056 end; 30057 30058@ @<Cases of |convert| for |print_cmd_chr|@>= 30059eTeX_revision_code: print_esc("eTeXrevision"); 30060XeTeX_revision_code: print_esc("XeTeXrevision"); 30061 30062XeTeX_variation_name_code: print_esc("XeTeXvariationname"); 30063XeTeX_feature_name_code: print_esc("XeTeXfeaturename"); 30064XeTeX_selector_name_code: print_esc("XeTeXselectorname"); 30065XeTeX_glyph_name_code: print_esc("XeTeXglyphname"); 30066 30067XeTeX_Uchar_code: print_esc("Uchar"); 30068XeTeX_Ucharcat_code: print_esc("Ucharcat"); 30069 30070@ @<Cases of `Scan the argument for command |c|'@>= 30071eTeX_revision_code: do_nothing; 30072pdf_strcmp_code: 30073 begin 30074 save_scanner_status:=scanner_status; 30075 save_warning_index:=warning_index; 30076 save_def_ref:=def_ref; 30077 save_cur_string; 30078 compare_strings; 30079 def_ref:=save_def_ref; 30080 warning_index:=save_warning_index; 30081 scanner_status:=save_scanner_status; 30082 restore_cur_string; 30083 end; 30084XeTeX_revision_code: do_nothing; 30085 30086XeTeX_variation_name_code: 30087 begin 30088 scan_font_ident; fnt:=cur_val; 30089 if is_aat_font(fnt) then begin 30090 scan_int; arg1:=cur_val; arg2:=0; 30091 end else 30092 not_aat_font_error(convert, c, fnt); 30093 end; 30094 30095XeTeX_feature_name_code: 30096 begin 30097 scan_font_ident; fnt:=cur_val; 30098 if is_aat_font(fnt) or is_gr_font(fnt) then begin 30099 scan_int; arg1:=cur_val; arg2:=0; 30100 end else 30101 not_aat_gr_font_error(convert, c, fnt); 30102 end; 30103 30104XeTeX_selector_name_code: 30105 begin 30106 scan_font_ident; fnt:=cur_val; 30107 if is_aat_font(fnt) or is_gr_font(fnt) then begin 30108 scan_int; arg1:=cur_val; scan_int; arg2:=cur_val; 30109 end else 30110 not_aat_gr_font_error(convert, c, fnt); 30111 end; 30112 30113XeTeX_glyph_name_code: 30114 begin 30115 scan_font_ident; fnt:=cur_val; 30116 if is_native_font(fnt) then begin 30117 scan_int; arg1:=cur_val; 30118 end else 30119 not_native_font_error(convert, c, fnt); 30120 end; 30121 30122left_margin_kern_code, right_margin_kern_code: begin 30123 scan_register_num; 30124 fetch_box(p); 30125 if (p = null) or (type(p) <> hlist_node) then 30126 pdf_error("marginkern", "a non-empty hbox expected") 30127end; 30128 30129@ @<Cases of `Print the result of command |c|'@>= 30130eTeX_revision_code: print(eTeX_revision); 30131pdf_strcmp_code: print_int(cur_val); 30132XeTeX_revision_code: print(XeTeX_revision); 30133 30134XeTeX_variation_name_code: 30135 if is_aat_font(fnt) then 30136 aat_print_font_name(c, font_layout_engine[fnt], arg1, arg2); 30137 30138XeTeX_feature_name_code, 30139XeTeX_selector_name_code: 30140 if is_aat_font(fnt) then 30141 aat_print_font_name(c, font_layout_engine[fnt], arg1, arg2) 30142 else if is_gr_font(fnt) then 30143 gr_print_font_name(c, font_layout_engine[fnt], arg1, arg2); 30144 30145XeTeX_glyph_name_code: 30146 if is_native_font(fnt) then print_glyph_name(fnt, arg1); 30147 30148left_margin_kern_code: begin 30149 p:=list_ptr(p); 30150 while (p <> null) and 30151 (cp_skipable(p) or 30152 ((not is_char_node(p)) and (type(p) = glue_node) and (subtype(p) = left_skip_code + 1))) 30153 do 30154 p:=link(p); 30155 if (p <> null) and (not is_char_node(p)) and (type(p) = margin_kern_node) and (subtype(p) = left_side) then 30156 print_scaled(width(p)) 30157 else 30158 print("0"); 30159 print("pt"); 30160end; 30161 30162right_margin_kern_code: begin 30163 q:=list_ptr(p); 30164 p:=prev_rightmost(q, null); 30165 while (p <> null) and 30166 (cp_skipable(p) or 30167 ((not is_char_node(p)) and (type(p) = glue_node) and (subtype(p) = right_skip_code + 1))) 30168 do 30169 p:=prev_rightmost(q, p); 30170 if (p <> null) and (not is_char_node(p)) and (type(p) = margin_kern_node) and (subtype(p) = right_side) then 30171 print_scaled(width(p)) 30172 else 30173 print("0"); 30174 print("pt"); 30175end; 30176 30177@ @d eTeX_ex==(eTeX_mode=1) {is this extended mode?} 30178 30179@<Glob...@>= 30180@!eTeX_mode: 0..1; {identifies compatibility and extended mode} 30181 30182@ @<Initialize table entries...@>= 30183eTeX_mode:=0; {initially we are in compatibility mode} 30184@<Initialize variables for \eTeX\ compatibility mode@>@; 30185 30186@ @<Dump the \eTeX\ state@>= 30187dump_int(eTeX_mode); 30188{ in a deliberate change from e-TeX, we allow non-zero state variables to be dumped } 30189 30190@ @<Undump the \eTeX\ state@>= 30191undump(0)(1)(eTeX_mode); 30192if eTeX_ex then 30193 begin @<Initialize variables for \eTeX\ extended mode@>@; 30194 end 30195else begin @<Initialize variables for \eTeX\ compatibility mode@>@; 30196 end; 30197 30198@ The |eTeX_enabled| function simply returns its first argument as 30199result. This argument is |true| if an optional \eTeX\ feature is 30200currently enabled; otherwise, if the argument is |false|, the function 30201gives an error message. 30202 30203@<Declare \eTeX\ procedures for use...@>= 30204function eTeX_enabled(@!b:boolean;@!j:quarterword;@!k:halfword):boolean; 30205begin if not b then 30206 begin print_err("Improper "); print_cmd_chr(j,k); 30207 help1("Sorry, this optional e-TeX feature has been disabled."); error; 30208 end; 30209eTeX_enabled:=b; 30210end; 30211 30212@ First we implement the additional \eTeX\ parameters in the table of 30213equivalents. 30214 30215@<Generate all \eTeX...@>= 30216primitive("everyeof",assign_toks,every_eof_loc); 30217@!@:every_eof_}{\.{\\everyeof} primitive@> 30218primitive("tracingassigns",assign_int,int_base+tracing_assigns_code);@/ 30219@!@:tracing_assigns_}{\.{\\tracingassigns} primitive@> 30220primitive("tracinggroups",assign_int,int_base+tracing_groups_code);@/ 30221@!@:tracing_groups_}{\.{\\tracinggroups} primitive@> 30222primitive("tracingifs",assign_int,int_base+tracing_ifs_code);@/ 30223@!@:tracing_ifs_}{\.{\\tracingifs} primitive@> 30224primitive("tracingscantokens",assign_int,int_base+tracing_scan_tokens_code);@/ 30225@!@:tracing_scan_tokens_}{\.{\\tracingscantokens} primitive@> 30226primitive("tracingnesting",assign_int,int_base+tracing_nesting_code);@/ 30227@!@:tracing_nesting_}{\.{\\tracingnesting} primitive@> 30228primitive("predisplaydirection", 30229 assign_int,int_base+pre_display_direction_code);@/ 30230@!@:pre_display_direction_}{\.{\\predisplaydirection} primitive@> 30231primitive("lastlinefit",assign_int,int_base+last_line_fit_code);@/ 30232@!@:last_line_fit_}{\.{\\lastlinefit} primitive@> 30233primitive("savingvdiscards",assign_int,int_base+saving_vdiscards_code);@/ 30234@!@:saving_vdiscards_}{\.{\\savingvdiscards} primitive@> 30235primitive("savinghyphcodes",assign_int,int_base+saving_hyph_codes_code);@/ 30236@!@:saving_hyph_codes_}{\.{\\savinghyphcodes} primitive@> 30237 30238@ @d every_eof==equiv(every_eof_loc) 30239 30240@<Cases of |assign_toks| for |print_cmd_chr|@>= 30241every_eof_loc: print_esc("everyeof"); 30242XeTeX_inter_char_loc: print_esc("XeTeXinterchartoks"); 30243 30244@ @<Cases for |print_param|@>= 30245tracing_assigns_code:print_esc("tracingassigns"); 30246tracing_groups_code:print_esc("tracinggroups"); 30247tracing_ifs_code:print_esc("tracingifs"); 30248tracing_scan_tokens_code:print_esc("tracingscantokens"); 30249tracing_nesting_code:print_esc("tracingnesting"); 30250pre_display_direction_code:print_esc("predisplaydirection"); 30251last_line_fit_code:print_esc("lastlinefit"); 30252saving_vdiscards_code:print_esc("savingvdiscards"); 30253saving_hyph_codes_code:print_esc("savinghyphcodes"); 30254 30255@ In order to handle \.{\\everyeof} we need an array |eof_seen| of 30256boolean variables. 30257 30258@<Glob...@>= 30259@!eof_seen : array[1..max_in_open] of boolean; {has eof been seen?} 30260 30261@ The |print_group| procedure prints the current level of grouping and 30262the name corresponding to |cur_group|. 30263 30264@<Declare \eTeX\ procedures for tr...@>= 30265procedure print_group(@!e:boolean); 30266label exit; 30267begin case cur_group of 30268 bottom_level: begin print("bottom level"); return; 30269 end; 30270 simple_group,semi_simple_group: 30271 begin if cur_group=semi_simple_group then print("semi "); 30272 print("simple"); 30273 end; 30274 hbox_group,adjusted_hbox_group: 30275 begin if cur_group=adjusted_hbox_group then print("adjusted "); 30276 print("hbox"); 30277 end; 30278 vbox_group: print("vbox"); 30279 vtop_group: print("vtop"); 30280 align_group,no_align_group: 30281 begin if cur_group=no_align_group then print("no "); 30282 print("align"); 30283 end; 30284 output_group: print("output"); 30285 disc_group: print("disc"); 30286 insert_group: print("insert"); 30287 vcenter_group: print("vcenter"); 30288 math_group,math_choice_group,math_shift_group,math_left_group: 30289 begin print("math"); 30290 if cur_group=math_choice_group then print(" choice") 30291 else if cur_group=math_shift_group then print(" shift") 30292 else if cur_group=math_left_group then print(" left"); 30293 end; 30294 end; {there are no other cases} 30295print(" group (level "); print_int(qo(cur_level)); print_char(")"); 30296if saved(-1)<>0 then 30297 begin if e then print(" entered at line ") else print(" at line "); 30298 print_int(saved(-1)); 30299 end; 30300exit:end; 30301 30302@ The |group_trace| procedure is called when a new level of grouping 30303begins (|e=false|) or ends (|e=true|) with |saved(-1)| containing the 30304line number. 30305 30306@<Declare \eTeX\ procedures for tr...@>= 30307@!stat procedure group_trace(@!e:boolean); 30308begin begin_diagnostic; print_char("{"); 30309if e then print("leaving ") else print("entering "); 30310print_group(e); print_char("}"); end_diagnostic(false); 30311end; 30312tats 30313 30314@ The \.{\\currentgrouplevel} and \.{\\currentgrouptype} commands return 30315the current level of grouping and the type of the current group 30316respectively. 30317 30318@d current_group_level_code=eTeX_int+1 {code for \.{\\currentgrouplevel}} 30319@d current_group_type_code=eTeX_int+2 {code for \.{\\currentgrouptype}} 30320 30321@<Generate all \eTeX...@>= 30322primitive("currentgrouplevel",last_item,current_group_level_code); 30323@!@:current_group_level_}{\.{\\currentgrouplevel} primitive@> 30324primitive("currentgrouptype",last_item,current_group_type_code); 30325@!@:current_group_type_}{\.{\\currentgrouptype} primitive@> 30326 30327@ @<Cases of |last_item| for |print_cmd_chr|@>= 30328current_group_level_code: print_esc("currentgrouplevel"); 30329current_group_type_code: print_esc("currentgrouptype"); 30330 30331@ @<Cases for fetching an integer value@>= 30332current_group_level_code: cur_val:=cur_level-level_one; 30333current_group_type_code: cur_val:=cur_group; 30334 30335@ The \.{\\currentiflevel}, \.{\\currentiftype}, and 30336\.{\\currentifbranch} commands return the current level of conditionals 30337and the type and branch of the current conditional. 30338 30339@d current_if_level_code=eTeX_int+3 {code for \.{\\currentiflevel}} 30340@d current_if_type_code=eTeX_int+4 {code for \.{\\currentiftype}} 30341@d current_if_branch_code=eTeX_int+5 {code for \.{\\currentifbranch}} 30342 30343@<Generate all \eTeX...@>= 30344primitive("currentiflevel",last_item,current_if_level_code); 30345@!@:current_if_level_}{\.{\\currentiflevel} primitive@> 30346primitive("currentiftype",last_item,current_if_type_code); 30347@!@:current_if_type_}{\.{\\currentiftype} primitive@> 30348primitive("currentifbranch",last_item,current_if_branch_code); 30349@!@:current_if_branch_}{\.{\\currentifbranch} primitive@> 30350 30351@ @<Cases of |last_item| for |print_cmd_chr|@>= 30352current_if_level_code: print_esc("currentiflevel"); 30353current_if_type_code: print_esc("currentiftype"); 30354current_if_branch_code: print_esc("currentifbranch"); 30355 30356@ @<Cases for fetching an integer value@>= 30357current_if_level_code: begin q:=cond_ptr; cur_val:=0; 30358 while q<>null do 30359 begin incr(cur_val); q:=link(q); 30360 end; 30361 end; 30362current_if_type_code: if cond_ptr=null then cur_val:=0 30363 else if cur_if<unless_code then cur_val:=cur_if+1 30364 else cur_val:=-(cur_if-unless_code+1); 30365current_if_branch_code: 30366 if (if_limit=or_code)or(if_limit=else_code) then cur_val:=1 30367 else if if_limit=fi_code then cur_val:=-1 30368 else cur_val:=0; 30369 30370@ The \.{\\fontcharwd}, \.{\\fontcharht}, \.{\\fontchardp}, and 30371\.{\\fontcharic} commands return information about a character in a 30372font. 30373 30374@d font_char_wd_code=eTeX_dim {code for \.{\\fontcharwd}} 30375@d font_char_ht_code=eTeX_dim+1 {code for \.{\\fontcharht}} 30376@d font_char_dp_code=eTeX_dim+2 {code for \.{\\fontchardp}} 30377@d font_char_ic_code=eTeX_dim+3 {code for \.{\\fontcharic}} 30378 30379@<Generate all \eTeX...@>= 30380primitive("fontcharwd",last_item,font_char_wd_code); 30381@!@:font_char_wd_}{\.{\\fontcharwd} primitive@> 30382primitive("fontcharht",last_item,font_char_ht_code); 30383@!@:font_char_ht_}{\.{\\fontcharht} primitive@> 30384primitive("fontchardp",last_item,font_char_dp_code); 30385@!@:font_char_dp_}{\.{\\fontchardp} primitive@> 30386primitive("fontcharic",last_item,font_char_ic_code); 30387@!@:font_char_ic_}{\.{\\fontcharic} primitive@> 30388 30389@ @<Cases of |last_item| for |print_cmd_chr|@>= 30390font_char_wd_code: print_esc("fontcharwd"); 30391font_char_ht_code: print_esc("fontcharht"); 30392font_char_dp_code: print_esc("fontchardp"); 30393font_char_ic_code: print_esc("fontcharic"); 30394 30395@ @<Cases for fetching a dimension value@>= 30396font_char_wd_code, 30397font_char_ht_code, 30398font_char_dp_code, 30399font_char_ic_code: begin scan_font_ident; q:=cur_val; scan_usv_num; 30400 if is_native_font(q) then begin 30401 case m of 30402 font_char_wd_code: cur_val:=getnativecharwd(q, cur_val); 30403 font_char_ht_code: cur_val:=getnativecharht(q, cur_val); 30404 font_char_dp_code: cur_val:=getnativechardp(q, cur_val); 30405 font_char_ic_code: cur_val:=getnativecharic(q, cur_val); 30406 end; {there are no other cases} 30407 end else begin 30408 if (font_bc[q]<=cur_val)and(font_ec[q]>=cur_val) then 30409 begin i:=char_info(q)(qi(cur_val)); 30410 case m of 30411 font_char_wd_code: cur_val:=char_width(q)(i); 30412 font_char_ht_code: cur_val:=char_height(q)(height_depth(i)); 30413 font_char_dp_code: cur_val:=char_depth(q)(height_depth(i)); 30414 font_char_ic_code: cur_val:=char_italic(q)(i); 30415 end; {there are no other cases} 30416 end 30417 else cur_val:=0; 30418 end 30419 end; 30420 30421@ The \.{\\parshapedimen}, \.{\\parshapeindent}, and \.{\\parshapelength} 30422commands return the indent and length parameters of the current 30423\.{\\parshape} specification. 30424 30425@d par_shape_length_code=eTeX_dim+4 {code for \.{\\parshapelength}} 30426@d par_shape_indent_code=eTeX_dim+5 {code for \.{\\parshapeindent}} 30427@d par_shape_dimen_code=eTeX_dim+6 {code for \.{\\parshapedimen}} 30428 30429@<Generate all \eTeX...@>= 30430primitive("parshapelength",last_item,par_shape_length_code); 30431@!@:par_shape_length_}{\.{\\parshapelength} primitive@> 30432primitive("parshapeindent",last_item,par_shape_indent_code); 30433@!@:par_shape_indent_}{\.{\\parshapeindent} primitive@> 30434primitive("parshapedimen",last_item,par_shape_dimen_code); 30435@!@:par_shape_dimen_}{\.{\\parshapedimen} primitive@> 30436 30437@ @<Cases of |last_item| for |print_cmd_chr|@>= 30438par_shape_length_code: print_esc("parshapelength"); 30439par_shape_indent_code: print_esc("parshapeindent"); 30440par_shape_dimen_code: print_esc("parshapedimen"); 30441 30442@ @<Cases for fetching a dimension value@>= 30443par_shape_length_code, 30444par_shape_indent_code, 30445par_shape_dimen_code: begin q:=cur_chr-par_shape_length_code; scan_int; 30446 if (par_shape_ptr=null)or(cur_val<=0) then cur_val:=0 30447 else begin if q=2 then 30448 begin q:=cur_val mod 2; cur_val:=(cur_val+q)div 2; 30449 end; 30450 if cur_val>info(par_shape_ptr) then cur_val:=info(par_shape_ptr); 30451 cur_val:=mem[par_shape_ptr+2*cur_val-q].sc; 30452 end; 30453 cur_val_level:=dimen_val; 30454 end; 30455 30456@ The \.{\\showgroups} command displays all currently active grouping 30457levels. 30458 30459@d show_groups=4 { \.{\\showgroups} } 30460 30461@<Generate all \eTeX...@>= 30462primitive("showgroups",xray,show_groups); 30463@!@:show_groups_}{\.{\\showgroups} primitive@> 30464 30465@ @<Cases of |xray| for |print_cmd_chr|@>= 30466show_groups:print_esc("showgroups"); 30467 30468@ @<Cases for |show_whatever|@>= 30469show_groups: begin begin_diagnostic; show_save_groups; 30470 end; 30471 30472@ @<Types...@>= 30473@!save_pointer=0..save_size; {index into |save_stack|} 30474 30475@ The modifications of \TeX\ required for the display produced by the 30476|show_save_groups| procedure were first discussed by Donald~E. Knuth in 30477{\sl TUGboat\/} {\bf 11}, 165--170 and 499--511, 1990. 30478@^Knuth, Donald Ervin@> 30479 30480In order to understand a group type we also have to know its mode. 30481Since unrestricted horizontal modes are not associated with grouping, 30482they are skipped when traversing the semantic nest. 30483 30484@<Declare \eTeX\ procedures for use...@>= 30485procedure show_save_groups; 30486label found1,found2,found,done; 30487var p:0..nest_size; {index into |nest|} 30488@!m:-mmode..mmode; {mode} 30489@!v:save_pointer; {saved value of |save_ptr|} 30490@!l:quarterword; {saved value of |cur_level|} 30491@!c:group_code; {saved value of |cur_group|} 30492@!a:-1..1; {to keep track of alignments} 30493@!i:integer; 30494@!j:quarterword; 30495@!s:str_number; 30496begin p:=nest_ptr; nest[p]:=cur_list; {put the top level into the array} 30497v:=save_ptr; l:=cur_level; c:=cur_group; 30498save_ptr:=cur_boundary; decr(cur_level);@/ 30499a:=1; 30500print_nl(""); print_ln; 30501loop@+begin print_nl("### "); print_group(true); 30502 if cur_group=bottom_level then goto done; 30503 repeat m:=nest[p].mode_field; 30504 if p>0 then decr(p) else m:=vmode; 30505 until m<>hmode; 30506 print(" ("); 30507 case cur_group of 30508 simple_group: begin incr(p); goto found2; 30509 end; 30510 hbox_group,adjusted_hbox_group: s:="hbox"; 30511 vbox_group: s:="vbox"; 30512 vtop_group: s:="vtop"; 30513 align_group: if a=0 then 30514 begin if m=-vmode then s:="halign" else s:="valign"; 30515 a:=1; goto found1; 30516 end 30517 else begin if a=1 then print("align entry") else print_esc("cr"); 30518 if p>=a then p:=p-a; 30519 a:=0; goto found; 30520 end; 30521 no_align_group: 30522 begin incr(p); a:=-1; print_esc("noalign"); goto found2; 30523 end; 30524 output_group: 30525 begin print_esc("output"); goto found; 30526 end; 30527 math_group: goto found2; 30528 disc_group,math_choice_group: 30529 begin if cur_group=disc_group then print_esc("discretionary") 30530 else print_esc("mathchoice"); 30531 for i:=1 to 3 do if i<=saved(-2) then print("{}"); 30532 goto found2; 30533 end; 30534 insert_group: 30535 begin if saved(-2)=255 then print_esc("vadjust") 30536 else begin print_esc("insert"); print_int(saved(-2)); 30537 end; 30538 goto found2; 30539 end; 30540 vcenter_group: begin s:="vcenter"; goto found1; 30541 end; 30542 semi_simple_group: begin incr(p); print_esc("begingroup"); goto found; 30543 end; 30544 math_shift_group: 30545 begin if m=mmode then print_char("$") 30546 else if nest[p].mode_field=mmode then 30547 begin print_cmd_chr(eq_no,saved(-2)); goto found; 30548 end; 30549 print_char("$"); goto found; 30550 end; 30551 math_left_group: 30552 begin if type(nest[p+1].eTeX_aux_field)=left_noad then print_esc("left") 30553 else print_esc("middle"); 30554 goto found; 30555 end; 30556 end; {there are no other cases} 30557 @<Show the box context@>; 30558 found1: print_esc(s); @<Show the box packaging info@>; 30559 found2: print_char("{"); 30560 found: print_char(")"); decr(cur_level); 30561 cur_group:=save_level(save_ptr); save_ptr:=save_index(save_ptr) 30562 end; 30563done: save_ptr:=v; cur_level:=l; cur_group:=c; 30564end; 30565 30566@ @<Show the box packaging info@>= 30567if saved(-2)<>0 then 30568 begin print_char(" "); 30569 if saved(-3)=exactly then print("to") else print("spread"); 30570 print_scaled(saved(-2)); print("pt"); 30571 end 30572 30573@ @<Show the box context@>= 30574i:=saved(-4); 30575if i<>0 then 30576 if i<box_flag then 30577 begin if abs(nest[p].mode_field)=vmode then j:=hmove else j:=vmove; 30578 if i>0 then print_cmd_chr(j,0) else print_cmd_chr(j,1); 30579 print_scaled(abs(i)); print("pt"); 30580 end 30581 else if i<ship_out_flag then 30582 begin if i>=global_box_flag then 30583 begin print_esc("global"); i:=i-(global_box_flag-box_flag); 30584 end; 30585 print_esc("setbox"); print_int(i-box_flag); print_char("="); 30586 end 30587 else print_cmd_chr(leader_ship,i-(leader_flag-a_leaders)) 30588 30589@ The |scan_general_text| procedure is much like |scan_toks(false,false)|, 30590but will be invoked via |expand|, i.e., recursively. 30591@^recursion@> 30592 30593@<Declare \eTeX\ procedures for sc...@>= 30594procedure@?scan_general_text; forward;@t\2@> 30595 30596@ The token list (balanced text) created by |scan_general_text| begins 30597at |link(temp_head)| and ends at |cur_val|. (If |cur_val=temp_head|, 30598the list is empty.) 30599 30600@<Declare \eTeX\ procedures for tok...@>= 30601procedure scan_general_text; 30602label found; 30603var s:normal..absorbing; {to save |scanner_status|} 30604@!w:pointer; {to save |warning_index|} 30605@!d:pointer; {to save |def_ref|} 30606@!p:pointer; {tail of the token list being built} 30607@!q:pointer; {new node being added to the token list via |store_new_token|} 30608@!unbalance:halfword; {number of unmatched left braces} 30609begin s:=scanner_status; w:=warning_index; d:=def_ref; 30610scanner_status:=absorbing; warning_index:=cur_cs; 30611def_ref:=get_avail; token_ref_count(def_ref):=null; p:=def_ref; 30612scan_left_brace; {remove the compulsory left brace} 30613unbalance:=1; 30614loop@+ begin get_token; 30615 if cur_tok<right_brace_limit then 30616 if cur_cmd<right_brace then incr(unbalance) 30617 else begin decr(unbalance); 30618 if unbalance=0 then goto found; 30619 end; 30620 store_new_token(cur_tok); 30621 end; 30622found: q:=link(def_ref); free_avail(def_ref); {discard reference count} 30623if q=null then cur_val:=temp_head @+ else cur_val:=p; 30624link(temp_head):=q; 30625scanner_status:=s; warning_index:=w; def_ref:=d; 30626end; 30627 30628@ The \.{\\showtokens} command displays a token list. 30629 30630@d show_tokens=5 { \.{\\showtokens} , must be odd! } 30631 30632@<Generate all \eTeX...@>= 30633primitive("showtokens",xray,show_tokens); 30634@!@:show_tokens_}{\.{\\showtokens} primitive@> 30635 30636@ @<Cases of |xray| for |print_cmd_chr|@>= 30637show_tokens:print_esc("showtokens"); 30638 30639@ The \.{\\unexpanded} primitive prevents expansion of tokens much as 30640the result from \.{\\the} applied to a token variable. The 30641\.{\\detokenize} primitive converts a token list into a list of 30642character tokens much as if the token list were written to a file. We 30643use the fact that the command modifiers for \.{\\unexpanded} and 30644\.{\\detokenize} are odd whereas those for \.{\\the} and \.{\\showthe} 30645are even. 30646 30647@<Generate all \eTeX...@>= 30648primitive("unexpanded",the,1);@/ 30649@!@:unexpanded_}{\.{\\unexpanded} primitive@> 30650primitive("detokenize",the,show_tokens);@/ 30651@!@:detokenize_}{\.{\\detokenize} primitive@> 30652 30653@ @<Cases of |the| for |print_cmd_chr|@>= 30654else if chr_code=1 then print_esc("unexpanded") 30655else print_esc("detokenize") 30656 30657@ @<Handle \.{\\unexpanded} or \.{\\detokenize} and |return|@>= 30658if odd(cur_chr) then 30659 begin c:=cur_chr; scan_general_text; 30660 if c=1 then the_toks:=cur_val 30661 else begin old_setting:=selector; selector:=new_string; b:=pool_ptr; 30662 p:=get_avail; link(p):=link(temp_head); 30663 token_show(p); flush_list(p); 30664 selector:=old_setting; the_toks:=str_toks(b); 30665 end; 30666 return; 30667 end 30668 30669@ The \.{\\showifs} command displays all currently active conditionals. 30670 30671@d show_ifs=6 { \.{\\showifs} } 30672 30673@<Generate all \eTeX...@>= 30674primitive("showifs",xray,show_ifs); 30675@!@:show_ifs_}{\.{\\showifs} primitive@> 30676 30677@ @<Cases of |xray| for |print_cmd_chr|@>= 30678show_ifs:print_esc("showifs"); 30679 30680@ 30681@d print_if_line(#)==if #<>0 then 30682 begin print(" entered on line "); print_int(#); 30683 end 30684 30685@<Cases for |show_whatever|@>= 30686show_ifs: begin begin_diagnostic; print_nl(""); print_ln; 30687 if cond_ptr=null then 30688 begin print_nl("### "); print("no active conditionals"); 30689 end 30690 else begin p:=cond_ptr; n:=0; 30691 repeat incr(n); p:=link(p);@+until p=null; 30692 p:=cond_ptr; t:=cur_if; l:=if_line; m:=if_limit; 30693 repeat print_nl("### level "); print_int(n); print(": "); 30694 print_cmd_chr(if_test,t); 30695 if m=fi_code then print_esc("else"); 30696 print_if_line(l); 30697 decr(n); t:=subtype(p); l:=if_line_field(p); m:=type(p); p:=link(p); 30698 until p=null; 30699 end; 30700 end; 30701 30702@ The \.{\\interactionmode} primitive allows to query and set the 30703interaction mode. 30704 30705@<Generate all \eTeX...@>= 30706primitive("interactionmode",set_page_int,2); 30707@!@:interaction_mode_}{\.{\\interactionmode} primitive@> 30708 30709@ @<Cases of |set_page_int| for |print_cmd_chr|@>= 30710else if chr_code=2 then print_esc("interactionmode") 30711 30712@ @<Cases for `Fetch the |dead_cycles| or the |insert_penalties|'@>= 30713else if m=2 then cur_val:=interaction 30714 30715@ @<Declare \eTeX\ procedures for use...@>= 30716procedure@?new_interaction; forward;@t\2@> 30717 30718@ @<Cases for |alter_integer|@>= 30719else if c=2 then 30720 begin if (cur_val<batch_mode)or(cur_val>error_stop_mode) then 30721 begin print_err("Bad interaction mode"); 30722@.Bad interaction mode@> 30723 help2("Modes are 0=batch, 1=nonstop, 2=scroll, and")@/ 30724 ("3=errorstop. Proceed, and I'll ignore this case."); 30725 int_error(cur_val); 30726 end 30727 else begin cur_chr:=cur_val; new_interaction; 30728 end; 30729 end 30730 30731@ The |middle| feature of \eTeX\ allows one ore several \.{\\middle} 30732delimiters to appear between \.{\\left} and \.{\\right}. 30733 30734@<Generate all \eTeX...@>= 30735primitive("middle",left_right,middle_noad); 30736@!@:middle_}{\.{\\middle} primitive@> 30737 30738@ @<Cases of |left_right| for |print_cmd_chr|@>= 30739else if chr_code=middle_noad then print_esc("middle") 30740 30741@ In constructions such as 30742$$\vbox{\halign{\.{#}\hfil\cr 30743{}\\hbox to \\hsize\{\cr 30744\hskip 25pt \\hskip 0pt plus 0.0001fil\cr 30745\hskip 25pt ...\cr 30746\hskip 25pt \\hfil\\penalty-200\\hfilneg\cr 30747\hskip 25pt ...\}\cr}}$$ 30748the stretch components of \.{\\hfil} and \.{\\hfilneg} compensate; they may, 30749however, get modified in order to prevent arithmetic overflow during 30750|hlist_out| when each of them is multiplied by a large |glue_set| value. 30751 30752Since this ``glue rounding'' depends on state variables |cur_g| and 30753|cur_glue| and \TeXXeT\ is supposed to emulate the behaviour of \TeXeT\ 30754(plus a suitable postprocessor) as close as possible the glue rounding 30755cannot be postponed until (segments of) an hlist has been reversed. 30756 30757The code below is invoked after the effective width, |rule_wd|, of a glue 30758node has been computed. The glue node is either converted into a kern node 30759or, for leaders, the glue specification is replaced by an equivalent rigid 30760one; the subtype of the glue node remains unchanged. 30761 30762@<Handle a glue node for mixed...@>= 30763if (((g_sign=stretching) and (stretch_order(g)=g_order)) or 30764 ((g_sign=shrinking) and (shrink_order(g)=g_order))) then 30765 begin fast_delete_glue_ref(g); 30766 if subtype(p)<a_leaders then 30767 begin type(p):=kern_node; width(p):=rule_wd; 30768 end 30769 else begin g:=get_node(glue_spec_size);@/ 30770 stretch_order(g):=filll+1; shrink_order(g):=filll+1; {will never match} 30771 width(g):=rule_wd; stretch(g):=0; shrink(g):=0; glue_ptr(p):=g; 30772 end; 30773 end 30774 30775@ The optional |TeXXeT| feature of \eTeX\ contains the code for mixed 30776left-to-right and right-to-left typesetting. This code is inspired by 30777but different from \TeXeT\ as presented by Donald~E. Knuth and Pierre 30778MacKay in {\sl TUGboat\/} {\bf 8}, 14--25, 1987. 30779@^Knuth, Donald Ervin@> 30780@^MacKay, Pierre@> 30781 30782In order to avoid confusion with \TeXeT\ the present implementation of 30783mixed direction typesetting is called \TeXXeT. It differs from \TeXeT\ 30784in several important aspects: (1)~Right-to-left text is reversed 30785explicitly by the |ship_out| routine and is written to a normal \.{DVI} 30786file without any |begin_reflect| or |end_reflect| commands; (2)~a 30787|math_node| is (ab)used instead of a |whatsit_node| to record the 30788\.{\\beginL}, \.{\\endL}, \.{\\beginR}, and \.{\\endR} text direction 30789primitives in order to keep the influence on the line breaking algorithm 30790for pure left-to-right text as small as possible; (3)~right-to-left text 30791interrupted by a displayed equation is automatically resumed after that 30792equation; and (4)~the |valign| command code with a non-zero command 30793modifier is (ab)used for the text direction primitives. 30794 30795Nevertheless there is a subtle difference between \TeX\ and \TeXXeT\ 30796that may influence the line breaking algorithm for pure left-to-right 30797text. When a paragraph containing math mode material is broken into 30798lines \TeX\ may generate lines where math mode material is not enclosed 30799by properly nested \.{\\mathon} and \.{\\mathoff} nodes. Unboxing such 30800lines as part of a new paragraph may have the effect that hyphenation is 30801attempted for `words' originating from math mode or that hyphenation is 30802inhibited for words originating from horizontal mode. 30803 30804In \TeXXeT\ additional \.{\\beginM}, resp.\ \.{\\endM} math nodes are 30805supplied at the start, resp.\ end of lines such that math mode material 30806inside a horizontal list always starts with either \.{\\mathon} or 30807\.{\\beginM} and ends with \.{\\mathoff} or \.{\\endM}. These 30808additional nodes are transparent to operations such as \.{\\unskip}, 30809\.{\\lastpenalty}, or \.{\\lastbox} but they do have the effect that 30810hyphenation is never attempted for `words' originating from math mode 30811and is never inhibited for words originating from horizontal mode. 30812 30813@d TeXXeT_state==eTeX_state(TeXXeT_code) 30814@d TeXXeT_en==(TeXXeT_state>0) {is \TeXXeT\ enabled?} 30815 30816@d XeTeX_upwards_state==eTeX_state(XeTeX_upwards_code) 30817@d XeTeX_upwards==(XeTeX_upwards_state>0) 30818 30819@d XeTeX_use_glyph_metrics_state==eTeX_state(XeTeX_use_glyph_metrics_code) 30820@d XeTeX_use_glyph_metrics==(XeTeX_use_glyph_metrics_state>0) 30821 30822@d XeTeX_inter_char_tokens_state==eTeX_state(XeTeX_inter_char_tokens_code) 30823@d XeTeX_inter_char_tokens_en==(XeTeX_inter_char_tokens_state>0) 30824 30825@d XeTeX_dash_break_state == eTeX_state(XeTeX_dash_break_code) 30826@d XeTeX_dash_break_en == (XeTeX_dash_break_state>0) 30827 30828@d XeTeX_input_normalization_state == eTeX_state(XeTeX_input_normalization_code) 30829@d XeTeX_tracing_fonts_state == eTeX_state(XeTeX_tracing_fonts_code) 30830 30831@d XeTeX_default_input_mode == eTeX_state(XeTeX_default_input_mode_code) 30832@d XeTeX_default_input_encoding == eTeX_state(XeTeX_default_input_encoding_code) 30833 30834@<Cases for |print_param|@>= 30835suppress_fontnotfound_error_code:print_esc("suppressfontnotfounderror"); 30836eTeX_state_code+TeXXeT_code:print_esc("TeXXeTstate"); 30837eTeX_state_code+XeTeX_upwards_code:print_esc("XeTeXupwardsmode"); 30838eTeX_state_code+XeTeX_use_glyph_metrics_code:print_esc("XeTeXuseglyphmetrics"); 30839eTeX_state_code+XeTeX_inter_char_tokens_code:print_esc("XeTeXinterchartokenstate"); 30840eTeX_state_code+XeTeX_dash_break_code:print_esc("XeTeXdashbreakstate"); 30841eTeX_state_code+XeTeX_input_normalization_code:print_esc("XeTeXinputnormalization"); 30842eTeX_state_code+XeTeX_tracing_fonts_code:print_esc("XeTeXtracingfonts"); 30843 30844@ @<Generate all \eTeX...@>= 30845primitive("suppressfontnotfounderror",assign_int,int_base+suppress_fontnotfound_error_code);@/ 30846primitive("TeXXeTstate",assign_int,eTeX_state_base+TeXXeT_code); 30847@!@:TeXXeT_state_}{\.{\\TeXXeT_state} primitive@> 30848primitive("XeTeXupwardsmode",assign_int,eTeX_state_base+XeTeX_upwards_code); 30849@!@:XeTeX_upwards_mode_}{\.{\\XeTeXupwardsmode} primitive@> 30850primitive("XeTeXuseglyphmetrics",assign_int,eTeX_state_base+XeTeX_use_glyph_metrics_code); 30851@!@:XeTeX_use_glyph_metrics_}{\.{\\XeTeXuseglyphmetrics} primitive@> 30852primitive("XeTeXinterchartokenstate",assign_int,eTeX_state_base+XeTeX_inter_char_tokens_code); 30853@!@:XeTeX_use_inter_char_tokens_}{\.{\\XeTeXinterchartokenstate} primitive@> 30854 30855primitive("XeTeXdashbreakstate",assign_int,eTeX_state_base+XeTeX_dash_break_code); 30856@!@:XeTeX_dash_break_state_}{\.{\\XeTeXdashbreakstate} primitive@> 30857 30858primitive("XeTeXinputnormalization",assign_int,eTeX_state_base+XeTeX_input_normalization_code); 30859@!@:XeTeX_input_normalization_}{\.{\\XeTeXinputnormalization} primitive@> 30860 30861primitive("XeTeXtracingfonts",assign_int,eTeX_state_base+XeTeX_tracing_fonts_code); 30862 30863primitive("XeTeXinputencoding",extension,XeTeX_input_encoding_extension_code); 30864primitive("XeTeXdefaultencoding",extension,XeTeX_default_encoding_extension_code); 30865primitive("beginL",valign,begin_L_code); 30866@!@:beginL_}{\.{\\beginL} primitive@> 30867primitive("endL",valign,end_L_code); 30868@!@:endL_}{\.{\\endL} primitive@> 30869primitive("beginR",valign,begin_R_code); 30870@!@:beginR_}{\.{\\beginR} primitive@> 30871primitive("endR",valign,end_R_code); 30872@!@:endR_}{\.{\\endR} primitive@> 30873 30874@ @<Cases of |valign| for |print_cmd_chr|@>= 30875else case chr_code of 30876 begin_L_code: print_esc("beginL"); 30877 end_L_code: print_esc("endL"); 30878 begin_R_code: print_esc("beginR"); 30879 othercases print_esc("endR") 30880 endcases 30881 30882@ @<Cases of |main_control| for |hmode+valign|@>= 30883if cur_chr>0 then 30884 begin if eTeX_enabled(TeXXeT_en,cur_cmd,cur_chr) then 30885@.Improper \\beginL@> 30886@.Improper \\endL@> 30887@.Improper \\beginR@> 30888@.Improper \\endR@> 30889 tail_append(new_math(0,cur_chr)); 30890 end 30891else 30892 30893@ An hbox with subtype dlist will never be reversed, even when embedded 30894in right-to-left text. 30895 30896@<Display if this box is never to be reversed@>= 30897if (type(p)=hlist_node)and(box_lr(p)=dlist) then print(", display") 30898 30899@ A number of routines are based on a stack of one-word nodes whose 30900|info| fields contain |end_M_code|, |end_L_code|, or |end_R_code|. The 30901top of the stack is pointed to by |LR_ptr|. 30902 30903When the stack manipulation macros of this section are used below, 30904variable |LR_ptr| might be the global variable declared here for |hpack| 30905and |ship_out|, or might be local to |post_line_break|. 30906 30907@d put_LR(#)==begin temp_ptr:=get_avail; info(temp_ptr):=#; 30908 link(temp_ptr):=LR_ptr; LR_ptr:=temp_ptr; 30909 end 30910@# 30911@d push_LR(#)==put_LR(end_LR_type(#)) 30912@# 30913@d pop_LR==begin temp_ptr:=LR_ptr; LR_ptr:=link(temp_ptr); 30914 free_avail(temp_ptr); 30915 end 30916 30917@<Glob...@>= 30918@!LR_ptr:pointer; {stack of LR codes for |hpack|, |ship_out|, and |init_math|} 30919@!LR_problems:integer; {counts missing begins and ends} 30920@!cur_dir:small_number; {current text direction} 30921 30922@ @<Set init...@>= 30923LR_ptr:=null; LR_problems:=0; cur_dir:=left_to_right; 30924 30925@ @<Insert LR nodes at the beg...@>= 30926begin q:=link(temp_head); 30927if LR_ptr<>null then 30928 begin temp_ptr:=LR_ptr; r:=q; 30929 repeat s:=new_math(0,begin_LR_type(info(temp_ptr))); link(s):=r; r:=s; 30930 temp_ptr:=link(temp_ptr); 30931 until temp_ptr=null; 30932 link(temp_head):=r; 30933 end; 30934while q<>cur_break(cur_p) do 30935 begin if not is_char_node(q) then 30936 if type(q)=math_node then @<Adjust \(t)the LR stack for the |p...@>; 30937 q:=link(q); 30938 end; 30939end 30940 30941@ @<Adjust \(t)the LR stack for the |p...@>= 30942if end_LR(q) then 30943 begin if LR_ptr<>null then if info(LR_ptr)=end_LR_type(q) then pop_LR; 30944 end 30945else push_LR(q) 30946 30947@ We use the fact that |q| now points to the node with \.{\\rightskip} glue. 30948 30949@<Insert LR nodes at the end...@>= 30950if LR_ptr<>null then 30951 begin s:=temp_head; r:=link(s); 30952 while r<>q do 30953 begin s:=r; r:=link(s); 30954 end; 30955 r:=LR_ptr; 30956 while r<>null do 30957 begin temp_ptr:=new_math(0,info(r)); 30958 link(s):=temp_ptr; s:=temp_ptr; r:=link(r); 30959 end; 30960 link(s):=q; 30961 end 30962 30963@ @<Initialize the LR stack@>= 30964put_LR(before) {this will never match} 30965 30966@ @<Adjust \(t)the LR stack for the |hp...@>= 30967if end_LR(p) then 30968 if info(LR_ptr)=end_LR_type(p) then pop_LR 30969 else begin incr(LR_problems); type(p):=kern_node; subtype(p):=explicit; 30970 end 30971else push_LR(p) 30972 30973@ @<Check for LR anomalies at the end of |hp...@>= 30974begin if info(LR_ptr)<>before then 30975 begin while link(q)<>null do q:=link(q); 30976 repeat temp_ptr:=q; q:=new_math(0,info(LR_ptr)); link(temp_ptr):=q; 30977 LR_problems:=LR_problems+10000; pop_LR; 30978 until info(LR_ptr)=before; 30979 end; 30980if LR_problems>0 then 30981 begin @<Report LR problems@>; goto common_ending; 30982 end; 30983pop_LR; 30984if LR_ptr<>null then confusion("LR1"); 30985@:this can't happen LR1}{\quad LR1@> 30986end 30987 30988@ @<Report LR problems@>= 30989begin print_ln; print_nl("\endL or \endR problem (");@/ 30990print_int(LR_problems div 10000); print(" missing, ");@/ 30991print_int(LR_problems mod 10000); print(" extra");@/ 30992LR_problems:=0; 30993end 30994 30995@ @<Initialize |hlist_out| for mixed...@>= 30996if eTeX_ex then 30997 begin @<Initialize the LR stack@>; 30998 if box_lr(this_box)=dlist then 30999 if cur_dir=right_to_left then 31000 begin cur_dir:=left_to_right; cur_h:=cur_h-width(this_box); 31001 end 31002 else set_box_lr(this_box)(0); 31003 if (cur_dir=right_to_left)and(box_lr(this_box)<>reversed) then 31004 @<Reverse the complete hlist and set the subtype to |reversed|@>; 31005 end 31006 31007@ @<Finish |hlist_out| for mixed...@>= 31008if eTeX_ex then 31009 begin @<Check for LR anomalies at the end of |hlist_out|@>; 31010 if box_lr(this_box)=dlist then cur_dir:=right_to_left; 31011 end 31012 31013@ @<Handle a math node in |hlist_out|@>= 31014begin if eTeX_ex then 31015 @<Adjust \(t)the LR stack for the |hlist_out| routine; if necessary 31016 reverse an hlist segment and |goto reswitch|@>; 31017 cur_h:=cur_h+width(p); 31018 end 31019 31020@ Breaking a paragraph into lines while \TeXXeT\ is disabled may result 31021in lines whith unpaired math nodes. Such hlists are silently accepted 31022in the absence of text direction directives. 31023 31024@d LR_dir(#)==(subtype(#) div R_code) {text direction of a `math node'} 31025 31026@<Adjust \(t)the LR stack for the |hl...@>= 31027begin if end_LR(p) then 31028 if info(LR_ptr)=end_LR_type(p) then pop_LR 31029 else begin if subtype(p)>L_code then incr(LR_problems); 31030 end 31031else begin push_LR(p); 31032 if LR_dir(p)<>cur_dir then 31033 @<Reverse an hlist segment and |goto reswitch|@>; 31034 end; 31035type(p):=kern_node; 31036end 31037 31038@ @<Check for LR anomalies at the end of |hl...@>= 31039begin while info(LR_ptr)<>before do 31040 begin if info(LR_ptr)>L_code then LR_problems:=LR_problems+10000; 31041 pop_LR; 31042 end; 31043pop_LR; 31044end 31045 31046@ @d edge_node=style_node {a |style_node| does not occur in hlists} 31047@d edge_node_size=style_node_size {number of words in an edge node} 31048@d edge_dist(#)==depth(#) {new |left_edge| position relative to |cur_h| 31049 (after |width| has been taken into account)} 31050 31051@<Declare procedures needed in |hlist_out|, |vlist_out|@>= 31052function new_edge(@!s:small_number;@!w:scaled):pointer; 31053 {create an edge node} 31054var p:pointer; {the new node} 31055begin p:=get_node(edge_node_size); type(p):=edge_node; subtype(p):=s; 31056width(p):=w; edge_dist(p):=0; {the |edge_dist| field will be set later} 31057new_edge:=p; 31058end; 31059 31060@ @<Cases of |hlist_out| that arise...@>= 31061edge_node: begin cur_h:=cur_h+width(p); 31062 left_edge:=cur_h+edge_dist(p); cur_dir:=subtype(p); 31063 end; 31064 31065@ We detach the hlist, start a new one consisting of just one kern node, 31066append the reversed list, and set the width of the kern node. 31067 31068@<Reverse the complete hlist...@>= 31069begin save_h:=cur_h; temp_ptr:=p; p:=new_kern(0); link(prev_p):=p; 31070cur_h:=0; link(p):=reverse(this_box,null,cur_g,cur_glue); width(p):=-cur_h; 31071cur_h:=save_h; set_box_lr(this_box)(reversed); 31072end 31073 31074@ We detach the remainder of the hlist, replace the math node by 31075an edge node, and append the reversed hlist segment to it; the tail of 31076the reversed segment is another edge node and the remainder of the 31077original list is attached to it. 31078 31079@<Reverse an hlist segment...@>= 31080begin save_h:=cur_h; temp_ptr:=link(p); rule_wd:=width(p); 31081free_node(p,small_node_size); 31082cur_dir:=reflected; p:=new_edge(cur_dir,rule_wd); link(prev_p):=p; 31083cur_h:=cur_h-left_edge+rule_wd; 31084link(p):=reverse(this_box,new_edge(reflected,0),cur_g,cur_glue); 31085edge_dist(p):=cur_h; cur_dir:=reflected; cur_h:=save_h; 31086goto reswitch; 31087end 31088 31089@ The |reverse| function defined here is responsible to reverse the 31090nodes of an hlist (segment). The first parameter |this_box| is the enclosing 31091hlist node, the second parameter |t| is to become the tail of the reversed 31092list, and the global variable |temp_ptr| is the head of the list to be 31093reversed. Finally |cur_g| and |cur_glue| are the current glue rounding state 31094variables, to be updated by this function. We remove nodes from the original 31095list and add them to the head of the new one. 31096 31097@<Declare procedures needed in |hlist_out|, |vlist_out|@>= 31098function reverse(@!this_box,@!t:pointer; var cur_g:scaled; 31099 var cur_glue:real):pointer; 31100label reswitch,next_p,done; 31101var l:pointer; {the new list} 31102@!p:pointer; {the current node} 31103@!q:pointer; {the next node} 31104@!g_order: glue_ord; {applicable order of infinity for glue} 31105@!g_sign: normal..shrinking; {selects type of glue} 31106@!glue_temp:real; {glue value before rounding} 31107@!m,@!n:halfword; {count of unmatched math nodes} 31108begin g_order:=glue_order(this_box); g_sign:=glue_sign(this_box); 31109l:=t; p:=temp_ptr; m:=min_halfword; n:=min_halfword; 31110loop@+ begin while p<>null do 31111 @<Move node |p| to the new list and go to the next node; 31112 or |goto done| if the end of the reflected segment has been reached@>; 31113 if (t=null)and(m=min_halfword)and(n=min_halfword) then goto done; 31114 p:=new_math(0,info(LR_ptr)); LR_problems:=LR_problems+10000; 31115 {manufacture one missing math node} 31116 end; 31117done:reverse:=l; 31118end; 31119 31120@ @<Move node |p| to the new list...@>= 31121reswitch: if is_char_node(p) then 31122 repeat f:=font(p); c:=character(p); 31123 cur_h:=cur_h+char_width(f)(char_info(f)(c)); 31124 q:=link(p); link(p):=l; l:=p; p:=q; 31125 until not is_char_node(p) 31126else @<Move the non-|char_node| |p| to the new list@> 31127 31128@ @<Move the non-|char_node| |p| to the new list@>= 31129begin q:=link(p); 31130case type(p) of 31131hlist_node,vlist_node,rule_node,kern_node: rule_wd:=width(p); 31132@t\4@>@<Cases of |reverse| that need special treatment@>@; 31133edge_node: confusion("LR2"); 31134@:this can't happen LR2}{\quad LR2@> 31135othercases goto next_p 31136endcases;@/ 31137cur_h:=cur_h+rule_wd; 31138next_p: link(p):=l; 31139if type(p)=kern_node then if (rule_wd=0)or(l=null) then 31140 begin free_node(p,small_node_size); p:=l; 31141 end; 31142l:=p; p:=q; 31143end 31144 31145@ Need to measure |native_word| and picture nodes when reversing! 31146@<Cases of |reverse|...@>= 31147whatsit_node: 31148 if (subtype(p)=native_word_node) 31149 or (subtype(p)=glyph_node) 31150 or (subtype(p)=pic_node) 31151 or (subtype(p)=pdf_node) 31152 then 31153 rule_wd:=width(p) 31154 else 31155 goto next_p; 31156 31157@ Here we compute the effective width of a glue node as in |hlist_out|. 31158 31159@<Cases of |reverse|...@>= 31160glue_node: begin round_glue; 31161 @<Handle a glue node for mixed...@>; 31162 end; 31163 31164@ A ligature node is replaced by a char node. 31165 31166@<Cases of |reverse|...@>= 31167ligature_node: begin flush_node_list(lig_ptr(p)); 31168 temp_ptr:=p; p:=get_avail; mem[p]:=mem[lig_char(temp_ptr)]; link(p):=q; 31169 free_node(temp_ptr,small_node_size); goto reswitch; 31170 end; 31171 31172@ Math nodes in an inner reflected segment are modified, those at the 31173outer level are changed into kern nodes. 31174 31175@<Cases of |reverse|...@>= 31176math_node: begin rule_wd:=width(p); 31177if end_LR(p) then 31178 if info(LR_ptr)<>end_LR_type(p) then 31179 begin type(p):=kern_node; incr(LR_problems); 31180 end 31181 else begin pop_LR; 31182 if n>min_halfword then 31183 begin decr(n); decr(subtype(p)); {change |after| into |before|} 31184 end 31185 else begin type(p):=kern_node; 31186 if m>min_halfword then decr(m) 31187 else @<Finish the reversed hlist segment and |goto done|@>; 31188 end; 31189 end 31190else begin push_LR(p); 31191 if (n>min_halfword)or(LR_dir(p)<>cur_dir) then 31192 begin incr(n); incr(subtype(p)); {change |before| into |after|} 31193 end 31194 else begin type(p):=kern_node; incr(m); 31195 end; 31196 end; 31197end; 31198 31199@ Finally we have found the end of the hlist segment to be reversed; the 31200final math node is released and the remaining list attached to the 31201edge node terminating the reversed segment. 31202 31203@<Finish the reversed...@>= 31204begin free_node(p,small_node_size); 31205link(t):=q; width(t):=rule_wd; edge_dist(t):=-cur_h-rule_wd; goto done; 31206end 31207 31208@ @<Check for LR anomalies at the end of |s...@>= 31209begin if LR_problems>0 then 31210 begin @<Report LR problems@>; print_char(")"); print_ln; 31211 end; 31212if (LR_ptr<>null)or(cur_dir<>left_to_right) then confusion("LR3"); 31213@:this can't happen LR3}{\quad LR3@> 31214end 31215 31216@ Some special actions are required for displayed equation in paragraphs 31217with mixed direction texts. First of all we have to set the text 31218direction preceding the display. 31219 31220@<Set the value of |x| to the text direction before the display@>= 31221if LR_save=null then x:=0 31222else if info(LR_save)>=R_code then x:=-1@+else x:=1 31223 31224@ @<Prepare for display after an empty...@>= 31225begin pop_nest; @<Set the value of |x|...@>; 31226end 31227 31228@ When calculating the natural width, |w|, of the final line preceding 31229the display, we may have to copy all or part of its hlist. We copy, 31230however, only those parts of the original list that are relevant for the 31231computation of |pre_display_size|. 31232@^data structure assumptions@> 31233 31234@<Declare subprocedures for |init_math|@>= 31235procedure just_copy(@!p,@!h,@!t:pointer); 31236label found,not_found; 31237var @!r:pointer; {current node being fabricated for new list} 31238@!words:0..5; {number of words remaining to be copied} 31239begin while p<>null do 31240 begin words:=1; {this setting occurs in more branches than any other} 31241 if is_char_node(p) then r:=get_avail 31242 else case type(p) of 31243 hlist_node,vlist_node: begin r:=get_node(box_node_size); 31244 mem[r+6]:=mem[p+6]; mem[r+5]:=mem[p+5]; {copy the last two words} 31245 words:=5; list_ptr(r):=null; {this affects |mem[r+5]|} 31246 end; 31247 rule_node: begin r:=get_node(rule_node_size); words:=rule_node_size; 31248 end; 31249 ligature_node: begin r:=get_avail; {only |font| and |character| are needed} 31250 mem[r]:=mem[lig_char(p)]; goto found; 31251 end; 31252 kern_node,math_node: begin r:=get_node(small_node_size); 31253 words:=small_node_size; 31254 end; 31255 glue_node: begin r:=get_node(small_node_size); add_glue_ref(glue_ptr(p)); 31256 glue_ptr(r):=glue_ptr(p); leader_ptr(r):=null; 31257 end; 31258 whatsit_node:@<Make a partial copy of the whatsit...@>; 31259 othercases goto not_found 31260 endcases; 31261 while words>0 do 31262 begin decr(words); mem[r+words]:=mem[p+words]; 31263 end; 31264 found: link(h):=r; h:=r; 31265 not_found: p:=link(p); 31266 end; 31267link(h):=t; 31268end; 31269 31270@ When the final line ends with R-text, the value |w| refers to the line 31271reflected with respect to the left edge of the enclosing vertical list. 31272 31273@<Prepare for display after a non-empty...@>= 31274if eTeX_ex then @<Let |j| be the prototype box for the display@>; 31275v:=shift_amount(just_box); 31276@<Set the value of |x|...@>; 31277if x>=0 then 31278 begin p:=list_ptr(just_box); link(temp_head):=null; 31279 end 31280else begin v:=-v-width(just_box); 31281 p:=new_math(0,begin_L_code); link(temp_head):=p; 31282 just_copy(list_ptr(just_box),p,new_math(0,end_L_code)); 31283 cur_dir:=right_to_left; 31284 end; 31285v:=v+2*quad(cur_font); 31286if TeXXeT_en then @<Initialize the LR stack@> 31287 31288@ @<Finish the natural width computation@>= 31289if TeXXeT_en then 31290 begin while LR_ptr<>null do pop_LR; 31291 if LR_problems<>0 then 31292 begin w:=max_dimen; LR_problems:=0; 31293 end; 31294 end; 31295cur_dir:=left_to_right; flush_node_list(link(temp_head)) 31296 31297@ In the presence of text direction directives we assume that any LR 31298problems have been fixed by the |hpack| routine. If the final line 31299contains, however, text direction directives while \TeXXeT\ is disabled, 31300then we set |w:=max_dimen|. 31301 31302@<Cases of `Let |d| be the natural...@>= 31303math_node: begin d:=width(p); 31304 if TeXXeT_en then @<Adjust \(t)the LR stack for the |init_math| routine@> 31305 else if subtype(p)>=L_code then 31306 begin w:=max_dimen; goto done; 31307 end; 31308 end; 31309edge_node: begin d:=width(p); cur_dir:=subtype(p); 31310 end; 31311 31312@ @<Adjust \(t)the LR stack for the |i...@>= 31313if end_LR(p) then 31314 begin if info(LR_ptr)=end_LR_type(p) then pop_LR 31315 else if subtype(p)>L_code then 31316 begin w:=max_dimen; goto done; 31317 end 31318 end 31319else begin push_LR(p); 31320 if LR_dir(p)<>cur_dir then 31321 begin just_reverse(p); p:=temp_head; 31322 end; 31323 end 31324 31325@ @<Declare subprocedures for |init_math|@>= 31326procedure just_reverse(@!p:pointer); 31327label found,done; 31328var l:pointer; {the new list} 31329@!t:pointer; {tail of reversed segment} 31330@!q:pointer; {the next node} 31331@!m,@!n:halfword; {count of unmatched math nodes} 31332begin m:=min_halfword; n:=min_halfword; 31333if link(temp_head)=null then 31334 begin just_copy(link(p),temp_head,null); q:=link(temp_head); 31335 end 31336else begin q:=link(p); link(p):=null; flush_node_list(link(temp_head)); 31337 end; 31338t:=new_edge(cur_dir,0); l:=t; cur_dir:=reflected; 31339while q<>null do 31340 if is_char_node(q) then 31341 repeat p:=q; q:=link(p); link(p):=l; l:=p; 31342 until not is_char_node(q) 31343 else begin p:=q; q:=link(p); 31344 if type(p)=math_node then 31345 @<Adjust \(t)the LR stack for the |just_reverse| routine@>; 31346 link(p):=l; l:=p; 31347 end; 31348goto done; 31349found:width(t):=width(p); link(t):=q; free_node(p,small_node_size); 31350done:link(temp_head):=l; 31351end; 31352 31353@ @<Adjust \(t)the LR stack for the |j...@>= 31354if end_LR(p) then 31355 if info(LR_ptr)<>end_LR_type(p) then 31356 begin type(p):=kern_node; incr(LR_problems); 31357 end 31358 else begin pop_LR; 31359 if n>min_halfword then 31360 begin decr(n); decr(subtype(p)); {change |after| into |before|} 31361 end 31362 else begin if m>min_halfword then decr(m)@+else goto found; 31363 type(p):=kern_node; 31364 end; 31365 end 31366else begin push_LR(p); 31367 if (n>min_halfword)or(LR_dir(p)<>cur_dir) then 31368 begin incr(n); incr(subtype(p)); {change |before| into |after|} 31369 end 31370 else begin type(p):=kern_node; incr(m); 31371 end; 31372 end 31373 31374@ The prototype box is an hlist node with the width, glue set, and shift 31375amount of |just_box|, i.e., the last line preceding the display. Its 31376hlist reflects the current \.{\\leftskip} and \.{\\rightskip}. 31377 31378@<Let |j| be the prototype box for the display@>= 31379begin if right_skip=zero_glue then j:=new_kern(0) 31380else j:=new_param_glue(right_skip_code); 31381if left_skip=zero_glue then p:=new_kern(0) 31382else p:=new_param_glue(left_skip_code); 31383link(p):=j; j:=new_null_box; width(j):=width(just_box); 31384shift_amount(j):=shift_amount(just_box); list_ptr(j):=p; 31385glue_order(j):=glue_order(just_box); glue_sign(j):=glue_sign(just_box); 31386glue_set(j):=glue_set(just_box); 31387end 31388 31389@ At the end of a displayed equation we retrieve the prototype box. 31390 31391@<Local variables for finishing...@>= 31392@!j:pointer; {prototype box} 31393 31394@ @<Retrieve the prototype box@>= 31395if mode=mmode then j:=LR_box 31396 31397@ @<Flush the prototype box@>= 31398flush_node_list(j) 31399 31400@ The |app_display| procedure used to append the displayed equation 31401and\slash or equation number to the current vertical list has three 31402parameters: the prototype box, the hbox to be appended, and the 31403displacement of the hbox in the display line. 31404 31405@<Declare subprocedures for |after_math|@>= 31406procedure app_display(@!j,@!b:pointer;@!d:scaled); 31407var z:scaled; {width of the line} 31408@!s:scaled; {move the line right this much} 31409@!e:scaled; {distance from right edge of box to end of line} 31410@!x:integer; {|pre_display_direction|} 31411@!p,@!q,@!r,@!t,@!u:pointer; {for list manipulation} 31412begin s:=display_indent; x:=pre_display_direction; 31413if x=0 then shift_amount(b):=s+d 31414else begin z:=display_width; p:=b; 31415 @<Set up the hlist for the display line@>; 31416 @<Package the display line@>; 31417 end; 31418append_to_vlist(b); 31419end; 31420 31421@ Here we construct the hlist for the display, starting with node |p| 31422and ending with node |q|. We also set |d| and |e| to the amount of 31423kerning to be added before and after the hlist (adjusted for the 31424prototype box). 31425 31426@<Set up the hlist for the display line@>= 31427if x>0 then e:=z-d-width(p) 31428else begin e:=d; d:=z-e-width(p); 31429 end; 31430if j<>null then 31431 begin b:=copy_node_list(j); height(b):=height(p); depth(b):=depth(p); 31432 s:=s-shift_amount(b); d:=d+s; e:=e+width(b)-z-s; 31433 end; 31434if box_lr(p)=dlist then q:=p {display or equation number} 31435else begin {display and equation number} 31436 r:=list_ptr(p); free_node(p,box_node_size); 31437 if r=null then confusion("LR4"); 31438 if x>0 then 31439 begin p:=r; 31440 repeat q:=r; r:=link(r); {find tail of list} 31441 until r=null; 31442 end 31443 else begin p:=null; q:=r; 31444 repeat t:=link(r); link(r):=p; p:=r; r:=t; {reverse list} 31445 until r=null; 31446 end; 31447 end 31448 31449@ In the presence of a prototype box we use its shift amount and width 31450to adjust the values of kerning and add these values to the glue nodes 31451inserted to cancel the \.{\\leftskip} and \.{\\rightskip}. If there is 31452no prototype box (because the display is preceded by an empty 31453paragraph), or if the skip parameters are zero, we just add kerns. 31454 31455The |cancel_glue| macro creates and links a glue node that is, together 31456with another glue node, equivalent to a given amount of kerning. We can 31457use |j| as temporary pointer, since all we need is |j<>null|. 31458 31459@d cancel_glue(#)==j:=new_skip_param(#); cancel_glue_cont 31460@d cancel_glue_cont(#)==link(#):=j; cancel_glue_cont_cont 31461@d cancel_glue_cont_cont(#)==link(j):=#; cancel_glue_end 31462@d cancel_glue_end(#)==j:=glue_ptr(#); cancel_glue_end_end 31463@d cancel_glue_end_end(#)== 31464stretch_order(temp_ptr):=stretch_order(j); 31465shrink_order(temp_ptr):=shrink_order(j); width(temp_ptr):=#-width(j); 31466stretch(temp_ptr):=-stretch(j); shrink(temp_ptr):=-shrink(j) 31467 31468@<Package the display line@>= 31469if j=null then 31470 begin r:=new_kern(0); t:=new_kern(0); {the widths will be set later} 31471 end 31472else begin r:=list_ptr(b); t:=link(r); 31473 end; 31474u:=new_math(0,end_M_code); 31475if type(t)=glue_node then {|t| is \.{\\rightskip} glue} 31476 begin cancel_glue(right_skip_code)(q)(u)(t)(e); link(u):=t; 31477 end 31478else begin width(t):=e; link(t):=u; link(q):=t; 31479 end; 31480u:=new_math(0,begin_M_code); 31481if type(r)=glue_node then {|r| is \.{\\leftskip} glue} 31482 begin cancel_glue(left_skip_code)(u)(p)(r)(d); link(r):=u; 31483 end 31484else begin width(r):=d; link(r):=p; link(u):=r; 31485 if j=null then 31486 begin b:=hpack(u,natural); shift_amount(b):=s; 31487 end 31488 else list_ptr(b):=u; 31489 end 31490 31491@ The |scan_tokens| feature of \eTeX\ defines the \.{\\scantokens} 31492primitive. 31493 31494@<Generate all \eTeX...@>= 31495primitive("scantokens",input,2); 31496@!@:scan_tokens_}{\.{\\scantokens} primitive@> 31497 31498@ @<Cases of |input| for |print_cmd_chr|@>= 31499else if chr_code=2 then print_esc("scantokens") 31500 31501@ @<Cases for |input|@>= 31502else if cur_chr=2 then pseudo_start 31503 31504@ The global variable |pseudo_files| is used to maintain a stack of 31505pseudo files. The |info| field of each pseudo file points to a linked 31506list of variable size nodes representing lines not yet processed: the 31507|info| field of the first word contains the size of this node, all the 31508following words contain ASCII codes. 31509 31510@<Glob...@>= 31511@!pseudo_files:pointer; {stack of pseudo files} 31512 31513@ @<Set init...@>= 31514pseudo_files:=null; 31515 31516@ The |pseudo_start| procedure initiates reading from a pseudo file. 31517 31518@<Declare \eTeX\ procedures for ex...@>= 31519procedure@?pseudo_start; forward;@t\2@> 31520 31521@ @<Declare \eTeX\ procedures for tok...@>= 31522procedure pseudo_start; 31523var old_setting:0..max_selector; {holds |selector| setting} 31524@!s:str_number; {string to be converted into a pseudo file} 31525@!l,@!m:pool_pointer; {indices into |str_pool|} 31526@!p,@!q,@!r:pointer; {for list construction} 31527@!w: four_quarters; {four ASCII codes} 31528@!nl,@!sz:integer; 31529begin scan_general_text; 31530old_setting:=selector; selector:=new_string; 31531token_show(temp_head); selector:=old_setting; 31532flush_list(link(temp_head)); 31533str_room(1); s:=make_string; 31534@<Convert string |s| into a new pseudo file@>; 31535flush_string; 31536@<Initiate input from new pseudo file@>; 31537end; 31538 31539@ @<Convert string |s| into a new pseudo file@>= 31540str_pool[pool_ptr]:=si(" "); l:=str_start_macro(s); 31541nl:=si(new_line_char); 31542p:=get_avail; q:=p; 31543while l<pool_ptr do 31544 begin m:=l; 31545 while (l<pool_ptr)and(str_pool[l]<>nl) do incr(l); 31546 sz:=(l-m+7)div 4; 31547 if sz=1 then sz:=2; 31548 r:=get_node(sz); link(q):=r; q:=r; info(q):=hi(sz); 31549 while sz>2 do 31550 begin decr(sz); incr(r); 31551 w.b0:=qi(so(str_pool[m])); w.b1:=qi(so(str_pool[m+1])); 31552 w.b2:=qi(so(str_pool[m+2])); w.b3:=qi(so(str_pool[m+3])); 31553 mem[r].qqqq:=w; m:=m+4; 31554 end; 31555 w.b0:=qi(" "); w.b1:=qi(" "); w.b2:=qi(" "); w.b3:=qi(" "); 31556 if l>m then 31557 begin w.b0:=qi(so(str_pool[m])); 31558 if l>m+1 then 31559 begin w.b1:=qi(so(str_pool[m+1])); 31560 if l>m+2 then 31561 begin w.b2:=qi(so(str_pool[m+2])); 31562 if l>m+3 then w.b3:=qi(so(str_pool[m+3])); 31563 end; 31564 end; 31565 end; 31566 mem[r+1].qqqq:=w; 31567 if str_pool[l]=nl then incr(l); 31568 end; 31569info(p):=link(p); link(p):=pseudo_files; pseudo_files:=p 31570 31571@ @<Initiate input from new pseudo file@>= 31572begin_file_reading; {set up |cur_file| and new level of input} 31573line:=0; limit:=start; loc:=limit+1; {force line read} 31574if tracing_scan_tokens>0 then 31575 begin if term_offset>max_print_line-3 then print_ln 31576 else if (term_offset>0)or(file_offset>0) then print_char(" "); 31577 name:=19; print("( "); incr(open_parens); update_terminal; 31578 end 31579else name:=18 31580 31581@ Here we read a line from the current pseudo file into |buffer|. 31582 31583@<Declare \eTeX\ procedures for tr...@>= 31584function pseudo_input: boolean; {inputs the next line or returns |false|} 31585var p:pointer; {current line from pseudo file} 31586@!sz:integer; {size of node |p|} 31587@!w:four_quarters; {four ASCII codes} 31588@!r:pointer; {loop index} 31589begin last:=first; {cf.\ Matthew 19\thinspace:\thinspace30} 31590p:=info(pseudo_files); 31591if p=null then pseudo_input:=false 31592else begin info(pseudo_files):=link(p); sz:=ho(info(p)); 31593 if 4*sz-3>=buf_size-last then 31594 @<Report overflow of the input buffer, and abort@>; 31595 last:=first; 31596 for r:=p+1 to p+sz-1 do 31597 begin w:=mem[r].qqqq; 31598 buffer[last]:=w.b0; buffer[last+1]:=w.b1; 31599 buffer[last+2]:=w.b2; buffer[last+3]:=w.b3; 31600 last:=last+4; 31601 end; 31602 if last>=max_buf_stack then max_buf_stack:=last+1; 31603 while (last>first)and(buffer[last-1]=" ") do decr(last); 31604 free_node(p,sz); 31605 pseudo_input:=true; 31606 end; 31607end; 31608 31609@ When we are done with a pseudo file we `close' it. 31610 31611@<Declare \eTeX\ procedures for tr...@>= 31612procedure pseudo_close; {close the top level pseudo file} 31613var p,@!q: pointer; 31614begin p:=link(pseudo_files); q:=info(pseudo_files); 31615free_avail(pseudo_files); pseudo_files:=p; 31616while q<>null do 31617 begin p:=q; q:=link(p); free_node(p,ho(info(p))); 31618 end; 31619end; 31620 31621@ @<Dump the \eTeX\ state@>= 31622while pseudo_files<>null do pseudo_close; {flush pseudo files} 31623 31624@ @<Generate all \eTeX...@>= 31625primitive("readline",read_to_cs,1);@/ 31626@!@:read_line_}{\.{\\readline} primitive@> 31627 31628@ @<Cases of |read| for |print_cmd_chr|@>= 31629else print_esc("readline") 31630 31631@ @<Handle \.{\\readline} and |goto done|@>= 31632if j=1 then 31633 begin while loc<=limit do {current line not yet finished} 31634 begin cur_chr:=buffer[loc]; incr(loc); 31635 if cur_chr=" " then cur_tok:=space_token 31636 @+else cur_tok:=cur_chr+other_token; 31637 store_new_token(cur_tok); 31638 end; 31639 goto done; 31640 end 31641 31642@ Here we define the additional conditionals of \eTeX\ as well as the 31643\.{\\unless} prefix. 31644 31645@d if_def_code=17 { `\.{\\ifdefined}' } 31646@d if_cs_code=18 { `\.{\\ifcsname}' } 31647@d if_font_char_code=19 { `\.{\\iffontchar}' } 31648@d if_in_csname_code=20 { `\.{\\ifincsname}' } 31649 31650@<Generate all \eTeX...@>= 31651primitive("unless",expand_after,1);@/ 31652@!@:unless_}{\.{\\unless} primitive@> 31653primitive("ifdefined",if_test,if_def_code); 31654@!@:if_defined_}{\.{\\ifdefined} primitive@> 31655primitive("ifcsname",if_test,if_cs_code); 31656@!@:if_cs_name_}{\.{\\ifcsname} primitive@> 31657primitive("iffontchar",if_test,if_font_char_code); 31658@!@:if_font_char_}{\.{\\iffontchar} primitive@> 31659primitive("ifincsname",if_test,if_in_csname_code); 31660@!@:if_in_csname_}{\.{\\ifincsname} primitive@> 31661 31662@ @<Cases of |expandafter| for |print_cmd_chr|@>= 31663else print_esc("unless") 31664 31665@ @<Cases of |if_test| for |print_cmd_chr|@>= 31666if_def_code:print_esc("ifdefined"); 31667if_cs_code:print_esc("ifcsname"); 31668if_font_char_code:print_esc("iffontchar"); 31669if_in_csname_code:print_esc("ifincsname"); 31670 31671@ The result of a boolean condition is reversed when the conditional is 31672preceded by \.{\\unless}. 31673 31674@<Negate a boolean conditional and |goto reswitch|@>= 31675begin get_token; 31676if (cur_cmd=if_test)and(cur_chr<>if_case_code) then 31677 begin cur_chr:=cur_chr+unless_code; goto reswitch; 31678 end; 31679print_err("You can't use `"); print_esc("unless"); print("' before `"); 31680@.You can't use \\unless...@> 31681print_cmd_chr(cur_cmd,cur_chr); print_char("'"); 31682help1("Continue, and I'll forget that it ever happened."); 31683back_error; 31684end 31685 31686@ The conditional \.{\\ifdefined} tests if a control sequence is 31687defined. 31688 31689We need to reset |scanner_status|, since \.{\\outer} control sequences 31690are allowed, but we might be scanning a macro definition or preamble. 31691 31692@<Cases for |conditional|@>= 31693if_def_code:begin save_scanner_status:=scanner_status; 31694 scanner_status:=normal; 31695 get_next; b:=(cur_cmd<>undefined_cs); 31696 scanner_status:=save_scanner_status; 31697 end; 31698 31699@ The conditional \.{\\ifcsname} is equivalent to \.{\{\\expandafter} 31700\.{\}\\expandafter} \.{\\ifdefined} \.{\\csname}, except that no new 31701control sequence will be entered into the hash table (once all tokens 31702preceding the mandatory \.{\\endcsname} have been expanded). 31703 31704@<Cases for |conditional|@>= 31705if_cs_code:begin n:=get_avail; p:=n; {head of the list of characters} 31706 e:=is_in_csname; is_in_csname:=true; 31707 repeat get_x_token; 31708 if cur_cs=0 then store_new_token(cur_tok); 31709 until cur_cs<>0; 31710 if cur_cmd<>end_cs_name then @<Complain about missing \.{\\endcsname}@>; 31711 @<Look up the characters of list |n| in the hash table, and set |cur_cs|@>; 31712 flush_list(n); 31713 b:=(eq_type(cur_cs)<>undefined_cs); 31714 is_in_csname:=e; 31715 end; 31716 31717@ @<Look up the characters of list |n| in the hash table...@>= 31718m:=first; p:=link(n); 31719while p<>null do 31720 begin if m>=max_buf_stack then 31721 begin max_buf_stack:=m+1; 31722 if max_buf_stack=buf_size then 31723 overflow("buffer size",buf_size); 31724@:TeX capacity exceeded buffer size}{\quad buffer size@> 31725 end; 31726 buffer[m]:=info(p) mod max_char_val; incr(m); p:=link(p); 31727 end; 31728if m>first+1 then 31729 cur_cs:=id_lookup(first,m-first) {|no_new_control_sequence| is |true|} 31730else if m=first then cur_cs:=null_cs {the list is empty} 31731else cur_cs:=single_base+buffer[first] {the list has length one} 31732 31733@ The conditional \.{\\iffontchar} tests the existence of a character in 31734a font. 31735 31736@<Cases for |conditional|@>= 31737if_in_csname_code: b:=is_in_csname; 31738if_font_char_code:begin scan_font_ident; n:=cur_val; scan_usv_num; 31739 if is_native_font(n) then 31740 b:=(map_char_to_glyph(n, cur_val) > 0) 31741 else begin 31742 if (font_bc[n]<=cur_val)and(font_ec[n]>=cur_val) then 31743 b:=char_exists(char_info(n)(qi(cur_val))) 31744 else b:=false; 31745 end; 31746 end; 31747 31748@ The |protected| feature of \eTeX\ defines the \.{\\protected} prefix 31749command for macro definitions. Such macros are protected against 31750expansions when lists of expanded tokens are built, e.g., for \.{\\edef} 31751or during \.{\\write}. 31752 31753@<Generate all \eTeX...@>= 31754primitive("protected",prefix,8); 31755@!@:protected_}{\.{\\protected} primitive@> 31756 31757@ @<Cases of |prefix| for |print_cmd_chr|@>= 31758else if chr_code=8 then print_esc("protected") 31759 31760@ The |get_x_or_protected| procedure is like |get_x_token| except that 31761protected macros are not expanded. 31762 31763@<Declare \eTeX\ procedures for sc...@>= 31764procedure get_x_or_protected; {sets |cur_cmd|, |cur_chr|, |cur_tok|, 31765 and expands non-protected macros} 31766label exit; 31767begin loop@+begin get_token; 31768 if cur_cmd<=max_command then return; 31769 if (cur_cmd>=call)and(cur_cmd<end_template) then 31770 if info(link(cur_chr))=protected_token then return; 31771 expand; 31772 end; 31773exit:end; 31774 31775@ A group entered (or a conditional started) in one file may end in a 31776different file. Such slight anomalies, although perfectly legitimate, 31777may cause errors that are difficult to locate. In order to be able to 31778give a warning message when such anomalies occur, \eTeX\ uses the 31779|grp_stack| and |if_stack| arrays to record the initial |cur_boundary| 31780and |cond_ptr| values for each input file. 31781 31782@<Glob...@>= 31783@!grp_stack : array[0..max_in_open] of save_pointer; {initial |cur_boundary|} 31784@!if_stack : array[0..max_in_open] of pointer; {initial |cond_ptr|} 31785 31786@ When a group ends that was apparently entered in a different input 31787file, the |group_warning| procedure is invoked in order to update the 31788|grp_stack|. If moreover \.{\\tracingnesting} is positive we want to 31789give a warning message. The situation is, however, somewhat complicated 31790by two facts: (1)~There may be |grp_stack| elements without a 31791corresponding \.{\\input} file or \.{\\scantokens} pseudo file (e.g., 31792error insertions from the terminal); and (2)~the relevant information is 31793recorded in the |name_field| of the |input_stack| only loosely 31794synchronized with the |in_open| variable indexing |grp_stack|. 31795 31796@<Declare \eTeX\ procedures for tr...@>= 31797procedure group_warning; 31798var i:0..max_in_open; {index into |grp_stack|} 31799@!w:boolean; {do we need a warning?} 31800begin base_ptr:=input_ptr; input_stack[base_ptr]:=cur_input; 31801 {store current state} 31802i:=in_open; w:=false; 31803while (grp_stack[i]=cur_boundary)and(i>0) do 31804 begin @<Set variable |w| to indicate if this case should be reported@>; 31805 grp_stack[i]:=save_index(save_ptr); decr(i); 31806 end; 31807if w then 31808 begin print_nl("Warning: end of "); print_group(true); 31809@.Warning: end of...@> 31810 print(" of a different file"); print_ln; 31811 if tracing_nesting>1 then show_context; 31812 if history=spotless then history:=warning_issued; 31813 end; 31814end; 31815 31816@ This code scans the input stack in order to determine the type of the 31817current input file. 31818 31819@<Set variable |w| to...@>= 31820if tracing_nesting>0 then 31821 begin while (input_stack[base_ptr].state_field=token_list)or@| 31822 (input_stack[base_ptr].index_field>i) do decr(base_ptr); 31823 if input_stack[base_ptr].name_field>17 then w:=true; 31824 end 31825 31826@ When a conditional ends that was apparently started in a different 31827input file, the |if_warning| procedure is invoked in order to update the 31828|if_stack|. If moreover \.{\\tracingnesting} is positive we want to 31829give a warning message (with the same complications as above). 31830 31831@<Declare \eTeX\ procedures for tr...@>= 31832procedure if_warning; 31833var i:0..max_in_open; {index into |if_stack|} 31834@!w:boolean; {do we need a warning?} 31835begin base_ptr:=input_ptr; input_stack[base_ptr]:=cur_input; 31836 {store current state} 31837i:=in_open; w:=false; 31838while if_stack[i]=cond_ptr do 31839 begin @<Set variable |w| to...@>; 31840 if_stack[i]:=link(cond_ptr); decr(i); 31841 end; 31842if w then 31843 begin print_nl("Warning: end of "); print_cmd_chr(if_test,cur_if); 31844@.Warning: end of...@> 31845 print_if_line(if_line); print(" of a different file"); print_ln; 31846 if tracing_nesting>1 then show_context; 31847 if history=spotless then history:=warning_issued; 31848 end; 31849end; 31850 31851@ Conversely, the |file_warning| procedure is invoked when a file ends 31852and some groups entered or conditionals started while reading from that 31853file are still incomplete. 31854 31855@<Declare \eTeX\ procedures for tr...@>= 31856procedure file_warning; 31857var p:pointer; {saved value of |save_ptr| or |cond_ptr|} 31858@!l:quarterword; {saved value of |cur_level| or |if_limit|} 31859@!c:quarterword; {saved value of |cur_group| or |cur_if|} 31860@!i:integer; {saved value of |if_line|} 31861begin p:=save_ptr; l:=cur_level; c:=cur_group; save_ptr:=cur_boundary; 31862while grp_stack[in_open]<>save_ptr do 31863 begin decr(cur_level); 31864 print_nl("Warning: end of file when "); 31865@.Warning: end of file when...@> 31866 print_group(true); print(" is incomplete");@/ 31867 cur_group:=save_level(save_ptr); save_ptr:=save_index(save_ptr) 31868 end; 31869save_ptr:=p; cur_level:=l; cur_group:=c; {restore old values} 31870p:=cond_ptr; l:=if_limit; c:=cur_if; i:=if_line; 31871while if_stack[in_open]<>cond_ptr do 31872 begin print_nl("Warning: end of file when "); 31873@.Warning: end of file when...@> 31874 print_cmd_chr(if_test,cur_if); 31875 if if_limit=fi_code then print_esc("else"); 31876 print_if_line(if_line); print(" is incomplete");@/ 31877 if_line:=if_line_field(cond_ptr); cur_if:=subtype(cond_ptr); 31878 if_limit:=type(cond_ptr); cond_ptr:=link(cond_ptr); 31879 end; 31880cond_ptr:=p; if_limit:=l; cur_if:=c; if_line:=i; {restore old values} 31881print_ln; 31882if tracing_nesting>1 then show_context; 31883if history=spotless then history:=warning_issued; 31884end; 31885 31886@ Here are the additional \eTeX\ primitives for expressions. 31887 31888@<Generate all \eTeX...@>= 31889primitive("numexpr",last_item,eTeX_expr-int_val+int_val); 31890@!@:num_expr_}{\.{\\numexpr} primitive@> 31891primitive("dimexpr",last_item,eTeX_expr-int_val+dimen_val); 31892@!@:dim_expr_}{\.{\\dimexpr} primitive@> 31893primitive("glueexpr",last_item,eTeX_expr-int_val+glue_val); 31894@!@:glue_expr_}{\.{\\glueexpr} primitive@> 31895primitive("muexpr",last_item,eTeX_expr-int_val+mu_val); 31896@!@:mu_expr_}{\.{\\muexpr} primitive@> 31897 31898@ @<Cases of |last_item| for |print_cmd_chr|@>= 31899eTeX_expr-int_val+int_val: print_esc("numexpr"); 31900eTeX_expr-int_val+dimen_val: print_esc("dimexpr"); 31901eTeX_expr-int_val+glue_val: print_esc("glueexpr"); 31902eTeX_expr-int_val+mu_val: print_esc("muexpr"); 31903 31904@ This code for reducing |cur_val_level| and\slash or negating the 31905result is similar to the one for all the other cases of 31906|scan_something_internal|, with the difference that |scan_expr| has 31907already increased the reference count of a glue specification. 31908 31909@<Process an expression and |return|@>= 31910begin if m<eTeX_mu then 31911 begin case m of 31912 @/@<Cases for fetching a glue value@>@/ 31913 end; {there are no other cases} 31914 cur_val_level:=glue_val; 31915 end 31916else if m<eTeX_expr then 31917 begin case m of 31918 @/@<Cases for fetching a mu value@>@/ 31919 end; {there are no other cases} 31920 cur_val_level:=mu_val; 31921 end 31922else begin cur_val_level:=m-eTeX_expr+int_val; scan_expr; 31923 end; 31924while cur_val_level>level do 31925 begin if cur_val_level=glue_val then 31926 begin m:=cur_val; cur_val:=width(m); delete_glue_ref(m); 31927 end 31928 else if cur_val_level=mu_val then mu_error; 31929 decr(cur_val_level); 31930 end; 31931if negative then 31932 if cur_val_level>=glue_val then 31933 begin m:=cur_val; cur_val:=new_spec(m); delete_glue_ref(m); 31934 @<Negate all three glue components of |cur_val|@>; 31935 end 31936 else negate(cur_val); 31937return; 31938end 31939 31940@ @<Declare \eTeX\ procedures for sc...@>= 31941procedure@?scan_expr; forward;@t\2@> 31942 31943@ The |scan_expr| procedure scans and evaluates an expression. 31944 31945@<Declare procedures needed for expressions@>= 31946@t\4@>@<Declare subprocedures for |scan_expr|@> 31947procedure scan_expr; {scans and evaluates an expression} 31948label restart, continue, found; 31949var a,@!b:boolean; {saved values of |arith_error|} 31950@!l:small_number; {type of expression} 31951@!r:small_number; {state of expression so far} 31952@!s:small_number; {state of term so far} 31953@!o:small_number; {next operation or type of next factor} 31954@!e:integer; {expression so far} 31955@!t:integer; {term so far} 31956@!f:integer; {current factor} 31957@!n:integer; {numerator of combined multiplication and division} 31958@!p:pointer; {top of expression stack} 31959@!q:pointer; {for stack manipulations} 31960begin l:=cur_val_level; a:=arith_error; b:=false; p:=null; 31961@<Scan and evaluate an expression |e| of type |l|@>; 31962if b then 31963 begin print_err("Arithmetic overflow"); 31964@.Arithmetic overflow@> 31965 help2("I can't evaluate this expression,")@/ 31966 ("since the result is out of range."); 31967 error; 31968 if l>=glue_val then 31969 begin delete_glue_ref(e); e:=zero_glue; add_glue_ref(e); 31970 end 31971 else e:=0; 31972 end; 31973arith_error:=a; cur_val:=e; cur_val_level:=l; 31974end; 31975 31976@ Evaluating an expression is a recursive process: When the left 31977parenthesis of a subexpression is scanned we descend to the next level 31978of recursion; the previous level is resumed with the matching right 31979parenthesis. 31980 31981@d expr_none=0 {\.( seen, or \.( $\langle\it expr\rangle$ \.) seen} 31982@d expr_add=1 {\.( $\langle\it expr\rangle$ \.+ seen} 31983@d expr_sub=2 {\.( $\langle\it expr\rangle$ \.- seen} 31984@d expr_mult=3 {$\langle\it term\rangle$ \.* seen} 31985@d expr_div=4 {$\langle\it term\rangle$ \./ seen} 31986@d expr_scale=5 {$\langle\it term\rangle$ \.* 31987 $\langle\it factor\rangle$ \./ seen} 31988 31989@<Scan and eval...@>= 31990restart: r:=expr_none; e:=0; s:=expr_none; t:=0; n:=0; 31991continue: if s=expr_none then o:=l@+else o:=int_val; 31992@<Scan a factor |f| of type |o| or start a subexpression@>; 31993found: @<Scan the next operator and set |o|@>; 31994arith_error:=b; 31995@<Make sure that |f| is in the proper range@>; 31996case s of @<Cases for evaluation of the current term@>@; 31997end; {there are no other cases} 31998if o>expr_sub then s:=o@+else @<Evaluate the current expression@>; 31999b:=arith_error; 32000if o<>expr_none then goto continue; 32001if p<>null then @<Pop the expression stack and |goto found|@> 32002 32003@ @<Scan the next op...@>= 32004@<Get the next non-blank non-call token@>; 32005if cur_tok=other_token+"+" then o:=expr_add 32006else if cur_tok=other_token+"-" then o:=expr_sub 32007else if cur_tok=other_token+"*" then o:=expr_mult 32008else if cur_tok=other_token+"/" then o:=expr_div 32009else begin o:=expr_none; 32010 if p=null then 32011 begin if cur_cmd<>relax then back_input; 32012 end 32013 else if cur_tok<>other_token+")" then 32014 begin print_err("Missing ) inserted for expression"); 32015@.Missing ) inserted@> 32016 help1("I was expecting to see `+', `-', `*', `/', or `)'. Didn't."); 32017 back_error; 32018 end; 32019 end 32020 32021@ @<Scan a factor...@>= 32022@<Get the next non-blank non-call token@>; 32023if cur_tok=other_token+"(" then 32024 @<Push the expression stack and |goto restart|@>; 32025back_input; 32026if o=int_val then scan_int 32027else if o=dimen_val then scan_normal_dimen 32028else if o=glue_val then scan_normal_glue 32029else scan_mu_glue; 32030f:=cur_val 32031 32032@ @<Declare \eTeX\ procedures for sc...@>= 32033procedure@?scan_normal_glue; forward;@t\2@>@/ 32034procedure@?scan_mu_glue; forward;@t\2@> 32035 32036@ Here we declare two trivial procedures in order to avoid mutually 32037recursive procedures with parameters. 32038 32039@<Declare procedures needed for expressions@>= 32040procedure scan_normal_glue; 32041begin scan_glue(glue_val); 32042end; 32043@# 32044procedure scan_mu_glue; 32045begin scan_glue(mu_val); 32046end; 32047 32048@ Parenthesized subexpressions can be inside expressions, and this 32049nesting has a stack. Seven local variables represent the top of the 32050expression stack: |p| points to pushed-down entries, if any; |l| 32051specifies the type of expression currently beeing evaluated; |e| is the 32052expression so far and |r| is the state of its evaluation; |t| is the 32053term so far and |s| is the state of its evaluation; finally |n| is the 32054numerator for a combined multiplication and division, if any. 32055 32056@d expr_node_size=4 {number of words in stack entry for subexpressions} 32057@d expr_e_field(#)==mem[#+1].int {saved expression so far} 32058@d expr_t_field(#)==mem[#+2].int {saved term so far} 32059@d expr_n_field(#)==mem[#+3].int {saved numerator} 32060 32061@<Push the expression...@>= 32062begin q:=get_node(expr_node_size); link(q):=p; type(q):=l; 32063subtype(q):=4*s+r; 32064expr_e_field(q):=e; expr_t_field(q):=t; expr_n_field(q):=n; 32065p:=q; l:=o; goto restart; 32066end 32067 32068@ @<Pop the expression...@>= 32069begin f:=e; q:=p; 32070e:=expr_e_field(q); t:=expr_t_field(q); n:=expr_n_field(q); 32071s:=subtype(q) div 4; r:=subtype(q) mod 4; 32072l:=type(q); p:=link(q); free_node(q,expr_node_size); 32073goto found; 32074end 32075 32076@ We want to make sure that each term and (intermediate) result is in 32077the proper range. Integer values must not exceed |infinity| 32078($2^{31}-1$) in absolute value, dimensions must not exceed |max_dimen| 32079($2^{30}-1$). We avoid the absolute value of an integer, because this 32080might fail for the value $-2^{31}$ using 32-bit arithmetic. 32081 32082@d num_error(#)== {clear a number or dimension and set |arith_error|} 32083 begin arith_error:=true; #:=0; 32084 end 32085@d glue_error(#)== {clear a glue spec and set |arith_error|} 32086 begin arith_error:=true; delete_glue_ref(#); #:=new_spec(zero_glue); 32087 end 32088 32089@<Make sure that |f|...@>= 32090if (l=int_val)or(s>expr_sub) then 32091 begin if (f>infinity)or(f<-infinity) then num_error(f); 32092 end 32093else if l=dimen_val then 32094 begin if abs(f)>max_dimen then num_error(f); 32095 end 32096else begin if (abs(width(f))>max_dimen)or@| 32097 (abs(stretch(f))>max_dimen)or@| 32098 (abs(shrink(f))>max_dimen) then glue_error(f); 32099 end 32100 32101@ Applying the factor |f| to the partial term |t| (with the operator 32102|s|) is delayed until the next operator |o| has been scanned. Here we 32103handle the first factor of a partial term. A glue spec has to be copied 32104unless the next operator is a right parenthesis; this allows us later on 32105to simply modify the glue components. 32106 32107@d normalize_glue(#)== 32108 if stretch(#)=0 then stretch_order(#):=normal; 32109 if shrink(#)=0 then shrink_order(#):=normal 32110 32111@<Cases for evaluation of the current term@>= 32112expr_none: if (l>=glue_val)and(o<>expr_none) then 32113 begin t:=new_spec(f); delete_glue_ref(f); normalize_glue(t); 32114 end 32115 else t:=f; 32116 32117@ When a term |t| has been completed it is copied to, added to, or 32118subtracted from the expression |e|. 32119 32120@d expr_add_sub(#)==add_or_sub(#,r=expr_sub) 32121@d expr_a(#)==expr_add_sub(#,max_dimen) 32122 32123@<Evaluate the current expression@>= 32124begin s:=expr_none; 32125if r=expr_none then e:=t 32126else if l=int_val then e:=expr_add_sub(e,t,infinity) 32127else if l=dimen_val then e:=expr_a(e,t) 32128else @<Compute the sum or difference of two glue specs@>; 32129r:=o; 32130end 32131 32132@ The function |add_or_sub(x,y,max_answer,negative)| computes the sum 32133(for |negative=false|) or difference (for |negative=true|) of |x| and 32134|y|, provided the absolute value of the result does not exceed 32135|max_answer|. 32136 32137@<Declare subprocedures for |scan_expr|@>= 32138function add_or_sub(@!x,@!y,@!max_answer:integer;@!negative:boolean):integer; 32139var a:integer; {the answer} 32140begin if negative then negate(y); 32141if x>=0 then 32142 if y<=max_answer-x then a:=x+y@+else num_error(a) 32143else if y>=-max_answer-x then a:=x+y@+else num_error(a); 32144add_or_sub:=a; 32145end; 32146 32147@ We know that |stretch_order(e)>normal| implies |stretch(e)<>0| and 32148|shrink_order(e)>normal| implies |shrink(e)<>0|. 32149 32150@<Compute the sum or diff...@>= 32151begin width(e):=expr_a(width(e),width(t)); 32152if stretch_order(e)=stretch_order(t) then 32153 stretch(e):=expr_a(stretch(e),stretch(t)) 32154else if (stretch_order(e)<stretch_order(t))and(stretch(t)<>0) then 32155 begin stretch(e):=stretch(t); stretch_order(e):=stretch_order(t); 32156 end; 32157if shrink_order(e)=shrink_order(t) then 32158 shrink(e):=expr_a(shrink(e),shrink(t)) 32159else if (shrink_order(e)<shrink_order(t))and(shrink(t)<>0) then 32160 begin shrink(e):=shrink(t); shrink_order(e):=shrink_order(t); 32161 end; 32162delete_glue_ref(t); normalize_glue(e); 32163end 32164 32165@ If a multiplication is followed by a division, the two operations are 32166combined into a `scaling' operation. Otherwise the term |t| is 32167multiplied by the factor |f|. 32168 32169@d expr_m(#)==#:=nx_plus_y(#,f,0) 32170 32171@<Cases for evaluation of the current term@>= 32172expr_mult: if o=expr_div then 32173 begin n:=f; o:=expr_scale; 32174 end 32175 else if l=int_val then t:=mult_integers(t,f) 32176 else if l=dimen_val then expr_m(t) 32177 else begin expr_m(width(t)); expr_m(stretch(t)); expr_m(shrink(t)); 32178 end; 32179 32180@ Here we divide the term |t| by the factor |f|. 32181 32182@d expr_d(#)==#:=quotient(#,f) 32183 32184@<Cases for evaluation of the current term@>= 32185expr_div: if l<glue_val then expr_d(t) 32186 else begin expr_d(width(t)); expr_d(stretch(t)); expr_d(shrink(t)); 32187 end; 32188 32189@ The function |quotient(n,d)| computes the rounded quotient 32190$q=\lfloor n/d+{1\over2}\rfloor$, when $n$ and $d$ are positive. 32191 32192@<Declare subprocedures for |scan_expr|@>= 32193function quotient(@!n,@!d:integer):integer; 32194var negative:boolean; {should the answer be negated?} 32195@!a:integer; {the answer} 32196begin if d=0 then num_error(a) 32197else begin if d>0 then negative:=false 32198 else begin negate(d); negative:=true; 32199 end; 32200 if n<0 then 32201 begin negate(n); negative:=not negative; 32202 end; 32203 a:=n div d; n:=n-a*d; d:=n-d; {avoid certain compiler optimizations!} 32204 if d+n>=0 then incr(a); 32205 if negative then negate(a); 32206 end; 32207quotient:=a; 32208end; 32209 32210@ Here the term |t| is multiplied by the quotient $n/f$. 32211 32212@d expr_s(#)==#:=fract(#,n,f,max_dimen) 32213 32214@<Cases for evaluation of the current term@>= 32215expr_scale: if l=int_val then t:=fract(t,n,f,infinity) 32216 else if l=dimen_val then expr_s(t) 32217 else begin expr_s(width(t)); expr_s(stretch(t)); expr_s(shrink(t)); 32218 end; 32219 32220@ Finally, the function |fract(x,n,d,max_answer)| computes the integer 32221$q=\lfloor xn/d+{1\over2}\rfloor$, when $x$, $n$, and $d$ are positive 32222and the result does not exceed |max_answer|. We can't use floating 32223point arithmetic since the routine must produce identical results in all 32224cases; and it would be too dangerous to multiply by~|n| and then divide 32225by~|d|, in separate operations, since overflow might well occur. Hence 32226this subroutine simulates double precision arithmetic, somewhat 32227analogous to \MF's |make_fraction| and |take_fraction| routines. 32228 32229@d too_big=88 {go here when the result is too big} 32230 32231@<Declare subprocedures for |scan_expr|@>= 32232function fract(@!x,@!n,@!d,@!max_answer:integer):integer; 32233label found, found1, too_big, done; 32234var negative:boolean; {should the answer be negated?} 32235@!a:integer; {the answer} 32236@!f:integer; {a proper fraction} 32237@!h:integer; {smallest integer such that |2*h>=d|} 32238@!r:integer; {intermediate remainder} 32239@!t:integer; {temp variable} 32240begin if d=0 then goto too_big; 32241a:=0; 32242if d>0 then negative:=false 32243else begin negate(d); negative:=true; 32244 end; 32245if x<0 then 32246 begin negate(x); negative:=not negative; 32247 end 32248else if x=0 then goto done; 32249if n<0 then 32250 begin negate(n); negative:=not negative; 32251 end; 32252t:=n div d; 32253if t>max_answer div x then goto too_big; 32254a:=t*x; n:=n-t*d; 32255if n=0 then goto found; 32256t:=x div d; 32257if t>(max_answer-a) div n then goto too_big; 32258a:=a+t*n; x:=x-t*d; 32259if x=0 then goto found; 32260if x<n then 32261 begin t:=x; x:=n; n:=t; 32262 end; {now |0<n<=x<d|} 32263@<Compute \(f)$f=\lfloor xn/d+{1\over2}\rfloor$@>@; 32264if f>(max_answer-a) then goto too_big; 32265a:=a+f; 32266found: if negative then negate(a); 32267goto done; 32268too_big: num_error(a); 32269done: fract:=a; 32270end; 32271 32272@ The loop here preserves the following invariant relations 32273between |f|, |x|, |n|, and~|r|: 32274(i)~$f+\lfloor(xn+(r+d))/d\rfloor=\lfloor x_0n_0/d+{1\over2}\rfloor$; 32275(ii)~|-d<=r<0<n<=x<d|, where $x_0$, $n_0$ are the original values of~$x$ 32276and $n$. 32277 32278Notice that the computation specifies |(x-d)+x| instead of |(x+x)-d|, 32279because the latter could overflow. 32280 32281@<Compute \(f)$f=\lfloor xn/d+{1\over2}\rfloor$@>= 32282f:=0; r:=(d div 2)-d; h:=-r; 32283loop@+begin if odd(n) then 32284 begin r:=r+x; 32285 if r>=0 then 32286 begin r:=r-d; incr(f); 32287 end; 32288 end; 32289 n:=n div 2; 32290 if n=0 then goto found1; 32291 if x<h then x:=x+x 32292 else begin t:=x-d; x:=t+x; f:=f+n; 32293 if x<n then 32294 begin if x=0 then goto found1; 32295 t:=x; x:=n; n:=t; 32296 end; 32297 end; 32298 end; 32299found1: 32300 32301@ The \.{\\gluestretch}, \.{\\glueshrink}, \.{\\gluestretchorder}, and 32302\.{\\glueshrinkorder} commands return the stretch and shrink components 32303and their orders of ``infinity'' of a glue specification. 32304 32305@d glue_stretch_order_code=eTeX_int+6 {code for \.{\\gluestretchorder}} 32306@d glue_shrink_order_code=eTeX_int+7 {code for \.{\\glueshrinkorder}} 32307@d glue_stretch_code=eTeX_dim+7 {code for \.{\\gluestretch}} 32308@d glue_shrink_code=eTeX_dim+8 {code for \.{\\glueshrink}} 32309 32310@<Generate all \eTeX...@>= 32311primitive("gluestretchorder",last_item,glue_stretch_order_code); 32312@!@:glue_stretch_order_}{\.{\\gluestretchorder} primitive@> 32313primitive("glueshrinkorder",last_item,glue_shrink_order_code); 32314@!@:glue_shrink_order_}{\.{\\glueshrinkorder} primitive@> 32315primitive("gluestretch",last_item,glue_stretch_code); 32316@!@:glue_stretch_}{\.{\\gluestretch} primitive@> 32317primitive("glueshrink",last_item,glue_shrink_code); 32318@!@:glue_shrink_}{\.{\\glueshrink} primitive@> 32319 32320@ @<Cases of |last_item| for |print_cmd_chr|@>= 32321glue_stretch_order_code: print_esc("gluestretchorder"); 32322glue_shrink_order_code: print_esc("glueshrinkorder"); 32323glue_stretch_code: print_esc("gluestretch"); 32324glue_shrink_code: print_esc("glueshrink"); 32325 32326@ @<Cases for fetching an integer value@>= 32327glue_stretch_order_code, glue_shrink_order_code: 32328 begin scan_normal_glue; q:=cur_val; 32329 if m=glue_stretch_order_code then cur_val:=stretch_order(q) 32330 else cur_val:=shrink_order(q); 32331 delete_glue_ref(q); 32332 end; 32333 32334@ @<Cases for fetching a dimension value@>= 32335glue_stretch_code, glue_shrink_code: 32336 begin scan_normal_glue; q:=cur_val; 32337 if m=glue_stretch_code then cur_val:=stretch(q) 32338 else cur_val:=shrink(q); 32339 delete_glue_ref(q); 32340 end; 32341 32342@ The \.{\\mutoglue} and \.{\\gluetomu} commands convert ``math'' glue 32343into normal glue and vice versa; they allow to manipulate math glue with 32344\.{\\gluestretch} etc. 32345 32346@d mu_to_glue_code=eTeX_glue {code for \.{\\mutoglue}} 32347@d glue_to_mu_code=eTeX_mu {code for \.{\\gluetomu}} 32348 32349@<Generate all \eTeX...@>= 32350primitive("mutoglue",last_item,mu_to_glue_code); 32351@!@:mu_to_glue_}{\.{\\mutoglue} primitive@> 32352primitive("gluetomu",last_item,glue_to_mu_code); 32353@!@:glue_to_mu_}{\.{\\gluetomu} primitive@> 32354 32355@ @<Cases of |last_item| for |print_cmd_chr|@>= 32356mu_to_glue_code: print_esc("mutoglue"); 32357glue_to_mu_code: print_esc("gluetomu"); 32358 32359@ @<Cases for fetching a glue value@>= 32360mu_to_glue_code: scan_mu_glue; 32361 32362@ @<Cases for fetching a mu value@>= 32363glue_to_mu_code: scan_normal_glue; 32364 32365@ \eTeX\ (in extended mode) supports 32768 (i.e., $2^{15}$) count, 32366dimen, skip, muskip, box, and token registers. As in \TeX\ the first 32367256 registers of each kind are realized as arrays in the table of 32368equivalents; the additional registers are realized as tree structures 32369built from variable-size nodes with individual registers existing only 32370when needed. Default values are used for nonexistent registers: zero 32371for count and dimen values, |zero_glue| for glue (skip and muskip) 32372values, void for boxes, and |null| for token lists (and current marks 32373discussed below). 32374 32375Similarly there are 32768 mark classes; the command \.{\\marks}|n| 32376creates a mark node for a given mark class |0<=n<=32767| (where 32377\.{\\marks0} is synonymous to \.{\\mark}). The page builder (actually 32378the |fire_up| routine) and the |vsplit| routine maintain the current 32379values of |top_mark|, |first_mark|, |bot_mark|, |split_first_mark|, and 32380|split_bot_mark| for each mark class. They are accessed as 32381\.{\\topmarks}|n| etc., and \.{\\topmarks0} is again synonymous to 32382\.{\\topmark}. As in \TeX\ the five current marks for mark class zero 32383are realized as |cur_mark| array. The additional current marks are 32384again realized as tree structure with individual mark classes existing 32385only when needed. 32386 32387@<Generate all \eTeX...@>= 32388primitive("marks",mark,marks_code); 32389@!@:marks_}{\.{\\marks} primitive@> 32390primitive("topmarks",top_bot_mark,top_mark_code+marks_code); 32391@!@:top_marks_}{\.{\\topmarks} primitive@> 32392primitive("firstmarks",top_bot_mark,first_mark_code+marks_code); 32393@!@:first_marks_}{\.{\\firstmarks} primitive@> 32394primitive("botmarks",top_bot_mark,bot_mark_code+marks_code); 32395@!@:bot_marks_}{\.{\\botmarks} primitive@> 32396primitive("splitfirstmarks",top_bot_mark,split_first_mark_code+marks_code); 32397@!@:split_first_marks_}{\.{\\splitfirstmarks} primitive@> 32398primitive("splitbotmarks",top_bot_mark,split_bot_mark_code+marks_code); 32399@!@:split_bot_marks_}{\.{\\splitbotmarks} primitive@> 32400 32401@ The |scan_register_num| procedure scans a register number that must 32402not exceed 255 in compatibility mode resp.\ 32767 in extended mode. 32403 32404@<Declare \eTeX\ procedures for ex...@>= 32405procedure@?scan_register_num; forward;@t\2@> 32406 32407@ @<Declare procedures that scan restricted classes of integers@>= 32408procedure scan_register_num; 32409begin scan_int; 32410if (cur_val<0)or(cur_val>max_reg_num) then 32411 begin print_err("Bad register code"); 32412@.Bad register code@> 32413 help2(max_reg_help_line)("I changed this one to zero."); 32414 int_error(cur_val); cur_val:=0; 32415 end; 32416end; 32417 32418@ @<Initialize variables for \eTeX\ comp...@>= 32419max_reg_num:=255; 32420max_reg_help_line:="A register number must be between 0 and 255."; 32421 32422@ @<Initialize variables for \eTeX\ ext...@>= 32423max_reg_num:=32767; 32424max_reg_help_line:="A register number must be between 0 and 32767."; 32425 32426@ @<Glob...@>= 32427@!max_reg_num: halfword; {largest allowed register number} 32428@!max_reg_help_line: str_number; {first line of help message} 32429 32430@ There are seven almost identical doubly linked trees, one for the 32431sparse array of the up to 32512 additional registers of each kind and 32432one for the sparse array of the up to 32767 additional mark classes. 32433The root of each such tree, if it exists, is an index node containing 16 32434pointers to subtrees for 4096 consecutive array elements. Similar index 32435nodes are the starting points for all nonempty subtrees for 4096, 256, 32436and 16 consecutive array elements. These four levels of index nodes are 32437followed by a fifth level with nodes for the individual array elements. 32438 32439Each index node is nine words long. The pointers to the 16 possible 32440subtrees or are kept in the |info| and |link| fields of the last eight 32441words. (It would be both elegant and efficient to declare them as 32442array, unfortunately \PASCAL\ doesn't allow this.) 32443 32444The fields in the first word of each index node and in the nodes for the 32445array elements are closely related. The |link| field points to the next 32446lower index node and the |sa_index| field contains four bits (one 32447hexadecimal digit) of the register number or mark class. For the lowest 32448index node the |link| field is |null| and the |sa_index| field indicates 32449the type of quantity (|int_val|, |dimen_val|, |glue_val|, |mu_val|, 32450|box_val|, |tok_val|, or |mark_val|). The |sa_used| field in the index 32451nodes counts how many of the 16 pointers are non-null. 32452 32453The |sa_index| field in the nodes for array elements contains the four 32454bits plus 16 times the type. Therefore such a node represents a count 32455or dimen register if and only if |sa_index<dimen_val_limit|; it 32456represents a skip or muskip register if and only if 32457|dimen_val_limit<=sa_index<mu_val_limit|; it represents a box register 32458if and only if |mu_val_limit<=sa_index<box_val_limit|; it represents a 32459token list register if and only if 32460|box_val_limit<=sa_index<tok_val_limit|; finally it represents a mark 32461class if and only if |tok_val_limit<=sa_index|. 32462 32463The |new_index| procedure creates an index node (returned in |cur_ptr|) 32464having given contents of the |sa_index| and |link| fields. 32465 32466@d box_val==4 {the additional box registers} 32467@d mark_val=7 {the additional mark classes} 32468@# 32469@d dimen_val_limit=@"20 {$2^4\cdot(|dimen_val|+1)$} 32470@d mu_val_limit=@"40 {$2^4\cdot(|mu_val|+1)$} 32471@d box_val_limit=@"50 {$2^4\cdot(|box_val|+1)$} 32472@d tok_val_limit=@"60 {$2^4\cdot(|tok_val|+1)$} 32473@# 32474@d index_node_size=9 {size of an index node} 32475@d sa_index==type {a four-bit address or a type or both} 32476@d sa_used==subtype {count of non-null pointers} 32477 32478@<Declare \eTeX\ procedures for ex...@>= 32479procedure new_index(@!i:quarterword; @!q:pointer); 32480var k:small_number; {loop index} 32481begin cur_ptr:=get_node(index_node_size); sa_index(cur_ptr):=i; 32482sa_used(cur_ptr):=0; link(cur_ptr):=q; 32483for k:=1 to index_node_size-1 do {clear all 16 pointers} 32484 mem[cur_ptr+k]:=sa_null; 32485end; 32486 32487@ The roots of the seven trees for the additional registers and mark 32488classes are kept in the |sa_root| array. The first six locations must 32489be dumped and undumped; the last one is also known as |sa_mark|. 32490 32491@d sa_mark==sa_root[mark_val] {root for mark classes} 32492 32493@<Glob...@>= 32494@!sa_root:array[int_val..mark_val] of pointer; {roots of sparse arrays} 32495@!cur_ptr:pointer; {value returned by |new_index| and |find_sa_element|} 32496@!sa_null:memory_word; {two |null| pointers} 32497 32498@ @<Set init...@>= 32499sa_mark:=null; sa_null.hh.lh:=null; sa_null.hh.rh:=null; 32500 32501@ @<Initialize table...@>= 32502for i:=int_val to inter_char_val do sa_root[i]:=null; 32503 32504@ Given a type |t| and a sixteen-bit number |n|, the |find_sa_element| 32505procedure returns (in |cur_ptr|) a pointer to the node for the 32506corresponding array element, or |null| when no such element exists. The 32507third parameter |w| is set |true| if the element must exist, e.g., 32508because it is about to be modified. The procedure has two main 32509branches: one follows the existing tree structure, the other (only used 32510when |w| is |true|) creates the missing nodes. 32511 32512We use macros to extract the four-bit pieces from a sixteen-bit register 32513number or mark class and to fetch or store one of the 16 pointers from 32514an index node. 32515 32516@d if_cur_ptr_is_null_then_return_or_goto(#)== {some tree element is missing} 32517 begin if cur_ptr=null then 32518 if w then goto #@+else return; 32519 end 32520@# 32521@d hex_dig1(#)==# div 4096 {the fourth lowest hexadecimal digit} 32522@d hex_dig2(#)==(# div 256) mod 16 {the third lowest hexadecimal digit} 32523@d hex_dig3(#)==(# div 16) mod 16 {the second lowest hexadecimal digit} 32524@d hex_dig4(#)==# mod 16 {the lowest hexadecimal digit} 32525@# 32526@d get_sa_ptr==if odd(i) then cur_ptr:=link(q+(i div 2)+1) 32527 else cur_ptr:=info(q+(i div 2)+1) 32528 {set |cur_ptr| to the pointer indexed by |i| from index node |q|} 32529@d put_sa_ptr(#)==if odd(i) then link(q+(i div 2)+1):=# 32530 else info(q+(i div 2)+1):=# 32531 {store the pointer indexed by |i| in index node |q|} 32532@d add_sa_ptr==begin put_sa_ptr(cur_ptr); incr(sa_used(q)); 32533 end {add |cur_ptr| as the pointer indexed by |i| in index node |q|} 32534@d delete_sa_ptr==begin put_sa_ptr(null); decr(sa_used(q)); 32535 end {delete the pointer indexed by |i| in index node |q|} 32536 32537@<Declare \eTeX\ procedures for ex...@>= 32538procedure find_sa_element(@!t:small_number;@!n:halfword;@!w:boolean); 32539 {sets |cur_val| to sparse array element location or |null|} 32540label not_found,not_found1,not_found2,not_found3,not_found4,exit; 32541var q:pointer; {for list manipulations} 32542@!i:small_number; {a four bit index} 32543begin cur_ptr:=sa_root[t]; 32544if_cur_ptr_is_null_then_return_or_goto(not_found);@/ 32545q:=cur_ptr; i:=hex_dig1(n); get_sa_ptr; 32546if_cur_ptr_is_null_then_return_or_goto(not_found1);@/ 32547q:=cur_ptr; i:=hex_dig2(n); get_sa_ptr; 32548if_cur_ptr_is_null_then_return_or_goto(not_found2);@/ 32549q:=cur_ptr; i:=hex_dig3(n); get_sa_ptr; 32550if_cur_ptr_is_null_then_return_or_goto(not_found3);@/ 32551q:=cur_ptr; i:=hex_dig4(n); get_sa_ptr; 32552if (cur_ptr=null)and w then goto not_found4; 32553return; 32554not_found: new_index(t,null); {create first level index node} 32555sa_root[t]:=cur_ptr; q:=cur_ptr; i:=hex_dig1(n); 32556not_found1: new_index(i,q); {create second level index node} 32557add_sa_ptr; q:=cur_ptr; i:=hex_dig2(n); 32558not_found2: new_index(i,q); {create third level index node} 32559add_sa_ptr; q:=cur_ptr; i:=hex_dig3(n); 32560not_found3: new_index(i,q); {create fourth level index node} 32561add_sa_ptr; q:=cur_ptr; i:=hex_dig4(n); 32562not_found4: @<Create a new array element of type |t| with index |i|@>; 32563link(cur_ptr):=q; add_sa_ptr; 32564exit:end; 32565 32566@ The array elements for registers are subject to grouping and have an 32567|sa_lev| field (quite analogous to |eq_level|) instead of |sa_used|. 32568Since saved values as well as shorthand definitions (created by e.g., 32569\.{\\countdef}) refer to the location of the respective array element, 32570we need a reference count that is kept in the |sa_ref| field. An array 32571element can be deleted (together with all references to it) when its 32572|sa_ref| value is |null| and its value is the default value. 32573@^reference counts@> 32574 32575Skip, muskip, box, and token registers use two word nodes, their values 32576are stored in the |sa_ptr| field. 32577Count and dimen registers use three word nodes, their 32578values are stored in the |sa_int| resp.\ |sa_dim| field in the third 32579word; the |sa_ptr| field is used under the name |sa_num| to store 32580the register number. Mark classes use four word nodes. The last three 32581words contain the five types of current marks 32582 32583@d sa_lev==sa_used {grouping level for the current value} 32584@d pointer_node_size=2 {size of an element with a pointer value} 32585@d sa_type(#)==(sa_index(#) div 16) {type part of combined type/index} 32586@d sa_ref(#)==info(#+1) {reference count of a sparse array element} 32587@d sa_ptr(#)==link(#+1) {a pointer value} 32588@# 32589@d word_node_size=3 {size of an element with a word value} 32590@d sa_num==sa_ptr {the register number} 32591@d sa_int(#)==mem[#+2].int {an integer} 32592@d sa_dim(#)==mem[#+2].sc {a dimension (a somewhat esotheric distinction)} 32593@# 32594@d mark_class_node_size=4 {size of an element for a mark class} 32595@# 32596@d fetch_box(#)== {fetch |box(cur_val)|} 32597 if cur_val<256 then #:=box(cur_val) 32598 else begin find_sa_element(box_val,cur_val,false); 32599 if cur_ptr=null then #:=null@+else #:=sa_ptr(cur_ptr); 32600 end 32601 32602@<Create a new array element...@>= 32603if t=mark_val then {a mark class} 32604 begin cur_ptr:=get_node(mark_class_node_size); 32605 mem[cur_ptr+1]:=sa_null; mem[cur_ptr+2]:=sa_null; mem[cur_ptr+3]:=sa_null; 32606 end 32607else begin if t<=dimen_val then {a count or dimen register} 32608 begin cur_ptr:=get_node(word_node_size); sa_int(cur_ptr):=0; 32609 sa_num(cur_ptr):=n; 32610 end 32611 else begin cur_ptr:=get_node(pointer_node_size); 32612 if t<=mu_val then {a skip or muskip register} 32613 begin sa_ptr(cur_ptr):=zero_glue; add_glue_ref(zero_glue); 32614 end 32615 else sa_ptr(cur_ptr):=null; {a box or token list register} 32616 end; 32617 sa_ref(cur_ptr):=null; {all registers have a reference count} 32618 end; 32619sa_index(cur_ptr):=16*t+i; sa_lev(cur_ptr):=level_one 32620 32621@ The |delete_sa_ref| procedure is called when a pointer to an array 32622element representing a register is being removed; this means that the 32623reference count should be decreased by one. If the reduced reference 32624count is |null| and the register has been (globally) assigned its 32625default value the array element should disappear, possibly together with 32626some index nodes. This procedure will never be used for mark class 32627nodes. 32628@^reference counts@> 32629 32630@d add_sa_ref(#)==incr(sa_ref(#)) {increase reference count} 32631@# 32632@d change_box(#)== {change |box(cur_val)|, the |eq_level| stays the same} 32633 if cur_val<256 then box(cur_val):=#@+else set_sa_box(#) 32634@# 32635@d set_sa_box(#)==begin find_sa_element(box_val,cur_val,false); 32636 if cur_ptr<>null then 32637 begin sa_ptr(cur_ptr):=#; add_sa_ref(cur_ptr); delete_sa_ref(cur_ptr); 32638 end; 32639 end 32640 32641@<Declare \eTeX\ procedures for tr...@>= 32642procedure delete_sa_ref(@!q:pointer); {reduce reference count} 32643label exit; 32644var p:pointer; {for list manipulations} 32645@!i:small_number; {a four bit index} 32646@!s:small_number; {size of a node} 32647begin decr(sa_ref(q)); 32648if sa_ref(q)<>null then return; 32649if sa_index(q)<dimen_val_limit then 32650 if sa_int(q)=0 then s:=word_node_size 32651 else return 32652else begin if sa_index(q)<mu_val_limit then 32653 if sa_ptr(q)=zero_glue then delete_glue_ref(zero_glue) 32654 else return 32655 else if sa_ptr(q)<>null then return; 32656 s:=pointer_node_size; 32657 end; 32658repeat i:=hex_dig4(sa_index(q)); p:=q; q:=link(p); free_node(p,s); 32659if q=null then {the whole tree has been freed} 32660 begin sa_root[i]:=null; return; 32661 end; 32662delete_sa_ptr; s:=index_node_size; {node |q| is an index node} 32663until sa_used(q)>0; 32664exit:end; 32665 32666@ The |print_sa_num| procedure prints the register number corresponding 32667to an array element. 32668 32669@<Basic print...@>= 32670procedure print_sa_num(@!q:pointer); {print register number} 32671var @!n:halfword; {the register number} 32672begin if sa_index(q)<dimen_val_limit then n:=sa_num(q) {the easy case} 32673else begin n:=hex_dig4(sa_index(q)); q:=link(q); n:=n+16*sa_index(q); 32674 q:=link(q); n:=n+256*(sa_index(q)+16*sa_index(link(q))); 32675 end; 32676print_int(n); 32677end; 32678 32679@ Here is a procedure that displays the contents of an array element 32680symbolically. It is used under similar circumstances as is 32681|restore_trace| (together with |show_eqtb|) for the quantities kept in 32682the |eqtb| array. 32683 32684@<Declare \eTeX\ procedures for tr...@>= 32685@!stat procedure show_sa(@!p:pointer;@!s:str_number); 32686var t:small_number; {the type of element} 32687begin begin_diagnostic; print_char("{"); print(s); print_char(" "); 32688if p=null then print_char("?") {this can't happen} 32689else begin t:=sa_type(p); 32690 if t<box_val then print_cmd_chr(register,p) 32691 else if t=box_val then 32692 begin print_esc("box"); print_sa_num(p); 32693 end 32694 else if t=tok_val then print_cmd_chr(toks_register,p) 32695 else print_char("?"); {this can't happen either} 32696 print_char("="); 32697 if t=int_val then print_int(sa_int(p)) 32698 else if t=dimen_val then 32699 begin print_scaled(sa_dim(p)); print("pt"); 32700 end 32701 else begin p:=sa_ptr(p); 32702 if t=glue_val then print_spec(p,"pt") 32703 else if t=mu_val then print_spec(p,"mu") 32704 else if t=box_val then 32705 if p=null then print("void") 32706 else begin depth_threshold:=0; breadth_max:=1; show_node_list(p); 32707 end 32708 else if t=tok_val then 32709 begin if p<>null then show_token_list(link(p),null,32); 32710 end 32711 else print_char("?"); {this can't happen either} 32712 end; 32713 end; 32714print_char("}"); end_diagnostic(false); 32715end; 32716tats 32717 32718@ Here we compute the pointer to the current mark of type |t| and mark 32719class |cur_val|. 32720 32721@<Compute the mark pointer...@>= 32722begin find_sa_element(mark_val,cur_val,false); 32723if cur_ptr<>null then 32724 if odd(t) then cur_ptr:=link(cur_ptr+(t div 2)+1) 32725 else cur_ptr:=info(cur_ptr+(t div 2)+1); 32726end 32727 32728@ The current marks for all mark classes are maintained by the |vsplit| 32729and |fire_up| routines and are finally destroyed (for \.{INITEX} only) 32730@.INITEX@> 32731by the |final_cleanup| routine. Apart from updating the current marks 32732when mark nodes are encountered, these routines perform certain actions 32733on all existing mark classes. The recursive |do_marks| procedure walks 32734through the whole tree or a subtree of existing mark class nodes and 32735preforms certain actions indicted by its first parameter |a|, the action 32736code. The second parameter |l| indicates the level of recursion (at 32737most four); the third parameter points to a nonempty tree or subtree. 32738The result is |true| if the complete tree or subtree has been deleted. 32739 32740@d vsplit_init==0 {action code for |vsplit| initialization} 32741@d fire_up_init==1 {action code for |fire_up| initialization} 32742@d fire_up_done==2 {action code for |fire_up| completion} 32743@d destroy_marks==3 {action code for |final_cleanup|} 32744@# 32745@d sa_top_mark(#)==info(#+1) {\.{\\topmarks}|n|} 32746@d sa_first_mark(#)==link(#+1) {\.{\\firstmarks}|n|} 32747@d sa_bot_mark(#)==info(#+2) {\.{\\botmarks}|n|} 32748@d sa_split_first_mark(#)==link(#+2) {\.{\\splitfirstmarks}|n|} 32749@d sa_split_bot_mark(#)==info(#+3) {\.{\\splitbotmarks}|n|} 32750 32751@<Declare the function called |do_marks|@>= 32752function do_marks(@!a,@!l:small_number;@!q:pointer):boolean; 32753var i:small_number; {a four bit index} 32754begin if l<4 then {|q| is an index node} 32755 begin for i:=0 to 15 do 32756 begin get_sa_ptr; 32757 if cur_ptr<>null then if do_marks(a,l+1,cur_ptr) then delete_sa_ptr; 32758 end; 32759 if sa_used(q)=0 then 32760 begin free_node(q,index_node_size); q:=null; 32761 end; 32762 end 32763else {|q| is the node for a mark class} 32764 begin case a of 32765 @<Cases for |do_marks|@>@; 32766 end; {there are no other cases} 32767 if sa_bot_mark(q)=null then if sa_split_bot_mark(q)=null then 32768 begin free_node(q,mark_class_node_size); q:=null; 32769 end; 32770 end; 32771do_marks:=(q=null); 32772end; 32773 32774@ At the start of the |vsplit| routine the existing |split_fist_mark| 32775and |split_bot_mark| are discarded. 32776 32777@<Cases for |do_marks|@>= 32778vsplit_init: if sa_split_first_mark(q)<>null then 32779 begin delete_token_ref(sa_split_first_mark(q)); sa_split_first_mark(q):=null; 32780 delete_token_ref(sa_split_bot_mark(q)); sa_split_bot_mark(q):=null; 32781 end; 32782 32783@ We use again the fact that |split_first_mark=null| if and only if 32784|split_bot_mark=null|. 32785 32786@<Update the current marks for |vsplit|@>= 32787begin find_sa_element(mark_val,mark_class(p),true); 32788if sa_split_first_mark(cur_ptr)=null then 32789 begin sa_split_first_mark(cur_ptr):=mark_ptr(p); 32790 add_token_ref(mark_ptr(p)); 32791 end 32792else delete_token_ref(sa_split_bot_mark(cur_ptr)); 32793sa_split_bot_mark(cur_ptr):=mark_ptr(p); 32794add_token_ref(mark_ptr(p)); 32795end 32796 32797@ At the start of the |fire_up| routine the old |top_mark| and 32798|first_mark| are discarded, whereas the old |bot_mark| becomes the new 32799|top_mark|. An empty new |top_mark| token list is, however, discarded 32800as well in order that mark class nodes can eventually be released. We 32801use again the fact that |bot_mark<>null| implies |first_mark<>null|; it 32802also knows that |bot_mark=null| implies |top_mark=first_mark=null|. 32803 32804@<Cases for |do_marks|@>= 32805fire_up_init: if sa_bot_mark(q)<>null then 32806 begin if sa_top_mark(q)<>null then delete_token_ref(sa_top_mark(q)); 32807 delete_token_ref(sa_first_mark(q)); sa_first_mark(q):=null; 32808 if link(sa_bot_mark(q))=null then {an empty token list} 32809 begin delete_token_ref(sa_bot_mark(q)); sa_bot_mark(q):=null; 32810 end 32811 else add_token_ref(sa_bot_mark(q)); 32812 sa_top_mark(q):=sa_bot_mark(q); 32813 end; 32814 32815@ @<Cases for |do_marks|@>= 32816fire_up_done: if (sa_top_mark(q)<>null)and(sa_first_mark(q)=null) then 32817 begin sa_first_mark(q):=sa_top_mark(q); add_token_ref(sa_top_mark(q)); 32818 end; 32819 32820@ @<Update the current marks for |fire_up|@>= 32821begin find_sa_element(mark_val,mark_class(p),true); 32822if sa_first_mark(cur_ptr)=null then 32823 begin sa_first_mark(cur_ptr):=mark_ptr(p); 32824 add_token_ref(mark_ptr(p)); 32825 end; 32826if sa_bot_mark(cur_ptr)<>null then delete_token_ref(sa_bot_mark(cur_ptr)); 32827sa_bot_mark(cur_ptr):=mark_ptr(p); add_token_ref(mark_ptr(p)); 32828end 32829 32830@ Here we use the fact that the five current mark pointers in a mark 32831class node occupy the same locations as the the first five pointers of 32832an index node. For systems using a run-time switch to distinguish 32833between \.{VIRTEX} and \.{INITEX}, the codewords `$|init|\ldots|tini|$' 32834surrounding the following piece of code should be removed. 32835@.INITEX@> 32836@^system dependencies@> 32837 32838@<Cases for |do_marks|@>= 32839@!init destroy_marks: for i:=top_mark_code to split_bot_mark_code do 32840 begin get_sa_ptr; 32841 if cur_ptr<>null then 32842 begin delete_token_ref(cur_ptr); put_sa_ptr(null); 32843 end; 32844 end; 32845tini 32846 32847@ The command code |register| is used for `\.{\\count}', `\.{\\dimen}', 32848etc., as well as for references to sparse array elements defined by 32849`\.{\\countdef}', etc. 32850 32851@<Cases of |register| for |print_cmd_chr|@>= 32852begin if (chr_code<mem_bot)or(chr_code>lo_mem_stat_max) then 32853 cmd:=sa_type(chr_code) 32854else begin cmd:=chr_code-mem_bot; chr_code:=null; 32855 end; 32856if cmd=int_val then print_esc("count") 32857else if cmd=dimen_val then print_esc("dimen") 32858else if cmd=glue_val then print_esc("skip") 32859else print_esc("muskip"); 32860if chr_code<>null then print_sa_num(chr_code); 32861end 32862 32863@ Similarly the command code |toks_register| is used for `\.{\\toks}' as 32864well as for references to sparse array elements defined by 32865`\.{\\toksdef}'. 32866 32867@<Cases of |toks_register| for |print_cmd_chr|@>= 32868begin print_esc("toks"); 32869if chr_code<>mem_bot then print_sa_num(chr_code); 32870end 32871 32872@ When a shorthand definition for an element of one of the sparse arrays 32873is destroyed, we must reduce the reference count. 32874 32875@<Cases for |eq_destroy|@>= 32876toks_register,register: 32877 if (equiv_field(w)<mem_bot)or(equiv_field(w)>lo_mem_stat_max) then 32878 delete_sa_ref(equiv_field(w)); 32879 32880@ The task to maintain (change, save, and restore) register values is 32881essentially the same when the register is realized as sparse array 32882element or entry in |eqtb|. The global variable |sa_chain| is the head 32883of a linked list of entries saved at the topmost level |sa_level|; the 32884lists for lowel levels are kept in special save stack entries. 32885 32886@<Glob...@>= 32887@!sa_chain: pointer; {chain of saved sparse array entries} 32888@!sa_level: quarterword; {group level for |sa_chain|} 32889 32890@ @<Set init...@>= 32891sa_chain:=null; sa_level:=level_zero; 32892 32893@ The individual saved items are kept in pointer or word nodes similar 32894to those used for the array elements: a word node with value zero is, 32895however, saved as pointer node with the otherwise impossible |sa_index| 32896value |tok_val_limit|. 32897 32898@d sa_loc==sa_ref {location of saved item} 32899 32900@<Declare \eTeX\ procedures for tr...@>= 32901procedure sa_save(@!p:pointer); {saves value of |p|} 32902var q:pointer; {the new save node} 32903@!i:quarterword; {index field of node} 32904begin if cur_level<>sa_level then 32905 begin check_full_save_stack; save_type(save_ptr):=restore_sa; 32906 save_level(save_ptr):=sa_level; save_index(save_ptr):=sa_chain; 32907 incr(save_ptr); sa_chain:=null; sa_level:=cur_level; 32908 end; 32909i:=sa_index(p); 32910if i<dimen_val_limit then 32911 begin if sa_int(p)=0 then 32912 begin q:=get_node(pointer_node_size); i:=tok_val_limit; 32913 end 32914 else begin q:=get_node(word_node_size); sa_int(q):=sa_int(p); 32915 end; 32916 sa_ptr(q):=null; 32917 end 32918else begin q:=get_node(pointer_node_size); sa_ptr(q):=sa_ptr(p); 32919 end; 32920sa_loc(q):=p; sa_index(q):=i; sa_lev(q):=sa_lev(p); 32921link(q):=sa_chain; sa_chain:=q; add_sa_ref(p); 32922end; 32923 32924@ @<Declare \eTeX\ procedures for tr...@>= 32925procedure sa_destroy(@!p:pointer); {destroy value of |p|} 32926begin if sa_index(p)<mu_val_limit then delete_glue_ref(sa_ptr(p)) 32927else if sa_ptr(p)<>null then 32928 if sa_index(p)<box_val_limit then flush_node_list(sa_ptr(p)) 32929 else delete_token_ref(sa_ptr(p)); 32930end; 32931 32932@ The procedure |sa_def| assigns a new value to sparse array elements, 32933and saves the former value if appropriate. This procedure is used only 32934for skip, muskip, box, and token list registers. The counterpart of 32935|sa_def| for count and dimen registers is called |sa_w_def|. 32936 32937@d sa_define(#)==if e then 32938 if global then gsa_def(#)@+else sa_def(#) 32939 else define 32940@# 32941@d sa_def_box== {assign |cur_box| to |box(cur_val)|} 32942 begin find_sa_element(box_val,cur_val,true); 32943 if global then gsa_def(cur_ptr,cur_box)@+else sa_def(cur_ptr,cur_box); 32944 end 32945@# 32946@d sa_word_define(#)==if e then 32947 if global then gsa_w_def(#)@+else sa_w_def(#) 32948 else word_define(#) 32949 32950@<Declare \eTeX\ procedures for tr...@>= 32951procedure sa_def(@!p:pointer;@!e:halfword); 32952 {new data for sparse array elements} 32953begin add_sa_ref(p); 32954if sa_ptr(p)=e then 32955 begin @!stat if tracing_assigns>0 then show_sa(p,"reassigning");@+tats@;@/ 32956 sa_destroy(p); 32957 end 32958else begin @!stat if tracing_assigns>0 then show_sa(p,"changing");@+tats@;@/ 32959 if sa_lev(p)=cur_level then sa_destroy(p)@+else sa_save(p); 32960 sa_lev(p):=cur_level; sa_ptr(p):=e; 32961 @!stat if tracing_assigns>0 then show_sa(p,"into");@+tats@;@/ 32962 end; 32963delete_sa_ref(p); 32964end; 32965@# 32966procedure sa_w_def(@!p:pointer;@!w:integer); 32967begin add_sa_ref(p); 32968if sa_int(p)=w then 32969 begin @!stat if tracing_assigns>0 then show_sa(p,"reassigning");@+tats@;@/ 32970 end 32971else begin @!stat if tracing_assigns>0 then show_sa(p,"changing");@+tats@;@/ 32972 if sa_lev(p)<>cur_level then sa_save(p); 32973 sa_lev(p):=cur_level; sa_int(p):=w; 32974 @!stat if tracing_assigns>0 then show_sa(p,"into");@+tats@;@/ 32975 end; 32976delete_sa_ref(p); 32977end; 32978 32979@ The |sa_def| and |sa_w_def| routines take care of local definitions. 32980@^global definitions@> 32981Global definitions are done in almost the same way, but there is no need 32982to save old values, and the new value is associated with |level_one|. 32983 32984@<Declare \eTeX\ procedures for tr...@>= 32985procedure gsa_def(@!p:pointer;@!e:halfword); {global |sa_def|} 32986begin add_sa_ref(p); 32987@!stat if tracing_assigns>0 then show_sa(p,"globally changing");@+tats@;@/ 32988sa_destroy(p); sa_lev(p):=level_one; sa_ptr(p):=e; 32989@!stat if tracing_assigns>0 then show_sa(p,"into");@+tats@;@/ 32990delete_sa_ref(p); 32991end; 32992@# 32993procedure gsa_w_def(@!p:pointer;@!w:integer); {global |sa_w_def|} 32994begin add_sa_ref(p); 32995@!stat if tracing_assigns>0 then show_sa(p,"globally changing");@+tats@;@/ 32996sa_lev(p):=level_one; sa_int(p):=w; 32997@!stat if tracing_assigns>0 then show_sa(p,"into");@+tats@;@/ 32998delete_sa_ref(p); 32999end; 33000 33001@ The |sa_restore| procedure restores the sparse array entries pointed 33002at by |sa_chain| 33003 33004@<Declare \eTeX\ procedures for tr...@>= 33005procedure sa_restore; 33006var p:pointer; {sparse array element} 33007begin repeat p:=sa_loc(sa_chain); 33008if sa_lev(p)=level_one then 33009 begin if sa_index(p)>=dimen_val_limit then sa_destroy(sa_chain); 33010 @!stat if tracing_restores>0 then show_sa(p,"retaining");@+tats@;@/ 33011 end 33012else begin if sa_index(p)<dimen_val_limit then 33013 if sa_index(sa_chain)<dimen_val_limit then sa_int(p):=sa_int(sa_chain) 33014 else sa_int(p):=0 33015 else begin sa_destroy(p); sa_ptr(p):=sa_ptr(sa_chain); 33016 end; 33017 sa_lev(p):=sa_lev(sa_chain); 33018 @!stat if tracing_restores>0 then show_sa(p,"restoring");@+tats@;@/ 33019 end; 33020delete_sa_ref(p); 33021p:=sa_chain; sa_chain:=link(p); 33022if sa_index(p)<dimen_val_limit then free_node(p,word_node_size) 33023else free_node(p,pointer_node_size); 33024until sa_chain=null; 33025end; 33026 33027@ When the value of |last_line_fit| is positive, the last line of a 33028(partial) paragraph is treated in a special way and we need additional 33029fields in the active nodes. 33030 33031@d active_node_size_extended=5 {number of words in extended active nodes} 33032@d active_short(#)==mem[#+3].sc {|shortfall| of this line} 33033@d active_glue(#)==mem[#+4].sc {corresponding glue stretch or shrink} 33034 33035@<Glob...@>= 33036@!last_line_fill:pointer; {the |par_fill_skip| glue node of the new paragraph} 33037@!do_last_line_fit:boolean; {special algorithm for last line of paragraph?} 33038@!active_node_size:small_number; {number of words in active nodes} 33039@!fill_width:array[0..2] of scaled; {infinite stretch components of 33040 |par_fill_skip|} 33041@!best_pl_short:array[very_loose_fit..tight_fit] of scaled; {|shortfall| 33042 corresponding to |minimal_demerits|} 33043@!best_pl_glue:array[very_loose_fit..tight_fit] of scaled; {corresponding 33044 glue stretch or shrink} 33045 33046@ The new algorithm for the last line requires that the stretchability of 33047|par_fill_skip| is infinite and the stretchability of |left_skip| plus 33048|right_skip| is finite. 33049 33050@<Check for special...@>= 33051do_last_line_fit:=false; active_node_size:=active_node_size_normal; 33052 {just in case} 33053if last_line_fit>0 then 33054 begin q:=glue_ptr(last_line_fill); 33055 if (stretch(q)>0)and(stretch_order(q)>normal) then 33056 if (background[3]=0)and(background[4]=0)and(background[5]=0) then 33057 begin do_last_line_fit:=true; 33058 active_node_size:=active_node_size_extended; 33059 fill_width[0]:=0; fill_width[1]:=0; fill_width[2]:=0; 33060 fill_width[stretch_order(q)-1]:=stretch(q); 33061 end; 33062 end 33063 33064@ @<Other local variables for |try_break|@>= 33065@!g:scaled; {glue stretch or shrink of test line, adjustment for last line} 33066 33067@ Here we initialize the additional fields of the first active node 33068representing the beginning of the paragraph. 33069 33070@<Initialize additional fields of the first active node@>= 33071begin active_short(q):=0; active_glue(q):=0; 33072end 33073 33074@ Here we compute the adjustment |g| and badness |b| for a line from |r| 33075to the end of the paragraph. When any of the criteria for adjustment is 33076violated we fall through to the normal algorithm. 33077 33078The last line must be too short, and have infinite stretch entirely due 33079to |par_fill_skip|. 33080 33081@<Perform computations for last line and |goto found|@>= 33082begin if (active_short(r)=0)or(active_glue(r)<=0) then goto not_found; 33083 {previous line was neither stretched nor shrunk, or was infinitely bad} 33084if (cur_active_width[3]<>fill_width[0])or@| 33085 (cur_active_width[4]<>fill_width[1])or@| 33086 (cur_active_width[5]<>fill_width[2]) then goto not_found; 33087 {infinite stretch of this line not entirely due to |par_fill_skip|} 33088if active_short(r)>0 then g:=cur_active_width[2] 33089else g:=cur_active_width[6]; 33090if g<=0 then goto not_found; {no finite stretch resp.\ no shrink} 33091arith_error:=false; g:=fract(g,active_short(r),active_glue(r),max_dimen); 33092if last_line_fit<1000 then g:=fract(g,last_line_fit,1000,max_dimen); 33093if arith_error then 33094 if active_short(r)>0 then g:=max_dimen@+else g:=-max_dimen; 33095if g>0 then 33096 @<Set the value of |b| to the badness of the last line for stretching, 33097 compute the corresponding |fit_class|, and |goto found|@> 33098else if g<0 then 33099 @<Set the value of |b| to the badness of the last line for shrinking, 33100 compute the corresponding |fit_class|, and |goto found|@>; 33101not_found:end 33102 33103@ These badness computations are rather similar to those of the standard 33104algorithm, with the adjustment amount |g| replacing the |shortfall|. 33105 33106@<Set the value of |b| to the badness of the last line for str...@>= 33107begin if g>shortfall then g:=shortfall; 33108if g>7230584 then if cur_active_width[2]<1663497 then 33109 begin b:=inf_bad; fit_class:=very_loose_fit; goto found; 33110 end; 33111b:=badness(g,cur_active_width[2]); 33112if b>12 then 33113 if b>99 then fit_class:=very_loose_fit 33114 else fit_class:=loose_fit 33115else fit_class:=decent_fit; 33116goto found; 33117end 33118 33119@ @<Set the value of |b| to the badness of the last line for shr...@>= 33120begin if -g>cur_active_width[6] then g:=-cur_active_width[6]; 33121b:=badness(-g,cur_active_width[6]); 33122if b>12 then fit_class:=tight_fit@+else fit_class:=decent_fit; 33123goto found; 33124end 33125 33126@ Vanishing values of |shortfall| and |g| indicate that the last line is 33127not adjusted. 33128 33129@<Adjust \(t)the additional data for last line@>= 33130begin if cur_p=null then shortfall:=0; 33131if shortfall>0 then g:=cur_active_width[2] 33132else if shortfall<0 then g:=cur_active_width[6] 33133else g:=0; 33134end 33135 33136@ For each feasible break we record the shortfall and glue stretch or 33137shrink (or adjustment). 33138 33139@<Store \(a)additional data for this feasible break@>= 33140begin best_pl_short[fit_class]:=shortfall; best_pl_glue[fit_class]:=g; 33141end 33142 33143@ Here we save these data in the active node representing a potential 33144line break. 33145 33146@<Store \(a)additional data in the new active node@>= 33147begin active_short(q):=best_pl_short[fit_class]; 33148active_glue(q):=best_pl_glue[fit_class]; 33149end 33150 33151@ @<Print additional data in the new active node@>= 33152begin print(" s="); print_scaled(active_short(q)); 33153if cur_p=null then print(" a=")@+else print(" g="); 33154print_scaled(active_glue(q)); 33155end 33156 33157@ Here we either reset |do_last_line_fit| or adjust the |par_fill_skip| 33158glue. 33159 33160@<Adjust \(t)the final line of the paragraph@>= 33161if active_short(best_bet)=0 then do_last_line_fit:=false 33162else begin q:=new_spec(glue_ptr(last_line_fill)); 33163 delete_glue_ref(glue_ptr(last_line_fill)); 33164 width(q):=width(q)+active_short(best_bet)-active_glue(best_bet); 33165 stretch(q):=0; glue_ptr(last_line_fill):=q; 33166 end 33167 33168@ When reading \.{\\patterns} while \.{\\savinghyphcodes} is positive 33169the current |lc_code| values are stored together with the hyphenation 33170patterns for the current language. They will later be used instead of 33171the |lc_code| values for hyphenation purposes. 33172 33173The |lc_code| values are stored in the linked trie analogous to patterns 33174$p_1$ of length~1, with |hyph_root=trie_r[0]| replacing |trie_root| and 33175|lc_code(p_1)| replacing the |trie_op| code. This allows to compress 33176and pack them together with the patterns with minimal changes to the 33177existing code. 33178 33179@d hyph_root==trie_r[0] {root of the linked trie for |hyph_codes|} 33180 33181@<Initialize table entries...@>= 33182 33183@ @<Store hyphenation codes for current language@>= 33184begin c:=cur_lang; first_child:=false; p:=0; 33185repeat q:=p; p:=trie_r[q]; 33186until (p=0)or(c<=so(trie_c[p])); 33187if (p=0)or(c<so(trie_c[p])) then 33188 @<Insert a new trie node between |q| and |p|, and 33189 make |p| point to it@>; 33190q:=p; {now node |q| represents |cur_lang|} 33191@<Store all current |lc_code| values@>; 33192end 33193 33194@ We store all nonzero |lc_code| values, overwriting any previously 33195stored values (and possibly wasting a few trie nodes that were used 33196previously and are not needed now). We always store at least one 33197|lc_code| value such that |hyph_index| (defined below) will not be zero. 33198 33199@<Store all current |lc_code| values@>= 33200p:=trie_l[q]; first_child:=true; 33201for c:=0 to 255 do 33202 if (lc_code(c)>0)or((c=255)and first_child) then 33203 begin if p=0 then 33204 @<Insert a new trie node between |q| and |p|, and 33205 make |p| point to it@> 33206 else trie_c[p]:=si(c); 33207 trie_o[p]:=qi(lc_code(c)); 33208 q:=p; p:=trie_r[q]; first_child:=false; 33209 end; 33210if first_child then trie_l[q]:=0@+else trie_r[q]:=0 33211 33212@ We must avoid to ``take'' location~1, in order to distinguish between 33213|lc_code| values and patterns. 33214 33215@<Pack all stored |hyph_codes|@>= 33216begin if trie_root=0 then for p:=0 to 255 do trie_min[p]:=p+2; 33217first_fit(hyph_root); trie_pack(hyph_root); 33218hyph_start:=trie_ref[hyph_root]; 33219end 33220 33221@ The global variable |hyph_index| will point to the hyphenation codes 33222for the current language. 33223 33224@d set_hyph_index== {set |hyph_index| for current language} 33225 if trie_char(hyph_start+cur_lang)<>qi(cur_lang) 33226 then hyph_index:=0 {no hyphenation codes for |cur_lang|} 33227 else hyph_index:=trie_link(hyph_start+cur_lang) 33228@# 33229@d set_lc_code(#)== {set |hc[0]| to hyphenation or lc code for |#|} 33230 if (hyph_index=0) or ((#)>255) then hc[0]:=lc_code(#) 33231 else if trie_char(hyph_index+#)<>qi(#) then hc[0]:=0 33232 else hc[0]:=qo(trie_op(hyph_index+#)) 33233 33234@<Glob...@>= 33235@!hyph_start:trie_pointer; {root of the packed trie for |hyph_codes|} 33236@!hyph_index:trie_pointer; {pointer to hyphenation codes for |cur_lang|} 33237 33238@ When |saving_vdiscards| is positive then the glue, kern, and penalty 33239nodes removed by the page builder or by \.{\\vsplit} from the top of a 33240vertical list are saved in special lists instead of being discarded. 33241 33242@d tail_page_disc==disc_ptr[copy_code] {last item removed by page builder} 33243@d page_disc==disc_ptr[last_box_code] {first item removed by page builder} 33244@d split_disc==disc_ptr[vsplit_code] {first item removed by \.{\\vsplit}} 33245 33246@<Glob...@>= 33247@!disc_ptr:array[copy_code..vsplit_code] of pointer; {list pointers} 33248 33249@ @<Set init...@>= 33250page_disc:=null; split_disc:=null; 33251 33252@ The \.{\\pagediscards} and \.{\\splitdiscards} commands share the 33253command code |un_vbox| with \.{\\unvbox} and \.{\\unvcopy}, they are 33254distinguished by their |chr_code| values |last_box_code| and 33255|vsplit_code|. These |chr_code| values are larger than |box_code| and 33256|copy_code|. 33257 33258@<Generate all \eTeX...@>= 33259primitive("pagediscards",un_vbox,last_box_code);@/ 33260@!@:page_discards_}{\.{\\pagediscards} primitive@> 33261primitive("splitdiscards",un_vbox,vsplit_code);@/ 33262@!@:split_discards_}{\.{\\splitdiscards} primitive@> 33263 33264@ @<Cases of |un_vbox| for |print_cmd_chr|@>= 33265else if chr_code=last_box_code then print_esc("pagediscards") 33266else if chr_code=vsplit_code then print_esc("splitdiscards") 33267 33268@ @<Handle saved items and |goto done|@>= 33269begin link(tail):=disc_ptr[cur_chr]; disc_ptr[cur_chr]:=null; 33270goto done; 33271end 33272 33273@ The \.{\\interlinepenalties}, \.{\\clubpenalties}, \.{\\widowpenalties}, 33274and \.{\\displaywidowpenalties} commands allow to define arrays of 33275penalty values to be used instead of the corresponding single values. 33276 33277@d inter_line_penalties_ptr==equiv(inter_line_penalties_loc) 33278@d club_penalties_ptr==equiv(club_penalties_loc) 33279@d widow_penalties_ptr==equiv(widow_penalties_loc) 33280@d display_widow_penalties_ptr==equiv(display_widow_penalties_loc) 33281 33282@<Generate all \eTeX...@>= 33283primitive("interlinepenalties",set_shape,inter_line_penalties_loc);@/ 33284@!@:inter_line_penalties_}{\.{\\interlinepenalties} primitive@> 33285primitive("clubpenalties",set_shape,club_penalties_loc);@/ 33286@!@:club_penalties_}{\.{\\clubpenalties} primitive@> 33287primitive("widowpenalties",set_shape,widow_penalties_loc);@/ 33288@!@:widow_penalties_}{\.{\\widowpenalties} primitive@> 33289primitive("displaywidowpenalties",set_shape,display_widow_penalties_loc);@/ 33290@!@:display_widow_penalties_}{\.{\\displaywidowpenalties} primitive@> 33291 33292@ @<Cases of |set_shape| for |print_cmd_chr|@>= 33293inter_line_penalties_loc: print_esc("interlinepenalties"); 33294club_penalties_loc: print_esc("clubpenalties"); 33295widow_penalties_loc: print_esc("widowpenalties"); 33296display_widow_penalties_loc: print_esc("displaywidowpenalties"); 33297 33298@ @<Fetch a penalties array element@>= 33299begin scan_int; 33300if (equiv(m)=null)or(cur_val<0) then cur_val:=0 33301else begin if cur_val>penalty(equiv(m)) then cur_val:=penalty(equiv(m)); 33302 cur_val:=penalty(equiv(m)+cur_val); 33303 end; 33304end 33305 33306@* \[54] System-dependent changes. 33307This section should be replaced, if necessary, by any special 33308modifications of the program 33309that are necessary to make \TeX\ work at a particular installation. 33310It is usually best to design your change file so that all changes to 33311previous sections preserve the section numbering; then everybody's version 33312will be consistent with the published program. More extensive changes, 33313which introduce new sections, can be inserted here; then only the index 33314itself will get a new section number. 33315@^system dependencies@> 33316 33317@* \[55] Index. 33318Here is where you can find all uses of each identifier in the program, 33319with underlined entries pointing to where the identifier was defined. 33320If the identifier is only one letter long, however, you get to see only 33321the underlined entries. {\sl All references are to section numbers instead of 33322page numbers.} 33323 33324This index also lists error messages and other aspects of the program 33325that you might want to look up some day. For example, the entry 33326for ``system dependencies'' lists all sections that should receive 33327special attention from people who are installing \TeX\ in a new 33328operating environment. A list of various things that can't happen appears 33329under ``this can't happen''. Approximately 40 sections are listed under 33330``inner loop''; these account for about 60\pct! of \TeX's running time, 33331exclusive of input and output. 33332