1<Chapter Label="ch:intro"><Heading>Introduction and Example</Heading> 2 3The main purpose of the &GAPDoc; package is to define a file format for 4documentation of &GAP;-programs and -packages (see <Cite Key="GAP4" />). The 5problem is that such documentation should be readable in several output 6formats. For example it should be possible to read the documentation inside 7the terminal in which &GAP; is running (a text mode) and there should be a 8printable version in high typesetting quality (produced by some version of 9&TeX;). It is also popular to view &GAP;'s online help with a Web-browser 10via an HTML-version of the documentation. Nowadays one can use &LaTeX; and 11standard viewer programs to produce and view on the screen <C>dvi</C>- or 12<C>pdf</C>-files with full support of internal and external hyperlinks. 13Certainly there will be other interesting document formats and tools in this 14direction in the future. <P/> 15 16Our aim is to find a <Emph>format for writing</Emph> the documentation which 17allows a relatively easy translation into the output formats just mentioned 18and which hopefully makes it easy to translate to future output formats as 19well. <P/> 20 21To make documentation written in the &GAPDoc; format directly usable, we 22also provide a set of programs, called converters, which produce text-, 23hyperlinked &LaTeX;- and HTML-output versions of a &GAPDoc; document. These 24programs are developed by the first named author. They run completely inside 25&GAP;, i.e., no external programs are needed. You only need <C>latex</C> and 26<C>pdflatex</C> to process the &LaTeX; output. These programs are described 27in Chapter <Ref Chap="ch:conv"/>. 28 29<Section Label="sec:XML"><Heading>XML</Heading> 30<Index >XML</Index> 31 32The definition of the &GAPDoc; format uses XML, the <Q>eXtendible Markup 33Language</Q>. This is a standard (defined by the W3C consortium, see 34<URL>http://www.w3c.org</URL>) which lays down a syntax for adding markup to 35a document or to some data. It allows to define document structures via 36introducing markup <E>elements</E> and certain relations between them. This 37is done in a <E>document type definition</E>. The file <F>gapdoc.dtd</F> 38contains such a document type definition and is the central part of the 39&GAPDoc; package. <P/> 40 41The easiest way for getting a good idea about this is probably to look at an 42example. The Appendix <Ref Appendix="app:3k+1" /> contains a short but 43complete &GAPDoc; document for a fictitious share package. In the next 44section we will go through this document, explain basic facts about XML and 45the &GAPDoc; document type, and give pointers to more details in later parts 46of this documentation. <P/> 47 48In the last Section <Ref Sect="sec:faq" /> of this introductory chapter 49we try to answer some general questions about the decisions which lead to 50the &GAPDoc; package. 51 52</Section> 53 54<Section Label="sec:3k+1expl"><Heading>A complete example</Heading> 55 56In this section we recall the lines from the example document in 57Appendix <Ref Appendix="app:3k+1" /> and give some explanations. 58 59<Listing Type="from 3k+1.xml"> 60<![CDATA[<?xml version="1.0" encoding="UTF-8"?> ]]> 61</Listing> 62 63This line just tells a human reader and computer programs that the file 64is a document with XML markup and that the text is encoded in the UTF-8 65character set (other common encodings are ASCII or ISO-8895-X encodings). 66 67<Listing Type="from 3k+1.xml"> 68<![CDATA[<!-- A complete "fake package" documentation 69--> 70]]></Listing> 71 72Everything in a XML file between <Q><C><!--</C></Q> and 73<Q><C>--></C></Q> is a comment and not part of the document content. 74 75<Listing Type="from 3k+1.xml"> 76<![CDATA[<!DOCTYPE Book SYSTEM "gapdoc.dtd"> 77]]></Listing> 78 79This line says that the document contains markup which is defined in 80the system file <F>gapdoc.dtd</F> and that the markup obeys certain 81rules defined in that file (the ending <F>dtd</F> means <Q>document type 82definition</Q>). It further says that the actual content of the document 83consists of an element with name <Q>Book</Q>. And we can really see that the 84remaining part of the file is enclosed as follows: 85 86<Listing Type="from 3k+1.xml"> 87<![CDATA[<Book Name="3k+1"> 88 [...] (content omitted) 89</Book> 90]]></Listing> 91 92This demonstrates the basics of the markup in XML. This part of the document 93is an <Q>element</Q>. It consists of the <Q>start tag</Q> <C><![CDATA[<Book 94Name="3k+1">]]></C>, the <Q>element content</Q> and the <Q>end tag</Q> 95<C><![CDATA[</Book>]]></C> (end tags always start with <C></</C>). This 96element also has an <Q>attribute</Q> <C>Name</C> whose <Q>value</Q> is 97<C>3k+1</C>. 98<P/> 99 100If you know HTML, this will look familiar to you. But there are some 101important differences: The element name <C>Book</C> and attribute name 102<C>Name</C> are <E>case sensitive</E>. The value of an attribute must 103<E>always</E> be enclosed in quotes. In XML <E>every</E> element has a start 104and end tag (which can be combined for elements defined as <Q>empty</Q>, see 105for example <C><TableOfContents/></C> below). 106<P/> 107 108If you know &LaTeX;, you are familiar with quite different 109types of markup, for example: The equivalent of the <C>Book</C> 110element in &LaTeX; is <C>\begin{document} ... 111\end{document}</C>. The sectioning in &LaTeX; is not 112done by explicit start and end markup, but implicitly via heading 113commands like <C>\section</C>. Other markup is done by using 114braces <C>{}</C> and putting some commands inside. And for 115mathematical formulae one can use the <C>$</C> for the start 116<E>and</E> the end of the markup. In XML <E>all</E> markup looks similar to 117that of the <C>Book</C> element. <P/> 118 119The content of the book starts with a title page. 120 121<Listing Type="from 3k+1.xml"> 122<![CDATA[<TitlePage> 123 <Title>The <Package>ThreeKPlusOne</Package> Package</Title> 124 <Version>Version 42</Version> 125 <Author>Dummy Authör 126 <Email>3kplusone@dev.null</Email> 127 </Author> 128 129 <Copyright>©right; 2000 The Author. <P/> 130 You can do with this package what you want.<P/> Really. 131 </Copyright> 132</TitlePage> 133]]></Listing> 134 135The content of the <C>TitlePage</C> element consists again of elements. In 136Chapter <Ref Chap="DTD" /> we describe which elements are allowed 137within a <C>TitlePage</C> and that their ordering is prescribed in this 138case. In the (stupid) name of the author you see that a German umlaut is 139used directly (in ISO-latin1 encoding). 140<P/> 141 142Contrary to &LaTeX;- or HTML-files this markup does not say anything about 143the actual layout of the title page in any output version of the document. 144It just adds information about the <E>meaning</E> of pieces of text. <P/> 145 146Within the <C>Copyright</C> element there are two more things to learn about 147XML markup. The <C><P/></C> is a complete element. It is a combined 148start and end tag. This shortcut is allowed for elements which are defined 149to be always <Q>empty</Q>, i.e., to have no content. You may have already 150guessed that <C><P/></C> is used as a paragraph separator. Note that 151empty lines do not separate paragraphs (contrary to &LaTeX;). <P/> 152 153The other construct we see here is <C>&copyright;</C>. This is an 154example of an <Q>entity</Q> in XML and is a macro for some substitution 155text. Here we use an entity as a shortcut for a complicated expression which 156makes it possible that the term <E>copyright</E> is printed as some text 157like <C>(C)</C> in text terminal output and as a copyright character in 158other output formats. In &GAPDoc; we predefine some entities. 159Certain <Q>special characters</Q> must be typed via entities, for example 160<Q><</Q>, <Q>></Q> and <Q>&</Q> to avoid a misinterpretation as 161XML markup. It is possible to define 162additional entities for your document inside the <C><!DOCTYPE ...></C> 163declaration, see <Ref Subsect="GDent" />. <P/> 164 165Note that elements in XML must always be properly nested, as in this 166example. A construct like <C><![CDATA[<a><b>...</a></b>]]></C> is <E>not</E> 167allowed. 168 169<Listing Type="from 3k+1.xml"> 170<![CDATA[<TableOfContents/> 171]]></Listing> 172 173This is another example of an <Q>empty element</Q>. It just means that a 174table of contents for the whole document should be included into any output 175version of the document. 176<P/> 177After this the main text of the document follows inside certain sectioning 178elements: 179 180<Listing Type="from 3k+1.xml"> 181<![CDATA[<Body> 182 <Chapter> <Heading>The <M>3k+1</M> Problem</Heading> 183 <Section Label="sec:theory"> <Heading>Theory</Heading> 184 [...] (content omitted) 185 </Section> 186 <Section> <Heading>Program</Heading> 187 [...] (content omitted) 188 </Section> 189 </Chapter> 190</Body> 191]]></Listing> 192 193These elements are used similarly to <Q>\chapter</Q> and 194<Q>\section</Q> in &LaTeX;. But note that the explicit end tags are 195necessary here. 196<P/> 197The sectioning commands allow to assign an optional attribute <Q>Label</Q>. 198This can be used for referring to a section inside the document. 199<P/> 200The text of the first section starts as follows. The whitespace in the text 201is unimportant and the indenting is not necessary. 202 203<Listing Type="from 3k+1.xml"> 204 205<![CDATA[ Let <M>k \in &NN;</M> be a natural number. We consider the 206 sequence <M>n(i, k), i \in &NN;,</M> with <M>n(1, k) = k</M> and 207 else 208]]></Listing> 209 210Here we come to the interesting question how to type mathematical formulae 211in a &GAPDoc; document. We did not find any alternative for writing formulae 212in &TeX; syntax. (There is MATHML, but even simple formulae contain a lot 213of markup, become quite unreadable and they are cumbersome to type. 214Furthermore there seem to be no tools available which translate such 215formulae in a nice way into &TeX; and text.) So, formulae are essentially 216typed as in &LaTeX;. (Actually, it is also possible to type unicode 217characters of some mathematical symbols directly, or via an entity like the 218<C>&NN;</C> above.) There are three types of elements containing 219formulae: <Q>M</Q>, <Q>Math</Q> and <Q>Display</Q>. The first two are for 220in-text formulae and the third is for displayed formulae. Here <Q>M</Q> and 221<Q>Math</Q> are equivalent, when translating a &GAPDoc; document into 222&LaTeX;. But they are handled differently for terminal text (and HTML) 223output. For the content of an <Q>M</Q>-element there are defined rules for a 224translation into well readable terminal text. More complicated formulae are 225in <Q>Math</Q> or <Q>Display</Q> elements and they are just printed as they 226are typed in text output. So, to make a section well readable inside a 227terminal window you should try to put as many formulae as possible into 228<Q>M</Q>-elements. In our example text we used the notation <C>n(i, k)</C> 229instead of <C>n_i(k)</C> because it is easier to read in text mode. See 230Sections <Ref Sect="GDformulae"/> and <Ref Sect="sec:misc" /> for 231more details. <P/> 232 233A few lines further on we find two non-internal references. 234 235<Listing Type="from 3k+1.xml"> 236<![CDATA[ problem, see <Cite Key="Wi98"/> or 237 <URL>http://mathsrv.ku-eichstaett.de/MGF/homes/wirsching/</URL> 238]]></Listing> 239 240The first within the <Q>Cite</Q>-element is the citation of a book. In 241&GAPDoc; we use the widely used &BibTeX; database format for reference 242lists. This does not use XML but has a well documented structure which is 243easy to parse. And many people have collections of references readily 244available in this format. The reference list in an output version of the 245document is produced with the empty element 246 247<Listing Type="from 3k+1.xml"> 248<![CDATA[<Bibliography Databases="3k+1" /> 249]]></Listing> 250 251close to the end of our example file. The attribute <Q>Databases</Q> 252give the name(s) of the database (<F>.bib</F>) files which contain the 253references. 254<P/> 255 256Putting a Web-address into an <Q>URL</Q>-element allows one to create a 257hyperlink in output formats which allow this. 258<P/> 259 260The second section of our example contains a special kind of subsection 261defined in &GAPDoc;. 262 263<Listing Type="from 3k+1.xml"> 264<![CDATA[ <ManSection> 265 <Func Name="ThreeKPlusOneSequence" Arg="k[, max]"/> 266 <Description> 267 This function computes for a natural number <A>k</A> the 268 beginning of the sequence <M>n(i, k)</M> defined in section 269 <Ref Sect="sec:theory"/>. The sequence stops at the first 270 <M>1</M> or at <M>n(<A>max</A>, k)</M>, if <A>max</A> is 271 given. 272<Example> 273gap> ThreeKPlusOneSequence(101); 274"Sorry, not yet implemented. Wait for Version 84 of the package" 275</Example> 276 </Description> 277 </ManSection> 278]]></Listing> 279 280A <Q>ManSection</Q> contains the description of some function, operation, 281method, filter and so on. The <Q>Func</Q>-element describes the name of a 282<E>function</E> (there are also similar elements <Q>Oper</Q>, <Q>Meth</Q>, 283<Q>Filt</Q> and so on) and names for its arguments, optional arguments 284enclosed in square brackets. See Section <Ref Sect="sec:mansect" /> for 285more details. <P/> 286 287In the <Q>Description</Q> we write the argument names as <Q>A</Q>-elements. 288A good description of a function should usually contain an example of its 289use. For this there are some verbatim-like elements in &GAPDoc;, like 290<Q>Example</Q> above (here, clearly, whitespace matters which causes a 291slightly strange indenting). <P/> 292 293The text contains an internal reference to the first section via the 294explicitly defined label <C>sec:theory</C>. 295<P/> 296 297The first section also contains a <Q>Ref</Q>-element which refers to the 298function described here. Note that there is no explicit label for such a 299reference. The pair <C><![CDATA[<Func Name="ThreeKPlusOneSequence" Arg="k[, 300max]"/>]]></C> and <C><![CDATA[<Ref Func="ThreeKPlusOneSequence"/>]]></C> 301does the cross referencing (and hyperlinking if possible) implicitly via the 302name of the function. 303<P/> 304 305Here is one further element from our example document which we want to 306explain. 307 308 309<Listing Type="from 3k+1.xml"> 310<![CDATA[<TheIndex/> 311]]></Listing> 312 313This is again an empty element which just says that an output version of the 314document should contain an index. Many entries for the index are generated 315automatically because the <Q>Func</Q> and similar elements implicitly 316produce such entries. It is also possible to include explicit additional 317entries in the index. 318 319</Section> 320 321 322<Section Label="sec:faq"><Heading>Some questions</Heading> 323 324<List> 325 <Mark>Are those XML files too ugly to read and edit?</Mark> 326 <Item> 327 Just have a look and decide yourself. The markup needs more characters 328 than most &TeX; or &LaTeX; markup. But the structure of the document is 329 easier to see. If you configure your favorite editor well, you do not need 330 more key strokes for typing the markup than in &LaTeX;. 331 </Item> 332 333 <Mark>Why do we not use &LaTeX; alone?</Mark> 334 <Item> 335 &LaTeX; is good for writing books. But &LaTeX; files are generally 336 difficult to parse and to process to other output formats like text 337 for browsing in a terminal window or HTML (or new formats which may 338 become popular in the future). &GAPDoc; markup is one step more 339 abstract than &LaTeX; insofar as it describes meaning instead of 340 appearance of text. The inner workings of &LaTeX; are too complicated 341 to learn without pain, which makes it difficult to overcome problems 342 that occur occasionally. 343 </Item> 344 345 <Mark>Why XML and not a newly defined markup language?</Mark> 346 <Item> 347 XML is a well defined standard that is more and more widely used. Lots 348 of people have thought about it. Years of experience with SGML went into the 349 design. It is easy to explain, easy to parse and lots of tools are available, 350 there will be more in the future. 351 </Item> 352</List> 353 354 355</Section> 356 357</Chapter> 358 359