1%%File intro.tex 2\Chapter{Introduction} 3\Section{What is KBMAG?} 4{\KBMAG} (pronounced ``Kay-bee-mag\'\') 5stands for *Knuth--Bendix on Monoids, and Automatic Groups*. 6The purpose of this manual is to provide instructions for its use as a 7stand-alone package. It is also usable from within Version 3 of the {\GAP} 8system (see \cite{Sch92}), and instructions for its use in that setting appear 9as a chapter of the {\GAP} manual in the directory 'gapdoc'. 10Some information on the interface with {\GAP} is also described in Chapter 11"The Interface with GAP" of this manual. 12 13The applications of {\KBMAG} can be divided into six inter-relating 14categories. These are covered in detail in the Chapters 15"The Knuth--Bendix Program for Monoids", 16"The Automatic Groups Package", 17"The Knuth--Bendix Program for Cosets of Subgroups", 18"The Automatic Coset System and Subgroup Presentation Package", 19"Creating Subgroup Word Acceptors", and 20"Programs for Manipulating Finite State Automata", 21but we will summarize them here. 22Note that the third, fourth and fifth applications, relating to subgroups and 23cosets, were new in Version 2. 24 25Firstly, the program 'kbprog' can be used by itself to carry out the 26Knuth--Bendix completion procedure on monoids defined by a finite presentation 27and, in particular, on finitely presented groups. The user has a choice 28between the short-lex ordering and the recursive-path ordering on strings. 29Weighted lex and wreath-product orderings, are also available. 30The latter are defined on pages 46 -- 50 of \cite{Sims94}. 31(It would be easy to make other orderings available if there were ever any 32demand.) The implementation is designed more with speed of execution in mind 33than with minimizing space requirements; thus, a finite state automaton is 34always used to carry out word reduction, which can be space-consuming, 35particularly when the number of generators is large. For a more flexible 36Knuth--Bendix package, with more general orderings available, and a 37choice of word-reduction procedures, the user could try the Rutgers 38Knuth--Bendix package {\rkbp} written by Charles Sims. 39After running the program, the current set of reduction rules can be used to 40reduce words in the monoid generators. If the rewriting-system produced is 41confluent, then words will be correctly reduced to their irreducible normal 42form under the given ordering, and so the word problem can be solved efficiently 43in the monoid. 44 45Secondly, the package can be used to compute the finite state automata 46that constitute the automatic structure of a short-lex automatic group. 47This supercedes the olderexisting Warwick {\Automata} package for this purpose; 48the current program is generally faster than {\Automata}, and successful with 49more examples. For general information about automatic groups, see 50\cite{ECHLPT92}, and for more detailed information about the algorithms 51used in {\Automata}, see \cite{EHR91} or \cite{Holt94}. There are 52no fundamentally new algorithms employed in {\KBMAG}, but several 53improvements have been made to the various components. The most 54noticeable change is that a single multiplier automaton is now computed, 55with the states labeled to indicate for which generators they are success 56states, rather than a separate multiplier for each generator (although the 57separate multipliers can still be computed if desired). 58 59Computing an automatic structure is done in several steps, which are 60carried out by a number of individual 'C'-programs. A Bourne Shell script, 61called 'autgroup' has been provided to run these programs in the correct 62sequence with, hopefully, a sensible default choice of options. As an 63alternative to the use of this shell script, the individual programs can of 64course be run separately. The first step is to run the program 65'kbprog', but with the ``word-difference\'\'\ option, which is required 66for automatic group calculations. The next program (which can itself be 67divided up into different parts if required) computes the word-acceptor 68and multiplier automata for the group. The final program (which can again 69be split into parts), is the so-called axiom-checking process, which proves 70that the automata that have been calculated are correct. 71If the process runs to completion, then the automata can be used to reduce 72words in the group generators to their irreducible normal forms under the 73short-lex ordering, and so the word problem in the group can be solved 74efficiently. It is also possible to compute the rational growth function 75for the group by applying the program 'fsagrowth' (see Chapter 8) to its 76word-acceptor. 77 78If the group happens to be word-hyperbolic then, 79after computing the automatic structure as described above, it is 80possible to run a program 'gpgeowa', which computes an automaton that accepts 81all geodesic words in the group generators and a 2-variable automaton that 82accepts all pairs of geodesic words which represent the same group element. 83In fact, by a result of Papsoglu~(\cite{PAP}), if 'gpgeowa' 84succeeds, then it effectively verifies the word-hyperbolicity of the 85group. 86 87Thirdly, there is a variant of 'kbprog' named 'kbprogcos' that is designed to 88work on the right cosets of a subgroup of the input group, at the same time 89as on the group itself. (The input structure could, more generally, be a 90monoid, but in order for the calculations carried out by the program to 91make any mathematical sense, it is probably necessary for the substructure to be 92a group.) In case of completion, a presentation of the subgroup 93can be calculated on the given subgroup generators. This will always work in 94case of a subgroup of finite index, but the better known Reidemeister-Schreier 95and Todd-Coxeter related methods for subgroup presentations are likely to be 96superior in this case (see, for example, \cite {Neu81}). 97 98Fourthly, there are programs for calculating the finite state automata 99that constitute an automatic coset system for certain subgroups $H$ of 100short-lex automatic groups $G$. In particular, the coset word-acceptor, that 101accepts the unique short-lex minimal word that lies in each right coset 102of $H$ in $G$, and the coset general multiplier, that accepts the pair of words 103$(w_1,w_2)$ if and only if $w_1$ and $w_2$ are accepted by the coset 104word-acceptor and $Hw_1x=Hw_2$ for some monoid generator $x$ of $G$, are 105constructed. In successful cases, a presentation of $H$ can also be computed. 106 107The first programs to calculate these structures 108were written by Ian Redfern and described by him in his Ph.D.thesis 109(\cite{Red93}), together with some nice graphical applications to drawing 110limit sets of Kleinian groups. He also proved that automatic coset systems 111exist for all quasiconvex subgroups of word-hyperbolic groups. (Apart from 112this, there appear to be no theoretical results that will predict for 113which subgroups $H$ automatic coset systems will exist but, by running the 114programs, we find that they do indeed exist in many cases.) 115The programs in {\KBMAG} employ a different method that uses finite state 116automata that have deterministic transition tables, but may have more than one 117initial state. It has the advantage over Redfern\'s that the structures 118computed can be proved correct. 119 120As in the case of ordinary automatic structures, these calculations are 121done in several steps, and a Bourne Shell script is provided, called 122'autcos', that runs the appropriate programs in the correct order, with 123a sensible default choice of options. The steps correspond roughly to those of 124'autgroup', the first being 'kbprogcos' with the ``word-difference\'\' 125option. There is an optional additional step at the end to compute a 126presentation of the subgroup $H$. This usually contains many redundant 127generators and relators, and is output as a file that can be read immediately 128by {\GAP}, so that the presenation can be simplified. It is planned, at some 129stage in the future, the facility to compute a presentation using the original 130given generators of $H$ will be provided. 131 132The fifth application is for the construction of a finite state automaton 133that accepts a unique word for each element of a suitable subgroup $H$ of 134a short-lex automatic group $G$. An automatic structure must first be computed 135for $G$ by using 'autcos' or its constituents, and the word-acceptor for $H$ 136will accept precisely those words that lie in $H$ and are accepted by the 137word-acceptor for $G$. Theoretically, it will work whenever $H$ is a 138quasiconvex subgroup of $G$. The algorithm employed has some features in 139common with that described in Section 2 of \cite{Kar94} (see there also for 140theoretical details and further references). In particular, the 141correctness verification part of it is identical to that described in 142\cite{Kar94}, but for the construction of the word-acceptor for $H$, 143{\KBMAG} uses Knuth-Bendix based methods rather than Todd-Coxeter coset 144enumeration. 145 146The sixth and final application of the package is for general manipulation of 147finite state automata. Most of the routines provided operate on 148deterministic automata. However, there is a program 'nfadeterminize', 149which inputs a non-deterministic automaton and outputs a deterministic 150one with the same language. 151There is also a program 'midfadeterminize' that handles the 152important special case in which the input automaton is 153a multiple-initial-state automaton with deterministic transition table 154(see above). 155 156There are programs for carrying out logical operations 157on automata (in fact, some of the functions that they call are also used 158in the automatic group calculations). There are also programs to count 159and to enumerate the language of a finite state automaton. These are 160likely to be interesting to apply to the automata associated with 161an automatic group, or to the reduction automaton output by 'kbprog' in 162its stand-alone mode. There are a number of other, more dedicated, 163packages that are available for manipulating finite state automata and 164regular expressions. One such is {\Automate} (see \cite{Rie87} or 165\cite{ChH91}), and another {\Grail} (see ??) developed at Ontario. 166 167\Section{File formats} 168The programs in {\KBMAG} generally do all of their serious input and output 169from files, and only print diagnostics to 'stdout'. These files conform to 170the format of the Version 3 of the {\GAP} system \cite{Sch92} and so are 171readable from within {\GAP}. See the {\GAP} manual chapter in the 'gapdoc' 172directory, or Chapter "The Interface with GAP" for details of how to use 173{\KBMAG} from within {\GAP}. 174 175Two principal types of objects are handled, rewriting-systems and 176finite state automata. Each file contains a single {\GAP} declaration, which 177defines one object of one of these two types. This takes the general form 178 179'<identifier> \:= rec(<list>);' 180 181where <list> is a comma-separated list 182of field-definitions that specify the values of the fields of the object 183being defined. The two types of object are distinguished by the first 184such definition, which must be either 185 186|isRWS := true| \hspace{1cm} or \hspace{1cm} |isFSA := true| 187 188for rewriting-systems and finite state automata, respectively. 189Subgroups of the group or monoid defined by a rewriting system can also be 190defined in a separate file, as described in Chapter 191"Subgroup and Coset Files". 192 193To use the interface with {\GAP} in its current form, the name of a 194rewriting-system (i.e. the value of <identifier>) has to be '\_RWS'. 195This is because the relevant {\GAP} functions have '\_RWS' defined as an 196external variable, and expect a declaration of this form. 197 198Formal definitions of these formats have not yet been written down. 199There is a text document giving an informal, and not completely up-to-data 200description of the finite state automaton format in the {\KBMAG} 201directory 'doc'. Examples of files containing rewriting-systems can be 202found in the directories 'kb\_data' and 'ag\_data', and examples of 203finite state automata in 'fsa\_data'. Rather than attempt a description 204here, we suggest that the reader takes a look at some of these files. 205 206{\sf Handy Tip\:} Reading the transition table directly from a file that 207contains the definition of an automaton with a large number of states 208can be difficult, because the rows of the table are not numbered in the 209file. Such numbering can be introduced as comments, by running the 210program 'fsafilter' with the '-csn' option; see Section "fsafilter". 211An alternative and more versatile way of accessing the information 212about the automaton is to read it into {\GAP} and then use the {\GAP} 213functions provided; see Chapter "The Interface with GAP". 214 215For the Knuth--Bendix and automatic group applications, the user has to 216supply a rewriting-system that defines a monoid or group in a file as input. 217For applications related to cosets of subgroups, an additional file 218defining generators of the subgroup must also be supplied. 219The programs produce new files containing rewriting-systems or automata. 220In general, the user\'s file will contain a declaration of the above type, 221and the computed files will contain a declaration of form 222 223'<identifier>.<suffix> \:= rec(<list>);' 224 225for the same <identifier>, so that if the user reads these files from 226within {\GAP}, and the input file is read first, then, from {\GAP}\'s 227viewpoint, the computed files will define new components of the original 228record. (As mentioned above, for use with {\GAP}, the value of 229<identifier> should always be '\_RWS'.) 230 231The best method (and the one always employed by the author) 232of creating an input file is to copy an existing one and edit it. 233Let us briefly look at an example and discuss it. 234 235| 236#Free nilpotent group of rank 2 and class 2 237_RWS := rec( 238 isRWS := true, 239 generatorOrder := [c,C,b,B,a,A], 240 inverses := [C,c,B,b,A,a], 241 ordering := "recursive", 242 equations := [ 243 [b*a,a*b*c], [c*a,a*c], [c*b,b*c] 244 ] 245); 246| 247 248The first line is a comment, and is ignored by programs. (In general, 249comments are preceded by the `\#\'\ symbol and last to the end of the line.) 250To comply with the current {\GAP} interface, we name our rewriting-system 251'\_RWS'. 252As we saw above, the first field definition merely states that this is a 253definition of a rewriting system. 254 255The |ordering|\ field specifies which 256ordering on strings in the input alphabet is to be used by the Knuth--Bendix 257program. Although there is a default ordering, which is '\"shortlex\"', 258this field is required by the {\GAP} interface, so it is recommended that 259it should always be included. 260It is also possible 261to define the ordering by means of a command-line option to 'kbprog' 262however. The convention for this, and various other fields, is that 263command-line options override settings in the file in case of conflict. 264Full details of these options are given in Chapters 265"The Knuth--Bendix Program for Monoids" and "The Automatic Groups Package". 266 267The remaining three fields provide the definition of the group or monoid. 268First comes a list of generators. They must generate the structure as a monoid, 269even if it is a group; this means inverses should be included in the generator 270list. The field is named |generatorOrder|\ to emphasize the fact that the 271order is relevant - it will affect the ordering of strings in the alphabet. 272The names of the generators should be alphanumeric strings. In fact, 273dots and underscores are also allowed, for compatability with {\GAP}. 274Case is significant. It is recommended to use single letters, and use 275case-change for inversion, as we have done in this example. Another 276recommended policy is to use a common prefix followed by numbers; for 277example, a file output by {\GAP} might have its generators named 278'G.1, G.2, ..., G.n' for some $n \ge 0$. It is also permissible to name a 279generator <name>'\^-1', where <name> is the name of another generator, and 280the two are mutually inverse. 281 282The |inverses|\ field supplies the list of two-sided inverses of the 283generators, in the same order as the generators. This field must be present, 284but, in general, not all generators need have inverses, so the list could 285be empty, or contain gaps. For example, if only the first and third 286generators have inverses, and these are named 'i1' and 'i2', then the list 287would be written as '[i1,,i2]'. However, if generator 'A' is listed as the 288inverse of 'a', then 'a' must also be listed as the inverse of 'A'. 289There is currently no mechanism for inputting one-sided inverses (although 290that would be useful information for 'kbprog' under some circumstances, and so 291it may be introduced in the future). 292In the automatic groups applications, the structure must be a group, and all 293generators must have inverses specified in the list. 294(Currently, there is no way of specifying a default convention, such as 295inversion equals case-change. We may introduce such a convention in 296future, but this will depend on whether it can be made meaningful to 297{\GAP}.) 298 299Finally, there comes the |equations|\ field. This consists of a list of 300lists. Each inner list constitutes a defining relation for the 301monoid, and should have two entries, which are words in the generators, 302and are the left and right hand sides of the relation. 303The empty word is denoted (as in {\GAP}) by 'IdWord'. 304The word may contain brackets to any level, and positive powers. 305So, for example 'a\*(b\*(A\*c)\^4)\^3\*c\^12' is a valid word in the generators 306in the example above. 307Since the word is in the monoid generators, not all of which will 308necessarily have inverses, negative powers are not permitted. 309However, if a generator is named 'a\^-1', for example, then the <n>-th 310power of it should be written as 'a\^-'<n> rather than as 'a\^-1\^'<n>. 311 312It is not necessary to include defining relations of type 313'[a\*A,IdWord]' in the list of equations, where the generators 'a' and 'A' 314have been already been specified as mutually inverse in the ``|inverses|\'\' 315field, and this has not been done in the example above. On the other hand 316it does no harm if you do include them, and they will be included in 317lists of equations output by 'kbprog'. 318 319There are a number of other optional fields, which do not occur in this example, 320and provide further detailed instructions for the Knuth--Bendix program. 321See the description of this program in Section "kbprog" for details. 322 323For a description of the format of files defining subgroups, see 324Chapter "Subgroup and Coset Files". 325 326\Section{Exit Status of Programs and Meanings of Some Options} 327 328The exit status of nearly all of the programs is 0 if successful and 3291 if unsuccessful, and the program aborts with an error message 330to 'stderr', without outputting to file. 331One or two of the programs can also exit with status 2, which means that 332something unusual but non-fatal has occurred. The two most important are 333'kbprog' and 'gpaxioms'; see Sections "kbprog", "kbprog -wd", and 334"gpaxioms". 335 336Many of the options to the individual programs in {\KBMAG} have the same 337meaning wherever they occur. To avoid repeating them over and over again, we 338list some of them here. 339\begin{description} 340\item[|-v |] 341The verbose option. A certain amount of extra information on progress of 342calculation is printed out to stdout. 343(By default, only the main results of calculations are printed out as 344comments to stdout.) 345\item[|-silent|] 346There is no output at all to 'stdout'. 347The only output to the terminal is error messages when the program 348aborts for some reason. 349\item[|-vv |] 350The very-verbose option. A huge amount of diagnostic information is printed out, 351much of which may seem incomprehensible. 352\item[|-l |] 353Large hash-tables. When constructing finite state automata, the states 354are identified as sequences of integers, sometimes of varying length. 355The sequences are stored in open hash-tables. Space is allocated as 356required in blocks. The default size of the block is $2^{18}$ bytes; 357with the '-l' option it becomes $2^{21}$ bytes. This makes things 358run more efficiently when constructing very large automata. 359There is also a '-h' option (huge hash-tables). 360\item[|-ip d|] 361Store finite state automata in dense format, which means that the 362transition table is stored as an $ne \times ns$ array, where $ne$ and 363$ns$ are the sizes of the alphabet and state-set, respectively. 364This is always the default, and is the fastest. It can be expensive on 365space, however, particularly when the alphabet is large. If you run out 366of space, or a program starts to swap heavily, then it may be worth trying 367sparse storage. 368\item[|-ip s|[<dr>]] 369Store finite state automata in sparse format. This means that the 370transitions from each state are stored as a sequence of edge-target 371pairs. With large automata, with large alphabet (which means size more 372than about 5 or 6), it normally requires significantly less space than 373dense format. The '[<dr>]' option (dense rows) is a compromise. 374Here <dr> should be a positive integer (something like 1000 might be a good 375choice). The transitions from the first <dr> states are stored in dense 376format, and the remainder in sparse format. 377\item[|-op d|] 378Automata are written to files in dense format. This is the default 379for one-variable automata (such as the word acceptor in an automatic 380group). 381\item[|-op s|] 382Automata are written to files in sparse format. This is the default 383for two-variable automata (such as the multiplier in an automatic 384group). 385\end{description} 386