1%%File intro.tex
2\Chapter{Introduction}
3\Section{What is KBMAG?}
4{\KBMAG} (pronounced ``Kay-bee-mag\'\')
5stands for *Knuth--Bendix on Monoids, and Automatic Groups*.
6The purpose of this manual is to provide instructions for its use as a
7stand-alone package.  It is also usable from within Version 3 of the {\GAP}
8system (see \cite{Sch92}), and instructions for its use in that setting appear
9as a chapter of the {\GAP} manual in the directory 'gapdoc'.
10Some information on the interface with {\GAP} is also described in Chapter
11"The Interface with GAP" of this manual.
12
13The applications of {\KBMAG} can be divided into six inter-relating
14categories. These are covered in detail in the Chapters
15"The Knuth--Bendix Program for Monoids",
16"The Automatic Groups Package",
17"The Knuth--Bendix Program for Cosets of Subgroups",
18"The Automatic Coset System and Subgroup Presentation Package",
19"Creating Subgroup Word Acceptors", and
20"Programs for Manipulating Finite State Automata",
21but we will summarize them here.
22Note that the third, fourth and fifth applications, relating to subgroups and
23cosets, were new in Version 2.
24
25Firstly, the program 'kbprog' can be used by itself to carry out the
26Knuth--Bendix completion procedure on monoids defined by a finite presentation
27and, in particular, on finitely presented groups. The user has a choice
28between the short-lex ordering and the recursive-path ordering on strings.
29Weighted lex and wreath-product orderings, are also available.
30The latter are defined on pages 46 -- 50 of \cite{Sims94}.
31(It would be easy to make other orderings available if there were ever any
32demand.) The implementation is designed more with speed of execution in mind
33than with minimizing space requirements; thus, a finite state automaton is
34always used to carry out word reduction, which can be space-consuming,
35particularly when the number of generators is large. For a more flexible
36Knuth--Bendix package, with more general orderings available, and a
37choice of word-reduction procedures, the user could try the Rutgers
38Knuth--Bendix package {\rkbp} written by Charles Sims.
39After running the program, the current set of reduction rules can be used to
40reduce words in the monoid generators. If the rewriting-system produced is
41confluent, then words will be correctly reduced to their irreducible normal
42form under the given ordering, and so the word problem can be solved efficiently
43in the monoid.
44
45Secondly, the package can be used to compute the finite state automata
46that constitute the automatic structure of a short-lex automatic group.
47This supercedes the olderexisting Warwick {\Automata} package for this purpose;
48the current program is generally faster than {\Automata}, and successful with
49more examples. For general information about automatic groups, see
50\cite{ECHLPT92}, and for more detailed information about the algorithms
51used in {\Automata}, see \cite{EHR91} or \cite{Holt94}. There are
52no fundamentally new algorithms employed in {\KBMAG}, but several
53improvements have been made to the various components. The most
54noticeable change is that a single multiplier automaton is now computed,
55with the states labeled to indicate for which generators they are success
56states, rather than a separate multiplier for each generator (although the
57separate multipliers can still be computed if desired).
58
59Computing an automatic structure is done in several steps, which are
60carried out by a number of individual 'C'-programs. A Bourne Shell script,
61called 'autgroup' has been provided to run these programs in the correct
62sequence with, hopefully, a sensible default choice of options. As an
63alternative to the use of this shell script, the individual programs can of
64course be run separately. The first step is to run the program
65'kbprog', but with the ``word-difference\'\'\ option, which is required
66for automatic group calculations. The next program (which can itself be
67divided up into different parts if required) computes the word-acceptor
68and multiplier automata for the group. The final program (which can again
69be split into parts), is the so-called axiom-checking process, which proves
70that the automata that have been calculated are correct.
71If the process runs to completion, then the automata can be used to reduce
72words in the group generators to their irreducible normal forms under the
73short-lex ordering, and so the word problem in the group can be solved
74efficiently. It is also possible to compute the rational growth function
75for the group by applying the program 'fsagrowth' (see Chapter 8) to its
76word-acceptor.
77
78If the group happens to be word-hyperbolic then,
79after computing the automatic structure as described above, it is
80possible to run a program 'gpgeowa', which computes an automaton that accepts
81all geodesic words in the group generators and a 2-variable automaton that
82accepts all pairs of geodesic words which represent the same group element.
83In fact, by a result of Papsoglu~(\cite{PAP}), if 'gpgeowa'
84succeeds, then it effectively verifies the word-hyperbolicity of the
85group.
86
87Thirdly, there is a variant of 'kbprog' named 'kbprogcos' that is designed to
88work on the right cosets of a subgroup of the input group, at the same time
89as on the group itself. (The input structure could, more generally, be a
90monoid, but in order for the calculations carried out by the program to
91make any mathematical sense, it is probably necessary for the substructure to be
92a group.) In case of completion, a presentation of the subgroup
93can be calculated on the given subgroup generators.  This will always work in
94case of a subgroup of finite index, but the better known Reidemeister-Schreier
95and Todd-Coxeter related methods for subgroup presentations are likely to be
96superior in this case (see, for example, \cite {Neu81}).
97
98Fourthly, there are programs for calculating the finite state automata
99that constitute an automatic coset system for certain subgroups $H$ of
100short-lex automatic groups $G$. In particular, the coset word-acceptor, that
101accepts the unique short-lex minimal word that lies in each right coset
102of $H$ in $G$, and the coset general multiplier, that accepts the pair of words
103$(w_1,w_2)$ if and only if $w_1$ and $w_2$ are accepted by the coset
104word-acceptor and $Hw_1x=Hw_2$ for some monoid generator $x$ of $G$, are
105constructed. In successful cases, a presentation of $H$ can also be computed.
106
107The first programs to calculate these structures
108were written by Ian Redfern and described by him in his Ph.D.thesis
109(\cite{Red93}), together with some nice graphical applications to drawing
110limit sets of Kleinian groups. He also proved that automatic coset systems
111exist for all quasiconvex subgroups of word-hyperbolic groups. (Apart from
112this, there appear to be no theoretical results that will predict for
113which subgroups $H$ automatic coset systems will exist but, by running the
114programs, we find that they do indeed exist in many cases.)
115The programs in {\KBMAG} employ a different method that uses finite state
116automata that have deterministic transition tables, but may have more than one
117initial state. It has the advantage over Redfern\'s that the structures
118computed can be proved correct.
119
120As in the case of ordinary automatic structures, these calculations are
121done in several steps, and a Bourne Shell script is provided, called
122'autcos', that  runs the appropriate programs in the correct order, with
123a sensible default choice of options. The  steps correspond roughly to those of
124'autgroup', the first being 'kbprogcos' with the ``word-difference\'\'
125option.  There is an optional additional step at the end to compute a
126presentation of the subgroup $H$. This usually contains many redundant
127generators and relators, and is output as a file that can be read immediately
128by {\GAP}, so that the presenation can be simplified. It is planned, at some
129stage in the future, the facility to compute a presentation using the original
130given generators of $H$ will be provided.
131
132The fifth application is for the construction of a finite state automaton
133that accepts a unique word for each element of a suitable subgroup $H$ of
134a short-lex automatic group $G$. An automatic structure must first be computed
135for $G$ by using 'autcos' or its constituents, and the word-acceptor for $H$
136will accept precisely those words that lie in $H$ and are accepted by the
137word-acceptor for $G$. Theoretically, it will work whenever $H$ is a
138quasiconvex subgroup of $G$. The algorithm employed has some features in
139common with that described in Section 2 of \cite{Kar94} (see there also for
140theoretical details and further references). In particular, the
141correctness verification part of it is identical to that described in
142\cite{Kar94}, but for the construction of the word-acceptor for $H$,
143{\KBMAG} uses Knuth-Bendix based methods rather than Todd-Coxeter coset
144enumeration.
145
146The sixth and final application of the package is for general manipulation of
147finite state automata. Most of the routines provided operate on
148deterministic automata. However, there is a program 'nfadeterminize',
149which inputs a non-deterministic automaton and outputs a deterministic
150one with the same language.
151There is also a program 'midfadeterminize' that handles the
152important special case in which the input automaton is
153a multiple-initial-state automaton with deterministic transition table
154(see above).
155
156There are programs for carrying out logical operations
157on automata (in fact, some of the functions that they call are also used
158in the automatic group calculations).  There are also programs to count
159and to enumerate the language of a finite state automaton. These are
160likely to be interesting to apply to the automata associated with
161an automatic group, or to the reduction automaton output by 'kbprog' in
162its stand-alone mode. There are a number of other, more dedicated,
163packages that are available for manipulating finite state automata and
164regular expressions. One such is {\Automate} (see \cite{Rie87} or
165\cite{ChH91}), and another {\Grail} (see ??) developed at Ontario.
166
167\Section{File formats}
168The programs in {\KBMAG} generally do all of their serious input and output
169from files, and only print diagnostics to 'stdout'. These files conform to
170the format of the Version 3 of the {\GAP} system \cite{Sch92} and so are
171readable from within {\GAP}. See the {\GAP} manual chapter in the 'gapdoc'
172directory, or Chapter "The Interface with GAP" for details of how to use
173{\KBMAG} from within {\GAP}.
174
175Two principal types of objects are handled, rewriting-systems and
176finite state automata. Each file contains a single {\GAP} declaration, which
177defines one object of one of these two types. This takes the general form
178
179'<identifier> \:= rec(<list>);'
180
181where <list> is a comma-separated list
182of field-definitions that specify the values of the fields of the object
183being defined. The two types of object are distinguished by the first
184such definition, which must be either
185
186|isRWS := true| \hspace{1cm}  or \hspace{1cm}  |isFSA := true|
187
188for rewriting-systems and finite state automata, respectively.
189Subgroups of the group or monoid defined by a rewriting system can also be
190defined in a separate file, as described in Chapter
191"Subgroup and Coset Files".
192
193To use the interface with {\GAP} in its current form, the name of a
194rewriting-system (i.e. the value of <identifier>) has to be '\_RWS'.
195This is because the relevant {\GAP} functions have '\_RWS' defined as an
196external variable, and expect a declaration of this form.
197
198Formal definitions of these formats have not yet been written down.
199There is a text document giving an informal, and not completely up-to-data
200description of the finite state automaton format in the {\KBMAG}
201directory 'doc'. Examples of files containing rewriting-systems can be
202found in the directories 'kb\_data' and 'ag\_data', and examples of
203finite state automata in 'fsa\_data'. Rather than attempt a description
204here, we suggest that the reader takes a look at some of these files.
205
206{\sf Handy Tip\:} Reading the transition table directly from a file that
207contains the definition of an automaton with a large number of states
208can be difficult, because the rows of the table are not numbered in the
209file. Such numbering can be introduced as comments, by running the
210program 'fsafilter' with the '-csn' option; see Section "fsafilter".
211An alternative and more versatile way of accessing the information
212about the automaton is to read it into {\GAP} and then use the {\GAP}
213functions provided; see Chapter "The Interface with GAP".
214
215For the Knuth--Bendix and automatic group applications, the user has to
216supply a rewriting-system that defines a monoid or group in a file as input.
217For applications related to cosets of subgroups, an additional file
218defining generators of the subgroup must also be supplied.
219The programs produce new files containing rewriting-systems or automata.
220In general, the user\'s file will contain a declaration of the above type,
221and the computed files will contain a declaration of form
222
223'<identifier>.<suffix> \:= rec(<list>);'
224
225for the same <identifier>, so that if the user reads these files from
226within {\GAP}, and the input file is read first, then, from {\GAP}\'s
227viewpoint, the computed files will define new components of the original
228record. (As mentioned above, for use with {\GAP}, the value of
229<identifier> should always be '\_RWS'.)
230
231The best method (and the one always employed by the author)
232of creating an input file is to copy an existing one and edit it.
233Let us briefly look at an example and discuss it.
234
235|
236#Free nilpotent group of rank 2 and class 2
237_RWS := rec(
238           isRWS := true,
239  generatorOrder := [c,C,b,B,a,A],
240        inverses := [C,c,B,b,A,a],
241        ordering := "recursive",
242       equations := [
243         [b*a,a*b*c], [c*a,a*c], [c*b,b*c]
244       ]
245);
246|
247
248The first line is a comment, and is ignored by programs. (In general,
249comments are preceded by the `\#\'\ symbol and last to the end of the line.)
250To comply with the current {\GAP} interface, we name our rewriting-system
251'\_RWS'.
252As we saw above, the first field definition merely states that this is a
253definition of a rewriting system.
254
255The |ordering|\ field specifies which
256ordering on strings in the input alphabet is to be used by the Knuth--Bendix
257program. Although there is a default ordering, which is '\"shortlex\"',
258this field is required by the {\GAP} interface, so it is recommended that
259it should always be included.
260It is also possible
261to define the ordering by means of a command-line option to 'kbprog'
262however. The convention for this, and various other fields, is that
263command-line options override settings in the file in case of conflict.
264Full details of these options are given in Chapters
265"The Knuth--Bendix Program for Monoids" and "The Automatic Groups Package".
266
267The remaining three fields provide the definition of the group or monoid.
268First comes a list of generators. They must generate the structure as a monoid,
269even if it is a group; this means inverses should be included in the generator
270list. The field is named |generatorOrder|\ to emphasize the fact that the
271order is relevant - it will affect the ordering of strings in the alphabet.
272The names of the generators should be alphanumeric strings. In fact,
273dots and underscores are also allowed, for compatability with {\GAP}.
274Case is significant. It is recommended to use single letters, and use
275case-change for inversion, as we have done in this example. Another
276recommended policy is to use a common prefix followed by numbers; for
277example, a file output by {\GAP} might have its generators named
278'G.1, G.2, ..., G.n' for some $n \ge 0$. It is also permissible to name a
279generator <name>'\^-1', where <name> is the name of another generator, and
280the two are mutually inverse.
281
282The |inverses|\ field supplies the list of two-sided inverses of the
283generators, in the same order as the generators. This field must be present,
284but, in general, not all generators need have inverses, so the list could
285be empty, or contain gaps. For example, if only the first and third
286generators have inverses, and these are named 'i1' and 'i2', then the list
287would be written as '[i1,,i2]'.  However, if generator 'A' is listed as the
288inverse of 'a', then 'a' must also be listed as the inverse of 'A'.
289There is currently no mechanism for inputting one-sided inverses (although
290that would be useful information for 'kbprog' under some circumstances, and so
291it may be introduced in the future).
292In the automatic groups applications, the structure must be a group, and all
293generators must have inverses specified in the list.
294(Currently, there is no way of specifying a default convention, such as
295inversion equals case-change. We may introduce such a convention in
296future, but this will depend on whether it can be made meaningful to
297{\GAP}.)
298
299Finally, there comes the |equations|\ field. This consists of a list of
300lists. Each inner list constitutes a defining relation for the
301monoid, and should have two entries, which are words in the generators,
302and are the left and right hand sides of the relation.
303The empty word is denoted (as in {\GAP}) by 'IdWord'.
304The word may contain brackets to any level, and positive powers.
305So, for example 'a\*(b\*(A\*c)\^4)\^3\*c\^12' is a valid word in the generators
306in the example above.
307Since the word is in the monoid generators, not all of which will
308necessarily have inverses, negative powers are not permitted.
309However, if a generator is named 'a\^-1', for example, then the <n>-th
310power of it should be written as 'a\^-'<n> rather than as 'a\^-1\^'<n>.
311
312It is not necessary to include defining relations of type
313'[a\*A,IdWord]' in the list of equations, where the generators 'a' and 'A'
314have been already been specified as mutually inverse in the ``|inverses|\'\'
315field, and this has not been done in the example above. On the other hand
316it does no harm if you do include them, and they will be included in
317lists of equations output by 'kbprog'.
318
319There are a number of other optional fields, which do not occur in this example,
320and provide further detailed instructions for the Knuth--Bendix program.
321See the description of this program in Section "kbprog" for details.
322
323For a description of the format of files defining subgroups, see
324Chapter "Subgroup and Coset Files".
325
326\Section{Exit Status of Programs and Meanings of Some Options}
327
328The exit status of nearly all of the programs is 0 if successful and
3291 if unsuccessful, and the program aborts with an error message
330to 'stderr', without outputting to file.
331One or two of the programs can also exit with status 2, which means that
332something unusual but non-fatal has occurred. The two most important are
333'kbprog' and 'gpaxioms'; see Sections "kbprog", "kbprog -wd", and
334"gpaxioms".
335
336Many of the options to the individual programs in {\KBMAG} have the same
337meaning wherever they occur. To avoid repeating them over and over again, we
338list some of them here.
339\begin{description}
340\item[|-v |]
341The verbose option. A certain amount of extra information on progress of
342calculation is printed out to stdout.
343(By default, only the main results of calculations are printed out as
344comments to stdout.)
345\item[|-silent|]
346There is no output at all to 'stdout'.
347The only output to the terminal is error messages when the program
348aborts for some reason.
349\item[|-vv |]
350The very-verbose option. A huge amount of diagnostic information is printed out,
351much of which may seem incomprehensible.
352\item[|-l |]
353Large hash-tables. When constructing finite state automata, the states
354are identified as sequences of integers, sometimes of varying length.
355The sequences are stored in open hash-tables. Space is allocated as
356required in blocks. The default size of the block is $2^{18}$ bytes;
357with the '-l' option it becomes $2^{21}$ bytes. This makes things
358run more efficiently when constructing very large automata.
359There is also a '-h' option (huge hash-tables).
360\item[|-ip d|]
361Store finite state automata in dense format, which means that the
362transition table is stored as an $ne \times ns$ array, where $ne$ and
363$ns$ are the sizes of the alphabet  and state-set, respectively.
364This is always the default, and is the fastest. It can be expensive on
365space, however, particularly when the alphabet is large. If you run out
366of space, or a program starts to swap heavily, then it may be worth trying
367sparse storage.
368\item[|-ip s|[<dr>]]
369Store finite state automata in sparse format. This means that the
370transitions from each state are stored as a sequence of edge-target
371pairs. With large automata, with large alphabet (which means size more
372than about 5 or 6), it normally requires significantly less space than
373dense format. The '[<dr>]' option (dense rows) is a compromise.
374Here <dr> should be a positive integer (something like 1000 might be a good
375choice). The transitions from the first <dr> states are stored in dense
376format, and the remainder in sparse format.
377\item[|-op d|]
378Automata are written to files in dense format. This is the default
379for one-variable automata (such as the word acceptor in an automatic
380group).
381\item[|-op s|]
382Automata are written to files in sparse format. This is the default
383for two-variable automata (such as the multiplier in an automatic
384group).
385\end{description}
386