1#% -*- mode: textmac; mode: fold -*-
2
3#% text-macro definitions #%{{{
4#i linuxdoc.tm
5
6#d slang \bf{S-Lang}
7#d slrn \bf{slrn}
8#d jed \bf{jed}
9#d kw#1 \tt{$1}
10#d exmp#1 \tt{$1}
11#d var#1 \tt{$1}
12#d ldots ...
13#d times *
14#d math#1 $1
15#d sc#1 \tt{$1}
16#d verb#1 \tt{$1}
17#d sldxe \bf{sldxe}
18#d url#1 <htmlurl url="$1" name="$1">
19#d slang-library-reference \bf{The \slang Library Reference}
20#d chapter#1 <chapt>$1<p>
21#d preface <preface>
22#d tag#1 <tag>$1</tag>
23#d appendix <appendix>
24
25#d NULL <tt>NULL</tt>
26#d kbd#1 <tt>$1</tt>
27
28#% d documentstyle article
29#% d sect1 \section
30#% d sect2 \subsection
31#% d sect3 \subsubsection
32#% d sect4 \subsubsection
33
34#d documentstyle book
35#d sect1 \chapter
36#d sect2 \section
37#d sect3 \subsection
38#d sect4 \subsubsection
39
40
41#%}}}
42
43\linuxdoc
44
45\begin{\documentstyle}
46
47\title A Guide to the S-Lang Language
48\author John E. Davis, \tt{davis@space.mit.edu}
49\date \__today__
50
51\toc
52
53#i preface.tm
54
55\sect1{Introduction} #%{{{
56
57   \slang is a powerful interpreted language that may be embedded into
58   an application to make the application extensible.  This enables
59   the application to be used in ways not envisioned by the programmer,
60   thus providing the application with much more flexibility and
61   power.  Examples of applications that take advantage of the
62   interpreter in this way include the \jed editor and the \slrn
63   newsreader.
64
65\sect2{Language Features}
66
67   The language features both global and local variables, branching
68   and looping constructs, user-defined functions, structures,
69   datatypes, and arrays.  In addition, there is limited support for
70   pointer types.  The concise array syntax rivals that of commercial
71   array-based numerical computing environments.
72
73\sect2{Data Types and Operators} #%{{{
74
75   The language provides built-in support for string, integer (signed
76   and unsigned long and short), double precision floating point, and
77   double precision complex numbers.  In addition, it supports user
78   defined structure types, multi-dimensional array types, and
79   associative arrays.  To facilitate the construction of
80   sophisticated data structures such as linked lists and trees, a
81   `reference' type was added to the language.  The reference type
82   provides much of the same flexibility as pointers in other
83   languages.  Finally, applications embedding the interpreter may
84   also provide special application specific types, such as the
85   \var{Mark_Type} that the \jed editor provides.
86
87   The language provides standard arithmetic operations such as
88   addition, subtraction, multiplication, and division.  It also
89   provides support for modulo arithmetic as well as operations at
90   the bit level, e.g., exclusive-or.  Any binary or unary operator
91   may be extended to work with any data type.  For example, the
92   addition operator (\var{+}) has been extended to work between
93   string types to permit string concatenation.
94
95   The binary and unary operators work transparently with array types.
96   For example, if \var{a} and \var{b} are arrays, then \exmp{a + b}
97   produces an array whose elements are the result of element by
98   element addition of \var{a} and \var{b}.  This permits one to do
99   vector operations without explicitly looping over the array
100   indices.
101
102#%}}}
103
104\sect2{Statements and Functions} #%{{{
105
106   The \slang language supports several types of looping constructs and
107   conditional statements.  The looping constructs include \kw{while},
108   \kw{do...while}, \kw{for}, \kw{forever}, \kw{loop}, \kw{foreach},
109   and \kw{_for}. The conditional statements include \kw{if},
110   \kw{if-then-else}, and \kw{!if}.
111
112   User defined functions may be defined to return zero, one, or more
113   values.  Functions that return zero values are similar to
114   `procedures' in languages such as PASCAL.  The local variables of a
115   function are always created on a stack allowing one to create
116   recursive functions.  Parameters to a function are always passed by
117   value and never by reference. However, the language supports a
118   \em{reference} data type that allows one to simulate pass by
119   reference.
120
121   Unlike many interpreted languages, \slang allows functions to be
122   dynamically loaded (function autoloading).  It also provides
123   constructs specifically designed for error handling and recovery as
124   well as debugging aids (e.g., function tracebacks).
125
126   Functions and variables may be declared as private belonging to a
127   namespace associated with the compilation unit that defines the
128   function or variable.  The ideas behind the namespace implementation
129   stems from the C language and should be quite familiar to any one
130   familiar with C.
131
132#%}}}
133
134\sect2{Error Handling} #%{{{
135
136   The \slang language defines a construct called an \em{error-block}
137   that may be used for error handling and recovery.  When a non-fatal
138   run-time error is encountered, any error blocks that have been
139   defined are executed as the run-time stack unwinds.  An error block
140   can optionally clear the error and the program will continue
141   running after the statement that triggered the error.  This
142   mechanism is somewhat similar to try-catch in C++.
143
144#%}}}
145
146\sect2{Run-Time Library} #%{{{
147
148   Functions that compose the \slang run-time library are called
149   \em{intrinsics}.  Examples of \slang intrinsic functions available
150   to every \slang application include string manipulation functions
151   such as \var{strcat}, \var{strchop}, and \var{strcmp}.  The \slang
152   library also provides mathematical functions such as \var{sin},
153   \var{cos}, and \var{tan}; however, not all applications enable the
154   use of these intrinsics.  For example, to conserve memory, the 16
155   bit version of the \jed editor does not provide support for any
156   mathematics other than simple integer arithmetic, whereas other
157   versions of the editor do support these functions.
158
159   Most applications embedding the languages will also provide a set of
160   application specific intrinsic functions.  For example, the \jed
161   editor adds over 100 application specific intrinsic functions to
162   the language.  Consult your application specific documentation to
163   see what additional intrinsics are supported.
164
165#%}}}
166
167\sect2{Input/Output}
168
169   The language supports C-like stdio input/output functions such as
170   \var{fopen}, \var{fgets}, \var{fputs}, and \var{fclose}.  In
171   addition it provides two functions, \var{message} and \var{error},
172   for writing to the standard output device and standard error.
173   Specific applications may provide other I/O mechanisms, e.g.,
174   the \jed editor supports I/O to files via the editor's
175   buffers.
176
177\sect2{Obtaining \slang} #%{{{
178
179  Comprehensive information about the library may be obtained via the
180  World Wide Web from \tt{http://www.s-lang.org}.
181
182  \slang as well as some programs that embed it are freely available
183  via anonymous ftp in the United States from
184\begin{itemize}
185  \item \url{ftp://space.mit.edu/pub/davis}.
186\end{itemize}
187  It is also available outside the United States from the following
188  mirror sites:
189\begin{itemize}
190    \item \url{ftp://ftp.uni-stuttgart.de/pub/unix/misc/slang/}
191    \item \url{ftp://ftp.fu-berlin.de/pub/unix/news/slrn/}
192    \item \url{ftp://ftp.ntua.gr/pub/lang/slang/}
193\end{itemize}
194
195  The Usenet newsgroup \var{alt.lang.s-lang} was created for \slang
196  programmers to exchange information and share macros for the various
197  programs the embed the language.  The newsgroup \var{comp.editors}
198  can be a useful resource for \slang macros for the \jed editor.
199  Similarly, \slrn users will find \var{news.software.readers} to be a
200  valuable source of information.
201
202  Finally, two mailing lists dealing with the \slang library have been
203  created:
204\begin{itemize}
205     \item \tt{slang-announce@babayaga.math.fu-berlin.de}
206     \item \tt{slang-workers@babayaga.math.fu-berlin.de}
207\end{itemize}
208  The first list is for announcements of new releases of the library, while the
209  second list is intended for those who use the library for their own code
210  development.  To subscribe to the announcement list, send an email to
211  \tt{slang-announce-subscribe@babayaga.math.fu-berlin.de} and include
212  the word \tt{subscribe} in the body of the message.  To subscribe to
213  the developers list, use the address
214  \tt{slang-workers-subscribe@babayaga.math.fu-berlin.de}.
215
216#%}}}
217
218#%}}}
219
220\sect1{Overview of the Language} #%{{{
221
222   This purpose of this section is to give the reader a feel for the
223   \slang language, its syntax, and its capabilities.  The information
224   and examples presented in this section should be sufficient to
225   provide the reader with the necessary background to understand the
226   rest of the document.
227
228\sect2{Variables and Functions} #%{{{
229
230   \slang is different from many other interpreted languages in the
231   sense that all variables and functions must be declared before they
232   can be used.
233
234   Variables are declared using the \kw{variable} keyword, e.g.,
235#v+
236     variable x, y, z;
237#v-
238   declares three variables, \var{x}, \var{y}, and \var{z}.  Note the
239   semicolon at the end of the statement.  \em{All \slang statements must
240   end in a semi-colon.}
241
242   Unlike compiled languages such as C, it is not necessary to specify
243   the data type of a \slang variable.  The data type of a \slang
244   variable is determined upon assignment.  For example, after
245   execution of the statements
246#v+
247     x = 3;
248     y = sin (5.6);
249     z = "I think, therefore I am.";
250#v-
251   \var{x} will be an integer, \var{y} will be a
252   double, and \var{z} will be a string.  In fact, it is even possible
253   to re-assign \var{x} to a string:
254#v+
255     x = "x was an integer, but now is a string";
256#v-
257   Finally, one can combine variable declarations and assignments in
258   the same statement:
259#v+
260     variable x = 3, y = sin(5.6), z = "I think, therefore I am.";
261#v-
262
263   Most functions are declared using the \kw{define} keyword.  A
264   simple example is
265#v+
266      define compute_average (x, y)
267      {
268         variable s = x + y;
269         return s / 2.0;
270      }
271#v-
272   which defines a function that simply computes the average of two
273   numbers and returns the result.  This example shows that a function
274   consists of three parts: the function name, a parameter list, and
275   the function body.
276
277   The parameter list consists of a comma separated list of variable
278   names.  It is not necessary to declare variables within a parameter
279   list; they are implicitly declared.  However, all other \em{local}
280   variables used in the function must be declared.  If the function
281   takes no parameters, then the parameter list must still be present,
282   but empty:
283#v+
284      define go_left_5 ()
285      {
286         go_left (5);
287      }
288#v-
289   The last example is a function that takes no arguments and returns
290   no value.  Some languages such as PASCAL distinguish such objects
291   from functions that return values by calling these objects
292   \em{procedures}.  However, \slang, like C, does not make such a
293   distinction.
294
295   The language permits \em{recursive} functions, i.e., functions that
296   call themselves.  The way to do this in \slang is to first declare
297   the function using the form:
298\begin{tscreen}
299     define \em{function-name} ();
300\end{tscreen}
301   It is not necessary to declare a parameter list when declaring a
302   function in this way.
303
304   The most famous example of a recursive function is the factorial
305   function.  Here is how to implement it using \slang:
306#v+
307     define factorial ();   % declare it for recursion
308
309     define factorial (n)
310     {
311        if (n < 2) return 1;
312        return n * factorial (n - 1);
313     }
314#v-
315   This example also shows how to mix comments with code.  \slang uses
316   the `\var{%}' character to start a comment and all characters from
317   the comment character to the end of the line are ignored.
318
319#%}}}
320
321\sect2{Strings} #%{{{
322
323   Perhaps the most appealing feature of any interpreted language is
324   that it frees the user from the responsibility of memory management.
325   This is particularly evident when contrasting how
326   \slang handles string variables with a lower level language such as
327   C.  Consider a function that concatenates three strings.  An
328   example in \slang is:
329#v+
330     define concat_3_strings (a, b, c)
331     {
332        return strcat (a, strcat (b, c));
333     }
334#v-
335   This function uses the built-in
336   \var{strcat} function for concatenating two strings.  In C, the
337   simplest such function would look like:
338#v+
339     char *concat_3_strings (char *a, char *b, char *c)
340     {
341        unsigned int len;
342        char *result;
343        len = strlen (a) + strlen (b) + strlen (c);
344        if (NULL == (result = (char *) malloc (len + 1)))
345          exit (1);
346        strcpy (result, a);
347        strcat (result, b);
348        strcat (result, c);
349        return result;
350     }
351#v-
352   Even this C example is misleading since none of the issues of memory
353   management of the strings has been dealt with.  The \slang language
354   hides all these issues from the user.
355
356   Binary operators have been defined to work with the string data
357   type.  In particular the \var{+} operator may be used to perform
358   string concatenation.  That is, one can use the
359   \var{+} operator as an alternative to \var{strcat}:
360#v+
361      define concat_3_strings (a, b, c)
362      {
363         return a + b + c;
364      }
365#v-
366   See section ??? for more information about string variables.
367
368#%}}}
369
370\sect2{Referencing and Dereferencing} #%{{{
371   The unary prefix operator, \var{&}, may be used to create a
372   \em{reference} to an object, which is similar to a pointer
373   in other languages.  References are commonly used as a mechanism to
374   pass a function as an argument to another function as the following
375   example illustrates:
376#v+
377       define compute_functional_sum (funct)
378       {
379          variable i, sum;
380
381          sum = 0;
382          for (i = 0; i < 10; i++)
383           {
384              sum += (@funct)(i);
385           }
386          return sum;
387       }
388
389       variable sin_sum = compute_functional_sum (&sin);
390       variable cos_sum = compute_functional_sum (&cos);
391#v-
392   Here, the function \var{compute_functional_sum} applies the
393   function specified by the parameter \var{funct} to the first
394   \exmp{10} integers and returns the sum.  The two statements
395   following the function definition show how the \var{sin} and
396   \var{cos} functions may be used.
397
398   Note the \var{@} operator in the definition of
399   \var{compute_functional_sum}.  It is known as the \em{dereference}
400   operator and is the inverse of the reference operator.
401
402   Another use of the reference operator is in the context of the
403   \var{fgets} function.  For example,
404#v+
405      define read_nth_line (file, n)
406      {
407         variable fp, line;
408         fp = fopen (file, "r");
409
410         while (n > 0)
411           {
412              if (-1 == fgets (&line, fp))
413                return NULL;
414              n--;
415           }
416         return line;
417      }
418#v-
419   uses the \var{fgets} function to read the nth line of a file.
420   In particular, a reference to the local variable \var{line} is
421   passed to \var{fgets}, and upon return \var{line} will be set to
422   the character string read by \var{fgets}.
423
424   Finally, references may be used as an alternative to multiple
425   return values by passing information back via the parameter list.
426   The example involving \var{fgets} presented above provided an
427   illustration of this.  Another example is
428#v+
429       define set_xyz (x, y, z)
430       {
431          @x = 1;
432          @y = 2;
433          @z = 3;
434       }
435       variable X, Y, Z;
436       set_xyz (&X, &Y, &Z);
437#v-
438   which, after execution, results in \var{X} set to \exmp{1}, \var{Y}
439   set to \exmp{2}, and \var{Z} set to \exmp{3}.  A C programmer will
440   note the similarity of \var{set_xyz} to the following C
441   implementation:
442#v+
443      void set_xyz (int *x, int *y, int *z)
444      {
445         *x = 1;
446         *y = 2;
447         *z = 3;
448      }
449#v-
450#%}}}
451
452\sect2{Arrays} #%{{{
453   The \slang language supports multi-dimensional arrays of all
454   datatypes.  For example, one can define arrays of references to
455   functions as well as arrays of arrays.  Here are a few examples of
456   creating arrays:
457#v+
458       variable A = Integer_Type [10];
459       variable B = Integer_Type [10, 3];
460       variable C = [1, 3, 5, 7, 9];
461#v-
462   The first example creates an array of \var{10} integers and assigns
463   it to the variable \var{A}.  The second example creates a 2-d array
464   of \var{30} integers arranged in \var{10} rows and \var{3} columns
465   and assigns the result to \var{B}.  In the last example, an array
466   of \var{5} integers is assigned to the variable \var{C}.  However,
467   in this case the elements of the array are initialized to the
468   values specified.  This is known as an \em{inline-array}.
469
470   \slang also supports something called an
471   \em{range-array}.  An example of such an array is
472#v+
473      variable C = [1:9:2];
474#v-
475   This will produce an array of 5 integers running from \exmp{1}
476   through \exmp{9} in increments of \exmp{2}.
477
478   Arrays are passed by reference to functions and never by value.
479   This permits one to write functions which can initialize arrays.
480   For example,
481#v+
482      define init_array (a)
483      {
484         variable i, imax;
485
486         imax = length (a);
487         for (i = 0; i < imax; i++)
488           {
489              a[i] = 7;
490           }
491      }
492
493      variable A = Integer_Type [10];
494      init_array (A);
495#v-
496   creates an array of \var{10} integers and initializes all its
497   elements to \var{7}.
498
499   There are more concise ways of accomplishing the result of the
500   previous example.  These include:
501#v+
502      variable A = [7, 7, 7, 7, 7, 7, 7, 7, 7, 7];
503      variable A = Integer_Type [10];  A[[0:9]] = 7;
504      variable A = Integer_Type [10];  A[*] = 7;
505#v-
506   The second and third methods use an array of indices to index the array
507   \var{A}.  In the second, the range of indices has been explicitly
508   specified, whereas the third example uses a wildcard form.  See
509   section ??? for more information about array indexing.
510
511   Although the examples have pertained to integer arrays, the fact is
512   that \slang arrays can be of any type, e.g.,
513#v+
514       variable A = Double_Type [10];
515       variable B = Complex_Type [10];
516       variable C = String_Type [10];
517       variable D = Ref_Type [10];
518#v-
519   create \var{10} element arrays of double, complex, string, and
520   reference types, respectively.  The last example may be used to
521   create an array of functions, e.g.,
522#v+
523      D[0] = &sin;
524      D[1] = &cos;
525#v-
526
527   The language also defines unary, binary, and mathematical
528   operations on arrays.  For example, if \var{A} and \var{B} are
529   integer arrays, then \exmp{A + B} is an array whose elements are
530   the sum of the elements of \var{A} and \var{B}.  A trivial example
531   that illustrates the power of this capability is
532#v+
533        variable X, Y;
534        X = [0:2*PI:0.01];
535        Y = 20 * sin (X);
536#v-
537   which is equivalent to the highly simplified C code:
538#v+
539        double *X, *Y;
540        unsigned int i, n;
541
542        n = (2 * PI) / 0.01 + 1;
543        X = (double *) malloc (n * sizeof (double));
544        Y = (double *) malloc (n * sizeof (double));
545        for (i = 0; i < n; i++)
546          {
547            X[i] = i * 0.01;
548            Y[i] = 20 * sin (X[i]);
549          }
550#v-
551
552
553#%}}}
554
555\sect2{Structures and User-Defined Types} #%{{{
556
557   A \em{structure} is similar to an array in the sense that it is a
558   container object.  However, the elements of an array must all be of
559   the same type (or of \var{Any_Type}), whereas a structure is
560   heterogeneous.  As an example, consider
561#v+
562      variable person = struct
563      {
564         first_name, last_name, age
565      };
566      variable bill = @person;
567      bill.first_name = "Bill";
568      bill.last_name = "Clinton";
569      bill.age = 51;
570#v-
571   In this example a structure consisting of the three fields has been
572   created and assigned to the variable \var{person}.  Then an
573   \em{instance} of this structure has been created using the
574   dereference operator and assigned to \var{bill}.  Finally, the
575   individual fields of \var{bill} were initialized.  This is an
576   example of an \em{anonymous} structure.
577
578   A \em{named} structure is really a new data type and may be created
579   using the \kw{typedef} keyword:
580#v+
581      typedef struct
582      {
583         first_name, last_name, age
584      }
585      Person_Type;
586
587      variable bill = @Person_Type;
588      bill.first_name = "Bill";
589      bill.last_name = "Clinton";
590      bill.age = 51;
591#v-
592   The big advantage of creating a new type is that one can go on to
593   create arrays of the data type
594#v+
595      variable People = Person_Type [100];
596      People[0].first_name = "Bill";
597      People[1].first_name = "Hillary";
598#v-
599
600   The creation and initialization of a structure may be facilitated
601   by a function such as
602#v+
603      define create_person (first, last, age)
604      {
605          variable person = @Person_Type;
606          person.first_name = first;
607          person.last_name = last;
608          person.age = age;
609          return person;
610      }
611      variable Bill = create_person ("Bill", "Clinton", 51);
612#v-
613
614   Other common uses of structures is the creation of linked lists,
615   binary trees, etc.  For more information about these and other
616   features of structures, see section ???.
617
618
619#%}}}
620
621\sect2{Namespaces}
622
623   In addition to the global namespace, each compilation unit (e.g., a
624   file) is given a private namespace.  A variable or function name
625   that is declared using the \var{static} keyword will be placed in
626   the private namespace associated with compilation unit.  For
627   example,
628#v+
629       variable i;
630       static variable i;
631#v-
632   defines two variables called \var{i}.  The first declaration
633   defines \var{i} in the global namespace, but the second declaration
634   defines \var{i} in the private namespace.
635
636   The \exmp{->} operator may be used in conjunction with the name of
637   the namespace to access objects in the name space.  In the above
638   example, to access the variable \var{i} in the global namespace,
639   one would use \exmp{Global->i}.  Unless otherwise specified, a
640   private namespace has no name and its objects may not be accessed
641   from outside the compilation unit.  However, the \var{implements}
642   function may be used give the private namespace a name, allowing
643   access to its objects.  For example, if the file \exmp{t.sl} contains
644#v+
645      implements ("A");
646      static variable i;
647#v-
648   then another file may access the variable \var{i} via \exmp{A->i}.
649
650#%}}}
651
652\sect1{Data Types and Literal Constants} #%{{{
653
654   The current implementation of the \slang language permits up to 256
655   distinct data types, including predefined data types such as integer and
656   floating point, as well as specialized applications specific data
657   types.  It is also possible to create new data types in the
658   language using the \kw{typedef} mechanism.
659
660   Literal constants are objects such as the integer \exmp{3} or the
661   string \exmp{"hello"}.  The actual data type given to a literal
662   constant depends upon the syntax of the constant.  The following
663   sections describe the syntax of literals of specific data types.
664
665\sect2{Predefined Data Types} #%{{{
666
667   The current version of \slang defines integer, floating point,
668   complex, and string types. It also defines special purpose data
669   types such as \var{Null_Type}, \var{DataType_Type}, and
670   \var{Ref_Type}.  These types are discussed below.
671
672\sect3{Integers} #%{{{
673
674   The \slang language supports both signed and unsigned characters,
675   short integer, long integer, and plain integer types. On most 32
676   bit systems, there is no difference between an integer and a long
677   integer; however, they may differ on 16 and 64 bit systems.
678   Generally speaking, on a 16 bit system, plain integers are 16 bit
679   quantities with a range of -32767 to 32767.  On a 32 bit system,
680   plain integers range from -2147483648 to 2147483647.
681
682   An plain integer \em{literal} can be specified in one of several ways:
683\begin{itemize}
684\item As a decimal (base 10) integer consisting of the characters
685      \var{0} through \var{9}, e.g., \var{127}.  An integer specified
686      this way cannot begin with a leading \var{0}.  That is,
687      \var{0127} is \em{not} the same as \var{127}.
688
689\item Using hexadecimal (base 16) notation consisting of the characters
690      \var{0} to \var{9} and \var{A} through \var{F}.  The hexadecimal
691      number must be preceded by the characters \var{0x}.  For example,
692      \var{0x7F} specifies an integer using hexadecimal notation and has
693      the same value as decimal \var{127}.
694
695\item In Octal notation using characters \var{0} through \var{7}.  The Octal
696      number must begin with a leading \var{0}.  For example,
697      \var{0177} and \var{127} represent the same integer.
698
699   Short, long, and unsigned types may be specified by using the
700   proper suffixes: \var{L} indicates that the integer is a long
701   integer, \var{h} indicates that the integer is a short integer, and
702   \var{U} indicates that it is unsigned.  For example, \exmp{1UL}
703   specifies an unsigned long integer.
704
705   Finally, a character literal may be specified using a notation
706   containing a character enclosed in single quotes as \exmp{'a'}.
707   The value of the character specified this way will lie in the
708   range 0 to 256 and will be determined by the ASCII value of the
709   character in quotes.  For example,
710#v+
711              i = '0';
712#v-
713      assigns to \var{i} the character 48 since the \exmp{'0'} character
714      has an ASCII value of 48.
715\end{itemize}
716
717    Any integer may be preceded by a minus sign to indicate that it is a
718    negative integer.
719
720#%}}}
721
722\sect3{Floating Point Numbers} #%{{{
723
724    Single and double precision floating point literals must contain either a
725    decimal point or an exponent (or both). Here are examples of
726    specifying the same double precision point number:
727#v+
728         12.    12.0    12e0   1.2e1   120e-1   .12e2   0.12e2
729#v-
730    Note that \var{12} is \em{not} a floating point number since it
731    contains neither a decimal point nor an exponent.  In fact,
732    \var{12} is an integer.
733
734    One may append the \var{f} character to the end of the number to
735    indicate that the number is a single precision literal.
736
737#%}}}
738
739\sect3{Complex Numbers} #%{{{
740
741    The language implements complex numbers as a pair of double
742    precision floating point numbers.  The first number in the pair
743    forms the \em{real} part, while the second number forms the
744    \em{imaginary} part.  That is, a complex number may be regarded as the
745    sum of a real number and an imaginary number.
746
747    Strictly speaking, the current implementation of the \slang does
748    not support generic complex literals.  However, it does support
749    imaginary literals and a more generic complex number with a non-zero
750    real part may be constructed from the imaginary literal via
751    addition of a real number.
752
753    An imaginary literal is specified in the same way as a floating
754    point literal except that \var{i} or \var{j} is appended.  For
755    example,
756#v+
757         12i    12.0i   12e0j
758#v-
759    all represent the same imaginary number.  Actually, \var{12i} is
760    really an imaginary integer except that \slang automatically
761    promotes it to a double precision imaginary number.
762
763    A more generic complex number may be constructed from an imaginary
764    literal via addition, e.g.,
765#v+
766        3.0 + 4.0i
767#v-
768    produces a complex number whose real part is \exmp{3.0} and whose
769    imaginary part is \exmp{4.0}.
770
771    The intrinsic functions \var{Real} and \var{Imag} may be used to
772    retrieve the real and imaginary parts of a complex number,
773    respectively.
774
775#%}}}
776
777\sect3{Strings} #%{{{
778
779    A string literal must be enclosed in double quotes as in:
780#v+
781      "This is a string".
782#v-
783    Although there is no imposed limit on the length of a string,
784    string literals must be less than 256 characters in length.  It is
785    possible to go beyond this limit by string concatenation, e.g.,
786#v+
787     "This is the first part of a long string"
788       + "and this is the second half"
789#v-
790    Any character except a newline (ASCII 10) or the null character
791    (ASCII 0) may appear explicitly in a string literal.  However,
792    these characters may be used implicitly using the mechanism
793    described below.
794
795    The backslash character is a special character and is used to
796    include other special characters (such as a newline character) in
797    the string. The special characters recognized are:
798#v+
799       \"    --  double quote
800       \'    --  single quote
801       \\    --  backslash
802       \a    --  bell character (ASCII 7)
803       \t    --  tab character (ASCII 9)
804       \n    --  newline character (ASCII 10)
805       \e    --  escape character (ASCII 27)
806       \xhhh --  character expressed in HEXADECIMAL notation
807       \ooo  --  character expressed in OCTAL notation
808       \dnnn --  character expressed in DECIMAL
809#v-
810    For example, to include the double quote character as part of the
811    string, it must be preceded by a backslash character, e.g.,
812#v+
813       "This is a \"quote\""
814#v-
815    Similarly, the next illustrates how a newline character may be
816    included:
817#v+
818       "This is the first line\nand this is the second"
819#v-
820#%}}}
821
822
823\sect3{Null_Type}
824
825   Objects of type \var{Null_Type} can have only one value:
826   \var{NULL}.  About the only thing that you can do with this data
827   type is to assign it to variables and test for equality with
828   other objects.  Nevertheless, \var{Null_Type} is an important and
829   extremely useful data type.  Its main use stems from the fact that
830   since it can be compared for equality with any other data type, it
831   is ideal to represent the value of an object which does not yet
832   have a value, or has an illegal value.
833
834   As a trivial example of its use, consider
835#v+
836      define add_numbers (a, b)
837      {
838         if (a == NULL) a = 0;
839         if (b == NULL) b = 0;
840         return a + b;
841      }
842      variable c = add_numbers (1, 2);
843      variable d = add_numbers (1, NULL);
844      variable e = add_numbers (1,);
845      variable f = add_numbers (,);
846#v-
847   It should be clear that after these statements have been executed,
848   \var{c} will have a value of \exmp{3}.  It should also be clear
849   that \var{d} will have a value of \exmp{1} because \var{NULL} has
850   been passed as the second parameter.  One feature of the language
851   is that if a parameter has been omitted from a function call, the
852   variable associated with that parameter will be set to \var{NULL}.
853   Hence, \var{e} and \var{f} will be set to \exmp{1} and \exmp{0},
854   respectively.
855
856   The \var{Null_Type} data type also plays an important role in the
857   context of \em{structures}.
858
859\sect3{Ref_Type}
860   Objects of \var{Ref_Type} are created using the unary
861   \em{reference} operator \var{&}.  Such objects may be
862   \em{dereferenced} using the dereference operator \var{@}.  For
863   example,
864#v+
865      variable sin_ref = &sin;
866      variable y = (@sin_ref) (1.0);
867#v-
868   creates a reference to the \var{sin} function and assigns it to
869   \var{sin_ref}.  The second statement uses the dereference operator
870   to call the function that \var{sin_ref} references.
871
872   The \var{Ref_Type} is useful for passing functions as arguments to
873   other functions, or for returning information from a function via
874   its parameter list.  The dereference operator is also used to create
875   an instance of a structure.  For these reasons, further discussion
876   of this important type can be found in section ??? and section ???.
877
878\sect3{Array_Type and Struct_Type}
879
880   Variables of type \var{Array_Type} and \var{Struct_Type} are known
881   as \em{container objects}.  They are much more complicated than the
882   simple data types discussed so far and each obeys a special syntax.
883   For these reasons they are discussed in a separate chapters.
884   See ???.
885
886\sect3{DataType_Type Type} #%{{{
887
888   \slang defines a type called \var{DataType_Type}.  Objects of
889   this type have values that are type names.  For example, an integer
890   is an object of type \var{Integer_Type}.  The literals of
891   \var{DataType_Type} include:
892#v+
893     Char_Type            (signed character)
894     UChar_Type           (unsigned character)
895     Short_Type           (short integer)
896     UShort_Type          (unsigned short integer)
897     Integer_Type         (plain integer)
898     UInteger_Type        (plain unsigned integer)
899     Long_Type            (long integer)
900     ULong_Type           (unsigned long integer)
901     Float_Type           (single precision real)
902     Double_Type          (double precision real)
903     Complex_Type         (complex numbers)
904     String_Type          (strings, C strings)
905     BString_Type         (binary strings)
906     Struct_Type          (structures)
907     Ref_Type             (references)
908     Null_Type            (NULL)
909     Array_Type           (arrays)
910     DataType_Type        (data types)
911#v-
912   as well as the names of any other types that an application
913   defines.
914
915   The built-in function \var{typeof} returns the data type of
916   its argument, i.e., a \var{DataType_Type}.  For instance
917   \exmp{typeof(7)} returns \var{Integer_Type} and
918   \var{typeof(Integer_Type)} returns \var{DataType_Type}.  One can use this
919   function as in the following example:
920#v+
921      if (Integer_Type == typeof (x)) message ("x is an integer");
922#v-
923   The literals of \var{DataType_Type} have other uses as well.  One
924   of the most common uses of these literals is to create arrays, e.g.,
925#v+
926        x = Complex_Type [100];
927#v-
928   creates an array of \exmp{100} complex numbers and assigns it to
929   \var{x}.
930#%}}}
931
932#%}}}
933
934\sect2{Typecasting: Converting from one Type to Another}
935
936   Occasionally, it is necessary to convert from one data type to
937   another.  For example, if you need to print an object as a string,
938   it may be necessary to convert it to a \var{String_Type}.  The
939   \var{typecast} function may be used to perform such conversions.
940   For example, consider
941#v+
942      variable x = 10, y;
943      y = typecast (x, Double_Type);
944#v-
945   After execution of these statements, \var{x} will have the integer
946   value \exmp{10} and \var{y} will have the double precision floating
947   point value \exmp{10.0}.  If the object to be converted is an
948   array, the \var{typecast} function will act upon all elements of
949   the array.  For example,
950#v+
951      variable x = [1:10];       % Array of integers
952      variable y = typecast (x, Double_Type);
953#v-
954   will create an array of \exmp{10} double precision values and
955   assign it to \var{y}.  One should also realize that it is not
956   always possible to perform a typecast.  For example, any attempt to
957   convert an \var{Integer_Type} to a \var{Null_Type} will result in a
958   run-time error.
959
960   Often the interpreter will perform implicit type conversions as necessary
961   to complete calculations.  For example, when multiplying an
962   \var{Integer_Type} with a \var{Double_Type}, it will convert the
963   \var{Integer_Type} to a \var{Double_Type} for the purpose of the
964   calculation.  Thus, the example involving the conversion of an
965   array of integers to an array of doubles could have been performed
966   by multiplication by \exmp{1.0}, i.e.,
967#v+
968      variable x = [1:10];       % Array of integers
969      variable y = 1.0 * x;
970#v-
971
972   The \var{string} intrinsic function is similar to the typecast
973   function except that it converts an object to a string
974   representation.  It is important to understand that a typecast from
975   some type to \var{String_Type} is \em{not} the same as converting
976   an object to its string operation.   That is,
977   \exmp{typecast(x,String_Type)} is not equivalent to
978   \exmp{string(x)}.  The reason for this is that when given an array,
979   the \var{typecast} function acts on each element of the array to
980   produce another array, whereas the \var{string} function produces a
981   a string.
982
983   The \var{string} function is useful for printing the value of an
984   object.  This use is illustrated in the following simple example:
985#v+
986      define print_object (x)
987      {
988         message (string (x));
989      }
990#v-
991   Here, the \var{message} function has been used because it writes a
992   string to the display.  If the \var{string} function was not used
993   and the \var{message} function was passed an integer, a
994   type-mismatch error would have resulted.
995
996#%}}}
997
998\sect1{Identifiers} #%{{{
999
1000   The names given to variables, functions, and data types are called
1001   \em{identifiers}.  There are some restrictions upon the actual
1002   characters that make up an identifier.  An identifier name must
1003   start with a letter (\var{[A-Za-z]}), an underscore character, or a
1004   dollar sign.  The rest of the characters in the name can be any
1005   combination of letters, digits, dollar signs, or underscore
1006   characters.  However, all identifiers whose name begins with two
1007   underscore characters are reserved for internal use by the
1008   interpreter and declarations of objects with such names should be
1009   avoided.
1010
1011   Examples of valid identifiers include:
1012#v+
1013      mary    _3    _this_is_ok
1014      a7e1    $44   _44$_Three
1015#v-
1016   However, the following are not legal:
1017#v+
1018      7abc   2e0    #xx
1019#v-
1020   In fact, \exmp{2e0} actually specifies the real number
1021   \exmp{2.0}.
1022
1023   Although the maximum length of identifiers is unspecified by the
1024   language, the length should be kept below \exmp{64} characters.
1025
1026   The following identifiers are reserved by the language for use as
1027   keywords:
1028#v+
1029        !if            _for        do         mod       sign       xor
1030        ERROR_BLOCK    abs         do_while   mul2      sqr        public
1031        EXIT_BLOCK     and         else       not       static     private
1032        USER_BLOCK0    andelse     exch       or        struct
1033        USER_BLOCK1    break       for        orelse    switch
1034        USER_BLOCK2    case        foreach    pop       typedef
1035        USER_BLOCK3    chs         forever    return    using
1036        USER_BLOCK4    continue    if         shl       variable
1037        __tmp          define      loop       shr       while
1038#v-
1039   In addition, the next major \slang release (v2.0) will reserve
1040   \exmp{try} and \exmp{catch}, so it is probably a good idea to avoid
1041   those words until then.
1042
1043#%}}}
1044
1045\sect1{Variables} #%{{{
1046
1047   A variable must be declared before it can be used, otherwise an
1048   undefined name error will be generated.  A variable is declared
1049   using the \kw{variable} keyword, e.g,
1050#v+
1051      variable x, y, z;
1052#v-
1053   declares three variables, \exmp{x}, \exmp{y}, and \exmp{z}.  This
1054   is an example of a variable declaration statement, and like all
1055   statements, it must end in a semi-colon.
1056
1057   Variables declared this way are untyped and inherit a type upon
1058   assignment.  The actual type checking is performed at run-time.  For
1059   example,
1060#v+
1061        x = "This is a string";
1062        x = 1.2;
1063        x = 3;
1064        x = 2i;
1065#v-
1066   results in x being set successively to a string, a float, an
1067   integer, and to a complex number (\exmp{0+2i}).  Any attempt to use
1068   a variable before it has acquired a type will result in an
1069   uninitialized variable error.
1070
1071   It is legal to put executable code in a variable declaration list.
1072   That is,
1073#v+
1074         variable x = 1, y = sin (x);
1075#v-
1076   are legal variable declarations.  This also provides a convenient way
1077   of initializing a variable.
1078
1079   Variables are classified as either \em{global} or \em{local}. A
1080   variable declared inside a function is said to be local and has no
1081   meaning outside the function.  A variable is said to be global if
1082   it was declared outside a function.  Global variables are further
1083   classified as being \var{public}, \var{static}, or \var{private},
1084   according to the name space where they were defined.
1085   See chapter ??? for more information about name spaces.
1086
1087   The following global variables are predefined by the language and
1088   are mainly used as convenience variables:
1089#v+
1090      $0 $1 $2 $3 $4 $5 $6 $7 $8 $9
1091#v-
1092
1093   An \em{intrinsic} variable is another type of global variable.
1094   Such variables have a definite type which cannot be altered.
1095   Variables of this type may also be defined to be read-only, or
1096   constant variables.  An example of an intrinsic variable is
1097   \var{PI} which is a read-only double precision variable with a value
1098   of approximately \exmp{3.14159265358979323846}.
1099
1100#%}}}
1101
1102\sect1{Operators} #%{{{
1103
1104   \slang supports a variety of operators that are grouped into three
1105   classes: assignment operators, binary operators, and unary operators.
1106
1107   An assignment operator is used to assign a value to a variable.
1108   They will be discussed more fully in the context of the assignment
1109   statement in section ???.
1110
1111   An unary operator acts only upon a single quantity while a binary
1112   operation is an operation between two quantities.  The boolean
1113   operator \var{not} is an example of an unary operator.  Examples of
1114   binary operators include the usual arithmetic operators
1115   \var{+}, \var{-}, \var{*}, and \var{/}.  The operator given by
1116   \var{-} can be either an unary operator (negation) or a binary operator
1117   (subtraction); the actual operation is determined from the context
1118   in which it is used.
1119
1120   Binary operators are used in algebraic forms, e.g., \exmp{a + b}.
1121   Unary operators fall in one of two classes: postfix-unary or
1122   prefix-unary.  For example, in the expression \exmp{-x}, the minus
1123   sign is a prefix-unary operator.
1124
1125   Not all data types have binary or unary operations defined.  For
1126   example, while \var{String_Type} objects support the \var{+}
1127   operator, they do not admit the \var{*} operator.
1128
1129\sect2{Unary Operators}
1130
1131   The \bf{unary} operators operate only upon a single operand.  They
1132   include: \var{not}, \var{~}, \var{-}, \var{@}, \var{&}, as well as the
1133   increment and decrement operators \var{++} and \var{--},
1134   respectively.
1135
1136   The boolean operator \var{not} acts only upon integers and produces
1137   \var{0} if its operand is non-zero, otherwise it produces \var{1}.
1138
1139   The bit-level not operator \var{~} performs a similar function,
1140   except that it operates on the individual bits of its integer
1141   operand.
1142
1143   The arithmetic negation operator \var{-} is the most well-known
1144   unary operator.  It simply reverses the sign of its operand.
1145
1146   The reference (\var{&}) and dereference (\var{@}) operators will be
1147   discussed in greater detail in section ???.  Similarly, the
1148   increment (\var{++}) and decrement (\var{--}) operators will be
1149   discussed in the context of the assignment operator.
1150
1151\sect2{Binary Operators} #%{{{
1152
1153   The binary operators may be grouped according to several classes:
1154   arithmetic operators, relational operators, boolean operators, and
1155   bitwise operators.
1156
1157   All binary and unary operators may be overloaded.  For example, the
1158   arithmetic plus operator has been overloaded by the
1159   \var{String_Type} data type to permit concatenation between strings.
1160
1161\sect3{Arithmetic Operators} #%{{{
1162
1163   The arithmetic operators include \var{+}, \var{-}, \var{*}, \var{/},
1164   which perform addition, subtraction, multiplication, and division,
1165   respectively.  In addition to these, \slang supports the \var{mod}
1166   operator as well as the power operator \var{^}.
1167
1168   The data type of the result produced by the use of one of these
1169   operators depends upon the data types of the binary participants.
1170   If they are both integers, the result will be an integer.  However,
1171   if the operands are not of the same type, they will be converted to
1172   a common type before the operation is performed.  For example, if
1173   one is a floating point value and the other is an integer, the
1174   integer will be converted to a float. In general, the promotion
1175   from one type to another is such that no information is lost, if
1176   possible.  As an example, consider the expression \exmp{8/5} which
1177   indicates division of the integer \var{8} by the integer \var{5}.
1178   The result will be the integer \var{1} and \em{not} the floating
1179   point value \var{1.6}.  However, \exmp{8/5.0} will produce
1180   \var{1.6} because \exmp{5.0} is a floating point number.
1181
1182#%}}}
1183
1184\sect3{Relational Operators} #%{{{
1185
1186   The relational operators are \var{>}, \var{>=}, \var{<}, \var{<=},
1187   \var{==}, and \var{!=}.  These perform the comparisons greater
1188   than, greater than or equal, less than, less than or equal, equal,
1189   and not equal, respectively.  The result of one of these
1190   comparisons is the integer \var{1} if the comparison is true, or
1191   \var{0} if the comparison is false.  For example, \exmp{6 >= 5}
1192   returns \var{1}, but \var{6 == 5} produces
1193   \var{0}.
1194
1195#%}}}
1196
1197\sect3{Boolean Operators} #%{{{
1198   There are only two boolean binary operators: \var{or} and
1199   \var{and}.  These operators are defined only for integers and
1200   produce an integer result.  The \var{or} operator returns \var{1}
1201   if either of its operands are non-zero, otherwise it produces
1202   \var{0}.  The \var{and} operator produces \var{1} if and only if
1203   both its operands are non-zero, otherwise it produces \var{0}.
1204
1205   Neither of these operators perform the so-called boolean
1206   short-circuit evaluation.  For example, consider the expression:
1207#v+
1208      (x != 0) and (1/x > 10)
1209#v-
1210   Here, if \var{x} were to have a value of zero, a division by zero error
1211   would occur because even though \var{x!=0} evaluates to zero, the
1212   \var{and} operator is not short-circuited and the \var{1/x} expression
1213   would still be evaluated.  Although these operators are not
1214   short-circuited, \slang does have another mechanism of performing
1215   short-circuit boolean evaluation via the \kw{orelse} and
1216   \kw{andelse} expressions.  See below for information about these
1217   constructs.
1218
1219#%}}}
1220
1221\sect3{Bitwise Operators} #%{{{
1222
1223   The bitwise binary operators are defined only with integer operands
1224   and are used for bit-level operations.  Operators that fall in this
1225   class include \var{&}, \var{|}, \var{shl}, \var{shr}, and
1226   \var{xor}.  The \var{&} operator performs a boolean AND operation
1227   between the corresponding bits of the operands.  Similarly, the
1228   \var{|} operator performs the boolean OR operation on the bits.
1229   The bit-shifting operators \var{shl} and \var{shr} shift the bits
1230   of the first operand by the number given by the second operand to
1231   the left or right, respectively.  Finally, the \var{xor} performs
1232   an EXCLUSIVE-OR operation.
1233
1234   These operators are commonly used to manipulate variables whose
1235   individual bits have distinct meanings.  In particular, \var{&} is
1236   usually used to test bits, \var{|} can be used to set bits, and
1237   \var{xor} may be used to flip a bit.
1238
1239   As an example of using \var{&} to perform tests on bits, consider
1240   the following: The \jed text editor stores some of the information
1241   about a buffer in a bitmapped integer variable.  The value of this
1242   variable may be retrieved using the \jed intrinsic function
1243   \var{getbuf_info}, which actually returns four quantities: the
1244   buffer flags, the name of the buffer, directory name, and file
1245   name.  For the purposes of this section, only the buffer flags are
1246   of interest and can be retrieved via a function such as
1247#v+
1248      define get_buffer_flags ()
1249      {
1250         variable flags;
1251         (,,,flags) = getbuf_info ();
1252         return flags;
1253      }
1254#v-
1255   The buffer flags is a bitmapped quantity where the 0th bit
1256   indicates whether or not the buffer has been modified, the first
1257   bit indicates whether or not autosave has been enabled for the
1258   buffer, and so on.  Consider for the moment the task of determining
1259   if the buffer has been modified.  This can be
1260   determined by looking at the zeroth bit, if it is \var{0} the
1261   buffer has not been modified, otherwise it has.  Thus we can create
1262   the function,
1263#v+
1264     define is_buffer_modified ()
1265     {
1266        variable flags = get_buffer_flags ();
1267        return (flags & 1);
1268     }
1269#v-
1270   where the integer \exmp{1} has been used since it has all of its
1271   bits set to \var{0}, except for the zeroth one, which is set to
1272   \var{1}.  (At this point, it should also be apparent that bits are
1273   numbered from zero, thus an \var{8} bit integer consists of bits
1274   \var{0} to \var{7}, where \var{0} is the least significant bit and
1275   \var{7} is the most significant one.)  Similarly, we can create another
1276   function
1277#v+
1278     define is_autosave_on ()
1279     {
1280        variable flags = get_buffer_flags ();
1281        return (flags & 2);
1282     }
1283#v-
1284   to determine whether or not autosave has been turned on for the
1285   buffer.
1286
1287   The \var{shl} operator may be used to form the integer with only
1288   the \em{nth} bit set.  For example, \exmp{1 shl 6} produces an
1289   integer with all bits set to zero except the sixth bit, which is
1290   set to one.  The following example exploits this fact:
1291#v+
1292     define test_nth_bit (flags, nth)
1293     {
1294        return flags & (1 shl nth);
1295     }
1296#v-
1297
1298#%}}}
1299
1300\sect3{Namespace operator}
1301   The operator \var{->} is used to in conjunction with the name of a
1302   namespace to access an object within the namespace.  For example,
1303   if \exmp{A} is the name of a namespace containing the variable
1304   \var{v}, then \exmp{A->v} refers to that variable.
1305
1306\sect3{Operator Precedence}
1307
1308\sect3{Binary Operators and Functions Returning Multiple Values} #%{{{
1309   Care must be exercised when using binary operators with an operand
1310   the returns multiple values.  In fact, the current implementation
1311   of the \slang language will produce incorrect results if both
1312   operands of a binary expression return multiple values.  \em{At
1313   most, only one of operands of a binary expression can return
1314   multiple values, and that operand must be the first one, not the
1315   second.}  For example,
1316#v+
1317    define read_line (fp)
1318    {
1319       variable line, status;
1320
1321       status = fgets (&line, fp);
1322       if (status == -1)
1323         return -1;
1324       return (line, status);
1325    }
1326#v-
1327   defines a function, \var{read_line} that takes a single argument, a
1328   handle to an open file, and returns one or two values, depending
1329   upon the return value of \var{fgets}.  Now consider
1330#v+
1331        while (read_line (fp) > 0)
1332          {
1333             text = ();
1334             % Do something with text
1335             .
1336             .
1337          }
1338#v-
1339   Here the relational binary operator \var{>} forms a comparison
1340   between one of the return values (the one at the top of the stack)
1341   and \var{0}.  In accordance with the above rule, since \var{read_line}
1342   returns multiple values, it occurs as the left binary operand.
1343   Putting it on the right as in
1344#v+
1345        while (0 < read_line (fp))    % Incorrect
1346          {
1347             text = ();
1348             % Do something with text
1349             .
1350             .
1351          }
1352#v-
1353   violates the rule and will result in the wrong answer.
1354
1355#%}}}
1356
1357#%}}}
1358
1359\sect2{Mixing Integer and Floating Point Arithmetic}
1360
1361   If a binary operation (\var{+}, \var{-}, \var{*} , \var{/}) is
1362   performed on two integers, the result is an integer.  If at least
1363   one of the operands is a float, the other is converted to float and
1364   the result is float.  For example:
1365#v+
1366      11 / 2           --> 5   (integer)
1367      11 / 2.0         --> 5.5 (float)
1368      11.0 / 2         --> 5.5 (float)
1369      11.0 / 2.0       --> 5.5 (float)
1370#v-
1371   Finally note that only integers may be used as array indices,
1372   loop control variables, and bit operations.  The conversion
1373   functions, \var{int} and \var{float}, may be used convert between
1374   floats and ints where appropriate, e.g.,
1375#v+
1376      int (1.5)         --> 1 (integer)
1377      float(1.5)        --> 1.5 (float)
1378      float (1)         --> 1.0 (float)
1379#v-
1380
1381\sect2{Short Circuit Boolean Evaluation}
1382
1383   The boolean operators \var{or} and \var{and} \em{are not short
1384   circuited} as they are in some languages.  \slang uses
1385   \var{orelse} and \var{andelse} expressions for short circuit boolean
1386   evaluation.  However, these are not binary operators. Expressions
1387   of the form:
1388\begin{tscreen}
1389        \em{expr-1} and \em{expr-2} and ... \em{expr-n}
1390\end{tscreen}
1391   can be replaced by the short circuited version using \var{andelse}:
1392\begin{tscreen}
1393        andelse {\em{expr-1}} {\em{expr-2}} ... {\em{expr-n}}
1394\end{tscreen}
1395   A similar syntax holds for the \var{orelse} operator.  For example, consider
1396   the statement:
1397#v+
1398      if ((x != 0) and (1/x > 10)) do_something ();
1399#v-
1400   Here, if \var{x} were to have a value of zero, a division by zero error
1401   would occur because even though \var{x!=0} evaluates to zero, the
1402   \var{and} operator is not short circuited and the \var{1/x} expression
1403   would be evaluated causing division by zero. For this case, the
1404   \var{andelse} expression could be used to avoid the problem:
1405#v+
1406      if (andelse
1407          {x != 0}
1408          {1 / x > 10})  do_something ();
1409#v-
1410
1411#%}}}
1412
1413\sect1{Statements} #%{{{
1414
1415   Loosely speaking, a \em{statement} is composed of \em{expressions}
1416   that are grouped according to the syntax or grammar of the language
1417   to express a complete computation.  Statements are analogous to
1418   sentences in a human language and expressions are like phrases.
1419   All statements in the \slang language must end in a semi-colon.
1420
1421   A statement that occurs within a function is executed only during
1422   execution of the function.  However, statements that occur outside
1423   the context of a function are evaluated immediately.
1424
1425   The language supports several different types of statements such as
1426   assignment statements, conditional statements, and so forth.  These
1427   are described in detail in the following sections.
1428
1429\sect2{Variable Declaration Statements}
1430   Variable declarations were already discussed in chapter ???.  For
1431   the sake of completeness, a variable declaration is a statement of
1432   the form
1433\begin{tscreen}
1434     variable \em{variable-declaration-list} ;
1435\end{tscreen}
1436   where the \em{variable-declaration-list} is a comma separated list
1437   of one or more variable names with optional initializations, e.g.,
1438#v+
1439     variable x, y = 2, z;
1440#v-
1441\sect2{Assignment Statements} #%{{{
1442
1443   Perhaps the most well known form of statement is the \em{assignment
1444   statement}.  Statements of this type consist of a left-hand side,
1445   an assignment operator, and a right-hand side.  The left-hand side
1446   must be something to which an assignment can be performed.  Such
1447   an object is called an \em{lvalue}.
1448
1449   The most common assignment operator is the simple assignment
1450   operator \var{=}.  Simple of its use include
1451#v+
1452      x = 3;
1453      x = some_function (10);
1454      x = 34 + 27/y + some_function (z);
1455      x = x + 3;
1456#v-
1457   In addition to the simple assignment operator, \slang
1458   also supports the assignment operators \var{+=} and \var{-=}.
1459   Internally, \slang transforms
1460#v+
1461       a += b;
1462#v-
1463   to
1464#v+
1465       a = a + b;
1466#v-
1467   Similarly, \exmp{a -= b} is transformed to \exmp{a = a - b}.  It is
1468   extremely important to realize that, in general, \exmp{a+b} is not
1469   equal to \exmp{b+a}.  This means that \exmp{a+=b} is not the same
1470   as \exmp{a=b+a}.  As an example consider
1471#v+
1472      a = "hello"; a += "world";
1473#v-
1474   After execution of these two statements, \var{a} will have the
1475   value \exmp{"helloworld"} and not \exmp{"worldhello"}.
1476
1477   Since adding or subtracting \exmp{1} from a variable is quite
1478   common, \slang also supports the unary increment and decrement
1479   operators \exmp{++}, and \exmp{--}, respectively.  That is, for
1480   numeric data types,
1481#v+
1482       x = x + 1;
1483       x += 1;
1484       x++;
1485#v-
1486   are all equivalent.  Similarly,
1487#v+
1488       x = x - 1;
1489       x -= 1;
1490       x--;
1491#v-
1492   are also equivalent.
1493
1494   Strictly speaking, \var{++} and \var{--} are unary operators.  When
1495   used as \var{x++}, the \var{++} operator is said to be a
1496   \em{postfix-unary} operator.  However, when used as \var{++x} it is
1497   said to be a \em{prefix-unary} operator.  The current
1498   implementation does not distinguish between the two forms, thus
1499   \var{x++} and \var{++x} are equivalent.  The reason for this
1500   equivalence is \em{that assignment expressions do not return a value in
1501   the \slang language} as they do in C.  Thus one should exercise care
1502   and not try to write C-like code such as
1503#v+
1504      x = 10;
1505      while (--x) do_something (x);     % Ok in C, but not in S-Lang
1506#v-
1507   The closest valid \slang form involves a \em{comma-expression}:
1508#v+
1509      x = 10;
1510      while (x--, x) do_something (x);  % Ok in S-Lang and in C
1511#v-
1512
1513   \slang also supports a \em{multiple-assignment} statement.  It is
1514   discussed in detail in section ???.
1515
1516#%}}}
1517
1518\sect2{Conditional and Looping Statements} #%{{{
1519
1520  \slang supports a wide variety of conditional and looping
1521  statements.  These constructs operate on statements grouped together
1522  in \em{blocks}.  A block is a sequence of \slang statements enclosed
1523  in braces and may contain other blocks. However, a block cannot
1524  include function declarations.  In the following,
1525  \em{statement-or-block} refers to either a single
1526  \slang statement or to a block of statements, and
1527  \em{integer-expression} is an integer-valued expression.
1528  \em{next-statement} represents the statement following the form
1529   under discussion.
1530
1531\sect3{Conditional Forms} #%{{{
1532\sect4{if}
1533   The simplest condition statement is the \kw{if} statement.  It
1534   follows the syntax
1535\begin{tscreen}
1536        if (\em{integer-expression}) \em{statement-or-block}
1537        \em{next-statement}
1538\end{tscreen}
1539   If \em{integer-expression} evaluates to a non-zero result, then the
1540   statement or group of statements implied \em{statement-or-block}
1541   will get executed.  Otherwise, control will proceed to
1542   \em{next-statement}.
1543
1544   An example of the use of this type of conditional statement is
1545#v+
1546       if (x != 0)
1547         {
1548            y = 1.0 / x;
1549            if (x > 0) z = log (x);
1550         }
1551#v-
1552   This example illustrates two \var{if} statements where the second
1553   \var{if} statement is part of the block of statements that belong to
1554   the first.
1555
1556\sect4{if-else}
1557   Another form of \kw{if} statement is the \em{if-else} statement.
1558   It follows the syntax:
1559\begin{tscreen}
1560      if (\em{integer-expression}) \em{statement-or-block-1}
1561      else \em{statement-or-block-2}
1562      \em{next-statement}
1563\end{tscreen}
1564   Here, if \em{expression} returns non-zero,
1565   \em{statement-or-block-1} will get executed and control will pass
1566   on to \em{next-statement}. However, if \em{expression} returns zero,
1567   \em{statement-or-block-2} will get executed before continuing with
1568   \em{next-statement}.  A simple example of this form is
1569#v+
1570     if (x > 0) z = log (x); else error ("x must be positive");
1571#v-
1572   Consider the more complex example:
1573#v+
1574     if (city == "Boston")
1575       if (street == "Beacon") found = 1;
1576     else if (city == "Madrid")
1577       if (street == "Calle Mayor") found = 1;
1578     else found = 0;
1579#v-
1580   This example illustrates a problem that beginners have with
1581   \em{if-else} statements.  The grammar presented above shows that
1582   the this example is equivalent to
1583#v+
1584     if (city == "Boston")
1585       {
1586         if (street == "Beacon") found = 1;
1587         else if (city == "Madrid")
1588           {
1589             if (street == "Calle Mayor") found = 1;
1590             else found = 0;
1591           }
1592       }
1593#v-
1594   It is important to understand the grammar and not be seduced by the
1595   indentation!
1596
1597\sect4{!if}
1598
1599   One often encounters \kw{if} statements similar to
1600\begin{tscreen}
1601     if (\em{integer-expression} == 0) \em{statement-or-block}
1602\end{tscreen}
1603   or equivalently,
1604\begin{tscreen}
1605     if (not(\em{integer-expression})) \em{statement-or-block}
1606\end{tscreen}
1607   The \kw{!if} statement was added to the language to simplify the
1608   handling of such statements.  It obeys the syntax
1609\begin{tscreen}
1610     !if (\em{integer-expression}) \em{statement-or-block}
1611\end{tscreen}
1612   and is functionally equivalent to
1613\begin{tscreen}
1614     if (not (\em{expression})) \em{statement-or-block}
1615\end{tscreen}
1616
1617\sect4{orelse, andelse}
1618
1619  These constructs were discussed earlier.  The syntax for the
1620  \var{orelse} statement is:
1621\begin{tscreen}
1622     orelse {\em{integer-expression-1}} ... {\em{integer-expression-n}}
1623\end{tscreen}
1624  This causes each of the blocks to be executed in turn until one of
1625  them returns a non-zero integer value.  The result of this statement
1626  is the integer value returned by the last block executed.  For
1627  example,
1628#v+
1629     orelse { 0 } { 6 } { 2 } { 3 }
1630#v-
1631  returns \var{6} since the second block is the first to return a
1632  non-zero result.  The last two block will not get executed.
1633
1634  The syntax for the \var{andelse} statement is:
1635\begin{tscreen}
1636     andelse {\em{integer-expression-1}} ... {\em{integer-expression-n}}
1637\end{tscreen}
1638  Each of the blocks will be executed in turn until one of
1639  them returns a zero value.  The result of this statement is the
1640  integer value returned by the last block executed.  For example,
1641#v+
1642     andelse { 6 } { 2 } { 0 } { 4 }
1643#v-
1644  returns \var{0} since the third block will be the last to execute.
1645
1646\sect4{switch}
1647  The switch statement deviates the most from its C counterpart.  The
1648  syntax is:
1649#v+
1650          switch (x)
1651            { ...  :  ...}
1652              .
1653              .
1654            { ...  :  ...}
1655#v-
1656   The `\var{:}' operator is a special symbol which means to test
1657   the top item on the stack, and if it is non-zero, the rest of the block
1658   will get executed and control will pass out of the switch statement.
1659   Otherwise, the execution of the block will be terminated and the process
1660   will be repeated for the next block.  If a block contains no
1661   \var{:} operator, the entire block is executed and control will
1662   pass onto the next statement following the \kw{switch} statement.
1663   Such a block is known as the \em{default} case.
1664
1665   As a simple example, consider the following:
1666#v+
1667      switch (x)
1668        { x == 1 : message("Number is one.");}
1669        { x == 2 : message("Number is two.");}
1670        { x == 3 : message("Number is three.");}
1671        { x == 4 : message("Number is four.");}
1672        { x == 5 : message("Number is five.");}
1673        { message ("Number is greater than five.");}
1674#v-
1675   Suppose \var{x} has an integer value of \exmp{3}.  The first two
1676   blocks will terminate at the `\var{:}' character because each of the
1677   comparisons with \var{x} will produce zero.  However, the third
1678   block will execute to completion.  Similarly, if \var{x} is
1679   \exmp{7}, only the last block will execute in full.
1680
1681   A more familiar way to write the previous example used the
1682   \kw{case} keyword:
1683#v+
1684      switch (x)
1685        { case 1 : print("Number is one.");}
1686        { case 2 : print("Number is two.");}
1687        { case 3 : print("Number is three.");}
1688        { case 4 : print("Number is four.");}
1689        { case 5 : print("Number is five.");}
1690        { print ("Number is greater than five.");}
1691#v-
1692   The \var{case} keyword is a more useful comparison operator because
1693   it can perform a comparison between different data types while
1694   using \var{==} may result in a type-mismatch error.  For example,
1695#v+
1696      switch (x)
1697        { (x == 1) or (x == "one") : print("Number is one.");}
1698        { (x == 2) or (x == "two") : print("Number is two.");}
1699        { (x == 3) or (x == "three") : print("Number is three.");}
1700        { (x == 4) or (x == "four") : print("Number is four.");}
1701        { (x == 5) or (x == "five") : print("Number is five.");}
1702        { print ("Number is greater than five.");}
1703#v-
1704  will fail because the \var{==} operation is not defined between
1705  strings and integers.  The correct way to write this to use the
1706  \var{case} keyword:
1707#v+
1708      switch (x)
1709        { case 1 or case "one" : print("Number is one.");}
1710        { case 2 or case "two" : print("Number is two.");}
1711        { case 3 or case "three" : print("Number is three.");}
1712        { case 4 or case "four" : print("Number is four.");}
1713        { case 5 or case "five" : print("Number is five.");}
1714        { print ("Number is greater than five.");}
1715#v-
1716
1717#%}}}
1718
1719\sect3{Looping Forms} #%{{{
1720
1721\sect4{while}
1722   The \kw{while} statement follows the syntax
1723\begin{tscreen}
1724      while (\em{integer-expression}) \em{statement-or-block}
1725      \em{next-statement}
1726\end{tscreen}
1727   It simply causes \em{statement-or-block} to get executed as long as
1728   \em{integer-expression} evaluates to a non-zero result.  For
1729   example,
1730#v+
1731      i = 10;
1732      while (i)
1733        {
1734          i--;
1735          newline ();
1736        }
1737#v-
1738   will cause the \var{newline} function to get called 10 times.
1739   However,
1740#v+
1741      i = -10;
1742      while (i)
1743        {
1744          i--;
1745          newline ();
1746        }
1747#v-
1748   would loop forever (or until \var{i} wraps from the most negative
1749   integer value to the most positive and then decrements to zero).
1750
1751
1752   If you are a C programmer, do not let the syntax of the language
1753   seduce you into writing this example as you would in C:
1754#v+
1755      i = 10;
1756      while (i--) newline ();
1757#v-
1758   The fact is that expressions such as \var{i--} do not return a
1759   value in \slang as they do in C.  If you must write this way, use
1760   the comma operator as in
1761#v+
1762      i = 10;
1763      while (i, i--) newline ();
1764#v-
1765
1766\sect4{do...while}
1767   The \kw{do...while} statement follows the syntax
1768\begin{tscreen}
1769      do
1770         \em{statement-or-block}
1771      while (\em{integer-expression});
1772\end{tscreen}
1773   The main difference between this statement and the \var{while}
1774   statement is that the \kw{do...while} form performs the test
1775   involving \em{integer-expression} after each execution
1776   of \em{statement-or-block} rather than before.  This guarantees that
1777   \em{statement-or-block} will get executed at least once.
1778
1779   A simple example from the \jed editor follows:
1780#v+
1781     bob ();      % Move to beginning of buffer
1782     do
1783       {
1784          indent_line ();
1785       }
1786     while (down (1));
1787#v-
1788   This will cause all lines in the buffer to get indented via the
1789   \jed intrinsic function \var{indent_line}.
1790
1791\sect4{for}
1792   Perhaps the most complex looping statement is the \kw{for}
1793   statement; nevertheless, it is a favorite of many programmers.
1794   This statement obeys the syntax
1795\begin{tscreen}
1796    for (\em{init-expression}; \em{integer-expression}; \em{end-expression})
1797      \em{statement-or-block}
1798    \em{next-statement}
1799\end{tscreen}
1800   In addition to \em{statement-or-block}, its specification requires
1801   three other expressions.  When executed, the \kw{for} statement
1802   evaluates \em{init-expression}, then it tests
1803   \em{integer-expression}.  If \em{integer-expression} returns zero,
1804   control passes to \em{next-statement}.  Otherwise, it executes
1805   \em{statement-or-block} as long as \em{integer-expression}
1806   evaluates to a non-zero result.  After every execution of
1807   \em{statement-or-block}, \em{end-expression} will get evaluated.
1808
1809   This statement is \em{almost} equivalent to
1810\begin{tscreen}
1811    \em{init-expression};
1812    while (\em{integer-expression})
1813      {
1814         \em{statement-or-block}
1815         \em{end-expression};
1816      }
1817\end{tscreen}
1818   The reason that they are not fully equivalent involves what happens
1819   when \em{statement-or-block} contains a \kw{continue} statement.
1820
1821   Despite the apparent complexity of the \kw{for} statement, it is
1822   very easy to use.  As an example, consider
1823#v+
1824     sum = 0;
1825     for (i = 1; i <= 10; i++) sum += i;
1826#v-
1827   which computes the sum of the first 10 integers.
1828
1829\sect4{loop}
1830   The \kw{loop} statement simply executes a block of code a fixed
1831   number of times.  It follows the syntax
1832\begin{tscreen}
1833      loop (\em{integer-expression}) \em{statement-or-block}
1834      \em{next-statement}
1835\end{tscreen}
1836   If the \em{integer-expression} evaluates to a positive integer,
1837   \em{statement-or-block} will get executed that many times.
1838   Otherwise, control will pass to \em{next-statement}.
1839
1840   For example,
1841#v+
1842      loop (10) newline ();
1843#v-
1844   will cause the function \var{newline} to get called 10 times.
1845
1846\sect4{forever}
1847   The \kw{forever} statement is similar to the \kw{loop} statement
1848   except that it loops forever, or until a \kw{break} or a
1849   \kw{return} statement is executed.  It obeys the syntax
1850\begin{tscreen}
1851     forever \em{statement-or-block}
1852\end{tscreen}
1853   A trivial example of this statement is
1854#v+
1855     n = 10;
1856     forever
1857       {
1858          if (n == 0) break;
1859          newline ();
1860          n--;
1861       }
1862#v-
1863
1864\sect4{foreach}
1865   The \kw{foreach} statement is used to loop over one or more
1866   statements for every element in a container object.  A container
1867   object is a data type that consists of other types.  Examples
1868   include both ordinary and associative arrays, structures, and
1869   strings.  Every time through the loop the current member of the
1870   object is pushed onto the stack.
1871
1872   The simple type of \kw{foreach} statement obeys the syntax
1873\begin{tscreen}
1874     foreach (\em{container-object}) \em{statement-or-block}
1875\end{tscreen}
1876   Here \em{container-object} can be an expression that returns a
1877   container object.  A simple example is
1878#v+
1879     foreach (["apple", "peach", "pear"])
1880      {
1881         fruit = ();
1882         process_fruit (fruit);
1883      }
1884#v-
1885   This example shows that if the container object is an array, then
1886   successive elements of the array are pushed onto the stack prior to
1887   each execution cycle.  If the container object is a string, then
1888   successive characters of the string are pushed onto the stack.
1889
1890   What actually gets pushed onto the stack may be controlled via the
1891   \kw{using} form of the \kw{foreach} statement.  This more complex
1892   type of \kw{foreach} statement follows the syntax
1893\begin{tscreen}
1894     foreach ( \em{container-object} ) using ( \em{control-list} )
1895       \em{statement-or-block}
1896\end{tscreen}
1897   The allowed values of \em{control-list} will depend upon the type
1898   of container object.  For associative arrays (\var{Assoc_Type}),
1899   \em{control-list} specified whether \em{keys}, \em{values}, or both
1900   are pushed onto the stack.  For example,
1901#v+
1902     foreach (a) using ("keys")
1903       {
1904          k = ();
1905           .
1906           .
1907       }
1908#v-
1909   results in the keys of the associative array \var{a} being pushed
1910   on the list.  However,
1911#v+
1912     foreach (a) using ("values")
1913       {
1914          v = ();
1915           .
1916           .
1917       }
1918#v-
1919   will cause the values to be used, and
1920#v+
1921     foreach (a) using ("keys", "values")
1922       {
1923          (k,v) = ();
1924           .
1925           .
1926       }
1927#v-
1928  will use both the keys and values of the array.
1929
1930  Similarly, for linked-lists of structures, one may walk the list via
1931  code like
1932#v+
1933     foreach (linked_list) using ("next")
1934       {
1935          s = ();
1936            .
1937            .
1938       }
1939#v-
1940  This \kw{foreach} statement is equivalent
1941#v+
1942     s = linked_list;
1943     while (s != NULL)
1944       {
1945          .
1946          .
1947         s = s.next;
1948       }
1949#v-
1950  Consult the type-specific documentation for a discussion of the
1951  \kw{using} control words, if any, appropriate for a given type.
1952
1953\sect2{break, return, continue}
1954
1955   \slang also includes the non-local transfer functions \var{return}, \var{break},
1956   and \var{continue}.  The \var{return} statement causes control to return to the
1957   calling function while the \var{break} and \var{continue} statements are used in
1958   the context of loop structures.  Consider:
1959#v+
1960       define fun ()
1961       {
1962          forever
1963            {
1964               s1;
1965               s2;
1966               ..
1967               if (condition_1) break;
1968               if (condition_2) return;
1969               if (condition_3) continue;
1970               ..
1971               s3;
1972            }
1973          s4;
1974          ..
1975       }
1976#v-
1977   Here, a function \var{fun} has been defined that contains a \var{forever}
1978   loop consisting of statements \var{s1}, \var{s2},\ldots,\var{s3}, and
1979   three \var{if} statements.  As long as the expressions \var{condition_1},
1980   \var{condition_2}, and \var{condition_3} evaluate to zero, the statements
1981   \var{s1}, \var{s2},\ldots,\var{s3} will be repeatedly executed.  However,
1982   if \var{condition_1} returns a non-zero value, the \var{break} statement
1983   will get executed, and control will pass out of the \var{forever} loop to
1984   the statement immediately following the loop which in this case is
1985   \var{s4}. Similarly, if \var{condition_2} returns a non-zero number,
1986   the \var{return} statement will cause control to pass back to the
1987   caller of \var{fun}.  Finally, the \var{continue} statement will
1988   cause control to pass back to the start of the loop, skipping the
1989   statement \var{s3} altogether.
1990
1991
1992#%}}}
1993
1994#%}}}
1995
1996#%}}}
1997
1998\sect1{Functions} #%{{{
1999
2000   A function may be thought of as a group of statements that work
2001   together to perform a computation.  While there are no imposed
2002   limits upon the number statements that may occur within a function,
2003   it is considered poor programming practice if a function contains
2004   many statements. This notion stems from the belief that a function
2005   should have a simple, well defined purpose.
2006
2007\sect2{Declaring Functions} #%{{{
2008
2009   Like variables, functions must be declared before they can be used. The
2010   \kw{define} keyword is used for this purpose.  For example,
2011#v+
2012      define factorial ();
2013#v-
2014   is sufficient to declare a function named \var{factorial}.  Unlike
2015   the \var{variable} keyword used for declaring variables, the
2016   \var{define} keyword does not accept a list of names.
2017
2018   Usually, the above form is used only for recursive functions.  In
2019   most cases, the function name is almost always followed by a
2020   parameter list and the body of the function:
2021\begin{tscreen}
2022      define \em{function-name} (\em{parameter-list})
2023      {
2024         \em{statement-list}
2025      }
2026\end{tscreen}
2027   The \em{function-name} is an identifier and must conform to the
2028   naming scheme for identifiers discussed in chapter ???.
2029   The \em{parameter-list} is a comma-separated list of variable names
2030   that represent parameters passed to the function, and
2031   may be empty if no parameters are to be passed.
2032   The body of the function is enclosed in braces and consists of zero
2033   or more statements (\em{statement-list}).
2034
2035   The variables in the \em{parameter-list} are implicitly declared,
2036   thus, there is no need to declare them via a variable declaration
2037   statement.  In fact any attempt to do so will result in a syntax
2038   error.
2039
2040#%}}}
2041
2042\sect2{Parameter Passing Mechanism} #%{{{
2043
2044   Parameters to a function are always passed by value and never by
2045   reference.  To see what this means, consider
2046#v+
2047     define add_10 (a)
2048     {
2049        a = a + 10;
2050     }
2051     variable b = 0;
2052     add_10 (b);
2053#v-
2054   Here a function \var{add_10} has been defined, which when executed,
2055   adds \exmp{10} to its parameter.  A variable \var{b} has also been
2056   declared and initialized to zero before it is passed to
2057   \var{add_10}.  What will be the value of \var{b} after the call to
2058   \var{add_10}?  If \slang were a language that passed parameters by
2059   reference, the value of \var{b} would be changed to
2060   \var{10}.  However, \slang always passes by value, which means that
2061   \var{b} would retain its value of zero after the function call.
2062
2063   \slang does provide a mechanism for simulating pass by reference
2064   via the reference operator.  See the next section for more details.
2065
2066   If a function is called with a parameter in the parameter list
2067   omitted, the corresponding variable in the function will be set to
2068   \var{NULL}.  To make this clear, consider the function
2069#v+
2070     define add_two_numbers (a, b)
2071     {
2072        if (a == NULL) a = 0;
2073        if (b == NULL) b = 0;
2074        return a + b;
2075     }
2076#v-
2077   This function must be called with two parameters.  However, we can
2078   omit one or both of the parameters by calling it in one of the
2079   following ways:
2080#v+
2081     variable s = add_two_numbers (2,3);
2082     variable s = add_two_numbers (2,);
2083     variable s = add_two_numbers (,3);
2084     variable s = add_two_numbers (,);
2085#v-
2086   The first example calls the function using both parameters;
2087   however, at least one of the parameters was omitted in the other
2088   examples.  The interpreter will implicitly convert the last three
2089   examples to
2090#v+
2091     variable s = add_two_numbers (2, NULL);
2092     variable s = add_two_numbers (NULL, 3);
2093     variable s = add_two_numbers (NULL, NULL);
2094#v-
2095   It is important to note that this mechanism is available only for
2096   function calls that specify more than one parameter.  That is,
2097#v+
2098     variable s = add_10 ();
2099#v-
2100  is \em{not} equivalent to \exmp{add_10(NULL)}.  The reason for this
2101  is simple: the parser can only tell whether or not \var{NULL} should
2102  be substituted by looking at the position of the comma character in
2103  the parameter list, and only function calls that indicate more than
2104  one parameter will use a comma.  A mechanism for handling single
2105  parameter function calls is described in the next section.
2106
2107#%}}}
2108
2109\sect2{Referencing Variables} #%{{{
2110
2111   One can achieve the effect of passing by reference by using the
2112   reference (\var{&}) and dereference (\var{@}) operators. Consider
2113   again the \var{add_10} function presented in the previous section.
2114   This time we write it as
2115#v+
2116     define add_10 (a)
2117     {
2118        @a = @a + 10;
2119     }
2120     variable b = 0;
2121     add_10 (&b);
2122#v-
2123   The expression \var{&b} creates a \em{reference} to the variable
2124   \var{b} and it is the reference that gets passed to \var{add_10}.
2125   When the function \var{add_10} is called, the value of \var{a} will
2126   be a reference to \var{b}.  It is only by \em{dereferencing} this
2127   value that \var{b} can be accessed and changed.  So, the statement
2128   \exmp{@a=@a+10;} should be read `add \exmp{10}' to the value of the
2129   object that \var{a} references and assign the result to the object
2130   that \var{a} references.
2131
2132   The reader familiar with C will note the similarity between
2133   \em{references} in \slang and \em{pointers} in C.
2134
2135   One of the main purposes for references is that this mechanism
2136   allows reference to functions to be passed to other functions.  As
2137   a simple example from elementary calculus, consider the following
2138   function which returns an approximation to the derivative of another
2139   function at a specified point:
2140#v+
2141     define derivative (f, x)
2142     {
2143        variable h = 1e-6;
2144        return ((@f)(x+h) - (@f)(x)) / h;
2145     }
2146#v-
2147   It can be used to differentiate the function
2148#v+
2149     define x_squared (x)
2150     {
2151        return x^2;
2152     }
2153#v-
2154   at the point \exmp{x = 3} via the expression
2155   \exmp{derivative(&x_squared,3)}.
2156
2157
2158#%}}}
2159
2160\sect2{Functions with a Variable Number of Arguments} #%{{{
2161
2162  \slang functions may be defined to take a variable number of
2163  arguments.  The reason for this is that the calling routine pushes
2164  the arguments onto the stack before making a function call, and it
2165  is up to the called function to pop the values off the stack and
2166  make assignments to the variables in the parameter list.  These
2167  details are, for the most part, hidden from the programmer.
2168  However, they are important when a variable number of arguments are
2169  passed.
2170
2171  Consider the \var{add_10} example presented earlier.  This time it
2172  is written
2173#v+
2174     define add_10 ()
2175     {
2176        variable x;
2177        x = ();
2178        return x + 10;
2179     }
2180     variable s = add_10 (12);  % ==> s = 22;
2181#v-
2182  For the uninitiated, this example looks as if it
2183  is destined for disaster.  The \var{add_10} function looks like it
2184  accepts zero arguments, yet it was called with a single argument.
2185  On top of that, the assignment to \var{x} looks strange.  The truth
2186  is, the code presented in this example makes perfect sense, once you
2187  realize what is happening.
2188
2189  First, consider what happened when \var{add_10} is called with the
2190  the parameter \exmp{12}.  Internally, \exmp{12} is
2191  pushed onto the stack and then the function called.  Now,
2192  consider the function itself.  \var{x} is a variable local to the
2193  function.  The strange looking assignment `\exmp{x=()}' simply
2194  takes whatever is on the stack and assigns it to \var{x}.  In
2195  other words, after this statement, the value of \var{x} will be
2196  \exmp{12}, since \exmp{12} will be at the top of the stack.
2197
2198  A generic function of the form
2199#v+
2200    define function_name (x, y, ..., z)
2201    {
2202       .
2203       .
2204    }
2205#v-
2206  is internally transformed by the interpreter to
2207#v+
2208    define function_name ()
2209    {
2210       variable x, y, ..., z;
2211       z = ();
2212       .
2213       .
2214       y = ();
2215       x = ();
2216       .
2217       .
2218    }
2219#v-
2220  before further parsing.  (The \var{add_10} function, as defined above, is
2221  already in this form.)  With this knowledge in hand, one can write a
2222  function that accepts a variable number of arguments.  Consider the
2223  function:
2224#v+
2225    define average_n (n)
2226    {
2227       variable x, y;
2228       variable sum;
2229
2230       if (n == 1)
2231         {
2232            x = ();
2233            sum = x;
2234         }
2235       else if (n == 2)
2236         {
2237            y = ();
2238            x = ();
2239            sum = x + y;
2240         }
2241       else error ("average_n: only one or two values supported");
2242
2243       return sum / n;
2244   }
2245   variable ave1 = average_n (3.0, 1);        % ==> 3.0
2246   variable ave2 = average_n (3.0, 5.0, 2);   % ==> 4.0
2247#v-
2248  Here, the last argument passed to \var{average_n} is an integer
2249  reflecting the number of quantities to be averaged.  Although this
2250  example works fine, its principal limitation is obvious: it only
2251  supports one or two values.  Extending it to three or more values
2252  by adding more \exmp{else if} constructs is rather straightforward but
2253  hardly worth the effort.  There must be a better way, and there is:
2254#v+
2255   define average_n (n)
2256   {
2257      variable sum, x;
2258      sum = 0;
2259      loop (n)
2260        {
2261           x = ();    % get next value from stack
2262           sum += x;
2263        }
2264      return sum / n;
2265   }
2266#v-
2267  The principal limitation of this approach is that one must still
2268  pass an integer that specifies how many values are to be averaged.
2269
2270  Fortunately, a special variable exists that is local to every function
2271  and contains the number of values that were passed to the function.
2272  That variable has the name \var{_NARGS} and may be used as follows:
2273#v+
2274   define average_n ()
2275   {
2276      variable x, sum = 0;
2277
2278      if (_NARGS == 0) error ("Usage: ave = average_n (x, ...);");
2279
2280      loop (_NARGS)
2281        {
2282           x = ();
2283           sum += x;
2284        }
2285      return sum / _NARGS;
2286   }
2287#v-
2288  Here, if no arguments are passed to the function, a simple message
2289  that indicates how it is to be used is printed out.
2290
2291
2292#%}}}
2293
2294
2295\sect2{Returning Values}
2296
2297   As stated earlier, the usual way to return values from a function
2298   is via the \kw{return} statement.  This statement has the
2299   simple syntax
2300\begin{tscreen}
2301      return \em{expression-list} ;
2302\end{tscreen}
2303   where \em{expression-list} is a comma separated list of expressions.
2304   If the function does not return any values, the expression list
2305   will be empty.  As an example of a function that can return
2306   multiple values, consider
2307#v+
2308        define sum_and_diff (x, y)
2309        {
2310            variable sum, diff;
2311
2312            sum = x + y;  diff = x - y;
2313            return sum, diff;
2314        }
2315#v-
2316   which is a function returning two values.
2317
2318   It is extremely important to note that \em{the calling routine must
2319   explicitly handle all values returned by a function}.  Although
2320   some languages such as C do not have this restriction, \slang does
2321   and it is a direct result of a \slang function's ability to return
2322   many values and accept a variable number of parameters.  Examples
2323   of properly handling the above function include
2324#v+
2325       variable sum, diff;
2326       (sum, diff) = sum_and_diff (5, 4);  % ignore neither
2327       (sum, ) = sum_and_diff (5, 4);      % ignore diff
2328       (,) = sum_and_diff (5, 4);          % ignore both sum and diff
2329#v-
2330   See the section below on assignment statements for more information
2331   about this important point.
2332
2333\sect2{Multiple Assignment Statement} #%{{{
2334
2335   \slang functions can return more than one value, e.g.,
2336#v+
2337       define sum_and_diff (x, y)
2338       {
2339          return x + y, x - y;
2340       }
2341#v-
2342   returns two values.  It accomplishes this by placing both values on
2343   the stack before returning.  If you understand how \slang functions
2344   handle a variable number of parameters (section ???), then it
2345   should be rather obvious that one assigns such values to variables.
2346   One way is to use, e.g.,
2347#v+
2348      sum_and_diff (9, 4);
2349      d = ();
2350      s = ();
2351#v-
2352
2353   However, the most convenient way to accomplish this is to use a
2354   \em{multiple assignment statement} such as
2355#v+
2356       (s, d) = sum_and_diff (9, 4);
2357#v-
2358   The most general form of the multiple assignment statement is
2359#v+
2360     ( var_1, var_2, ..., var_n ) = expression;
2361#v-
2362   In fact, internally the interpreter transforms this statement into
2363   the form
2364#v+
2365     expression; var_n = (); ... var_2 = (); var_1 = ();
2366#v-
2367   for further processing.
2368
2369   If you do not care about one of return values, simply omit the
2370   variable name from the list.  For example,
2371#v+
2372        (s, ) = sum_and_diff (9, 4);
2373#v-
2374   assigns the sum of \exmp{9} and \exmp{4} to \var{s} and the
2375   difference (\exmp{9-4}) will be removed from the stack.
2376
2377   As another example, the \jed editor provides a function called
2378   \var{down} that takes an integer argument and returns an integer.
2379   It is used to move the current editing position down the number of
2380   lines specified by the argument passed to it.  It returns the number
2381   of lines it successfully moved the editing position.  Often one does
2382   not care about the return value from this function.  Although it is
2383   always possible to handle the return value via
2384#v+
2385       variable dummy = down (10);
2386#v-
2387   it is more convenient to use a multiple assignment expression and
2388   omit the variable name, e.g.,
2389#v+
2390       () = down (10);
2391#v-
2392
2393   Some functions return a \em{variable number} of values instead of a
2394   \em{fixed number}.  Usually, the value at the top of the stack will
2395   indicate the actual number of return values.  For such functions,
2396   the multiple assignment statement cannot directly be used.  To see
2397   how such functions can be dealt with, consider the following
2398   function:
2399#v+
2400     define read_line (fp)
2401     {
2402        variable line;
2403        if (-1 == fgets (&line, fp))
2404          return -1;
2405        return (line, 0);
2406     }
2407#v-
2408   This function returns either one or two values, depending upon the
2409   return value of \var{fgets}.  Such a function may be handled as in
2410   the following example:
2411#v+
2412      status = read_line (fp);
2413      if (status != -1)
2414        {
2415           s = ();
2416           .
2417           .
2418        }
2419#v-
2420   In this example, the \em{last} value returned by \var{read_line} is
2421   assigned to \var{status} and then tested.  If it is non-zero, the
2422   second return value is assigned to \var{s}.  In particular note the
2423   empty set of parenthesis in the assignment to \var{s}.  This simply
2424   indicates that whatever is on the top of the stack when the
2425   statement is executed will be assigned to \var{s}.
2426
2427   Before leaving this section it is important to reiterate the fact
2428   that if a function returns a value, the caller must deal with that
2429   return value.  Otherwise, the value will continue to live onto the
2430   stack and may eventually lead to a stack overflow error.
2431   Failing to handle the return value of a function is the
2432   most common mistake that inexperienced \slang programmers make.
2433   For example, the \var{fflush} function returns a value that many C
2434   programmer's never check.  Instead of writing
2435#v+
2436      fflush (fp);
2437#v-
2438   as one could in C, a \slang programmer should write
2439#v+
2440      () = fflush (fp);
2441#v-
2442   in \slang.  (Many good C programmer's write \exmp{(void)fflush(fp)}
2443   to indicate that the return value is being ignored).
2444
2445#%}}}
2446
2447\sect2{Exit-Blocks}
2448
2449   An \em{exit-block} is a set of statements that get executed when a
2450   functions returns.  They are very useful for cleaning up when a
2451   function returns via an explicit call to \var{return} from deep
2452   within a function.
2453
2454   An exit-block is created by using the \kw{EXIT_BLOCK} keyword
2455   according to the syntax
2456\begin{tscreen}
2457      EXIT_BLOCK { \em{statement-list} }
2458\end{tscreen}
2459   where \em{statement-list} represents the list of statements that
2460   comprise the exit-block.  The following example illustrates the use
2461   of an exit-block:
2462#v+
2463      define simple_demo ()
2464      {
2465         variable n = 0;
2466
2467         EXIT_BLOCK { message ("Exit block called."); }
2468
2469         forever
2470          {
2471            if (n == 10) return;
2472            n++;
2473          }
2474      }
2475#v-
2476   Here, the function contains an exit-block and a \var{forever} loop.
2477   The loop will terminate via the \kw{return} statement when \var{n}
2478   is 10.  Before it returns, the exit-block will get executed.
2479
2480   A function can contain multiple exit-blocks, but only the last
2481   one encountered during execution will actually get executed.  For
2482   example,
2483#v+
2484      define simple_demo (n)
2485      {
2486         EXIT_BLOCK { return 1; }
2487
2488         if (n != 1)
2489           {
2490              EXIT_BLOCK { return 2; }
2491           }
2492         return;
2493      }
2494#v-
2495   If \var{1} is passed to this function, the first exit-block will
2496   get executed because the second one would not have been encountered
2497   during the execution.  However, if some other value is passed, the
2498   second exit-block would get executed.  This example also
2499   illustrates that it is possible to explicitly return from an
2500   exit-block, although nested exit-blocks are illegal.
2501
2502#%}}}
2503
2504\sect1{Name Spaces} #%{{{
2505
2506  By default, all global variables and functions are defined in the
2507  global namespace.  In addition to the global namespace, every
2508  compilation unit (e.g., a file containing \slang code) has an
2509  anonymous namespace.  Objects may be defined in the anonymous
2510  namespace via the \var{static} declaration keyword.  For example,
2511#v+
2512     static variable x;
2513     static define hello () { message ("hello"); }
2514#v-
2515  defines a variable \var{x} and a function \var{hello} in the
2516  anonymous namespace.  This is useful when one wants to define
2517  functions and variables that are only to be used within the file, or
2518  more precisely the compilation unit, that defines them.
2519
2520  The \var{implements} function may be used to give the anonymous
2521  namespace a name to allow access to its objects from outside the
2522  compilation unit that defines them.  For example,
2523#v+
2524     implements ("foo");
2525     static variable x;
2526#v-
2527  allows the variable \var{x} to be accessed via \var{foo->x}, e.g.,
2528#v+
2529     if (foo->x == 1) foo->x = 2;
2530#v-
2531
2532  The \var{implements} function does more than simply giving the
2533  anonymous namespace a name.  It also changes the default variable
2534  and function declaration mode from \var{public} to \var{static}.
2535  That is,
2536#v+
2537     implements ("foo");
2538     variable x;
2539#v-
2540  and
2541#v+
2542     implements ("foo");
2543     static variable x;
2544#v-
2545  are equivalent.  Then to create a public object within the
2546  namespace, one must explicitly use the \var{public} keyword.
2547
2548  Finally, the \var{private} keyword may be used to create an object
2549  that is truly private within the compilation unit.  For example,
2550#v+
2551    implements ("foo");
2552    variable x;
2553    private variable y;
2554#v-
2555  allows \var{x} to be accessed from outside the namespace via
2556  \var{foo->x}, however \var{y} cannot be accessed.
2557
2558#%}}}
2559
2560\sect1{Arrays} #%{{{
2561
2562   An array is a container object that can contain many values of one
2563   data type.  Arrays are very useful objects and are indispensable
2564   for certain types of programming.  The purpose of this chapter is
2565   to describe how arrays are defined and used in the \slang language.
2566
2567\sect2{Creating Arrays} #%{{{
2568
2569   The \slang language supports multi-dimensional arrays of all data
2570   types.  Since the \var{Array_Type} is a data type, one can even
2571   have arrays of arrays.  To create a multi-dimensional array of
2572   \em{SomeType} use the syntax
2573#v+
2574      SomeType [dim0, dim1, ..., dimN]
2575#v-
2576   Here \em{dim0}, \em{dim1}, ... \em{dimN} specify the size of
2577   the individual dimensions of the array.  The current implementation
2578   permits arrays consist of up to \var{7} dimensions.  When a
2579   numeric array is created, all its elements are initialized to zero.
2580   The initialization of other array types depend upon the data type,
2581   e.g., \var{String_Type} and \var{Struct_Type} arrays are
2582   initialized to \var{NULL}.
2583
2584   As a concrete example, consider
2585#v+
2586     a = Integer_Type [10];
2587#v-
2588   which creates a one-dimensional array of \exmp{10} integers and
2589   assigns it to \var{a}.
2590   Similarly,
2591#v+
2592     b = Double_Type [10, 3];
2593#v-
2594   creates a \var{30} element array of double precision numbers
2595   arranged in \var{10} rows and \var{3} columns, and assigns it to
2596   \var{b}.
2597
2598\sect3{Range Arrays}
2599
2600   There is a more convenient syntax for creating and initializing a
2601   1-d arrays.  For example, to create an array of ten
2602   integers whose elements run from \exmp{1} through \exmp{10}, one
2603   may simply use:
2604#v+
2605     a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
2606#v-
2607   Similarly,
2608#v+
2609     b = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0];
2610#v-
2611   specifies an array of ten doubles.
2612
2613   An even more compact way of specifying a numeric array is to use a
2614   \em{range-array}.  For example,
2615#v+
2616     a = [0:9];
2617#v-
2618   specifies an array of 10 integers whose elements range from \var{0}
2619   through \var{9}.  The most general form of a range array is
2620#v+
2621     [first-value : last-value : increment]
2622#v-
2623   where the \em{increment} is optional and defaults to \exmp{1}. This
2624   creates an array whose first element is \em{first-value} and whose
2625   successive values differ by \em{increment}.  \em{last-value} sets
2626   an upper limit upon the last value of the array as described below.
2627
2628   If the range array \var{[a:b:c]} is integer valued, then the
2629   interval specified by \var{a} and \var{b} is closed.  That is, the
2630   kth element of the array \math{x_k} is given by \math{x_k=a+ck} and
2631   must satisfy \math{a<=x_k<=b}.  Hence, the number of elements in an
2632   integer range array is given by the expression \math{1 + (b-a)/c}.
2633
2634   The situation is somewhat more complicated for floating point range
2635   arrays.  The interval specified by a floating point range array
2636   \var{[a:b:c]} is semi-open such that \var{b} is not contained in
2637   the interval.  In particular, the kth element of \var{[a:b:c]} is
2638   given by \math{x_k=a+kc} such that \math{a<=x_k<b} when
2639   \math{c>=0}, and \math{b<x_k<=a} otherwise.  The number of elements
2640   in the array is one greater than the largest \math{k} that
2641   satisfies the open interval constraint.
2642
2643   Here are a few examples that illustrate the above comments:
2644#v+
2645       [1:5:1]         ==> [1,2,3,4,5]
2646       [1.0:5.0:1.0]   ==> [1.0, 2.0, 3.0, 4.0]
2647       [5:1:-1]        ==> [5,4,3,2,1]
2648       [5.0:1.0:-1.0]  ==> [5.0, 4.0, 3.0, 2.0];
2649       [1:1]           ==> [1]
2650       [1.0:1.0]       ==> []
2651       [1:-3]          ==> []
2652#v-
2653
2654\sect3{Creating arrays via the dereference operator}
2655
2656   Another way to create an array is apply the dereference operator
2657   \var{@} to the \var{DataType_Type} literal \var{Array_Type}.  The
2658   actual syntax for this operation resembles a function call
2659\begin{tscreen}
2660     variable a = @Array_Type (\em{data-type}, \em{integer-array});
2661\end{tscreen}
2662  where \em{data-type} is of type \var{DataType_Type} and
2663  \em{integer-array} is a 1-d array of integers that specify the size
2664  of each dimension.  For example,
2665#v+
2666     variable a = @Array_Type (Double_Type, [10, 20]);
2667#v-
2668  will create a \exmp{10} by \var{20} array of doubles and assign it
2669  to \var{a}.  This method of creating arrays derives its power from
2670  the fact that it is more flexible than the methods discussed in this
2671  section.  We shall encounter it again in section ??? in the context
2672  of the \var{array_info} function.
2673
2674#%}}}
2675
2676\sect2{Reshaping Arrays} #%{{{
2677   It is sometimes possible to change the `shape' of an array using
2678   the \var{reshape} function.  For example, a 1-d 10 element array
2679   may be reshaped into a 2-d array consisting of 5 rows and 2
2680   columns.  The only restriction on the operation is that the arrays
2681   must be commensurate.  The \var{reshape} function follows the
2682   syntax
2683\begin{tscreen}
2684       reshape (\em{array-name}, \em{integer-array});
2685\end{tscreen}
2686   where \em{array-name} specifies the array to be reshaped to have
2687   the dimensions given by \var{integer-array}, a 1-dimensional array of
2688   integers.  It is important to note that this does \em{not} create a
2689   new array, it simply reshapes the existing array.  Thus,
2690#v+
2691       variable a = Double_Type [100];
2692       reshape (a, [10, 10]);
2693#v-
2694   turns \var{a} into a \exmp{10} by \exmp{10} array.
2695
2696#%}}}
2697
2698\sect2{Indexing Arrays} #%{{{
2699   An individual element of an array may be referred to by its
2700   \em{index}.  For example, \exmp{a[0]} specifies the zeroth element
2701   of the one dimensional array \var{a}, and \exmp{b[3,2]} specifies
2702   the element in the third row and second column of the two
2703   dimensional array \var{b}.  As in C array indices are numbered from
2704   \var{0}.  Thus if \var{a} is a one-dimensional array of ten
2705   integers, the last element of the array is given by \var{a[9]}.
2706   Using \var{a[10]} would result in a range error.
2707
2708   A negative index may be used to index from the end of the array,
2709   with \exmp{a[-1]} referring to the last element of \var{a},
2710   \exmp{a[-2]} referring to the next to the last element, and so on.
2711
2712   One may use the indexed value like any other variable.  For
2713   example, to set the third element of an integer array to \var{6}, use
2714#v+
2715     a[2] = 6;
2716#v-
2717   Similarly, that element may be used in an expression, such as
2718#v+
2719     y = a[2] + 7;
2720#v-
2721   Unlike other \slang variables which inherit a type upon assignment,
2722   array elements already have a type.  For example, an attempt to
2723   assign a string value to an element of an integer array will result
2724   in a type-mismatch error.
2725
2726   One may use any integer expression to index an array.  A simple
2727   example that computes the sum of the elements of 10 element 1-d
2728   array is
2729#v+
2730      variable i, sum;
2731      sum = 0;
2732      for (i = 0; i < 10; i++) sum += a[i];
2733#v-
2734
2735   Unlike many other languages, \slang permits arrays to be indexed by
2736   other integer arrays.   Suppose that \var{a} is a 1-d array of 10
2737   doubles.  Now consider:
2738#v+
2739      i = [6:8];
2740      b = a[i];
2741#v-
2742   Here, \var{i} is a 1-dimensional range array of three integers with
2743   \exmp{i[0]} equal to \exmp{6}, \exmp{i[1]} equal to \exmp{7},
2744   and \exmp{i[2]} equal to \exmp{8}.  The statement \var{b = a[i];}
2745   will create a 1-d array of three doubles and assign it to \var{b}.
2746   The zeroth element of \var{b}, \exmp{b[0]} will be set to the sixth
2747   element of \var{a}, or \exmp{a[6]}, and so on.  In fact, these two simple
2748   statements are equivalent to
2749#v+
2750     b = Double_Type [3];
2751     b[0] = a[6];
2752     b[1] = a[7];
2753     b[2] = a[8];
2754#v-
2755   except that using an array of indices is not only much more
2756   convenient, but executes much faster.
2757
2758   More generally, one may use an index array to specify which
2759   elements are to participate in a calculation.  For example, consider
2760#v+
2761     a = Double_Type [1000];
2762     i = [0:499];
2763     j = [500:999];
2764     a[i] = -1.0;
2765     a[j] = 1.0;
2766#v-
2767   This creates an array of \exmp{1000} doubles and sets the first
2768   \exmp{500} elements to \exmp{-1.0} and the last \exmp{500} to
2769   \var{1.0}.  Actually, one may do away with the \var{i} and \var{j}
2770   variables altogether and use
2771#v+
2772     a = Double_Type [1000];
2773     a [[0:499]] = -1.0;
2774     a [[500:999]] = 1.0;
2775#v-
2776   It is important to understand the syntax used and, in particular,
2777   to note that \exmp{a[[0:499]]} is \em{not} the same as
2778   \exmp{a[0:499]}.  In fact, the latter will generate a syntax error.
2779
2780   Often, it is convenient to use a \em{rubber} range to specify
2781   indices.  For example, \exmp{a[[500:]]} specifies all elements of
2782   \var{a} whose index is greater than or equal to \var{500}.  Similarly,
2783   \exmp{a[[:499]]} specifies the first 500 elements of \var{a}.
2784   Finally, \exmp{a[[:]]} specifies all the elements of \var{a};
2785   however, using \exmp{a[*]} is more convenient.
2786
2787   One should be careful when using index arrays with negative
2788   elements.  As pointed out above, a negative index is used to index
2789   from the end of the array.  That is, \exmp{a[-1]} refers to the
2790   last element of \exmp{a}.  How should \exmp{a[[[0:-1]]} be
2791   interpreted?  By itself, \var{[0:-1]} is an empty array; hence, one
2792   might expect \exmp{a[[0:-1]]} to refer to no elements.  However,
2793   when used in an array indexing context, \exmp{[0:-1]} is
2794   interpreted as an array indexing the first through the last
2795   elements of the array.  While this is a very convenient mechanism
2796   to specifiy the last 3 elements of an array using
2797   \exmp{a[[-3:-1]]}, it is very easy to forget these semantics.
2798
2799   Now consider a multi-dimensional array.  For simplicity, suppose
2800   that \var{a} is a \exmp{100} by \exmp{100} array of doubles.  Then
2801   the expression \var{a[0, *]} specifies all elements in the zeroth
2802   row.  Similarly, \var{a[*, 7]} specifies all elements in the
2803   seventh column.  Finally, \var{a[[3:5][6:12]]} specifies the
2804   \exmp{3} by \exmp{7} region consisting of rows \exmp{3}, \exmp{4},
2805   and \exmp{5}, and columns \exmp{6} through \exmp{12} of \var{a}.
2806
2807   We conclude this section with a few examples.
2808
2809   Here is a function that computes the trace (sum of the diagonal
2810   elements) of a square 2 dimensional \var{n} by \var{n} array:
2811#v+
2812      define array_trace (a, n)
2813      {
2814         variable sum = 0, i;
2815         for (i = 0; i < n; i++) sum += a[i, i];
2816         return sum;
2817      }
2818#v-
2819   This fragment creates a \exmp{10} by \exmp{10} integer array, sets
2820   its diagonal elements to \exmp{5}, and then computes the trace of
2821   the array:
2822#v+
2823      a = Integer_Type [10, 10];
2824      for (j = 0; j < 10; j++) a[j, j] = 5;
2825      the_trace = array_trace(a, 10);
2826#v-
2827   We can get rid of the \kw{for} loop as follows:
2828#v+
2829      j = Integer_Type [10, 2];
2830      j[*,0] = [0:9];
2831      j[*,1] = [0:9];
2832      a[j] = 5;
2833#v-
2834   Here, the goal was to construct a 2-d array of indices that
2835   correspond to the diagonal elements of \var{a}, and then use that
2836   array to index \var{a}.  To understand how
2837   this works, consider the middle statements.  They are equivalent
2838   to the following \var{for} loops:
2839#v+
2840      variable i;
2841      for (i = 0; i < 10; i++) j[i, 0] = i;
2842      for (i = 0; i < 10; i++) j[i, 1] = i;
2843#v-
2844   Thus, row \var{n} of \var{j} will have the value \exmp{(n,n)},
2845   which is precisely what was sought.
2846
2847   Another example of this technique is the function:
2848#v+
2849      define unit_matrix (n)
2850      {
2851         variable a = Integer_Type [n, n];
2852         variable j = Integer_Type [n, 2];
2853         j[*,0] = [0:n - 1];
2854         j[*,1] = [0:n - 1];
2855
2856         a[j] = 1;
2857         return a;
2858      }
2859#v-
2860   This function creates an \var{n} by \var{n} unit matrix,
2861   that is a 2-d \var{n} by \var{n} array whose elements are all zero
2862   except on the diagonal where they have a value of \exmp{1}.
2863
2864
2865#%}}}
2866
2867\sect2{Arrays and Variables}
2868
2869   When an array is created and assigned to a variable, the
2870   interpreter allocates the proper amount of space for the array,
2871   initializes it, and then assigns to the variable a \em{reference}
2872   to the array.   So, a variable that represents an array has a value
2873   that is really a reference to the array.  This has several
2874   consequences, some good and some bad.  It is believed that the
2875   advantages of this representation outweigh the disadvantages.
2876   First, we shall look at the positive aspects.
2877
2878   When a variable is passed to a function, it is always the value of
2879   the variable that gets passed.  Since the value of a variable
2880   representing an array is a reference, a reference to the array gets
2881   passed.  One major advantage of this is rather obvious: it is a
2882   fast and efficient way to pass the array.  This also has another
2883   consequence that is illustrated by the function
2884#v+
2885      define init_array (a, n)
2886      {
2887         variable i;
2888
2889         for (i = 0; i < n; i++) a[i] = some_function (i);
2890      }
2891#v-
2892   where \var{some_function} is a function that generates a scalar
2893   value to initialize the \em{ith} element.  This function can be
2894   used in the following way:
2895#v+
2896      variable X = Double_Type [100000];
2897      init_array (X, 100000);
2898#v-
2899   Since the array is passed to the function by reference, there is no
2900   need to make a separate copy of the \var{100000} element array. As
2901   pointed out above, this saves both execution time and memory. The
2902   other salient feature to note is that any changes made to the
2903   elements of the array within the function will be manifested in the
2904   array outside the function.  Of course, in this case, this is a
2905   desirable side-effect.
2906
2907   To see the downside of this representation, consider:
2908#v+
2909      variable a, b;
2910      a = Double_Type [10];
2911      b = a;
2912      a[0] = 7;
2913#v-
2914   What will be the value of \exmp{b[0]}?  Since the value of \var{a}
2915   is really a reference to the array of ten doubles, and that
2916   reference was assigned to \var{b}, \var{b} also refers to the same
2917   array.  Thus any changes made to the elements of \var{a}, will also
2918   be made implicitly to \var{b}.
2919
2920   This begs the question: If the assignment of one variable which
2921   represents an array, to another variable results in the assignment
2922   of a reference to the array, then how does one make separate copies
2923   of the array?  There are several answers including using an index
2924   array, e.g., \exmp{b = a[*]}; however, the most natural method is
2925   to use the dereference operator:
2926#v+
2927      variable a, b;
2928      a = Double_Type [10];
2929      b = @a;
2930      a[0] = 7;
2931#v-
2932   In this example, a separate copy of \var{a} will be created and
2933   assigned to \var{b}.  It is very important to note that \slang
2934   never implicitly dereferences an object.  So, one must explicitly use
2935   the dereference operator.  This means that the elements of a
2936   dereferenced array are not themselves dereferenced.  For example,
2937   consider dereferencing an array of arrays, e.g.,
2938#v+
2939      variable a, b;
2940      a = Array_Type [2];
2941      a[0] = Double_Type [10];
2942      a[1] = Double_Type [10];
2943      b = @a;
2944#v-
2945   In this example, \exmp{b[0]} will be a reference to the array that
2946   \exmp{a[0]} references because \exmp{a[0]} was not explicitly
2947   dereferenced.
2948
2949\sect2{Using Arrays in Computations} #%{{{
2950
2951   Many functions and operations work transparently with arrays.
2952   For example, if \var{a} and \var{b} are arrays, then the sum
2953   \exmp{a + b} is an array whose elements are formed from the sum of
2954   the corresponding elements of \var{a} and \var{b}.  A similar
2955   statement holds for all other binary and unary operations.
2956
2957   Let's consider a simple example.  Suppose, that we wish to solve a
2958   set of \var{n} quadratic equations whose coefficients are given by
2959   the 1-d arrays \var{a}, \var{b}, and \var{c}.  In general, the
2960   solution of a quadratic equation will be two complex numbers.  For
2961   simplicity, suppose that all we really want is to know what subset of
2962   the coefficients, \var{a}, \var{b}, \var{c}, correspond to
2963   real-valued solutions.  In terms of \var{for} loops, we can write:
2964#v+
2965     variable i, d, index_array;
2966     index_array = Integer_Type [n];
2967     for (i = 0; i < n; i++)
2968       {
2969          d = b[i]^2 - 4 * a[i] * c[i];
2970          index_array [i] = (d >= 0.0);
2971       }
2972#v-
2973   In this example, the array \var{index_array} will contain a
2974   non-zero value if the corresponding set of coefficients has a
2975   real-valued solution.  This code may be written much more compactly
2976   and with more clarity as follows:
2977#v+
2978     variable index_array = ((b^2 - 4 * a * c) >= 0.0);
2979#v-
2980
2981   \slang has a powerful built-in function called \var{where}.  This
2982   function takes an array of integers and returns a 2-d array of
2983   indices that correspond to where the elements of the input array
2984   are non-zero.  This simple operation is extremely useful. For
2985   example, suppose \var{a} is a 1-d array of \var{n} doubles, and it
2986   is desired to set to zero all elements of the array whose value is
2987   less than zero. One way is to use a \var{for} loop:
2988#v+
2989     for (i = 0; i < n; i++)
2990       if (a[i] < 0.0) a[i] = 0.0;
2991#v-
2992   If \var{n} is a large number, this statement can take some time to
2993   execute.  The optimal way to achieve the same result is to use the
2994   \var{where} function:
2995#v+
2996     a[where (a < 0.0)] = 0;
2997#v-
2998   Here, the expression \exmp{(a < 0.0)} returns an array whose
2999   dimensions are the same size as \var{a} but whose elements are
3000   either \exmp{1} or \exmp{0}, according to whether or not the
3001   corresponding element of \var{a} is less than zero.  This array of
3002   zeros and ones is then passed to \var{where} which returns a 2-d
3003   integer array of indices that indicate where the elements of
3004   \var{a} are less than zero.  Finally, those elements of \var{a} are
3005   set to zero.
3006
3007   As a final example, consider once more the example involving the set of
3008   \var{n} quadratic equations presented above.  Suppose that we wish
3009   to get rid of the coefficients of the previous example that
3010   generated non-real solutions.  Using an explicit \var{for} loop requires
3011   code such as:
3012#v+
3013     variable i, j, nn, tmp_a, tmp_b, tmp_c;
3014
3015     nn = 0;
3016     for (i = 0; i < n; i++)
3017       if (index_array [i]) nn++;
3018
3019     tmp_a = Double_Type [nn];
3020     tmp_b = Double_Type [nn];
3021     tmp_c = Double_Type [nn];
3022
3023     j = 0;
3024     for (i = 0; i < n; i++)
3025       {
3026          if (index_array [i])
3027            {
3028               tmp_a [j] = a[i];
3029               tmp_b [j] = b[i];
3030               tmp_c [j] = c[i];
3031               j++;
3032            }
3033       }
3034     a = tmp_a;
3035     b = tmp_b;
3036     c = tmp_c;
3037#v-
3038   Not only is this a lot of code, it is also clumsy and error-prone.
3039   Using the \var{where} function, this task is trivial:
3040#v+
3041     variable i;
3042     i = where (index_array != 0);
3043     a = a[i];
3044     b = b[i];
3045     c = c[i];
3046#v-
3047
3048   All the examples up to now assumed that the dimensions of the array
3049   were known.  Although the intrinsic function \var{length} may be
3050   used to get the total number of elements of an array, it cannot be
3051   used to get the individual dimensions of a multi-dimensional array.
3052   However, the function \var{array_info} may be used to
3053   get information about an array, such as its data type and size.
3054   The function returns three values: the data type, the number of
3055   dimensions, and an integer array containing the size
3056   of each dimension.  It may be used to determine the number of rows
3057   of an array as follows:
3058#v+
3059     define num_rows (a)
3060     {
3061        variable dims, type, num_dims;
3062
3063        (dims, num_dims, type) = array_info (a);
3064        return dims[0];
3065     }
3066#v-
3067   The number of columns may be obtained in a similar manner:
3068#v+
3069     define num_cols (a)
3070     {
3071        variable dims, type, num_dims;
3072
3073        (dims, num_dims, type) = array_info (a);
3074        if (num_dims > 1) return dims[1];
3075        return 1;
3076     }
3077#v-
3078
3079   Another use of \var{array_info} is to create an array that has the
3080   same number of dimensions as another array:
3081#v+
3082     define make_int_array (a)
3083     {
3084        variable dims, num_dims, type;
3085
3086        (dims, num_dims, type) = array_info (a);
3087        return @Array_Type (Integer_Type, dims);
3088     }
3089#v-
3090
3091#%}}}
3092
3093#%}}}
3094
3095\sect1{Associative Arrays} #%{{{
3096
3097   An associative array differs from an ordinary array in the sense
3098   that its size is not fixed and that is indexed by a string, called
3099   the \em{key}. For example, consider:
3100#v+
3101       variable A = Assoc_Type [Integer_Type];
3102       A["alpha"] = 1;
3103       A["beta"] = 2;
3104       A["gamma"] = 3;
3105#v-
3106   Here, \var{A} represents an associative array of integers
3107   (\var{Integer_Type}) and three keys have been added to the array.
3108
3109   As the example suggests, an associative array may be created using
3110   one of the following forms:
3111\begin{tscreen}
3112      Assoc_Type [\em{type}]
3113      Assoc_Type [\em{type}, \em{default-value}]
3114      Assoc_Type []
3115\end{tscreen}
3116   The last form returns an associative array of \var{Any_Type}
3117   objects allowing any type of object to may be stored in
3118   the array.
3119
3120   The form involving a \em{default-value} is useful for associating a
3121   default value for non-existent array members.  This feature is
3122   explained in more detail below.
3123
3124   There are several functions that are specially designed to work
3125   with associative arrays.  These include:
3126\begin{itemize}
3127\item \var{assoc_get_keys}, which returns an ordinary array of strings
3128      containing the keys in the array.
3129
3130\item \var{assoc_get_values}, which returns an ordinary array of the
3131      values of the associative array.
3132
3133\item \var{assoc_key_exists}, which can be used to determine whether
3134      or not a key exists in the array.
3135
3136\item \var{assoc_delete_key}, which may be used to remove a key (and
3137      its value) from the array.
3138\end{itemize}
3139
3140   To illustrate the use of an associative array, consider the problem
3141   of counting the number of repeated occurrences of words in a list.
3142   Let the word list be represented as an array of strings given by
3143   \var{word_list}.  The number of occurrences of each word may be
3144   stored in an associative array as follows:
3145#v+
3146     variable a, word;
3147     a = Assoc_Type [Integer_Type];
3148     foreach (word_list)
3149       {
3150          word = ();
3151          if (0 == assoc_key_exists (a, word))
3152            a[word] = 0;
3153          a[word]++;  % same as a[word] = a[word] + 1;
3154       }
3155#v-
3156   Note that \var{assoc_key_exists} was necessary to determine whether
3157   or not a word was already added to the array in order to properly
3158   initialize it.  However, by creating the associative array with a
3159   default value of \exmp{0}, the above code may be simplified to
3160#v+
3161     variable a, word;
3162     a = Assoc_Type [Integer_Type, 0];
3163     foreach (word_list)
3164       {
3165          word = ();
3166          a[word]++;
3167       }
3168#v-
3169
3170
3171#%}}}
3172
3173\sect1{Structures and User-Defined Types} #%{{{
3174
3175   A \em{structure} is a heterogeneous container object, i.e., it is
3176   an object with elements whose values do not have to be of the same
3177   data type.  The elements or fields of a structure are named, and
3178   one accesses a particular field of the structure via the field
3179   name. This should be contrasted with an array whose values are of
3180   the same type, and whose elements are accessed via array indices.
3181
3182   A \em{user-defined} data type is a structure with a fixed set of
3183   fields defined by the user.
3184
3185\sect2{Defining a Structure}
3186
3187   The \kw{struct} keyword is used to define a structure.  The syntax
3188   for this operation is:
3189\begin{tscreen}
3190     struct {\em{field-name-1}, \em{field-name-2}, ... \em{field-name-N}};
3191\end{tscreen}
3192   This creates and returns a structure with \em{N} fields whose names
3193   are specified by \em{field-name-1}, \em{field-name-2}, ...,
3194   \em{field-name-N}.  When a structure is created, all its fields are
3195   initialized to \var{NULL}.
3196
3197   For example,
3198#v+
3199     variable t = struct { city_name, population, next };
3200#v-
3201   creates a structure with three fields and assigns it to the
3202   variable \var{t}.
3203
3204   Alternatively, a structure may be created by dereferencing
3205   \var{Struct_Type}.  For example, the above structure may also be
3206   created using one of the two forms:
3207#v+
3208      t = @Struct_Type ("city_name", "population", "next");
3209      t = @Struct_Type (["city_name", "population", "next"]);
3210#v-
3211   These are useful when creating structures dynamically where one does
3212   not know the name of the fields until run-time.
3213
3214   Like arrays, structures are passed around via a references.  Thus,
3215   in the above example, the value of \var{t} is a reference to the
3216   structure.  This means that after execution of
3217#v+
3218     variable u = t;
3219#v-
3220   \em{both} \var{t} and \var{u} refer to the \em{same} structure,
3221   since only the reference was used in the assignment.  To actually
3222   create a new copy of the structure, use the \em{dereference}
3223   operator, e.g.,
3224#v+
3225     variable u = @t;
3226#v-
3227
3228\sect2{Accessing the Fields of a Structure}
3229
3230   The dot (\var{.}) operator is used to specify the particular
3231   field of structure.  If \var{s} is a structure and \var{field_name}
3232   is a field of the structure, then \exmp{s.field_name} specifies
3233   that field of \var{s}.  This specification can be used in
3234   expressions just like ordinary variables.  Again, consider
3235#v+
3236     variable t = struct { city_name, population, next };
3237#v-
3238   described in the last section.  Then,
3239#v+
3240     t.city_name = "New York";
3241     t.population = 13000000;
3242     if (t.population > 200) t = t.next;
3243#v-
3244   are all valid statements involving the fields of \var{t}.
3245
3246\sect2{Linked Lists}
3247
3248  One of the most important uses of structures is to create a
3249  \em{dynamic} data structure such as a \em{linked-list}.  A
3250  linked-list is simply a chain of structures that are linked together
3251  such that one structure in the chain is the value of a field of the
3252  previous structure in the chain.  To be concrete, consider the
3253  structure discussed earlier:
3254#v+
3255     variable t = struct { city_name, population, next };
3256#v-
3257  and suppose that we desire to create a list of such structures.
3258  The purpose of the \var{next} field is to provide the link to the
3259  next structure in the chain.  Suppose that there exists a function,
3260  \var{read_next_city}, that reads city names and populations from a
3261  file.  Then we can create the list via:
3262#v+
3263     define create_population_list ()
3264     {
3265        variable city_name, population, list_root, list_tail;
3266        variable next;
3267
3268        list_root = NULL;
3269        while (read_next_city (&city_name, &population))
3270          {
3271             next = struct {city_name, population, next };
3272
3273             next.city_name = city_name;
3274             next.population = population;
3275             next.next = NULL;
3276
3277             if (list_root == NULL)
3278               list_root = next;
3279             else
3280               list_tail.next = next;
3281
3282             list_tail = next;
3283          }
3284        return list_root;
3285     }
3286#v-
3287  In this function, the variables \var{list_root} and \var{list_tail}
3288  represent the beginning and end of the list, respectively. As long
3289  as \var{read_next_city} returns a non-zero value, a new structure is
3290  created, initialized, and then appended to the list via the
3291  \var{next} field of the \var{list_tail} structure.  On the first
3292  time through the loop, the list is created via the assignment to the
3293  \var{list_root} variable.
3294
3295  This function may be used as follows:
3296#v+
3297    variable Population_List = create_population_list ();
3298    if (Population_List == NULL) error ("List is empty");
3299#v-
3300  We can create other functions that manipulate the list.  An example is
3301  a function that finds the city with the largest population:
3302#v+
3303    define get_largest_city (list)
3304    {
3305       variable largest;
3306
3307       largest = list;
3308       while (list != NULL)
3309         {
3310            if (list.population > largest.population)
3311              largest = list;
3312            list = list.next;
3313         }
3314       return largest.city_name;
3315    }
3316
3317    vmessage ("%s is the largest city in the list",
3318               get_largest_city (Population_List)));
3319#v-
3320  The \var{get_largest_city} is a typical example of how one traverses
3321  a linear linked-list by starting at the head of the list and
3322  successively moves to the next element of the list via the
3323  \var{next} field.
3324
3325  In the previous example, a \kw{while} loop was used to traverse the
3326  linked list.  It is faster to use a \kw{foreach} loop for this:
3327#v+
3328    define get_largest_city (list)
3329    {
3330       variable largest, elem;
3331
3332       largest = list;
3333       foreach (list)
3334         {
3335            elem = ();
3336            if (item.population > largest.population)
3337              largest = item;
3338         }
3339       return largest.city_name;
3340    }
3341#v-
3342  Here a \kw{foreach} loop has been used to walk the list via its
3343  \exmp{next} field.  If the field name was not \exmp{next}, then it
3344  would have been necessary to use the \kw{using} form of the
3345  \kw{foreach} statement.  For example, if the field name implementing the
3346  linked list was \exmp{next_item}, then
3347#v+
3348     foreach (list) using ("next_item")
3349     {
3350        elem = ();
3351        .
3352        .
3353     }
3354#v-
3355  would have been used.  In other words, unless otherwise indicated
3356  via the \kw{using} clause, \kw{foreach} walks the list using a field
3357  named \exmp{next}.
3358
3359  Now consider a function that sorts the list according to population.
3360  To illustrate the technique, a \em{bubble-sort} will be used, not
3361  because it is efficient, it is not, but because it is simple and
3362  intuitive.
3363#v+
3364    define sort_population_list (list)
3365    {
3366       variable changed;
3367       variable node, next_node, last_node;
3368       do
3369         {
3370            changed = 0;
3371            node = list;
3372            next_node = node.next;
3373            last_node = NULL;
3374            while (next_node != NULL)
3375              {
3376                 if (node.population < next_node.population)
3377                   {
3378                      % swap node and next_node
3379                      node.next = next_node.next;
3380                      next_node.next = node;
3381                      if (last_node != NULL)
3382                        last_node.next = next_node;
3383
3384                      if (list == node) list = next_node;
3385                      node = next_node;
3386                      next_node = node.next;
3387                      changed++;
3388                   }
3389                 last_node = node;
3390                 node = next_node;
3391                 next_node = next_node.next;
3392              }
3393         }
3394       while (changed);
3395
3396       return list;
3397    }
3398#v-
3399   Note the test for equality between \var{list} and \var{node}, i.e.,
3400#v+
3401                      if (list == node) list = next_node;
3402#v-
3403   It is important to appreciate the fact that the values of these
3404   variables are references to structures, and that the
3405   comparison only compares the references and \em{not} the actual
3406   structures they reference.  If it were not for this, the algorithm
3407   would fail.
3408
3409\sect2{Defining New Types}
3410
3411   A user-defined data type may be defined using the \kw{typedef}
3412   keyword.  In the current implementation, a user-defined data type
3413   is essentially a structure with a user-defined set of fields. For
3414   example, in the previous section a structure was used to represent
3415   a city/population pair.  We can define a data type called
3416   \var{Population_Type} to represent the same information:
3417#v+
3418      typedef struct
3419      {
3420         city_name,
3421         population
3422      } Population_Type;
3423#v-
3424   This data type can be used like all other data types.  For example,
3425   an array of Population_Type types can be created,
3426#v+
3427      variable a = Population_Type[10];
3428#v-
3429   and `populated' via expressions such as
3430#v+
3431      a[0].city_name = "Boston";
3432      a[0].population = 2500000;
3433#v-
3434   The new type \var{Population_Type} may also be used with the
3435   \var{typeof} function:
3436#v+
3437      if (Population_Type = typeof (a)) city = a.city_name;
3438#v-
3439   The dereference \var{@} may be used to create an instance of the
3440   new type:
3441#v+
3442     a = @Population_Type;
3443     a.city_name = "Calcutta";
3444     a.population = 13000000;
3445#v-
3446
3447
3448#%}}}
3449
3450\sect1{Error Handling} #%{{{
3451
3452   Many intrinsic functions signal errors in the event of failure.
3453   User defined functions may also generate an error condition via the
3454   \var{error} function.  Depending upon the severity of the error, it
3455   can be caught and cleared using a construct called an
3456   \em{error-block}.
3457
3458\sect2{Error-Blocks}
3459
3460   When the interpreter encounters a recoverable run-time error, it
3461   will return to top-level by \em{unwinding} its function call
3462   stack.  Any error-blocks that it encounters as part of this
3463   unwinding process will get executed.  Errors such as syntax errors
3464   and memory allocation errors are not recoverable, and error-blocks
3465   will not get executed when such errors are encountered.
3466
3467   An error-block is defined using the syntax
3468#v+
3469       ERROR_BLOCK { statement-list }
3470#v-
3471   where \em{statement-list} represents a list of statements that
3472   comprise the error-block.  A simple example of an error-block is
3473#v+
3474       define simple (a)
3475       {
3476          ERROR_BLOCK { message ("error-block executed"); }
3477          if (a) error ("Triggering Error");
3478          message ("hello");
3479       }
3480#v-
3481   Executing this function via \exmp{simple(0)} will result in the
3482   message \exmp{"hello"}.  However, calling it using \exmp{simple(1)}
3483   will generate an error that will be caught, but not cleared, by
3484   the error-block and the \exmp{"error-block executed"} message will
3485   result.
3486
3487   Error-blocks are never executed unless triggered by an error.  The
3488   only exception to this is when the user explicitly indicates that
3489   the error-block in scope should execute.  This is indicated by the
3490   special keyword \var{EXECUTE_ERROR_BLOCK}.  For example,
3491   \var{simple} could be recoded as
3492#v+
3493       define simple (a)
3494       {
3495          variable err_string = "error-block executed";
3496          ERROR_BLOCK { message (err_string); }
3497          if (a) error ("Triggering Error");
3498          err_string = "hello";
3499          EXECUTE_ERROR_BLOCK;
3500       }
3501#v-
3502   Please note that \var{EXECUTE_ERROR_BLOCK} does not initiate an
3503   error condition; it simply causes the error-block to be executed
3504   and control will pass onto the next statement following the
3505   \var{EXECUTE_ERROR_BLOCK} statement.
3506
3507\sect2{Clearing Errors}
3508
3509   Once an error has been caught by an error-block, the error can be cleared
3510   by the \var{_clear_error} function.  After the error has been cleared,
3511   execution will resume at the next statement at the level of the error block
3512   following the statement that generated the error.  For example, consider:
3513#v+
3514       define make_error ()
3515       {
3516           error ("Error condition created.");
3517           message ("This statement is not executed.");
3518       }
3519
3520       define test ()
3521       {
3522           ERROR_BLOCK
3523             {
3524                _clear_error ();
3525             }
3526           make_error ();
3527           message ("error cleared.");
3528       }
3529#v-
3530   Calling \var{test} will trigger an error in the \var{make_error}
3531   function, but will get cleared in the \var{test} function.  The
3532   call-stack will unwind from \var{make_error} back into \var{test}
3533   where the error-block will get executed.  As a result, execution
3534   resumes after the statement that makes the call to \var{make_error}
3535   since this statement is at the same level as the error-block that
3536   cleared the error.
3537
3538   Here is another example that illustrates how multiple error-blocks
3539   work:
3540#v+
3541       define example ()
3542       {
3543          variable n = 0, s = "";
3544          variable str;
3545
3546          ERROR_BLOCK {
3547              str = sprintf ("s=%s,n=%d", s, n);
3548              _clear_error ();
3549          }
3550
3551          forever
3552            {
3553              ERROR_BLOCK {
3554               s += "0";
3555               _clear_error ();
3556              }
3557
3558              if (n == 0) error ("");
3559
3560              ERROR_BLOCK {
3561               s += "1";
3562              }
3563
3564              if (n == 1) error ("");
3565              n++;
3566            }
3567          return str;
3568       }
3569#v-
3570   Here, three error-blocks have been declared.  One has been declared
3571   outside the \var{forever} loop and the other two have been declared
3572   inside the \var{forever} loop.  Each time through the loop, the variable
3573   \var{n} is incremented and a different error-block is triggered.  The
3574   error-block that gets triggered is the last one encountered, since
3575   that will be the one in scope.  On the first time through the loop,
3576   \var{n} will be zero and the first error-block in the loop will get
3577   executed.  This error block clears the error and execution resumes
3578   following the \var{if} statement that triggered the error. The
3579   variable \var{n} will get incremented to \exmp{1} and, on the
3580   second cycle through the loop the second \var{if} statement
3581   will trigger an error causing the second error-block to execute.
3582   This time, the error is not cleared and the call-stack unwinds out
3583   of the \var{forever} loop, at which point the error-block outside
3584   the loop is in scope, causing it to execute. This error-block
3585   prints out the values of the variables \var{s} and \var{n}.  It
3586   will clear the error and execution resumes on the statement
3587   \em{following} the \var{forever} loop.  The result of this
3588   complicated series of events is that the function will return the
3589   string \exmp{"s=01,n=1"}.
3590
3591#%}}}
3592
3593\sect1{Loading Files: evalfile and autoload}
3594
3595\sect1{File Input/Output} #%{{{
3596
3597 \slang provides built-in supports for two different I/O facilities.
3598 The simplest interface is modeled upon the C language \var{stdio}
3599 streams interface and consists of functions such as \var{fopen},
3600 \var{fgets}, etc.  The other interface is modeled on a lower level
3601 POSIX interface consisting of functions such as \var{open},
3602 \var{read}, etc.  In addition to permitting more control, the lower
3603 level interface permits one to access network objects as well as disk
3604 files.
3605
3606\sect2{Input/Output via stdio}
3607\sect3{Stdio Overview}
3608 The \var{stdio} interface consists of the following functions:
3609\begin{itemize}
3610\item \var{fopen}, which opens a file for read or writing.
3611
3612\item \var{fclose}, which closes a file opened by \var{fopen}.
3613
3614\item \var{fgets}, used to read a line from the file.
3615
3616\item \var{fputs}, which writes text to the file.
3617
3618\item \var{fprintf}, used to write formatted text to the file.
3619
3620\item \var{fwrite}, which may be used to write objects to the
3621       file.
3622
3623\item \var{fread}, which reads a specified number of objects from
3624       the file.
3625
3626\item \var{feof}, which is used to test whether the file pointer is at the
3627       end of the file.
3628
3629\item \var{ferror}, which is used to see whether or not the stream
3630       associated with the file has an error.
3631
3632\item \var{clearerr}, which clears the end-of-file and error
3633       indicators for the stream.
3634
3635\item \var{fflush}, used to force all buffered data associated with
3636       the stream to be written out.
3637
3638\item \var{ftell}, which is used to query the file position indicator
3639       of the stream.
3640
3641\item \var{fseek}, which is used to set the position of the file
3642      position indicator of the stream.
3643
3644\item \var{fgetslines}, which reads all the lines in a text file and
3645      returns them as an array of strings.
3646
3647\end{itemize}
3648
3649 In addition, the interface supports the \var{popen} and \var{pclose}
3650 functions on systems where the corresponding C functions are available.
3651
3652 Before reading or writing to a file, it must first be opened using
3653 the \var{fopen} function.  The only exceptions to this rule involves
3654 use of the pre-opened streams: \var{stdin}, \var{stdout}, and
3655 \var{stderr}.  \var{fopen} accepts two arguments: a file name and a
3656 string argument that indicates how the file is to be opened, e.g.,
3657 for reading, writing, update, etc.  It returns a \var{File_Type}
3658 stream object that is used as an argument to all other functions of
3659 the \var{stdio} interface.  Upon failure, it returns \NULL.  See the
3660 reference manual for more information about \var{fopen}.
3661
3662\sect3{Stdio Examples}
3663
3664 In this section, some simple examples of the use of the \var{stdio}
3665 interface is presented.  It is important to realize that all the
3666 functions of the interface return something, and that return value
3667 must be dealt with.
3668
3669 The first example involves writing a function to count the number of
3670 lines in a text file.  To do this, we shall read in the lines, one by
3671 one, and count them:
3672#v+
3673    define count_lines_in_file (file)
3674    {
3675       variable fp, line, count;
3676
3677       fp = fopen (file, "r");    % Open the file for reading
3678       if (fp == NULL)
3679         verror ("%s failed to open", file);
3680
3681       count = 0;
3682       while (-1 != fgets (&line, fp))
3683         count++;
3684
3685       () = fclose (fp);
3686       return count;
3687    }
3688#v-
3689 Note that \exmp{&line} was passed to the \var{fgets} function.  When
3690 \var{fgets} returns, \var{line} will contain the line of text read in
3691 from the file.  Also note how the return value from \var{fclose} was
3692 handled.
3693
3694 Although the preceding example closed the file via \var{fclose},
3695 there is no need to explicitly close a file because \slang will
3696 automatically close the file when it is no longer referenced.  Since
3697 the only variable to reference the file is \var{fp}, it would have
3698 automatically been closed when the function returned.
3699
3700 Suppose that it is desired to count the number of characters in the
3701 file instead of the number of lines.  To do this, the \var{while}
3702 loop could be modified to count the characters as follows:
3703#v+
3704      while (-1 != fgets (&line, fp))
3705        count += strlen (line);
3706#v-
3707 The main difficulty with this approach is that it will not work for
3708 binary files, i.e., files that contain null characters.  For such
3709 files, the file should be opened in \em{binary} mode via
3710#v+
3711      fp = fopen (file, "rb");
3712#v-
3713 and then the data read in using the \var{fread} function:
3714#v+
3715      while (-1 != fread (&line, Char_Type, 1024, fp))
3716           count += bstrlen (line);
3717#v-
3718 The \var{fread} function requires two additional arguments: the type
3719 of object to read (\var{Char_Type} in the case), and the number of
3720 such objects to read.  The function returns the number of objects
3721 actually read, or -1 upon failure.  The \var{bstrlen} function was
3722 used to compute the length of \var{line} because for \var{Char_Type}
3723 or \var{UChar_Type} objects, the \var{fread} function assigns a
3724 \em{binary} string (\var{BString_Type}) to \var{line}.
3725
3726 The \kw{foreach} construct also works with \var{File_Type} objects.
3727 For example, the number of characters in a file may be counted via
3728#v+
3729     foreach (fp) using ("char")
3730     {
3731        ch = ();
3732        count++;
3733     }
3734#v-
3735 To count the number of lines, one can use:
3736#v+
3737     foreach (fp) using ("line")
3738     {
3739        line = ();
3740        num_lines++;
3741        count += strlen (line);
3742     }
3743#v-
3744
3745 Finally, it should be mentioned that neither of these examples should
3746 be used to count the number of characters in a file when that
3747 information is more readily accessible by another means.  For
3748 example, it is preferable to get this information via the
3749 \var{stat_file} function:
3750#v+
3751     define count_chars_in_file (file)
3752     {
3753        variable st;
3754
3755        st = stat_file (file);
3756        if (st == NULL)
3757          error ("stat_file failed.");
3758        return st.st_size;
3759     }
3760#v-
3761
3762\sect2{POSIX I/O}
3763
3764\sect2{Advanced I/O techniques}
3765
3766  The previous examples illustrate how to read and write objects of a
3767  single data-type from a file, e.g.,
3768#v+
3769      num = fread (&a, Double_Type, 20, fp);
3770#v-
3771  would result in a \exmp{Double_Type[num]} array being assigned to
3772  \var{a} if successful.  However, suppose that the binary data file
3773  consists of numbers in a specified byte-order.  How can one read
3774  such objects with the proper byte swapping?  The answer is to use
3775  the \var{fread} function to read the objects as \var{Char_Type} and
3776  then \em{unpack} the resulting string into the specified data type,
3777  or types.  This process is facilitated using the \var{pack} and
3778  \var{unpack} functions.
3779
3780  The \var{pack} function follows the syntax
3781\begin{tscreen}
3782    BString_Type pack (\em{format-string}, \em{item-list});
3783\end{tscreen}
3784  and combines the objects in the \em{item-list} according to
3785  \em{format-string} into a binary string and returns the result.
3786  Likewise, the \var{unpack} function may be used to convert a binary
3787  string into separate data objects:
3788\begin{tscreen}
3789   (\em{variable-list}) = unpack (\em{format-string}, \em{binary-string});
3790\end{tscreen}
3791
3792  The format string consists of one or more data-type specification
3793  characters, and each may be followed by an optional decimal length
3794  specifier. Specifically, the data-types are specified according to
3795  the following table:
3796#v+
3797     c     char
3798     C     unsigned char
3799     h     short
3800     H     unsigned short
3801     i     int
3802     I     unsigned int
3803     l     long
3804     L     unsigned long
3805     j     16 bit int
3806     J     16 unsigned int
3807     k     32 bit int
3808     K     32 bit unsigned int
3809     f     float
3810     d     double
3811     F     32 bit float
3812     D     64 bit float
3813     s     character string, null padded
3814     S     character string, space padded
3815     x     a null pad character
3816#v-
3817  A decimal length specifier may follow the data-type specifier. With
3818  the exception of the \var{s} and \var{S} specifiers, the length
3819  specifier indicates how many objects of that data type are to be
3820  packed or unpacked from the string.  When used with the \var{s} or
3821  \var{S} specifiers, it indicates the field width to be used.  If the
3822  length specifier is not present, the length defaults to one.
3823
3824  With the exception of \var{c}, \var{C}, \var{s}, \var{S}, and
3825  \var{x}, each of these may be prefixed by a character that indicates
3826  the byte-order of the object:
3827#v+
3828     >    big-endian order (network order)
3829     <    little-endian order
3830     =    native byte-order
3831#v-
3832  The default is native byte order.
3833
3834  Here are a few examples that should make this more clear:
3835#v+
3836     a = pack ("cc", 'A', 'B');         % ==> a = "AB";
3837     a = pack ("c2", 'A', 'B');         % ==> a = "AB";
3838     a = pack ("xxcxxc", 'A', 'B');     % ==> a = "\0\0A\0\0B";
3839     a = pack ("h2", 'A', 'B');         % ==> a = "\0A\0B" or "\0B\0A"
3840     a = pack (">h2", 'A', 'B');        % ==> a = "\0\xA\0\xB"
3841     a = pack ("<h2", 'A', 'B');        % ==> a = "\0B\0A"
3842     a = pack ("s4", "AB", "CD");       % ==> a = "AB\0\0"
3843     a = pack ("s4s2", "AB", "CD");     % ==> a = "AB\0\0CD"
3844     a = pack ("S4", "AB", "CD");       % ==> a = "AB  "
3845     a = pack ("S4S2", "AB", "CD");     % ==> a = "AB  CD"
3846#v-
3847
3848  When unpacking, if the length specifier is greater than one, then an
3849  array of that length will be returned.  In addition, trailing
3850  whitespace and null character are stripped when unpacking an object
3851  given by the \var{S} specifier.  Here are a few examples:
3852#v+
3853    (x,y) = unpack ("cc", "AB");         % ==> x = 'A', y = 'B'
3854    x = unpack ("c2", "AB");             % ==> x = ['A', 'B']
3855    x = unpack ("x<H", "\0\xAB\xCD");    % ==> x = 0xCDABuh
3856    x = unpack ("xxs4", "a b c\0d e f");  % ==> x = "b c\0"
3857    x = unpack ("xxS4", "a b c\0d e f");  % ==> x = "b c"
3858#v-
3859
3860\sect3{Example: Reading /var/log/wtmp}
3861
3862  Consider the task of reading the Unix system file
3863  \var{/var/log/utmp}, which contains login records about who logged
3864  onto the system.  This file format is documented in section 5 of the
3865  online Unix man pages, and consists of a sequence of entries
3866  formatted according to the C structure \var{utmp} defined in the
3867  \var{utmp.h} C header file.  The actual details of the structure
3868  may vary from one version of Unix to the other.  For the purposes of
3869  this example, consider its definition under the Linux operating
3870  system running on an Intel processor:
3871#v+
3872    struct utmp {
3873       short ut_type;              /* type of login */
3874       pid_t ut_pid;               /* pid of process */
3875       char ut_line[12];           /* device name of tty - "/dev/" */
3876       char ut_id[2];              /* init id or abbrev. ttyname */
3877       time_t ut_time;             /* login time */
3878       char ut_user[8];            /* user name */
3879       char ut_host[16];           /* host name for remote login */
3880       long ut_addr;               /* IP addr of remote host */
3881    };
3882#v-
3883  On this system, \var{pid_t} is defined to be an \var{int} and
3884  \var{time_t} is a \var{long}.  Hence, a format specifier for the
3885  \var{pack} and \var{unpack} functions is easily constructed to be:
3886#v+
3887     "h i S12 S2 l S8 S16 l"
3888#v-
3889  However, this particular definition is naive because it does not
3890  allow for structure padding performed by the C compiler in order to
3891  align the data types on suitable word boundaries.  Fortunately, the
3892  intrinsic function \var{pad_pack_format} may be used to modify a
3893  format by adding the correct amount of padding in the right places.
3894  In fact, \var{pad_pack_format} applied to the above format on an
3895  Intel-based Linux system produces the result:
3896#v+
3897     "h x2 i S12 S2 x2 l S8 S16 l"
3898#v-
3899  Here we see that 4 bytes of padding were added.
3900
3901  The other missing piece of information is the size of the structure.
3902  This is useful because we would like to read in one structure at a
3903  time using the \var{fread} function.  Knowing the size of the
3904  various data types makes this easy; however it is even easier to use
3905  the \var{sizeof_pack} intrinsic function, which returns the size (in
3906  bytes) of the structure described by the pack format.
3907
3908  So, with all the pieces in place, it is rather straightforward to
3909  write the code:
3910#v+
3911    variable format, size, fp, buf;
3912
3913    typedef struct
3914    {
3915       ut_type, ut_pid, ut_line, ut_id,
3916       ut_time, ut_user, ut_host, ut_addr
3917    } UTMP_Type;
3918
3919    format = pad_pack_format ("h i S12 S2 l S8 S16 l");
3920    size = sizeof_pack (format);
3921
3922    define print_utmp (u)
3923    {
3924
3925      () = fprintf (stdout, "%-16s %-12s %-16s %s\n",
3926		    u.ut_user, u.ut_line, u.ut_host, ctime (u.ut_time));
3927    }
3928
3929
3930   fp = fopen ("/var/log/utmp", "rb");
3931   if (fp == NULL)
3932     error ("Unable to open utmp file");
3933
3934   () = fprintf (stdout, "%-16s %-12s %-16s %s\n",
3935                          "USER", "TTY", "FROM", "LOGIN@");
3936
3937   variable U = @UTMP_Type;
3938
3939   while (-1 != fread (&buf, Char_Type, size, fp))
3940     {
3941       set_struct_fields (U, unpack (format, buf));
3942       print_utmp (U);
3943     }
3944
3945   () = fclose (fp);
3946#v-
3947  A few comments about this example are in order.  First of all, note
3948  that a new data type called \var{UTMP_Type} was created, although
3949  this was not really necessary.  We also opened the file in binary
3950  mode, but this too is optional under a Unix system where there is no
3951  distinction between binary and text modes. The \var{print_utmp}
3952  function does not print all of the structure fields.  Finally, last
3953  but not least, the return values from \var{fprintf} and \var{fclose}
3954  were dealt with.
3955
3956#%}}}
3957
3958\sect1{Debugging} #%{{{
3959
3960 The current implementation provides no support for an interactive
3961 debugger, although a future version will.  Nevertheless, \slang has
3962 several features that aid the programmer in tracking down problems,
3963 including function call tracebacks and the tracing of function calls.
3964 However, the biggest debugging aid stems from the fact that the
3965 language is interpreted permitting one to easily add debugging
3966 statements to the code.
3967
3968 To enable debugging information, add the lines
3969#v+
3970    _debug_info = 1;
3971    _traceback = 1;
3972#v-
3973 to the top of the source file of the code containing the bug and the
3974 reload the file.  Setting the \var{_debug_info} variable to
3975 \exmp{1} causes line number information to be compiled into the
3976 functions when the file is loaded.  The \var{_traceback} variable
3977 controls whether or not traceback information should be generated.
3978 If it is set to \exmp{1}, the values of local variables will be
3979 dumped when the traceback is generated.  Setting this variable
3980 to \exmp{-1} will cause only function names to be reported in the
3981 traceback.
3982
3983 Here is an example of a traceback report:
3984#v+
3985    S-Lang Traceback: error
3986    S-Lang Traceback: verror
3987    S-Lang Traceback: (Error occurred on line 65)
3988    S-Lang Traceback: search_generic_search
3989      Local Variables:
3990        $0: Type: String_Type,  Value:  "Search forward:"
3991        $1: Type: Integer_Type, Value:  1
3992        $2: Type: Ref_Type,     Value:  _function_return_1
3993        $3: Type: String_Type,  Value:  "abcdefg"
3994        $4: Type: Integer_Type, Value:  1
3995    S-Lang Traceback: (Error occurred on line 72)
3996    S-Lang Traceback: search_forward
3997#v-
3998 There are several ways to read this report; perhaps the simplest is
3999 to read it from the bottom.  This report says that on line \exmp{72},
4000 the \var{search_forward} function called the
4001 \var{search_generic_search} function.  On line \var{65} it called the
4002 \verb{verror} function, which called \var{error}.  The
4003 \var{search_generic_search} function contains \var{5} local variables
4004 and are represented symbolically as \exmp{$0} through \exmp{$4}.
4005
4006
4007#%}}}
4008
4009#i regexp.tm
4010
4011\sect1{Future Directions} #%{{{
4012
4013 Several new features or enhancements to the \slang language are
4014 planned for the next major release.  In no particular order, these
4015 include:
4016\begin{itemize}
4017  \item An interactive debugging facility.
4018  \item Function qualifiers.  These entities should already be
4019  familiar to VMS users or to those who are familiar with the IDL
4020  language.  Basically, a qualifier is an optional argument that is
4021  passed to a function, e.g., \exmp{plot(X,Y,/logx)}.  Here
4022  \exmp{/logx} is a qualifier that specifies that the plot function
4023  should use a log scale for \exmp{x}.
4024  \item File local variables and functions.  A file local variable or
4025  function is an object that is global to the file that defines it.
4026  \item Multi-threading.  Currently the language does not support
4027  multiple threads.
4028\end{itemize}
4029
4030
4031#%}}}
4032
4033\appendix
4034
4035#i copyright.tm
4036
4037\end{\documentstyle}
4038