1#% -*- mode: textmac; mode: fold -*- 2 3#% text-macro definitions #%{{{ 4#i linuxdoc.tm 5 6#d slang \bf{S-Lang} 7#d slrn \bf{slrn} 8#d jed \bf{jed} 9#d kw#1 \tt{$1} 10#d exmp#1 \tt{$1} 11#d var#1 \tt{$1} 12#d ldots ... 13#d times * 14#d math#1 $1 15#d sc#1 \tt{$1} 16#d verb#1 \tt{$1} 17#d sldxe \bf{sldxe} 18#d url#1 <htmlurl url="$1" name="$1"> 19#d slang-library-reference \bf{The \slang Library Reference} 20#d chapter#1 <chapt>$1<p> 21#d preface <preface> 22#d tag#1 <tag>$1</tag> 23#d appendix <appendix> 24 25#d NULL <tt>NULL</tt> 26#d kbd#1 <tt>$1</tt> 27 28#% d documentstyle article 29#% d sect1 \section 30#% d sect2 \subsection 31#% d sect3 \subsubsection 32#% d sect4 \subsubsection 33 34#d documentstyle book 35#d sect1 \chapter 36#d sect2 \section 37#d sect3 \subsection 38#d sect4 \subsubsection 39 40 41#%}}} 42 43\linuxdoc 44 45\begin{\documentstyle} 46 47\title A Guide to the S-Lang Language 48\author John E. Davis, \tt{davis@space.mit.edu} 49\date \__today__ 50 51\toc 52 53#i preface.tm 54 55\sect1{Introduction} #%{{{ 56 57 \slang is a powerful interpreted language that may be embedded into 58 an application to make the application extensible. This enables 59 the application to be used in ways not envisioned by the programmer, 60 thus providing the application with much more flexibility and 61 power. Examples of applications that take advantage of the 62 interpreter in this way include the \jed editor and the \slrn 63 newsreader. 64 65\sect2{Language Features} 66 67 The language features both global and local variables, branching 68 and looping constructs, user-defined functions, structures, 69 datatypes, and arrays. In addition, there is limited support for 70 pointer types. The concise array syntax rivals that of commercial 71 array-based numerical computing environments. 72 73\sect2{Data Types and Operators} #%{{{ 74 75 The language provides built-in support for string, integer (signed 76 and unsigned long and short), double precision floating point, and 77 double precision complex numbers. In addition, it supports user 78 defined structure types, multi-dimensional array types, and 79 associative arrays. To facilitate the construction of 80 sophisticated data structures such as linked lists and trees, a 81 `reference' type was added to the language. The reference type 82 provides much of the same flexibility as pointers in other 83 languages. Finally, applications embedding the interpreter may 84 also provide special application specific types, such as the 85 \var{Mark_Type} that the \jed editor provides. 86 87 The language provides standard arithmetic operations such as 88 addition, subtraction, multiplication, and division. It also 89 provides support for modulo arithmetic as well as operations at 90 the bit level, e.g., exclusive-or. Any binary or unary operator 91 may be extended to work with any data type. For example, the 92 addition operator (\var{+}) has been extended to work between 93 string types to permit string concatenation. 94 95 The binary and unary operators work transparently with array types. 96 For example, if \var{a} and \var{b} are arrays, then \exmp{a + b} 97 produces an array whose elements are the result of element by 98 element addition of \var{a} and \var{b}. This permits one to do 99 vector operations without explicitly looping over the array 100 indices. 101 102#%}}} 103 104\sect2{Statements and Functions} #%{{{ 105 106 The \slang language supports several types of looping constructs and 107 conditional statements. The looping constructs include \kw{while}, 108 \kw{do...while}, \kw{for}, \kw{forever}, \kw{loop}, \kw{foreach}, 109 and \kw{_for}. The conditional statements include \kw{if}, 110 \kw{if-then-else}, and \kw{!if}. 111 112 User defined functions may be defined to return zero, one, or more 113 values. Functions that return zero values are similar to 114 `procedures' in languages such as PASCAL. The local variables of a 115 function are always created on a stack allowing one to create 116 recursive functions. Parameters to a function are always passed by 117 value and never by reference. However, the language supports a 118 \em{reference} data type that allows one to simulate pass by 119 reference. 120 121 Unlike many interpreted languages, \slang allows functions to be 122 dynamically loaded (function autoloading). It also provides 123 constructs specifically designed for error handling and recovery as 124 well as debugging aids (e.g., function tracebacks). 125 126 Functions and variables may be declared as private belonging to a 127 namespace associated with the compilation unit that defines the 128 function or variable. The ideas behind the namespace implementation 129 stems from the C language and should be quite familiar to any one 130 familiar with C. 131 132#%}}} 133 134\sect2{Error Handling} #%{{{ 135 136 The \slang language defines a construct called an \em{error-block} 137 that may be used for error handling and recovery. When a non-fatal 138 run-time error is encountered, any error blocks that have been 139 defined are executed as the run-time stack unwinds. An error block 140 can optionally clear the error and the program will continue 141 running after the statement that triggered the error. This 142 mechanism is somewhat similar to try-catch in C++. 143 144#%}}} 145 146\sect2{Run-Time Library} #%{{{ 147 148 Functions that compose the \slang run-time library are called 149 \em{intrinsics}. Examples of \slang intrinsic functions available 150 to every \slang application include string manipulation functions 151 such as \var{strcat}, \var{strchop}, and \var{strcmp}. The \slang 152 library also provides mathematical functions such as \var{sin}, 153 \var{cos}, and \var{tan}; however, not all applications enable the 154 use of these intrinsics. For example, to conserve memory, the 16 155 bit version of the \jed editor does not provide support for any 156 mathematics other than simple integer arithmetic, whereas other 157 versions of the editor do support these functions. 158 159 Most applications embedding the languages will also provide a set of 160 application specific intrinsic functions. For example, the \jed 161 editor adds over 100 application specific intrinsic functions to 162 the language. Consult your application specific documentation to 163 see what additional intrinsics are supported. 164 165#%}}} 166 167\sect2{Input/Output} 168 169 The language supports C-like stdio input/output functions such as 170 \var{fopen}, \var{fgets}, \var{fputs}, and \var{fclose}. In 171 addition it provides two functions, \var{message} and \var{error}, 172 for writing to the standard output device and standard error. 173 Specific applications may provide other I/O mechanisms, e.g., 174 the \jed editor supports I/O to files via the editor's 175 buffers. 176 177\sect2{Obtaining \slang} #%{{{ 178 179 Comprehensive information about the library may be obtained via the 180 World Wide Web from \tt{http://www.s-lang.org}. 181 182 \slang as well as some programs that embed it are freely available 183 via anonymous ftp in the United States from 184\begin{itemize} 185 \item \url{ftp://space.mit.edu/pub/davis}. 186\end{itemize} 187 It is also available outside the United States from the following 188 mirror sites: 189\begin{itemize} 190 \item \url{ftp://ftp.uni-stuttgart.de/pub/unix/misc/slang/} 191 \item \url{ftp://ftp.fu-berlin.de/pub/unix/news/slrn/} 192 \item \url{ftp://ftp.ntua.gr/pub/lang/slang/} 193\end{itemize} 194 195 The Usenet newsgroup \var{alt.lang.s-lang} was created for \slang 196 programmers to exchange information and share macros for the various 197 programs the embed the language. The newsgroup \var{comp.editors} 198 can be a useful resource for \slang macros for the \jed editor. 199 Similarly, \slrn users will find \var{news.software.readers} to be a 200 valuable source of information. 201 202 Finally, two mailing lists dealing with the \slang library have been 203 created: 204\begin{itemize} 205 \item \tt{slang-announce@babayaga.math.fu-berlin.de} 206 \item \tt{slang-workers@babayaga.math.fu-berlin.de} 207\end{itemize} 208 The first list is for announcements of new releases of the library, while the 209 second list is intended for those who use the library for their own code 210 development. To subscribe to the announcement list, send an email to 211 \tt{slang-announce-subscribe@babayaga.math.fu-berlin.de} and include 212 the word \tt{subscribe} in the body of the message. To subscribe to 213 the developers list, use the address 214 \tt{slang-workers-subscribe@babayaga.math.fu-berlin.de}. 215 216#%}}} 217 218#%}}} 219 220\sect1{Overview of the Language} #%{{{ 221 222 This purpose of this section is to give the reader a feel for the 223 \slang language, its syntax, and its capabilities. The information 224 and examples presented in this section should be sufficient to 225 provide the reader with the necessary background to understand the 226 rest of the document. 227 228\sect2{Variables and Functions} #%{{{ 229 230 \slang is different from many other interpreted languages in the 231 sense that all variables and functions must be declared before they 232 can be used. 233 234 Variables are declared using the \kw{variable} keyword, e.g., 235#v+ 236 variable x, y, z; 237#v- 238 declares three variables, \var{x}, \var{y}, and \var{z}. Note the 239 semicolon at the end of the statement. \em{All \slang statements must 240 end in a semi-colon.} 241 242 Unlike compiled languages such as C, it is not necessary to specify 243 the data type of a \slang variable. The data type of a \slang 244 variable is determined upon assignment. For example, after 245 execution of the statements 246#v+ 247 x = 3; 248 y = sin (5.6); 249 z = "I think, therefore I am."; 250#v- 251 \var{x} will be an integer, \var{y} will be a 252 double, and \var{z} will be a string. In fact, it is even possible 253 to re-assign \var{x} to a string: 254#v+ 255 x = "x was an integer, but now is a string"; 256#v- 257 Finally, one can combine variable declarations and assignments in 258 the same statement: 259#v+ 260 variable x = 3, y = sin(5.6), z = "I think, therefore I am."; 261#v- 262 263 Most functions are declared using the \kw{define} keyword. A 264 simple example is 265#v+ 266 define compute_average (x, y) 267 { 268 variable s = x + y; 269 return s / 2.0; 270 } 271#v- 272 which defines a function that simply computes the average of two 273 numbers and returns the result. This example shows that a function 274 consists of three parts: the function name, a parameter list, and 275 the function body. 276 277 The parameter list consists of a comma separated list of variable 278 names. It is not necessary to declare variables within a parameter 279 list; they are implicitly declared. However, all other \em{local} 280 variables used in the function must be declared. If the function 281 takes no parameters, then the parameter list must still be present, 282 but empty: 283#v+ 284 define go_left_5 () 285 { 286 go_left (5); 287 } 288#v- 289 The last example is a function that takes no arguments and returns 290 no value. Some languages such as PASCAL distinguish such objects 291 from functions that return values by calling these objects 292 \em{procedures}. However, \slang, like C, does not make such a 293 distinction. 294 295 The language permits \em{recursive} functions, i.e., functions that 296 call themselves. The way to do this in \slang is to first declare 297 the function using the form: 298\begin{tscreen} 299 define \em{function-name} (); 300\end{tscreen} 301 It is not necessary to declare a parameter list when declaring a 302 function in this way. 303 304 The most famous example of a recursive function is the factorial 305 function. Here is how to implement it using \slang: 306#v+ 307 define factorial (); % declare it for recursion 308 309 define factorial (n) 310 { 311 if (n < 2) return 1; 312 return n * factorial (n - 1); 313 } 314#v- 315 This example also shows how to mix comments with code. \slang uses 316 the `\var{%}' character to start a comment and all characters from 317 the comment character to the end of the line are ignored. 318 319#%}}} 320 321\sect2{Strings} #%{{{ 322 323 Perhaps the most appealing feature of any interpreted language is 324 that it frees the user from the responsibility of memory management. 325 This is particularly evident when contrasting how 326 \slang handles string variables with a lower level language such as 327 C. Consider a function that concatenates three strings. An 328 example in \slang is: 329#v+ 330 define concat_3_strings (a, b, c) 331 { 332 return strcat (a, strcat (b, c)); 333 } 334#v- 335 This function uses the built-in 336 \var{strcat} function for concatenating two strings. In C, the 337 simplest such function would look like: 338#v+ 339 char *concat_3_strings (char *a, char *b, char *c) 340 { 341 unsigned int len; 342 char *result; 343 len = strlen (a) + strlen (b) + strlen (c); 344 if (NULL == (result = (char *) malloc (len + 1))) 345 exit (1); 346 strcpy (result, a); 347 strcat (result, b); 348 strcat (result, c); 349 return result; 350 } 351#v- 352 Even this C example is misleading since none of the issues of memory 353 management of the strings has been dealt with. The \slang language 354 hides all these issues from the user. 355 356 Binary operators have been defined to work with the string data 357 type. In particular the \var{+} operator may be used to perform 358 string concatenation. That is, one can use the 359 \var{+} operator as an alternative to \var{strcat}: 360#v+ 361 define concat_3_strings (a, b, c) 362 { 363 return a + b + c; 364 } 365#v- 366 See section ??? for more information about string variables. 367 368#%}}} 369 370\sect2{Referencing and Dereferencing} #%{{{ 371 The unary prefix operator, \var{&}, may be used to create a 372 \em{reference} to an object, which is similar to a pointer 373 in other languages. References are commonly used as a mechanism to 374 pass a function as an argument to another function as the following 375 example illustrates: 376#v+ 377 define compute_functional_sum (funct) 378 { 379 variable i, sum; 380 381 sum = 0; 382 for (i = 0; i < 10; i++) 383 { 384 sum += (@funct)(i); 385 } 386 return sum; 387 } 388 389 variable sin_sum = compute_functional_sum (&sin); 390 variable cos_sum = compute_functional_sum (&cos); 391#v- 392 Here, the function \var{compute_functional_sum} applies the 393 function specified by the parameter \var{funct} to the first 394 \exmp{10} integers and returns the sum. The two statements 395 following the function definition show how the \var{sin} and 396 \var{cos} functions may be used. 397 398 Note the \var{@} operator in the definition of 399 \var{compute_functional_sum}. It is known as the \em{dereference} 400 operator and is the inverse of the reference operator. 401 402 Another use of the reference operator is in the context of the 403 \var{fgets} function. For example, 404#v+ 405 define read_nth_line (file, n) 406 { 407 variable fp, line; 408 fp = fopen (file, "r"); 409 410 while (n > 0) 411 { 412 if (-1 == fgets (&line, fp)) 413 return NULL; 414 n--; 415 } 416 return line; 417 } 418#v- 419 uses the \var{fgets} function to read the nth line of a file. 420 In particular, a reference to the local variable \var{line} is 421 passed to \var{fgets}, and upon return \var{line} will be set to 422 the character string read by \var{fgets}. 423 424 Finally, references may be used as an alternative to multiple 425 return values by passing information back via the parameter list. 426 The example involving \var{fgets} presented above provided an 427 illustration of this. Another example is 428#v+ 429 define set_xyz (x, y, z) 430 { 431 @x = 1; 432 @y = 2; 433 @z = 3; 434 } 435 variable X, Y, Z; 436 set_xyz (&X, &Y, &Z); 437#v- 438 which, after execution, results in \var{X} set to \exmp{1}, \var{Y} 439 set to \exmp{2}, and \var{Z} set to \exmp{3}. A C programmer will 440 note the similarity of \var{set_xyz} to the following C 441 implementation: 442#v+ 443 void set_xyz (int *x, int *y, int *z) 444 { 445 *x = 1; 446 *y = 2; 447 *z = 3; 448 } 449#v- 450#%}}} 451 452\sect2{Arrays} #%{{{ 453 The \slang language supports multi-dimensional arrays of all 454 datatypes. For example, one can define arrays of references to 455 functions as well as arrays of arrays. Here are a few examples of 456 creating arrays: 457#v+ 458 variable A = Integer_Type [10]; 459 variable B = Integer_Type [10, 3]; 460 variable C = [1, 3, 5, 7, 9]; 461#v- 462 The first example creates an array of \var{10} integers and assigns 463 it to the variable \var{A}. The second example creates a 2-d array 464 of \var{30} integers arranged in \var{10} rows and \var{3} columns 465 and assigns the result to \var{B}. In the last example, an array 466 of \var{5} integers is assigned to the variable \var{C}. However, 467 in this case the elements of the array are initialized to the 468 values specified. This is known as an \em{inline-array}. 469 470 \slang also supports something called an 471 \em{range-array}. An example of such an array is 472#v+ 473 variable C = [1:9:2]; 474#v- 475 This will produce an array of 5 integers running from \exmp{1} 476 through \exmp{9} in increments of \exmp{2}. 477 478 Arrays are passed by reference to functions and never by value. 479 This permits one to write functions which can initialize arrays. 480 For example, 481#v+ 482 define init_array (a) 483 { 484 variable i, imax; 485 486 imax = length (a); 487 for (i = 0; i < imax; i++) 488 { 489 a[i] = 7; 490 } 491 } 492 493 variable A = Integer_Type [10]; 494 init_array (A); 495#v- 496 creates an array of \var{10} integers and initializes all its 497 elements to \var{7}. 498 499 There are more concise ways of accomplishing the result of the 500 previous example. These include: 501#v+ 502 variable A = [7, 7, 7, 7, 7, 7, 7, 7, 7, 7]; 503 variable A = Integer_Type [10]; A[[0:9]] = 7; 504 variable A = Integer_Type [10]; A[*] = 7; 505#v- 506 The second and third methods use an array of indices to index the array 507 \var{A}. In the second, the range of indices has been explicitly 508 specified, whereas the third example uses a wildcard form. See 509 section ??? for more information about array indexing. 510 511 Although the examples have pertained to integer arrays, the fact is 512 that \slang arrays can be of any type, e.g., 513#v+ 514 variable A = Double_Type [10]; 515 variable B = Complex_Type [10]; 516 variable C = String_Type [10]; 517 variable D = Ref_Type [10]; 518#v- 519 create \var{10} element arrays of double, complex, string, and 520 reference types, respectively. The last example may be used to 521 create an array of functions, e.g., 522#v+ 523 D[0] = &sin; 524 D[1] = &cos; 525#v- 526 527 The language also defines unary, binary, and mathematical 528 operations on arrays. For example, if \var{A} and \var{B} are 529 integer arrays, then \exmp{A + B} is an array whose elements are 530 the sum of the elements of \var{A} and \var{B}. A trivial example 531 that illustrates the power of this capability is 532#v+ 533 variable X, Y; 534 X = [0:2*PI:0.01]; 535 Y = 20 * sin (X); 536#v- 537 which is equivalent to the highly simplified C code: 538#v+ 539 double *X, *Y; 540 unsigned int i, n; 541 542 n = (2 * PI) / 0.01 + 1; 543 X = (double *) malloc (n * sizeof (double)); 544 Y = (double *) malloc (n * sizeof (double)); 545 for (i = 0; i < n; i++) 546 { 547 X[i] = i * 0.01; 548 Y[i] = 20 * sin (X[i]); 549 } 550#v- 551 552 553#%}}} 554 555\sect2{Structures and User-Defined Types} #%{{{ 556 557 A \em{structure} is similar to an array in the sense that it is a 558 container object. However, the elements of an array must all be of 559 the same type (or of \var{Any_Type}), whereas a structure is 560 heterogeneous. As an example, consider 561#v+ 562 variable person = struct 563 { 564 first_name, last_name, age 565 }; 566 variable bill = @person; 567 bill.first_name = "Bill"; 568 bill.last_name = "Clinton"; 569 bill.age = 51; 570#v- 571 In this example a structure consisting of the three fields has been 572 created and assigned to the variable \var{person}. Then an 573 \em{instance} of this structure has been created using the 574 dereference operator and assigned to \var{bill}. Finally, the 575 individual fields of \var{bill} were initialized. This is an 576 example of an \em{anonymous} structure. 577 578 A \em{named} structure is really a new data type and may be created 579 using the \kw{typedef} keyword: 580#v+ 581 typedef struct 582 { 583 first_name, last_name, age 584 } 585 Person_Type; 586 587 variable bill = @Person_Type; 588 bill.first_name = "Bill"; 589 bill.last_name = "Clinton"; 590 bill.age = 51; 591#v- 592 The big advantage of creating a new type is that one can go on to 593 create arrays of the data type 594#v+ 595 variable People = Person_Type [100]; 596 People[0].first_name = "Bill"; 597 People[1].first_name = "Hillary"; 598#v- 599 600 The creation and initialization of a structure may be facilitated 601 by a function such as 602#v+ 603 define create_person (first, last, age) 604 { 605 variable person = @Person_Type; 606 person.first_name = first; 607 person.last_name = last; 608 person.age = age; 609 return person; 610 } 611 variable Bill = create_person ("Bill", "Clinton", 51); 612#v- 613 614 Other common uses of structures is the creation of linked lists, 615 binary trees, etc. For more information about these and other 616 features of structures, see section ???. 617 618 619#%}}} 620 621\sect2{Namespaces} 622 623 In addition to the global namespace, each compilation unit (e.g., a 624 file) is given a private namespace. A variable or function name 625 that is declared using the \var{static} keyword will be placed in 626 the private namespace associated with compilation unit. For 627 example, 628#v+ 629 variable i; 630 static variable i; 631#v- 632 defines two variables called \var{i}. The first declaration 633 defines \var{i} in the global namespace, but the second declaration 634 defines \var{i} in the private namespace. 635 636 The \exmp{->} operator may be used in conjunction with the name of 637 the namespace to access objects in the name space. In the above 638 example, to access the variable \var{i} in the global namespace, 639 one would use \exmp{Global->i}. Unless otherwise specified, a 640 private namespace has no name and its objects may not be accessed 641 from outside the compilation unit. However, the \var{implements} 642 function may be used give the private namespace a name, allowing 643 access to its objects. For example, if the file \exmp{t.sl} contains 644#v+ 645 implements ("A"); 646 static variable i; 647#v- 648 then another file may access the variable \var{i} via \exmp{A->i}. 649 650#%}}} 651 652\sect1{Data Types and Literal Constants} #%{{{ 653 654 The current implementation of the \slang language permits up to 256 655 distinct data types, including predefined data types such as integer and 656 floating point, as well as specialized applications specific data 657 types. It is also possible to create new data types in the 658 language using the \kw{typedef} mechanism. 659 660 Literal constants are objects such as the integer \exmp{3} or the 661 string \exmp{"hello"}. The actual data type given to a literal 662 constant depends upon the syntax of the constant. The following 663 sections describe the syntax of literals of specific data types. 664 665\sect2{Predefined Data Types} #%{{{ 666 667 The current version of \slang defines integer, floating point, 668 complex, and string types. It also defines special purpose data 669 types such as \var{Null_Type}, \var{DataType_Type}, and 670 \var{Ref_Type}. These types are discussed below. 671 672\sect3{Integers} #%{{{ 673 674 The \slang language supports both signed and unsigned characters, 675 short integer, long integer, and plain integer types. On most 32 676 bit systems, there is no difference between an integer and a long 677 integer; however, they may differ on 16 and 64 bit systems. 678 Generally speaking, on a 16 bit system, plain integers are 16 bit 679 quantities with a range of -32767 to 32767. On a 32 bit system, 680 plain integers range from -2147483648 to 2147483647. 681 682 An plain integer \em{literal} can be specified in one of several ways: 683\begin{itemize} 684\item As a decimal (base 10) integer consisting of the characters 685 \var{0} through \var{9}, e.g., \var{127}. An integer specified 686 this way cannot begin with a leading \var{0}. That is, 687 \var{0127} is \em{not} the same as \var{127}. 688 689\item Using hexadecimal (base 16) notation consisting of the characters 690 \var{0} to \var{9} and \var{A} through \var{F}. The hexadecimal 691 number must be preceded by the characters \var{0x}. For example, 692 \var{0x7F} specifies an integer using hexadecimal notation and has 693 the same value as decimal \var{127}. 694 695\item In Octal notation using characters \var{0} through \var{7}. The Octal 696 number must begin with a leading \var{0}. For example, 697 \var{0177} and \var{127} represent the same integer. 698 699 Short, long, and unsigned types may be specified by using the 700 proper suffixes: \var{L} indicates that the integer is a long 701 integer, \var{h} indicates that the integer is a short integer, and 702 \var{U} indicates that it is unsigned. For example, \exmp{1UL} 703 specifies an unsigned long integer. 704 705 Finally, a character literal may be specified using a notation 706 containing a character enclosed in single quotes as \exmp{'a'}. 707 The value of the character specified this way will lie in the 708 range 0 to 256 and will be determined by the ASCII value of the 709 character in quotes. For example, 710#v+ 711 i = '0'; 712#v- 713 assigns to \var{i} the character 48 since the \exmp{'0'} character 714 has an ASCII value of 48. 715\end{itemize} 716 717 Any integer may be preceded by a minus sign to indicate that it is a 718 negative integer. 719 720#%}}} 721 722\sect3{Floating Point Numbers} #%{{{ 723 724 Single and double precision floating point literals must contain either a 725 decimal point or an exponent (or both). Here are examples of 726 specifying the same double precision point number: 727#v+ 728 12. 12.0 12e0 1.2e1 120e-1 .12e2 0.12e2 729#v- 730 Note that \var{12} is \em{not} a floating point number since it 731 contains neither a decimal point nor an exponent. In fact, 732 \var{12} is an integer. 733 734 One may append the \var{f} character to the end of the number to 735 indicate that the number is a single precision literal. 736 737#%}}} 738 739\sect3{Complex Numbers} #%{{{ 740 741 The language implements complex numbers as a pair of double 742 precision floating point numbers. The first number in the pair 743 forms the \em{real} part, while the second number forms the 744 \em{imaginary} part. That is, a complex number may be regarded as the 745 sum of a real number and an imaginary number. 746 747 Strictly speaking, the current implementation of the \slang does 748 not support generic complex literals. However, it does support 749 imaginary literals and a more generic complex number with a non-zero 750 real part may be constructed from the imaginary literal via 751 addition of a real number. 752 753 An imaginary literal is specified in the same way as a floating 754 point literal except that \var{i} or \var{j} is appended. For 755 example, 756#v+ 757 12i 12.0i 12e0j 758#v- 759 all represent the same imaginary number. Actually, \var{12i} is 760 really an imaginary integer except that \slang automatically 761 promotes it to a double precision imaginary number. 762 763 A more generic complex number may be constructed from an imaginary 764 literal via addition, e.g., 765#v+ 766 3.0 + 4.0i 767#v- 768 produces a complex number whose real part is \exmp{3.0} and whose 769 imaginary part is \exmp{4.0}. 770 771 The intrinsic functions \var{Real} and \var{Imag} may be used to 772 retrieve the real and imaginary parts of a complex number, 773 respectively. 774 775#%}}} 776 777\sect3{Strings} #%{{{ 778 779 A string literal must be enclosed in double quotes as in: 780#v+ 781 "This is a string". 782#v- 783 Although there is no imposed limit on the length of a string, 784 string literals must be less than 256 characters in length. It is 785 possible to go beyond this limit by string concatenation, e.g., 786#v+ 787 "This is the first part of a long string" 788 + "and this is the second half" 789#v- 790 Any character except a newline (ASCII 10) or the null character 791 (ASCII 0) may appear explicitly in a string literal. However, 792 these characters may be used implicitly using the mechanism 793 described below. 794 795 The backslash character is a special character and is used to 796 include other special characters (such as a newline character) in 797 the string. The special characters recognized are: 798#v+ 799 \" -- double quote 800 \' -- single quote 801 \\ -- backslash 802 \a -- bell character (ASCII 7) 803 \t -- tab character (ASCII 9) 804 \n -- newline character (ASCII 10) 805 \e -- escape character (ASCII 27) 806 \xhhh -- character expressed in HEXADECIMAL notation 807 \ooo -- character expressed in OCTAL notation 808 \dnnn -- character expressed in DECIMAL 809#v- 810 For example, to include the double quote character as part of the 811 string, it must be preceded by a backslash character, e.g., 812#v+ 813 "This is a \"quote\"" 814#v- 815 Similarly, the next illustrates how a newline character may be 816 included: 817#v+ 818 "This is the first line\nand this is the second" 819#v- 820#%}}} 821 822 823\sect3{Null_Type} 824 825 Objects of type \var{Null_Type} can have only one value: 826 \var{NULL}. About the only thing that you can do with this data 827 type is to assign it to variables and test for equality with 828 other objects. Nevertheless, \var{Null_Type} is an important and 829 extremely useful data type. Its main use stems from the fact that 830 since it can be compared for equality with any other data type, it 831 is ideal to represent the value of an object which does not yet 832 have a value, or has an illegal value. 833 834 As a trivial example of its use, consider 835#v+ 836 define add_numbers (a, b) 837 { 838 if (a == NULL) a = 0; 839 if (b == NULL) b = 0; 840 return a + b; 841 } 842 variable c = add_numbers (1, 2); 843 variable d = add_numbers (1, NULL); 844 variable e = add_numbers (1,); 845 variable f = add_numbers (,); 846#v- 847 It should be clear that after these statements have been executed, 848 \var{c} will have a value of \exmp{3}. It should also be clear 849 that \var{d} will have a value of \exmp{1} because \var{NULL} has 850 been passed as the second parameter. One feature of the language 851 is that if a parameter has been omitted from a function call, the 852 variable associated with that parameter will be set to \var{NULL}. 853 Hence, \var{e} and \var{f} will be set to \exmp{1} and \exmp{0}, 854 respectively. 855 856 The \var{Null_Type} data type also plays an important role in the 857 context of \em{structures}. 858 859\sect3{Ref_Type} 860 Objects of \var{Ref_Type} are created using the unary 861 \em{reference} operator \var{&}. Such objects may be 862 \em{dereferenced} using the dereference operator \var{@}. For 863 example, 864#v+ 865 variable sin_ref = &sin; 866 variable y = (@sin_ref) (1.0); 867#v- 868 creates a reference to the \var{sin} function and assigns it to 869 \var{sin_ref}. The second statement uses the dereference operator 870 to call the function that \var{sin_ref} references. 871 872 The \var{Ref_Type} is useful for passing functions as arguments to 873 other functions, or for returning information from a function via 874 its parameter list. The dereference operator is also used to create 875 an instance of a structure. For these reasons, further discussion 876 of this important type can be found in section ??? and section ???. 877 878\sect3{Array_Type and Struct_Type} 879 880 Variables of type \var{Array_Type} and \var{Struct_Type} are known 881 as \em{container objects}. They are much more complicated than the 882 simple data types discussed so far and each obeys a special syntax. 883 For these reasons they are discussed in a separate chapters. 884 See ???. 885 886\sect3{DataType_Type Type} #%{{{ 887 888 \slang defines a type called \var{DataType_Type}. Objects of 889 this type have values that are type names. For example, an integer 890 is an object of type \var{Integer_Type}. The literals of 891 \var{DataType_Type} include: 892#v+ 893 Char_Type (signed character) 894 UChar_Type (unsigned character) 895 Short_Type (short integer) 896 UShort_Type (unsigned short integer) 897 Integer_Type (plain integer) 898 UInteger_Type (plain unsigned integer) 899 Long_Type (long integer) 900 ULong_Type (unsigned long integer) 901 Float_Type (single precision real) 902 Double_Type (double precision real) 903 Complex_Type (complex numbers) 904 String_Type (strings, C strings) 905 BString_Type (binary strings) 906 Struct_Type (structures) 907 Ref_Type (references) 908 Null_Type (NULL) 909 Array_Type (arrays) 910 DataType_Type (data types) 911#v- 912 as well as the names of any other types that an application 913 defines. 914 915 The built-in function \var{typeof} returns the data type of 916 its argument, i.e., a \var{DataType_Type}. For instance 917 \exmp{typeof(7)} returns \var{Integer_Type} and 918 \var{typeof(Integer_Type)} returns \var{DataType_Type}. One can use this 919 function as in the following example: 920#v+ 921 if (Integer_Type == typeof (x)) message ("x is an integer"); 922#v- 923 The literals of \var{DataType_Type} have other uses as well. One 924 of the most common uses of these literals is to create arrays, e.g., 925#v+ 926 x = Complex_Type [100]; 927#v- 928 creates an array of \exmp{100} complex numbers and assigns it to 929 \var{x}. 930#%}}} 931 932#%}}} 933 934\sect2{Typecasting: Converting from one Type to Another} 935 936 Occasionally, it is necessary to convert from one data type to 937 another. For example, if you need to print an object as a string, 938 it may be necessary to convert it to a \var{String_Type}. The 939 \var{typecast} function may be used to perform such conversions. 940 For example, consider 941#v+ 942 variable x = 10, y; 943 y = typecast (x, Double_Type); 944#v- 945 After execution of these statements, \var{x} will have the integer 946 value \exmp{10} and \var{y} will have the double precision floating 947 point value \exmp{10.0}. If the object to be converted is an 948 array, the \var{typecast} function will act upon all elements of 949 the array. For example, 950#v+ 951 variable x = [1:10]; % Array of integers 952 variable y = typecast (x, Double_Type); 953#v- 954 will create an array of \exmp{10} double precision values and 955 assign it to \var{y}. One should also realize that it is not 956 always possible to perform a typecast. For example, any attempt to 957 convert an \var{Integer_Type} to a \var{Null_Type} will result in a 958 run-time error. 959 960 Often the interpreter will perform implicit type conversions as necessary 961 to complete calculations. For example, when multiplying an 962 \var{Integer_Type} with a \var{Double_Type}, it will convert the 963 \var{Integer_Type} to a \var{Double_Type} for the purpose of the 964 calculation. Thus, the example involving the conversion of an 965 array of integers to an array of doubles could have been performed 966 by multiplication by \exmp{1.0}, i.e., 967#v+ 968 variable x = [1:10]; % Array of integers 969 variable y = 1.0 * x; 970#v- 971 972 The \var{string} intrinsic function is similar to the typecast 973 function except that it converts an object to a string 974 representation. It is important to understand that a typecast from 975 some type to \var{String_Type} is \em{not} the same as converting 976 an object to its string operation. That is, 977 \exmp{typecast(x,String_Type)} is not equivalent to 978 \exmp{string(x)}. The reason for this is that when given an array, 979 the \var{typecast} function acts on each element of the array to 980 produce another array, whereas the \var{string} function produces a 981 a string. 982 983 The \var{string} function is useful for printing the value of an 984 object. This use is illustrated in the following simple example: 985#v+ 986 define print_object (x) 987 { 988 message (string (x)); 989 } 990#v- 991 Here, the \var{message} function has been used because it writes a 992 string to the display. If the \var{string} function was not used 993 and the \var{message} function was passed an integer, a 994 type-mismatch error would have resulted. 995 996#%}}} 997 998\sect1{Identifiers} #%{{{ 999 1000 The names given to variables, functions, and data types are called 1001 \em{identifiers}. There are some restrictions upon the actual 1002 characters that make up an identifier. An identifier name must 1003 start with a letter (\var{[A-Za-z]}), an underscore character, or a 1004 dollar sign. The rest of the characters in the name can be any 1005 combination of letters, digits, dollar signs, or underscore 1006 characters. However, all identifiers whose name begins with two 1007 underscore characters are reserved for internal use by the 1008 interpreter and declarations of objects with such names should be 1009 avoided. 1010 1011 Examples of valid identifiers include: 1012#v+ 1013 mary _3 _this_is_ok 1014 a7e1 $44 _44$_Three 1015#v- 1016 However, the following are not legal: 1017#v+ 1018 7abc 2e0 #xx 1019#v- 1020 In fact, \exmp{2e0} actually specifies the real number 1021 \exmp{2.0}. 1022 1023 Although the maximum length of identifiers is unspecified by the 1024 language, the length should be kept below \exmp{64} characters. 1025 1026 The following identifiers are reserved by the language for use as 1027 keywords: 1028#v+ 1029 !if _for do mod sign xor 1030 ERROR_BLOCK abs do_while mul2 sqr public 1031 EXIT_BLOCK and else not static private 1032 USER_BLOCK0 andelse exch or struct 1033 USER_BLOCK1 break for orelse switch 1034 USER_BLOCK2 case foreach pop typedef 1035 USER_BLOCK3 chs forever return using 1036 USER_BLOCK4 continue if shl variable 1037 __tmp define loop shr while 1038#v- 1039 In addition, the next major \slang release (v2.0) will reserve 1040 \exmp{try} and \exmp{catch}, so it is probably a good idea to avoid 1041 those words until then. 1042 1043#%}}} 1044 1045\sect1{Variables} #%{{{ 1046 1047 A variable must be declared before it can be used, otherwise an 1048 undefined name error will be generated. A variable is declared 1049 using the \kw{variable} keyword, e.g, 1050#v+ 1051 variable x, y, z; 1052#v- 1053 declares three variables, \exmp{x}, \exmp{y}, and \exmp{z}. This 1054 is an example of a variable declaration statement, and like all 1055 statements, it must end in a semi-colon. 1056 1057 Variables declared this way are untyped and inherit a type upon 1058 assignment. The actual type checking is performed at run-time. For 1059 example, 1060#v+ 1061 x = "This is a string"; 1062 x = 1.2; 1063 x = 3; 1064 x = 2i; 1065#v- 1066 results in x being set successively to a string, a float, an 1067 integer, and to a complex number (\exmp{0+2i}). Any attempt to use 1068 a variable before it has acquired a type will result in an 1069 uninitialized variable error. 1070 1071 It is legal to put executable code in a variable declaration list. 1072 That is, 1073#v+ 1074 variable x = 1, y = sin (x); 1075#v- 1076 are legal variable declarations. This also provides a convenient way 1077 of initializing a variable. 1078 1079 Variables are classified as either \em{global} or \em{local}. A 1080 variable declared inside a function is said to be local and has no 1081 meaning outside the function. A variable is said to be global if 1082 it was declared outside a function. Global variables are further 1083 classified as being \var{public}, \var{static}, or \var{private}, 1084 according to the name space where they were defined. 1085 See chapter ??? for more information about name spaces. 1086 1087 The following global variables are predefined by the language and 1088 are mainly used as convenience variables: 1089#v+ 1090 $0 $1 $2 $3 $4 $5 $6 $7 $8 $9 1091#v- 1092 1093 An \em{intrinsic} variable is another type of global variable. 1094 Such variables have a definite type which cannot be altered. 1095 Variables of this type may also be defined to be read-only, or 1096 constant variables. An example of an intrinsic variable is 1097 \var{PI} which is a read-only double precision variable with a value 1098 of approximately \exmp{3.14159265358979323846}. 1099 1100#%}}} 1101 1102\sect1{Operators} #%{{{ 1103 1104 \slang supports a variety of operators that are grouped into three 1105 classes: assignment operators, binary operators, and unary operators. 1106 1107 An assignment operator is used to assign a value to a variable. 1108 They will be discussed more fully in the context of the assignment 1109 statement in section ???. 1110 1111 An unary operator acts only upon a single quantity while a binary 1112 operation is an operation between two quantities. The boolean 1113 operator \var{not} is an example of an unary operator. Examples of 1114 binary operators include the usual arithmetic operators 1115 \var{+}, \var{-}, \var{*}, and \var{/}. The operator given by 1116 \var{-} can be either an unary operator (negation) or a binary operator 1117 (subtraction); the actual operation is determined from the context 1118 in which it is used. 1119 1120 Binary operators are used in algebraic forms, e.g., \exmp{a + b}. 1121 Unary operators fall in one of two classes: postfix-unary or 1122 prefix-unary. For example, in the expression \exmp{-x}, the minus 1123 sign is a prefix-unary operator. 1124 1125 Not all data types have binary or unary operations defined. For 1126 example, while \var{String_Type} objects support the \var{+} 1127 operator, they do not admit the \var{*} operator. 1128 1129\sect2{Unary Operators} 1130 1131 The \bf{unary} operators operate only upon a single operand. They 1132 include: \var{not}, \var{~}, \var{-}, \var{@}, \var{&}, as well as the 1133 increment and decrement operators \var{++} and \var{--}, 1134 respectively. 1135 1136 The boolean operator \var{not} acts only upon integers and produces 1137 \var{0} if its operand is non-zero, otherwise it produces \var{1}. 1138 1139 The bit-level not operator \var{~} performs a similar function, 1140 except that it operates on the individual bits of its integer 1141 operand. 1142 1143 The arithmetic negation operator \var{-} is the most well-known 1144 unary operator. It simply reverses the sign of its operand. 1145 1146 The reference (\var{&}) and dereference (\var{@}) operators will be 1147 discussed in greater detail in section ???. Similarly, the 1148 increment (\var{++}) and decrement (\var{--}) operators will be 1149 discussed in the context of the assignment operator. 1150 1151\sect2{Binary Operators} #%{{{ 1152 1153 The binary operators may be grouped according to several classes: 1154 arithmetic operators, relational operators, boolean operators, and 1155 bitwise operators. 1156 1157 All binary and unary operators may be overloaded. For example, the 1158 arithmetic plus operator has been overloaded by the 1159 \var{String_Type} data type to permit concatenation between strings. 1160 1161\sect3{Arithmetic Operators} #%{{{ 1162 1163 The arithmetic operators include \var{+}, \var{-}, \var{*}, \var{/}, 1164 which perform addition, subtraction, multiplication, and division, 1165 respectively. In addition to these, \slang supports the \var{mod} 1166 operator as well as the power operator \var{^}. 1167 1168 The data type of the result produced by the use of one of these 1169 operators depends upon the data types of the binary participants. 1170 If they are both integers, the result will be an integer. However, 1171 if the operands are not of the same type, they will be converted to 1172 a common type before the operation is performed. For example, if 1173 one is a floating point value and the other is an integer, the 1174 integer will be converted to a float. In general, the promotion 1175 from one type to another is such that no information is lost, if 1176 possible. As an example, consider the expression \exmp{8/5} which 1177 indicates division of the integer \var{8} by the integer \var{5}. 1178 The result will be the integer \var{1} and \em{not} the floating 1179 point value \var{1.6}. However, \exmp{8/5.0} will produce 1180 \var{1.6} because \exmp{5.0} is a floating point number. 1181 1182#%}}} 1183 1184\sect3{Relational Operators} #%{{{ 1185 1186 The relational operators are \var{>}, \var{>=}, \var{<}, \var{<=}, 1187 \var{==}, and \var{!=}. These perform the comparisons greater 1188 than, greater than or equal, less than, less than or equal, equal, 1189 and not equal, respectively. The result of one of these 1190 comparisons is the integer \var{1} if the comparison is true, or 1191 \var{0} if the comparison is false. For example, \exmp{6 >= 5} 1192 returns \var{1}, but \var{6 == 5} produces 1193 \var{0}. 1194 1195#%}}} 1196 1197\sect3{Boolean Operators} #%{{{ 1198 There are only two boolean binary operators: \var{or} and 1199 \var{and}. These operators are defined only for integers and 1200 produce an integer result. The \var{or} operator returns \var{1} 1201 if either of its operands are non-zero, otherwise it produces 1202 \var{0}. The \var{and} operator produces \var{1} if and only if 1203 both its operands are non-zero, otherwise it produces \var{0}. 1204 1205 Neither of these operators perform the so-called boolean 1206 short-circuit evaluation. For example, consider the expression: 1207#v+ 1208 (x != 0) and (1/x > 10) 1209#v- 1210 Here, if \var{x} were to have a value of zero, a division by zero error 1211 would occur because even though \var{x!=0} evaluates to zero, the 1212 \var{and} operator is not short-circuited and the \var{1/x} expression 1213 would still be evaluated. Although these operators are not 1214 short-circuited, \slang does have another mechanism of performing 1215 short-circuit boolean evaluation via the \kw{orelse} and 1216 \kw{andelse} expressions. See below for information about these 1217 constructs. 1218 1219#%}}} 1220 1221\sect3{Bitwise Operators} #%{{{ 1222 1223 The bitwise binary operators are defined only with integer operands 1224 and are used for bit-level operations. Operators that fall in this 1225 class include \var{&}, \var{|}, \var{shl}, \var{shr}, and 1226 \var{xor}. The \var{&} operator performs a boolean AND operation 1227 between the corresponding bits of the operands. Similarly, the 1228 \var{|} operator performs the boolean OR operation on the bits. 1229 The bit-shifting operators \var{shl} and \var{shr} shift the bits 1230 of the first operand by the number given by the second operand to 1231 the left or right, respectively. Finally, the \var{xor} performs 1232 an EXCLUSIVE-OR operation. 1233 1234 These operators are commonly used to manipulate variables whose 1235 individual bits have distinct meanings. In particular, \var{&} is 1236 usually used to test bits, \var{|} can be used to set bits, and 1237 \var{xor} may be used to flip a bit. 1238 1239 As an example of using \var{&} to perform tests on bits, consider 1240 the following: The \jed text editor stores some of the information 1241 about a buffer in a bitmapped integer variable. The value of this 1242 variable may be retrieved using the \jed intrinsic function 1243 \var{getbuf_info}, which actually returns four quantities: the 1244 buffer flags, the name of the buffer, directory name, and file 1245 name. For the purposes of this section, only the buffer flags are 1246 of interest and can be retrieved via a function such as 1247#v+ 1248 define get_buffer_flags () 1249 { 1250 variable flags; 1251 (,,,flags) = getbuf_info (); 1252 return flags; 1253 } 1254#v- 1255 The buffer flags is a bitmapped quantity where the 0th bit 1256 indicates whether or not the buffer has been modified, the first 1257 bit indicates whether or not autosave has been enabled for the 1258 buffer, and so on. Consider for the moment the task of determining 1259 if the buffer has been modified. This can be 1260 determined by looking at the zeroth bit, if it is \var{0} the 1261 buffer has not been modified, otherwise it has. Thus we can create 1262 the function, 1263#v+ 1264 define is_buffer_modified () 1265 { 1266 variable flags = get_buffer_flags (); 1267 return (flags & 1); 1268 } 1269#v- 1270 where the integer \exmp{1} has been used since it has all of its 1271 bits set to \var{0}, except for the zeroth one, which is set to 1272 \var{1}. (At this point, it should also be apparent that bits are 1273 numbered from zero, thus an \var{8} bit integer consists of bits 1274 \var{0} to \var{7}, where \var{0} is the least significant bit and 1275 \var{7} is the most significant one.) Similarly, we can create another 1276 function 1277#v+ 1278 define is_autosave_on () 1279 { 1280 variable flags = get_buffer_flags (); 1281 return (flags & 2); 1282 } 1283#v- 1284 to determine whether or not autosave has been turned on for the 1285 buffer. 1286 1287 The \var{shl} operator may be used to form the integer with only 1288 the \em{nth} bit set. For example, \exmp{1 shl 6} produces an 1289 integer with all bits set to zero except the sixth bit, which is 1290 set to one. The following example exploits this fact: 1291#v+ 1292 define test_nth_bit (flags, nth) 1293 { 1294 return flags & (1 shl nth); 1295 } 1296#v- 1297 1298#%}}} 1299 1300\sect3{Namespace operator} 1301 The operator \var{->} is used to in conjunction with the name of a 1302 namespace to access an object within the namespace. For example, 1303 if \exmp{A} is the name of a namespace containing the variable 1304 \var{v}, then \exmp{A->v} refers to that variable. 1305 1306\sect3{Operator Precedence} 1307 1308\sect3{Binary Operators and Functions Returning Multiple Values} #%{{{ 1309 Care must be exercised when using binary operators with an operand 1310 the returns multiple values. In fact, the current implementation 1311 of the \slang language will produce incorrect results if both 1312 operands of a binary expression return multiple values. \em{At 1313 most, only one of operands of a binary expression can return 1314 multiple values, and that operand must be the first one, not the 1315 second.} For example, 1316#v+ 1317 define read_line (fp) 1318 { 1319 variable line, status; 1320 1321 status = fgets (&line, fp); 1322 if (status == -1) 1323 return -1; 1324 return (line, status); 1325 } 1326#v- 1327 defines a function, \var{read_line} that takes a single argument, a 1328 handle to an open file, and returns one or two values, depending 1329 upon the return value of \var{fgets}. Now consider 1330#v+ 1331 while (read_line (fp) > 0) 1332 { 1333 text = (); 1334 % Do something with text 1335 . 1336 . 1337 } 1338#v- 1339 Here the relational binary operator \var{>} forms a comparison 1340 between one of the return values (the one at the top of the stack) 1341 and \var{0}. In accordance with the above rule, since \var{read_line} 1342 returns multiple values, it occurs as the left binary operand. 1343 Putting it on the right as in 1344#v+ 1345 while (0 < read_line (fp)) % Incorrect 1346 { 1347 text = (); 1348 % Do something with text 1349 . 1350 . 1351 } 1352#v- 1353 violates the rule and will result in the wrong answer. 1354 1355#%}}} 1356 1357#%}}} 1358 1359\sect2{Mixing Integer and Floating Point Arithmetic} 1360 1361 If a binary operation (\var{+}, \var{-}, \var{*} , \var{/}) is 1362 performed on two integers, the result is an integer. If at least 1363 one of the operands is a float, the other is converted to float and 1364 the result is float. For example: 1365#v+ 1366 11 / 2 --> 5 (integer) 1367 11 / 2.0 --> 5.5 (float) 1368 11.0 / 2 --> 5.5 (float) 1369 11.0 / 2.0 --> 5.5 (float) 1370#v- 1371 Finally note that only integers may be used as array indices, 1372 loop control variables, and bit operations. The conversion 1373 functions, \var{int} and \var{float}, may be used convert between 1374 floats and ints where appropriate, e.g., 1375#v+ 1376 int (1.5) --> 1 (integer) 1377 float(1.5) --> 1.5 (float) 1378 float (1) --> 1.0 (float) 1379#v- 1380 1381\sect2{Short Circuit Boolean Evaluation} 1382 1383 The boolean operators \var{or} and \var{and} \em{are not short 1384 circuited} as they are in some languages. \slang uses 1385 \var{orelse} and \var{andelse} expressions for short circuit boolean 1386 evaluation. However, these are not binary operators. Expressions 1387 of the form: 1388\begin{tscreen} 1389 \em{expr-1} and \em{expr-2} and ... \em{expr-n} 1390\end{tscreen} 1391 can be replaced by the short circuited version using \var{andelse}: 1392\begin{tscreen} 1393 andelse {\em{expr-1}} {\em{expr-2}} ... {\em{expr-n}} 1394\end{tscreen} 1395 A similar syntax holds for the \var{orelse} operator. For example, consider 1396 the statement: 1397#v+ 1398 if ((x != 0) and (1/x > 10)) do_something (); 1399#v- 1400 Here, if \var{x} were to have a value of zero, a division by zero error 1401 would occur because even though \var{x!=0} evaluates to zero, the 1402 \var{and} operator is not short circuited and the \var{1/x} expression 1403 would be evaluated causing division by zero. For this case, the 1404 \var{andelse} expression could be used to avoid the problem: 1405#v+ 1406 if (andelse 1407 {x != 0} 1408 {1 / x > 10}) do_something (); 1409#v- 1410 1411#%}}} 1412 1413\sect1{Statements} #%{{{ 1414 1415 Loosely speaking, a \em{statement} is composed of \em{expressions} 1416 that are grouped according to the syntax or grammar of the language 1417 to express a complete computation. Statements are analogous to 1418 sentences in a human language and expressions are like phrases. 1419 All statements in the \slang language must end in a semi-colon. 1420 1421 A statement that occurs within a function is executed only during 1422 execution of the function. However, statements that occur outside 1423 the context of a function are evaluated immediately. 1424 1425 The language supports several different types of statements such as 1426 assignment statements, conditional statements, and so forth. These 1427 are described in detail in the following sections. 1428 1429\sect2{Variable Declaration Statements} 1430 Variable declarations were already discussed in chapter ???. For 1431 the sake of completeness, a variable declaration is a statement of 1432 the form 1433\begin{tscreen} 1434 variable \em{variable-declaration-list} ; 1435\end{tscreen} 1436 where the \em{variable-declaration-list} is a comma separated list 1437 of one or more variable names with optional initializations, e.g., 1438#v+ 1439 variable x, y = 2, z; 1440#v- 1441\sect2{Assignment Statements} #%{{{ 1442 1443 Perhaps the most well known form of statement is the \em{assignment 1444 statement}. Statements of this type consist of a left-hand side, 1445 an assignment operator, and a right-hand side. The left-hand side 1446 must be something to which an assignment can be performed. Such 1447 an object is called an \em{lvalue}. 1448 1449 The most common assignment operator is the simple assignment 1450 operator \var{=}. Simple of its use include 1451#v+ 1452 x = 3; 1453 x = some_function (10); 1454 x = 34 + 27/y + some_function (z); 1455 x = x + 3; 1456#v- 1457 In addition to the simple assignment operator, \slang 1458 also supports the assignment operators \var{+=} and \var{-=}. 1459 Internally, \slang transforms 1460#v+ 1461 a += b; 1462#v- 1463 to 1464#v+ 1465 a = a + b; 1466#v- 1467 Similarly, \exmp{a -= b} is transformed to \exmp{a = a - b}. It is 1468 extremely important to realize that, in general, \exmp{a+b} is not 1469 equal to \exmp{b+a}. This means that \exmp{a+=b} is not the same 1470 as \exmp{a=b+a}. As an example consider 1471#v+ 1472 a = "hello"; a += "world"; 1473#v- 1474 After execution of these two statements, \var{a} will have the 1475 value \exmp{"helloworld"} and not \exmp{"worldhello"}. 1476 1477 Since adding or subtracting \exmp{1} from a variable is quite 1478 common, \slang also supports the unary increment and decrement 1479 operators \exmp{++}, and \exmp{--}, respectively. That is, for 1480 numeric data types, 1481#v+ 1482 x = x + 1; 1483 x += 1; 1484 x++; 1485#v- 1486 are all equivalent. Similarly, 1487#v+ 1488 x = x - 1; 1489 x -= 1; 1490 x--; 1491#v- 1492 are also equivalent. 1493 1494 Strictly speaking, \var{++} and \var{--} are unary operators. When 1495 used as \var{x++}, the \var{++} operator is said to be a 1496 \em{postfix-unary} operator. However, when used as \var{++x} it is 1497 said to be a \em{prefix-unary} operator. The current 1498 implementation does not distinguish between the two forms, thus 1499 \var{x++} and \var{++x} are equivalent. The reason for this 1500 equivalence is \em{that assignment expressions do not return a value in 1501 the \slang language} as they do in C. Thus one should exercise care 1502 and not try to write C-like code such as 1503#v+ 1504 x = 10; 1505 while (--x) do_something (x); % Ok in C, but not in S-Lang 1506#v- 1507 The closest valid \slang form involves a \em{comma-expression}: 1508#v+ 1509 x = 10; 1510 while (x--, x) do_something (x); % Ok in S-Lang and in C 1511#v- 1512 1513 \slang also supports a \em{multiple-assignment} statement. It is 1514 discussed in detail in section ???. 1515 1516#%}}} 1517 1518\sect2{Conditional and Looping Statements} #%{{{ 1519 1520 \slang supports a wide variety of conditional and looping 1521 statements. These constructs operate on statements grouped together 1522 in \em{blocks}. A block is a sequence of \slang statements enclosed 1523 in braces and may contain other blocks. However, a block cannot 1524 include function declarations. In the following, 1525 \em{statement-or-block} refers to either a single 1526 \slang statement or to a block of statements, and 1527 \em{integer-expression} is an integer-valued expression. 1528 \em{next-statement} represents the statement following the form 1529 under discussion. 1530 1531\sect3{Conditional Forms} #%{{{ 1532\sect4{if} 1533 The simplest condition statement is the \kw{if} statement. It 1534 follows the syntax 1535\begin{tscreen} 1536 if (\em{integer-expression}) \em{statement-or-block} 1537 \em{next-statement} 1538\end{tscreen} 1539 If \em{integer-expression} evaluates to a non-zero result, then the 1540 statement or group of statements implied \em{statement-or-block} 1541 will get executed. Otherwise, control will proceed to 1542 \em{next-statement}. 1543 1544 An example of the use of this type of conditional statement is 1545#v+ 1546 if (x != 0) 1547 { 1548 y = 1.0 / x; 1549 if (x > 0) z = log (x); 1550 } 1551#v- 1552 This example illustrates two \var{if} statements where the second 1553 \var{if} statement is part of the block of statements that belong to 1554 the first. 1555 1556\sect4{if-else} 1557 Another form of \kw{if} statement is the \em{if-else} statement. 1558 It follows the syntax: 1559\begin{tscreen} 1560 if (\em{integer-expression}) \em{statement-or-block-1} 1561 else \em{statement-or-block-2} 1562 \em{next-statement} 1563\end{tscreen} 1564 Here, if \em{expression} returns non-zero, 1565 \em{statement-or-block-1} will get executed and control will pass 1566 on to \em{next-statement}. However, if \em{expression} returns zero, 1567 \em{statement-or-block-2} will get executed before continuing with 1568 \em{next-statement}. A simple example of this form is 1569#v+ 1570 if (x > 0) z = log (x); else error ("x must be positive"); 1571#v- 1572 Consider the more complex example: 1573#v+ 1574 if (city == "Boston") 1575 if (street == "Beacon") found = 1; 1576 else if (city == "Madrid") 1577 if (street == "Calle Mayor") found = 1; 1578 else found = 0; 1579#v- 1580 This example illustrates a problem that beginners have with 1581 \em{if-else} statements. The grammar presented above shows that 1582 the this example is equivalent to 1583#v+ 1584 if (city == "Boston") 1585 { 1586 if (street == "Beacon") found = 1; 1587 else if (city == "Madrid") 1588 { 1589 if (street == "Calle Mayor") found = 1; 1590 else found = 0; 1591 } 1592 } 1593#v- 1594 It is important to understand the grammar and not be seduced by the 1595 indentation! 1596 1597\sect4{!if} 1598 1599 One often encounters \kw{if} statements similar to 1600\begin{tscreen} 1601 if (\em{integer-expression} == 0) \em{statement-or-block} 1602\end{tscreen} 1603 or equivalently, 1604\begin{tscreen} 1605 if (not(\em{integer-expression})) \em{statement-or-block} 1606\end{tscreen} 1607 The \kw{!if} statement was added to the language to simplify the 1608 handling of such statements. It obeys the syntax 1609\begin{tscreen} 1610 !if (\em{integer-expression}) \em{statement-or-block} 1611\end{tscreen} 1612 and is functionally equivalent to 1613\begin{tscreen} 1614 if (not (\em{expression})) \em{statement-or-block} 1615\end{tscreen} 1616 1617\sect4{orelse, andelse} 1618 1619 These constructs were discussed earlier. The syntax for the 1620 \var{orelse} statement is: 1621\begin{tscreen} 1622 orelse {\em{integer-expression-1}} ... {\em{integer-expression-n}} 1623\end{tscreen} 1624 This causes each of the blocks to be executed in turn until one of 1625 them returns a non-zero integer value. The result of this statement 1626 is the integer value returned by the last block executed. For 1627 example, 1628#v+ 1629 orelse { 0 } { 6 } { 2 } { 3 } 1630#v- 1631 returns \var{6} since the second block is the first to return a 1632 non-zero result. The last two block will not get executed. 1633 1634 The syntax for the \var{andelse} statement is: 1635\begin{tscreen} 1636 andelse {\em{integer-expression-1}} ... {\em{integer-expression-n}} 1637\end{tscreen} 1638 Each of the blocks will be executed in turn until one of 1639 them returns a zero value. The result of this statement is the 1640 integer value returned by the last block executed. For example, 1641#v+ 1642 andelse { 6 } { 2 } { 0 } { 4 } 1643#v- 1644 returns \var{0} since the third block will be the last to execute. 1645 1646\sect4{switch} 1647 The switch statement deviates the most from its C counterpart. The 1648 syntax is: 1649#v+ 1650 switch (x) 1651 { ... : ...} 1652 . 1653 . 1654 { ... : ...} 1655#v- 1656 The `\var{:}' operator is a special symbol which means to test 1657 the top item on the stack, and if it is non-zero, the rest of the block 1658 will get executed and control will pass out of the switch statement. 1659 Otherwise, the execution of the block will be terminated and the process 1660 will be repeated for the next block. If a block contains no 1661 \var{:} operator, the entire block is executed and control will 1662 pass onto the next statement following the \kw{switch} statement. 1663 Such a block is known as the \em{default} case. 1664 1665 As a simple example, consider the following: 1666#v+ 1667 switch (x) 1668 { x == 1 : message("Number is one.");} 1669 { x == 2 : message("Number is two.");} 1670 { x == 3 : message("Number is three.");} 1671 { x == 4 : message("Number is four.");} 1672 { x == 5 : message("Number is five.");} 1673 { message ("Number is greater than five.");} 1674#v- 1675 Suppose \var{x} has an integer value of \exmp{3}. The first two 1676 blocks will terminate at the `\var{:}' character because each of the 1677 comparisons with \var{x} will produce zero. However, the third 1678 block will execute to completion. Similarly, if \var{x} is 1679 \exmp{7}, only the last block will execute in full. 1680 1681 A more familiar way to write the previous example used the 1682 \kw{case} keyword: 1683#v+ 1684 switch (x) 1685 { case 1 : print("Number is one.");} 1686 { case 2 : print("Number is two.");} 1687 { case 3 : print("Number is three.");} 1688 { case 4 : print("Number is four.");} 1689 { case 5 : print("Number is five.");} 1690 { print ("Number is greater than five.");} 1691#v- 1692 The \var{case} keyword is a more useful comparison operator because 1693 it can perform a comparison between different data types while 1694 using \var{==} may result in a type-mismatch error. For example, 1695#v+ 1696 switch (x) 1697 { (x == 1) or (x == "one") : print("Number is one.");} 1698 { (x == 2) or (x == "two") : print("Number is two.");} 1699 { (x == 3) or (x == "three") : print("Number is three.");} 1700 { (x == 4) or (x == "four") : print("Number is four.");} 1701 { (x == 5) or (x == "five") : print("Number is five.");} 1702 { print ("Number is greater than five.");} 1703#v- 1704 will fail because the \var{==} operation is not defined between 1705 strings and integers. The correct way to write this to use the 1706 \var{case} keyword: 1707#v+ 1708 switch (x) 1709 { case 1 or case "one" : print("Number is one.");} 1710 { case 2 or case "two" : print("Number is two.");} 1711 { case 3 or case "three" : print("Number is three.");} 1712 { case 4 or case "four" : print("Number is four.");} 1713 { case 5 or case "five" : print("Number is five.");} 1714 { print ("Number is greater than five.");} 1715#v- 1716 1717#%}}} 1718 1719\sect3{Looping Forms} #%{{{ 1720 1721\sect4{while} 1722 The \kw{while} statement follows the syntax 1723\begin{tscreen} 1724 while (\em{integer-expression}) \em{statement-or-block} 1725 \em{next-statement} 1726\end{tscreen} 1727 It simply causes \em{statement-or-block} to get executed as long as 1728 \em{integer-expression} evaluates to a non-zero result. For 1729 example, 1730#v+ 1731 i = 10; 1732 while (i) 1733 { 1734 i--; 1735 newline (); 1736 } 1737#v- 1738 will cause the \var{newline} function to get called 10 times. 1739 However, 1740#v+ 1741 i = -10; 1742 while (i) 1743 { 1744 i--; 1745 newline (); 1746 } 1747#v- 1748 would loop forever (or until \var{i} wraps from the most negative 1749 integer value to the most positive and then decrements to zero). 1750 1751 1752 If you are a C programmer, do not let the syntax of the language 1753 seduce you into writing this example as you would in C: 1754#v+ 1755 i = 10; 1756 while (i--) newline (); 1757#v- 1758 The fact is that expressions such as \var{i--} do not return a 1759 value in \slang as they do in C. If you must write this way, use 1760 the comma operator as in 1761#v+ 1762 i = 10; 1763 while (i, i--) newline (); 1764#v- 1765 1766\sect4{do...while} 1767 The \kw{do...while} statement follows the syntax 1768\begin{tscreen} 1769 do 1770 \em{statement-or-block} 1771 while (\em{integer-expression}); 1772\end{tscreen} 1773 The main difference between this statement and the \var{while} 1774 statement is that the \kw{do...while} form performs the test 1775 involving \em{integer-expression} after each execution 1776 of \em{statement-or-block} rather than before. This guarantees that 1777 \em{statement-or-block} will get executed at least once. 1778 1779 A simple example from the \jed editor follows: 1780#v+ 1781 bob (); % Move to beginning of buffer 1782 do 1783 { 1784 indent_line (); 1785 } 1786 while (down (1)); 1787#v- 1788 This will cause all lines in the buffer to get indented via the 1789 \jed intrinsic function \var{indent_line}. 1790 1791\sect4{for} 1792 Perhaps the most complex looping statement is the \kw{for} 1793 statement; nevertheless, it is a favorite of many programmers. 1794 This statement obeys the syntax 1795\begin{tscreen} 1796 for (\em{init-expression}; \em{integer-expression}; \em{end-expression}) 1797 \em{statement-or-block} 1798 \em{next-statement} 1799\end{tscreen} 1800 In addition to \em{statement-or-block}, its specification requires 1801 three other expressions. When executed, the \kw{for} statement 1802 evaluates \em{init-expression}, then it tests 1803 \em{integer-expression}. If \em{integer-expression} returns zero, 1804 control passes to \em{next-statement}. Otherwise, it executes 1805 \em{statement-or-block} as long as \em{integer-expression} 1806 evaluates to a non-zero result. After every execution of 1807 \em{statement-or-block}, \em{end-expression} will get evaluated. 1808 1809 This statement is \em{almost} equivalent to 1810\begin{tscreen} 1811 \em{init-expression}; 1812 while (\em{integer-expression}) 1813 { 1814 \em{statement-or-block} 1815 \em{end-expression}; 1816 } 1817\end{tscreen} 1818 The reason that they are not fully equivalent involves what happens 1819 when \em{statement-or-block} contains a \kw{continue} statement. 1820 1821 Despite the apparent complexity of the \kw{for} statement, it is 1822 very easy to use. As an example, consider 1823#v+ 1824 sum = 0; 1825 for (i = 1; i <= 10; i++) sum += i; 1826#v- 1827 which computes the sum of the first 10 integers. 1828 1829\sect4{loop} 1830 The \kw{loop} statement simply executes a block of code a fixed 1831 number of times. It follows the syntax 1832\begin{tscreen} 1833 loop (\em{integer-expression}) \em{statement-or-block} 1834 \em{next-statement} 1835\end{tscreen} 1836 If the \em{integer-expression} evaluates to a positive integer, 1837 \em{statement-or-block} will get executed that many times. 1838 Otherwise, control will pass to \em{next-statement}. 1839 1840 For example, 1841#v+ 1842 loop (10) newline (); 1843#v- 1844 will cause the function \var{newline} to get called 10 times. 1845 1846\sect4{forever} 1847 The \kw{forever} statement is similar to the \kw{loop} statement 1848 except that it loops forever, or until a \kw{break} or a 1849 \kw{return} statement is executed. It obeys the syntax 1850\begin{tscreen} 1851 forever \em{statement-or-block} 1852\end{tscreen} 1853 A trivial example of this statement is 1854#v+ 1855 n = 10; 1856 forever 1857 { 1858 if (n == 0) break; 1859 newline (); 1860 n--; 1861 } 1862#v- 1863 1864\sect4{foreach} 1865 The \kw{foreach} statement is used to loop over one or more 1866 statements for every element in a container object. A container 1867 object is a data type that consists of other types. Examples 1868 include both ordinary and associative arrays, structures, and 1869 strings. Every time through the loop the current member of the 1870 object is pushed onto the stack. 1871 1872 The simple type of \kw{foreach} statement obeys the syntax 1873\begin{tscreen} 1874 foreach (\em{container-object}) \em{statement-or-block} 1875\end{tscreen} 1876 Here \em{container-object} can be an expression that returns a 1877 container object. A simple example is 1878#v+ 1879 foreach (["apple", "peach", "pear"]) 1880 { 1881 fruit = (); 1882 process_fruit (fruit); 1883 } 1884#v- 1885 This example shows that if the container object is an array, then 1886 successive elements of the array are pushed onto the stack prior to 1887 each execution cycle. If the container object is a string, then 1888 successive characters of the string are pushed onto the stack. 1889 1890 What actually gets pushed onto the stack may be controlled via the 1891 \kw{using} form of the \kw{foreach} statement. This more complex 1892 type of \kw{foreach} statement follows the syntax 1893\begin{tscreen} 1894 foreach ( \em{container-object} ) using ( \em{control-list} ) 1895 \em{statement-or-block} 1896\end{tscreen} 1897 The allowed values of \em{control-list} will depend upon the type 1898 of container object. For associative arrays (\var{Assoc_Type}), 1899 \em{control-list} specified whether \em{keys}, \em{values}, or both 1900 are pushed onto the stack. For example, 1901#v+ 1902 foreach (a) using ("keys") 1903 { 1904 k = (); 1905 . 1906 . 1907 } 1908#v- 1909 results in the keys of the associative array \var{a} being pushed 1910 on the list. However, 1911#v+ 1912 foreach (a) using ("values") 1913 { 1914 v = (); 1915 . 1916 . 1917 } 1918#v- 1919 will cause the values to be used, and 1920#v+ 1921 foreach (a) using ("keys", "values") 1922 { 1923 (k,v) = (); 1924 . 1925 . 1926 } 1927#v- 1928 will use both the keys and values of the array. 1929 1930 Similarly, for linked-lists of structures, one may walk the list via 1931 code like 1932#v+ 1933 foreach (linked_list) using ("next") 1934 { 1935 s = (); 1936 . 1937 . 1938 } 1939#v- 1940 This \kw{foreach} statement is equivalent 1941#v+ 1942 s = linked_list; 1943 while (s != NULL) 1944 { 1945 . 1946 . 1947 s = s.next; 1948 } 1949#v- 1950 Consult the type-specific documentation for a discussion of the 1951 \kw{using} control words, if any, appropriate for a given type. 1952 1953\sect2{break, return, continue} 1954 1955 \slang also includes the non-local transfer functions \var{return}, \var{break}, 1956 and \var{continue}. The \var{return} statement causes control to return to the 1957 calling function while the \var{break} and \var{continue} statements are used in 1958 the context of loop structures. Consider: 1959#v+ 1960 define fun () 1961 { 1962 forever 1963 { 1964 s1; 1965 s2; 1966 .. 1967 if (condition_1) break; 1968 if (condition_2) return; 1969 if (condition_3) continue; 1970 .. 1971 s3; 1972 } 1973 s4; 1974 .. 1975 } 1976#v- 1977 Here, a function \var{fun} has been defined that contains a \var{forever} 1978 loop consisting of statements \var{s1}, \var{s2},\ldots,\var{s3}, and 1979 three \var{if} statements. As long as the expressions \var{condition_1}, 1980 \var{condition_2}, and \var{condition_3} evaluate to zero, the statements 1981 \var{s1}, \var{s2},\ldots,\var{s3} will be repeatedly executed. However, 1982 if \var{condition_1} returns a non-zero value, the \var{break} statement 1983 will get executed, and control will pass out of the \var{forever} loop to 1984 the statement immediately following the loop which in this case is 1985 \var{s4}. Similarly, if \var{condition_2} returns a non-zero number, 1986 the \var{return} statement will cause control to pass back to the 1987 caller of \var{fun}. Finally, the \var{continue} statement will 1988 cause control to pass back to the start of the loop, skipping the 1989 statement \var{s3} altogether. 1990 1991 1992#%}}} 1993 1994#%}}} 1995 1996#%}}} 1997 1998\sect1{Functions} #%{{{ 1999 2000 A function may be thought of as a group of statements that work 2001 together to perform a computation. While there are no imposed 2002 limits upon the number statements that may occur within a function, 2003 it is considered poor programming practice if a function contains 2004 many statements. This notion stems from the belief that a function 2005 should have a simple, well defined purpose. 2006 2007\sect2{Declaring Functions} #%{{{ 2008 2009 Like variables, functions must be declared before they can be used. The 2010 \kw{define} keyword is used for this purpose. For example, 2011#v+ 2012 define factorial (); 2013#v- 2014 is sufficient to declare a function named \var{factorial}. Unlike 2015 the \var{variable} keyword used for declaring variables, the 2016 \var{define} keyword does not accept a list of names. 2017 2018 Usually, the above form is used only for recursive functions. In 2019 most cases, the function name is almost always followed by a 2020 parameter list and the body of the function: 2021\begin{tscreen} 2022 define \em{function-name} (\em{parameter-list}) 2023 { 2024 \em{statement-list} 2025 } 2026\end{tscreen} 2027 The \em{function-name} is an identifier and must conform to the 2028 naming scheme for identifiers discussed in chapter ???. 2029 The \em{parameter-list} is a comma-separated list of variable names 2030 that represent parameters passed to the function, and 2031 may be empty if no parameters are to be passed. 2032 The body of the function is enclosed in braces and consists of zero 2033 or more statements (\em{statement-list}). 2034 2035 The variables in the \em{parameter-list} are implicitly declared, 2036 thus, there is no need to declare them via a variable declaration 2037 statement. In fact any attempt to do so will result in a syntax 2038 error. 2039 2040#%}}} 2041 2042\sect2{Parameter Passing Mechanism} #%{{{ 2043 2044 Parameters to a function are always passed by value and never by 2045 reference. To see what this means, consider 2046#v+ 2047 define add_10 (a) 2048 { 2049 a = a + 10; 2050 } 2051 variable b = 0; 2052 add_10 (b); 2053#v- 2054 Here a function \var{add_10} has been defined, which when executed, 2055 adds \exmp{10} to its parameter. A variable \var{b} has also been 2056 declared and initialized to zero before it is passed to 2057 \var{add_10}. What will be the value of \var{b} after the call to 2058 \var{add_10}? If \slang were a language that passed parameters by 2059 reference, the value of \var{b} would be changed to 2060 \var{10}. However, \slang always passes by value, which means that 2061 \var{b} would retain its value of zero after the function call. 2062 2063 \slang does provide a mechanism for simulating pass by reference 2064 via the reference operator. See the next section for more details. 2065 2066 If a function is called with a parameter in the parameter list 2067 omitted, the corresponding variable in the function will be set to 2068 \var{NULL}. To make this clear, consider the function 2069#v+ 2070 define add_two_numbers (a, b) 2071 { 2072 if (a == NULL) a = 0; 2073 if (b == NULL) b = 0; 2074 return a + b; 2075 } 2076#v- 2077 This function must be called with two parameters. However, we can 2078 omit one or both of the parameters by calling it in one of the 2079 following ways: 2080#v+ 2081 variable s = add_two_numbers (2,3); 2082 variable s = add_two_numbers (2,); 2083 variable s = add_two_numbers (,3); 2084 variable s = add_two_numbers (,); 2085#v- 2086 The first example calls the function using both parameters; 2087 however, at least one of the parameters was omitted in the other 2088 examples. The interpreter will implicitly convert the last three 2089 examples to 2090#v+ 2091 variable s = add_two_numbers (2, NULL); 2092 variable s = add_two_numbers (NULL, 3); 2093 variable s = add_two_numbers (NULL, NULL); 2094#v- 2095 It is important to note that this mechanism is available only for 2096 function calls that specify more than one parameter. That is, 2097#v+ 2098 variable s = add_10 (); 2099#v- 2100 is \em{not} equivalent to \exmp{add_10(NULL)}. The reason for this 2101 is simple: the parser can only tell whether or not \var{NULL} should 2102 be substituted by looking at the position of the comma character in 2103 the parameter list, and only function calls that indicate more than 2104 one parameter will use a comma. A mechanism for handling single 2105 parameter function calls is described in the next section. 2106 2107#%}}} 2108 2109\sect2{Referencing Variables} #%{{{ 2110 2111 One can achieve the effect of passing by reference by using the 2112 reference (\var{&}) and dereference (\var{@}) operators. Consider 2113 again the \var{add_10} function presented in the previous section. 2114 This time we write it as 2115#v+ 2116 define add_10 (a) 2117 { 2118 @a = @a + 10; 2119 } 2120 variable b = 0; 2121 add_10 (&b); 2122#v- 2123 The expression \var{&b} creates a \em{reference} to the variable 2124 \var{b} and it is the reference that gets passed to \var{add_10}. 2125 When the function \var{add_10} is called, the value of \var{a} will 2126 be a reference to \var{b}. It is only by \em{dereferencing} this 2127 value that \var{b} can be accessed and changed. So, the statement 2128 \exmp{@a=@a+10;} should be read `add \exmp{10}' to the value of the 2129 object that \var{a} references and assign the result to the object 2130 that \var{a} references. 2131 2132 The reader familiar with C will note the similarity between 2133 \em{references} in \slang and \em{pointers} in C. 2134 2135 One of the main purposes for references is that this mechanism 2136 allows reference to functions to be passed to other functions. As 2137 a simple example from elementary calculus, consider the following 2138 function which returns an approximation to the derivative of another 2139 function at a specified point: 2140#v+ 2141 define derivative (f, x) 2142 { 2143 variable h = 1e-6; 2144 return ((@f)(x+h) - (@f)(x)) / h; 2145 } 2146#v- 2147 It can be used to differentiate the function 2148#v+ 2149 define x_squared (x) 2150 { 2151 return x^2; 2152 } 2153#v- 2154 at the point \exmp{x = 3} via the expression 2155 \exmp{derivative(&x_squared,3)}. 2156 2157 2158#%}}} 2159 2160\sect2{Functions with a Variable Number of Arguments} #%{{{ 2161 2162 \slang functions may be defined to take a variable number of 2163 arguments. The reason for this is that the calling routine pushes 2164 the arguments onto the stack before making a function call, and it 2165 is up to the called function to pop the values off the stack and 2166 make assignments to the variables in the parameter list. These 2167 details are, for the most part, hidden from the programmer. 2168 However, they are important when a variable number of arguments are 2169 passed. 2170 2171 Consider the \var{add_10} example presented earlier. This time it 2172 is written 2173#v+ 2174 define add_10 () 2175 { 2176 variable x; 2177 x = (); 2178 return x + 10; 2179 } 2180 variable s = add_10 (12); % ==> s = 22; 2181#v- 2182 For the uninitiated, this example looks as if it 2183 is destined for disaster. The \var{add_10} function looks like it 2184 accepts zero arguments, yet it was called with a single argument. 2185 On top of that, the assignment to \var{x} looks strange. The truth 2186 is, the code presented in this example makes perfect sense, once you 2187 realize what is happening. 2188 2189 First, consider what happened when \var{add_10} is called with the 2190 the parameter \exmp{12}. Internally, \exmp{12} is 2191 pushed onto the stack and then the function called. Now, 2192 consider the function itself. \var{x} is a variable local to the 2193 function. The strange looking assignment `\exmp{x=()}' simply 2194 takes whatever is on the stack and assigns it to \var{x}. In 2195 other words, after this statement, the value of \var{x} will be 2196 \exmp{12}, since \exmp{12} will be at the top of the stack. 2197 2198 A generic function of the form 2199#v+ 2200 define function_name (x, y, ..., z) 2201 { 2202 . 2203 . 2204 } 2205#v- 2206 is internally transformed by the interpreter to 2207#v+ 2208 define function_name () 2209 { 2210 variable x, y, ..., z; 2211 z = (); 2212 . 2213 . 2214 y = (); 2215 x = (); 2216 . 2217 . 2218 } 2219#v- 2220 before further parsing. (The \var{add_10} function, as defined above, is 2221 already in this form.) With this knowledge in hand, one can write a 2222 function that accepts a variable number of arguments. Consider the 2223 function: 2224#v+ 2225 define average_n (n) 2226 { 2227 variable x, y; 2228 variable sum; 2229 2230 if (n == 1) 2231 { 2232 x = (); 2233 sum = x; 2234 } 2235 else if (n == 2) 2236 { 2237 y = (); 2238 x = (); 2239 sum = x + y; 2240 } 2241 else error ("average_n: only one or two values supported"); 2242 2243 return sum / n; 2244 } 2245 variable ave1 = average_n (3.0, 1); % ==> 3.0 2246 variable ave2 = average_n (3.0, 5.0, 2); % ==> 4.0 2247#v- 2248 Here, the last argument passed to \var{average_n} is an integer 2249 reflecting the number of quantities to be averaged. Although this 2250 example works fine, its principal limitation is obvious: it only 2251 supports one or two values. Extending it to three or more values 2252 by adding more \exmp{else if} constructs is rather straightforward but 2253 hardly worth the effort. There must be a better way, and there is: 2254#v+ 2255 define average_n (n) 2256 { 2257 variable sum, x; 2258 sum = 0; 2259 loop (n) 2260 { 2261 x = (); % get next value from stack 2262 sum += x; 2263 } 2264 return sum / n; 2265 } 2266#v- 2267 The principal limitation of this approach is that one must still 2268 pass an integer that specifies how many values are to be averaged. 2269 2270 Fortunately, a special variable exists that is local to every function 2271 and contains the number of values that were passed to the function. 2272 That variable has the name \var{_NARGS} and may be used as follows: 2273#v+ 2274 define average_n () 2275 { 2276 variable x, sum = 0; 2277 2278 if (_NARGS == 0) error ("Usage: ave = average_n (x, ...);"); 2279 2280 loop (_NARGS) 2281 { 2282 x = (); 2283 sum += x; 2284 } 2285 return sum / _NARGS; 2286 } 2287#v- 2288 Here, if no arguments are passed to the function, a simple message 2289 that indicates how it is to be used is printed out. 2290 2291 2292#%}}} 2293 2294 2295\sect2{Returning Values} 2296 2297 As stated earlier, the usual way to return values from a function 2298 is via the \kw{return} statement. This statement has the 2299 simple syntax 2300\begin{tscreen} 2301 return \em{expression-list} ; 2302\end{tscreen} 2303 where \em{expression-list} is a comma separated list of expressions. 2304 If the function does not return any values, the expression list 2305 will be empty. As an example of a function that can return 2306 multiple values, consider 2307#v+ 2308 define sum_and_diff (x, y) 2309 { 2310 variable sum, diff; 2311 2312 sum = x + y; diff = x - y; 2313 return sum, diff; 2314 } 2315#v- 2316 which is a function returning two values. 2317 2318 It is extremely important to note that \em{the calling routine must 2319 explicitly handle all values returned by a function}. Although 2320 some languages such as C do not have this restriction, \slang does 2321 and it is a direct result of a \slang function's ability to return 2322 many values and accept a variable number of parameters. Examples 2323 of properly handling the above function include 2324#v+ 2325 variable sum, diff; 2326 (sum, diff) = sum_and_diff (5, 4); % ignore neither 2327 (sum, ) = sum_and_diff (5, 4); % ignore diff 2328 (,) = sum_and_diff (5, 4); % ignore both sum and diff 2329#v- 2330 See the section below on assignment statements for more information 2331 about this important point. 2332 2333\sect2{Multiple Assignment Statement} #%{{{ 2334 2335 \slang functions can return more than one value, e.g., 2336#v+ 2337 define sum_and_diff (x, y) 2338 { 2339 return x + y, x - y; 2340 } 2341#v- 2342 returns two values. It accomplishes this by placing both values on 2343 the stack before returning. If you understand how \slang functions 2344 handle a variable number of parameters (section ???), then it 2345 should be rather obvious that one assigns such values to variables. 2346 One way is to use, e.g., 2347#v+ 2348 sum_and_diff (9, 4); 2349 d = (); 2350 s = (); 2351#v- 2352 2353 However, the most convenient way to accomplish this is to use a 2354 \em{multiple assignment statement} such as 2355#v+ 2356 (s, d) = sum_and_diff (9, 4); 2357#v- 2358 The most general form of the multiple assignment statement is 2359#v+ 2360 ( var_1, var_2, ..., var_n ) = expression; 2361#v- 2362 In fact, internally the interpreter transforms this statement into 2363 the form 2364#v+ 2365 expression; var_n = (); ... var_2 = (); var_1 = (); 2366#v- 2367 for further processing. 2368 2369 If you do not care about one of return values, simply omit the 2370 variable name from the list. For example, 2371#v+ 2372 (s, ) = sum_and_diff (9, 4); 2373#v- 2374 assigns the sum of \exmp{9} and \exmp{4} to \var{s} and the 2375 difference (\exmp{9-4}) will be removed from the stack. 2376 2377 As another example, the \jed editor provides a function called 2378 \var{down} that takes an integer argument and returns an integer. 2379 It is used to move the current editing position down the number of 2380 lines specified by the argument passed to it. It returns the number 2381 of lines it successfully moved the editing position. Often one does 2382 not care about the return value from this function. Although it is 2383 always possible to handle the return value via 2384#v+ 2385 variable dummy = down (10); 2386#v- 2387 it is more convenient to use a multiple assignment expression and 2388 omit the variable name, e.g., 2389#v+ 2390 () = down (10); 2391#v- 2392 2393 Some functions return a \em{variable number} of values instead of a 2394 \em{fixed number}. Usually, the value at the top of the stack will 2395 indicate the actual number of return values. For such functions, 2396 the multiple assignment statement cannot directly be used. To see 2397 how such functions can be dealt with, consider the following 2398 function: 2399#v+ 2400 define read_line (fp) 2401 { 2402 variable line; 2403 if (-1 == fgets (&line, fp)) 2404 return -1; 2405 return (line, 0); 2406 } 2407#v- 2408 This function returns either one or two values, depending upon the 2409 return value of \var{fgets}. Such a function may be handled as in 2410 the following example: 2411#v+ 2412 status = read_line (fp); 2413 if (status != -1) 2414 { 2415 s = (); 2416 . 2417 . 2418 } 2419#v- 2420 In this example, the \em{last} value returned by \var{read_line} is 2421 assigned to \var{status} and then tested. If it is non-zero, the 2422 second return value is assigned to \var{s}. In particular note the 2423 empty set of parenthesis in the assignment to \var{s}. This simply 2424 indicates that whatever is on the top of the stack when the 2425 statement is executed will be assigned to \var{s}. 2426 2427 Before leaving this section it is important to reiterate the fact 2428 that if a function returns a value, the caller must deal with that 2429 return value. Otherwise, the value will continue to live onto the 2430 stack and may eventually lead to a stack overflow error. 2431 Failing to handle the return value of a function is the 2432 most common mistake that inexperienced \slang programmers make. 2433 For example, the \var{fflush} function returns a value that many C 2434 programmer's never check. Instead of writing 2435#v+ 2436 fflush (fp); 2437#v- 2438 as one could in C, a \slang programmer should write 2439#v+ 2440 () = fflush (fp); 2441#v- 2442 in \slang. (Many good C programmer's write \exmp{(void)fflush(fp)} 2443 to indicate that the return value is being ignored). 2444 2445#%}}} 2446 2447\sect2{Exit-Blocks} 2448 2449 An \em{exit-block} is a set of statements that get executed when a 2450 functions returns. They are very useful for cleaning up when a 2451 function returns via an explicit call to \var{return} from deep 2452 within a function. 2453 2454 An exit-block is created by using the \kw{EXIT_BLOCK} keyword 2455 according to the syntax 2456\begin{tscreen} 2457 EXIT_BLOCK { \em{statement-list} } 2458\end{tscreen} 2459 where \em{statement-list} represents the list of statements that 2460 comprise the exit-block. The following example illustrates the use 2461 of an exit-block: 2462#v+ 2463 define simple_demo () 2464 { 2465 variable n = 0; 2466 2467 EXIT_BLOCK { message ("Exit block called."); } 2468 2469 forever 2470 { 2471 if (n == 10) return; 2472 n++; 2473 } 2474 } 2475#v- 2476 Here, the function contains an exit-block and a \var{forever} loop. 2477 The loop will terminate via the \kw{return} statement when \var{n} 2478 is 10. Before it returns, the exit-block will get executed. 2479 2480 A function can contain multiple exit-blocks, but only the last 2481 one encountered during execution will actually get executed. For 2482 example, 2483#v+ 2484 define simple_demo (n) 2485 { 2486 EXIT_BLOCK { return 1; } 2487 2488 if (n != 1) 2489 { 2490 EXIT_BLOCK { return 2; } 2491 } 2492 return; 2493 } 2494#v- 2495 If \var{1} is passed to this function, the first exit-block will 2496 get executed because the second one would not have been encountered 2497 during the execution. However, if some other value is passed, the 2498 second exit-block would get executed. This example also 2499 illustrates that it is possible to explicitly return from an 2500 exit-block, although nested exit-blocks are illegal. 2501 2502#%}}} 2503 2504\sect1{Name Spaces} #%{{{ 2505 2506 By default, all global variables and functions are defined in the 2507 global namespace. In addition to the global namespace, every 2508 compilation unit (e.g., a file containing \slang code) has an 2509 anonymous namespace. Objects may be defined in the anonymous 2510 namespace via the \var{static} declaration keyword. For example, 2511#v+ 2512 static variable x; 2513 static define hello () { message ("hello"); } 2514#v- 2515 defines a variable \var{x} and a function \var{hello} in the 2516 anonymous namespace. This is useful when one wants to define 2517 functions and variables that are only to be used within the file, or 2518 more precisely the compilation unit, that defines them. 2519 2520 The \var{implements} function may be used to give the anonymous 2521 namespace a name to allow access to its objects from outside the 2522 compilation unit that defines them. For example, 2523#v+ 2524 implements ("foo"); 2525 static variable x; 2526#v- 2527 allows the variable \var{x} to be accessed via \var{foo->x}, e.g., 2528#v+ 2529 if (foo->x == 1) foo->x = 2; 2530#v- 2531 2532 The \var{implements} function does more than simply giving the 2533 anonymous namespace a name. It also changes the default variable 2534 and function declaration mode from \var{public} to \var{static}. 2535 That is, 2536#v+ 2537 implements ("foo"); 2538 variable x; 2539#v- 2540 and 2541#v+ 2542 implements ("foo"); 2543 static variable x; 2544#v- 2545 are equivalent. Then to create a public object within the 2546 namespace, one must explicitly use the \var{public} keyword. 2547 2548 Finally, the \var{private} keyword may be used to create an object 2549 that is truly private within the compilation unit. For example, 2550#v+ 2551 implements ("foo"); 2552 variable x; 2553 private variable y; 2554#v- 2555 allows \var{x} to be accessed from outside the namespace via 2556 \var{foo->x}, however \var{y} cannot be accessed. 2557 2558#%}}} 2559 2560\sect1{Arrays} #%{{{ 2561 2562 An array is a container object that can contain many values of one 2563 data type. Arrays are very useful objects and are indispensable 2564 for certain types of programming. The purpose of this chapter is 2565 to describe how arrays are defined and used in the \slang language. 2566 2567\sect2{Creating Arrays} #%{{{ 2568 2569 The \slang language supports multi-dimensional arrays of all data 2570 types. Since the \var{Array_Type} is a data type, one can even 2571 have arrays of arrays. To create a multi-dimensional array of 2572 \em{SomeType} use the syntax 2573#v+ 2574 SomeType [dim0, dim1, ..., dimN] 2575#v- 2576 Here \em{dim0}, \em{dim1}, ... \em{dimN} specify the size of 2577 the individual dimensions of the array. The current implementation 2578 permits arrays consist of up to \var{7} dimensions. When a 2579 numeric array is created, all its elements are initialized to zero. 2580 The initialization of other array types depend upon the data type, 2581 e.g., \var{String_Type} and \var{Struct_Type} arrays are 2582 initialized to \var{NULL}. 2583 2584 As a concrete example, consider 2585#v+ 2586 a = Integer_Type [10]; 2587#v- 2588 which creates a one-dimensional array of \exmp{10} integers and 2589 assigns it to \var{a}. 2590 Similarly, 2591#v+ 2592 b = Double_Type [10, 3]; 2593#v- 2594 creates a \var{30} element array of double precision numbers 2595 arranged in \var{10} rows and \var{3} columns, and assigns it to 2596 \var{b}. 2597 2598\sect3{Range Arrays} 2599 2600 There is a more convenient syntax for creating and initializing a 2601 1-d arrays. For example, to create an array of ten 2602 integers whose elements run from \exmp{1} through \exmp{10}, one 2603 may simply use: 2604#v+ 2605 a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; 2606#v- 2607 Similarly, 2608#v+ 2609 b = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]; 2610#v- 2611 specifies an array of ten doubles. 2612 2613 An even more compact way of specifying a numeric array is to use a 2614 \em{range-array}. For example, 2615#v+ 2616 a = [0:9]; 2617#v- 2618 specifies an array of 10 integers whose elements range from \var{0} 2619 through \var{9}. The most general form of a range array is 2620#v+ 2621 [first-value : last-value : increment] 2622#v- 2623 where the \em{increment} is optional and defaults to \exmp{1}. This 2624 creates an array whose first element is \em{first-value} and whose 2625 successive values differ by \em{increment}. \em{last-value} sets 2626 an upper limit upon the last value of the array as described below. 2627 2628 If the range array \var{[a:b:c]} is integer valued, then the 2629 interval specified by \var{a} and \var{b} is closed. That is, the 2630 kth element of the array \math{x_k} is given by \math{x_k=a+ck} and 2631 must satisfy \math{a<=x_k<=b}. Hence, the number of elements in an 2632 integer range array is given by the expression \math{1 + (b-a)/c}. 2633 2634 The situation is somewhat more complicated for floating point range 2635 arrays. The interval specified by a floating point range array 2636 \var{[a:b:c]} is semi-open such that \var{b} is not contained in 2637 the interval. In particular, the kth element of \var{[a:b:c]} is 2638 given by \math{x_k=a+kc} such that \math{a<=x_k<b} when 2639 \math{c>=0}, and \math{b<x_k<=a} otherwise. The number of elements 2640 in the array is one greater than the largest \math{k} that 2641 satisfies the open interval constraint. 2642 2643 Here are a few examples that illustrate the above comments: 2644#v+ 2645 [1:5:1] ==> [1,2,3,4,5] 2646 [1.0:5.0:1.0] ==> [1.0, 2.0, 3.0, 4.0] 2647 [5:1:-1] ==> [5,4,3,2,1] 2648 [5.0:1.0:-1.0] ==> [5.0, 4.0, 3.0, 2.0]; 2649 [1:1] ==> [1] 2650 [1.0:1.0] ==> [] 2651 [1:-3] ==> [] 2652#v- 2653 2654\sect3{Creating arrays via the dereference operator} 2655 2656 Another way to create an array is apply the dereference operator 2657 \var{@} to the \var{DataType_Type} literal \var{Array_Type}. The 2658 actual syntax for this operation resembles a function call 2659\begin{tscreen} 2660 variable a = @Array_Type (\em{data-type}, \em{integer-array}); 2661\end{tscreen} 2662 where \em{data-type} is of type \var{DataType_Type} and 2663 \em{integer-array} is a 1-d array of integers that specify the size 2664 of each dimension. For example, 2665#v+ 2666 variable a = @Array_Type (Double_Type, [10, 20]); 2667#v- 2668 will create a \exmp{10} by \var{20} array of doubles and assign it 2669 to \var{a}. This method of creating arrays derives its power from 2670 the fact that it is more flexible than the methods discussed in this 2671 section. We shall encounter it again in section ??? in the context 2672 of the \var{array_info} function. 2673 2674#%}}} 2675 2676\sect2{Reshaping Arrays} #%{{{ 2677 It is sometimes possible to change the `shape' of an array using 2678 the \var{reshape} function. For example, a 1-d 10 element array 2679 may be reshaped into a 2-d array consisting of 5 rows and 2 2680 columns. The only restriction on the operation is that the arrays 2681 must be commensurate. The \var{reshape} function follows the 2682 syntax 2683\begin{tscreen} 2684 reshape (\em{array-name}, \em{integer-array}); 2685\end{tscreen} 2686 where \em{array-name} specifies the array to be reshaped to have 2687 the dimensions given by \var{integer-array}, a 1-dimensional array of 2688 integers. It is important to note that this does \em{not} create a 2689 new array, it simply reshapes the existing array. Thus, 2690#v+ 2691 variable a = Double_Type [100]; 2692 reshape (a, [10, 10]); 2693#v- 2694 turns \var{a} into a \exmp{10} by \exmp{10} array. 2695 2696#%}}} 2697 2698\sect2{Indexing Arrays} #%{{{ 2699 An individual element of an array may be referred to by its 2700 \em{index}. For example, \exmp{a[0]} specifies the zeroth element 2701 of the one dimensional array \var{a}, and \exmp{b[3,2]} specifies 2702 the element in the third row and second column of the two 2703 dimensional array \var{b}. As in C array indices are numbered from 2704 \var{0}. Thus if \var{a} is a one-dimensional array of ten 2705 integers, the last element of the array is given by \var{a[9]}. 2706 Using \var{a[10]} would result in a range error. 2707 2708 A negative index may be used to index from the end of the array, 2709 with \exmp{a[-1]} referring to the last element of \var{a}, 2710 \exmp{a[-2]} referring to the next to the last element, and so on. 2711 2712 One may use the indexed value like any other variable. For 2713 example, to set the third element of an integer array to \var{6}, use 2714#v+ 2715 a[2] = 6; 2716#v- 2717 Similarly, that element may be used in an expression, such as 2718#v+ 2719 y = a[2] + 7; 2720#v- 2721 Unlike other \slang variables which inherit a type upon assignment, 2722 array elements already have a type. For example, an attempt to 2723 assign a string value to an element of an integer array will result 2724 in a type-mismatch error. 2725 2726 One may use any integer expression to index an array. A simple 2727 example that computes the sum of the elements of 10 element 1-d 2728 array is 2729#v+ 2730 variable i, sum; 2731 sum = 0; 2732 for (i = 0; i < 10; i++) sum += a[i]; 2733#v- 2734 2735 Unlike many other languages, \slang permits arrays to be indexed by 2736 other integer arrays. Suppose that \var{a} is a 1-d array of 10 2737 doubles. Now consider: 2738#v+ 2739 i = [6:8]; 2740 b = a[i]; 2741#v- 2742 Here, \var{i} is a 1-dimensional range array of three integers with 2743 \exmp{i[0]} equal to \exmp{6}, \exmp{i[1]} equal to \exmp{7}, 2744 and \exmp{i[2]} equal to \exmp{8}. The statement \var{b = a[i];} 2745 will create a 1-d array of three doubles and assign it to \var{b}. 2746 The zeroth element of \var{b}, \exmp{b[0]} will be set to the sixth 2747 element of \var{a}, or \exmp{a[6]}, and so on. In fact, these two simple 2748 statements are equivalent to 2749#v+ 2750 b = Double_Type [3]; 2751 b[0] = a[6]; 2752 b[1] = a[7]; 2753 b[2] = a[8]; 2754#v- 2755 except that using an array of indices is not only much more 2756 convenient, but executes much faster. 2757 2758 More generally, one may use an index array to specify which 2759 elements are to participate in a calculation. For example, consider 2760#v+ 2761 a = Double_Type [1000]; 2762 i = [0:499]; 2763 j = [500:999]; 2764 a[i] = -1.0; 2765 a[j] = 1.0; 2766#v- 2767 This creates an array of \exmp{1000} doubles and sets the first 2768 \exmp{500} elements to \exmp{-1.0} and the last \exmp{500} to 2769 \var{1.0}. Actually, one may do away with the \var{i} and \var{j} 2770 variables altogether and use 2771#v+ 2772 a = Double_Type [1000]; 2773 a [[0:499]] = -1.0; 2774 a [[500:999]] = 1.0; 2775#v- 2776 It is important to understand the syntax used and, in particular, 2777 to note that \exmp{a[[0:499]]} is \em{not} the same as 2778 \exmp{a[0:499]}. In fact, the latter will generate a syntax error. 2779 2780 Often, it is convenient to use a \em{rubber} range to specify 2781 indices. For example, \exmp{a[[500:]]} specifies all elements of 2782 \var{a} whose index is greater than or equal to \var{500}. Similarly, 2783 \exmp{a[[:499]]} specifies the first 500 elements of \var{a}. 2784 Finally, \exmp{a[[:]]} specifies all the elements of \var{a}; 2785 however, using \exmp{a[*]} is more convenient. 2786 2787 One should be careful when using index arrays with negative 2788 elements. As pointed out above, a negative index is used to index 2789 from the end of the array. That is, \exmp{a[-1]} refers to the 2790 last element of \exmp{a}. How should \exmp{a[[[0:-1]]} be 2791 interpreted? By itself, \var{[0:-1]} is an empty array; hence, one 2792 might expect \exmp{a[[0:-1]]} to refer to no elements. However, 2793 when used in an array indexing context, \exmp{[0:-1]} is 2794 interpreted as an array indexing the first through the last 2795 elements of the array. While this is a very convenient mechanism 2796 to specifiy the last 3 elements of an array using 2797 \exmp{a[[-3:-1]]}, it is very easy to forget these semantics. 2798 2799 Now consider a multi-dimensional array. For simplicity, suppose 2800 that \var{a} is a \exmp{100} by \exmp{100} array of doubles. Then 2801 the expression \var{a[0, *]} specifies all elements in the zeroth 2802 row. Similarly, \var{a[*, 7]} specifies all elements in the 2803 seventh column. Finally, \var{a[[3:5][6:12]]} specifies the 2804 \exmp{3} by \exmp{7} region consisting of rows \exmp{3}, \exmp{4}, 2805 and \exmp{5}, and columns \exmp{6} through \exmp{12} of \var{a}. 2806 2807 We conclude this section with a few examples. 2808 2809 Here is a function that computes the trace (sum of the diagonal 2810 elements) of a square 2 dimensional \var{n} by \var{n} array: 2811#v+ 2812 define array_trace (a, n) 2813 { 2814 variable sum = 0, i; 2815 for (i = 0; i < n; i++) sum += a[i, i]; 2816 return sum; 2817 } 2818#v- 2819 This fragment creates a \exmp{10} by \exmp{10} integer array, sets 2820 its diagonal elements to \exmp{5}, and then computes the trace of 2821 the array: 2822#v+ 2823 a = Integer_Type [10, 10]; 2824 for (j = 0; j < 10; j++) a[j, j] = 5; 2825 the_trace = array_trace(a, 10); 2826#v- 2827 We can get rid of the \kw{for} loop as follows: 2828#v+ 2829 j = Integer_Type [10, 2]; 2830 j[*,0] = [0:9]; 2831 j[*,1] = [0:9]; 2832 a[j] = 5; 2833#v- 2834 Here, the goal was to construct a 2-d array of indices that 2835 correspond to the diagonal elements of \var{a}, and then use that 2836 array to index \var{a}. To understand how 2837 this works, consider the middle statements. They are equivalent 2838 to the following \var{for} loops: 2839#v+ 2840 variable i; 2841 for (i = 0; i < 10; i++) j[i, 0] = i; 2842 for (i = 0; i < 10; i++) j[i, 1] = i; 2843#v- 2844 Thus, row \var{n} of \var{j} will have the value \exmp{(n,n)}, 2845 which is precisely what was sought. 2846 2847 Another example of this technique is the function: 2848#v+ 2849 define unit_matrix (n) 2850 { 2851 variable a = Integer_Type [n, n]; 2852 variable j = Integer_Type [n, 2]; 2853 j[*,0] = [0:n - 1]; 2854 j[*,1] = [0:n - 1]; 2855 2856 a[j] = 1; 2857 return a; 2858 } 2859#v- 2860 This function creates an \var{n} by \var{n} unit matrix, 2861 that is a 2-d \var{n} by \var{n} array whose elements are all zero 2862 except on the diagonal where they have a value of \exmp{1}. 2863 2864 2865#%}}} 2866 2867\sect2{Arrays and Variables} 2868 2869 When an array is created and assigned to a variable, the 2870 interpreter allocates the proper amount of space for the array, 2871 initializes it, and then assigns to the variable a \em{reference} 2872 to the array. So, a variable that represents an array has a value 2873 that is really a reference to the array. This has several 2874 consequences, some good and some bad. It is believed that the 2875 advantages of this representation outweigh the disadvantages. 2876 First, we shall look at the positive aspects. 2877 2878 When a variable is passed to a function, it is always the value of 2879 the variable that gets passed. Since the value of a variable 2880 representing an array is a reference, a reference to the array gets 2881 passed. One major advantage of this is rather obvious: it is a 2882 fast and efficient way to pass the array. This also has another 2883 consequence that is illustrated by the function 2884#v+ 2885 define init_array (a, n) 2886 { 2887 variable i; 2888 2889 for (i = 0; i < n; i++) a[i] = some_function (i); 2890 } 2891#v- 2892 where \var{some_function} is a function that generates a scalar 2893 value to initialize the \em{ith} element. This function can be 2894 used in the following way: 2895#v+ 2896 variable X = Double_Type [100000]; 2897 init_array (X, 100000); 2898#v- 2899 Since the array is passed to the function by reference, there is no 2900 need to make a separate copy of the \var{100000} element array. As 2901 pointed out above, this saves both execution time and memory. The 2902 other salient feature to note is that any changes made to the 2903 elements of the array within the function will be manifested in the 2904 array outside the function. Of course, in this case, this is a 2905 desirable side-effect. 2906 2907 To see the downside of this representation, consider: 2908#v+ 2909 variable a, b; 2910 a = Double_Type [10]; 2911 b = a; 2912 a[0] = 7; 2913#v- 2914 What will be the value of \exmp{b[0]}? Since the value of \var{a} 2915 is really a reference to the array of ten doubles, and that 2916 reference was assigned to \var{b}, \var{b} also refers to the same 2917 array. Thus any changes made to the elements of \var{a}, will also 2918 be made implicitly to \var{b}. 2919 2920 This begs the question: If the assignment of one variable which 2921 represents an array, to another variable results in the assignment 2922 of a reference to the array, then how does one make separate copies 2923 of the array? There are several answers including using an index 2924 array, e.g., \exmp{b = a[*]}; however, the most natural method is 2925 to use the dereference operator: 2926#v+ 2927 variable a, b; 2928 a = Double_Type [10]; 2929 b = @a; 2930 a[0] = 7; 2931#v- 2932 In this example, a separate copy of \var{a} will be created and 2933 assigned to \var{b}. It is very important to note that \slang 2934 never implicitly dereferences an object. So, one must explicitly use 2935 the dereference operator. This means that the elements of a 2936 dereferenced array are not themselves dereferenced. For example, 2937 consider dereferencing an array of arrays, e.g., 2938#v+ 2939 variable a, b; 2940 a = Array_Type [2]; 2941 a[0] = Double_Type [10]; 2942 a[1] = Double_Type [10]; 2943 b = @a; 2944#v- 2945 In this example, \exmp{b[0]} will be a reference to the array that 2946 \exmp{a[0]} references because \exmp{a[0]} was not explicitly 2947 dereferenced. 2948 2949\sect2{Using Arrays in Computations} #%{{{ 2950 2951 Many functions and operations work transparently with arrays. 2952 For example, if \var{a} and \var{b} are arrays, then the sum 2953 \exmp{a + b} is an array whose elements are formed from the sum of 2954 the corresponding elements of \var{a} and \var{b}. A similar 2955 statement holds for all other binary and unary operations. 2956 2957 Let's consider a simple example. Suppose, that we wish to solve a 2958 set of \var{n} quadratic equations whose coefficients are given by 2959 the 1-d arrays \var{a}, \var{b}, and \var{c}. In general, the 2960 solution of a quadratic equation will be two complex numbers. For 2961 simplicity, suppose that all we really want is to know what subset of 2962 the coefficients, \var{a}, \var{b}, \var{c}, correspond to 2963 real-valued solutions. In terms of \var{for} loops, we can write: 2964#v+ 2965 variable i, d, index_array; 2966 index_array = Integer_Type [n]; 2967 for (i = 0; i < n; i++) 2968 { 2969 d = b[i]^2 - 4 * a[i] * c[i]; 2970 index_array [i] = (d >= 0.0); 2971 } 2972#v- 2973 In this example, the array \var{index_array} will contain a 2974 non-zero value if the corresponding set of coefficients has a 2975 real-valued solution. This code may be written much more compactly 2976 and with more clarity as follows: 2977#v+ 2978 variable index_array = ((b^2 - 4 * a * c) >= 0.0); 2979#v- 2980 2981 \slang has a powerful built-in function called \var{where}. This 2982 function takes an array of integers and returns a 2-d array of 2983 indices that correspond to where the elements of the input array 2984 are non-zero. This simple operation is extremely useful. For 2985 example, suppose \var{a} is a 1-d array of \var{n} doubles, and it 2986 is desired to set to zero all elements of the array whose value is 2987 less than zero. One way is to use a \var{for} loop: 2988#v+ 2989 for (i = 0; i < n; i++) 2990 if (a[i] < 0.0) a[i] = 0.0; 2991#v- 2992 If \var{n} is a large number, this statement can take some time to 2993 execute. The optimal way to achieve the same result is to use the 2994 \var{where} function: 2995#v+ 2996 a[where (a < 0.0)] = 0; 2997#v- 2998 Here, the expression \exmp{(a < 0.0)} returns an array whose 2999 dimensions are the same size as \var{a} but whose elements are 3000 either \exmp{1} or \exmp{0}, according to whether or not the 3001 corresponding element of \var{a} is less than zero. This array of 3002 zeros and ones is then passed to \var{where} which returns a 2-d 3003 integer array of indices that indicate where the elements of 3004 \var{a} are less than zero. Finally, those elements of \var{a} are 3005 set to zero. 3006 3007 As a final example, consider once more the example involving the set of 3008 \var{n} quadratic equations presented above. Suppose that we wish 3009 to get rid of the coefficients of the previous example that 3010 generated non-real solutions. Using an explicit \var{for} loop requires 3011 code such as: 3012#v+ 3013 variable i, j, nn, tmp_a, tmp_b, tmp_c; 3014 3015 nn = 0; 3016 for (i = 0; i < n; i++) 3017 if (index_array [i]) nn++; 3018 3019 tmp_a = Double_Type [nn]; 3020 tmp_b = Double_Type [nn]; 3021 tmp_c = Double_Type [nn]; 3022 3023 j = 0; 3024 for (i = 0; i < n; i++) 3025 { 3026 if (index_array [i]) 3027 { 3028 tmp_a [j] = a[i]; 3029 tmp_b [j] = b[i]; 3030 tmp_c [j] = c[i]; 3031 j++; 3032 } 3033 } 3034 a = tmp_a; 3035 b = tmp_b; 3036 c = tmp_c; 3037#v- 3038 Not only is this a lot of code, it is also clumsy and error-prone. 3039 Using the \var{where} function, this task is trivial: 3040#v+ 3041 variable i; 3042 i = where (index_array != 0); 3043 a = a[i]; 3044 b = b[i]; 3045 c = c[i]; 3046#v- 3047 3048 All the examples up to now assumed that the dimensions of the array 3049 were known. Although the intrinsic function \var{length} may be 3050 used to get the total number of elements of an array, it cannot be 3051 used to get the individual dimensions of a multi-dimensional array. 3052 However, the function \var{array_info} may be used to 3053 get information about an array, such as its data type and size. 3054 The function returns three values: the data type, the number of 3055 dimensions, and an integer array containing the size 3056 of each dimension. It may be used to determine the number of rows 3057 of an array as follows: 3058#v+ 3059 define num_rows (a) 3060 { 3061 variable dims, type, num_dims; 3062 3063 (dims, num_dims, type) = array_info (a); 3064 return dims[0]; 3065 } 3066#v- 3067 The number of columns may be obtained in a similar manner: 3068#v+ 3069 define num_cols (a) 3070 { 3071 variable dims, type, num_dims; 3072 3073 (dims, num_dims, type) = array_info (a); 3074 if (num_dims > 1) return dims[1]; 3075 return 1; 3076 } 3077#v- 3078 3079 Another use of \var{array_info} is to create an array that has the 3080 same number of dimensions as another array: 3081#v+ 3082 define make_int_array (a) 3083 { 3084 variable dims, num_dims, type; 3085 3086 (dims, num_dims, type) = array_info (a); 3087 return @Array_Type (Integer_Type, dims); 3088 } 3089#v- 3090 3091#%}}} 3092 3093#%}}} 3094 3095\sect1{Associative Arrays} #%{{{ 3096 3097 An associative array differs from an ordinary array in the sense 3098 that its size is not fixed and that is indexed by a string, called 3099 the \em{key}. For example, consider: 3100#v+ 3101 variable A = Assoc_Type [Integer_Type]; 3102 A["alpha"] = 1; 3103 A["beta"] = 2; 3104 A["gamma"] = 3; 3105#v- 3106 Here, \var{A} represents an associative array of integers 3107 (\var{Integer_Type}) and three keys have been added to the array. 3108 3109 As the example suggests, an associative array may be created using 3110 one of the following forms: 3111\begin{tscreen} 3112 Assoc_Type [\em{type}] 3113 Assoc_Type [\em{type}, \em{default-value}] 3114 Assoc_Type [] 3115\end{tscreen} 3116 The last form returns an associative array of \var{Any_Type} 3117 objects allowing any type of object to may be stored in 3118 the array. 3119 3120 The form involving a \em{default-value} is useful for associating a 3121 default value for non-existent array members. This feature is 3122 explained in more detail below. 3123 3124 There are several functions that are specially designed to work 3125 with associative arrays. These include: 3126\begin{itemize} 3127\item \var{assoc_get_keys}, which returns an ordinary array of strings 3128 containing the keys in the array. 3129 3130\item \var{assoc_get_values}, which returns an ordinary array of the 3131 values of the associative array. 3132 3133\item \var{assoc_key_exists}, which can be used to determine whether 3134 or not a key exists in the array. 3135 3136\item \var{assoc_delete_key}, which may be used to remove a key (and 3137 its value) from the array. 3138\end{itemize} 3139 3140 To illustrate the use of an associative array, consider the problem 3141 of counting the number of repeated occurrences of words in a list. 3142 Let the word list be represented as an array of strings given by 3143 \var{word_list}. The number of occurrences of each word may be 3144 stored in an associative array as follows: 3145#v+ 3146 variable a, word; 3147 a = Assoc_Type [Integer_Type]; 3148 foreach (word_list) 3149 { 3150 word = (); 3151 if (0 == assoc_key_exists (a, word)) 3152 a[word] = 0; 3153 a[word]++; % same as a[word] = a[word] + 1; 3154 } 3155#v- 3156 Note that \var{assoc_key_exists} was necessary to determine whether 3157 or not a word was already added to the array in order to properly 3158 initialize it. However, by creating the associative array with a 3159 default value of \exmp{0}, the above code may be simplified to 3160#v+ 3161 variable a, word; 3162 a = Assoc_Type [Integer_Type, 0]; 3163 foreach (word_list) 3164 { 3165 word = (); 3166 a[word]++; 3167 } 3168#v- 3169 3170 3171#%}}} 3172 3173\sect1{Structures and User-Defined Types} #%{{{ 3174 3175 A \em{structure} is a heterogeneous container object, i.e., it is 3176 an object with elements whose values do not have to be of the same 3177 data type. The elements or fields of a structure are named, and 3178 one accesses a particular field of the structure via the field 3179 name. This should be contrasted with an array whose values are of 3180 the same type, and whose elements are accessed via array indices. 3181 3182 A \em{user-defined} data type is a structure with a fixed set of 3183 fields defined by the user. 3184 3185\sect2{Defining a Structure} 3186 3187 The \kw{struct} keyword is used to define a structure. The syntax 3188 for this operation is: 3189\begin{tscreen} 3190 struct {\em{field-name-1}, \em{field-name-2}, ... \em{field-name-N}}; 3191\end{tscreen} 3192 This creates and returns a structure with \em{N} fields whose names 3193 are specified by \em{field-name-1}, \em{field-name-2}, ..., 3194 \em{field-name-N}. When a structure is created, all its fields are 3195 initialized to \var{NULL}. 3196 3197 For example, 3198#v+ 3199 variable t = struct { city_name, population, next }; 3200#v- 3201 creates a structure with three fields and assigns it to the 3202 variable \var{t}. 3203 3204 Alternatively, a structure may be created by dereferencing 3205 \var{Struct_Type}. For example, the above structure may also be 3206 created using one of the two forms: 3207#v+ 3208 t = @Struct_Type ("city_name", "population", "next"); 3209 t = @Struct_Type (["city_name", "population", "next"]); 3210#v- 3211 These are useful when creating structures dynamically where one does 3212 not know the name of the fields until run-time. 3213 3214 Like arrays, structures are passed around via a references. Thus, 3215 in the above example, the value of \var{t} is a reference to the 3216 structure. This means that after execution of 3217#v+ 3218 variable u = t; 3219#v- 3220 \em{both} \var{t} and \var{u} refer to the \em{same} structure, 3221 since only the reference was used in the assignment. To actually 3222 create a new copy of the structure, use the \em{dereference} 3223 operator, e.g., 3224#v+ 3225 variable u = @t; 3226#v- 3227 3228\sect2{Accessing the Fields of a Structure} 3229 3230 The dot (\var{.}) operator is used to specify the particular 3231 field of structure. If \var{s} is a structure and \var{field_name} 3232 is a field of the structure, then \exmp{s.field_name} specifies 3233 that field of \var{s}. This specification can be used in 3234 expressions just like ordinary variables. Again, consider 3235#v+ 3236 variable t = struct { city_name, population, next }; 3237#v- 3238 described in the last section. Then, 3239#v+ 3240 t.city_name = "New York"; 3241 t.population = 13000000; 3242 if (t.population > 200) t = t.next; 3243#v- 3244 are all valid statements involving the fields of \var{t}. 3245 3246\sect2{Linked Lists} 3247 3248 One of the most important uses of structures is to create a 3249 \em{dynamic} data structure such as a \em{linked-list}. A 3250 linked-list is simply a chain of structures that are linked together 3251 such that one structure in the chain is the value of a field of the 3252 previous structure in the chain. To be concrete, consider the 3253 structure discussed earlier: 3254#v+ 3255 variable t = struct { city_name, population, next }; 3256#v- 3257 and suppose that we desire to create a list of such structures. 3258 The purpose of the \var{next} field is to provide the link to the 3259 next structure in the chain. Suppose that there exists a function, 3260 \var{read_next_city}, that reads city names and populations from a 3261 file. Then we can create the list via: 3262#v+ 3263 define create_population_list () 3264 { 3265 variable city_name, population, list_root, list_tail; 3266 variable next; 3267 3268 list_root = NULL; 3269 while (read_next_city (&city_name, &population)) 3270 { 3271 next = struct {city_name, population, next }; 3272 3273 next.city_name = city_name; 3274 next.population = population; 3275 next.next = NULL; 3276 3277 if (list_root == NULL) 3278 list_root = next; 3279 else 3280 list_tail.next = next; 3281 3282 list_tail = next; 3283 } 3284 return list_root; 3285 } 3286#v- 3287 In this function, the variables \var{list_root} and \var{list_tail} 3288 represent the beginning and end of the list, respectively. As long 3289 as \var{read_next_city} returns a non-zero value, a new structure is 3290 created, initialized, and then appended to the list via the 3291 \var{next} field of the \var{list_tail} structure. On the first 3292 time through the loop, the list is created via the assignment to the 3293 \var{list_root} variable. 3294 3295 This function may be used as follows: 3296#v+ 3297 variable Population_List = create_population_list (); 3298 if (Population_List == NULL) error ("List is empty"); 3299#v- 3300 We can create other functions that manipulate the list. An example is 3301 a function that finds the city with the largest population: 3302#v+ 3303 define get_largest_city (list) 3304 { 3305 variable largest; 3306 3307 largest = list; 3308 while (list != NULL) 3309 { 3310 if (list.population > largest.population) 3311 largest = list; 3312 list = list.next; 3313 } 3314 return largest.city_name; 3315 } 3316 3317 vmessage ("%s is the largest city in the list", 3318 get_largest_city (Population_List))); 3319#v- 3320 The \var{get_largest_city} is a typical example of how one traverses 3321 a linear linked-list by starting at the head of the list and 3322 successively moves to the next element of the list via the 3323 \var{next} field. 3324 3325 In the previous example, a \kw{while} loop was used to traverse the 3326 linked list. It is faster to use a \kw{foreach} loop for this: 3327#v+ 3328 define get_largest_city (list) 3329 { 3330 variable largest, elem; 3331 3332 largest = list; 3333 foreach (list) 3334 { 3335 elem = (); 3336 if (item.population > largest.population) 3337 largest = item; 3338 } 3339 return largest.city_name; 3340 } 3341#v- 3342 Here a \kw{foreach} loop has been used to walk the list via its 3343 \exmp{next} field. If the field name was not \exmp{next}, then it 3344 would have been necessary to use the \kw{using} form of the 3345 \kw{foreach} statement. For example, if the field name implementing the 3346 linked list was \exmp{next_item}, then 3347#v+ 3348 foreach (list) using ("next_item") 3349 { 3350 elem = (); 3351 . 3352 . 3353 } 3354#v- 3355 would have been used. In other words, unless otherwise indicated 3356 via the \kw{using} clause, \kw{foreach} walks the list using a field 3357 named \exmp{next}. 3358 3359 Now consider a function that sorts the list according to population. 3360 To illustrate the technique, a \em{bubble-sort} will be used, not 3361 because it is efficient, it is not, but because it is simple and 3362 intuitive. 3363#v+ 3364 define sort_population_list (list) 3365 { 3366 variable changed; 3367 variable node, next_node, last_node; 3368 do 3369 { 3370 changed = 0; 3371 node = list; 3372 next_node = node.next; 3373 last_node = NULL; 3374 while (next_node != NULL) 3375 { 3376 if (node.population < next_node.population) 3377 { 3378 % swap node and next_node 3379 node.next = next_node.next; 3380 next_node.next = node; 3381 if (last_node != NULL) 3382 last_node.next = next_node; 3383 3384 if (list == node) list = next_node; 3385 node = next_node; 3386 next_node = node.next; 3387 changed++; 3388 } 3389 last_node = node; 3390 node = next_node; 3391 next_node = next_node.next; 3392 } 3393 } 3394 while (changed); 3395 3396 return list; 3397 } 3398#v- 3399 Note the test for equality between \var{list} and \var{node}, i.e., 3400#v+ 3401 if (list == node) list = next_node; 3402#v- 3403 It is important to appreciate the fact that the values of these 3404 variables are references to structures, and that the 3405 comparison only compares the references and \em{not} the actual 3406 structures they reference. If it were not for this, the algorithm 3407 would fail. 3408 3409\sect2{Defining New Types} 3410 3411 A user-defined data type may be defined using the \kw{typedef} 3412 keyword. In the current implementation, a user-defined data type 3413 is essentially a structure with a user-defined set of fields. For 3414 example, in the previous section a structure was used to represent 3415 a city/population pair. We can define a data type called 3416 \var{Population_Type} to represent the same information: 3417#v+ 3418 typedef struct 3419 { 3420 city_name, 3421 population 3422 } Population_Type; 3423#v- 3424 This data type can be used like all other data types. For example, 3425 an array of Population_Type types can be created, 3426#v+ 3427 variable a = Population_Type[10]; 3428#v- 3429 and `populated' via expressions such as 3430#v+ 3431 a[0].city_name = "Boston"; 3432 a[0].population = 2500000; 3433#v- 3434 The new type \var{Population_Type} may also be used with the 3435 \var{typeof} function: 3436#v+ 3437 if (Population_Type = typeof (a)) city = a.city_name; 3438#v- 3439 The dereference \var{@} may be used to create an instance of the 3440 new type: 3441#v+ 3442 a = @Population_Type; 3443 a.city_name = "Calcutta"; 3444 a.population = 13000000; 3445#v- 3446 3447 3448#%}}} 3449 3450\sect1{Error Handling} #%{{{ 3451 3452 Many intrinsic functions signal errors in the event of failure. 3453 User defined functions may also generate an error condition via the 3454 \var{error} function. Depending upon the severity of the error, it 3455 can be caught and cleared using a construct called an 3456 \em{error-block}. 3457 3458\sect2{Error-Blocks} 3459 3460 When the interpreter encounters a recoverable run-time error, it 3461 will return to top-level by \em{unwinding} its function call 3462 stack. Any error-blocks that it encounters as part of this 3463 unwinding process will get executed. Errors such as syntax errors 3464 and memory allocation errors are not recoverable, and error-blocks 3465 will not get executed when such errors are encountered. 3466 3467 An error-block is defined using the syntax 3468#v+ 3469 ERROR_BLOCK { statement-list } 3470#v- 3471 where \em{statement-list} represents a list of statements that 3472 comprise the error-block. A simple example of an error-block is 3473#v+ 3474 define simple (a) 3475 { 3476 ERROR_BLOCK { message ("error-block executed"); } 3477 if (a) error ("Triggering Error"); 3478 message ("hello"); 3479 } 3480#v- 3481 Executing this function via \exmp{simple(0)} will result in the 3482 message \exmp{"hello"}. However, calling it using \exmp{simple(1)} 3483 will generate an error that will be caught, but not cleared, by 3484 the error-block and the \exmp{"error-block executed"} message will 3485 result. 3486 3487 Error-blocks are never executed unless triggered by an error. The 3488 only exception to this is when the user explicitly indicates that 3489 the error-block in scope should execute. This is indicated by the 3490 special keyword \var{EXECUTE_ERROR_BLOCK}. For example, 3491 \var{simple} could be recoded as 3492#v+ 3493 define simple (a) 3494 { 3495 variable err_string = "error-block executed"; 3496 ERROR_BLOCK { message (err_string); } 3497 if (a) error ("Triggering Error"); 3498 err_string = "hello"; 3499 EXECUTE_ERROR_BLOCK; 3500 } 3501#v- 3502 Please note that \var{EXECUTE_ERROR_BLOCK} does not initiate an 3503 error condition; it simply causes the error-block to be executed 3504 and control will pass onto the next statement following the 3505 \var{EXECUTE_ERROR_BLOCK} statement. 3506 3507\sect2{Clearing Errors} 3508 3509 Once an error has been caught by an error-block, the error can be cleared 3510 by the \var{_clear_error} function. After the error has been cleared, 3511 execution will resume at the next statement at the level of the error block 3512 following the statement that generated the error. For example, consider: 3513#v+ 3514 define make_error () 3515 { 3516 error ("Error condition created."); 3517 message ("This statement is not executed."); 3518 } 3519 3520 define test () 3521 { 3522 ERROR_BLOCK 3523 { 3524 _clear_error (); 3525 } 3526 make_error (); 3527 message ("error cleared."); 3528 } 3529#v- 3530 Calling \var{test} will trigger an error in the \var{make_error} 3531 function, but will get cleared in the \var{test} function. The 3532 call-stack will unwind from \var{make_error} back into \var{test} 3533 where the error-block will get executed. As a result, execution 3534 resumes after the statement that makes the call to \var{make_error} 3535 since this statement is at the same level as the error-block that 3536 cleared the error. 3537 3538 Here is another example that illustrates how multiple error-blocks 3539 work: 3540#v+ 3541 define example () 3542 { 3543 variable n = 0, s = ""; 3544 variable str; 3545 3546 ERROR_BLOCK { 3547 str = sprintf ("s=%s,n=%d", s, n); 3548 _clear_error (); 3549 } 3550 3551 forever 3552 { 3553 ERROR_BLOCK { 3554 s += "0"; 3555 _clear_error (); 3556 } 3557 3558 if (n == 0) error (""); 3559 3560 ERROR_BLOCK { 3561 s += "1"; 3562 } 3563 3564 if (n == 1) error (""); 3565 n++; 3566 } 3567 return str; 3568 } 3569#v- 3570 Here, three error-blocks have been declared. One has been declared 3571 outside the \var{forever} loop and the other two have been declared 3572 inside the \var{forever} loop. Each time through the loop, the variable 3573 \var{n} is incremented and a different error-block is triggered. The 3574 error-block that gets triggered is the last one encountered, since 3575 that will be the one in scope. On the first time through the loop, 3576 \var{n} will be zero and the first error-block in the loop will get 3577 executed. This error block clears the error and execution resumes 3578 following the \var{if} statement that triggered the error. The 3579 variable \var{n} will get incremented to \exmp{1} and, on the 3580 second cycle through the loop the second \var{if} statement 3581 will trigger an error causing the second error-block to execute. 3582 This time, the error is not cleared and the call-stack unwinds out 3583 of the \var{forever} loop, at which point the error-block outside 3584 the loop is in scope, causing it to execute. This error-block 3585 prints out the values of the variables \var{s} and \var{n}. It 3586 will clear the error and execution resumes on the statement 3587 \em{following} the \var{forever} loop. The result of this 3588 complicated series of events is that the function will return the 3589 string \exmp{"s=01,n=1"}. 3590 3591#%}}} 3592 3593\sect1{Loading Files: evalfile and autoload} 3594 3595\sect1{File Input/Output} #%{{{ 3596 3597 \slang provides built-in supports for two different I/O facilities. 3598 The simplest interface is modeled upon the C language \var{stdio} 3599 streams interface and consists of functions such as \var{fopen}, 3600 \var{fgets}, etc. The other interface is modeled on a lower level 3601 POSIX interface consisting of functions such as \var{open}, 3602 \var{read}, etc. In addition to permitting more control, the lower 3603 level interface permits one to access network objects as well as disk 3604 files. 3605 3606\sect2{Input/Output via stdio} 3607\sect3{Stdio Overview} 3608 The \var{stdio} interface consists of the following functions: 3609\begin{itemize} 3610\item \var{fopen}, which opens a file for read or writing. 3611 3612\item \var{fclose}, which closes a file opened by \var{fopen}. 3613 3614\item \var{fgets}, used to read a line from the file. 3615 3616\item \var{fputs}, which writes text to the file. 3617 3618\item \var{fprintf}, used to write formatted text to the file. 3619 3620\item \var{fwrite}, which may be used to write objects to the 3621 file. 3622 3623\item \var{fread}, which reads a specified number of objects from 3624 the file. 3625 3626\item \var{feof}, which is used to test whether the file pointer is at the 3627 end of the file. 3628 3629\item \var{ferror}, which is used to see whether or not the stream 3630 associated with the file has an error. 3631 3632\item \var{clearerr}, which clears the end-of-file and error 3633 indicators for the stream. 3634 3635\item \var{fflush}, used to force all buffered data associated with 3636 the stream to be written out. 3637 3638\item \var{ftell}, which is used to query the file position indicator 3639 of the stream. 3640 3641\item \var{fseek}, which is used to set the position of the file 3642 position indicator of the stream. 3643 3644\item \var{fgetslines}, which reads all the lines in a text file and 3645 returns them as an array of strings. 3646 3647\end{itemize} 3648 3649 In addition, the interface supports the \var{popen} and \var{pclose} 3650 functions on systems where the corresponding C functions are available. 3651 3652 Before reading or writing to a file, it must first be opened using 3653 the \var{fopen} function. The only exceptions to this rule involves 3654 use of the pre-opened streams: \var{stdin}, \var{stdout}, and 3655 \var{stderr}. \var{fopen} accepts two arguments: a file name and a 3656 string argument that indicates how the file is to be opened, e.g., 3657 for reading, writing, update, etc. It returns a \var{File_Type} 3658 stream object that is used as an argument to all other functions of 3659 the \var{stdio} interface. Upon failure, it returns \NULL. See the 3660 reference manual for more information about \var{fopen}. 3661 3662\sect3{Stdio Examples} 3663 3664 In this section, some simple examples of the use of the \var{stdio} 3665 interface is presented. It is important to realize that all the 3666 functions of the interface return something, and that return value 3667 must be dealt with. 3668 3669 The first example involves writing a function to count the number of 3670 lines in a text file. To do this, we shall read in the lines, one by 3671 one, and count them: 3672#v+ 3673 define count_lines_in_file (file) 3674 { 3675 variable fp, line, count; 3676 3677 fp = fopen (file, "r"); % Open the file for reading 3678 if (fp == NULL) 3679 verror ("%s failed to open", file); 3680 3681 count = 0; 3682 while (-1 != fgets (&line, fp)) 3683 count++; 3684 3685 () = fclose (fp); 3686 return count; 3687 } 3688#v- 3689 Note that \exmp{&line} was passed to the \var{fgets} function. When 3690 \var{fgets} returns, \var{line} will contain the line of text read in 3691 from the file. Also note how the return value from \var{fclose} was 3692 handled. 3693 3694 Although the preceding example closed the file via \var{fclose}, 3695 there is no need to explicitly close a file because \slang will 3696 automatically close the file when it is no longer referenced. Since 3697 the only variable to reference the file is \var{fp}, it would have 3698 automatically been closed when the function returned. 3699 3700 Suppose that it is desired to count the number of characters in the 3701 file instead of the number of lines. To do this, the \var{while} 3702 loop could be modified to count the characters as follows: 3703#v+ 3704 while (-1 != fgets (&line, fp)) 3705 count += strlen (line); 3706#v- 3707 The main difficulty with this approach is that it will not work for 3708 binary files, i.e., files that contain null characters. For such 3709 files, the file should be opened in \em{binary} mode via 3710#v+ 3711 fp = fopen (file, "rb"); 3712#v- 3713 and then the data read in using the \var{fread} function: 3714#v+ 3715 while (-1 != fread (&line, Char_Type, 1024, fp)) 3716 count += bstrlen (line); 3717#v- 3718 The \var{fread} function requires two additional arguments: the type 3719 of object to read (\var{Char_Type} in the case), and the number of 3720 such objects to read. The function returns the number of objects 3721 actually read, or -1 upon failure. The \var{bstrlen} function was 3722 used to compute the length of \var{line} because for \var{Char_Type} 3723 or \var{UChar_Type} objects, the \var{fread} function assigns a 3724 \em{binary} string (\var{BString_Type}) to \var{line}. 3725 3726 The \kw{foreach} construct also works with \var{File_Type} objects. 3727 For example, the number of characters in a file may be counted via 3728#v+ 3729 foreach (fp) using ("char") 3730 { 3731 ch = (); 3732 count++; 3733 } 3734#v- 3735 To count the number of lines, one can use: 3736#v+ 3737 foreach (fp) using ("line") 3738 { 3739 line = (); 3740 num_lines++; 3741 count += strlen (line); 3742 } 3743#v- 3744 3745 Finally, it should be mentioned that neither of these examples should 3746 be used to count the number of characters in a file when that 3747 information is more readily accessible by another means. For 3748 example, it is preferable to get this information via the 3749 \var{stat_file} function: 3750#v+ 3751 define count_chars_in_file (file) 3752 { 3753 variable st; 3754 3755 st = stat_file (file); 3756 if (st == NULL) 3757 error ("stat_file failed."); 3758 return st.st_size; 3759 } 3760#v- 3761 3762\sect2{POSIX I/O} 3763 3764\sect2{Advanced I/O techniques} 3765 3766 The previous examples illustrate how to read and write objects of a 3767 single data-type from a file, e.g., 3768#v+ 3769 num = fread (&a, Double_Type, 20, fp); 3770#v- 3771 would result in a \exmp{Double_Type[num]} array being assigned to 3772 \var{a} if successful. However, suppose that the binary data file 3773 consists of numbers in a specified byte-order. How can one read 3774 such objects with the proper byte swapping? The answer is to use 3775 the \var{fread} function to read the objects as \var{Char_Type} and 3776 then \em{unpack} the resulting string into the specified data type, 3777 or types. This process is facilitated using the \var{pack} and 3778 \var{unpack} functions. 3779 3780 The \var{pack} function follows the syntax 3781\begin{tscreen} 3782 BString_Type pack (\em{format-string}, \em{item-list}); 3783\end{tscreen} 3784 and combines the objects in the \em{item-list} according to 3785 \em{format-string} into a binary string and returns the result. 3786 Likewise, the \var{unpack} function may be used to convert a binary 3787 string into separate data objects: 3788\begin{tscreen} 3789 (\em{variable-list}) = unpack (\em{format-string}, \em{binary-string}); 3790\end{tscreen} 3791 3792 The format string consists of one or more data-type specification 3793 characters, and each may be followed by an optional decimal length 3794 specifier. Specifically, the data-types are specified according to 3795 the following table: 3796#v+ 3797 c char 3798 C unsigned char 3799 h short 3800 H unsigned short 3801 i int 3802 I unsigned int 3803 l long 3804 L unsigned long 3805 j 16 bit int 3806 J 16 unsigned int 3807 k 32 bit int 3808 K 32 bit unsigned int 3809 f float 3810 d double 3811 F 32 bit float 3812 D 64 bit float 3813 s character string, null padded 3814 S character string, space padded 3815 x a null pad character 3816#v- 3817 A decimal length specifier may follow the data-type specifier. With 3818 the exception of the \var{s} and \var{S} specifiers, the length 3819 specifier indicates how many objects of that data type are to be 3820 packed or unpacked from the string. When used with the \var{s} or 3821 \var{S} specifiers, it indicates the field width to be used. If the 3822 length specifier is not present, the length defaults to one. 3823 3824 With the exception of \var{c}, \var{C}, \var{s}, \var{S}, and 3825 \var{x}, each of these may be prefixed by a character that indicates 3826 the byte-order of the object: 3827#v+ 3828 > big-endian order (network order) 3829 < little-endian order 3830 = native byte-order 3831#v- 3832 The default is native byte order. 3833 3834 Here are a few examples that should make this more clear: 3835#v+ 3836 a = pack ("cc", 'A', 'B'); % ==> a = "AB"; 3837 a = pack ("c2", 'A', 'B'); % ==> a = "AB"; 3838 a = pack ("xxcxxc", 'A', 'B'); % ==> a = "\0\0A\0\0B"; 3839 a = pack ("h2", 'A', 'B'); % ==> a = "\0A\0B" or "\0B\0A" 3840 a = pack (">h2", 'A', 'B'); % ==> a = "\0\xA\0\xB" 3841 a = pack ("<h2", 'A', 'B'); % ==> a = "\0B\0A" 3842 a = pack ("s4", "AB", "CD"); % ==> a = "AB\0\0" 3843 a = pack ("s4s2", "AB", "CD"); % ==> a = "AB\0\0CD" 3844 a = pack ("S4", "AB", "CD"); % ==> a = "AB " 3845 a = pack ("S4S2", "AB", "CD"); % ==> a = "AB CD" 3846#v- 3847 3848 When unpacking, if the length specifier is greater than one, then an 3849 array of that length will be returned. In addition, trailing 3850 whitespace and null character are stripped when unpacking an object 3851 given by the \var{S} specifier. Here are a few examples: 3852#v+ 3853 (x,y) = unpack ("cc", "AB"); % ==> x = 'A', y = 'B' 3854 x = unpack ("c2", "AB"); % ==> x = ['A', 'B'] 3855 x = unpack ("x<H", "\0\xAB\xCD"); % ==> x = 0xCDABuh 3856 x = unpack ("xxs4", "a b c\0d e f"); % ==> x = "b c\0" 3857 x = unpack ("xxS4", "a b c\0d e f"); % ==> x = "b c" 3858#v- 3859 3860\sect3{Example: Reading /var/log/wtmp} 3861 3862 Consider the task of reading the Unix system file 3863 \var{/var/log/utmp}, which contains login records about who logged 3864 onto the system. This file format is documented in section 5 of the 3865 online Unix man pages, and consists of a sequence of entries 3866 formatted according to the C structure \var{utmp} defined in the 3867 \var{utmp.h} C header file. The actual details of the structure 3868 may vary from one version of Unix to the other. For the purposes of 3869 this example, consider its definition under the Linux operating 3870 system running on an Intel processor: 3871#v+ 3872 struct utmp { 3873 short ut_type; /* type of login */ 3874 pid_t ut_pid; /* pid of process */ 3875 char ut_line[12]; /* device name of tty - "/dev/" */ 3876 char ut_id[2]; /* init id or abbrev. ttyname */ 3877 time_t ut_time; /* login time */ 3878 char ut_user[8]; /* user name */ 3879 char ut_host[16]; /* host name for remote login */ 3880 long ut_addr; /* IP addr of remote host */ 3881 }; 3882#v- 3883 On this system, \var{pid_t} is defined to be an \var{int} and 3884 \var{time_t} is a \var{long}. Hence, a format specifier for the 3885 \var{pack} and \var{unpack} functions is easily constructed to be: 3886#v+ 3887 "h i S12 S2 l S8 S16 l" 3888#v- 3889 However, this particular definition is naive because it does not 3890 allow for structure padding performed by the C compiler in order to 3891 align the data types on suitable word boundaries. Fortunately, the 3892 intrinsic function \var{pad_pack_format} may be used to modify a 3893 format by adding the correct amount of padding in the right places. 3894 In fact, \var{pad_pack_format} applied to the above format on an 3895 Intel-based Linux system produces the result: 3896#v+ 3897 "h x2 i S12 S2 x2 l S8 S16 l" 3898#v- 3899 Here we see that 4 bytes of padding were added. 3900 3901 The other missing piece of information is the size of the structure. 3902 This is useful because we would like to read in one structure at a 3903 time using the \var{fread} function. Knowing the size of the 3904 various data types makes this easy; however it is even easier to use 3905 the \var{sizeof_pack} intrinsic function, which returns the size (in 3906 bytes) of the structure described by the pack format. 3907 3908 So, with all the pieces in place, it is rather straightforward to 3909 write the code: 3910#v+ 3911 variable format, size, fp, buf; 3912 3913 typedef struct 3914 { 3915 ut_type, ut_pid, ut_line, ut_id, 3916 ut_time, ut_user, ut_host, ut_addr 3917 } UTMP_Type; 3918 3919 format = pad_pack_format ("h i S12 S2 l S8 S16 l"); 3920 size = sizeof_pack (format); 3921 3922 define print_utmp (u) 3923 { 3924 3925 () = fprintf (stdout, "%-16s %-12s %-16s %s\n", 3926 u.ut_user, u.ut_line, u.ut_host, ctime (u.ut_time)); 3927 } 3928 3929 3930 fp = fopen ("/var/log/utmp", "rb"); 3931 if (fp == NULL) 3932 error ("Unable to open utmp file"); 3933 3934 () = fprintf (stdout, "%-16s %-12s %-16s %s\n", 3935 "USER", "TTY", "FROM", "LOGIN@"); 3936 3937 variable U = @UTMP_Type; 3938 3939 while (-1 != fread (&buf, Char_Type, size, fp)) 3940 { 3941 set_struct_fields (U, unpack (format, buf)); 3942 print_utmp (U); 3943 } 3944 3945 () = fclose (fp); 3946#v- 3947 A few comments about this example are in order. First of all, note 3948 that a new data type called \var{UTMP_Type} was created, although 3949 this was not really necessary. We also opened the file in binary 3950 mode, but this too is optional under a Unix system where there is no 3951 distinction between binary and text modes. The \var{print_utmp} 3952 function does not print all of the structure fields. Finally, last 3953 but not least, the return values from \var{fprintf} and \var{fclose} 3954 were dealt with. 3955 3956#%}}} 3957 3958\sect1{Debugging} #%{{{ 3959 3960 The current implementation provides no support for an interactive 3961 debugger, although a future version will. Nevertheless, \slang has 3962 several features that aid the programmer in tracking down problems, 3963 including function call tracebacks and the tracing of function calls. 3964 However, the biggest debugging aid stems from the fact that the 3965 language is interpreted permitting one to easily add debugging 3966 statements to the code. 3967 3968 To enable debugging information, add the lines 3969#v+ 3970 _debug_info = 1; 3971 _traceback = 1; 3972#v- 3973 to the top of the source file of the code containing the bug and the 3974 reload the file. Setting the \var{_debug_info} variable to 3975 \exmp{1} causes line number information to be compiled into the 3976 functions when the file is loaded. The \var{_traceback} variable 3977 controls whether or not traceback information should be generated. 3978 If it is set to \exmp{1}, the values of local variables will be 3979 dumped when the traceback is generated. Setting this variable 3980 to \exmp{-1} will cause only function names to be reported in the 3981 traceback. 3982 3983 Here is an example of a traceback report: 3984#v+ 3985 S-Lang Traceback: error 3986 S-Lang Traceback: verror 3987 S-Lang Traceback: (Error occurred on line 65) 3988 S-Lang Traceback: search_generic_search 3989 Local Variables: 3990 $0: Type: String_Type, Value: "Search forward:" 3991 $1: Type: Integer_Type, Value: 1 3992 $2: Type: Ref_Type, Value: _function_return_1 3993 $3: Type: String_Type, Value: "abcdefg" 3994 $4: Type: Integer_Type, Value: 1 3995 S-Lang Traceback: (Error occurred on line 72) 3996 S-Lang Traceback: search_forward 3997#v- 3998 There are several ways to read this report; perhaps the simplest is 3999 to read it from the bottom. This report says that on line \exmp{72}, 4000 the \var{search_forward} function called the 4001 \var{search_generic_search} function. On line \var{65} it called the 4002 \verb{verror} function, which called \var{error}. The 4003 \var{search_generic_search} function contains \var{5} local variables 4004 and are represented symbolically as \exmp{$0} through \exmp{$4}. 4005 4006 4007#%}}} 4008 4009#i regexp.tm 4010 4011\sect1{Future Directions} #%{{{ 4012 4013 Several new features or enhancements to the \slang language are 4014 planned for the next major release. In no particular order, these 4015 include: 4016\begin{itemize} 4017 \item An interactive debugging facility. 4018 \item Function qualifiers. These entities should already be 4019 familiar to VMS users or to those who are familiar with the IDL 4020 language. Basically, a qualifier is an optional argument that is 4021 passed to a function, e.g., \exmp{plot(X,Y,/logx)}. Here 4022 \exmp{/logx} is a qualifier that specifies that the plot function 4023 should use a log scale for \exmp{x}. 4024 \item File local variables and functions. A file local variable or 4025 function is an object that is global to the file that defines it. 4026 \item Multi-threading. Currently the language does not support 4027 multiple threads. 4028\end{itemize} 4029 4030 4031#%}}} 4032 4033\appendix 4034 4035#i copyright.tm 4036 4037\end{\documentstyle} 4038