kawa-3.1.1/doc/kawa.info-2

This is kawa.info, produced by makeinfo version 6.6 from kawa.texi.

START-INFO-DIR-ENTRY
* kawa: (kawa).         The Kawa Scheme language
END-INFO-DIR-ENTRY


File: kawa.info,  Node: Datum syntax,  Next: Hash-prefixed forms,  Prev: Lexical syntax,  Up: Syntax

7.4 Datum syntax
================

The datum syntax describes the syntax of syntactic data in terms of a
sequence of LEXEMEs, as defined in the lexical syntax.

   The following grammar describes the syntax of syntactic data in terms
of various kinds of lexemes defined in the grammar in section “Lexical
Syntax”:

     DATUM ::= DEFINING-DATUM
              | NONDEFINING-DATUM
              | DEFINED-DATUM
     NONDEFINING-DATUM ::= LEXEME-DATUM
              | COMPOUND-DATUM

     LEXEME-DATUM ::= BOOLEAN | NUMBER
              | CHARACTER | STRING | SYMBOL
     SYMBOL ::= IDENTIFIER
     COMPOUND-DATUM ::= LIST | VECTOR | UNIFORM-VECTOR | ARRAY-LITERAL | EXTENDED-STRING-LITERAL | XML-LITERAL
     LIST ::= ‘(’DATUM*‘)’
              | ‘(’DATUM^{+} ‘.’ DATUM‘)’
              | ABBREVIATION
     VECTOR ::= ‘#(’DATUM^{*}‘)’

7.4.1 Datum labels
------------------

     DATUM-LABEL ::= ‘#’INDEXNUM‘=’
     DEFINING-DATUM ::= DATUM-LABEL^{+}NONDEFINING-DATUM
     DEFINED-DATUM ::= ‘#’INDEXNUM‘#’
     INDEXNUM ::= DIGIT^{+}

   The lexical syntax ‘#N=DATUM’ reads the same as DATUM, but also
results in DATUM being labelled by N, which must a sequence of digits.

   The lexical syntax ‘#N#’ serves as a reference to some object
labelled by ‘#N=’; the result is the same object (in the sense of ‘eq?’)
as the ‘#N=’.

   Together, these syntaxes permit the notation of structures with
shared or circular substructure.

     (let ((x (list 'a 'b 'c)))
       (set-cdr! (cddr x) x)
       x)    ⇒ #0=(a b c . #0#)

   The scope of a datum label is the portion of the outermost datum in
which it appears that is to the right of the label.  Consequently, a
reference ‘#N#’ can occur only after a label ‘#N=’; it is an error to
attempt a forward reference.  In addition, it is an error if the
reference appears as the labelled object itself (as in ‘#N=#N#’),
because the object labelled by ‘#N=’ is not well defined in this case.

7.4.2 Abbreviations
-------------------

     ABBREVIATION ::= R6RS-ABBREVIATION | KAWA-ABBREVIATION
     R6RS-ABBREVIATION ::= ABBREV-PREFIX DATUM
     ABBREV-PREFIX ::= ‘’’ | ‘‘’ | ‘,’ | ‘,@’
              | ‘#’’ | ‘#‘’
     KAWA-ABBREVIATION ::= XXX

   The following abbreviations are expanded at read-time:

‘’’DATUM
     means ‘(quote’ DATUM‘)’.

‘‘’DATUM
     means ‘(quasiquote’ DATUM‘)’.

‘,’DATUM
     means ‘(unquote’ DATUM‘)’.

‘,@’DATUM
     means ‘(unquote-splicing’ DATUM‘)’.

‘#’’DATUM
     means ‘(syntax’ DATUM‘)’.

‘#‘’DATUM
     means ‘(quasisyntax’ DATUM‘)’.

‘#,’DATUM
     means ‘(unsyntax’ DATUM‘)’.  This abbreviation is currently only
     recognized when nested inside an explicit ‘#‘’DATUM form, because
     of a conflict with SRFI-10 named constructors.

‘#,@’DATUM
     means ‘(unsyntax-splicing’ DATUM‘)’.

DATUM1‘:’DATUM2
     means ‘($lookup$’ DATUM1 ‘(quasiquote’ DATUM2‘))’.  *Note Colon
     notation::.

‘[’EXPRESSION ...‘]’
     means ‘($bracket-list$’ EXPRESSION ...‘)’.

OPERATOR‘[’EXPRESSION ...‘]’
     means ‘($bracket-apply$’ OPERATOR EXPRESSION ...‘)’.


File: kawa.info,  Node: Hash-prefixed forms,  Next: Primitive expression syntax,  Prev: Datum syntax,  Up: Syntax

7.5 Hash-prefixed forms
=======================

A number of different special forms are indicated by an initial hash
(number) symbols (‘#’).  Here is a table summarizing them.

   Case is ignored for the character followed the ‘#’.  Thus ‘#x’ and
‘#X’ are the same.

‘#:’KEYWORD
     Guile-style *note keyword: Keywords. syntax.
‘#\’
     *note Character literals: meta-character.
‘#!’
     *Note Special named constants::.
‘#‘’DATUM
     Equivalent to ‘(quasisyntax DATUM)’.  Convenience syntax for
     syntax-case macros.
‘#’’DATUM
     Equivalent to ‘(syntax DATUM)’.  Convenience syntax for syntax-case
     macros.
‘#,’DATUM
     Equivalent to ‘(unsyntax DATUM)’.  Currently only recognized when
     inside a ‘#`TEMPLATE’ form.  Convenience syntax for syntax-case
     macros.
‘#,(’NAME DATUM ...‘)’
     Special named constructors.  This syntax is deprecated, because it
     conflicts with ‘unsyntax’.  It is only recognized when _not_ in a
     ‘#`TEMPLATE’ form.
‘#,@’DATUM
     Equivalent to ‘(unsyntax-splicing DATUM)’.
‘#(’
     A vector.
‘#|’
     Start of nested-comment.
‘#/’REGEX‘/’
     *Note Regular expressions::.
‘#<’
     *Note XML literals::.
‘#;’DATUM
     A datum comment - the DATUM is ignored.  (An INTERLEXEME-SPACE may
     appear before the DATUM.)
‘#’NUMBER‘=’DATUM
     A reference definition, allowing cyclic and shared structure.
     Equivalent to the DATUM, but also defines an association between
     the integer NUMBER and that DATUM, which can be used by a
     subsequent ‘#NUMBER#’ form.
‘#’NUMBER‘#’
     A back-reference, allowing cyclic and shared structure.
‘#’R‘a’DATUM
     An *note array literal: array-literals, for a multi-dimensional
     array of rank R.
‘#b’
     A binary (base-2) number.
‘#d’
     A decimal (base-10) number.
‘#e’
     A prefix to treat the following number as exact.
‘#f’
‘#false’
     The standard boolean false object.
‘#f’N‘(’NUMBER ...‘)’
     A uniform vector of floating-point numbers.  The parameter N is a
     precision, which can be 32 or 64.  *Note Uniform vectors::.
‘#i’
     A prefix to treat the following number as inexact.
‘#o’
     An octal (base-8) number.
‘#’BASE‘r’
     A number in the specified BASE (radix).
‘#s’N‘(’NUMBER ...‘)’
     A uniform vector of signed integers.  The parameter N is a
     precision, which can be 8, 16, 32, or 64.  *Note Uniform vectors::.
‘#t’
‘#true’
     The standard boolean true object.
‘#u’N‘(’NUMBER ...‘)’
     A uniform vector of unsigned integers.  The parameter N is a
     precision, which can be 8, 16, 32, or 64.  *Note Uniform vectors::.
‘#x’
     A hexadecimal (base-16) number.

   The follow named constructor forms are supported:

‘#,(path’ PATH‘)’
‘#,(filepath’ PATH‘)’
‘#,(URI’ PATH‘)’
‘#,(symbol’ LOCAL-NAME [URI [PREFIX]]‘)’
‘#,(symbol’ LOCAL-NAME NAMESPACE‘)’
‘#,(namespace’ URI [PREFIX]‘)’
‘#,(duration’ DURATION‘)’


File: kawa.info,  Node: Primitive expression syntax,  Next: Colon notation,  Prev: Hash-prefixed forms,  Up: Syntax

7.6 Primitive expression syntax
===============================

     EXPRESSION ::= LITERAL-EXPRESSION | VARIABLE-REFERENCE
       | PROCEDURE-CALL | TODO

7.6.1 Literal expressions
-------------------------

     LITERAL-EXPRESSION ::= ‘(quote’ DATUM‘)’
       | ‘’’ DATUM
       | CONSTANT
     CONSTANT ::= NUMBER | BOOLEAN | CHARACTER | STRING

   ‘(quote DATUM)’ evaluates to DATUM, which may be any external
representation of a Scheme object.  This notation is used to include
literal constants in Scheme code.
     (quote a)               ⇒  a
     (quote #(a b c))        ⇒  #(a b c)
     (quote (+ 1 2))         ⇒  (+ 1 2)

   ‘(quote DATUM)’ may be abbreviated as ‘'DATUM’.  The two notations
are equivalent in all respects.
     ’a                      ⇒  a
     ’#(a b c)               ⇒  #(a b c)
     ’()                     ⇒  ()
     ’(+ 1 2)                ⇒  (+ 1 2)
     ’(quote a)              ⇒  (quote a)
     ’’a                     ⇒  (quote a)

   Numerical constants, string constants, character constants,
bytevector constants, and boolean constants evaluate to themselves; they
need not be quoted.

     145932          ⇒  145932
     #t              ⇒  #t
     "abc"           ⇒  "abc"

   Note that *note keywords: Keywords. need to be quoted, unlike some
other Lisp/Scheme dialect, including Common Lisp, and earlier versions
of Kawa.  (Kawa currently evaluates a non-quoted keyword as itself, but
that will change.)

7.6.2 Variable references
-------------------------

     VARIABLE-REFERENCE ::= IDENTIFIER
   An expression consisting of a variable is a variable reference if it
is not a macro use (see below).  The value of the variable reference is
the value stored in the location to which the variable is bound.  It is
a syntax violation to reference an unbound variable.

   The following example assumes the base library has been imported:

     (define x 28)
     x   ⇒  28

7.6.3 Procedure calls
---------------------

     PROCEDURE-CALL ::= ‘(’OPERATOR OPERAND ...)
     OPERATOR ::= EXPRESSION
     OPERAND ::= EXPRESSION
       | KEYWORD EXPRESSION
       | ‘@’ EXPRESSION
       | ‘@:’ EXPRESSION

   A procedure call consists of expressions for the procedure to be
called and the arguments to be passed to it, with enclosing parentheses.
A form in an expression context is a procedure call if OPERATOR is not
an identifier bound as a syntactic keyword.

   When a procedure call is evaluated, the operator and operand
expressions are evaluated (in an unspecified order) and the resulting
procedure is passed the resulting arguments.

     (+ 3 4)                ⇒  7
     ((if #f + *) 3 4)      ⇒  12

   The syntax KEYWORD EXPRESSION is a “keyword argument”.  This is a
mechanism for specifying arguments using a name rather than position,
and is especially useful for procedures with many optional paramaters.
Note that KEYWORD must be literal, and cannot be the result from
evaluating a non-literal expression.  (This is a change from previous
versions of Kawa, and is different from Common Lisp and some other
Scheme dialects.)

   An expression prefixed by ‘@’ or ‘@:’ is a splice argument.  The
following expression must evaluate to an “argument list” (see *note
Application and Arguments Lists:: for details); each element in the
argument becomes a separate argument when call the OPERATOR.  (This is
very similar to the “spread” operator is EcmaScript 6.)


File: kawa.info,  Node: Colon notation,  Next: Bodies,  Prev: Primitive expression syntax,  Up: Syntax

7.7 Property access using colon notation
========================================

The “colon notation” accesses named parts (properties) of a value.  It
is used to get and set fields, call methods, construct compound symbols,
and more.  Evaluating the form ‘OWNER:PROPERTY’ evaluates the ‘OWNER’
then it extracts the named ‘PROPERTY’ of the result.

     PROPERTY-ACCESS-ABBREVIATION ::= PROPERTY-OWNER-EXPRESSION‘:’PROPERTY-NAME
     PROPERTY-OWNER-EXPRESSION ::= EXPRESSION
     PROPERTY-NAME ::= IDENTIFIER | ‘,’EXPRESSION

   The PROPERTY-NAME is usually a literal name, but it can be an
unquoted EXPRESSION (i.e.  following a ‘,’), in which case the name is
evaluated at run-time.  No separators are allowed on either side of the
colon.

   The input syntax ‘OWNER:PART’ is translated by the Scheme reader to
the internal representation ‘($lookup$ OWNER (quasiquote PART))’.

7.7.1 Part lookup rules
-----------------------

Evaluation proceeds as follows.  First PROPERTY-OWNER-EXPRESSION is
evaluated to yield an OWNER object.  Evaluating the PROPERTY-NAME yields
a PART name, which is a simple symbol: Either the literal IDENTIFIER, or
the result of evaluating the property-name EXPRESSION.  If the
EXPRESSION evaluates to a string, it is converted to a symbol, as if
using ‘string->symbol’.

   • If the OWNER implements ‘gnu.mapping.HasNamedParts’, then the
     result is that of invoking the ‘get’ method of the OWNER with the
     PART name as a parameter.

     As a special case of this rule, if OWNER is a
     ‘gnu.mapping.Namespace’, then the result is the *note compound
     symbol in that namespace: Namespaces.
   • If OWNER is a ‘java.lang.Class’ or a ‘gnu.bytecode.ObjectType’, the
     result is the static member named PART (i.e.  a static field,
     method, or member class).
   • If OWNER is a ‘java.lang.Package’ object, we get the member class
     or sub-package named PART.
   • Otherwise, we look for a named member (instance member or field).

     Note you can’t use colon notation to invoke instance methods of a
     ‘Class’, because it will match a previous rule.  For example if you
     want to invoke the ‘getDeclaredMethod’ method of the
     ‘java.util.List’ , you can’t write
     ‘(java.util.List:getDeclaredMethod’ because that will look for a
     static method in ‘java.util.List’.  Instead, use the ‘invoke’ or
     ‘invoke-sttic’ method.  For example: ‘(invoke java.util.List
     'getDeclaredMethod)’.

   If the colon form is on the left-hand-side of an assignment (‘set!’),
then the named part is modified as appropriate.

7.7.2 Specific cases
--------------------

Some of these are deprecated; more compact and readable forms are
usually preferred.

7.7.2.1 Invoking methods
........................

     ‘(’INSTANCE‘:’METHOD-NAME ARG ...‘)’
     ‘(’CLASS‘:’METHOD-NAME INSTANCE ARG ...‘)’
     ‘(’CLASS‘:’METHOD-NAME ARG ...‘)’
     ‘(*:’METHOD-NAME INSTANCE ARG ...‘)’

   For details *note Method operations::.

7.7.2.2 Accessing fields
........................

     CLASS‘:’FIELD-NAME
     INSTANCE‘:’FIELD-NAME
     ‘(’PREFIX‘:.’FIELD-NAME INSTANCE‘)’

   For details *note Field operations::.

7.7.2.3 Type literal
....................

     ‘(’TYPE‘:<>)’
   Returns the TYPE.  Deprecated; usually you can just write:
     TYPE

7.7.2.4 Type cast
.................

     ‘(’TYPE‘:’‘@’ EXPRESSION‘)’
   Performs a cast.  Deprecated; usually you can just write:
     ->TYPE

7.7.2.5 Type test
.................

     ‘(’TYPE‘:instanceof?’ EXPRESSION‘)’

   Deprecated; usually you can just write:
     (TYPE? EXPRESSION)

7.7.2.6 New object construction
...............................

     ‘(’TYPE‘:new’ ARG ...‘)’

   Deprecated; usually you can just write:
     ‘(’TYPE ARG ...‘)’

7.7.2.7 Getting array length
............................

     EXPRESSION‘:length’
     ‘(’EXPRESSION‘:.length)’


File: kawa.info,  Node: Bodies,  Next: Syntax and conditional compilation,  Prev: Colon notation,  Up: Syntax

7.8 Programs and Bodies
=======================

Program units
-------------

A PROGRAM-UNIT consists of a sequence of definitions and expressions.

     PROGRAM-UNIT ::= LIBRARY-DEFINITION^{+} [STATEMENTS]
       | STATEMENTS
     STATEMENTS ::= STATEMENT^{+}
     STATEMENT ::= DEFINITION | EXPRESSION | ‘(begin’ STATEMENT^{*} ‘)’

   Typically a PROGRAM-UNIT corresponds to a single source file (i.e.a
named file in the file system).  Evaluating a PROGRAM-UNIT first
requires the Kawa processor to analyze the whole PROGRAM-UNIT to
determine which names are defined by the definitions, and then evaluates
each STATEMENT in order in the context of the defined names.  The value
of an EXPRESSION is normally discarded, but may be printed out instead,
depending on the evaluating context.

   The read-eval-print-loop (REPL) reads one or more lines until it gets
a valid PROGRAM-UNIT, and evaluates it as above, except that the values
of expressions are printed to the console (as if using the ‘display’
function).  Then the REPL reads and evaluates another PROGRAM-UNIT, and
so on.  A definition in an earlier PROGRAM-UNIT is remembered and is
visible in a later PROGRAM-UNIT unles it is overridden.

   A comment in the first 2 lines of a source file may contain an
encoding specification.  This can be used to tell the reader what kind
of character set encoding is used for the file.  This only works for a
character encoding that is compatible with ASCII (in the sense that if
the high-order bit is clear then it’s an ASCII character), and that are
no non-ASCI characters in the lines upto and including the encoding
specification.  A basic example is:
     ;; -*- coding: utf-8 -*-
   In general any string that matches the following regular expression
works:
     coding[:=]\s*([-a-zA-Z0-9]+)

Libraries
---------

A PROGRAM-UNIT may contain LIBRARY-DEFINITIONS.  In addition, any
STATEMENTS in PROGRAM-UNIT comprise an “implicit library”, in that it
can be given a name, and referenced from other libraries.  Certain names
defined in the PROGRAM-UNIT can be exported, and then they can be
imported by other libraries.  For more information *note Module
classes::.

   It is recommended but not required that:
   • There should be at most one LIBRARY-DEFINITION in a PROGRAM-UNIT.
   • The LIBRARY-NAME of the LIBRARY-DEFINITION should match the name of
     the source file.  For example:
          (define-library (foo bar) ...)
     should be in a file named ‘foo/bar.scm’.
   • If there is a LIBRARY-DEFINITION, there should be no extra
     STATEMENTS - i.e no implicit library definition.  (It is disallowed
     to ‘export’ any definitions from the implicit library if there is
     also a LIBRARY-DEFINITION.)
   Following these recommendations makes it easier to locate and
organize libraries.  However, having multiple libraries in a single
PROGRAM-UNIT is occasionally useful for source distribution and for
testing.

Bodies
------

The BODY of a ‘lambda’, ‘let’, ‘let*’, ‘let-values’, ‘let*-values’,
‘letrec’, or ‘letrec*’ expression, or that of a definition with a body
consists of zero or more definitions or expressions followed by a final
expression.  (Standard Scheme requires that all definitions precede all
expressions.)

     BODY ::= STATEMENT^{*}

   Each identifier defined by a definition is local to the BODY.  That
is, the identifier is bound, and the region of the binding is the entire
BODY.  Example:

     (let ((x 5))
       (define foo (lambda (y) (bar x y)))
       (define bar (lambda (a b) (+ (* a b) a)))
       (foo (+ x 3)))
     ⇒ 45

   When ‘begin’, ‘let-syntax’, or ‘letrec-syntax’ forms occur in a body
prior to the first expression, they are spliced into the body.  Some or
all of the body, including portions wrapped in ‘begin’, ‘let-syntax’, or
‘letrec-syntax’ forms, may be specified by a macro use.

   An expanded BODY containing variable definitions can be converted
into an equivalent ‘letrec*’ expression.  (If there is a definition
following expressions you may need to convert the expressions to dummy
definitions.)  For example, the ‘let’ expression in the above example is
equivalent to

     (let ((x 5))
       (letrec* ((foo (lambda (y) (bar x y)))
                 (bar (lambda (a b) (+ (* a b) a))))
         (foo (+ x 3))))


File: kawa.info,  Node: Syntax and conditional compilation,  Next: Macros,  Prev: Bodies,  Up: Syntax

7.9 Syntax and conditional compilation
======================================

Feature testing
---------------

 -- Syntax: cond-expand COND-EXPAND-CLAUSE^{*} [‘(else’
          command-or-definition*‘)’]
          COND-EXPAND-CLAUSE ::= ‘(’FEATURE-REQUIREMENT COMMAND-OR-DEFINITION*‘)’
          FEATURE-REQUIREMENT ::= FEATURE-IDENTIFIER
            | ‘(and’ FEATURE-REQUIREMENT^{*}‘)’
            | ‘(or’ FEATURE-REQUIREMENT^{*}‘)’
            | ‘(not’ FEATURE-REQUIREMENT‘)’
            | ‘(library’ LIBRARY-NAME‘)’
          FEATURE-IDENTIFIER ::= a symbol which is the name or alias of a SRFI

     The ‘cond-expand’ form tests for the existence of features at
     macro-expansion time.  It either expands into the body of one of
     its clauses or signals an error during syntactic processing.
     ‘cond-expand’ expands into the body of the first clause whose
     feature requirement is currently satisfied; the ‘else’ clause, if
     present, is selected if none of the previous clauses is selected.

     The implementation has a set of feature identifiers which are
     “present”, as well as a set of libraries which can be imported.
     The value of a FEATURE-REQUIREMENT is determined by replacing each
     FEATURE-IDENTIFIER by ‘#t’ if it is present (and ‘#f’ otherwise);
     replacing ‘(library LIBRARY-NAME)’ by ‘#t’ if LIBRARY-NAME is
     importable (and ‘#f’ otherwise); and then evaluating the resulting
     expression as a Scheme boolean expression under the normal
     interpretation of ‘and’, ‘or’, and ‘not’.

     Examples:
          (cond-expand
              ((and srfi-1 srfi-10)
               (write 1))
              ((or srfi-1 srfi-10)
               (write 2))
              (else))

          (cond-expand
            (command-line
             (define (program-name) (car (argv)))))

     The second example assumes that ‘command-line’ is an alias for some
     feature which gives access to command line arguments.  Note that an
     error will be signaled at macro-expansion time if this feature is
     not present.

     You can use ‘java-6’, ‘java-7’, ‘java-8’, or ‘java-9’ to check if
     the underlying Java is a specific version or newer.  For example
     the name ‘java-7’ matches for either Java 7, Java 8, or newer, as
     reported by ‘System’ property ‘"java.version"’.

     You can use ‘class-exists:CLASSNAME’ to check if ‘CLASSNAME’ exists
     at compile-time.  The identifier ‘class-exists:org.example.MyClass’
     is roughly equivalent to the test ‘(library (org example
     MyClass))’.  (The latter has some special handling for ‘(srfi ...)’
     as well as builtin Kawa classes.)

     The feature ‘in-http-server’ is defined in a *note self-configuring
     web page scripts: Self-configuring page scripts, and more
     specifically ‘in-servlet’ in a *note servlet container: Servlets.

 -- Procedure: features
     Returns a list of feature identifiers which ‘cond-expand’ treats as
     true.  This not a complete list - for example
     ‘class-exists:CLASSNAME’ feature identifiers are not included.  It
     is an error to modify this list.  Here is an example of what
     ‘features’ might return:
          (features)  ⇒
          (complex exact-complex full-unicode java-7 java-6 kawa
           ratios srfi-0 srfi-4 srfi-6 srfi-8 srfi-9 srfi-11
           srfi-16 srfi-17 srfi-23 srfi-25 srfi-26 srfi-28 srfi-30
           srfi-39 string-normalize-unicode threads)

File inclusion
--------------

 -- Syntax: include path^{+}
 -- Syntax: include-relative path^{+}
 -- Syntax: include-ci path^{+}
     These take one or more path names expressed as string literals,
     find corresponding files, read the contents of the files in the
     specified order as if by repeated applications of ‘read’, and
     effectively replace the ‘include’ with a ‘begin’ form containing
     what was read from the files.

     You can control the search path used for ‘include’ by setting the
     ‘kawa.include.path’ property.  For example:
          $ kawa -Dkawa.include.path="|:/opt/kawa-includes"
     The special ‘"|"’ path element means to search relative to the
     directory containing the including source file.  The default search
     path is ‘"|:."’ which means to first search the directory
     containing the including source file, and then search the directory
     specified by ‘(current-path)’.

     The search path for ‘include-relative’ prepends ‘"|"’ before the
     search path used by ‘include’, so it always searches first the
     directory containing the including source file.  Note that if the
     default search path is used then ‘include’ and ‘include-relative’
     are equivalent; there is only a difference if the
     ‘kawa.include.path’ property changes the default.

     Using ‘include-ci’ is like ‘include’, except that it reads each
     file as if it began with the ‘#!fold-case’ directive.


File: kawa.info,  Node: Macros,  Next: Named quasi-literals,  Prev: Syntax and conditional compilation,  Up: Syntax

7.10 Macros
===========

Libraries and top–level programs can define and use new kinds of derived
expressions and definitions called _syntactic abstractions_ or _macros_.
A syntactic abstraction is created by binding a keyword to a _macro
transformer_ or, simply, _transformer_.

   The transformer determines how a use of the macro (called a _macro
use_) is transcribed into a more primitive form.

   Most macro uses have the form:

     (KEYWORD DATUM ...)
where KEYWORD is an identifier that uniquely determines the kind of
form.  This identifier is called the _syntactic keyword_, or simply
_keyword_.  The number of DATUMs and the syntax of each depends on the
syntactic abstraction.

   Macro uses can also take the form of improper lists, singleton
identifiers, or ‘set!’ forms, where the second subform of the ‘set!’ is
the keyword:

     (KEYWORD DATUM ... . DATUM)
     KEYWORD
     (set! KEYWORD DATUM)

   The ‘define-syntax’, ‘let-syntax’ and ‘letrec-syntax’ forms create
bindings for keywords, associate them with macro transformers, and
control the scope within which they are visible.

   The ‘syntax-rules’ and ‘identifier-syntax’ forms create transformers
via a pattern language.  Moreover, the ‘syntax-case’ form allows
creating transformers via arbitrary Scheme code.

   Keywords occupy the same name space as variables.  That is, within
the same scope, an identifier can be bound as a variable or keyword, or
neither, but not both, and local bindings of either kind may shadow
other bindings of either kind.

   Macros defined using ‘syntax-rules’ and ‘identifier-syntax’ are
“hygienic” and “referentially transparent” and thus preserve Scheme’s
lexical scoping.

   • If a macro transformer inserts a binding for an identifier
     (variable or keyword) not appearing in the macro use, the
     identifier is in effect renamed throughout its scope to avoid
     conflicts with other identifiers.

   • If a macro transformer inserts a free reference to an identifier,
     the reference refers to the binding that was visible where the
     transformer was specified, regardless of any local bindings that
     may surround the use of the macro.

   Macros defined using the ‘syntax-case’ facility are also hygienic
unless ‘datum->syntax’ is used.

   Kawa supports most of the ‘syntax-case’ feature.

   Syntax definitions are valid wherever definitions are.  They have the
following form:

 -- Syntax: define-syntax keyword TRANSFORMER-SPEC
     The KEYWORD is a identifier, and TRANSFORMER-SPEC is a function
     that maps syntax forms to syntax forms, usually an instance of
     ‘syntax-rules’.  If the ‘define-syntax’ occurs at the top level,
     then the top-level syntactic environment is extended by binding the
     KEYWORD to the specified transformer, but existing references to
     any top-level binding for KEYWORD remain unchanged.  Otherwise, it
     is an “internal syntax definition”, and is local to the BODY in
     which it is defined.

          (let ((x 1) (y 2))
             (define-syntax swap!
               (syntax-rules ()
                 ((swap! a b)
                  (let ((tmp a))
                    (set! a b)
                    (set! b tmp)))))
             (swap! x y)
             (list x y))  ⇒ (2 1)

     Macros can expand into definitions in any context that permits
     them.  However, it is an error for a definition to define an
     identifier whose binding has to be known in order to determine the
     meaning of the definition itself, or of any preceding definition
     that belongs to the same group of internal definitions.

 -- Syntax: define-syntax-case name ‘(’literals‘)’ ‘(’pattern expr‘)’
          ...
     A convenience macro to make it easy to define ‘syntax-case’-style
     macros.  Defines a macro with the given NAME and list of LITERALS.
     Each PATTERN has the form of a ‘syntax-rules’-style pattern, and it
     is matched against the macro invocation syntax form.  When a match
     is found, the corresponding EXPR is evaluated.  It must evaluate to
     a syntax form, which replaces the macro invocation.
          (define-syntax-case macro-name (literals)
            (pat1 result1)
            (pat2 result2))
     is equivalent to:
          (define-syntax macro-name
            (lambda (form)
              (syntax-case form (literals)
                (pat1 result1)
                (pat2 result2))))

 -- Syntax: define-macro ‘(’name lambda-list‘)’ form ...
     _This form is deprecated._  Functionally equivalent to ‘defmacro’.

 -- Syntax: defmacro name lambda-list form ...
     _This form is deprecated._  Instead of
          (defmacro (NAME ...)
            (let ... `(... ,EXP ...)))
     you should probably do:
          (define-syntax-case NAME ()
            ((_ ...) (let #`(... #,EXP ...))))
     and instead of
          (defmacro (NAME ... VAR ...) `(... VAR ...))
     you should probably do:
          (define-syntax-case NAME ()
            ((_ ... VAR ...) #`(... VAR ...))

     Defines an old-style macro a la Common Lisp, and installs ‘(lambda
     LAMBDA-LIST FORM ...)’ as the expansion function for NAME.  When
     the translator sees an application of NAME, the expansion function
     is called with the rest of the application as the actual arguments.
     The resulting object must be a Scheme source form that is futher
     processed (it may be repeatedly macro-expanded).

 -- Procedure: gentemp
     Returns a new (interned) symbol each time it is called.  The symbol
     names are implementation-dependent.  (This is not directly
     macro-related, but is often used in conjunction with ‘defmacro’ to
     get a fresh unique identifier.)

 -- Procedure: expand form
     The result of evaluating FORM is treated as a Scheme expression,
     syntax-expanded to internal form, and then converted back to
     (roughly) the equivalent expanded Scheme form.

     This can be useful for debugging macros.

     To access this function, you must first ‘(require 'syntax-utils)’.
          (require 'syntax-utils)
          (expand '(cond ((> x y) 0) (else 1))) ⇒ (if (> x y) 0 1)

7.10.1 Pattern language
-----------------------

A TRANSFORMER-SPEC is an expression that evaluates to a transformer
procedure, which takes an input form and returns a resulting form.  You
can do general macro-time compilation with such a procedure, commonly
using ‘syntax-case’ (which is documented in the R6RS library
specification).  However, when possible it is better to use the simpler
pattern language of ‘syntax-rules’:

     TRANSFORMER-SPEC ::=
       ‘(syntax-rules (’ TR-LITERAL^{*} ‘)’ SYNTAX-RULE^{*}‘)’
       | ‘(syntax-rules’ ELLIPSIS ‘(’ TR-LITERAL^{*} ‘)’ SYNTAX-RULE^{*}‘)’
       | EXPRESSION
     SYNTAX-RULE ::= ‘(’LIST-PATTERN SYNTAX-TEMPLATE‘)’
     TR-LITERAL ::= IDENTIFIER
     ELLIPSIS ::= IDENTIFIER

   An instance of ‘syntax-rules’ produces a new macro transformer by
specifying a sequence of hygienic rewrite rules.  A use of a macro whose
keyword is associated with a transformer specified by ‘syntax-rules’ is
matched against the patterns contained in the SYNTAX-RULEs beginning
with the leftmost syntax rule .  When a match is found, the macro use is
transcribed hygienically according to the template.  The optional
ELLIPSIS species a symbol used to indicate repetition; it defaults to
‘...’ (3 periods).

     SYNTAX-PATTERN ::=
       IDENTIFIER | CONSTANT | LIST-PATTERN | VECTOR-PATTERN
     LIST-PATTERN ::= ‘(’ SYNTAX-PATTERN^{*} ‘)’
       | ‘(’ SYNTAX-PATTERN SYNTAX-PATTERN^{*} ‘.’ SYNTAX-PATTERN ‘)’
       | ‘(’ SYNTAX-PATTERN^{*} SYNTAX-PATTERN ELLIPSIS SYNTAX-PATTERN^{*} ‘)’
       | ‘(’ SYNTAX-PATTERN^{*} SYNTAX-PATTERN ELLIPSIS SYNTAX-PATTERN^{*} ‘.’ SYNTAX-PATTERN‘)’
     VECTOR-PATTERN ::= ‘#(’ SYNTAX-PATTERN^{*} ‘)’
       | ‘#(’ SYNTAX-PATTERN^{*} SYNTAX-PATTERN ELLIPSIS SYNTAX-PATTERN^{*} ‘)’

   An identifier appearing within a pattern can be an underscore (‘_’),
a literal identifier listed in the list of TR-LITERALs, or the ELLIPSIS.
All other identifiers appearing within a pattern are pattern variables.

   The outer SYNTAX-LIST of the pattern in a SYNTAX-RULE must start with
an identifier.  It is not involved in the matching and is considered
neither a pattern variable nor a literal identifier.

   Pattern variables match arbitrary input elements and are used to
refer to elements of the input in the template.  It is an error for the
same pattern variable to appear more than once in a SYNTAX-PATTERN.

   Underscores also match arbitrary input elements but are not pattern
variables and so cannot be used to refer to those elements.  If an
underscore appears in the literals list, then that takes precedence and
underscores in the pattern match as literals.  Multiple underscores can
appear in a SYNTAX-PATTERN.

   Identifiers that appear in ‘(TR-LITERAL^{*})’ are interpreted as
literal identifiers to be matched against corresponding elements of the
input.  An element in the input matches a literal identifier if and only
if it is an identifier and either both its occurrence in the macro
expression and its occurrence in the macro definition have the same
lexical binding, or the two identifiers are the same and both have no
lexical binding.

   A subpattern followed by ellipsis can match zero or more elements of
the input, unless ellipsis appears in the literals, in which case it is
matched as a literal.

   More formally, an input expression E matches a pattern P if and only
if:
   • P is an underscore (‘_’); or
   • P is a non-literal identifier; or
   • P is a literal identifier and E is an identifier with the same
     binding; or
   • P is a list ‘(’P_{1} ...  P_{N}‘)’ and E is a list of N elements
     that match P_{1} through P_{N}, respectively; or
   • P is an improper list ‘(’P_{1} ...  P_{N} ‘.’  P_{N+1}‘)’ and E is
     a list or improper list of N or more elements that match P_{1}
     through P_{N}, respectively, and whose Nth tail matches P_{N+1}; or
   • P is of the form ‘(’P_{1} ...  P_{K} P_{E} ELLIPSIS P_{K+1} ...
     P_{K+L}‘)’ where E is a proper list of N elements, the first K of
     which match P_{1} through P_{K}, respectively, whose next N-K-L
     elements each match P_{E}, and whose remaining L elements match
     P_{K+1} through P_{K+L}; or
   • P is of the form ‘(’P_{1} ...  P_{K} P_{E} ELLIPSIS P_{K+1} ...
     P_{K+L} ‘.’  P_{X}‘)’ where E is a list or improper list of N
     elements, the first K of which match P_{1} through P_{K}, whose
     next N-K-L elements each match P_{E}, and whose remaining L
     elements match P_{K+1} through P_{K+L}, and whose Nth and final
     ‘cdr’ matches P_{X}; or
   • P is a vector of the form ‘#(’P_{1} ...  P_{N}‘)’ and E is a vector
     of N elements that match P_{1} through P_{N}; or
   • P is of the form ‘#(’P_{1} ...  P_{K} P_{E} ELLIPSIS P_{K+1} ...
     P_{K+L}‘)’ where E is a vector of N elements the first K of which
     match P_{1} through P_{K}, whose next N-K-L elements each match
     P_{E}, and whose remaining L elements match P_{K+1} through
     P_{K+L}; or
   • P is a constant and E is equal to P in the sense of the ‘equal?’
     procedure.

   It is an error to use a macro keyword, within the scope of its
binding, in an expression that does not match any of the patterns.
     SYNTAX-TEMPLATE ::= IDENTIFIER | CONSTANT
        | ‘(’TEMPLATE-ELEMENT^{*}‘)’
        | ‘(’TEMPLATE-ELEMENT TEMPLATE-ELEMENT^{*} ‘.’ SYNTAX-TEMPLATE ‘)’
        | ‘(’ ELLIPSIS SYNTAX-TEMPLATE‘)’
     TEMPLATE-ELEMENT ::= SYNTAX-TEMPLATE [ELLIPSIS]

   When a macro use is transcribed according to the template of the
matching SYNTAX-RULE, pattern variables that occur in the template are
replaced by the elements they match in the input.  Pattern variables
that occur in subpatterns followed by one or more instances of the
identifier ELLIPSIS are allowed only in subtemplates that are followed
by as many instances of ELLIPSIS .  They are replaced in the output by
all of the elements they match in the input, distributed as indicated.
It is an error if the output cannot be built up as specified.

   Identifiers that appear in the template but are not pattern variables
or the identifier ELLIPSIS are inserted into the output as literal
identifiers.  If a literal identifier is inserted as a free identifier
then it refers to the binding of that identifier within whose scope the
instance of ‘syntax-rules’ appears.  If a literal identifier is inserted
as a bound identifier then it is in effect renamed to prevent
inadvertent captures of free identifiers.

   A template of the form ‘(’ELLIPSIS TEMPLATE‘)’ is identical to
TEMPLATE, except that ELLIPSES within the template have no special
meaning.  That is, any ELLIPSES contained within TEMPLATE are treated as
ordinary identifiers.  In particular, the template ‘(’ELLIPSIS
ELLIPSIS‘)’ produces a single ELLIPSIS.  This allows syntactic
abstractions to expand into code containing ellipses.

     (define-syntax be-like-begin
       (syntax-rules ()
         ((be-like-begin name)
          (define-syntax name
            (syntax-rules ()
              ((name expr (... ...))
               (begin expr (... ...))))))))

     (be-like-begin sequence)
     (sequence 1 2 3 4) ⇒ 4

7.10.2 Identifier predicates
----------------------------

 -- Procedure: identifier? OBJ
     Return ‘#t’ if OBJ is an identifier, i.e., a syntax object
     representing an identifier, and ‘#f’ otherwise.

     The ‘identifier?’ procedure is often used within a fender to verify
     that certain subforms of an input form are identifiers, as in the
     definition of ‘rec’, which creates self–contained recursive
     objects, below.

          (define-syntax rec
            (lambda (x)
              (syntax-case x ()
                ((_ x e)
                 (identifier? #'x)
                 #'(letrec ((x e)) x)))))

          (map (rec fact
                 (lambda (n)
                   (if (= n 0)
                       1
                       (* n (fact (- n 1))))))
               '(1 2 3 4 5))    ⇒ (1 2 6 24 120)

          (rec 5 (lambda (x) x))  ⇒ exception

   The procedures ‘bound-identifier=?’ and ‘free-identifier=?’ each take
two identifier arguments and return ‘#t’ if their arguments are
equivalent and ‘#f’ otherwise.  These predicates are used to compare
identifiers according to their _intended use_ as free references or
bound identifiers in a given context.

 -- Procedure: bound-identifier=? ID1 ID2
     ID1 and ID2 must be identifiers.

     The procedure ‘bound-identifier=?’ returns ‘#t’ if a binding for
     one would capture a reference to the other in the output of the
     transformer, assuming that the reference appears within the scope
     of the binding, and ‘#f’ otherwise.

     In general, two identifiers are ‘bound-identifier=?’ only if both
     are present in the original program or both are introduced by the
     same transformer application (perhaps implicitly, see
     ‘datum->syntax’).

     The ‘bound-identifier=?’ procedure can be used for detecting
     duplicate identifiers in a binding construct or for other
     preprocessing of a binding construct that requires detecting
     instances of the bound identifiers.

 -- Procedure: free-identifier=? ID1 ID2
     ID1 and ID2 must be identifiers.

     The ‘free-identifier=?’ procedure returns ‘#t’ if and only if the
     two identifiers would resolve to the same binding if both were to
     appear in the output of a transformer outside of any bindings
     inserted by the transformer.  (If neither of two like–named
     identifiers resolves to a binding, i.e., both are unbound, they are
     considered to resolve to the same binding.)

     Operationally, two identifiers are considered equivalent by
     ‘free-identifier=?’ if and only the topmost matching substitution
     for each maps to the same binding or the identifiers have the same
     name and no matching substitution.

     The ‘syntax-case’ and ‘syntax-rules’ forms internally use
     ‘free-identifier=?’ to compare identifiers listed in the literals
     list against input identifiers.

          (let ((fred 17))
            (define-syntax a
              (lambda (x)
                (syntax-case x ()
                  ((_ id) #'(b id fred)))))
            (define-syntax b
              (lambda (x)
                (syntax-case x ()
                  ((_ id1 id2)
                   #`(list
                       #,(free-identifier=? #'id1 #'id2)
                       #,(bound-identifier=? #'id1 #'id2))))))
            (a fred))
              ⇒ (#t #f)

     The following definition of unnamed ‘let’ uses ‘bound-identifier=?’
     to detect duplicate identifiers.

          (define-syntax let
            (lambda (x)
              (define unique-ids?
                (lambda (ls)
                  (or (null? ls)
                      (and (let notmem? ((x (car ls)) (ls (cdr ls)))
                             (or (null? ls)
                                 (and (not (bound-identifier=? x (car ls)))
                                      (notmem? x (cdr ls)))))
                           (unique-ids? (cdr ls))))))
              (syntax-case x ()
                ((_ ((i v) ...) e1 e2 ...)
                 (unique-ids? #'(i ...))
                 #'((lambda (i ...) e1 e2 ...) v ...)))))

     The argument ‘#'(i ...)’ to ‘unique-ids?’ is guaranteed to be a
     list by the rules given in the description of ‘syntax’ above.

     With this definition of ‘let’:

          (let ((a 3) (a 4)) (+ a a))    ⇒ syntax error

     However,
          (let-syntax
            ((dolet (lambda (x)
                      (syntax-case x ()
                        ((_ b)
                         #'(let ((a 3) (b 4)) (+ a b)))))))
            (dolet a))
          ⇒ 7

     since the identifier ‘a’ introduced by ‘dolet’ and the identifier
     ‘a’ extracted from the input form are not ‘bound-identifier=?’.

     Rather than including ‘else’ in the literals list as before, this
     version of ‘case’ explicitly tests for ‘else’ using
     ‘free-identifier=?’.

          (define-syntax case
            (lambda (x)
              (syntax-case x ()
                ((_ e0 ((k ...) e1 e2 ...) ...
                    (else-key else-e1 else-e2 ...))
                 (and (identifier? #'else-key)
                      (free-identifier=? #'else-key #'else))
                 #'(let ((t e0))
                     (cond
                      ((memv t '(k ...)) e1 e2 ...)
                      ...
                      (else else-e1 else-e2 ...))))
                ((_ e0 ((ka ...) e1a e2a ...)
                    ((kb ...) e1b e2b ...) ...)
                 #'(let ((t e0))
                     (cond
                      ((memv t '(ka ...)) e1a e2a ...)
                      ((memv t '(kb ...)) e1b e2b ...)
                      ...))))))

     With either definition of ‘case’, ‘else’ is not recognized as an
     auxiliary keyword if an enclosing lexical binding for ‘else’
     exists.  For example,

          (let ((else #f))
            (case 0 (else (write "oops"))))    ⇒ syntax error

     since ‘else’ is bound lexically and is therefore not the same
     ‘else’ that appears in the definition of ‘case’.

7.10.3 Syntax-object and datum conversions
------------------------------------------

 -- Procedure: syntax->datum SYNTAX-OBJECT
 -- Deprecated procedure: syntax-object->datum SYNTAX-OBJECT
     Strip all syntactic information from a syntax object and returns
     the corresponding Scheme datum.

     Identifiers stripped in this manner are converted to their symbolic
     names, which can then be compared with ‘eq?’.  Thus, a predicate
     ‘symbolic-identifier=?’ might be defined as follows.

          (define symbolic-identifier=?
            (lambda (x y)
              (eq? (syntax->datum x)
                   (syntax->datum y))))

 -- Procedure: datum->syntax TEMPLATE-ID DATUM [SRCLOC]
 -- Deprecated procedure: datum->syntax-object TEMPLATE-ID DATUM
     TEMPLATE-ID must be a template identifier and DATUM should be a
     datum value.

     The ‘datum->syntax’ procedure returns a syntax-object
     representation of DATUM that contains the same contextual
     information as TEMPLATE-ID, with the effect that the syntax object
     behaves as if it were introduced into the code when TEMPLATE-ID was
     introduced.

     If SRCLOC is specified (and neither ‘#f’ or ‘#!null’), it specifies
     the file position (including line number) for the result.  In that
     case it should be a syntax object representing a list; otherwise it
     is currently ignored, though future extensions may support other
     ways of specifying the position.

     The ‘datum->syntax’ procedure allows a transformer to “bend”
     lexical scoping rules by creating _implicit identifiers_ that
     behave as if they were present in the input form, thus permitting
     the definition of macros that introduce visible bindings for or
     references to identifiers that do not appear explicitly in the
     input form.  For example, the following defines a ‘loop’ expression
     that uses this controlled form of identifier capture to bind the
     variable ‘break’ to an escape procedure within the loop body.  (The
     derived ‘with-syntax’ form is like ‘let’ but binds pattern
     variables.)

          (define-syntax loop
            (lambda (x)
              (syntax-case x ()
                ((k e ...)
                 (with-syntax
                     ((break (datum->syntax #'k 'break)))
                   #'(call-with-current-continuation
                       (lambda (break)
                         (let f () e ... (f)))))))))

          (let ((n 3) (ls '()))
            (loop
              (if (= n 0) (break ls))
              (set! ls (cons 'a ls))
              (set! n (- n 1))))
          ⇒ (a a a)

     Were ‘loop’ to be defined as:

          (define-syntax loop
            (lambda (x)
              (syntax-case x ()
                ((_ e ...)
                 #'(call-with-current-continuation
                     (lambda (break)
                       (let f () e ... (f))))))))

     the variable ‘break’ would not be visible in ‘e ...’.

     The datum argument DATUM may also represent an arbitrary Scheme
     form, as demonstrated by the following definition of ‘include’.

          (define-syntax include
            (lambda (x)
              (define read-file
                (lambda (fn k)
                  (let ((p (open-file-input-port fn)))
                    (let f ((x (get-datum p)))
                      (if (eof-object? x)
                          (begin (close-port p) '())
                          (cons (datum->syntax k x)
                                (f (get-datum p))))))))
              (syntax-case x ()
                ((k filename)
                 (let ((fn (syntax->datum #'filename)))
                   (with-syntax (((exp ...)
                                  (read-file fn #'k)))
                     #'(begin exp ...)))))))

     ‘(include "filename")’ expands into a ‘begin’ expression containing
     the forms found in the file named by ‘"filename"’.  For example, if
     the file ‘flib.ss’ contains:

          (define f (lambda (x) (g (* x x))))

     and the file ‘glib.ss’ contains:

          (define g (lambda (x) (+ x x)))

     the expression:

          (let ()
            (include "flib.ss")
            (include "glib.ss")
            (f 5))

     evaluates to ‘50’.

     The definition of ‘include’ uses ‘datum->syntax’ to convert the
     objects read from the file into syntax objects in the proper
     lexical context, so that identifier references and definitions
     within those expressions are scoped where the ‘include’ form
     appears.

     Using ‘datum->syntax’, it is even possible to break hygiene
     entirely and write macros in the style of old Lisp macros.  The
     ‘lisp-transformer’ procedure defined below creates a transformer
     that converts its input into a datum, calls the programmer’s
     procedure on this datum, and converts the result back into a syntax
     object scoped where the original macro use appeared.

          (define lisp-transformer
            (lambda (p)
              (lambda (x)
                (syntax-case x ()
                  ((kwd . rest)
                   (datum->syntax #'kwd
                     (p (syntax->datum x))))))))

7.10.4 Signaling errors in macro transformers
---------------------------------------------

 -- Syntax: syntax-error message args^{*}
     The MESSAGE and ARGS are treated similary as for the ‘error’
     procedure.  However, the error is reported when the ‘syntax-error’
     is expanded.  This can be used as a ‘syntax-rules’ template for a
     pattern that is an invalid use of the macro, which can provide more
     descriptive error messages.  The MESSAGE should be a string
     literal, and the ARGS arbitrary (non-evalualted) expressions
     providing additional information.

          (define-syntax simple-let
            (syntax-rules ()
              ((_ (head ... ((x . y) val) . tail)
                 body1 body2 ...)
               (syntax-error "expected an identifier but got" (x . y)))
              ((_ ((name val) ...) body1 body2 ...)
               ((lambda (name ...) body1 body2 ...)
                val ...))))

 -- Procedure: report-syntax-error location message
     This is a procedure that can be called at macro-expansion time by a
     syntax transformer function.  (In contrast ‘syntax-error’ is a
     syntax form used in the expansion result.)  The MESSAGE is reported
     as a compile-time error message.  The LOCATION is used for the
     source location (file name and line/column numbers): In general it
     can be a ‘SourceLocator’ value; most commonly it is a syntax object
     for a sub-list of the input form that is erroneous.  The value
     returned by ‘report-syntax-error’ is an instance of ‘ErrorExp’,
     which supresses further compilation.

          (define-syntax if
            (lambda (x)
              (syntax-case x ()
                           ((_ test then)
                            (make-if-exp #'test #'then #!null))
                           ((_ test then else)
                            (make-if-exp #'test #'then #'else))
                           ((_ e1 e2 e3 . rest)
                            (report-syntax-error #'rest
                             "too many expressions for 'if'"))
                           ((_ . rest)
                            (report-syntax-error #'rest
                             "too few expressions for 'if'")))))
     In the above example, one could use the source form ‘x’ for the
     location, but using ‘#'rest’ is more accurate.  Note that the
     following is incorrect, because ‘e1’ might not be a pair, in which
     case we don’t have location information for it (due to a Kawa
     limitation):
              (syntax-case x ()
                           ...
                           ((_ e1)
                            (report-syntax-error
                             #'e1 ;; poor location specifier
                             "too few expressions for 'if'")))))

7.10.5 Convenience forms
------------------------

 -- Syntax: with-syntax ((PATTERN EXPRESSION) ...) BODY
     The ‘with-syntax’ form is used to bind pattern variables, just as
     ‘let’ is used to bind variables.  This allows a transformer to
     construct its output in separate pieces, then put the pieces
     together.

     Each PATTERN is identical in form to a ‘syntax-case’ pattern.  The
     value of each EXPRESSION is computed and destructured according to
     the corresponding PATTERN, and pattern variables within the PATTERN
     are bound as with ‘syntax-case’ to the corresponding portions of
     the value within BODY.

     The ‘with-syntax’ form may be defined in terms of ‘syntax-case’ as
     follows.

          (define-syntax with-syntax
            (lambda (x)
              (syntax-case x ()
                ((_ ((p e0) ...) e1 e2 ...)
                 (syntax (syntax-case (list e0 ...) ()
                           ((p ...) (let () e1 e2 ...))))))))

     The following definition of ‘cond’ demonstrates the use of
     ‘with-syntax’ to support transformers that employ recursion
     internally to construct their output.  It handles all ‘cond’ clause
     variations and takes care to produce one-armed ‘if’ expressions
     where appropriate.

          (define-syntax cond
            (lambda (x)
              (syntax-case x ()
                ((_ c1 c2 ...)
                 (let f ((c1 #'c1) (c2* #'(c2 ...)))
                   (syntax-case c2* ()
                     (()
                      (syntax-case c1 (else =>)
                       (((else e1 e2 ...) #'(begin e1 e2 ...))
                        ((e0) #'e0)
                        ((e0 => e1)
                         #'(let ((t e0)) (if t (e1 t))))
                        ((e0 e1 e2 ...)
                         #'(if e0 (begin e1 e2 ...)))))
                     ((c2 c3 ...)
                      (with-syntax ((rest (f #'c2 #'(c3 ...))))
                        (syntax-case c1 (=>)
                          ((e0) #'(let ((t e0)) (if t t rest)))
                          ((e0 => e1)
                           #'(let ((t e0)) (if t (e1 t) rest)))
                          ((e0 e1 e2 ...)
                           #'(if e0
                                  (begin e1 e2 ...)
                                  rest)))))))))))

 -- Syntax: quasisyntax TEMPLATE
 -- Auxiliary Syntax: unsyntax
 -- Auxiliary Syntax: unsyntax-splicing
     The ‘quasisyntax’ form is similar to ‘syntax’, but it allows parts
     of the quoted text to be evaluated, in a manner similar to the
     operation of ‘quasiquote’.

     Within a ‘quasisyntax’ TEMPLATE, subforms of ‘unsyntax’ and
     ‘unsyntax-splicing’ forms are evaluated, and everything else is
     treated as ordinary template material, as with ‘syntax’.

     The value of each ‘unsyntax’ subform is inserted into the output in
     place of the ‘unsyntax’ form, while the value of each
     ‘unsyntax-splicing’ subform is spliced into the surrounding list or
     vector structure.  Uses of ‘unsyntax’ and ‘unsyntax-splicing’ are
     valid only within ‘quasisyntax’ expressions.

     A ‘quasisyntax’ expression may be nested, with each ‘quasisyntax’
     introducing a new level of syntax quotation and each ‘unsyntax’ or
     ‘unsyntax-splicing’ taking away a level of quotation.  An
     expression nested within _n_ ‘quasisyntax’ expressions must be
     within _n_ _unsyntax_ or ‘unsyntax-splicing’ expressions to be
     evaluated.

     As noted in ABBREVIATION, ‘#`TEMPLATE’ is equivalent to
     ‘(quasisyntax TEMPLATE)’, ‘#,TEMPLATE’ is equivalent to ‘(unsyntax
     TEMPLATE)’, and ‘#,@TEMPLATE’ is equivalent to ‘(unsyntax-splicing
     TEMPLATE)’.  _Note_ that for backwards compatibility, you should
     only use ‘#,TEMPLATE’ inside a literal ‘#`TEMPLATE’ form.

     The ‘quasisyntax’ keyword can be used in place of ‘with-syntax’ in
     many cases.  For example, the definition of ‘case’ shown under the
     description of ‘with-syntax’ above can be rewritten using
     ‘quasisyntax’ as follows.

          (define-syntax case
            (lambda (x)
              (syntax-case x ()
                ((_ e c1 c2 ...)
                 #`(let ((t e))
                     #,(let f ((c1 #'c1) (cmore #'(c2 ...)))
                         (if (null? cmore)
                             (syntax-case c1 (else)
                               ((else e1 e2 ...)
                                #'(begin e1 e2 ...))
                               (((k ...) e1 e2 ...)
                                #'(if (memv t '(k ...))
                                      (begin e1 e2 ...))])
                             (syntax-case c1 ()
                               (((k ...) e1 e2 ...)
                                #`(if (memv t '(k ...))
                                      (begin e1 e2 ...)
                                      #,(f (car cmore)
                                            (cdr cmore))))))))))))

     _Note:_ Any ‘syntax-rules’ form can be expressed with ‘syntax-case’
     by making the ‘lambda’ expression and ‘syntax’ expressions
     explicit, and ‘syntax-rules’ may be defined in terms of
     ‘syntax-case’ as follows.

          (define-syntax syntax-rules
            (lambda (x)
              (syntax-case x ()
                ((_ (lit ...) ((k . p) t) ...)
                 (for-all identifier? #'(lit ... k ...))
                 #'(lambda (x)
                     (syntax-case x (lit ...)
                       ((_ . p) #'t) ...))))))


File: kawa.info,  Node: Named quasi-literals,  Prev: Macros,  Up: Syntax

7.11 Named quasi-literals
=========================

Traditional Scheme has only a few kinds of values, and thus only a few
builtin kinds of literals.  Modern Scheme allows defining new types, so
it is desirable to have a mechanism for defining literal values for the
new types.

   Consider the ‘*note URI: URI-type.’ type.  You can create a new
instance of a ‘URI’ using a constructor function:
     (URI "http://example.com/")
   This isn’t too bad, though the double-quote characters are an ugly
distraction.  However, if you need to construct the string it gets
messy:
     (URI (string-append base-uri "icon.png"))

   Instead use can write:
     &URI{http://example.com/}
   or:
     &URI{&[base-uri]icon.png}

   This syntax is translated by the Scheme reader to the more familiar
but more verbose equivalent forms:
     ($construct$:URI "http://example.com/")
     ($construct$:URI $<<$ base-uri $>>$ "icon.png")
   So for this to work there just needs to be a definition of
‘$construct$:URI’, usually a macro.  Normal scope rules apply; typically
you’d define ‘$construct$:URI’ in a module.

   The names ‘$<<$’ and ‘$>>$’ are bound to unique zero-length strings.
They are used to allow the implementation of ‘$construct$:URI’ to
determine which arguments are literal and which come from escaped
expressions.

   If you want to define your own ‘$construct$:TAG’, or to read
motivation and details, see the SRFI 108
(http://srfi.schemers.org/srfi-108/srfi-108.html) specification.

     EXTENDED-DATUM-LITERAL ::=
         ‘&’ CNAME ‘{’ [INITIAL-IGNORED] NAMED-LITERAL-PART^{*} ‘}’
       | ‘&’ CNAME ‘[’ EXPRESSION^{*} ‘]{’ [INITIAL-IGNORED] NAMED-LITERAL-PART^{*} ‘}’
     CNAME ::= IDENTIFIER
     NAMED-LITERAL-PART ::=
         any character except ‘&’, ‘{’ or ‘}’
       | ‘{’ NAMED-LITERAL-PART^{+} ‘}’
       | CHAR-REF
       | ENTITY-REF
       | SPECIAL-ESCAPE
       | ENCLOSED-PART
       | EXTENDED-DATUM-LITERAL


File: kawa.info,  Node: Program structure,  Next: Control features,  Prev: Syntax,  Up: Top

8 Program structure
*******************

See *note program units:: for some notes on structure of an entire
source file.

* Menu:

* Boolean values::
* Conditionals::
* Variables and Patterns::
* Definitions::
* Local binding constructs::
* Lazy evaluation::
* Repeat forms::         Repeat patterns and expressions
* Threads::
* Exceptions::           Exception handling


File: kawa.info,  Node: Boolean values,  Next: Conditionals,  Up: Program structure

8.1 Boolean values
==================

The standard boolean objects for true and false are written as ‘#t’ and
‘#f’.  Alternatively, they may be written ‘#true’ and ‘#false’,
respectively.

     BOOLEAN ::= ‘#t’ | ‘#f’ | ‘#true’ | ‘#false’

     TEST-EXPRESSION ::= EXPRESSION

   What really matters, though, are the objects that the Scheme
conditional expressions (‘if’, ‘cond’, ‘and’, ‘or’, ‘when’, ‘unless’,
‘do’) treat as true or false.  The phrase “a true value” (or sometimes
just “true”) means any object treated as true by the conditional
expressions, and the phrase “a false value” (or “false”) means any
object treated as false by the conditional expressions.  In this
document, TEST-EXPRESSION is an expression that is evaluated, but we
only care about whether the result is a true or a false value.

   Of all the standard Scheme values, only ‘#f’ counts as false in
conditional expressions.  All other Scheme values, including ‘#t’, count
as true.  A TEST-EXPRESSION is an expression evaluated in this manner
for whether it is true or false.

   In addition the null value ‘#!null’ (in Java written as ‘null’) is
also considered false.  Also, if you for some strange reason create a
fresh ‘java.lang.Boolean’ object whose ‘booleanValue()’ returns ‘false’,
that is also considered false.

   _Note:_ Unlike some other dialects of Lisp, Scheme distinguishes ‘#f’
and the empty list from each other and from the symbol ‘nil’.

   Boolean constants evaluate to themselves, so they do not need to be
quoted in programs.

     #t       ⇒  #t
     #true    ⇒  #t
     #f       ⇒  #f
     #false   ⇒  #f
     '#f      ⇒  #f

 -- Type: boolean
     The type of boolean values.  As a type conversion, a true value is
     converted to ‘#t’, while a false value is converted to ‘#f’.
     Represented as a primitive Java ‘boolean’ or ‘kawa.lang.Boolean’
     when converted to an object.

 -- Procedure: boolean? obj
     The ‘boolean?’ predicate returns ‘#t’ if OBJ is either ‘#t’ or
     ‘#f’, and returns ‘#f’ otherwise.
          (boolean? #f)   ⇒  #t
          (boolean? 0)    ⇒  #f
          (boolean? '())  ⇒  #f

 -- Procedure: boolean=? boolean1 boolean2 boolean3 ...
     Returns ‘#t’ if all the arguments are booleans and all are ‘#t’ or
     all are ‘#f’.


File: kawa.info,  Node: Conditionals,  Next: Variables and Patterns,  Prev: Boolean values,  Up: Program structure

8.2 Conditionals
================

Kawa Scheme has the usual conditional expression forms, such as ‘if’,
‘case’, ‘and’, and ‘or’:
     (if (> 3 2) 'yes 'no)          ⇒ yes

   Kawa also allows you bind variables in the condition, using the ‘?’
operator.
     (if (and (? x ::integer (get-value)) (> x 0))
       (* x 10)
       'invalid)
   In the above, if ‘(get-value)’ evaluates to an integer, that integer
is bound to the variable ‘x’, which is visible in both following
sub-expression of ‘and’, as well case the true-part of the ‘if’.

   Specifically, the first sub-expression of an ‘if’ is a TEST-OR-MATCH,
which can be a TEST-EXPRESSION, or a ‘?’ match expression, or a
combination using ‘and’:

     TEST-OR-MATCH ::= TEST-EXPRESSION
       | ‘(?’ PATTERN EXPRESSION ‘)’
       | ‘(and’ TEST-OR-MATCH^{*}‘)’

   A TEST-OR-MATCH is true if every nested TEST-EXPRESSION is true, and
every ‘?’ operation succeeds.  It produces a set of variable bindings
which is the union of the bindings produced by all the PATTERNs.  In an
‘and’ form, bindings produced by a PATTERN are visible to all subsequent
TEST-OR-MATCH sub-expressions.

 -- Syntax: ? PATTERN EXPRESSION
     The form ‘(? P V)’ informally is true if the value of V matches the
     pattern P.  Any variables bound in P are in scope in the “true”
     path of the containing conditional.

     This has the form of an expression, but it can only be used in
     places where a TEST-OR-MATCH is required.  For example it can be
     used as the first clause of an ‘if’ expression, in which case the
     scope of the variables bound in the ‘pattern’ includes the second
     (CONSEQUENT) sub-expression.  On the other hand, a ‘?’ form may not
     be used as an argument to a procedure application.

 -- Syntax: if TEST-OR-MATCH CONSEQUENT ALTERNATE
 -- Syntax: if TEST-OR-MATCH CONSEQUENT

          CONSEQUENT ::= EXPRESSION
          ALTERNATE ::= EXPRESSION

     An ‘if’ expression is evaluated as follows: first, the
     TEST-OR-MATCH is evaluated.  If it it true, then CONSEQUENT is
     evaluated and its values are returned.  Otherwise ALTERNATE is
     evaluated and its values are returned.  If TEST yields ‘#f’ and no
     ALTERNATE is specified, then the result of the expression is void.

          (if (> 2 3) 'yes 'no)          ⇒ no
          (if (> 3 2)
              (- 3 2)
              (+ 3 2))                   ⇒ 1
          (if #f #f)                     ⇒ #!void
          (if (? x::integer 3)
            (+ x 1)
            'invalid)                    ⇒ 4
          (if (? x::integer 3.4)
            (+ x 1)
            'invalid)                    ⇒ 'invalid

     The CONSEQUENT and ALTERNATE expressions are in tail context if the
     ‘if’ expression itself is.

 -- Syntax: cond COND-CLAUSE^{+}
 -- Syntax: cond COND-CLAUSE^{*} ‘(else’ EXPRESSION...‘)’

          COND-CLAUSE ::= ‘(’TEST-OR-MATCH BODY‘)’
              | ‘(’TEST ‘=>’ EXPRESSION‘)’

     A ‘cond’ expression is evaluated by evaluating the TEST-OR-MATCHs
     of successive COND-CLAUSEs in order until one of them evaluates to
     a true value.  When a TEST-OR-MATCH is true value, then the
     remaining EXPRESSIONs in its COND-CLAUSE are evaluated in order,
     and the results of the last EXPRESSION in the COND-CLAUSE are
     returned as the results of the entire ‘cond’ expression.  Variables
     bound by the TEST-OR-MATCH are visible in BODY.  If the selected
     COND-CLAUSE contains only the TEST-OR-MATCH and no EXPRESSIONs,
     then the value of the last TEST-EXPRESSION is returned as the
     result.  If the selected COND-CLAUSE uses the ‘=>’ alternate form,
     then the EXPRESSION is evaluated.  Its value must be a procedure.
     This procedure should accept one argument; it is called on the
     value of the TEST-EXPRESSION and the values returned by this
     procedure are returned by the ‘cond’ expression.

     If all TEST-OR-MATCHs evaluate to ‘#f’, and there is no ‘else’
     clause, then the conditional expression returns unspecified values;
     if there is an ‘else’ clause, then its EXPRESSIONs are evaluated,
     and the values of the last one are returned.

          (cond ((> 3 2) 'greater)
                ((< 3 2) 'less))         ⇒ greater

          (cond ((> 3 3) 'greater)
                ((< 3 3) 'less)
                (else 'equal))           ⇒ equal

          (cond ('(1 2 3) => cadr)
                (else #f))               ⇒ 2

     For a COND-CLAUSE of one of the following forms:
          (TEST EXPRESSION^{*})
          (else EXPRESSION EXPRESSION^{*})

     the last EXPRESSION is in tail context if the ‘cond’ form itself
     is.  For a COND CLAUSE of the form:

          (TEST => EXPRESSION)

     the (implied) call to the procedure that results from the
     evaluation of EXPRESSION is in tail context if the ‘cond’ form
     itself is.

 -- Syntax: case CASE-KEY CASE-CLAUSE^{+}
 -- Syntax: case CASE-KEY CASE-CLAUSE^{*} CASE-ELSE-CLAUSE

          CASE-KEY ::= EXPRESSION
          CASE-CLAUSE ::= ‘((’DATUM^{*}‘)’ EXPRESSION^{+}‘)’
              | ‘((’DATUM^{*}‘)’ ‘=>’ EXPRESSION‘)’
          CASE-ELSE-CLAUSE ::= ‘(else’  EXPRESSION^{+}‘)’
              | ‘(else =>’ EXPRESSION‘)’

     Each DATUM is an external representation of some object.  Each
     DATUM in the entire ‘case’ expression should be distinct.

     A ‘case’ expression is evaluated as follows.

       1. The CASE-KEY is evaluated and its result is compared using
          ‘eqv?’ against the data represented by the DATUMs of each
          CASE-CLAUSE in turn, proceeding in order from left to right
          through the set of clauses.

       2. If the result of evaluating CASE-KEY is equivalent to a datum
          of a CASE-CLAUSE, the corresponding EXPRESSIONs are evaluated
          from left to right and the results of the last expression in
          the CASE-CLAUSE are returned as the results of the ‘case’
          expression.  Otherwise, the comparison process continues.
       3. If the result of evaluating KEY is different from every datum
          in each set, then if there is an CASE-ELSE-CLAUSE its
          expressions are evaluated and the results of the last are the
          results of the ‘case’ expression; otherwise the result of
          ‘case’ expression is unspecified.

     If the selected CASE-CLAUSE or CASE-ELSE-CLAUSE uses the ‘=>’
     alternate form, then the EXPRESSION is evaluated.  It is an error
     if its value is not a procedure accepting one argument.  This
     procedure is then called on the value of the KEY and the values
     returned by this procedure are returned by the ‘case’ expression.

          (case (* 2 3)
            ((2 3 5 7) 'prime)
            ((1 4 6 8 9) 'composite))    ⇒ composite
          (case (car '(c d))
            ((a) 'a)
            ((b) 'b))                    ⇒ unspecified
          (case (car '(c d))
            ((a e i o u) 'vowel)
            ((w y) 'semivowel)
            (else => (lambda (x) x)))    ⇒ c

     The last EXPRESSION of a CASE CLAUSE is in tail context if the
     ‘case’ expression itself is.

 -- Syntax: match MATCH-KEY MATCH-CLAUSE^{+}
     The ‘match’ form is a generalization of ‘case’ using PATTERNs,
          MATCH-KEY ::= EXPRESSION
          MATCH-CLAUSE ::=
            ‘(’ PATTERN [GUARD] BODY ‘)’
     The MATCH-KEY is evaluated, Then the MATCH-CLAUSEs are tried in
     order.  The first MATCH-CLAUSE whose PATTERN matches (and the
     GUARD, if any, is true), is selected, and the corresponding BODY
     evaluated.  It is an error if no MATCH-CLAUSE matches.
          (match value
            (0 (found-zero))
            (x #!if (> x 0) (found-positive x))
            (x #!if (< x 0) (found-negative x))
            (x::symbol (found-symbol x))
            (_ (found-other)))

     One ‘case’ feature is not (yet) directly supported by ‘match’:
     Matching against a list of values.  However, this is easy to
     simulate using a guard using ‘memq’, ‘memv’, or ‘member’:
          ;; compare similar example under case
          (match (car '(c d))
            (x #!if (memv x '(a e i o u)) ’vowel)
            (x #!if (memv x '(w y)) ’semivowel)
            (x x))

 -- Syntax: and TEST-OR-MATCH^{*}

     If there are no TEST-OR-MATCH forms, ‘#t’ is returned.

     If the ‘and’ is not in TEST-OR-MATCH context, then the last
     sub-expression (if any) must be a TEST-EXPRESSION, and not a ‘?’
     form.  In this case the TEST-OR-MATCH expressions are evaluated
     from left to right until either one of them is false (a
     TEST-EXPRESSION is false or a ‘?’ match fails), or the last
     TEST-EXPRESSION is reached.  In the former case, the ‘and’
     expression returns ‘#f’ without evaluating the remaining
     expressions.  In the latter case, the last expression is evaluated
     and its values are returned.

     If the ‘and’ is in TEST-OR-MATCH context, then the last sub-form
     can be ‘?’ form.  They are evaluated in order: If one of them is
     false, the entire ‘and’ is false; otherwise the ‘and’ is true.

     Regardless, any bindings made by earlier ‘?’ forms are visible in
     later TEST-OR-MATCH forms.

          (and (= 2 2) (> 2 1))          ⇒  #t
          (and (= 2 2) (< 2 1))          ⇒  #f
          (and 1 2 'c '(f g))            ⇒  (f g)
          (and)                          ⇒  #t
          (and (? x ::int 23) (> x 0))   ⇒  #t

     The ‘and’ keyword could be defined in terms of ‘if’ using
     ‘syntax-rules’ as follows:

          (define-syntax and
            (syntax-rules ()
              ((and) #t)
              ((and test) test)
              ((and test1 test2 ...)
               (if test1 (and test2 ...) #t))))

     The last TEST-EXPRESSION is in tail context if the ‘and’ expression
     itself is.

 -- Syntax: or TEST-EXPRESSION ...
     If there are no TEST-EXPRESSIONs, ‘#f’ is returned.  Otherwise, the
     TEST-EXPRESSIONs are evaluated from left to right until a
     TEST-EXPRESSION returns a true value VAL or the last
     TEST-EXPRESSION is reached.  In the former case, the ‘or’
     expression returns VAL without evaluating the remaining
     expressions.  In the latter case, the last expression is evaluated
     and its values are returned.

          (or (= 2 2) (> 2 1))           ⇒ #t
          (or (= 2 2) (< 2 1))           ⇒ #t
          (or #f #f #f)                  ⇒ #f
          (or '(b c) (/ 3 0))            ⇒ (b c)

     The ‘or’ keyword could be defined in terms of ‘if’ using
     ‘syntax-rules’ as follows:

          (define-syntax or
            (syntax-rules ()
              ((or) #f)
              ((or test) test)
              ((or test1 test2 ...)
               (let ((x test1))
                 (if x x (or test2 ...))))))

     The last TEST-EXPRESSION is in tail context if the ‘or’ expression
     itself is.

 -- Procedure: not test-expression
     The ‘not’ procedure returns ‘#t’ if TEST-EXPRESSION is false, and
     returns ‘#f’ otherwise.

          (not #t)         ⇒  #f
          (not 3)          ⇒  #f
          (not (list 3))   ⇒  #f
          (not #f)         ⇒  #t
          (not ’())        ⇒  #f
          (not (list))     ⇒  #f
          (not ’nil)       ⇒  #f
          (not #!null)     ⇒  #t

 -- Syntax: when TEST-EXPRESSION form...
     If TEST-EXPRESSION is true, evaluate each FORM in order, returning
     the value of the last one.

 -- Syntax: unless TEST-EXPRESSION form...
     If TEST-EXPRESSION is false, evaluate each FORM in order, returning
     the value of the last one.


File: kawa.info,  Node: Variables and Patterns,  Next: Definitions,  Prev: Conditionals,  Up: Program structure

8.3 Variables and Patterns
==========================

An identifier can name either a type of syntax or a location where a
value can be stored.  An identifier that names a type of syntax is
called a “syntactic keyword” (informally called a “macro”), and is said
to be “bound” to a transformer for that syntax.  An identifier that
names a location is called a “variable” and is said to be “bound” to
that location.  The set of all visible bindings in effect at some point
in a program is known as the “environment” in effect at that point.  The
value stored in the location to which a variable is bound is called the
variable’s value.  By abuse of terminology, the variable is sometimes
said to name the value or to be bound to the value.  This is not quite
accurate, but confusion rarely results from this practice.

   Certain expression types are used to create new kinds of syntax and
to bind syntactic keywords to those new syntaxes, while other expression
types create new locations and bind variables to those locations.  These
expression types are called “binding constructs”.  Those that bind
syntactic keywords are discussed in *note Macros::.  The most
fundamental of the variable binding constructs is the *note ‘lambda’
expression: meta-lambda-expression, because all other variable binding
constructs can be explained in terms of ‘lambda’ expressions.  Other
binding constructs include the *note ‘define’ family: Definitions, and
the *note ‘let’ family: Local binding constructs.

   Scheme is a language with block structure.  To each place where an
identifier is bound in a program there corresponds a “region” of the
program text within which the binding is visible.  The region is
determined by the particular binding construct that establishes the
binding; if the binding is established by a ‘lambda’ expression, for
example, then its region is the entire ‘lambda’ expression.  Every
mention of an identifier refers to the binding of the identifier that
established the innermost of the regions containing the use.

   If there is no binding of the identifier whose region contains the
use, then the use refers to the binding for the variable in the global
environment, if any; if there is no binding for the identifier, it is
said to be “unbound”.

8.3.1 Patterns
--------------

The usual way to bind variables is to match an incoming value against a
“pattern”.  The pattern contains variables that are bound to some value
derived from the value.
     (! [x::double y::double] (some-expression))
   In the above example, the pattern ‘[x::double y::double]’ is matched
against the incoming value that results from evaluating
‘(some-expression)’.  That value is required to be a two-element
sequence.  Then the sub-pattern ‘x::double’ is matched against element 0
of the sequence, which means it is coerced to a ‘double’ and then the
coerced value is matched against the sub-pattern ‘x’ (which trivially
succeeds).  Similarly, ‘y::double’ is matched against element 1.

   The syntax of patterns is a work-in-progress.  (The focus until now
has been in designing and implementing how patterns work in general,
rather than the details of the pattern syntax.)

     PATTERN ::= IDENTIFIER
       | ‘_’
       | PATTERN-LITERAL
       | ‘’’DATUM
       | PATTERN ‘::’ TYPE
       | ‘[’ LPATTERN^{*} ‘]’
     LPATTERN ::= PATTERN
       | ‘@’ PATTERN
       | PATTERN ‘...’
       | GUARD
     PATTERN-LITERAL ::=
         BOOLEAN | number | CHARACTER | STRING
     GUARD ::= ‘#!if’ EXPRESSION

   This is how the specific patterns work:

IDENTIFIER
     This is the simplest and most common form of pattern.  The
     IDENTIFIER is bound to a new variable that is initialized to the
     incoming value.

‘_’
     This pattern just discards the incoming value.  It is equivalent to
     a unique otherwise-unused IDENTIFIER.

PATTERN-LITERAL
     Matches if the value is ‘equal?’ to the PATTERN-LITERAL.

‘’’DATUM
     Matches if the value is ‘equal?’ to the quoted DATUM.

PATTERN ‘::’ TYPE
     The incoming value is coerced to a value of the specified TYPE, and
     then the coerced value is matched against the sub-PATTERN.  Most
     commonly the sub-PATTERN is a plain IDENTIFIER, so the latter match
     is trivial.

‘[’ LPATTERN^{*} ‘]’
     The incoming value must be a sequence (a list, vector or similar).
     In the case where each sub-pattern is a plain PATTERN, then the
     number of sub-patterns must match the size of the sequence, and
     each sub-pattern is matched against the corresponding element of
     the sequence.  More generally, each sub-pattern may match zero or
     more consequtive elements of the incoming sequence.

‘#!if’ EXPRESSION
     No incoming value is used.  Instead the EXPRESSION is evaluated.
     If the result is true, matching succeeds (so far); otherwise the
     match fails.  This form is called a “guard”
     (https://en.wikipedia.org/wiki/Guard_(computer_science)).

‘@’ PATTERN
     A “splice pattern” may match multiple (zero or more) elements of a
     sequence.  The PATTERN is matched against the resulting
     sub-sequence.
          (! [x @r] [2 3 5 7 11])
     This binds ‘x’ to 2 and ‘r’ to ‘[3 5 7 11]’.

PATTERN ‘...’
     Similar to ‘@PATTERN’ in that it matches multiple elements of a
     sequence.  However, each individual element is matched against the
     PATTERN, rather than the elements as a sequence.  This is a *note
     repeat pattern: Repeat forms.


File: kawa.info,  Node: Definitions,  Next: Local binding constructs,  Prev: Variables and Patterns,  Up: Program structure

8.4 Definitions
===============

A variable definition binds one or more identifiers and specifies an
initial value for each of them.  The simplest kind of variable
definition takes one of the following forms:

 -- Syntax: ! PATTERN EXPRESSION
     Evaluate EXPRESSION, and match the result against PATTERN.
     Defining variables in PATTERN becomes bound in the current
     (surrounding) scope.

     This is similar to ‘define-constant’ except generalized to a
     PATTERN.

          (! [x y] (vector 3 4))
          (format "x is ~w and y is ~w" x y) ⇒ "x is 3 and y is 4"

 -- Syntax: define name [‘::’ TYPE] EXPRESSION
     Evaluate the EXPRESSION, optionally converting it to TYPE, and bind
     the NAME to the result.

 -- Syntax: define (name FORMAL-ARGUMENTS) (ANNOTATION |
          OPTION-PAIR)^{*} OPT-RETURN-TYPE BODY
 -- Syntax: define (name ‘.’ REST-ARG) (ANNOTATION | OPTION-PAIR)^{*}
          OPT-RETURN-TYPE BODY

     Bind the NAME to a function definition.  The form:
          (define (NAME FORMAL-ARGUMENTS) OPTION-PAIR^{*} OPT-RETURN-TYPE BODY)
     is equivalent to:
          (define NAME (lambda FORMAL-ARGUMENTS) name: NAME OPTION-PAIR^{*} OPT-RETURN-TYPE BODY))
     while the form:
          (define (NAME . REST-ARG) OPTION-PAIR^{*} OPT-RETURN-TYPE BODY)
     is equivalent to:
          (define NAME (lambda REST-ARG) name: NAME OPTION-PAIR^{*} OPT-RETURN-TYPE BODY))

     You can associate *note annotations: Annotations. with NAME.  A
     field annotation will be associated with the generated field; a
     method annotation will be associated with the generated method(s).

   In addition to ‘define’ (which can take an optional type specifier),
Kawa has some extra definition forms.

 -- Syntax: define-private name [‘::’ TYPE] value
 -- Syntax: define-private (name formals) body
     Same as ‘define’, except that ‘name’ is not exported.

 -- Syntax: define-constant name [‘::’ TYPE] value
 -- Syntax: define-early-constant name [:: type] value
     Defines NAME to have the given VALUE.  The value is readonly, and
     you cannot assign to it.  (This is not fully enforced.)

     If ‘define-early-constant’ is used _or_ the VALUE is a compile-time
     constant, then the compiler will create a ‘final’ field with the
     given name and type, and evaluate VALUE in the module’s class
     initializer (if the definition is static) or constructor (if the
     definition is non-static), before other definitions and
     expressions.  Otherwise, the VALUE is evaluated in the module body
     where it appears.

     If the VALUE is a compile-time constant, then the definition
     defaults to being static.

 -- Syntax: define-variable name [‘::’ TYPE] [init]
     If INIT is specified and NAME does not have a global variable
     binding, then INIT is evaluated, and NAME bound to the result.
     Otherwise, the value bound to NAME does not change.  (Note that
     INIT is not evaluated if NAME does have a global variable binding.)

     Also, declares to the compiler that NAME will be looked up in the
     per-thread dynamic environment.  This can be useful for shutting up
     warnings from ‘--warn-undefined-variable’.

     This is similar to the Common Lisp ‘defvar’ form.  However, the
     Kawa version is (currently) only allowed at module level.

   For ‘define-namespace’ and ‘define-private-namespace’ see *note
Namespaces::.


File: kawa.info,  Node: Local binding constructs,  Next: Lazy evaluation,  Prev: Definitions,  Up: Program structure

8.5 Local binding constructs
============================

The binding constructs ‘let’, ‘let*’, ‘letrec’, and ‘letrec*’ give
Scheme a block structure, like Algol 60.  The syntax of these four
constructs is identical, but they differ in the regions they establish
for their variable bindings.  In a ‘let’ expression, the initial values
are computed before any of the variables become bound; in a ‘let*’
expression, the bindings and evaluations are performed sequentially;
while in ‘letrec’ and ‘letrec*’ expressions, all the bindings are in
effect while their initial values are being computed, thus allowing
mutually recursive definitions.

 -- Syntax: let ‘((’PATTERN INIT‘)’ ...‘)’ BODY
     Declare new local variables as found in the PATTERNs.  Each PATTERN
     is matched against the corresponding INIT.  The INITs are evaluated
     in the current environment (in left-to-right onder), the VARIABLEs
     in the PATTERNSs are bound to fresh locations holding the matched
     results, the BODY is evaluated in the extended environment, and the
     values of the last expression of body are returned.  Each binding
     of a variable has BODY as its region.

          (let ((x 2) (y 3))
            (* x y)) ⇒ 6

          (let ((x 2) (y 3))
            (let ((x 7)
                  (z (+ x y)))
              (* z x)))   ⇒ 35

     An example with a non-trivial pattern:

          (let (([a::double b::integer] (vector 4 5)))
            (cons b a))  ⇒ (5 . 4.0)

 -- Syntax: let* ‘((’PATTERN init‘)’ ...‘)’ BODY

     The ‘let*’ binding construct is similar to ‘let’, but the bindings
     are performed sequentially from left to right, and the region of a
     VARIABLEs in a PATTERN is that part of the ‘let*’ expression to the
     right of the PATTERN.  Thus the second pattern is matched in an
     environment in which the bindings from the first pattern are
     visible, and so on.

          (let ((x 2) (y 3))
            (let* ((x 7)
                   (z (+ x y)))
              (* z x)))  ⇒ 70

 -- Syntax: letrec ‘((’variable [‘::’ TYPE] init‘)’ ...‘)’ BODY
 -- Syntax: letrec* ‘((’variable [‘::’ TYPE] init‘)’ ...‘)’ BODY
     The VARIABLEs are bound to fresh locations, each VARIABLE is
     assigned in left-to-right order to the result of the corresponding
     INIT, the BODY is evaluated in the resulting environment, and the
     values of the last expression in body are returned.  Despite the
     left-to-right evaluation and assignment order, each binding of a
     VARIABLE has the entire ‘letrec’ or ‘letrec*’ expression as its
     region, making it possible to define mutually recursive procedures.

     In Kawa ‘letrec’ is defined as the same as ‘letrec*’.  In standard
     Scheme the order of evaluation of the INITs is undefined, as is the
     order of assignments.  If the order matters, you should use
     ‘letrec*’.

     If it is not possible to evaluate each INIT without assigning or
     referring to the value of the corresponding VARIABLE or the
     variables that follow it, it is an error.
          (letrec ((even?
                    (lambda (n)
                      (if (zero? n)
                          #t
                          (odd? (- n 1)))))
                   (odd?
                    (lambda (n)
                      (if (zero? n)
                          #f
                          (even? (- n 1))))))
            (even? 88))
               ⇒ #t


File: kawa.info,  Node: Lazy evaluation,  Next: Repeat forms,  Prev: Local binding constructs,  Up: Program structure

8.6 Lazy evaluation
===================

“Lazy evaluation” (or call-by-need) delays evaluating an expression
until it is actually needed; when it is evaluated, the result is saved
so repeated evaluation is not needed.  Lazy evaluation
(http://en.wikipedia.org/wiki/Lazy_evaluation) is a technique that can
make some algorithms easier to express compactly or much more
efficiently, or both.  It is the normal evaluation mechanism for strict
functional (side-effect-free) languages such as Haskell
(http://www.haskell.org).  However, automatic lazy evaluation is awkward
to combine with side-effects such as input-output.  It can also be
difficult to implement lazy evaluation efficiently, as it requires more
book-keeping.

   Kawa, like other Schemes, uses “eager evaluation” - an expression is
normally evaluated immediately, unless it is wrapped in a special form.
Standard Scheme has some basic building blocks for “manual” lazy
evaluation, using an explicit ‘delay’ operator to indicate that an
expression is to be evaluated lazily, yielding a “promise”, and a
‘force’ function to force evaluation of a promise.  This functionality
is enhanced in SRFI 45 (http://srfi.schemers.org/srfi-45/srfi-45.html),
in R7RS-draft (based on SRFI 45), and SRFI 41
(http://srfi.schemers.org/srfi-41/srfi-41.html) (lazy lists aka
streams).

   Kawa makes lazy evaluation easier to use, by “implicit forcing”: The
promise is automatically evaluated (forced) when used in a context that
requires a normal value, such as arithmetic needing a number.  Kawa
enhances lazy evaluation in other ways, including support for safe
multi-threaded programming.

8.6.1 Delayed evaluation
------------------------

 -- Syntax: delay EXPRESSION
     The ‘delay’ construct is used together with the procedure ‘force’
     to implement _lazy evaluation_ or _call by need_.

     The result of ‘(delay EXPRESSION)’ is a _promise_ which at some
     point in the future may be asked (by the ‘force’ procedure) to
     evaluate EXPRESSION, and deliver the resulting value.  The effect
     of EXPRESSION returning multiple values is unspecified.

 -- Syntax: delay-force EXPRESSION
 -- Syntax: lazy EXPRESSION
     The ‘delay-force’ construct is similar to ‘delay’, but it is
     expected that its argument evaluates to a promise.  (Kawa treats a
     non-promise value as if it were a forced promise.)  The returned
     promise, when forced, will evaluate to whatever the original
     promise would have evaluated to if it had been forced.

     The expression ‘(delay-force EXPRESSION)’ is conceptually similar
     to ‘(delay (force EXPRESSION))’, with the difference that forcing
     the result of ‘delay-force’ will in effect result in a tail call to
     ‘(force EXPRESSION)’, while forcing the result of ‘(delay (force
     EXPRESSION))’ might not.  Thus iterative lazy algorithms that might
     result in a long series of chains of ‘delay’ and ‘force’ can be
     rewritten using delay-force to prevent consuming unbounded space
     during evaluation.

     Using ‘delay-force’ or ‘lazy’ is equivalent.  The name
     ‘delay-force’ is from R7RS; the name ‘lazy’ is from the older
     SRFI-45.

 -- Procedure: eager obj
     Returns a promise that when forced will return OBJ.  It is similar
     to ‘delay’, but does not delay its argument; it is a procedure
     rather than syntax.

     The Kawa implementation just returns OBJ as-is.  This is because
     Kawa treats as equivalent a value and forced promise evaluating to
     the value.

 -- Procedure: force promise
     The ‘force’ procedure forces the value of PROMISE.  As a Kawa
     extension, if the PROMISE is not a promise (a value that does not
     implement ‘gnu.mapping.Lazy’) then the argument is returned
     unchanged.  If no value has been computed for the promise, then a
     value is computed and returned.  The value of the promise is cached
     (or “memoized”) so that if it is forced a second time, the
     previously computed value is returned.
          (force (delay (+ 1 2)))                ⇒  3

          (let ((p (delay (+ 1 2))))
            (list (force p) (force p)))          ⇒  (3 3)

          (define integers
            (letrec ((next
                      (lambda (n)
                        (cons n (delay (next (+ n 1)))))))
              (next 0)))
          (define head
            (lambda (stream) (car (force stream))))
          (define tail
            (lambda (stream) (cdr (force stream))))

          (head (tail (tail integers)))          ⇒  2

     The following example is a mechanical transformation of a lazy
     stream-filtering algorithm into Scheme.  Each call to a constructor
     is wrapped in ‘delay’, and each argument passed to a deconstructor
     is wrapped in ‘force’.  The use of ‘(lazy ...)’ instead of ‘(delay
     (force ...))’ around the body of the procedure ensures that an
     ever-growing sequence of pending promises does not exhaust the
     heap.

          (define (stream-filter p? s)
            (lazy
             (if (null? (force s))
                 (delay ’())
                 (let ((h (car (force s)))
                       (t (cdr (force s))))
                   (if (p? h)
                       (delay (cons h (stream-filter p? t)))
                       (stream-filter p? t))))))

          (head (tail (tail (stream-filter odd? integers))))
              ⇒ 5

 -- Procedure: force* promise
     Does ‘force’ as many times as necessary to produce a non-promise.
     (A non-promise is a value that does not implement
     ‘gnu.mapping.Lazy’, or if it does implement ‘gnu.mapping.Lazy’ then
     forcing the value using the ‘getValue’ method yields the receiver.)

     The ‘force*’ function is a Kawa extension.  Kawa will add implicit
     calls to ‘force*’ in most contexts that need it, but you can also
     call it explicitly.

   The following examples are not intended to illustrate good
programming style, as ‘delay’, ‘lazy’, and ‘force’ are mainly intended
for programs written in the functional style.  However, they do
illustrate the property that only one value is computed for a promise,
no matter how many times it is forced.

     (define count 0)
     (define p
       (delay (begin (set! count (+ count 1))
                     (if (> count x)
                         count
                         (force p)))))
     (define x 5)
     p                  ⇒ _a promise_
     (force p)          ⇒ 6
     p                  ⇒ _a promise, still_
     (begin (set! x 10)
            (force p))  ⇒ 6

8.6.2 Implicit forcing
----------------------

If you pass a promise as an argument to a function like ‘sqrt’ if must
first be forced to a number.  In general, Kawa does this automatically
(implicitly) as needed, depending on the context.  For example:
     (+ (delay (* 3 7)) 13)   ⇒ 34

   Other functions, like ‘cons’ have no problems with promises, and
automatic forcing would be undesirable.

   Generally, implicit forcing happens for arguments that require a
specific type, and does not happen for arguments that work on _any_ type
(or ‘Object’).

   Implicit forcing happens for:
   • arguments to arithmetic functions;
   • the sequence and the index in indexing operations, like
     ‘string-ref’;
   • the operands to ‘eqv?’ and ‘equal?’ are forced, though the operands
     to ‘eq?’ are not;
   • port operands to port functions;
   • the value to be emitted by a ‘display’ but _not_ the value to be
     emitted by a ‘write’;
   • the function in an application.

   Type membership tests, such as the ‘instance?’ operation, generally
do not force their values.

   The exact behavior for when implicit forcing happens is a
work-in-progress: There are certainly places where implicit forcing
doesn’t happen while it should; there are also likely to be places where
implicit forcing happens while it is undesirable.

   Most Scheme implementations are such that a forced promise behaves
differently from its forced value, but some Scheme implementions are
such that there is no means by which a promise can be operationally
distinguished from its forced value.  Kawa is a hybrid: Kawa tries to
minimize the difference between a forced promise and its forced value,
and may freely optimize and replace a forced promise with its value.

8.6.3 Blank promises
--------------------

A “blank promise” is a promise that doesn’t (yet) have a value _or_ a
rule for calculating the value.  Forcing a blank promise will wait
forever, until some other thread makes the promise non-blank.

   Blank promises are useful as a synchronization mechanism - you can
use it to safely pass data from one thread (the producer) to another
thread (the consumer).  Note that you can only pass one value for a
given promise: To pass multiple values, you need multiple promises.

     (define p (promise))
     (future ;; Consumer thread
       (begin
         (do-stuff)
         (define v (force promise)) ; waits until promise-set-value!
         (do-stuff-with v)))
     ;; Producer thread
     ... do stuff ...
     (promise-set-value! p (calculate-value))

 -- Constructor: promise
     Calling ‘promise’ as a zero-argument constructor creates a new
     blank promise.

     This calls the constructor for ‘gnu.mapping.Promise’.  You can also
     create a non-blank promise, by setting one of the ‘value’, ‘alias’,
     ‘thunk’, or ‘exception’ properties.  Doing so is equivalent to
     calling ‘promise-set-value!’, ‘promise-set-alias!’,
     ‘promise-set-thunk!’, or ‘promise-set-exception!’ on the resulting
     promise.  For example: ‘(delay exp)’ is equivalent to:
          (promise thunk: (lambda() exp))

   The following four procedures require that their first arguments be
blank promises.  When the procedure returns, the promise is no longer
blank, and cannot be changed.  This is because a promise is conceptually
a placeholder for a single “not-yet-known” value; it is not a location
that can be assigned multiple times.  The former enables clean and safe
(“declarative") use of multiple threads; the latter is much trickier.

 -- Procedure: promise-set-value! promise value
     Sets the value of the PROMISE to VALUE, which makes the PROMISE
     forced.

 -- Procedure: promise-set-exception! promise exception
     Associate EXCEPTION with the PROMISE.  When the PROMISE is forced
     the EXCEPTION gets thrown.

 -- Procedure: promise-set-alias! promise other
     Bind the PROMISE to be an alias of OTHER.  Forcing PROMISE will
     cause OTHER to be forced.

 -- Procedure: promise-set-thunk! promise thunk
     Associate THUNK (a zero-argument procedure) with the PROMISE.  The
     first time the PROMISE is forced will causes the THUNK to be
     called, with the result (a value or an exception) saved for future
     calls.

 -- Procedure: make-promise obj
     The ‘make-promise’ procedure returns a promise which, when forced,
     will return OBJ.  It is similar to ‘delay’, but does not delay its
     argument: it is a procedure rather than syntax.  If OBJ is already
     a promise, it is returned.

     Because of Kawa’s implicit forcing, there is seldom a need to use
     ‘make-promise’, except for portability.

8.6.4 Lazy and eager types
--------------------------

 -- Type: promise[T]
     This parameterized type is the type of promises that evaluate to an
     value of type ‘T’.  It is equivalent to the Java interface
     ‘gnu.mapping.Lazy<T>’.  The implementation class for promises is
     usually ‘gnu.mapping.Promise’, though there are other classes that
     implement ‘Lazy’, most notably ‘gnu.mapping.Future’, used for
     futures, which are promises evaluated in a separate thread.

   Note the distinction between the types ‘integer’ (the type of actual
(eager) integer values), and ‘promise[integer]’ (the type of (lazy)
promises that evaluate to integer).  The two are compatible: if a
‘promise[integer]’ value is provided in a context requiring an ‘integer’
then it is automatically evaluated (forced).  If an ‘integer’ value is
provided in context requiring a ‘promise[integer]’, that conversion is
basically a no-op (though the compiler may wrap the ‘integer’ in a
pre-forced promise).

   In a fully-lazy language there would be no distinction, or at least
the promise type would be the default.  However, Kawa is a mostly-eager
language, so the eager type is the default.  This makes efficient
code-generation easier: If an expression has an eager type, then the
compiler can generate code that works on its values directly, without
having to check for laziness.


File: kawa.info,  Node: Repeat forms,  Next: Threads,  Prev: Lazy evaluation,  Up: Program structure

8.7 Repeat patterns and expressions
===================================

Many programming languages have some variant of list comprehension
syntax (https://en.wikipedia.org/wiki/List_comprehension).  Kawa splits
this into two separate forms, that can be in separate parts of the
program:
   • A “repeat pattern” as you might guess repeats a pattern by matching
     the pattern once for each element of a sequence.  For example,
     assume ‘A’ is a some sequence-valued expression.  Then:
          #|kawa:3|# (! [a::integer ...] A)
     Here ‘a::integer ...’ is a REPEAT PATTERN that matches all the
     elements pf ‘A’.  We call ‘a::integer’ the “repeated pattern” - it
     matches an individual element of ‘A’.  Any variable defined in a
     repeated pattern is a “repeat variable”.  In the example, that
     would be ‘a’.
   • A “repeat expression” creates a sequence by repeating an expression
     for each element of the result.
          #|kawa:4|# [(* 2 a) ...]
          [4 6 10 14 22]
     In this case ‘(* 2 a) ...’ is the repeat expression.  The “repeated
     expression” is ‘(* 2 a)’.  The repeated expression is evaluated
     once for each element of any contained repeat variable.  If there
     is more than one repeat variable, they are repeated in parallel, as
     many times as the “shortest” repeat variable, similar to the ‘map’
     procedure.  (If there is no repeat variable, the repeated
     expression is potentially evaluated infinitely many times, which is
     not allowed.  A planned extension will allow it for lazy repeated
     expression.)

   The use of ‘...’ for repeat patterns and expressions mirrors exactly
their use in ‘syntax-rules’ patterns and templates.

   It is an error to use a repeat variable outside of repeat context:
     #|kawa:5|# a
     /dev/stdin:2:1: using repeat variable 'a' while not in repeat context

   The repeat form feature is not yet complete.  It is missing
functionality such as selecting only some elements from a repeat
sequence, lazy sequences, and it could be optimized more.

   A repeat variable can be used multiple times in the same repeat
expressions, or different repeat expressions:
     #|kawa:7|# [a ... a ...]
     [2 3 5 7 11 2 3 5 7 11]
     #|kawa:8|# [(* a a) ...]
     [4 9 25 49 121]

   Repeat expressions are useful not just in sequence literals, but in
the argument list of a procedure call, where the resulting sequence is
spliced into the argument list.  This is especially useful for functions
that take a variable number of arguments, because that enables a
convenient way to do fold/accumulate/reduce
(https://en.wikipedia.org/wiki/Fold_(higher-order_function)) operations.
For example:

     #|kawa:9|# (+ a ...)
     28

   because 28 is the result of ‘(+ 2 3 5 7 11)’.

   An elegant way to implement dot product
(https://en.wikipedia.org/wiki/Dot_product):
     (define (dot-product [x ...] [y ...])
       (+ (* x y) ...))

   When an ellipse expression references two or more distinct repeat
variables then they are processed “in parallel”.  That does not
(necessarily) imply muliple threads, but that the first element of the
repeat result is evaluated using the first element of all the repeat
sequences, the second element of the result uses the second element of
all the repeat sequences, and so on.

Sub-patterns in repeat patterns
...............................

While the repeated pattern before the ‘...’ is commonly in identifier,
it may be a more complex pattern.  We showed earlier the repeated
pattern with a type specifier, which applies to each element:
     #|kawa:11|# (define (isum [x::integer ...]) (+ x ...))
     #|kawa:12|# (isum [4 5 6])
     15
     #|kawa:12|# (isum [4 5.1 6])
     Argument #1 (null) to 'isum' has wrong type
     	at gnu.mapping.CallContext.matchError(CallContext.java:189)
     	at atInteractiveLevel-6.isum$check(stdin:11)
     	...
   (The stack trace line number ‘stdin:11’ is that of the ‘isum’
definition.)

   You can nest repeat patterns, allowing matching against sequences
whose elements are sequences.
     #|kawa:31|# (define (fun2 [[x ...] ...] [y ...])
     #|.....32|#   [[(+ x y) ...] ...])
     #|kawa:33|# (fun2 [[1 2 3] [10 11 12]] [100 200])
     [[101 102 103] [210 211 212]]
   Note that ‘x’ is double-nested, while ‘y’ is singly-nested.

   Here each element is constrained to be a pair (a -element sequence):
     #|kawa:1|# (! [[x y] ...] [[11 12] [21 22] [31 32]])
     #|kawa:2|# [(+ x y) ...]
     #(23 43 63)
     #|kawa:3|# [[x ...] [y ...]]
     #(#(11 21 31) #(12 22 32))


File: kawa.info,  Node: Threads,  Next: Exceptions,  Prev: Repeat forms,  Up: Program structure

8.8 Threads
===========

There is a very preliminary interface to create parallel threads.  The
interface is similar to the standard ‘delay’/‘force’, where a thread is
basically the same as a promise, except that evaluation may be in
parallel.

 -- Syntax: future expression
     Creates a new thread that evaluates EXPRESSION.

     (The result extends ‘java.lang.Thread’ and implements
     ‘gnu.mapping.Lazy’.)

 -- Procedure: force thread
     The standard ‘force’ function is generalized to also work on
     threads.  It waits for the thread’s EXPRESSION to finish executing,
     and returns the result.

 -- Procedure: runnable function
     Creates a new ‘Runnable’ instance from a function.  Useful for
     passing to Java code that expects a ‘Runnable’.  You can get the
     result (a value or a thrown exception) using the ‘getResult’
     method.

 -- Syntax: synchronized object form ...
     Synchronize on the given OBJECT.  (This means getting an exclusive
     lock on the object, by acquiring its “monitor”.)  Then execute the
     FORMs while holding the lock.  When the FORMs finish (normally or
     abnormally by throwing an exception), the lock is released.
     Returns the result of the last FORM.  Equivalent to the Java
     ‘synchronized’ statement, except that it may return a result.


File: kawa.info,  Node: Exceptions,  Prev: Threads,  Up: Program structure

8.9 Exception handling
======================

An “exception” is an object used to signal an error or other exceptional
situation.  The program or run-time system can “throw” the exception
when an error is discovered.  An exception handler is a program
construct that registers an action to handle exceptions when the handler
is active.

   If an exception is thrown and not handled then the
read-eval-print-loop will print a stack trace, and bring you back to the
top level prompt.  When not running interactively, an unhandled
exception will normally cause Kawa to be exited.

   In the Scheme exception model (as of R6RS and R7RS), exception
handlers are one-argument procedures that determine the action the
program takes when an exceptional situation is signaled.  The system
implicitly maintains a current exception handler in the dynamic
environment.  The program raises an exception by invoking the current
exception handler, passing it an object encapsulating information about
the exception.  Any procedure accepting one argument can serve as an
exception handler and any object can be used to represent an exception.

   The Scheme exception model is implemented on top of the Java VM’s
native exception model where the only objects that can be thrown are
instances of ‘java.lang.Throwable’.  Kawa also provides direct access to
this native model, as well as older Scheme exception models.

 -- Procedure: with-exception-handler handler thunk
     It is an error if HANDLER does not accept one argument.  It is also
     an error if THUNK does not accept zero arguments.  The
     ‘with-exception-handler’ procedure returns the results of invoking
     THUNK.  The HANDLER is installed as the current exception handler
     in the dynamic environment used for the invocation of THUNK.

          (call-with-current-continuation
            (lambda (k)
             (with-exception-handler
              (lambda (x)
               (display "condition: ")
               (write x)
               (newline)
               (k 'exception))
              (lambda ()
               (+ 1 (raise ’an-error))))))
                 ⇒ exception
                 and prints condition: an-error

          (with-exception-handler
           (lambda (x)
            (display "something went wrong\n"))
           (lambda ()
            (+ 1 (raise ’an-error))))
              prints something went wrong

     After printing, the second example then raises another exception.

     _Performance note:_ The THUNK is inlined if it is a lambda
     expression.  However, the HANDLER cannot be inlined even if it is a
     lambda expression, because it could be called by
     ‘raise-continuable’.  Using the ‘guard’ form is usually more
     efficient.

 -- Procedure: raise obj
     Raises an exception by invoking the current exception handler on
     OBJ.  The handler is called with the same dynamic environment as
     that of the call to raise, except that the current exception
     handler is the one that was in place when the handler being called
     was installed.  If the handler returns, then OBJ is re-raised in
     the same dynamic environment as the handler.

     If OBJ is an instance of ‘java.lang.Throwable’, then ‘raise’ has
     the same effect as ‘primitive-throw’.

 -- Procedure: raise-continuable obj
     Raises an exception by invoking the current exception handler on
     OBJ.  The handler is called with the same dynamic environment as
     the call to ‘raise-continuable’, except that: (1) the current
     exception handler is the one that was in place when the handler
     being called was installed, and (2) if the handler being called
     returns, then it will again become the current exception handler.
     If the handler returns, the values it returns become the values
     returned by the call to ‘raise-continuable’.

          (with-exception-handler
            (lambda (con)
              (cond
                ((string? con)
                 (display con))
                (else
                 (display "a warning has been issued")))
              42)
            (lambda ()
              (+ (raise-continuable "should be a number")
                 23)))
                prints: should be a number
                ⇒ 65

 -- Syntax: guard VARIABLE COND-CLAUSE^{+} BODY
     The BODY is evaluated with an exception handler that binds the
     raised object to VARIABLE and, within the scope of that binding,
     evaluates the clauses as if they were the clauses of a ‘cond’
     expression.  That implicit ‘cond’ expression is evaluated with the
     continuation and dynamic environment of the ‘guard’ expression.  If
     every cond-clause’s test evaluates to ‘#f’ and there is no ‘else’
     clause, then ‘raise-continuable’ is invoked on the raised object
     within the dynamic environment of the original call to ‘raise’ or
     ‘raise-continuable’, except that the current exception handler is
     that of the ‘guard’ expression.

          (guard (condition
                   ((assq 'a condition) => cdr)
                   ((assq 'b condition)))
            (raise (list (cons 'a 42))))
                ⇒ 42

          (guard (condition
                   ((assq 'a condition) => cdr)
                   ((assq 'b condition)))
            (raise (list (cons 'b 23))))
                ⇒ (b . 23)

     _Performance note:_ Using ‘guard’ is moderately efficient: there is
     some overhead compared to using native exception handling, but both
     the BODY and the handlers in the COND-CLAUSE are inlined.

 -- Procedure: dynamic-wind in-guard thunk out-guard
     All three arguments must be 0-argument procedures.  First calls
     IN-GUARD, then THUNK, then OUT-GUARD.  The result of the expression
     is that of THUNK.  If THUNK is exited abnormally (by throwing an
     exception or invoking a continuation), OUT-GUARD is called.

     If the continuation of the dynamic-wind is re-entered (which is not
     yet possible in Kawa), the IN-GUARD is called again.

     This function was added in R5RS.

 -- Procedure: read-error? obj
     Returns #t if OBJ is an object raised by the ‘read’ procedure.
     (That is if OBJ is a ‘gnu.text.SyntaxException’.)

 -- Procedure: file-error? obj
     Returns #t if OBJ is an object raised by inability to open an input
     or output port on a file.  (This includes
     ‘java.io.FileNotFoundException’ as well as certain other
     exceptions.)

8.9.1 Simple error objects
--------------------------

 -- Procedure: error message obj ...
     Raises an exception as if by calling ‘raise’ on a newly allocated
     “simple error object”, which encapsulates the information provided
     by MESSAGE (which should a string), as well as any OBJ arguments,
     known as the irritants.

     The string representation of a simple error object is as if calling
     ‘(format "#<ERROR ~a~{ ~w~}>" MESSAGE IRRITANTS)’.  (That is the
     MESSAGE is formatted as if with ‘display’ while each irritant OBJ
     is formatted as if with ‘write’.)

     This procedure is part of SRFI-23, and R7RS. It differs from (and
     is incompatible with) R6RS’s ‘error’ procedure.

 -- Procedure: error-object? obj
     Returns ‘#t’ if OBJ is a simple error object.  Specifically, that
     OBJ is an instance of ‘kawa.lang.NamedException’.  Otherwise, it
     returns ‘#f’.

 -- Procedure: error-object-message error-object
     Returns the message encapsulated by error-object, which must be a
     simple error object.

 -- Procedure: error-object-irritants error-object
     Returns a list of the irritants (other arguments) encapsulated by
     error-object, which must be a simple error object.

8.9.2 Named exceptions
----------------------

These functions associate a symbol with exceptions and handlers: A
handler catches an exception if the symbol matches.

 -- Procedure: catch key thunk handler
     Invoke THUNK in the dynamic context of HANDLER for exceptions
     matching KEY.  If thunk throws to the symbol KEY, then HANDLER is
     invoked this way:

          (handler key args ...)

     KEY may be a symbol.  The THUNK takes no arguments.  If THUNK
     returns normally, that is the return value of ‘catch’.

     Handler is invoked outside the scope of its own ‘catch’.  If
     HANDLER again throws to the same key, a new handler from further up
     the call chain is invoked.

     If the key is ‘#t’, then a throw to _any_ symbol will match this
     call to ‘catch’.

 -- Procedure: throw key arg ...
     Invoke the catch form matching KEY, passing the ARGs to the current
     HANDLER.

     If the key is a symbol it will match catches of the same symbol or
     of ‘#t’.

     If there is no handler at all, an error is signaled.

8.9.3 Native exception handling
-------------------------------

 -- Procedure: primitive-throw exception
     Throws the EXCEPTION, which must be an instance of a sub-class of
     ‘java.lang.Throwable’.

 -- Syntax: try-finally body handler
     Evaluate BODY, and return its result.  However, before it returns,
     evaluate HANDLER.  Even if BODY returns abnormally (by throwing an
     exception), HANDLER is evaluated.

     (This is implemented just like Java’s ‘try’-‘finally’.  However,
     the current implementation does not duplicate the HANDLER.)

 -- Syntax: try-catch body handler ...
     Evaluate BODY, in the context of the given HANDLER specifications.
     Each HANDLER has the form:
          VAR TYPE EXP ...
     If an exception is thrown in BODY, the first HANDLER is selected
     such that the thrown exception is an instance of the HANDLER’s
     TYPE.  If no HANDLER is selected, the exception is propagated
     through the dynamic execution context until a matching HANDLER is
     found.  (If no matching HANDLER is found, then an error message is
     printed, and the computation terminated.)

     Once a HANDLER is selected, the VAR is bound to the thrown
     exception, and the EXP in the HANDLER are executed.  The result of
     the ‘try-catch’ is the result of BODY if no exception is thrown, or
     the value of the last EXP in the selected HANDLER if an exception
     is thrown.

     (This is implemented just like Java’s ‘try’-‘catch’.)


File: kawa.info,  Node: Control features,  Next: Symbols and namespaces,  Prev: Program structure,  Up: Top

9 Control features
******************

* Menu:

* Mapping functions::
* Multiple values::


File: kawa.info,  Node: Mapping functions,  Next: Multiple values,  Up: Control features

9.1 Mapping functions
=====================

The procedures ‘string-for-each’ and ‘string-map’ are documented under
*note Strings::.

   The procedure ‘string-cursor-for-each’ is documented under *note
String Cursor API::.

 -- Procedure: map PROC sequence_{1} sequence_{2} ...
 -- Procedure: for-each PROC sequence_{1} sequence_{2} ...
     The ‘map’ procedure applies PROC element-wise to the elements of
     the SEQUENCEs and returns a list of the results, in order.  The
     dynamic order in which PROC is applied to the elements of the
     SEQUENCEs is unspecified.

     The ‘for-each’ procedure does the same, but is executed for the
     side-effects of PROC, whose result (if any) is discarded.  Unlike
     ‘map’, ‘for-each’ is guaranteed to call PROC on the elements of the
     SEQUENCESs in order from the first element(s) to the last.  The
     value returned by ‘for-each’ is the void value.

     Each SEQUENCE must be a generalized sequence.  (Traditionally,
     these arguments were restricted to lists, but Kawa allows
     sequences, including vectors, Java arrays, and strings.)  If more
     than one SEQUENCE is given and not all SEQUENCEs have the same
     length, the procedure terminates when the shortest SEQUENCE runs
     out.  The SEQUENCEs can be infinite (for example circular lists),
     but it is an error if all of them are infinite.

     The PROC must be a procedure that accepts as many arguments as
     there are SEQUENCE arguments.  It is an error for PROC to mutate
     any of the SEQUENCEs.  In the case of ‘map’, PROC must return a
     single value.

          (map cadr '((a b) (d e) (g h)))
              ⇒ (b e h)

          (map (lambda (n) (expt n n))
               '(1 2 3 4 5))
              ⇒ (1 4 27 256 3125)

          (map + ’(1 2 3) ’(4 5 6 7))  ⇒ (5 7 9)

          (let ((count 0))
            (map (lambda (ignored)
                   (set! count (+ count 1))
                   count)
                 '(a b)))
              ⇒ (1 2) or (2 1)

     The result of ‘map’ is a list, even if the arguments are non-lists:
          (map +
               #(3 4 5)
               (float[] 0.5 1.5))
              ⇒ (3.5 5.5)

     To get a vector result, use ‘vector-map’.

          (let ((v (make-vector 5)))
            (for-each (lambda (i)
                        (vector-set! v i (* i i)))
                      '(0 1 2 3 4))
            v)
              ⇒  #(0 1 4 9 16)

     A string is considered a sequence of ‘character’ values (not 16-bit
     ‘char’ values):

          (let ((v (make-vector 10 #\-)))
            (for-each (lambda (i ch)
                        (vector-set! v i ch))
                      [0 <: ]
                      "Smile ��!")
             v)
              ⇒ #(#\S #\m #\i #\l #\e #\space #\x1f603 #\! #\- #\-)

     _Performance note:_ These procedures are pretty well optimized.
     For each SEQUENCE the compiler will by default create an iterator.
     However, if the type of the SEQUENCE is known, the compiler will
     inline the iteration code.

 -- Procedure: vector-map PROC SEQUENCE1 SEQUENCE2 ...
     Same as the ‘map’ procedure, except the result is a vector.
     (Traditionally, these arguments were restricted to vectors, but
     Kawa allows sequences, including lists, Java arrays, and strings.)

          (vector-map cadr '#((a b) (d e) (g h)))
              ⇒ #(b e h)

          (vector-map (lambda (n) (expt n n))
                      '#(1 2 3 4 5))
              ⇒ #(1 4 27 256 3125)

          (vector-map + '#(1 2 3) ’#(4 5 6 7))
              ⇒ #(5 7 9)

          (let ((count 0))
            (vector-map
              (lambda (ignored)
                (set! count (+ count 1))
                count)
              '#(a b)))
              ⇒ #(1 2) or #(2 1)

 -- Procedure: vector-for-each PROC VECTOR1 VECTOR2 ...
     Mostly the same as ‘for-each’, however the arguments should be
     generalized vectors.  Specifically, they should implement
     ‘java.util.List’ (which both regular vectors and uniform vectors
     do).  The VECTORS should also be efficiently indexable.

     (Traditionally, these arguments were restricted to vectors, but
     Kawa allows sequences, including lists, Java arrays, and strings.)

          (let ((v (make-list 5)))
            (vector-for-each
              (lambda (i) (list-set! v i (* i i)))
              '#(0 1 2 3 4))
            v)
              ⇒ (0 1 4 9 16)


File: kawa.info,  Node: Multiple values,  Prev: Mapping functions,  Up: Control features

9.2 Multiple values
===================

The multiple-value feature was added in R5RS.

 -- Procedure: values object ...
     Delivers all of its arguments to its continuation.

 -- Procedure: call-with-values producer consumer
     Calls its PRODUCER argument with no arguments and a continuation
     that, when passed some values, calls the CONSUMER procedure with
     those values as arguments.

          (call-with-values (lambda () (values 4 5))
                            (lambda (a b) b))
                                   ⇒ 5

          (call-with-values * -)   ⇒ -1

     _Performance note:_ If either the PRODUCER or CONSUMER is a
     fixed-arity lambda expression, it is inlined.

 -- Syntax: define-values FORMALS EXPRESSION
     It is an error if a variable appears more than once in the set of
     FORMALS.

     The EXPRESSION is evaluated, and the FORMALS are bound to the
     return values in the same way that the FORMALS in a ‘lambda’
     expression are matched to the arguments in a procedure call.

          (define-values (x y) (integer-sqrt 17))
          (list x y)    ⇒ (4 1)
          (let ()
            (define-values (x y) (values 1 2))
            (+ x y))
                        ⇒  3

 -- Syntax: let-values ‘((’FORMALS EXPRESSION‘)’ ...‘)’ BODY
     Each FORMALS should be a formal arguments list, as for a ‘lambda’.

     The EXPRESSIONs are evaluated in the current environment, the
     variables of the FORMALS are bound to fresh locations, the return
     values of the EXPRESSIONs are stored in the variables, the BODY is
     evaluated in the extended environment, and the values of the last
     expression of BODY are returned.  The BODY is a "tail body", cf
     section 3.5 of the R5RS.

     The matching of each FORMALS to values is as for the matching of
     FORMALS to arguments in a ‘lambda’ expression, and it is an error
     for an EXPRESSION to return a number of values that does not match
     its corresponding FORMALS.
          (let-values (((a b . c) (values 1 2 3 4)))
            (list a b c))            ⇒ (1 2 (3 4))

          (let ((a 'a) (b 'b) (x 'x) (y 'y))
            (let-values (((a b) (values x y))
                         ((x y) (values a b)))
              (list a b x y)))       ⇒ (x y a b)

 -- Syntax: let*-values ‘((’FORMALS EXPRESSION‘)’ ...‘)’ BODY

     Each FORMALS should be a formal arguments list as for a ‘lambda’
     expression.

     ‘let*-values’ is similar to ‘let-values’, but the bindings are
     performed sequentially from left to right, and the region of a
     binding indicated by (FORMALS EXPRESSION) is that part of the
     ‘let*-values’ expression to the right of the binding.  Thus the
     second binding is done in an environment in which the first binding
     is visible, and so on.
          (let ((a 'a) (b 'b) (x 'x) (y 'y))
            (let*-values (((a b) (values x y))
                          ((x y) (values a b)))
              (list a b x y)))       ⇒ (x y x y)

 -- Syntax: receive FORMALS EXPRESSION BODY
     This convenience form (from SRFI-8
     (http://srfi.schemers.org/srfi-8/srfi-8.html)) is equivalent to:
          (let-values ((FORMALS EXPRESSION)) BODY)
     For example:
          (receive a (values 1 2 3 4)
            (reverse a)) ⇒ (4 3 2 1)

          (receive (a b . c) (values 1 2 3 4)
            (list a b c))            ⇒ (1 2 (3 4))

          (let ((a 'a) (b 'b) (x 'x) (y 'y))
            (receive (a b) (values x y)
              (receive (x y) (values a b)
                (list a b x y))))    ⇒ (x y x y)

 -- Procedure: values-append arg1 ...
     The values resulting from evaluating each argument are appended
     together.


File: kawa.info,  Node: Symbols and namespaces,  Next: Procedures,  Prev: Control features,  Up: Top

10 Symbols and namespaces
*************************

An identifier is a name that appears in a program.

   A symbol is an object representing a string that cannot be modified.
This string is called the symbol’s name.  Unlike strings, two symbols
whose names are spelled the same way are indistinguishable.  A symbol is
immutable (unmodifiable) and normally viewed as atomic.  Symbols are
useful for many applications; for instance, they may be used the way
enumerated values are used in other languages.

   In addition to the simple symbols of standard Scheme, Kawa also has
compound (two-part) symbols.

* Menu:

* Simple symbols::
* Namespaces::
* Keywords::
* Special named constants::


File: kawa.info,  Node: Simple symbols,  Next: Namespaces,  Up: Symbols and namespaces

10.1 Simple symbols
===================

Simple symbols have no properties other than their name, an immutable
string.  They have the useful property that two simple symbols are
identical (in the sense of ‘eq?’, ‘eqv?’ and ‘equal?’) if and only if
their names are spelled the same way.  A symbol literal is formed using
‘quote’.

 -- Procedure: symbol? OBJ
     Return ‘#t’ if OBJ is a symbol, ‘#f’ otherwise.

          (symbol? 'foo)          ⇒ #t
          (symbol? (car '(a b)))  ⇒ #t
          (symbol? "bar")         ⇒ #f
          (symbol? 'nil)          ⇒ #t
          (symbol? '())           ⇒ #f
          (symbol? #f)            ⇒ #f

 -- Procedure: symbol->string SYMBOL
     Return the name of SYMBOL as an immutable string.

          (symbol->string 'flying-fish)                   ⇒  "flying-fish"
          (symbol->string 'Martin)                        ⇒  "Martin"
          (symbol->string (string->symbol "Malvina"))     ⇒  "Malvina"

 -- Procedure: string->symbol STRING
     Return the symbol whose name is STRING.

          (eq? 'mISSISSIppi 'mississippi)
          ⇒ #f

          (string->symbol "mISSISSIppi")
          ⇒ the symbol with name "mISSISSIppi"

          (eq? 'bitBlt (string->symbol "bitBlt"))
          ⇒ #t

          (eq? 'JollyWog (string->symbol (symbol->string 'JollyWog)))
          ⇒ #t

          (string=? "K. Harper, M.D."
                    (symbol->string (string->symbol "K. Harper, M.D.")))
          ⇒ #t


File: kawa.info,  Node: Namespaces,  Next: Keywords,  Prev: Simple symbols,  Up: Symbols and namespaces

10.2 Namespaces and compound symbols
====================================

Different applications may want to use the same symbol to mean different
things.  To avoid such “name clashes” we can use “compound symbols”,
which have two string parts: a “local name” and a “namespace URI”. The
namespace-uri can be any string, but it is recommended that it have the
form of an absolute URI
(http://en.wikipedia.org/wiki/Uniform_Resource_Identifier).  It would be
too verbose to write the full URI all the time, so one usually uses a
“namespace prefix” (namespace alias) as a short local alias to refer to
a namespace URI.

   Compound symbols are usually written using the infix colon operator:
     PREFIX:LOCAL-NAME
   where PREFIX is a namespace alias bound to some (lexically-known)
namespace URI.

   Compound symbols are used for namespace-aware XML processing.

10.2.1 Namespace objects
------------------------

A “namespace” is a mapping from strings to symbols.  The string is the
local-name of the resulting symbol.  A namespace is similar to a Common
Lisp “package”.

   A namespace has a namespace-uri, which a string; it is recommended
that it have the form of an absolute URI. A namespace may optionally
have a prefix, which is a string used when printing out symbols
belonging to the namespace.  (If you want “equivalent symbols” (i.e.
those that have the same local-name and same uri) to be the identical
symbol object, then you should use namespaces whose prefix is the empty
string.)

 -- Constructor: namespace name [prefix]
     Return a namespace with the given NAME and PREFIX.  If no such
     namespace exists, create it.  The NAMESPACE-NAME is commonly a URI,
     especially when working with XML, in which case it is called a
     NAMESPACE-URI.  However, any non-empty string is allowed.  The
     prefix can be a string or a simple symbol.  (If a symbol is used,
     then the symbol’s local-name is used.)  The default for PREFIX is
     the empty string.  Multiple calls with the same arguments will
     yield the same namespace object.

   The reader macro ‘#,namespace’ is equivalent to the ‘namespace’
function, but it is invoked at read-time:
     #,(namespace "http://www.w3.org/1999/XSL/Transform" xsl)
     (eq? #,(namespace "foo") (namespace "foo")) ⇒ #t

   The form ‘(,#namespace "" "")’ returns the default “empty namespace”,
which is used for simple symbols.

 -- Procedure: namespace-uri namespace
     Return the namespace-uri of the argument NAMESPACE, as a string.

 -- Procedure: namespace-prefix namespace
     Return the namespace prefix of the argument NAMESPACE, as a string.

10.2.2 Compound symbols
-----------------------

A compound symbol is one that belongs to a namespace other than the
default empty namespace, and (normally) has a non-empty namespace uri.
(It is possible for a symbol to belong to a non-default namespace and
have an empty namespace uri, but that is not recommended.)

 -- Constructor: symbol local-name namespace-spec
 -- Constructor: symbol local-name [uri [prefix]]
     Construct a symbol with the given LOCAL-NAME and namespace.  If
     NAMESPACE-SPEC is a namespace object, then find (or, if needed,
     construct) a symbol with the given LOCAL-NAME belonging to the
     namespace.  Multiple calls to ‘symbol’ with the same namespace and
     LOCAL-NAME will yield the same symbol object.

     If uri is a string (optionally followed by a prefix), then:
          (symbol lname uri [prefix])
     is equivalent to:
          (symbol lname (namespace uri [prefix]))

     Using ‘#t’ for the NAMESPACE-SPEC is equivalent to using the empty
     namespace ‘#,(namespace "")’.

     Using ‘#!null’ or ‘#f’ for the NAMESPACE-SPEC creates an UNINTERNED
     symbol, which does not belong to any namespace.

 -- Procedure: symbol-local-name symbol
     Return the local name of the argument symbol, as an immutable
     string.  (The string is interned, except in the case of an
     uninterned symbol.)

 -- Procedure: symbol-prefix symbol
     Return the prefix of the argument symbol, as an immutable (and
     interned) string.

 -- Procedure: symbol-namespace-uri symbol
     Return the namespace uri of the argument symbol, as an immutable
     (and interned) string.

 -- Procedure: symbol-namespace symbol
     Return the namespace object (if any) of the argument symbol.
     Returns ‘#!null’ if the symbol is uninterned.

 -- Procedure: symbol=? SYMBOL1 SYMBOL2 SYMBOL3 ...
     Return ‘#t’ if the symbols are equivalent as symbols, i.e., if
     their local-names and namespace-uris are the same.  They may have
     different values of ‘symbol-prefix’ and ‘symbol-namespace’.  If a
     symbol is uninterned (or is ‘#!null’) then ‘symbol=?’ returns the
     same result as ‘eq?’.

   Two symbols are ‘equal?’ or ‘eqv?’ if they’re ‘symbol=?’.

10.2.3 Namespace aliases
------------------------

A namespace is usually referenced using a shorter “namespace alias”,
which is is a lexical definition that binds a namespace prefix to a
namespace object (and thus a namespace uri).  This allows using compound
symbols as identifiers in Scheme programs.

 -- Syntax: define-namespace name namespace-name
     Defines NAME as a “namespace prefix” - a lexically scoped
     "nickname" for the namespace whose full name is NAMESPACE-NAME,
     which should be a non-empty string literal.  It is customary for
     the string have syntactic form of an absolute URI
     (http://en.wikipedia.org/wiki/Uniform_Resource_Identifier), but any
     non-empty string is acceptable and is used without further
     interpretation.

     Any symbols in the scope of this definitions that contain a colon,
     and where the part before the colon matches the NAME will be
     treated as being in the package/namespace whose global unique name
     is the NAMESPACE-NAME.

     Has mostly the same effect as:
          (define-constant NAME #,(namespace NAMESPACE-NAME)

     However, using ‘define-namespace’ (rather than ‘define-constant’)
     is recommended if you want to use compound symbols as names of
     variables, especially local variables, or if you want to quote
     compound symbols.

     Note that the prefix is only visible lexically: it is not part of
     the namespace, or thus indirectly the symbols, and so is not
     available when printing the symbol.  You might consider using
     ‘define-xml-namespace’ as an alternative.

     A namespace is similar to a Common Lisp package, and the
     NAMESPACE-NAME is like the name of the package.  However, a
     namespace alias belongs to the lexical scope, while a Common Lisp
     package nickname is global and belongs to the package itself.

     If the namespace-name starts with the string ‘"class:"’, then the
     NAME can be used for invoking Java methods (*note Method
     operations::) and accessing fields (*note Field operations::).

     You can use a namespace as an abbreviation or renaming of a class
     name, but as a matter of style ‘define-alias’ is preferred.

 -- Syntax: define-private-namespace name namespace-name
     Same as ‘define-namespace’, but the prefix NAME is local to the
     current module.

   For example, you might have a set of a geometry definitions defined
under the namespace-uri ‘"http://foo.org/lib/geometry"’:

     (define-namespace geom "http://foo.org/lib/geometry")
     (define (geom:translate x y)
       (java.awt.geom.AffineTransform:getTranslateInstance x y))
     (define geom:zero (geom:translate 0 0))
     geom:zero
       ⇒ AffineTransform[[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]

   You could have some other definitions for complex math:
     (define-namespace complex "http://foo.org/lib/math/complex")
     (define complex:zero +0+0i)

   You can use a namespace-value directly in a compound name:
     (namespace "http://foo.org/lib/geometry"):zero
       ⇒ AffineTransform[[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]

   The variation ‘define-xml-namespace’ is used for *note Creating XML
nodes::.

 -- Syntax: define-xml-namespace prefix "namespace-uri"
     Defines a namespace with prefix PREFIX and URI NAMESPACE-URI.  This
     is similar to ‘define-namespace’ but with two important
     differences:
        • Every symbol in the namespace automatically maps to an
          element-constructor-type, as with the ‘html’ namespace.
        • The PREFIX is a component of the namespace object, and hence
          indirectly of any symbols belongining to the namespace.

     Thus the definition is roughly equivalent to:
          (define-constant NAME #,(namespace NAMESPACE-NAME NAME)
     along with an infinite set of definitions, for every possible TAG:
          (define (name:TAG . rest) (apply make-element 'name:TAG rest))

     $ kawa --output-format xml
     #|kawa:1|# (define-xml-namespace na "Namespace1")
     #|kawa:2|# (define-xml-namespace nb "Namespace1")
     #|kawa:3|# (define xa (na:em "Info"))
     #|kawa:4|# xa
     <na:em xmlns:na="Namespace1">Info</na:em>
     #|kawa:5|# (define xb (nb:em "Info"))
     #|kawa:6|# xa
     <nb:em xmlns:nb="Namespace1">Info</nb:em>

   Note that the prefix is part of the qualified name (it is actually
part of the namespace object), and it is used when printing the tag.
Two qualified names (symbols) that have the same local-name and the same
namespace-name are considered equal, even if they have different prefix.
You can think of the prefix as annotation used when printing, but not
otherwise part of the “meaning” of a compound symbol.  They are the same
object if they also have the same prefix.  This is an important
difference from traditional Lisp/Scheme symbols, but it is how XML
QNames work.
     #|kawa:7|# (instance? xb na:em)
     true
     #|kawa:8|# (eq? 'na:em 'nb:em)
     false
     #|kawa:9|# (equal? 'na:em 'nb:em)
     true
     #|kawa:10|# (eqv? 'na:em 'nb:em)
     true
   (Note that ‘#t’ is printed as ‘true’ when using XML formatting.)

   The predefined ‘html’ prefix could be defined thus:
     (define-xml-namespace html "http://www.w3.org/1999/xhtml")


File: kawa.info,  Node: Keywords,  Next: Special named constants,  Prev: Namespaces,  Up: Symbols and namespaces

10.3 Keywords
=============

Keywords are similar to symbols.  They are used mainly for specifying
keyword arguments.

   Historically keywords have been self-evaluating (you did not need to
quote them).  This has changed: you must quote a keyword if you want a
literal keyword value, and not quote it if it is used as a keyword
argument.

     KEYWORD ::= IDENTIFIER‘:’
       | ‘#:’IDENTIFIER

   The two syntaxes have the same meaning: The former is nicer-looking;
the latter is more portable (and required if you use the ‘--r7rs’
command-line flag).

     _Details:_ In r7rs and other Scheme standards the colon character
     does not have any special meaning, so ‘foo:’ or ‘foo:bar’ are just
     regular identifiers.  Therefore some other Scheme variants that
     have keywords (including Guile and Racket) use the ‘#:’ syntax.
     Kawa has some hacks so that _most_ standard Scheme programs that
     have colons in identifiers will work.  However, for best
     compatibility, use the ‘--r7rs’ command-line flag (which turns
     colon into a regular character in a symbol), and the ‘#:’ syntax.

   A keyword is a single token; therefore no whitespace is allowed
between the IDENTIFIER and the colon or after the ‘#:’; these characters
are not considered part of the name of the keyword.

 -- Procedure: keyword? obj
     Return ‘#t’ if OBJ is a keyword, and otherwise returns ‘#f’.

 -- Procedure: keyword->string keyword
     Returns the name of KEYWORD as a string.  The name does not include
     the final ‘#\:’.

 -- Procedure: string->keyword string
     Returns the keyword whose name is STRING.  (The STRING does not
     include a final ‘#\:’.)


File: kawa.info,  Node: Special named constants,  Prev: Keywords,  Up: Symbols and namespaces

10.4 Special named constants
============================

 -- Constant: #!optional
     Special self-evaluating literal used in lambda parameter lists
     before optional parameters.

 -- Constant: #!rest
     Special self-evaluating literal used in lambda parameter lists
     before the rest parameter.

 -- Constant: #!key
     Special self-evaluating literal used in lambda parameter lists
     before keyword parameters.

 -- Constant: #!eof
     The end-of-file object.

     Note that if the Scheme reader sees this literal at top-level, it
     is returned literally.  This is indistinguishable from coming to
     the end of the input file.  If you do not want to end reading, but
     want the actual value of ‘#!eof’, you should quote it.

 -- Constant: #!void
     The void value.  Same as ‘(values)’.  If this is the value of an
     expression in a read-eval-print loop, nothing is printed.

 -- Constant: #!null
     The Java ‘null’ value.  This is not really a Scheme value, but is
     useful when interfacing to low-level Java code.


File: kawa.info,  Node: Procedures,  Next: Numbers,  Prev: Symbols and namespaces,  Up: Top

11 Procedures
*************

* Menu:

* Application and Arguments Lists::
* Extended formals::
* Procedure properties::
* Generic procedures::
* Partial application::


File: kawa.info,  Node: Application and Arguments Lists,  Next: Extended formals,  Up: Procedures

11.1 Application and Arguments Lists
====================================

When a procedure is called, the actual argument expressions are
evaluated, and the resulting values becomes the actual argument list.
This is then matched against the formal parameter list in the procedure
definition, and (assuming they match) the procedure body is called.

11.1.1 Arguments lists
----------------------

An argument list has three parts:
   • Zero or more “prefix arguments”, each of which is a value.  These
     typically get bound to named required or optional formal
     parameters, but can also get bound to patterns.
   • Zero or more “keyword arguments”, each of which is a keyword (an
     identifier specified with keyword syntax) combined with a value.
     These are bound to either named keyword formal parameters, or
     bundled in with a rest parameter.
   • Zero or more “postfix arguments”, each of which is a value.  These
     are usually bound to a “rest” formal parameter, which receives any
     remaining arguments.

     If there are no keyword arguments, then it ambiguous where prefix
     arguments end and where postfix arguments start.  This is normally
     not a problem: the called procedure can split them up however it
     wishes.

   Note that all keyword arguments have to be grouped together: It is
not allowed to have a keyword argument followed by a plain argument
followed by a keyword argument.

   The argument list is constructed by evaluating each OPERAND of the
PROCEDURE-CALL in order:
EXPRESSION
     The EXPRESSION is evaluated, yielding a single value that becomes a
     prefix or postfix argument.
KEYWORD EXPRESSION
     The EXPRESSION is evaluated.  The resulting value combined with the
     KEYWORD becomes a keyword argument.
‘@’EXPRESSION
     The EXPRESSION is evaluated.  The result must be a sequence - a
     list, vector, or primitive array.  The values of the sequence are
     appended to the resulting argument list.  Keyword arguments are not
     allowed.
‘@:’EXPRESSION
     The EXPRESSION is evaluted.  The result can be a sequence; a hash
     table (viewed as a collection of (keyword,value) pairs); or an
     “explicit argument list” object, which is a sequence of values _or_
     keyword arguments.  The values and keyword arguments are appended
     to the resulting argument list, though subject to the restriction
     that keyword arguments must be adjacent in the resulting argument
     list.

11.1.2 Explicit argument list objects
-------------------------------------

Sometimes it is useful to create an argument list out of pieces, take
argument lists apart, iterate over them, and generally treat an argument
list as an actual first-class value.

   Explicit argument list objects can take multiple forms.  The simplest
is a sequence: a list, vector, or primitive array.  Each element of the
list becomes a value in the resulting argument list.

     (define v1 '(a b c))
     (define v2 (int[] 10 11 12 13))
     (list "X" @v1 "Y" @v2 "Z")
       ⇒ ("X" a b c "Y" 10 11 12 13 "Z")

   Things get more complicated once keywords are involved.  An explicit
argument list with keywords is only allowed when using the ‘@:’ splicing
form, not the ‘@’ form.  It can be either a hash table, or the types
‘arglist’ or ‘argvector’.

     _Design note:_ An argument list with keywords is straightforward in
     Common Lisp and some Scheme implementations (including order
     versions of Kawa): It’s just a list some of whose ‘car’ cells are
     keyword objects.  The problem with this model is neither a human or
     the compiler can reliably tell when an argument is a keyword, since
     any variable might have been assigned a keyword.  This limits
     performance and error checking.

   A hash table (anything the implements ‘java.util.Map’) whose keys are
strings or keyword objects is interpreted as a sequence of keyword
arguments, using the hash-table keys and values.

 -- Type: argvector
 -- Constructor: argvector OPERAND^{*}
     List of arguments represented as an immutable vector.  A keyword
     argument takes two elements in this vector: A keyword object,
     followed by the value.

          (define v1 (argvector 1 2 k1: 10 k2: 11 98 99))
          (v1 4) ⇒ 'k2
          (v1 5) ⇒ 11
     When ‘v1’ is viewed as a vector it is equivalent to ‘(vector 1 2
     'k1: 10 'k2: 11 98 99)’.  (Note in this case the keywords need to
     be quoted, since the ‘vector’ constructor does not take keyword
     arguments.)  However, the ‘argvector’ “knows” which arguments are
     actually keyword arguments, and can be examined using the ‘(kawa
     arglist)’ library discussed below:

          (arglist-key-count (argvector 1 x: 2 3)) ⇒ 1
          (arglist-key-count (argvector 1 'x: 2 3)) ⇒ 0
          (arglist-key-count (vector 1 'x: 2 3)) ⇒ 0

     In this case:
          (fun 'a @:v1)
     is equivalent to:
          (fun 'a 1 2 k1: 10 k2: 11 98 99)

 -- Type: arglist
 -- Constructor: arglist OPERAND^{*}
     Similar to ‘argvector’, but compatible with ‘list’.  If there are
     no keyword arguments, returns a plain list.  If there is at least
     one keyword argument creates a special ‘gnu.mapping.ArgListPair’
     object that implements the usual ‘list’ properties but internally
     wraps a ‘argvector’.

11.1.3 Argument list library
----------------------------

     (import (kawa arglist))

   In the following, ARGS is an ‘arglist’ or ‘argvector’ (or in general
any object that implement ‘gnu.mapping.ArgList’).  Also, ARGS can be any
sequence, in which case it behaves like an ‘argvector’ that has no
keyword arguments.

 -- Procedure: arglist-walk args proc
     Call PROC once, in order, for each argument in ARGS.  The PROC is
     called with two arguments, corresponding to ‘(arglist-key-ref ARGS
     I)’ and ‘(arglist-arg-ref ARGS I)’ for each I from 0 up to
     ‘(arglist-arg-count ARGS)’ (exclusive).  I.e.  the first argument
     is either ‘#!null’ or the keyword (as a string); the second
     argument is the corresponding argument value.

          (define (print-arguments args #!optional (out (current-output-port)))
            (arglist-walk args
                          (lambda (key value)
                            (if key (format out "key: ~a value: ~w~%" key value)
                                (format out "value: ~w~%" value)))))

 -- Procedure: arglist-key-count args
     Return the number of keyword arguments.

 -- Procedure: arglist-key-start args
     Number of prefix arguments, which is the number of arguments before
     the first keyword argument.

 -- Procedure: arglist-arg-count args
     Return the number of arguments.  The count includes the number of
     keyword arguments, but not the actual keywords.
          (arglist-arg-count (arglist 10 11 k1: -1 19)) ⇒ 4

 -- Procedure: arglist-arg-ref args index
     Get the INDEX’th argument value.  The INDEX counts keyword argument
     values, but not the keywords themselves.
          (arglist-arg-ref (arglist 10 11 k1: -1 19) 2) ⇒ -1
          (arglist-arg-ref (arglist 10 11 k1: -1 19) 3) ⇒ 19

 -- Procedure: arglist-key-ref args index
     The INDEX counts arguments like ‘arglist-arg-ref’ does.  If this is
     a keyword argument, return the corresponding keyword (as a string);
     otherwise, return ‘#!null’ (which counts as false).
          (arglist-key-ref (argvector 10 11 k1: -1 k2: -2 19) 3) ⇒ "k2"
          (arglist-key-ref (argvector 10 11 k1: -1 k2: -2 19) 4) ⇒ #!null

 -- Procedure: arglist-key-index args key
     Search for a keyword matching KEY (which must be an interned
     string).  If there is no such keyword, return -1.  Otherwise return
     the keyword’s index as an argument to ‘arglist-key-ref’.

 -- Procedure: arglist-key-value args key default
     Search for a keyword matching KEY (which must be an interned
     string).  If there is no such keyword, return the DEFAULT.
     Otherwise return the corresponding keyword argument’s value.

11.1.4 Apply procedures
-----------------------

 -- Procedure: apply proc argi^{*} argrest
     ARGREST must be a sequence (list, vector, or string) or a primitive
     Java array.  (This is an extension over standard Scheme, which
     requires that ARGS be a list.)  Calls the PROC (which must be a
     procedure), using as arguments the ARGI...  values plus all the
     elements of ARGREST.

     Equivalent to: ‘(’PROC ARGI^{*} ‘@’ARGREST‘)’.

 -- Syntax: constant-fold proc arg1 ...
     Same as ‘(PROC ARG1 ...)’, unless PROC and all the following
     arguments are compile-time constants.  (That is: They are either
     constant, or symbols that have a global binding and no lexical
     binding.)  In that case, PROC is applied to the arguments at
     compile-time, and the result replaces the ‘constant-fold’ form.  If
     the application raises an exception, a compile-time error is
     reported.  For example:
          (constant-fold vector 'a 'b 'c)
     is equivalent to ‘(quote #(a b c))’, assuming ‘vector’ has not been
     re-bound.


File: kawa.info,  Node: Extended formals,  Next: Procedure properties,  Prev: Application and Arguments Lists,  Up: Procedures

11.2 Lambda Expressions and Formal Parameters
=============================================

A ‘lambda’ expression evaluates to a procedure.  The environment in
effect when the ‘lambda’ expression was evaluated is remembered as part
of the procedure.  When the procedure is later called with some actual
arguments, the environment in which the ‘lambda’ expression was
evaluated will be extended by binding the variables in the formal
argument list to fresh locations, and the corresponding actual argument
values will be stored in those locations.  (A “fresh location” is one
that is distinct from every previously existing location.)  Next, the
expressions in the body of the lambda expression will be evaluated
sequentially in the extended environment.  The results of the last
expression in the body will be returned as the results of the procedure
call.

     (lambda (x) (+ x x))   ⇒ _a procedure_
     ((lambda (x) (+ x x)) 4)  ⇒ 8

     (define reverse-subtract
       (lambda (x y) (- y x)))
     (reverse-subtract 7 10) ⇒ 3

     (define add4
       (let ((x 4))
        (lambda (y) (+ x y))))
     (add4 6) ⇒ 10

   The formal arguments list of a lambda expression has some extensions
over standard Scheme: Kawa borrows the extended formal argument list of
DSSSL, and allows you to declare the type of the parameter.  More
generally, you can use *note patterns: Variables and Patterns.

     LAMBDA-EXPRESSION ::= ‘(lambda’ FORMALS OPTION-PAIR^{*} OPT-RETURN-TYPE BODY‘)’
     OPT-RETURN-TYPE ::= [‘::’ TYPE]
     FORMALS ::= ‘(’FORMAL-ARGUMENTS‘)’ | REST-ARG

   An OPT-RETURN-TYPE specifies the return type of the procedure: The
result of evaluating the BODY is coerced to the specified TYPE.

   _Deprecated_: If the first form of the function body is an unbound
identifier of the form ‘<TYPE>’ (that is the first character is ‘<’ and
the last is ‘>’), then that is another way to specify the function’s
return type.

   See *note properties: Procedure properties. for how to set and use an
OPTION-PAIR.

   The *note ‘define’: Definitions. form has a short-hand that combines
a lambda definition with binding the lambda to a variable:
     (define (NAME FORMAL-ARGUMENTS) OPT-RETURN-TYPE BODY)

     FORMAL-ARGUMENTS ::= REQUIRED-OR-GUARD^{*} [‘#!optional’ OPTIONAL-ARG ...] REST-KEY-ARGS
     REST-KEY-ARGS ::= [‘#!rest’ REST-ARG] [‘#!key’ KEY-ARG ...] [GUARD]
       | [‘#!key’ KEY-ARG ...] [REST-PARAMETER] [GUARD]
       | ‘.’ REST-ARG

   When the procedure is applied to an *note argument list::, the latter
is matched against formal parameters.  This may involve some complex
rules and pattern matching.

Required parameters
...................

     REQUIRED-OR-GUARD ::= REQUIRED-ARG | GUARD
     REQUIRED-ARG ::= PATTERN
       | ‘(’ PATTERN ‘::’ TYPE‘)’

   The REQUIRED-ARGs are matched against the actual (pre-keyword)
arguments in order, starting with the first actual argument.  It is an
error if there are fewer pre-keyword arguments then there are
REQUIRED-ARGs.  While a PATTERN is most commonly an identifier, more
complicated patterns are possible, thus more (or fewer) variables may be
bound than there are arguments.

   Note a PATTERN may include an OPT-TYPE-SPECIFIER.  For example:
     (define (isquare x::integer)
       (* x x))
   In this case the actual argument is coerced to an ‘integer’ and then
the result matched against the pattern ‘x’.  This is how parameter types
are specified.

   The PATTERN may be enclosed in parentheses for clarify (just like for
optional parameters), but in that case the type specifier is required to
avoid ambiguity.

Optional parameters
...................

     OPTIONAL-ARG ::= VARIABLE OPT-TYPE-SPECIFIER
       | ‘(’ PATTERN OPT-TYPE-SPECIFIER [INITIALIZER [SUPPLIED-VAR]]‘)’
     SUPPLIED-VAR ::= VARIABLE

   Next the OPTIONAL-ARGs are bound to remaining pre-keyword arguments.
If there are fewer remaining pre-keyword arguments than there are
OPTIONAL-ARGs, then the remaining VARIABLEs are bound to the
corresponding INITIALIZER.  If no INITIALIZER was specified, it defaults
to ‘#f’.  (TODO: If a TYPE is specified the default for INITIALIZER is
the default value of the TYPE.)  The INITIALIZER is evaluated in an
environment in which all the previous formal parameters have been bound.
If a SUPPLIED-VAR is specified, it has type boolean, and is set to true
if there was an actual corresponding argument, and false if the
initializer was evaluated.

Keyword parameters
..................

     KEY-ARG ::= VARIABLE OPT-TYPE-SPECIFIER
         | ‘(’ VARIABLE OPT-TYPE-SPECIFIER [INITIALIZER [SUPPLIED-VAR]] ‘)’

   Keyword parameters follow ‘#!key’.  For each VARIABLE if there is an
actual keyword parameter whose keyword matches VARIABLE, then VARIABLE
is bound to the corresponding value.  If there is no matching artual
argument, then the INITIALIZER is evaluated and bound to the argument.
If INITIALIZER is not specified, it defaults to ‘#f’.  The INITIALIZER
is evaluated in an environment in which all the previous formal
parameters have been bound.

     (define (fun x #!key (foo 1) (bar 2) (baz 3))
       (list x foo bar baz))
     (fun 9 baz: 10 foo: 11) ⇒ (9 11 2 10)

   The following cause a match failure, _unless_ there is a rest
parameter:
   • There may not be extra non-keyword arguments (prefix or postfix)
     beyond those matched by required and optional parameters.
   • There may not be any duplicated keyword arguments.
   • All keyowrds in the actual argument list must match one of the
     keyword formal parameters.

   It is not recommended to use both keyword parameters and a rest
parameter that can match keyword arguments.  Currently, the rest
parameter will include any arguments that match the explicit keyword
parameters, as well any that don’t, though this may change.

   On the other hand, it is fine to have both keyword parameters and a
rest parameter does not accept keywords.  In that case the rest
parameter will match any “postfix” arguments:

     #|kawa:8|# (define (fun x #!key k1 k2 #!rest r)
       (format "x:~w k1:~w k2:~w r:~w" x k1 k2 r))
     (fun 3 k2: 12 100 101) ⇒ x:3 k1:#f k2:12 r:(100 101)

   The SUPPLIED-VAR argument is as for optional arguments.

   _Performance note:_ Keyword parameters are implemented very
efficiently and compactly when explicit in the code.  The parameters are
sorted by the compiler, and the actual keyword arguemnts at the call
state are also sorted at compile-time.  So keyword matching just
requires a fast linear scan comparing the two sorted lists.  This
implementation is also very compact, compared to say a hash table.

   If a TYPE is specified, the corresponding actual argument (or the
INITIALIZER default value) is coerced to the specified TYPE.  In the
function body, the parameter has the specified type.

Rest parameters
...............

A “rest parameter” matches any arguments not matched by other
parameters.  You can write it using any of the following ways:

     REST-PARAMETER ::=
       ‘#!rest’ REST-ARG [‘::’ TYPE]
       | ‘@’REST-ARG
       | ‘@:’REST-ARG
     REST-ARG ::= VARIABLE

   In addition, if FORMALS is just a REST-ARG identifier, or a
FORMAL-ARGUMENTS ends with ‘. REST-ARG’ (i.e.  is a dotted list) that is
equivalent to using ‘#!rest’.

   These forms are similar but differ in the type of the REST-ARG and
whether keywords are allowed (as part of the REST-ARG):

   • If ‘#!rest’ REST-ARG is used with no TYPE specifier (or a TYPE
     specifier of ‘list’) then REST-ARG is a list.  Keywords are not
     allowed if ‘#!key’ has been seen.  (For backward compatibility, it
     is allowed to have extra keywords if ‘#!rest’ is followed by
     ‘!key’.)  If there are any keywords, then REST-ARG is more
     specifically an ‘arglist’.
   • If ‘#!rest’ REST-ARG is used with TYPE specifier that is a Java
     array (for example ‘#!rest r::string[]’ then REST-ARG has that
     type.  Each argument must be compatible with the element type of
     the array.  Keywords are not allowed (even if TYPE is ‘object[]’).

     The generated method will be compiled like a Java varargs methods
     if possible (i.e.  no non-trivial patterns or keyword paremeters).
   • Using ‘@’REST-ARG is equivalent to ‘#!rest REST-ARG::object[]’:
     Keywords are not allowed; the type of REST-ARG is a Java array; the
     method is compiled like a Java varargs method.

   • For ‘@:’REST-ARG then REST-ARG is a vector, specifically an
     ‘argvector’.  Keywords are allowed.

Guards (conditional expressions)
................................

A GUARD is evaluated when it appears in the formal parameter list.  If
it evaluates to false, then matching fails.  Guards can appears before
or after required arguments, or at the very end, after all other formal
parameters.


File: kawa.info,  Node: Procedure properties,  Next: Generic procedures,  Prev: Extended formals,  Up: Procedures

11.3 Procedure properties
=========================

You can associate arbitrary “properties” with any procedure.  Each
property is a (KEY, VALUE)-pair.  Usually the KEY is a symbol, but it
can be any object.

   The preferred way to set a property is using an OPTION-PAIR in a
LAMBDA-EXPRESSION.  For example, to set the ‘setter’ property of a
procedure to ‘my-set-car’ do the following:
     (define my-car
       (lambda (arg) setter: my-set-car (primitive-car arg)))

   The system uses certain internal properties: ‘'name’ refers to the
name used when a procedure is printed; ‘'emacs-interactive’ is used to
implement Emacs ‘interactive’ specification; ‘'setter’ is used to
associate a ‘setter’ procedure.

 -- Procedure: procedure-property proc key [default]
     Get the property value corresponding to the given KEY.  If PROC has
     no property with the given KEY, return DEFAULT (which defaults to
     ‘#f’) instead.

 -- Procedure: set-procedure-property! proc key value
     Associate the given VALUE with the KEY property of PROC.

   To change the print name of the standard ‘+’ procedure (probably not
a good idea!), you could do:
     (set-procedure-property! + 'name 'PLUS)
   Note this _only_ changes the name property used for printing:
     + ⇒ #<procedure PLUS>
     (+ 2 3) ⇒ 5
     (PLUS 3 4) ⇒ ERROR

   As a matter of style, it is cleaner to use the ‘define-procedure’
form, as it is a more declarative interface.

 -- Syntax: define-procedure name [propname: propvalue] ... method ...
     Defines NAME as a compound procedure consisting of the specified
     METHODs, with the associated properties.  Applying NAME select the
     "best" METHOD, and applies that.  See the following section on
     generic procedures.

     For example, the standard ‘vector-ref’ procedure specifies one
     method, as well as the ‘setter’ property:
          (define-procedure vector-ref
            setter: vector-set!
            (lambda (vector::vector k ::int)
              (invoke vector 'get k)))

   You can also specify properties in the lambda body:

     (define (vector-ref vector::vector k ::int)
         setter: vector-set!
         (invoke vector 'get k))

11.3.1 Standard properties
--------------------------

‘name’
     The name of a procedure (as a symbol), which is used when the
     procedure is printed.
‘setter’
     Set the setter procedure associated with the procedure.
‘validate-apply’
‘validate-xapply’
     Used during the validation phase of the compiler.
‘compile-apply’
     Used during the bytecode-generation phase of the compiler: If we
     see a call to a known function with this property, we can emit
     custom bytecode for the call.


File: kawa.info,  Node: Generic procedures,  Next: Partial application,  Prev: Procedure properties,  Up: Procedures

11.4 Generic (dynamically overloaded) procedures
================================================

A “generic procedure” is a collection of “method procedures”.  (A
"method procedure" is not the same as a Java method, but the terms are
related.)  You can call a generic procedure, which selects the "closest
match" among the component method procedures: I.e.  the most specific
method procedure that is applicable given the actual arguments.

     *Warning:* The current implementation of selecting the "best"
     method is not reliable if there is more than one method.  It can
     select depending on argument count, and it can select between
     primitive Java methods.  However, selecting between different
     Scheme procedures based on parameter types should be considered
     experimental.  The main problem is we can’t determine the most
     specific method, so Kawa just tries the methods in order.

 -- Procedure: make-procedure [keyword: value]... method...
     Create a generic procedure given the specific methods.  You can
     also specify property values for the result.

     The KEYWORDs specify how the arguments are used.  A ‘method:’
     keyword is optional and specifies that the following argument is a
     method.  A ‘name:’ keyword specifies the name of the resulting
     procedure, when used for printing.  Unrecognized keywords are used
     to set the procedure properties of the result.
          (define plus10 (make-procedure foo: 33 name: 'Plus10
                                      method: (lambda (x y) (+ x y 10))
                                      method: (lambda () 10)))


File: kawa.info,  Node: Partial application,  Prev: Generic procedures,  Up: Procedures

11.5 Partial application
========================

 -- Syntax: cut slot-or-expr slot-or-expr* [‘<...>’]
     where each SLOT-OR-EXPR is either an EXPRESSION or the literal
     symbol ‘<>’.

     It is frequently necessary to specialize some of the parameters of
     a multi-parameter procedure.  For example, from the binary
     operation ‘cons’ one might want to obtain the unary operation
     ‘(lambda (x) (cons 1 x))’.  This specialization of parameters is
     also known as “partial application”, “operator section”, or
     “projection”.  The macro ‘cut’ specializes some of the parameters
     of its first argument.  The parameters that are to show up as
     formal variables of the result are indicated by the symbol ‘<>’,
     pronouced as "slot".  In addition, the symbol ‘<...>’, pronounced
     as "rest-slot", matches all residual arguments of a variable
     argument procedure.

     A ‘cut’-expression is transformed into a LAMBDA EXPRESSION with as
     many formal variables as there are slots in the list SLOT-OR-EXPR*.
     The body of the resulting LAMBDA EXPRESSION calls the first
     SLOT-OR-EXPR with arguments from the SLOT-OR-EXPR* list in the
     order they appear.  In case there is a rest-slot symbol, the
     resulting procedure is also of variable arity, and the body calls
     the first SLOT-OR-EXPR with remaining arguments provided to the
     actual call of the specialized procedure.

     Here are some examples:

     ‘(cut cons (+ a 1) <>)’ is the same as
     ‘(lambda (x2) (cons (+ a 1) x2))’

     ‘(cut list 1 <> 3 <> 5)’ is the same as
     ‘(lambda (x2 x4) (list 1 x2 3 x4 5))’

     ‘(cut list)’ is the same as ‘(lambda () (list))’

     ‘(cut list 1 <> 3 <...>)’ is the same as
     ‘(lambda (x2 . xs) (apply list 1 x2 3 xs))’

     The first argument can also be a slot, as one should expect in
     Scheme: ‘(cut <> a b)’ is the same as ‘(lambda (f) (f a b))’

 -- Syntax: cute slot-or-expr slot-or-expr* [‘<...>’]
     The macro ‘cute’ (a mnemonic for "cut with evaluated non-slots") is
     similar to ‘cut’, but it evaluates the non-slot expressions at the
     time the procedure is specialized, not at the time the specialized
     procedure is called.

     For example ‘(cute cons (+ a 1) <>)’ is the same as
     ‘(let ((a1 (+ a 1))) (lambda (x2) (cons a1 x2)))’

     As you see from comparing this example with the first example
     above, the ‘cute’-variant will evaluate ‘(+ a 1)’ once, while the
     ‘cut’-variant will evaluate it during every invocation of the
     resulting procedure.


File: kawa.info,  Node: Numbers,  Next: Characters and text,  Prev: Procedures,  Up: Top

12 Quantities and Numbers
*************************

Kawa supports the full Scheme set of number operations with some
extensions.

   Kawa converts between Scheme number types and Java number types as
appropriate.

* Menu:

* Numerical types::
* Arithmetic operations::
* Numerical input and output::
* Quaternions::
* Quantities::
* Logical Number Operations::
* Performance of numeric operations::


File: kawa.info,  Node: Numerical types,  Next: Arithmetic operations,  Up: Numbers

12.1 Numerical types
====================

Mathematically, numbers are arranged into a tower of subtypes in which
each level is a subset of the level before it: number; complex number;
real number; rational number; integer.

   For example, ‘3’ is an integer.  Therefore ‘3’ is also a rational, a
real, and a complex number.  The same is true of the Scheme numbers that
model 3.  For Scheme numbers, these types are defined by the predicates
‘number?’, ‘complex?’, ‘real?’, ‘rational?’, and ‘integer?’.

   There is no simple relationship between a number’s type and its
representation inside a computer.  Although most implementations of
Scheme will offer at least two different representations of 3, these
different representations denote the same integer.

   Scheme’s numerical operations treat numbers as abstract data, as
independent of their representation as possible.  Although an
implementation of Scheme may use multiple internal representations of
numbers, this ought not to be apparent to a casual programmer writing
simple programs.

 -- Type: number
     The type of Scheme numbers.

 -- Type: quantity
     The type of quantities optionally with units.  This is a sub-type
     of ‘number’.

 -- Type: complex
     The type of complex numbers.  This is a sub-type of ‘quantity’.

 -- Type: real
     The type of real numbers.  This is a sub-type of ‘complex’.

 -- Type: rational
     The type of exact rational numbers.  This is a sub-type of ‘real’.

 -- Type: integer
     The type of exact Scheme integers.  This is a sub-type of
     ‘rational’.

   Kawa allows working with expressions of “primitive” types, which are
supported by the JVM without object allocation, and using builtin
arithmetic.  Using these types may be much faster, assuming the compiler
is able to infer that the variable or expression has primitive type.

 -- Type: long
 -- Type: int
 -- Type: short
 -- Type: byte
     These are fixed-sized primitive signed exact integer types, of
     respectively 64, 32, 18, and 8 bits.  If a value of one of these
     types needs to be converted to an object, the standard classes
     ‘java.lang.Long’, ‘java.lang.Integer’, ‘java.lang.Short’, or
     ‘java.lang.Byte’, respectively, are used.

 -- Type: ulong
 -- Type: uint
 -- Type: ushort
 -- Type: ubyte
     These are fixed-sized primitive unsigned exact integer types, of
     respectively 64, 32, 18, and 8 bits.  These are presented at
     runtime using the corresponding signed types (‘long’, ‘int’,
     ‘short’, or ‘byte’).  However, for arithmetic the Kawa compiler
     generates code to perform the “mathematically correct” result,
     truncated to an unsigned result rather than signed.  If a value of
     one of these types needs to be converted to an object, the classes
     ‘gnu.math.ULong’, ‘gnu.math.UInt’, ‘gnu.math.UShort’, or
     ‘gnu.math.UByte’ is used.

 -- Type: double
 -- Type: float
     These are fixed-size primitive inexact floating-point real types,
     using the standard 64-bit or 32-bit IEEE representation.  If a
     value of one of these types needs to be converted to an object, the
     standard classes ‘java.lang.Double’, or ‘java.lang.Float’ is used.

12.1.1 Exactness
----------------

It is useful to distinguish between numbers that are represented exactly
and those that might not be.  For example, indexes into data structures
must be known exactly, as must some polynomial coefficients in a
symbolic algebra system.  On the other hand, the results of measurements
are inherently inexact, and irrational numbers may be approximated by
rational and therefore inexact approximations.  In order to catch uses
of inexact numbers where exact numbers are required, Scheme explicitly
distinguishes exact from inexact numbers.  This distinction is
orthogonal to the dimension of type.

   A Scheme number is “exact” if it was written as an exact constant or
was derived from exact numbers using only exact operations.  A number is
“inexact” if it was written as an inexact constant, if it was derived
using inexact ingredients, or if it was derived using inexact
operations.  Thus inexactness is a contagious property of a number.  In
particular, an “exact complex number” has an exact real part and an
exact imaginary part; all other complex numbers are “inexact complex
numbers”.

   If two implementations produce exact results for a computation that
did not involve inexact intermediate results, the two ultimate results
will be mathematically equal.  This is generally not true of
computations involving inexact numbers since approximate methods such as
floating-point arithmetic may be used, but it is the duty of the
implementation to make the result as close as practical to the
mathematically ideal result.

   Rational operations such as ‘+’ should always produce exact results
when given exact arguments.  If the operation is unable to produce an
exact result, then it may either report the violation of an
implementation restriction or it may silently coerce its result to an
inexact value.

   Except for ‘exact’, the operations described in this section must
generally return inexact results when given any inexact arguments.  An
operation may, however, return an exact result if it can prove that the
value of the result is unaffected by the inexactness of its arguments.
For example, multiplication of any number by an exact zero may produce
an exact zero result, even if the other argument is inexact.

   Specifically, the expression ‘(* 0 +inf.0)’ may return ‘0’, or
‘+nan.0’, or report that inexact numbers are not supported, or report
that non-rational real numbers are not supported, or fail silently or
noisily in other implementation-specific ways.

   The procedures listed below will always return exact integer results
provided all their arguments are exact integers and the mathematically
expected results are representable as exact integers within the
implementation: ‘-’, ‘*’, ‘+’, ‘abs’, ‘ceiling’, ‘denominator’,
‘exact-integer-sqrt’, ‘expt’, ‘floor’, ‘floor/’, ‘floor-quotient’,
‘floor-remainder’, ‘gcd’, ‘lcm’, ‘max’, ‘min’, ‘modulo’, ‘numerator’,
‘quotient’, ‘rationalize’, ‘remainder’, ‘square’, ‘truncate’,
‘truncate/’, ‘truncate-quotient’, ‘truncate-remainder’.

12.1.2 Numerical promotion and conversion
-----------------------------------------

When combining two values of different numeric types, the values are
converted to the first line in the following that subsumes (follows)
both types.  The computation is done using values of that type, and so
is the result.  For example adding a ‘long’ and a ‘float’ converts the
former to the latter, yielding a ‘float’.

   Note that ‘short’, ‘byte’, ‘ushort’, ‘ubyte’ are converted to ‘int’
regardless, even in the case of a single-operand operation, such as
unary negation.  Another exception is trancendental functions (such as
‘cos’), where integer operands are converted to ‘double’.

   • ‘int’ subsumes ‘short’, ‘byte’, ‘ushort’, ‘ubyte’.
   • ‘uint’
   • ‘long’
   • ‘ulong’
   • ‘java.lang.BigInteger’
   • ‘integer’ (i.e.  ‘gnu.math.IntNum’)
   • ‘rational’ (i.e.  ‘gnu.math.RatNum’)
   • ‘float’
   • ‘double’
   • ‘gnu.math.FloNum’
   • ‘real’ (i.e.  ‘gnu.math.RealNum’)
   • ‘number’
   • ‘complex’
   • ‘quantity’

   When comparing a primitive signed integer value with a primitive
unsigned integer (for example ‘<’ applied to a ‘int’ and a ‘ulong’) the
mathemically correct result is computed, as it converting both operands
to ‘integer’.


File: kawa.info,  Node: Arithmetic operations,  Next: Numerical input and output,  Prev: Numerical types,  Up: Numbers

12.2 Arithmetic operations
==========================

 -- Procedure: real-valued? OBJ
 -- Procedure: rational-valued? OBJ
 -- Procedure: integer-valued? OBJ
     These numerical type predicates can be applied to any kind of
     argument.  The ‘real-valued?’ procedure returns ‘#t’ if the object
     is a number object and is equal in the sense of ‘=’ to some real
     number object, or if the object is a NaN, or a complex number
     object whose real part is a NaN and whose imaginary part is zero in
     the sense of ‘zero?’.  The ‘rational-valued?’ and ‘integer-valued?’
     procedures return ‘#t’ if the object is a number object and is
     equal in the sense of ‘=’ to some object of the named type, and
     otherwise they return ‘#f’.

          (real-valued? +nan.0)                  ⇒ #t
          (real-valued? +nan.0+0i)               ⇒ #t
          (real-valued? -inf.0)                  ⇒ #t
          (real-valued? 3)                       ⇒ #t
          (real-valued? -2.5+0.0i)               ⇒ #t

          (real-valued? -2.5+0i)                 ⇒ #t
          (real-valued? -2.5)                    ⇒ #t
          (real-valued? #e1e10)                  ⇒ #t

          (rational-valued? +nan.0)              ⇒ #f
          (rational-valued? -inf.0)              ⇒ #f
          (rational-valued? 6/10)                ⇒ #t
          (rational-valued? 6/10+0.0i)           ⇒ #t
          (rational-valued? 6/10+0i)             ⇒ #t
          (rational-valued? 6/3)                 ⇒ #t

          (integer-valued? 3+0i)                 ⇒ #t
          (integer-valued? 3+0.0i)               ⇒ #t
          (integer-valued? 3.0)                  ⇒ #t
          (integer-valued? 3.0+0.0i)             ⇒ #t
          (integer-valued? 8/4)                  ⇒ #t

          _Note:_ These procedures test whether a given number object
          can be coerced to the specified type without loss of numerical
          accuracy.  Specifically, the behavior of these predicates
          differs from the behavior of ‘real?’, ‘rational?’, and
          ‘integer?’ on complex number objects whose imaginary part is
          inexact zero.

          _Note:_ The behavior of these type predicates on inexact
          number objects is unreliable, because any inaccuracy may
          affect the result.

 -- Procedure: exact-integer? z
     Returns ‘#t’ if Z is both exact and an integer; otherwise returns
     ‘#f’.
          (exact-integer? 32)                    ⇒ #t
          (exact-integer? 32.0)                  ⇒ #t
          (exact-integer? 32/5)                  ⇒ #f

 -- Procedure: finite? Z
     Returns ‘#t’ if Z is finite real number (i.e.  an infinity and not
     a NaN), or if Z is a complex number whose real and imaginary parts
     are both finite.
          (finite? 3)             ⇒ #t
          (finite? +inf.0)        ⇒ #f
          (finite? 3.0+inf.0i)    ⇒ #f

 -- Procedure: infinite? Z
     Return ‘#t’ if Z is an infinite real number (‘+int.0’ or ‘-inf.0’),
     or if Z is a complex number where either real or imaginary parts or
     both are infinite.
          (infinite? 5.0)         ⇒ #f
          (infinite? +inf.0)      ⇒ #t
          (infinite? +nan.0)      ⇒ #f
          (infinite? 3.0+inf.0i)  ⇒ #t

 -- Procedure: nan? Z
     For a real numer returns whether its is a NaN; for a complex number
     if the real or imaginary parts or both is a NaN.
          (nan? +nan.0)           ⇒ #t
          (nan? 32)               ⇒ #f
          (nan? +nan.0+5.0i)      ⇒ #t
          (nan? 1+2i)             ⇒ #f

 -- Procedure: + Z ...
 -- Procedure: * Z ...
     These procedures return the sum or product of their arguments.

          (+ 3 4)                          ⇒  7
          (+ 3)                            ⇒  3
          (+)                              ⇒  0
          (+ +inf.0 +inf.0)                ⇒  +inf.0
          (+ +inf.0 -inf.0)                ⇒  +nan.0

          (* 4)                            ⇒  4
          (*)                              ⇒  1
          (* 5 +inf.0)                     ⇒  +inf.0
          (* -5 +inf.0)                    ⇒  -inf.0
          (* +inf.0 +inf.0)                ⇒  +inf.0
          (* +inf.0 -inf.0)                ⇒  -inf.0
          (* 0 +inf.0)                     ⇒  +nan.0
          (* 0 +nan.0)                     ⇒  +nan.0
          (* 1.0 0)                        ⇒  0.0

     For any real number object X that is neither infinite nor NaN:

          (+ +inf.0 X)                   ⇒  +inf.0
          (+ -inf.0 X)                   ⇒  -inf.0

     For any real number object X:

          (+ +nan.0 X)                   ⇒  +nan.0

     For any real number object X that is not an exact 0:

          (* +nan.0 X)                   ⇒  +nan.0

     The behavior of ‘-0.0’ is illustrated by the following examples:

          (+  0.0 -0.0)  ⇒  0.0
          (+ -0.0  0.0)  ⇒  0.0
          (+  0.0  0.0)  ⇒  0.0
          (+ -0.0 -0.0)  ⇒ -0.0

 -- Procedure: - Z
 -- Procedure: - Z1 Z2 Z3 ...
     With two or more arguments, this procedures returns the difference
     of its arguments, associating to the left.  With one argument,
     however, it returns the negation (additive inverse) of its
     argument.

          (- 3 4)                               ⇒  -1
          (- 3 4 5)                             ⇒  -6
          (- 3)                                 ⇒  -3
          (- +inf.0 +inf.0)                     ⇒  +nan.0

     The behavior of ‘-0.0’ is illustrated by the following examples:

          (-  0.0)       ⇒ -0.0
          (- -0.0)       ⇒  0.0
          (-  0.0 -0.0)  ⇒  0.0
          (- -0.0  0.0)  ⇒ -0.0
          (-  0.0  0.0)  ⇒  0.0
          (- -0.0 -0.0)  ⇒  0.0

 -- Procedure: / Z
 -- Procedure: / Z1 Z2 Z3 ...
     If all of the arguments are exact, then the divisors must all be
     nonzero.  With two or more arguments, this procedure returns the
     quotient of its arguments, associating to the left.  With one
     argument, however, it returns the multiplicative inverse of its
     argument.

          (/ 3 4 5)                         ⇒  3/20
          (/ 3)                             ⇒  1/3
          (/ 0.0)                           ⇒  +inf.0
          (/ 1.0 0)                         ⇒  +inf.0
          (/ -1 0.0)                        ⇒  -inf.0
          (/ +inf.0)                        ⇒  0.0
          (/ 0 0)                           ⇒  exception &assertion
          (/ 3 0)                           ⇒  exception &assertion
          (/ 0 3.5)                         ⇒  0.0
          (/ 0 0.0)                         ⇒  +nan.0
          (/ 0.0 0)                         ⇒  +nan.0
          (/ 0.0 0.0)                       ⇒  +nan.0

     If this procedure is applied to mixed non–rational real and
     non–real complex arguments, it either raises an exception with
     condition type ‘&implementation-restriction’ or returns an
     unspecified number object.

 -- Procedure: floor/ x y
 -- Procedure: truncate/ x y
 -- Procedure: div-and-mod x y
 -- Procedure: div0-and-mod0 x y
     These procedures implement number–theoretic integer division.  They
     accept two real numbers X and Y as operands, where Y must be
     nonzero.  In all cases the result is two values Q (an integer) and
     R (a real) that satisfy the equations:
          X = Q * Y + R
          Q = ROUNDING-OP(X/Y)
     The result is inexact if either argument is inexact.

     For ‘floor/’ the ROUNDING-OP is the ‘floor’ function (below).
          (floor/ 123 10)         ⇒  12 3
          (floor/ 123 -10)        ⇒  -13 -7
          (floor/ -123 10)        ⇒  -13 7
          (floor/ -123 -10)       ⇒  12 -3

     For ‘truncate/’ the ROUNDING-OP is the ‘truncate’ function.
          (truncate/ 123 10)      ⇒  12 3
          (truncate/ 123 -10)     ⇒  -12 3
          (truncate/ -123 10)     ⇒  -12 -3
          (truncate/ -123 -10)    ⇒  12 -3

     For ‘div-and-mod’ the ROUNDING-OP is either ‘floor’ (if Y is
     positive) or ‘ceiling’ (if Y is negative).  We have:
          0  <= R < |Y|
          (div-and-mod 123 10)    ⇒  12 3
          (div-and-mod 123 -10)   ⇒  -12 3
          (div-and-mod -123 10)   ⇒  -13 7
          (div-and-mod -123 -10)  ⇒  13 7

     For ‘div0-and-mod0’ the ROUNDING-OP is the ‘round’ function, and
     ‘r’ lies within a half–open interval centered on zero.
          -|Y/2| <= R < |Y/2|

          (div0-and-mod0 123 10)   ⇒  12 3
          (div0-and-mod0 123 -10)  ⇒  -12 3
          (div0-and-mod0 -123 10)  ⇒  -12 -3
          (div0-and-mod0 -123 -10) ⇒  12 -3
          (div0-and-mod0 127 10)   ⇒  13 -3
          (div0-and-mod0 127 -10)  ⇒  -13 -3
          (div0-and-mod0 -127 10)  ⇒  -13 3
          (div0-and-mod0 -127 -10) ⇒  13 3

     The inconsistent naming is for historical reasons: ‘div-and-mod’
     and ‘div0-and-mod0’ are from R6RS, while ‘floor/’ and ‘truncate/’
     are from R7RS.

 -- Procedure: floor-quotient x y
 -- Procedure: truncate-quotient x y
 -- Procedure: div x y
 -- Procedure: div0 x y
     These procedures return the quotient part (first value) of
     respectively ‘floor/’, ‘truncate/’, ‘div-and-mod’, and
     ‘div0-and-mod0’.

 -- Procedure: floor-remainder x y
 -- Procedure: truncate-remainder x y
 -- Procedure: mod x y
 -- Procedure: mod0 x y
     These procedures return the remainder part (second value) of
     respectively ‘floor/’, ‘truncate/’, ‘div-and-mod’, and
     ‘div0-and-mod0’.

     As a Kawa extension Y may be zero, in which case the result is X:
          (mod 123 0)     ⇒  123 ;; Kawa extension

 -- Procedure: quotient x y
 -- Procedure: remainder x y
 -- Procedure: modulo x y
     These are equivalent to ‘truncate-quotient’, ‘truncate-remainder’,
     and ‘floor-remainder’, respectively.  These are provided for
     backward compatibility.
          (remainder 13 4)     ⇒ 1
          (remainder -13 4)    ⇒ -1
          (remainder 13 -4)    ⇒ 1
          (remainder -13 -4)   ⇒ -1
          (remainder -13 -4.0) ⇒ -1.0
          (modulo 13 4)   ⇒ 1
          (modulo -13 4)  ⇒ 3
          (modulo 13 -4)  ⇒ -4
          (modulo -13 -4) ⇒ -1

 -- Procedure: abs X
     Returns the absolute value of its argument.

          (abs -7)                         ⇒  7
          (abs -inf.0)                     ⇒  +inf.0

 -- Procedure: gcd N1 ...
 -- Procedure: lcm N1 ...
     These procedures return the greatest common divisor or least common
     multiple of their arguments.  The result is always non–negative.
     The arguments must be integers; if an argument is inexact, so is
     the result.

          (gcd 32 -36)                     ⇒  4
          (gcd)                            ⇒  0
          (lcm 32 -36)                     ⇒  288
          (lcm 32.0 -36)                   ⇒  288.0 ; inexact
          (lcm)                            ⇒  1

 -- Procedure: numerator Q
 -- Procedure: denominator Q
     These procedures return the numerator or denominator of their
     argument; the result is computed as if the argument was represented
     as a fraction in lowest terms.  The denominator is always positive.
     The denominator of ‘0’ is defined to be ‘1’.  The arguments must be
     integers; if an argument is inexact, so is the result.

          (numerator   (/ 6 4))            ⇒  3
          (denominator (/ 6 4))            ⇒  2
          (denominator (inexact (/ 6 4)))        ⇒  2.0

 -- Procedure: floor X
 -- Procedure: ceiling X
 -- Procedure: truncate X
 -- Procedure: round X
     These procedures return inexact integer objects for inexact
     arguments that are not infinities or NaNs, and exact integer
     objects for exact rational arguments.

     ‘floor’
          Returns the largest integer object not larger than X.

     ‘ceiling’
          Returns the smallest integer object not smaller than X.
     ‘truncate’
          Returns the integer object closest to X whose absolute value
          is not larger than the absolute value of X.

     ‘round’
          Returns the closest integer object to X, rounding to even when
          X represents a number halfway between two integers.

     If the argument to one of these procedures is inexact, then the
     result is also inexact.  If an exact value is needed, the result
     should be passed to the ‘exact’ procedure.

     Although infinities and NaNs are not integer objects, these
     procedures return an infinity when given an infinity as an
     argument, and a NaN when given a NaN.

          (floor -4.3)                     ⇒  -5.0
          (ceiling -4.3)                   ⇒  -4.0
          (truncate -4.3)                  ⇒  -4.0
          (round -4.3)                     ⇒  -4.0

          (floor 3.5)                      ⇒  3.0
          (ceiling 3.5)                    ⇒  4.0
          (truncate 3.5)                   ⇒  3.0
          (round 3.5)                      ⇒  4.0

          (round 7/2)                      ⇒  4
          (round 7)                        ⇒  7

          (floor +inf.0)                   ⇒  +inf.0
          (ceiling -inf.0)                 ⇒  -inf.0
          (round +nan.0)                   ⇒  +nan.0

 -- Procedure: rationalize X1 X2
     The ‘rationalize’ procedure returns a number object representing
     the _simplest_ rational number differing from X1 by no more than
     X2.

     A rational number _r_1_ is _simpler_ than another rational number
     _r_2_ if ‘r_1 = p_1/q_1’ and ‘r_2 = p_2/q_2’ (in lowest terms) and
     ‘|p_1| <= |p_2|’ and ‘|q_1| <= |q_2|’.  Thus ‘3/5’ is simpler than
     ‘4/7’.

     Although not all rationals are comparable in this ordering
     (consider ‘2/7’ and ‘3/5’) any interval contains a rational number
     that is simpler than every other rational number in that interval
     (the simpler ‘2/5’ lies between ‘2/7’ and ‘3/5’).

     Note that ‘0 = 0/1’ is the simplest rational of all.
          (rationalize (exact .3) 1/10)          ⇒ 1/3
          (rationalize .3 1/10)                  ⇒ #i1/3  ; approximately

          (rationalize +inf.0 3)                 ⇒  +inf.0
          (rationalize +inf.0 +inf.0)            ⇒  +nan.0

     The first two examples hold only in implementations whose inexact
     real number objects have sufficient precision.

 -- Procedure: exp Z
 -- Procedure: log Z
 -- Procedure: log Z1 Z2
 -- Procedure: sin Z
 -- Procedure: cos Z
 -- Procedure: tan Z
 -- Procedure: asin Z
 -- Procedure: acos Z
 -- Procedure: atan Z
 -- Procedure: atan X1 X2
     These procedures compute the usual transcendental functions.

     The ‘exp’ procedure computes the base–E exponential of Z.  The
     ‘log’ procedure with a single argument computes the natural
     logarithm of Z (*not* the base–10 logarithm); ‘(log Z1 Z2)’
     computes the base–Z2 logarithm of Z1.

     The ‘asin’, ‘acos’, and ‘atan’ procedures compute arcsine,
     arccosine, and arctangent, respectively.  The two–argument variant
     of ‘atan’ computes:

          (angle (make-rectangular X2 X1))

     These procedures may return inexact results even when given exact
     arguments.
          (exp +inf.0)    ⇒ +inf.0
          (exp -inf.0)    ⇒ 0.0
          (log +inf.0)    ⇒ +inf.0
          (log 0.0)       ⇒ -inf.0
          (log 0)         ⇒ exception &assertion
          (log -inf.0)    ⇒ +inf.0+3.141592653589793i    ; approximately
          (atan -inf.0)   ⇒ -1.5707963267948965          ; approximately
          (atan +inf.0)   ⇒ 1.5707963267948965           ; approximately
          (log -1.0+0.0i) ⇒ 0.0+3.141592653589793i       ; approximately
          (log -1.0-0.0i) ⇒ 0.0-3.141592653589793i       ; approximately
                                                          ; if -0.0 is distinguished

 -- Procedure: sinh z
 -- Procedure: cosh z
 -- Procedure: tanh z
 -- Procedure: asinh z
 -- Procedure: acosh z
 -- Procedure: atanh z
     The hyperbolic functions.

 -- Procedure: square z
     Returns the square of Z.  This is equivalent to ‘(* Z Z)’.
          (square 42)    ⇒ 1764
          (square 2.0)   ⇒ 4.0

 -- Procedure: sqrt Z
     Returns the principal square root of Z.  For rational Z, the result
     has either positive real part, or zero real part and non–negative
     imaginary part.  The value of ‘(sqrt Z)’ could be expressed as:

          e^((log z)/2)

     The ‘sqrt’ procedure may return an inexact result even when given
     an exact argument.

          (sqrt -5)                   ⇒  0.0+2.23606797749979i ; approximately
          (sqrt +inf.0)               ⇒  +inf.0
          (sqrt -inf.0)               ⇒  +inf.0i

     Note that if the argument is a primitive number (such as ‘double’)
     or an instance of the corresponding boxed class (such as
     ‘java.lang.Double’) then we use the real-number version of ‘sqrt’:
          (sqrt (->double -5))        ⇒  NaN
     That is, we get different a result for ‘java.lang.Double’ and
     ‘gnu.math.DFloNum’, even for arguments that are numerically equal
     in the sense of ‘=’.  This is so that the compiler can use the
     ‘java.lang.Math.sqrt’ method without object allocation when the
     argument is a ‘double’ (and because we want ‘double’ and
     ‘java.lang.Double’ to behave consistently).

 -- Procedure: exact-integer-sqrt K
     The ‘exact-integer-sqrt’ procedure returns two non–negative exact
     integer objects _s_ and _r_ where ‘K = s^2 + r’ and ‘K < (s+1)^2’.

          (exact-integer-sqrt 4)  ⇒ 2 0 ; two return values
          (exact-integer-sqrt 5)  ⇒ 2 1 ; two return values

 -- Procedure: expt Z1 Z2
     Returns Z1 raised to the power Z2.  For nonzero Z1, this is Z1^{Z2}
     = E^{Z2 log Z1}.  The value of 0^{Z} is 1 if ‘(zero? Z)’, 0 if
     ‘(real-part Z)’ is positive, and an error otherwise.  Similarly for
     0.0^{z}, with inexact results.


File: kawa.info,  Node: Numerical input and output,  Next: Quaternions,  Prev: Arithmetic operations,  Up: Numbers

12.3 Numerical input and output
===============================

 -- Procedure: number->string z [radix]

     The procedure ‘number->string’ takes a number and a radix and
     returns as a string an external representation of the given number
     in the given radix such that
          (let ((number number)
                (radix radix))
            (eqv? number
                  (string->number (number->string number radix)
                                  radix)))
     is true.  It is an error if no possible result makes this
     expression true.

     If present, RADIX must be an exact integer in the range 2 to 36,
     inclusive.  If omitted, RADIX defaults to 10.

     If Z is inexact, the RADIX is 10, and the above expression can be
     satisfied by a result that contains a decimal point, then the
     result contains a decimal point and is expressed using the minimum
     number of digits (exclusive of exponent and trailing zeroes) needed
     to make the above expression; otherwise the format of the result is
     unspecified.

     The result returned by ‘number->string’ never contains an explicit
     radix prefix.

     _Note:_ The error case can occur only when Z is not a complex
     number or is a complex number with a non-rational real or imaginary
     part.

     _Rationale:_ If Z is an inexact number and the RADIX is 10, then
     the above expression is normally satisfied by a result containing a
     decimal point.  The unspecified case allows for infinities, NaNs,
     and unusual representations.

 -- Procedure: string->number string [radix]
     Returns a number of the maximally precise representation expressed
     by the given STRING.  It is an error if RADIX is not an exact
     integer in the range 2 to 26, inclusive.

     If supplied, RADIX is a default radix that will be overridden if an
     explicit radix prefix is present in the string (e.g.  ‘"#o177"’).
     If RADIX is not supplied, then the default RADIX is 10.  If STRING
     is not a syntactically valid notation for a number, or would result
     in a number that the implementation cannot represent, then
     ‘string->number’ returns ‘#f’.  An error is never signaled due to
     the content of STRING.

          (string->number "100")      ⇒  100
          (string->number "100" 16)   ⇒  256
          (string->number "1e2")      ⇒  100.0
          (string->number "#x100" 10) ⇒  256


File: kawa.info,  Node: Quaternions,  Next: Quantities,  Prev: Numerical input and output,  Up: Numbers

12.4 Quaternions
================

Kawa extends the Scheme numeric tower to include quaternions
(http://en.wikipedia.org/wiki/Quaternion) as a proper superset of the
complex numbers.  Quaternions provide a convenient notation to represent
rotations in three-dimensional space
(http://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation), and are
therefore commonly found in applications such as computer graphics,
robotics, and spacecraft engineering.  The Kawa quaternion API is
modeled after this (http://www.ccs.neu.edu/home/dorai/squat/squat.html)
with some additions.

   A quaternion is a number that can be expressed in the form
‘w+xi+yj+zk’, where ‘w’, ‘x’, ‘y’, and ‘z’ are real, and ‘i’, ‘j’, and
‘k’ are imaginary units satisfying i^{2} = j^{2} = k^{2} = ijk = -1.
The magnitude of a quaternion is defined to be its Euclidean norm when
viewed as a point in R^{4}.

   The real–part of a quaternion is also called its ‘scalar’, while the
i–part, j–part, and k–part taken together are also called its ‘vector’.
A quaternion with zero j–part and k–part is an ordinary complex number.
(If the i–part is also zero, then it is a real).  A quaternion with zero
real–part is called a ‘vector quaternion’.

   The reader syntax for number literals has been extended to support
both rectangular and polar (hyperspherical) notation for quaternions.
The rectangular notation is as above, i.e.  ‘w+xi+yj+zk’.  The polar
notation takes the form ‘r@t%u&v’, where ‘r’ is the magnitude, ‘t’ is
the first angle, and ‘u’ and ‘v’ are two other angles called the
“colatitude” and “longitude”.

   The rectangular coordinates and polar coordinates are related by the
equations:
     W = R * cos T
     X = R * sin T * cos U
     Y = R * sin T * sin U * cos V
     Z = R * sin T * sin U * sin V
   With either notation, zero elements may be omitted.

 -- Procedure: make-rectangular W X
 -- Procedure: make-rectangular W X Y Z
     These procedures construct quaternions from Cartesian coordinates.

 -- Procedure: make-polar R T
 -- Procedure: make-polar R T U V
     These procedures construct quaternions from polar coordinates.

 -- Procedure: + Q ...
 -- Procedure: - Q ...
 -- Procedure: * Q ...
 -- Procedure: / Q
 -- Procedure: / Q1 Q2 Q3 ...
 -- Procedure: expt Q1 Q2
 -- Procedure: exp Q
 -- Procedure: log Q
 -- Procedure: sqrt Q
 -- Procedure: sin Q
 -- Procedure: cos Q
 -- Procedure: tan Q
 -- Procedure: asin Q
 -- Procedure: acos Q
 -- Procedure: atan Q
     All of the arithmetic and transcendental functions defined for
     complex arguments have been extended to support quaternions.

     Quaternion multiplication is not commutative, so there are two
     possible interpretations of ‘(/ q1 q2)’ which would yield different
     results: either ‘(* q1 (/ q2))’, or ‘(* (/ q2) q1)’.  Division in
     this implementation has been defined such that ‘(/ q1 q2 ...)’ is
     equivalent to ‘(* q1 (/ q2) ...)’, but it is recommended to use
     reciprocals (unary ‘/’) and multiplication.

 -- Procedure: real-part Q
     Return the real–part of Q.

          (real-part 0)          ⇒  0
          (real-part -i)         ⇒  0
          (real-part 1+2i-3j+4k) ⇒  1

 -- Procedure: imag-part Q
     Return the i–part of Q.

          (imag-part 0)          ⇒  0
          (imag-part -i)         ⇒  -1
          (imag-part 1+2i-3j+4k) ⇒  2

 -- Procedure: magnitude Q
     Return the Euclidean norm of Q.  If Q is ‘a+bi+cj+dk’, then
     ‘(magnitude q)’ is ‘(sqrt (apply + (map square (list a b c d))))’

 -- Procedure: angle Q
     Return the angle of Q.

12.4.1 The ‘(kawa quaternions)’ module
--------------------------------------

The following additional functionality is made available by doing one
of:
     (require 'quaternions) ;; or
     (import (kawa quaternions))

 -- Alias: quaternion
     An alias for ‘gnu.math.Quaternion’, useful for type declarations.
 -- Procedure: quaternion? X
     Return ‘#t’ if X is a quaternion, i.e.  an ordinary number, and
     ‘#f’ otherwise.

          (quaternion? 0)          ⇒  #t
          (quaternion? -i)         ⇒  #t
          (quaternion? 1+2i-3j+4k) ⇒  #t
          (quaternion? 10.0m)      ⇒  #f
          (quaternion? "x")        ⇒  #f

 -- Procedure: jmag-part Q
     Return the j–part of Q.

          (jmag-part 0)          ⇒  0
          (jmag-part -i)         ⇒  0
          (jmag-part 1+2i-3j+4k) ⇒  -3
 -- Procedure: kmag-part Q

          (kmag-part 0)          ⇒  0
          (kmag-part -i)         ⇒  0
          (kmag-part 1+2i-3j+4k) ⇒  4

 -- Procedure: complex-part Q
     Return the projection of Q into the complex plane: ‘(+ (real-part
     q) (* +i (imag-part q)))’

          (complex-part 0)          ⇒  0
          (complex-part -i)         ⇒  -1i
          (complex-part 1+2i-3j+4k) ⇒  1+2i
 -- Procedure: vector-part Q
     Return the vector–part of Q.

          (vector-part 0)          ⇒  0
          (vector-part -i)         ⇒  -1i
          (vector-part 1+2i-3j+4k) ⇒  +2i-3j+4k

 -- Procedure: unit-quaternion Q
     Return a quaternion of unit magnitude with the same direction as Q.
     If Q is zero, return zero.  This is like a 4D version of a signum
     function.

          (unit-quaternion 0)          ⇒  0
          (unit-quaternion -i)         ⇒  -1i
          (unit-quaternion 1+2i-3j+4k) ⇒  0.18257418583505536+0.3651483716701107i-0.5477225575051661j+0.7302967433402214k

 -- Procedure: unit-vector Q
     Return the vector–part of Q, scaled to have magnitude 1.  If the
     vector–part is zero, then return zero.

          (unit-vector 0)          ⇒  0
          (unit-vector -i)         ⇒  -1i
          (unit-vector 1+2i-3j+4k) ⇒  +0.3713906763541037i-0.5570860145311556j+0.7427813527082074k

 -- Procedure: colatitude Q
     Return the colatitude of Q.

 -- Procedure: longitude Q
     Return the longitude of Q.

 -- Procedure: vector-quaternion? OBJ
     Return ‘#t’ if OBJ is a vector quaternion, i.e.  a quaternion with
     zero real–part.

 -- Procedure: make-vector-quaternion X Y Z
     Construct vector quaternion ‘xi+yj+zk’.  This is equivalent to
     ‘(make-rectangular 0 x y z)’.

 -- Procedure: vector-quaternion->list VQ
     Return a newly allocated list of the x, y, and z components of VQ.
     This is equivalent to ‘(list (imag-part vq) (jmag-part vq)
     (kmag-part vq))’.

 -- Procedure: dot-product Q1 Q2
     For two vector quaternions Q1 = ‘ai+bj+ck’ and Q2 = ‘di+ej+fk’,
     return ‘ad + be + cf’.  This is equal to the R^3 dot product for
     vectors (a,b,c) and (d,e,f), and is also equal to ‘(- (real-part (*
     q1 q2)))’.  It is an error if either Q1 or Q2 has a non-zero
     real–part.

 -- Procedure: cross-product Q1 Q2
     For two vector quaternions Q1 = ‘ai+bj+ck’ and Q2 = ‘di+ej+fk’,
     return the R^3 cross product for vectors (a,b,c) and (d,e,f), which
     is equal to ‘(vector-part (* q1 q2))’.  It is an error if either Q1
     or Q2 has a non-zero real–part.

 -- Procedure: conjugate Q
     Return ‘(+ (real-part q) (* -1 (vector-part q)))’.

          (conjugate 0)          ⇒  0
          (conjugate -i)         ⇒  +1i
          (conjugate 1+2i-3j+4k) ⇒  1-2i+3j-4k

12.4.2 The ‘(kawa rotations)’ module
------------------------------------

The ‘(kawa rotations)’ library provides a set of functions which use
unit quaternions to represent 3D spatial rotations.  To use these
functions, the library must be imported:
     (import (kawa rotations))

   These functions normalize their quaternion inputs as needed to be of
length 1.

12.4.2.1 Rotation Representation Conversions
............................................

Conversions to and from several alternate representations of rotations
are supported.

   The set of unit quaternions provides a double covering of all
possible 3D rotations: ‘q’ and ‘-q’ represent the same rotation.  Most
other representations also have multiple numerical values which map to
the same rotation (for example, the rotation about ‘axis-vec’ by ‘angle’
is the same as the rotation about ‘-axis-vec’ by ‘-angle+2pi’).
Therefore, these functions do not necessarily act as inverses in the
sense of ‘equal?’.  Furthermore, rotations involve trigonometric
functions, so there will typically be some floating point error: ‘(acos
(cos 0.1))’ returns 0.09999999999999945, which is very close to 0.1 but
not exact.

Rotation Matrices
.................

 -- Procedure: quaternion->rotation-matrix Q
 -- Procedure: rotation-matrix->quaternion M

     The ‘quaternion->rotation-matrix’ procedure returns a 3x3 rotation
     matrix representing the same rotation as Q.  The rotation matrix is
     instantiated as a *note SRFI-25 multi-dimensional array: Arrays.
     backed by an *note f64vector: Uniform vectors.

     The ‘rotation-matrix->quaternion’ procedure performs the reverse
     operation, producing an equivalent unit quaternion for the rotation
     matrix (multi-dimensional array) M.

          (rotation-matrix->quaternion (quaternion->rotation-matrix -1)) ⇒ 1.0

Axis-Angle Representation
.........................

 -- Procedure: rotation-axis Q
 -- Procedure: rotation-angle Q
 -- Procedure: rotation-axis/angle Q

     The ‘rotation-axis’ procedure returns the axis of rotation of the
     quaternion Q as a unit-length vector quaternion.  If the axis of
     rotation is not well-defined (the angle of rotation is 0), then
     ‘+i’ is arbitrarily chosen as the axis.

     The ‘rotation-angle’ procedure returns the corresponding angle of
     rotation.  Note that this is not the same as the result of the
     ‘angle’ procedure.

     The ‘rotation-axis/angle’ procedure returns the rotation axis and
     angle as multiple values.

          (let* ((q 1/2+1/2i+1/2j+1/2k)
                 (ar (rotation-angle q))
                 (ad (java.lang.Math:toDegrees ar))
                 (exact-ad (exact ad)))
            (rationalize exact-ad 1/10)) ⇒ 120

 -- Procedure: make-axis/angle AXIS-VEC ANGLE
 -- Procedure: make-axis/angle AXIS-X AXIS-Y AXIS-Z ANGLE

     The ‘make-axis/angle’ procedure returns a quaternion representing
     the given axis/angle rotation.  The axis is specified as either a
     single vector quaternion argument AXIS-VEC, or as three reals
     AXIS-X, AXIS-Y, and AXIS-Z.

 -- Procedure: rotx ANGLE
 -- Procedure: roty ANGLE
 -- Procedure: rotz ANGLE

     The procedures ‘rotx’, ‘roty’, and ‘rotz’ return quaternions
     representing rotations about the X-, Y-, and Z-axes.

Intrinsic Angle Sets
....................

The intrinsic angle sets represent arbitrary rotations as a sequence of
three rotations about coordinate frame axes attached to the rotating
body (i.e.  the axes rotate with the body).

   There are twelve possible angle sets which neatly divide into two
groups of six.  The six with same first and third axes are also known as
“Euler angles”.  The six with different first and third axes are also
known as “Tait-Bryan angles”.

 -- Procedure: intrinsic-xyx Q
 -- Procedure: intrinsic-xzx Q
 -- Procedure: intrinsic-yxy Q
 -- Procedure: intrinsic-yzy Q
 -- Procedure: intrinsic-zxz Q
 -- Procedure: intrinsic-zyz Q

     These functions decompose the rotation represented by Q into Euler
     angles of the given set (XYX, XZX, YXY, YZY, ZXZ, or ZYZ) and
     returns the three angles as multiple values.  The middle angle will
     be in the range [0,pi].  If it is on the edges of that range
     (within 1.0E-12 of 0 or pi), such that the first and third axes are
     colinear, then the first angle will be set to 0.

          (intrinsic-zyz (* (rotz 0.3) (roty 0.8) (rotz -0.6))) ⇒ 0.3000000000000001 0.7999999999999999 -0.5999999999999999

 -- Alias: euler-xyx
 -- Alias: euler-xzx
 -- Alias: euler-yxy
 -- Alias: euler-yzy
 -- Alias: euler-zxz
 -- Alias: euler-zyz
     Aliases for the corresponding ‘intrinsic-’ procedures.

 -- Procedure: intrinsic-xyz Q
 -- Procedure: intrinsic-xzy Q
 -- Procedure: intrinsic-yxz Q
 -- Procedure: intrinsic-yzx Q
 -- Procedure: intrinsic-zxy Q
 -- Procedure: intrinsic-zyx Q

     These functions decompose the rotation represented by Q into
     Tait-Bryan angles of the given set (XYZ, XZY, YXZ, YZX, ZXY, or
     ZYX) and returns the three angles as multiple values.  The middle
     angle will be in the range [-pi/2,pi/2].  If it is on the edges of
     that range, such that the first and third axes are colinear, then
     the first angle will be set to 0.

 -- Alias: tait-bryan-xyz
 -- Alias: tait-bryan-xzy
 -- Alias: tait-bryan-yxz
 -- Alias: tait-bryan-yzx
 -- Alias: tait-bryan-zxy
 -- Alias: tait-bryan-zyx
     Aliases for the corresponding ‘intrinsic-’ procedures.

 -- Procedure: make-intrinsic-xyx ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-xzx ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-yxy ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-yzy ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-zxz ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-zyz ALPHA BETA GAMMA

     These functions return quaternions representing the given Euler
     angle rotations.

 -- Alias: make-euler-xyx
 -- Alias: make-euler-xzx
 -- Alias: make-euler-yxy
 -- Alias: make-euler-yzy
 -- Alias: make-euler-zxz
 -- Alias: make-euler-zyz
     Aliases for the corresponding ‘make-intrinsic-’ procedures.

          (let-values (((a b c) (euler-xyx (make-euler-xyx 1.0 0.0 2.0))))
            (list a b c)) ⇒ (0.0 0.0 3.0)

 -- Procedure: make-intrinsic-xyz ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-xzy ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-yxz ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-yzx ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-zxy ALPHA BETA GAMMA
 -- Procedure: make-intrinsic-zyx ALPHA BETA GAMMA

     These functions return quaternions representing the given
     Tait-Bryan angle rotations.

 -- Alias: make-tait-bryan-xyz
 -- Alias: make-tait-bryan-xzy
 -- Alias: make-tait-bryan-yxz
 -- Alias: make-tait-bryan-yzx
 -- Alias: make-tait-bryan-zxy
 -- Alias: make-tait-bryan-zyx
     Aliases for the corresponding ‘make-intrinsic-’ procedures.

Extrinsic Angle Sets
....................

The extrinsic angle sets represent arbitrary rotations as a sequence of
three rotations about fixed-frame axes (i.e.  the axes do not rotate
with the body).

   There are twelve possible extrinsic angle sets, and each is the dual
of an intrinsic set.  The extrinsic rotation about axes ‘A’, ‘B’, and
‘C’ by angles ‘a’, ‘b’, and ‘c’ is the same as the intrinsic rotation
about axes ‘C’, ‘B’, and ‘A’ by angles ‘c’, ‘b’, and ‘a’, with the order
of the three axes reversed.

 -- Procedure: extrinsic-xyx Q
 -- Procedure: extrinsic-xyz Q
 -- Procedure: extrinsic-xzx Q
 -- Procedure: extrinsic-zxy Q
 -- Procedure: extrinsic-yxy Q
 -- Procedure: extrinsic-yxz Q
 -- Procedure: extrinsic-yzx Q
 -- Procedure: extrinsic-yzy Q
 -- Procedure: extrinsic-zxy Q
 -- Procedure: extrinsic-zxz Q
 -- Procedure: extrinsic-zyx Q
 -- Procedure: extrinsic-zyz Q

     These functions decompose the rotation represented by Q into
     extrinsic angles of the given set and returns the three angles as
     multiple values.

 -- Procedure: make-extrinsic-xyx GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-xyz GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-xzx GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-xzy GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-yxy GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-yxz GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-yzx GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-yzy GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-zxy GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-zxz GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-zyx GAMMA BETA ALPHA
 -- Procedure: make-extrinsic-zyz GAMMA BETA ALPHA

     These functions return quaternions representing the given extrinsic
     angle rotations.

 -- Alias: rpy
 -- Alias: make-rpy
     Aliases for ‘extrinsic-xyz’ and ‘make-extrinsic-xyz’.

          (let ((r (make-rpy 0.12 -0.23 0.34)))
            (let-values (((a b c) (tait-bryan-zyx r)))
              (list a b c))) ⇒ (0.3400000000000001 -0.2300000000000001 0.12000000000000002)

12.4.2.2 Rotation Operations
............................

 -- Procedure: rotate-vector RQ VQ
     Applies the rotation represented by quaternion RQ to the vector
     represented by vector quaternion VQ, and returns the rotated
     vector.  This is equivalent to ‘(* rq vq (conjugate rq))’ for
     normalized RQ.

          (rotate-vector +k +2i)                      ⇒ -2i
          (rotate-vector 1/2+1/2i+1/2j+1/2k +i+2j+3k) ⇒ +3.0i+1.0j+2.0k

 -- Procedure: make-rotation-procedure RQ
     A partial application of ‘rotate-vector’.  Returns a
     single-argument procedure which will take a vector quaternion
     argument and rotate it by RQ.  The returned procedure closes over
     both RQ and its conjugate, so this will likely be more efficient
     than ‘rotate-vector’ at rotating many vectors by the same rotation.


File: kawa.info,  Node: Quantities,  Next: Logical Number Operations,  Prev: Quaternions,  Up: Numbers

12.5 Quantities and Units
=========================

As a super-class of numbers, Kawa also provides quantities.  A
“quantity” is a product of a “unit” and a pure number.  The number part
can be an arbitrary complex number.  The unit is a product of integer
powers of base units, such as meter or second.

   Quantity literals have the following syntax:
     QUANTITY ::= OPTIONAL-SIGN DECIMAL UNIT-TERM [‘*’ UNIT-TERM]... [‘/’ UNIT-TERM]
     UNIT-TERM ::= UNIT-NAME [‘^’ DIGIT+]
     UNIT-NAME ::= LETTER+
   Some examples are ‘10pt’ (10 points), ‘5s’ (5 seconds), and ‘4cm^2’
(4 square centimeters).

   Note the QUANTITY syntax is not recognized by the reader.  Instead
these are read as symbols.  Assuming there is no lexical binding the for
the symbol, it will be rewritten at compile-time into an expression.
For example ‘4cm^2’ is transformed into:
     (* 4.0 (expt unit:cm 2))

 -- Procedure: quantity? object
     True iff OBJECT is a quantity.  Note that all numbers are
     quantities, but not the other way round.  Currently, there are no
     quantities that are not numbers.  To distinguish a plain unit-less
     number from a quantity, you can use ‘complex?’.

 -- Procedure: quantity->number q
     Returns the pure number part of the quantity Q, relative to
     primitive (base) units.  If Q is a number, returns Q.  If Q is a
     unit, yields the magitude of Q relative to base units.

 -- Procedure: quantity->unit q
     Returns the unit of the quantity Q.  If Q is a number, returns the
     empty unit.

 -- Procedure: make-quantity x unit
     Returns the product of X (a pure number) and UNIT.  You can specify
     a string instead of UNIT, such as ‘"cm"’ or ‘"s"’ (seconds).

 -- Syntax: define-base-unit unit-name dimension
     Define UNIT-NAME as a base (primitive) unit, which is used to
     measure along the specified DIMENSION.
          (define-base-unit dollar "Money")

 -- Syntax: define-unit unit-name expression
     Define UNIT-NAME as a unit (that can be used in literals) equal to
     the quantity EXPRESSION.
          (define-unit cent 0.01dollar)
     The UNIT-NAME is declared in the ‘unit’ namespace, so the above is
     equivalent to:
          (define-constant unit:cent (* 0.01 unit:dollar))

Angles
------

The following angle units are dimensionless, with no base unit.

   Some procedures treat a unit-less real number as if it were in
radians (which mathematicians prefer); some procedures (such as
‘rotate’) treat a unit-less real number as if it were in degrees (which
is common in Web and other standards).

 -- Unit: rad
     A unit for angles specified in radians.  A full circle is 2*pi
     radians.  Note that ‘(= 1.5 1.5rad)’ is true, while ‘(eqv? 1.5
     1.5rad)’ is false.

 -- Unit: deg
     A unit for angles specified in degrees.  A full circle is 360
     degrees.

 -- Unit: grad
     A unit for angles specified in gradians.  A full circle is 400
     gradians.


File: kawa.info,  Node: Logical Number Operations,  Next: Performance of numeric operations,  Prev: Quantities,  Up: Numbers

12.6 Logical Number Operations
==============================

These functions operate on the 2’s complement binary representation of
an exact integer.

 -- Procedure: bitwise-not i
     Returns the bit-wise logical inverse of the argument.  More
     formally, returns the exact integer whose two’s complement
     representation is the one’s complement of the two’s complement
     representation of I.

 -- Procedure: bitwise-and i ...
 -- Procedure: bitwise-ior i ...
 -- Procedure: bitwise-xor i ...
     These procedures return the exact integer that is the bit-wise
     “and”, “inclusive or”, or “exclusive or” of the two’s complement
     representations of their arguments.  If they are passed only one
     argument, they return that argument.  If they are passed no
     arguments, they return the integer that acts as identity for the
     operation: -1, 0, or 0, respectively.

 -- Procedure: bitwise-if i1 i2 i3

     Returns the exact integer that is the bit-wise “if” of the twos
     complement representations of its arguments, i.e.  for each bit, if
     it is 1 in i1, the corresponding bit in i2 becomes the value of the
     corresponding bit in the result, and if it is 0, the corresponding
     bit in i3 becomes the corresponding bit in the value of the result.
     This is equivaent to the following computation:
          (bitwise-ior (bitwise-and i1 i2)
                       (bitwise-and (bitwise-not i1) i3))

 -- Procedure: bitwise-bit-count i
     If i is non-negative, returns the number of 1 bits in the twos
     complement representation of i.  Otherwise it returns the result of
     the following computation:
          (bitwise-not (bitwise-bit-count (bitwise-not i)))

 -- Procedure: bitwise-length i
     Returns the number of bits needed to represent i if it is positive,
     and the number of bits needed to represent ‘(bitwise-not I)’ if it
     is negative, which is the exact integer that is the result of the
     following computation:
          (do ((result 0 (+ result 1))
               (bits (if (negative? i)
                         (bitwise-not i)
                         ei)
                     (bitwise-arithmetic-shift bits -1)))
              ((zero? bits)
               result))
     This is the number of bits needed to represent I in an unsigned
     field.

 -- Procedure: bitwise-first-bit-set i
     Returns the index of the least significant 1 bit in the twos
     complement representation of i.  If i is 0, then - 1 is returned.
          (bitwise-first-bit-set 0) ⇒ -1
          (bitwise-first-bit-set 1) ⇒ 0
          (bitwise-first-bit-set -4) ⇒ 2

 -- Procedure: bitwise-bit-set? i1 i2
     Returns ‘#t’ if the i2’th bit (where I2 must be non-negative) is 1
     in the two’s complement representation of I1, and ‘#f’ otherwise.
     This is the result of the following computation:
          (not (zero?
                 (bitwise-and
                   (bitwise-arithmetic-shift-left 1 i2)
                   i1)))

 -- Procedure: bitwise-copy-bit i bitno replacement-bit
     Returns the result of replacing the BITNO’th bit of I by
     REPLACEMENT-BIT, where BITNO must be non-negative, and
     REPLACEMENT-BIT must be either 0 or 1.  This is the result of the
     following computation:
          (let* ((mask (bitwise-arithmetic-shift-left 1 bitno)))
            (bitwise-if mask
                      (bitwise-arithmetic-shift-left replacement-bit bitno)
                      i))

 -- Procedure: bitwise-bit-field n start end
     Returns the integer formed from the (unsigned) bit-field starting
     at START and ending just before END.  Same as:
          (let ((mask
                 (bitwise-not
                  (bitwise-arithmetic-shift-left -1 END))))
            (bitwise-arithmetic-shift-right
              (bitwise-and N mask)
              START))

 -- Procedure: bitwise-copy-bit-field to start end from
     Returns the result of replacing in TO the bits at positions from
     START (inclusive) to END (exclusive) by the bits in FROM from
     position 0 (inclusive) to position END - START (exclusive).  Both
     START and START must be non-negative, and START must be less than
     or equal to START.

     This is the result of the following computation:
          (let* ((mask1
                   (bitwise-arithmetic-shift-left -1 start))
                 (mask2
                   (bitwise-not
                     (bitwise-arithmetic-shift-left -1 end)))
                 (mask (bitwise-and mask1 mask2)))
            (bitwise-if mask
                        (bitwise-arithmetic-shift-left from
                                                       start)
                        to))

 -- Procedure: bitwise-arithmetic-shift i j
     Shifts I by J.  It is a “left” shift if ‘J>0’, and a “right” shift
     if ‘J<0’.  The result is equal to ‘(floor (* I (expt 2 J)))’.

     Examples:
          (bitwise-arithmetic-shift -6 -1) ⇒-3
          (bitwise-arithmetic-shift -5 -1) ⇒ -3
          (bitwise-arithmetic-shift -4 -1) ⇒ -2
          (bitwise-arithmetic-shift -3 -1) ⇒ -2
          (bitwise-arithmetic-shift -2 -1) ⇒ -1
          (bitwise-arithmetic-shift -1 -1) ⇒ -1

 -- Procedure: bitwise-arithmetic-shift-left i amount
 -- Procedure: bitwise-arithmetic-shift-right i amount
     The AMOUNT must be non-negative The ‘bitwise-arithmetic-shift-left’
     procedure returns the same result as ‘bitwise-arithmetic-shift’,
     and ‘(bitwise-arithmetic-shift-right I AMOUNT)’ returns the same
     result as ‘(bitwise-arithmetic-shift I (- AMOUNT))’.

     If I is a primitive integer type, then AMOUNT must be less than the
     number of bits in the promoted type of I (32 or 64).  If the type
     is unsigned, an unsigned (logic) shift is done for
     ‘bitwise-arithmetic-shift-right’, rather than a signed (arithmetic)
     shift.

 -- Procedure: bitwise-rotate-bit-field n start end count
     Returns the result of cyclically permuting in N the bits at
     positions from START (inclusive) to END (exclusive) by COUNT bits
     towards the more significant bits, START and END must be
     non-negative, and START must be less than or equal to END.  This is
     the result of the following computation:
          (let* ((n     ei1)
                 (width (- end start)))
            (if (positive? width)
                (let* ((count (mod count width))
                       (field0
                         (bitwise-bit-field n start end))
                       (field1 (bitwise-arithmetic-shift-left
                                 field0 count))
                       (field2 (bitwise-arithmetic-shift-right
                                 field0
                                 (- width count)))
                       (field (bitwise-ior field1 field2)))
                  (bitwise-copy-bit-field n start end field))
                n))

 -- Procedure: bitwise-reverse-bit-field i start end
     Returns the result obtained from I by reversing the order of the
     bits at positions from START (inclusive) to END (exclusive), where
     START and END must be non-negative, and START must be less than or
     equal to END.
          (bitwise-reverse-bit-field #b1010010 1 4) ⇒  88 ; #b1011000

 -- Procedure: logop op x y
     Perform one of the 16 bitwise operations of X and Y, depending on
     OP.

 -- Procedure: logtest i j
     Returns true if the arguments have any bits in common.  Same as
     ‘(not (zero? (bitwise-and I J)))’, but is more efficient.

12.6.1 SRFI-60 Logical Number Operations
----------------------------------------

Kawa supports SRFI-60 “Integers as Bits” as well, although we generally
recommend using the R6RS-compatible functions instead when possible.
Unless noted as being a builtin function, to use these you must first
‘(require 'srfi-60)’ or ‘(import (srfi :60))’ (or ‘(import (srfi :60
integer-bits))’).

 -- Procedure: logand i ...
     Equivalent to ‘(bitwise-and I ...)’.  Builtin.

 -- Procedure: logior i ...
     Equivalent to ‘(bitwise-ior I ...)’.  Builtin.

 -- Procedure: logxor i ...
     Equivalent to ‘(bitwise-xor I ...)’.  Builtin.

 -- Procedure: lognot i
     Equivalent to ‘(bitwise-not I)’.  Builtin.

 -- Procedure: bitwise-merge mask i j
     Equivalent to ‘(bitwise-if MASK I J)’.

 -- Procedure: any-bits-set? i j
     Equivalent to ‘(logtest I J)’.

 -- Procedure: logcount i
 -- Procedure: bit-count i
     Count the number of 1-bits in I, if it is non-negative.  If I is
     negative, count number of 0-bits.  Same as ‘(bitwise-bit-count I)’
     if I is non-negative.  Builtin as ‘logcount’.

 -- Procedure: integer-length i
     Equivalent to ‘(bitwise-length I)’.  Builtin.

 -- Procedure: log2-binary-factors i
 -- Procedure: first-set-bit i
     Equivalent to ‘(bitwise-first-bit-set I)’.

 -- Procedure: logbit? pos i
 -- Procedure: bit-set? pos i
     Equivalent to ‘(bitwise-bit-set? I POS)’.

 -- Procedure: copy-bit bitno i bool
     Equivalent to ‘(bitwise-copy-bit I BITNO (if BOOL 1 0))’.

 -- Procedure: bit-field n start end
     Equivalent to ‘(bitwise-bit-field N START END)’.

 -- Procedure: copy-bit-field to from start end
     Equivalent to ‘(bitwise-copy-bit-field TO START END FROM)’.

 -- Procedure: arithmetic-shift i j
     Equivalent to ‘(bitwise-arithmetic-shift I J)’.  Builtin.

 -- Procedure: ash i j
     Alias for ‘arithmetic-shift’.  Builtin.

 -- Procedure: rotate-bit-field n count start end
     Equivalent to ‘(bitwise-rotate-bit-field N START END COUNT)’.

 -- Procedure: reverse-bit-field i start end
     Equivalent to ‘(bitwise-reverse-bit-field I START END)’.

 -- Procedure: integer->list K [LENGTH]
 -- Procedure: list->integer LIST
     The ‘integer->list’ procedure returns a list of LENGTH booleans
     corresponding to the bits of the non-negative integer K, with ‘#t’
     for ‘1’ and ‘#f’ for ‘0’.  LENGTH defaults to ‘(bitwise-length K)’.
     The list will be in order from MSB to LSB, with the value of ‘(odd?
     K)’ in the last car.

     The ‘list->integer’ procedure returns the integer corresponding to
     the booleans in the list LIST.  The ‘integer->list’ and
     ‘list->integer’ procedures are inverses so far as ‘equal?’ is
     concerned.

 -- Procedure: booleans->integer bool1 ...
     Returns the integer coded by the BOOL1 ...  arguments.  Equivalent
     to ‘(list->integer (list BOOL1 ...))’.

12.6.2 Deprecated Logical Number Operations
-------------------------------------------

This older function is still available, but we recommend using the
R6RS-compatible function.

 -- Procedure: bit-extract n start end
     Equivalent to ‘(bitwise-bit-field N START END)’.


File: kawa.info,  Node: Performance of numeric operations,  Prev: Logical Number Operations,  Up: Numbers

12.7 Performance of numeric operations
======================================

Kawa can generally do a pretty good job of generating efficient code for
numeric operations, at least when it knows or can figure out the types
of the operands.

   The basic operations ‘+’, ‘-’, and ‘*’ are compiled to
single-instruction bytecode if both operands are ‘int’ or ‘long’.
Likewise, if both operands are floating-point (or one is floating-point
and the other is rational), then single-instruction ‘double’ or ‘float’
instructions are emitted.

   A binary operation involving an infinite-precision ‘integer’ and a
fixed-size ‘int’ or ‘long’ is normally evaluated by expanding the latter
to ‘integer’ and using ‘integer’ arithmetic.  An exception is an integer
literal whose value fits in an ‘int’ or ‘long’ - in that case the
operation is done using ‘int’ or ‘long’ arithmetic.

   In general, integer literals have amorphous type.  When used to infer
the type of a variable, they have ‘integer’ type:
     (let ((v1 0))
       ... v1 has type integer ... )
   However, a literal whose value fits in the ‘int’ or ‘long’ range is
implicitly viewed ‘int’ or ‘long’ in certain contexts, primarily method
overload resolution and binary arithmetic (as mentioned above).

   The comparison functions ‘<’, ‘<=’, ‘=’, ‘>’, and ‘=>’ are also
optimized to single instriction operations if the operands have
appropriate type.  However, the functions ‘zero?’, ‘positive?’, and
‘negative?’ have not yet been optimized.  Instead of ‘(positive? x)’
write ‘(> x 0)’.

   There are a number of integer division and modulo operations.  If the
operands are ‘int’ or ‘long’, it is faster to use ‘quotient’ and
‘remainder’ rather than ‘div’ and ‘mod’ (or ‘modulo’).  If you know the
first operand is non-negative and the second is positive, then use
‘quotient’ and ‘remainder’.  (If an operand is an arbitrary-precision
‘integer’, then it dosn’t really matter.)

   The logical operations ‘bitwise-and’, ‘bitwise-ior’, ‘bitwise-xor’,
‘bitwise-not’, ‘bitwise-arithmetic-shift-left’,
‘bitwise-arithmetic-shift-right’ are compiled to single bitcode
instructions if the operands are ‘int’ or ‘long’.  Avoid
‘bitwise-arithmetic-shift’ if the sign of the shift is known.  If the
operands are arbitrary-precision ‘integer’, a library call is needed,
but run-time type dispatch is avoided.


File: kawa.info,  Node: Characters and text,  Next: Data structures,  Prev: Numbers,  Up: Top

13 Characters and text
**********************

* Menu:

* Characters::
* Character sets::
* Strings::
* String literals::
* Unicode::              Unicode character classes and conversions
* Regular expressions::


File: kawa.info,  Node: Characters,  Next: Character sets,  Up: Characters and text

13.1 Characters
===============

Characters are objects that represent human-readable characters such as
letters and digits.  More precisely, a character represents a Unicode
scalar value (http://www.unicode.org/glossary/#unicode_scalar_value).
Each character has an integer value in the range ‘0’ to ‘#x10FFFF’
(excluding the range ‘#xD800’ to ‘#xDFFF’ used for Surrogate Code Points
(http://www.unicode.org/glossary/#surrogate_code_point)).

     _Note:_ Unicode distinguishes between glyphs, which are printed for
     humans to read, and characters, which are abstract entities that
     map to glyphs (sometimes in a way that’s sensitive to surrounding
     characters).  Furthermore, different sequences of scalar values
     sometimes correspond to the same character.  The relationships
     among scalar, characters, and glyphs are subtle and complex.

     Despite this complexity, most things that a literate human would
     call a “character” can be represented by a single Unicode scalar
     value (although several sequences of Unicode scalar values may
     represent that same character).  For example, Roman letters,
     Cyrillic letters, Hebrew consonants, and most Chinese characters
     fall into this category.

     Unicode scalar values exclude the range ‘#xD800’ to ‘#xDFFF’, which
     are part of the range of Unicode “code points”.  However, the
     Unicode code points in this range, the so-called “surrogates”, are
     an artifact of the UTF-16 encoding, and can only appear in specific
     Unicode encodings, and even then only in pairs that encode scalar
     values.  Consequently, all characters represent code points, but
     the surrogate code points do not have representations as
     characters.

 -- Type: character
     A Unicode code point - normally a Unicode scalar value, but could
     be a surrogate.  This is implemented using a 32-bit ‘int’.  When an
     object is needed (i.e.  the “boxed” representation), it is
     implemented an instance of ‘gnu.text.Char’.

 -- Type: character-or-eof
     A ‘character’ or the specical ‘#!eof’ value (used to indicate
     end-of-file when reading from a port).  This is implemented using a
     32-bit ‘int’, where the value -1 indicates end-of-file.  When an
     object is needed, it is implemented an instance of ‘gnu.text.Char’
     or the special ‘#!eof’ object.

 -- Type: char
     A UTF-16 code unit.  Same as Java primitive ‘char’ type.
     Considered to be a sub-type of ‘character’.  When an object is
     needed, it is implemented as an instance of ‘java.lang.Character’.
     Note the unfortunate inconsistency (for historical reasons) of
     ‘char’ boxed as ‘Character’ vs ‘character’ boxed as ‘Char’.

   Characters are written using the notation ‘#\’CHARACTER (which stands
for the given CHARACTER; ‘#\x’HEX-SCALAR-VALUE (the character whose
scalar value is the given hex integer); or ‘#\’CHARACTER-NAME (a
character with a given name):

     CHARACTER ::= ‘#\’ANY-CHARACTER
             | ‘#\’ CHARACTER-NAME
             | ‘#\x’ HEX-SCALAR-VALUE
             | ‘#\X’ HEX-SCALAR-VALUE

   The following CHARACTER-NAME forms are recognized:
‘#\alarm’
     ‘#\x0007’ - the alarm (bell) character
‘#\backspace’
     ‘#\x0008’
‘#\delete’
‘#\del’
‘#\rubout’
     ‘#\x007f’ - the delete or rubout character
‘#\escape’
‘#\esc’
     ‘#\x001b’
‘#\newline’
‘#\linefeed’
     ‘#\x001a’ - the linefeed character
‘#\null’
‘#\nul’
     ‘#\x0000’ - the null character
‘#\page’
     ‘#\000c’ - the formfeed character
‘#\return’
     ‘#\000d’ - the carriage return character
‘#\space’
     ‘#\x0020’ - the preferred way to write a space
‘#\tab’
     ‘#\x0009’ - the tab character
‘#\vtab’
     ‘#\x000b’ - the vertical tabulation character
‘#\ignorable-char’
     A special ‘character’ value, but it is not a Unicode code point.
     It is a special value returned when an index refers to the second
     ‘char’ (code point) of a surrogate pair, and which should be
     ignored.  (When writing a ‘character’ to a string or file, it will
     be written as one or two ‘char’ values.  The exception is
     ‘#\ignorable-char’, for which zero ‘char’ values are written.)

 -- Procedure: char? OBJ
     Return ‘#t’ if OBJ is a character, ‘#f’ otherwise.  (The OBJ can be
     any character, not just a 16-bit ‘char’.)

 -- Procedure: char->integer CHAR
 -- Procedure: integer->char SV
     SV should be a Unicode scalar value, i.e., a non–negative exact
     integer object in ‘[0, #xD7FF] union [#xE000, #x10FFFF]’.  (Kawa
     also allows values in the surrogate range.)

     Given a character, ‘char->integer’ returns its Unicode scalar value
     as an exact integer object.  For a Unicode scalar value SV,
     ‘integer->char’ returns its associated character.

          (integer->char 32)                     ⇒ #\space
          (char->integer (integer->char 5000))   ⇒ 5000
          (integer->char #\xD800)                ⇒ throws ClassCastException

     _Performance note:_ A call to ‘char->integer’ is compiled as
     casting the argument to a ‘character’, and then re-interpreting
     that value as an ‘int’.  A call to ‘integer->char’ is compiled as
     casting the argument to an ‘int’, and then re-interpreting that
     value as an ‘character’.  If the argument is the right type, no
     code is emitted: the value is just re-interpreted as the result
     type.

 -- Procedure: char=? CHAR1 CHAR2 CHAR3 ...
 -- Procedure: char<? CHAR1 CHAR2 CHAR3 ...
 -- Procedure: char>? CHAR1 CHAR2 CHAR3 ...
 -- Procedure: char<=? CHAR1 CHAR2 CHAR3 ...
 -- Procedure: char>=? CHAR1 CHAR2 CHAR3 ...
     These procedures impose a total ordering on the set of characters
     according to their Unicode scalar values.

          (char<? #\z #\ß)      ⇒ #t
          (char<? #\z #\Z)      ⇒ #f

     _Performance note:_ This is compiled as if converting each argument
     using ‘char->integer’ (which requires no code) and the using the
     corresponing ‘int’ comparison.

 -- Procedure: digit-value char
     This procedure returns the numeric value (0 to 9) of its argument
     if it is a numeric digit (that is, if ‘char-numeric?’ returns
     ‘#t’), or ‘#f’ on any other character.

          (digit-value #\3)        ⇒ 3
          (digit-value #\x0664)    ⇒ 4
          (digit-value #\x0AE6)    ⇒ 0
          (digit-value #\x0EA6)    ⇒ #f


File: kawa.info,  Node: Character sets,  Next: Strings,  Prev: Characters,  Up: Characters and text

13.2 Character sets
===================

Sets of characters are useful for text-processing code, including
parsing, lexing, and pattern-matching.  SRFI 14
(http://srfi.schemers.org/srfi-14/srfi-14.html) specifies a ‘char-set’
type for such uses.  Some examples:

     (import (srfi :14 char-sets))
     (define vowel (char-set #\a #\e #\i #\o #\u))
     (define vowely (char-set-adjoin vowel #\y))
     (char-set-contains? vowel #\y) ⇒  #f
     (char-set-contains? vowely #\y) ⇒  #t

   See the SRFI 14 specification
(http://srfi.schemers.org/srfi-14/srfi-14.html) for details.

 -- Type: char-set
     The type of character sets.  In Kawa ‘char-set’ is a type that can
     be used in type specifiers:
          (define vowely ::char-set (char-set-adjoin vowel #\y))

   Kawa uses inversion lists
(https://en.wikipedia.org/wiki/Inversion_list) for an efficient
implementation, using Java ‘int’ arrays to represents character ranges
(inversions).  The ‘char-set-contains?’ function uses binary search, so
it takes time proportional to the logarithm of the number of inversions.
Other operations may take time proportional to the number of inversions.


File: kawa.info,  Node: Strings,  Next: String literals,  Prev: Character sets,  Up: Characters and text

13.3 Strings
============

Strings are sequences of characters.  The _length_ of a string is the
number of characters that it contains, as an exact non-negative integer.
The _valid indices_ of a string are the exact non-negative integers less
than the length of the string.  The first character of a string has
index 0, the second has index 1, and so on.

   Strings are _implemented_ as a sequence of 16-bit ‘char’ values, even
though they’re semantically a sequence of 32-bit Unicode code points.  A
character whose value is greater than ‘#xffff’ is represented using two
surrogate characters.  The implementation allows for natural
interoperability with Java APIs.  However it does make certain
operations (indexing or counting based on character counts) difficult to
implement efficiently.  Luckily one rarely needs to index or count based
on character counts; alternatives are discussed below.

   There are different kinds of strings:
   • An “istring” is _immutable_: It is fixed, and cannot be modified.
     On the other hand, indexing (e.g.  ‘string-ref’) is efficient
     (constant-time), while indexing of other string implementations
     takes time proportional to the index.

     String literals are istrings, as are the return values of most of
     the procedures in this chapter.

     An “istring” is an instance of the ‘gnu.lists.IString’ class.

   • An “mstring” is _mutable_: You can replace individual characters
     (using ‘string-set!’).  You can also change the MSTRING’s length by
     inserting or removing characters (using ‘string-append!’ or
     ‘string-replace!’).

     An “mstring” is an instance of the ‘gnu.lists.FString’ class.

   • Any other object that implements the ‘java.lang.CharSequence’
     interface is also a string.  This includes standard Java
     ‘java.lang.String’ and ‘java.lang.StringBuilder’ objects.

   Some of the procedures that operate on strings ignore the difference
between upper and lower case.  The names of the versions that ignore
case end with “‘-ci’” (for “case insensitive”).

   _Compatibility:_ Many of the following procedures (for example
‘string-append’) return an immutable istring in Kawa, but return a
“freshly allocated” mutable string in standard Scheme (include R7RS) as
well as most Scheme implementations (including previous versions of
Kawa).  To get the “compatibility mode” versions of those procedures
(which return mstrings), invoke Kawa with one the ‘--r5rs’, ‘--r6rs’, or
‘--r7rs’ options, or you can ‘import’ a standard library like ‘(scheme
base)’.

 -- Type: string
     The type of string objects.  The underlying type is the interface
     ‘java.lang.CharSequence’.  Immultable strings are
     ‘gnu.lists.IString’ or ‘java.lang.String’, while mutable strings
     are ‘gnu.lists.FString’.

13.3.1 Basic string procedures
------------------------------

 -- Procedure: string? OBJ
     Return ‘#t’ if OBJ is a string, ‘#f’ otherwise.

 -- Procedure: istring? OBJ
     Return ‘#t’ if OBJ is a istring (a immutable,
     constant-time-indexable string); ‘#f’ otherwise.

 -- Constructor: string CHAR ...
     Return a string composed of the arguments.  This is analogous to
     LIST.

     _Compatibility:_ The result is an istring, except in compatibility
     mode, when it is a new allocated mstring.

 -- Procedure: string-length STRING
     Return the number of characters in the given STRING as an exact
     integer object.

     _Performance note:_ If the STRING is not an istring, the calling
     ‘string-length’ may take time proportional to the length of the
     STRING, because of the need to scan for surrogate pairs.

 -- Procedure: string-ref STRING K
     K must be a valid index of STRING.  The ‘string-ref’ procedure
     returns character K of STRING using zero–origin indexing.

     _Performance note:_ If the STRING is not an istring, then calling
     ‘string-ref’ may take time proportional to K because of the need to
     check for surrogate pairs.  An alternative is to use
     ‘string-cursor-ref’.  If iterating through a string, use
     ‘string-for-each’.

 -- Procedure: string-null? STRING
     Is STRING the empty string?  Same result as ‘(= (string-length
     STRING) 0)’ but executes in O(1) time.

 -- Procedure: string-every pred string [start end])
 -- Procedure: string-any pred string [start end])

     Checks to see if every/any character in STRING satisfies PRED,
     proceeding from left (index START) to right (index END).  These
     procedures are short-circuiting: if PRED returns false,
     ‘string-every’ does not call PRED on subsequent characters; if PRED
     returns true, ‘string-any’ does not call PRED on subsequent
     characters.  Both procedures are “witness-generating”:
        • If ‘string-every’ is given an empty interval (with START =
          END), it returns ‘#t’.
        • If ‘string-every’ returns true for a non-empty interval (with
          START < END), the returned true value is the one returned by
          the final call to the predicate on ‘(string-ref STRING (- END
          1))’.
        • If ‘string-any’ returns true, the returned true value is the
          one returned by the predicate.

     _Note:_ The names of these procedures do not end with a question
     mark.  This indicates a general value is returned instead of a
     simple boolean (‘#t’ or ‘#f’).

13.3.2 Immutable String Constructors
------------------------------------

 -- Procedure: string-tabulate proc len

     Constructs a string of size LEN by calling PROC on each value from
     0 (inclusive) to LEN (exclusive) to produce the corresponding
     element of the string.  The procedure PROC accepts an exact integer
     as its argument and returns a character.  The order in which PROC
     is called on those indexes is not specifified.

     _Rationale:_ Although ‘string-unfold’ is more general,
     ‘string-tabulate’ is likely to run faster for the common special
     case it implements.

 -- Procedure: string-unfold stop? mapper successor seed [base
          make-final]
 -- Procedure: string-unfold-right stop? mapper successor seed [base
          make-final]
     This is a fundamental and powerful constructor for strings.
        • SUCCESSOR is used to generate a series of “seed” values from
          the initial seed: SEED, ‘(’SUCCESSOR SEED‘)’, ‘(’SUCCESSOR^{2}
          SEED‘)’, ‘(’SUCCESSOR^{3} SEED‘)’, ...
        • STOP? tells us when to stop — when it returns true when
          applied to one of these seed values.
        • MAPPER maps each seed value to the corresponding character(s)
          in the result string, which are assembled into that string in
          left-to-right order.  It is an error for MAPPER to return
          anything other than a character or string.
        • BASE is the optional initial/leftmost portion of the
          constructed string, which defaults to the empty string ‘""’.
          It is an error if BASE is anything other than a character or
          string.
        • MAKE-FINAL is applied to the terminal seed value (on which
          STOP? returns true) to produce the final/rightmost portion of
          the constructed string.  It defaults to ‘(lambda (x) "")’.  It
          is an error for MAKE-FINAL to return anything other than a
          character or string.

     ‘string-unfold-right’ is the same as ‘string-unfold’ except the
     results of MAPPER are assembled into the string in right-to-left
     order, BASE is the optional rightmost portion of the constructed
     string, and MAKE-FINAL produces the leftmost portion of the
     constructed string.

     You can use it ‘string-unfold’ to convert a list to a string, read
     a port into a string, reverse a string, copy a string, and so
     forth.  Examples:
          (define (port->string p)
            (string-unfold eof-object? values
                           (lambda (x) (read-char p))
                           (read-char p)))

          (define (list->string lis)
            (string-unfold null? car cdr lis))

          (define (string-tabulate f size)
            (string-unfold (lambda (i) (= i size)) f add1 0))
     To map F over a list LIS, producing a string:
          (string-unfold null? (compose F car) cdr LIS)

     Interested functional programmers may enjoy noting that
     ‘string-fold-right’ and ‘string-unfold’ are in some sense inverses.
     That is, given operations KNULL?, KAR, KDR, KONS, and KNIL
     satisfying
          (KONS (KAR x) (KDR x)) = x  and  (KNULL? KNIL) = #t
     then
          (string-fold-right KONS KNIL (string-unfold KNULL? KAR KDR X)) = X
     and
          (string-unfold KNULL? KAR KDR (string-fold-right KONS KNIL STRING)) = STRING.

     This combinator pattern is sometimes called an “anamorphism.”

13.3.3 Selection
----------------

 -- Procedure: substring STRING START END
     STRING must be a string, and START and END must be exact integer
     objects satisfying:

          0 <= START <= END <= (string-length STRING)

     The ‘substring’ procedure returns a newly allocated string formed
     from the characters of STRING beginning with index START
     (inclusive) and ending with index END (exclusive).

 -- Procedure: string-take string nchars
 -- Procedure: string-drop string nchars
 -- Procedure: string-take-right string nchars
 -- Procedure: string-drop-right string nchars
     ‘string-take’ returns an immutable string containing the first
     NCHARS of STRING; ‘string-drop’ returns a string containing all but
     the first NCHARS of STRING.  ‘string-take-right’ returns a string
     containing the last NCHARS of STRING; ‘string-drop-right’ returns a
     string containing all but the last NCHARS of STRING.

          (string-take "Pete Szilagyi" 6) ⇒ "Pete S"
          (string-drop "Pete Szilagyi" 6) ⇒ "zilagyi"

          (string-take-right "Beta rules" 5) ⇒ "rules"
          (string-drop-right "Beta rules" 5) ⇒ "Beta "
     It is an error to take or drop more characters than are in the
     string:
          (string-take "foo" 37) ⇒ error

 -- Procedure: string-pad string len [char start end]
 -- Procedure: string-pad-right string len [char start end]
     Returns an istring of length LEN comprised of the characters drawn
     from the given subrange of STRING, padded on the left (right) by as
     many occurrences of the character CHAR as needed.  If STRING has
     more than LEN chars, it is truncated on the left (right) to length
     LEN.  The CHAR defaults to ‘#\space’
          (string-pad     "325" 5) ⇒ "  325"
          (string-pad   "71325" 5) ⇒ "71325"
          (string-pad "8871325" 5) ⇒ "71325"

 -- Procedure: string-trim string [pred start end]
 -- Procedure: string-trim-right string [pred start end]
 -- Procedure: string-trim-both string [pred start end]
     Returns an istring obtained from the given subrange of STRING by
     skipping over all characters on the left / on the right / on both
     sides that satisfy the second argument PRED: PRED defaults to
     ‘char-whitespace?’.
          (string-trim-both "  The outlook wasn't brilliant,  \n\r")
              ⇒ "The outlook wasn't brilliant,"

13.3.4 String Comparisons
-------------------------

 -- Procedure: string=? STRING1 STRING2 STRING3 ...
     Return ‘#t’ if the strings are the same length and contain the same
     characters in the same positions.  Otherwise, the ‘string=?’
     procedure returns ‘#f’.

          (string=? "Straße" "Strasse")    ⇒ #f

 -- Procedure: string<? STRING1 STRING2 STRING3 ...
 -- Procedure: string>? STRING1 STRING2 STRING3 ...
 -- Procedure: string<=? STRING1 STRING2 STRING3 ...
 -- Procedure: string>=? STRING1 STRING2 STRING3 ...
     These procedures return ‘#t’ if their arguments are (respectively):
     monotonically increasing, monotonically decreasing, monotonically
     non-decreasing, or monotonically nonincreasing.  These predicates
     are required to be transitive.

     These procedures are the lexicographic extensions to strings of the
     corresponding orderings on characters.  For example, ‘string<?’ is
     the lexicographic ordering on strings induced by the ordering
     ‘char<?’ on characters.  If two strings differ in length but are
     the same up to the length of the shorter string, the shorter string
     is considered to be lexicographically less than the longer string.

          (string<? "z" "ß")      ⇒ #t
          (string<? "z" "zz")     ⇒ #t
          (string<? "z" "Z")      ⇒ #f

 -- Procedure: string-ci=? STRING1 STRING2 STRING3 ...
 -- Procedure: string-ci<? STRING1 STRING2 STRING3 ...
 -- Procedure: string-ci>? STRING1 STRING2 STRING3 ...
 -- Procedure: string-ci<=? STRING1 STRING2 STRING3 ...
 -- Procedure: string-ci>=? STRING1 STRING2 STRING3 ...
     These procedures are similar to ‘string=?’, etc., but behave as if
     they applied ‘string-foldcase’ to their arguments before invoking
     the corresponding procedures without ‘-ci’.

          (string-ci<? "z" "Z")                   ⇒ #f
          (string-ci=? "z" "Z")                   ⇒ #t
          (string-ci=? "Straße" "Strasse")        ⇒ #t
          (string-ci=? "Straße" "STRASSE")        ⇒ #t
          (string-ci=? "ΧΑΟΣ" "χαοσ")             ⇒ #t

13.3.5 Conversions
------------------

 -- Procedure: list->string LIST
     The ‘list->string’ procedure returns an istring formed from the
     characters in LIST, in order.  It is an error if any element of
     LIST is not a character.

     _Compatibility:_ The result is an istring, except in compatibility
     mode, when it is an mstring.

 -- Procedure: reverse-list->string LIST
     An efficient implementation of ‘(compose list->text reverse)’:
          (reverse-list->text '(#\a #\B #\c))  ⇒ "cBa"
     This is a common idiom in the epilogue of string-processing loops
     that accumulate their result using a list in reverse order.  (See
     also ‘string-concatenate-reverse’ for the “chunked” variant.)

 -- Procedure: string->list STRING [START [END]]
     The ‘string->list’ procedure returns a newly allocated list of the
     characters of STRING between START and END, in order.  The
     ‘string->list’ and ‘list->string’ procedures are inverses so far as
     ‘equal?’ is concerned.

 -- Procedure: vector->string vector [start [end]]
     The ‘vector->string’ procedure returns a newly allocated string of
     the objects contained in the elements of VECTOR between START and
     END.  It is an error if any element of VECTOR between START and END
     is not a character, or is a character forbidden in strings.
          (vector->string #(#\1 #\2 #\3))             ⇒ "123"
          (vector->string #(#\1 #\2 #\3 #\4 #\5) 2 4) ⇒ "34"

 -- Procedure: string->vector string [start [end]]
     The ‘string->vector’ procedure returns a newly created vector
     initialized to the elements of the string STRING between START and
     END.
          (string->vector "ABC")       ⇒ #(#\A #\B #\C)
          (string->vector "ABCDE" 1 3) ⇒ #(#\B #\C)

 -- Procedure: string-upcase STRING
 -- Procedure: string-downcase STRING
 -- Procedure: string-titlecase STRING
 -- Procedure: string-foldcase STRING
     These procedures take a string argument and return a string result.
     They are defined in terms of Unicode’s locale–independent case
     mappings from Unicode scalar–value sequences to scalar–value
     sequences.  In particular, the length of the result string can be
     different from the length of the input string.  When the specified
     result is equal in the sense of ‘string=?’ to the argument, these
     procedures may return the argument instead of a newly allocated
     string.

     The ‘string-upcase’ procedure converts a string to upper case;
     ‘string-downcase’ converts a string to lower case.  The
     ‘string-foldcase’ procedure converts the string to its case–folded
     counterpart, using the full case–folding mapping, but without the
     special mappings for Turkic languages.  The ‘string-titlecase’
     procedure converts the first cased character of each word, and
     downcases all other cased characters.

          (string-upcase "Hi")              ⇒ "HI"
          (string-downcase "Hi")            ⇒ "hi"
          (string-foldcase "Hi")            ⇒ "hi"

          (string-upcase "Straße")          ⇒ "STRASSE"
          (string-downcase "Straße")        ⇒ "straße"
          (string-foldcase "Straße")        ⇒ "strasse"
          (string-downcase "STRASSE")       ⇒ "strasse"

          (string-downcase "Σ")             ⇒ "σ"
          ; Chi Alpha Omicron Sigma:
          (string-upcase "ΧΑΟΣ")            ⇒ "ΧΑΟΣ"
          (string-downcase "ΧΑΟΣ")          ⇒ "χαος"
          (string-downcase "ΧΑΟΣΣ")         ⇒ "χαοσς"
          (string-downcase "ΧΑΟΣ Σ")        ⇒ "χαος σ"
          (string-foldcase "ΧΑΟΣΣ")         ⇒ "χαοσσ"
          (string-upcase "χαος")            ⇒ "ΧΑΟΣ"
          (string-upcase "χαοσ")            ⇒ "ΧΑΟΣ"

          (string-titlecase "kNock KNoCK")  ⇒ "Knock Knock"
          (string-titlecase "who's there?") ⇒ "Who's There?"
          (string-titlecase "r6rs")         ⇒ "R6rs"
          (string-titlecase "R6RS")         ⇒ "R6rs"

     Since these procedures are locale–independent, they may not be
     appropriate for some locales.

     _Kawa Note:_ The implementation of ‘string-titlecase’ does not
     correctly handle the case where an initial character needs to be
     converted to multiple characters, such as “LATIN SMALL LIGATURE FL”
     which should be converted to the two letters ‘"Fl"’.

     _Compatibility:_ The result is an istring, except in compatibility
     mode, when it is an mstring.

 -- Procedure: string-normalize-nfd STRING
 -- Procedure: string-normalize-nfkd STRING
 -- Procedure: string-normalize-nfc STRING
 -- Procedure: string-normalize-nfkc STRING
     These procedures take a string argument and return a string result,
     which is the input string normalized to Unicode normalization form
     D, KD, C, or KC, respectively.  When the specified result is equal
     in the sense of ‘string=?’ to the argument, these procedures may
     return the argument instead of a newly allocated string.

          (string-normalize-nfd "\xE9;")          ⇒ "\x65;\x301;"
          (string-normalize-nfc "\xE9;")          ⇒ "\xE9;"
          (string-normalize-nfd "\x65;\x301;")    ⇒ "\x65;\x301;"
          (string-normalize-nfc "\x65;\x301;")    ⇒ "\xE9;"

13.3.6 Searching and matching
-----------------------------

 -- Procedure: string-prefix-length string_{1} string_{2} [start_{1}
          end_{1} start_{2} end_{2}]
 -- Procedure: string-suffix-length string_{1} string_{2} [start_{1}
          end_{1} start_{2} end_{2}]
     Return the length of the longest common prefix/suffix of STRING_{1}
     and STRING_{2}.  For prefixes, this is equivalent to their
     “mismatch index” (relative to the start indexes).

     The optional START/END indexes restrict the comparison to the
     indicated substrings of STRING_{1} and STRING_{2}.

 -- Procedure: string-prefix? string_{1} string_{2} [start_{1} end_{1}
          start_{2} end_{2}]
 -- Procedure: string-suffix? string_{1} string_{2} [start_{1} end_{1}
          start_{2} end_{2}]
     Is STRING_{1} a prefix/suffix of STRING_{2}?

     The optional START/END indexes restrict the comparison to the
     indicated substrings of STRING_{1} and STRING_{2}.

 -- Procedure: string-index string pred [start end]
 -- Procedure: string-index-right string pred [start end]
 -- Procedure: string-skip string pred [start end]
 -- Procedure: string-skip-right string pred [start end]

     ‘string-index’ searches through the given substring from the left,
     returning the index of the leftmost character satisfying the
     predicate PRED.  ‘string-index-right’ searches from the right,
     returning the index of the rightmost character satisfying the
     predicate PRED.  If no match is found, these procedures return
     ‘#f’.

     The START and END arguments specify the beginning and end of the
     search; the valid indexes relevant to the search include START but
     exclude END.  Beware of “fencepost”" errors: when searching
     right-to-left, the first index considered is ‘(- END 1)’, whereas
     when searching left-to-right, the first index considered is START.
     That is, the start/end indexes describe the same half-open interval
     ‘[START,END)’ in these procedures that they do in other string
     procedures.

     The ‘-skip’ functions are similar, but use the complement of the
     criterion: they search for the first char that _doesn’t_ satisfy
     PRED.  To skip over initial whitespace, for example, say

          (substring string
                      (or (string-skip string char-whitespace?)
                          (string-length string))
                      (string-length string))

     These functions can be trivially composed with ‘string-take’ and
     ‘string-drop’ to produce ‘take-while’, ‘drop-while’, ‘span’, and
     ‘break’ procedures without loss of efficiency.

 -- Procedure: string-contains string_{1} string_{2} [start_{1} end_{1}
          start_{2} end_{2}]
 -- Procedure: string-contains-right string_{1} string_{2} [start_{1}
          end_{1} start_{2} end_{2}]
     Does the substring of STRING_{1} specified by START_{1} and END_{1}
     contain the sequence of characters given by the substring of
     STRING_{2} specified by START_{2} and END_{2}?

     Returns ‘#f’ if there is no match.  If START_{2} = END_{2},
     ‘string-contains’ returns START_{1} but ‘string-contains-right’
     returns END_{1}.  Otherwise returns the index in STRING_{1} for the
     first character of the first/last match; that index lies within the
     half-open interval [START_{1},END_{1}), and the match lies entirely
     within the [START_{1},END_{1}) range of STRING_{1}.

          (string-contains "eek -- what a geek." "ee" 12 18) ; Searches "a geek"
              ⇒ 15
     Note: The names of these procedures do not end with a question
     mark.  This indicates a useful value is returned when there is a
     match.

13.3.7 Concatenation and replacing
----------------------------------

 -- Procedure: string-append STRING ...
     Returns a string whose characters form the concatenation of the
     given strings.

     _Compatibility:_ The result is an istring, except in compatibility
     mode, when it is an mstring.

 -- Procedure: string-concatenate string-list
     Concatenates the elements of STRING-LIST together into a single
     istring.

     _Rationale:_ Some implementations of Scheme limit the number of
     arguments that may be passed to an n-ary procedure, so the ‘(apply
     string-append STRING-LIST)’ idiom, which is otherwise equivalent to
     using this procedure, is not as portable.

 -- Procedure: string-concatenate-reverse string-list [final-string
          [end]])
     With no optional arguments, calling this procedure is equivalent to
     ‘(string-concatenate (reverse STRING-LIST))’.  If the optional
     argument FINAL-STRING is specified, it is effectively consed onto
     the beginning of STRING-LIST before performing the list-reverse and
     string-concatenate operations.

     If the optional argument END is given, only the characters up to
     but not including END in FINAL-STRING are added to the result, thus
     producing
          (string-concatenate
            (reverse (cons (substring final-string 0 end)
                           string-list)))
     For example:
          (string-concatenate-reverse '(" must be" "Hello, I") " going.XXXX" 7)
            ⇒ "Hello, I must be going."

     _Rationale:_ This procedure is useful when constructing procedures
     that accumulate character data into lists of string buffers, and
     wish to convert the accumulated data into a single string when
     done.  The optional end argument accommodates that use case when
     FINAL-STRING is a bob-full mutable string, and is allowed (for
     uniformity) when FINAL-STRING is an immutable string.

 -- Procedure: string-join string-list [delimiter [grammar]]
     This procedure is a simple unparser; it pastes strings together
     using the DELIMITER string, returning an istring.

     The STRING-LIST is a list of strings.  The DELIMITER is the string
     used to delimit elements; it defaults to a single space ‘" "’.

     The GRAMMAR argument is a symbol that determines how the DELIMITER
     is used, and defaults to ‘'infix’.  It is an error for GRAMMAR to
     be any symbol other than these four:
     ‘'infix’
          An infix or separator grammar: insert the delimiter between
          list elements.  An empty list will produce an empty string.
     ‘'strict-infix’
          Means the same as ‘'infix’ if the string-list is non-empty,
          but will signal an error if given an empty list.  (This avoids
          an ambiguity shown in the examples below.)
     ‘'suffix’
          Means a suffix or terminator grammar: insert the DELIMITER
          after every list element.
     ‘'prefix’
          Means a prefix grammar: insert the DELIMITER before every list
          element.

          (string-join '("foo" "bar" "baz"))
                   ⇒ "foo bar baz"
          (string-join '("foo" "bar" "baz") "")
                   ⇒ "foobarbaz"
          (string-join '("foo" "bar" "baz") ":")
                   ⇒ "foo:bar:baz"
          (string-join '("foo" "bar" "baz") ":" 'suffix)
                   ⇒ "foo:bar:baz:"

          ;; Infix grammar is ambiguous wrt empty list vs. empty string:
          (string-join '()   ":") ⇒ ""
          (string-join '("") ":") ⇒ ""

          ;; Suffix and prefix grammars are not:
          (string-join '()   ":" 'suffix)) ⇒ ""
          (string-join '("") ":" 'suffix)) ⇒ ":"

 -- Procedure: string-replace string_{1} string_{2} start_{1} end_{1}
          [start_{2} end_{2}]
     Returns
          (string-append (substring STRING_{1} 0 START_{1})
                         (substring STRING_{2} START_{2} END_{2})
                         (substring STRING_{1} END_{1} (string-length STRING_{1})))
     That is, the segment of characters in STRING_{1} from START_{1} to
     END_{1} is replaced by the segment of characters in STRING_{2} from
     START_{2} to END_{2}.  If START_{1}=END_{1}, this simply splices
     the characters drawn from STRING_{2} into STRING_{1} at that
     position.

     Examples:
          (string-replace "The TCL programmer endured daily ridicule."
                           "another miserable perl drone" 4 7 8 22)
              ⇒ "The miserable perl programmer endured daily ridicule."

          (string-replace "It's easy to code it up in Scheme." "lots of fun" 5 9)
              ⇒ "It's lots of fun to code it up in Scheme."

          (define (string-insert s i t) (string-replace s t i i))

          (string-insert "It's easy to code it up in Scheme." 5 "really ")
              ⇒ "It's really easy to code it up in Scheme."

          (define (string-set s i c) (string-replace s (string c) i (+ i 1)))

          (string-set "String-ref runs in O(n) time." 19 #\1)
              ⇒ "String-ref runs in O(1) time."

   Also see ‘string-append!’ and ‘string-replace!’ for destructive
changes to a mutable string.

13.3.8 Mapping and folding
--------------------------

 -- Procedure: string-fold kons knil string [start end]
 -- Procedure: string-fold-right kons knil string [start end]
     These are the fundamental iterators for strings.

     The ‘string-fold’ procedure maps the KONS procedure across the
     given STRING from left to right:
          (... (KONS STRING_{2} (KONS STRING_{1} (KONS STRING_{0} KNIL))))
     In other words, string-fold obeys the (tail) recursion
            (string-fold KONS KNIL STRING START END)
          = (string-fold KONS (KONS STRING_{start} KNIL) START+1 END)
     The ‘string-fold-right’ procedure maps KONS across the given string
     STRING from right to left:
          (KONS STRING_{0}
                (... (KONS STRING_{END-3}
                           (KONS STRING_{END-2}
                                 (KONS STRING_{END-1}
                                       KNIL)))))
     obeying the (tail) recursion
            (string-fold-right KONS KNIL STRING START END)
          = (string-fold-right KONS (KONS STRING_{END-1} KNIL) START END-1)
     Examples:
          ;;; Convert a string or string to a list of chars.
          (string-fold-right cons '() string)

          ;;; Count the number of lower-case characters in a string or string.
          (string-fold (lambda (c count)
                          (if (char-lower-case? c)
                              (+ count 1)
                              count))
                        0
                        string)
     The string-fold-right combinator is sometimes called a
     "catamorphism."

 -- Procedure: string-for-each PROC STRING1 STRING2 ...
 -- Procedure: string-for-each PROC STRING1 [start [end]]
     The STRINGs must all have the same length.  PROC should accept as
     many arguments as there are STRINGs.

     The START-END variant is provided for compatibility with the
     SRFI-13 version.  (In that case START and END count code Unicode
     scalar values (‘character’ values), not Java 16-bit ‘char’ values.)

     The ‘string-for-each’ procedure applies PROC element–wise to the
     characters of the STRINGs for its side effects, in order from the
     first characters to the last.  PROC is always called in the same
     dynamic environment as ‘string-for-each’ itself.

     Analogous to ‘for-each’.

          (let ((v '()))
            (string-for-each
              (lambda (c) (set! v (cons (char->integer c) v)))
              "abcde")
             v)
            ⇒ (101 100 99 98 97)

     _Performance note:_ The compiler generates efficient code for
     ‘string-for-each’.  If PROC is a lambda expression, it is inlined.

 -- Procedure: string-map PROC STRING1 STRING2 ...
     The ‘string-map’ procedure applies PROC element-wise to the
     elements of the strings and returns a string of the results, in
     order.  It is an error if PROC does not accept as many arguments as
     there are strings, or return other than a single character or a
     string.  If more than one string is given and not all strings have
     the same length, ‘string-map’ terminates when the shortest string
     runs out.  The dynamic order in which PROC is applied to the
     elements of the strings is unspecified.

          (string-map char-foldcase "AbdEgH")  ⇒ "abdegh"
          (string-map
            (lambda (c) (integer->char (+ 1 (char->integer c))))
            "HAL")
                  ⇒ "IBM"
          (string-map
            (lambda (c k)
              ((if (eqv? k #\u) char-upcase char-downcase) c))
            "studlycaps xxx"
            "ululululul")
                  ⇒ "StUdLyCaPs"

     Traditionally the result of PROC had to be a character, but Kawa
     (and SRFI-140) allows the result to be a string.

     _Performance note:_ The ‘string-map’ procedure has not been
     optimized (mainly because it is not very useful): The characters
     are boxed, and the PROC is not inlined even if it is a lambda
     expression.

 -- Procedure: string-map-index proc string [start end]

     Calls PROC on each valid index of the specified substring, converts
     the results of those calls into strings, and returns the
     concatenation of those strings.  It is an error for PROC to return
     anything other than a character or string.  The dynamic order in
     which proc is called on the indexes is unspecified, as is the
     dynamic order in which the coercions are performed.  If any strings
     returned by PROC are mutated after they have been returned and
     before the call to ‘string-map-index’ has returned, then
     ‘string-map-index’ returns a string with unspecified contents; the
     STRING-MAP-INDEX procedure itself does not mutate those strings.

 -- Procedure: string-for-each-index proc string [start end]

     Calls PROC on each valid index of the specified substring, in
     increasing order, discarding the results of those calls.  This is
     simply a safe and correct way to loop over a substring.

     Example:
          (let ((txt (string->string "abcde"))
                (v '()))
            (string-for-each-index
              (lambda (cur) (set! v (cons (char->integer (string-ref txt cur)) v)))
              txt)
            v) ⇒ (101 100 99 98 97)

 -- Procedure: string-count string pred [start end]
     Returns a count of the number of characters in the specified
     substring of STRING that satisfy the predicate PRED.

 -- Procedure: string-filter pred string [start end]
 -- Procedure: string-remove pred string [start end]
     Return an immutable string consisting of only selected characters,
     in order: ‘string-filter’ selects only the characters that satisfy
     PRED; ‘string-remove’ selects only the characters that _not_
     satisfy PRED

13.3.9 Replication & splitting
------------------------------

 -- Procedure: string-repeat string-or-character len
     Create an istring by repeating the first argument LEN times.  If
     the first argument is a character, it is as if it were wrapped with
     the ‘string’ constructor.  We can define string-repeat in terms of
     the more general ‘xsubstring’ procedure:
          (define (string-repeat S N)
             (let ((T (if (char? S) (string S) S)))
               (xsubstring T 0 (* N (string-length T))))

 -- Procedure: xsubstring string [from to [start end]]
     This is an extended substring procedure that implements replicated
     copying of a substring.  The STRING is a string; START and END are
     optional arguments that specify a substring of STRING, defaulting
     to 0 and the length of STRING.  This substring is conceptually
     replicated both up and down the index space, in both the positive
     and negative directions.  For example, if STRING is ‘"abcdefg"’,
     START is 3, and END is 6, then we have the conceptual
     bidirectionally-infinite string
            ...  d  e  f  d  e  f  d  e  f  d  e  f  d  e  f  d  e  f  d ...
                -9 -8 -7 -6 -5 -4 -3 -2 -1  0 +1 +2 +3 +4 +5 +6 +7 +8 +9
     ‘xsubstring’ returns the substring of the STRING beginning at index
     FROM, and ending at TO.  It is an error if FROM is greater than TO.

     If FROM and TO are missing they default to 0 and FROM+(END-START),
     respectively.  This variant is a generalization of using
     ‘substring’, but unlike ‘substring’ never shares substructures that
     would retain characters or sequences of characters that are
     substructures of its first argument or previously allocated
     objects.

     You can use ‘xsubstring’ to perform a variety of tasks:
        • To rotate a string left: ‘(xsubstring "abcdef" 2 8) ⇒
          "cdefab"’
        • To rotate a string right: ‘(xsubstring "abcdef" -2 4) ⇒
          "efabcd"’
        • To replicate a string: ‘(xsubstring "abc" 0 7) ⇒ "abcabca"’

     Note that
        • The FROM/TO arguments give a half-open range containing the
          characters from index FROM up to, but not including, index TO.
        • The FROM/TO indexes are not expressed in the index space of
          STRING.  They refer instead to the replicated index space of
          the substring defined by STRING, START, and END.

     It is an error if START=END, unless FROM=TO, which is allowed as a
     special case.

 -- Procedure: string-split string delimiter [grammar limit start end]
     Returns a list of strings representing the words contained in the
     substring of STRING from START (inclusive) to END (exclusive).  The
     DELIMITER is a string to be used as the word separator.  This will
     often be a single character, but multiple characters are allowed
     for use cases such as splitting on ‘"\r\n"’.  The returned list
     will have one more item than the number of non-overlapping
     occurrences of the DELIMITER in the string.  If DELIMITER is an
     empty string, then the returned list contains a list of strings,
     each of which contains a single character.

     The GRAMMAR is a symbol with the same meaning as in the
     ‘string-join’ procedure.  If it is ‘infix’, which is the default,
     processing is done as described above, except an empty string
     produces the empty list; if grammar is ‘strict-infix’, then an
     empty string signals an error.  The values ‘prefix’ and ‘suffix’
     cause a leading/trailing empty string in the result to be
     suppressed.

     If LIMIT is a non-negative exact integer, at most that many splits
     occur, and the remainder of string is returned as the final element
     of the list (so the result will have at most limit+1 elements).  If
     limit is not specified or is #f, then as many splits as possible
     are made.  It is an error if limit is any other value.

     To split on a regular expression, you can use SRFI 115’s
     ‘regexp-split’ procedure.

13.3.10 String mutation
-----------------------

The following procedures create a mutable string, i.e.  one that you can
modify.

 -- Procedure: make-string [K [CHAR]]
     Return a newly allocated mstring of K characters, where K defaults
     to 0.  If CHAR is given, then all elements of the string are
     initialized to CHAR, otherwise the contents of the STRING are
     unspecified.

     The 1-argument version is deprecated as poor style, except when k
     is 0.

     _Rationale:_ In many languags the most common pattern for mutable
     strings is to allocate an empty string and incrementally append to
     it.  It seems natural to initialize the string with
     ‘(make-string)’, rather than ‘(make-string 0)’.

     To return an immutable string that repeats K times a character CHAR
     use ‘string-repeat’.

     This is as R7RS, except the result is variable-size and we allow
     leaving out K when it is zero.

 -- Procedure: string-copy STRING [START [END]]
     Returns a newly allocated mutable (mstring) copy of the part of the
     given STRING between START and END.

   The following procedures modify a mutable string.

 -- Procedure: string-set! string k char
     This procedure stores CHAR in element K of STRING.

          (define s1 (make-string 3 #\*))
          (define s2 "***")
          (string-set! s1 0 #\?) ⇒ _void_
          s1 ⇒ "?**"
          (string-set! s2 0 #\?) ⇒ _error_
          (string-set! (symbol->string 'immutable) 0 #\?) ⇒ _error_

     _Performance note:_ Calling ‘string-set!’ may take time
     proportional to the length of the string: First it must scan for
     the right position, like ‘string-ref’ does.  Then if the new
     character requires using a surrogate pair (and the old one doesn’t)
     then we have to make room in the string, possibly re-allocating a
     new ‘char’ array.  Alternatively, if the old character requires
     using a surrogate pair (and the new one doesn’t) then following
     characters need to be moved.

     The function ‘string-set!’ is deprecated: It is inefficient, and it
     very seldom does the correct thing.  Instead, you can construct a
     string with ‘string-append!’.

 -- Procedure: string-append! STRING VALUE ...
     The STRING must be a mutable string, such as one returned by
     ‘make-string’ or ‘string-copy’.  The ‘string-append!’ procedure
     extends STRING by appending each VALUE (in order) to the end of
     STRING.  Each ‘value’ should be a character or a string.

     _Performance note:_ The compiler converts a call with multiple
     VALUEs to multiple ‘string-append!’ calls.  If a VALUE is known to
     be a ‘character’, then no boxing (object-allocation) is needed.

     The following example shows how to efficiently process a string
     using ‘string-for-each’ and incrementally “build” a result string
     using ‘string-append!’.

          (define (translate-space-to-newline str::string)::string
            (let ((result (make-string 0)))
              (string-for-each
               (lambda (ch)
                 (string-append! result
                                 (if (char=? ch #\Space) #\Newline ch)))
               str)
              result))

 -- Procedure: string-copy! TO AT FROM [START [END]]
     Copies the characters of the string FROM that are between START end
     END into the string TO, starting at index AT.  The order in which
     characters are copied is unspecified, except that if the source and
     destination overlap, copying takes place as if the source is first
     copied into a temporary string and then into the destination.
     (This is achieved without allocating storage by making sure to copy
     in the correct direction in such circumstances.)

     This is equivalent to (and implemented as):
          (string-replace! to at (+ at (- end start)) from start end))

          (define a "12345")
          (define b (string-copy "abcde"))
          (string-copy! b 1 a 0 2)
          b  ⇒  "a12de"

 -- Procedure: string-replace! DST DST-START DST-END SRC [SRC-START
          [SRC-END]]
     Replaces the characters of string DST (between DST-START and
     DST-END) with the characters of SRC (between SRC-START and
     SRC-END).  The number of characters from SRC may be different than
     the number replaced in DST, so the string may grow or contract.
     The special case where DST-START is equal to DST-END corresponds to
     insertion; the case where SRC-START is equal to SRC-END corresponds
     to deletion.  The order in which characters are copied is
     unspecified, except that if the source and destination overlap,
     copying takes place as if the source is first copied into a
     temporary string and then into the destination.  (This is achieved
     without allocating storage by making sure to copy in the correct
     direction in such circumstances.)

 -- Procedure: string-fill! STRING FILL [START [END]]
     The ‘string-fill!’ procedure stores FILL in the elements of STRING
     between START and END.  It is an error if FILL is not a character
     or is forbidden in strings.

13.3.11 Strings as sequences
----------------------------

13.3.11.1 Indexing a string
...........................

Using function-call syntax with strings is convenient and efficient.
However, it has some “gotchas”.

   We will use the following example string:
     (! str1 "Smile \x1f603;!")
   or if you’re brave:
     (! str1 "Smile ��!")

   This is ‘"Smile "’ followed by an emoticon (“smiling face with open
mouth”) followed by ‘"!"’.  The emoticon has scalar value ‘\x1f603’ - it
is not in the 16-bit Basic Multi-language Plane, and so it must be
encoded by a surrogate pair (‘#\xd83d’ followed by ‘#\xde03’).

   The number of scalar values (‘character’s) is 8, while the number of
16-bits code units (‘char’s) is 9.  The ‘java.lang.CharSequence:length’
method counts ‘char’s.  Both the ‘length’ and the ‘string-length’
procedures count ‘character’s.  Thus:

     (length str1)          ⇒ 8
     (string-length str1)   ⇒ 8
     (str1:length)          ⇒ 9

   Counting ‘char’s is a constant-time operation (since it is stored in
the data structure).  Counting ‘character’s depends on the
representation used: In geneeral it may take time proportional to the
length of the string, since it has to subtract one for each surrogate
pair; however the ISTRING type (‘gnu.lists.IString’ class) uses a extra
structure so it can count characters in constant-time.

   Similarly we can can index the string in 3 ways:

     (str1 1)              ⇒ #\m :: character
     (string-ref str1 1)   ⇒ #\m :: character
     (str1:charAt 1)       ⇒ #\m :: char

   Using function-call syntax when the “function” is a string and a
single integer argument is the same as using ‘string-ref’.

   Things become interesting when we reach the emoticon:

     (str1 6)              ⇒ #\�� :: character
     (str1:charAt 6)       ⇒ #\d83d :: char

   Both ‘string-ref’ and the function-call syntax return the real
character, while the ‘charAt’ methods returns a partial character.

     (str1 7)              ⇒ #\! :: character
     (str1:charAt 7)       ⇒ #\de03 :: char
     (str1 8)              ⇒ throws StringIndexOutOfBoundsException
     (str1:charAt 8)       ⇒ #\! :: char

13.3.11.2 Indexing with a sequence
..................................

You can index a string with a list of integer indexes, most commonly a
range:
     (STR [I ...])
   is basically the same as:
     (string (STR I) ...)

   Generally when working with strings it is best to work with
substrings rather than individual characters:
     (STR [START <: END])

   This is equivalent to invoking the ‘substring’ procedure:
     (substring STR START END)

13.3.12 String Cursor API
-------------------------

Indexing into a string (using for example ‘string-ref’) is inefficient
because of the possible presence of surrogate pairs.  Hence given an
index I access normally requires linearly scanning the string until we
have seen I characters.

   The string-cursor API is defined in terms of abstract “cursor
values”, which point to a position in the string.  This avoids the
linear scan.

   Typical usage is:
     (let* ((str WHATEVER)
            (end (string-cursor-end str)))
       (do ((sc::string-cursor (string-cursor-start str)
                               (string-cursor-next str sc)))
         ((string-cursor>=? sc end))
         (let ((ch (string-cursor-ref str sc)))
           (DO-SOMETHING-WITH ch))))
   Alternatively, the following may be marginally faster:
     (let* ((str WHATEVER)
            (end (string-cursor-end str)))
       (do ((sc::string-cursor (string-cursor-start str)
                               (string-cursor-next-quick sc)))
         ((string-cursor>=? sc end))
         (let ((ch (string-cursor-ref str sc)))
           (if (not (char=? ch #\ignorable-char))
             (DO-SOMETHING-WITH ch)))))

   The API is non-standard, but is based on that in Chibi Scheme.

 -- Type: string-cursor
     An abstract position (index) in a string.  Implemented as a
     primitive ‘int’ which counts the number of preceding code units
     (16-bit ‘char’ values).

 -- Procedure: string-cursor-start str
     Returns a cursor for the start of the string.  The result is always
     0, cast to a ‘string-cursor’.

 -- Procedure: string-cursor-end str
     Returns a cursor for the end of the string - one past the last
     valid character.  Implemented as ‘(as string-cursor (invoke STR
     'length))’.

 -- Procedure: string-cursor-ref str cursor
     Return the ‘character’ at the CURSOR.  If the CURSOR points to the
     second ‘char’ of a surrogate pair, returns ‘#\ignorable-char’.

 -- Procedure: string-cursor-next string cursor [count]
     Return the cursor position COUNT (default 1) character positions
     forwards beyond CURSOR.  For each COUNT this may add either 1 or 2
     (if pointing at a surrogate pair) to the CURSOR.

 -- Procedure: string-cursor-next-quiet cursor
     Increment cursor by one raw ‘char’ position, even if CURSOR points
     to the start of a surrogate pair.  (In that case the next
     ‘string-cursor-ref’ will return ‘#\ignorable-char’.)  Same as ‘(+
     CURSOR 1)’ but with the ‘string-cursor’ type.

 -- Procedure: string-cursor-prev string cursor [count]
     Return the cursor position COUNT (default 1) character positions
     backwards before CURSOR.

 -- Procedure: substring-cursor string [start [end]]
     Create a substring of the section of STRING between the cursors
     START and END.

 -- Procedure: string-cursor<? cursor1 cursor2
 -- Procedure: string-cursor<=? cursor1 cursor2
 -- Procedure: string-cursor=? cursor1 cursor2
 -- Procedure: string-cursor>=? cursor1 cursor2
 -- Procedure: string-cursor>? cursor1 cursor2
     Is the position of CURSOR1 respectively before, before or same,
     same, after, or after or same, as CURSOR2.

     _Performance note:_ Implemented as the corresponding ‘int’
     comparison.

 -- Procedure: string-cursor-for-each proc string [start [end]]
     Apply the procedure PROC to each character position in STRING
     between the cursors START and END.


File: kawa.info,  Node: String literals,  Next: Unicode,  Prev: Strings,  Up: Characters and text

13.4 String literals
====================

Kaw support two syntaxes of string literals: The traditional, portable,
qdouble-quoted-delimited literals like ‘"this"’; and extended SRFI-109
quasi-literals like ‘&{this}’.

13.4.1 Simple string literals
-----------------------------

     STRING ::= ‘"’STRING-ELEMENT^{*}‘"’
     STRING-ELEMENT ::= any character other than ‘"’ or ‘\’
         | MNEMONIC-ESCAPE | ‘\"’ | ‘\\’
         | ‘\’INTRALINE-WHITESPACE^{*}LINE-ENDING INTRALINE-WHITESPACE^{*}
         | INLINE-HEX-ESCAPE
     MNEMONIC-ESCAPE ::= ‘\a’ | ‘\b’ | ‘\t’ | ‘\n’ | ‘\r’ | ... (see below)

   A string is written as a sequence of characters enclosed within
quotation marks (‘"’).  Within a string literal, various escape sequence
represent characters other than themselves.  Escape sequences always
start with a backslash (‘\’):
‘\a’
     Alarm (bell), ‘#\x0007’.
‘\b’
     Backspace, ‘#\x0008’.
‘\e’
     Escape, ‘#\x001B’.
‘\f’
     Form feed, ‘#\x000C’.
‘\n’
     Linefeed (newline), ‘#\x000A’.
‘\r’
     Return, ‘#\x000D’.
‘\t’
     Character tabulation, ‘#\x0009’.
‘\v’
     Vertical tab, ‘#\x000B’.
‘\C-’X
‘\^’X
     Returns the scalar value of X masked (anded) with ‘#x9F’.  An
     alternative way to write the Ascii control characters: For example
     ‘"\C-m"’ or ‘"\^m"’ is the same as ‘"#\x000D"’ (which the same as
     ‘"\r"’).  As a special case ‘\^?’ is rubout (delete) (‘\x7f;’).
‘\x’ HEX-SCALAR-VALUE‘;’
‘\X’ HEX-SCALAR-VALUE‘;’
     A hex encoding that gives the scalar value of a character.
‘\\’ OCT-DIGIT^{+}
     At most three octal digits that give the scalar value of a
     character.  (Historical, for C compatibility.)
‘\u’ HEX-DIGIT^{+}
     Exactly four hex digits that give the scalar value of a character.
     (Historical, for Java compatibility.)
‘\M-’X
     (Historical, for Emacs Lisp.)  Set the meta-bit (high-bit of single
     byte) of the following character X.
‘\|’
     Vertical line, ‘#\x007c’.  (Not useful for string literals, but
     useful for symbols.)
‘\"’
     Double quote, ‘#\x0022’.
‘\\’
     Backslah, ‘#\005C’.
‘\’INTRALINE-WHITESPACE^{*}LINE-ENDING INTRALINE-WHITESPACE^{*}
     Nothing (ignored).  Allows you to split up a long string over
     multiple lines; ignoring initial whitespace on the continuation
     lines allows you to indent them.

   Except for a line ending, any character outside of an escape sequence
stands for itself in the string literal.  A line ending which is
preceded by ‘\’INTRALINE-WHITESPACE^{*} expands to nothing (along with
any trailing INTRALINE-WHITESPACE), and can be used to indent strings
for improved legibility.  Any other line ending has the same effect as
inserting a ‘\n’ character into the string.

   Examples:
     "The word \"recursion\" has many meanings."
     "Another example:\ntwo lines of text"
     "Here’s text \
     containing just one line"
     "\x03B1; is named GREEK SMALL LETTER ALPHA."

13.4.2 String templates
-----------------------

The following syntax is a “string template” (also called a string
quasi-literal or “here document
(http://en.wikipedia.org/wiki/Here_document)”):
     &{Hello &[name]!}
   Assuming the variable ‘name’ evaluates to ‘"John"’ then the example
evaluates to ‘"Hello John!"’.

   The Kawa reader converts the above example to:
     ($string$ "Hello " $<<$ name $>>$ "!")
   See the SRFI-109 (http://srfi.schemers.org/srfi-109/srfi-109.html)
specification for details.

     EXTENDED-STRING-LITERAL ::= ‘&{’ [INITIAL-IGNORED] STRING-LITERAL-PART^{*} ‘}’
     STRING-LITERAL-PART ::=  any character except ‘&’, ‘{’ or ‘}’
         | ‘{’ STRING-LITERAL-PART^{*} ‘}’
         | CHAR-REF
         | ENTITY-REF
         | SPECIAL-ESCAPE
         | ENCLOSED-PART

   You can use the plain ‘"STRING"’ syntax for longer multiline strings,
but ‘&{STRING}’ has various advantages.  The syntax is less error-prone
because the start-delimiter is different from the end-delimiter.  Also
note that nested braces are allowed: a right brace ‘}’ is only an
end-delimiter if it is unbalanced, so you would seldom need to escape
it:
     &{This has a {braced} section.}
       ⇒ "This has a {braced} section."

   The escape character used for special characters is ‘&’.  This is
compatible with XML syntax and *note XML literals::.

13.4.2.1 Special characters
...........................

     CHAR-REF ::=
         ‘&#’ DIGIT^{+} ‘;’
       | ‘&#x’ HEX-DIGIT^{+}  ‘;’
     ENTITY-REF ::=
         ‘&’ CHAR-OR-ENTITY-NAME ‘;’
     CHAR-OR-ENTITY-NAME ::= TAGNAME

   You can the standard XML syntax for character references, using
either decimal or hexadecimal values.  The following string has two
instances of the Ascii escape character, as either decimal 27 or hex 1B:
     &{&#27;&#x1B;} ⇒ "\e\e"

   You can also use the pre-defined XML entity names:
     &{&amp; &lt; &gt; &quot; &apos;} ⇒ "& < > \" '"
   In addition, ‘&lbrace;’ ‘&rbrace;’ can be used for left and right
curly brace, though you don’t need them for balanced parentheses:
     &{ &rbrace;_&lbrace; / {_} }  ⇒ " }_{ / {_} "

   You can use the standard XML entity names
(http://www.w3.org/2003/entities/2007/w3centities-f.ent).  For example:
     &{L&aelig;rdals&oslash;yri}
       ⇒ "Lærdalsøyri"

   You can also use the standard R7RS character names ‘null’, ‘alarm’,
‘backspace’, ‘tab’, ‘newline’, ‘return’, ‘escape’, ‘space’, and
‘delete’.  For example:
     &{&escape;&space;}

   The syntax ‘&NAME;’ is actually syntactic sugar (specifically reader
syntax) to the variable reference ‘$entity$:NAME’.  Hence you can also
define your own entity names:
     (define $entity$:crnl "\r\n")
     &{&crnl;} ⟹ "\r\n"

13.4.2.2 Multiline string literals
..................................

     INITIAL-IGNORED ::=
         INTRALINE-WHITESPACE^{*} LINE-ENDING INTRALINE-WHITESPACE^{*} &|
     SPECIAL-ESCAPE ::=
         INTRALINE-WHITESPACE^{*} &|
       | & NESTED-COMMENT
       | &- INTRALINE-WHITESPACE^{*} LINE-ENDING

   A line-ending directly in the text is becomes a newline, as in a
simple string literal:
     (string-capitalize &{one two three
     uno dos tres
     }) ⇒ "One Two Three\nUno Dos Tres\n"
   However, you have extra control over layout.  If the string is in a
nested expression, it is confusing (and ugly) if the string cannot be
indented to match the surrounding context.  The indentation marker ‘&|’
is used to mark the end of insignificant initial whitespace.  The ‘&|’
characters and all the preceding whitespace are removed.  In addition,
it also suppresses an initial newline.  Specifically, when the initial
left-brace is followed by optional (invisible) intraline-whitespace,
then a newline, then optional intraline-whitespace (the indentation),
and finally the indentation marker ‘&|’ - all of which is removed from
the output.  Otherwise the ‘&|’ only removes initial
intraline-whitespace on the same line (and itself).

     (write (string-capitalize &{
          &|one two three
          &|uno dos tres
     }) out)
         ⇒ prints "One Two Three\nUno Dos Tres\n"

   As a matter of style, all of the indentation lines should line up.
It is an error if there are any non-whitespace characters between the
previous newline and the indentation marker.  It is also an error to
write an indentation marker before the first newline in the literal.

   The line-continuation marker ‘&-’ is used to suppress a newline:
     &{abc&-
       def} ⇒ "abc  def"

   You can write a ‘#|...|#’-style comment following a ‘&’.  This could
be useful for annotation, or line numbers:
     &{&#|line 1|#one two
       &#|line 2|# three
       &#|line 3|#uno dos tres
     } ⇒ "one two\n three\nuno dos tres\n"

13.4.2.3 Embedded expressions
.............................

     ENCLOSED-PART ::=
         & ENCLOSED-MODIFIER^{*} [ EXPRESSION^{*} ]
       | & ENCLOSED-MODIFIER^{*} ( EXPRESSION^{+} )

   An embedded expression has the form ‘&[EXPRESSION]’.  It is
evaluated, the result converted to a string (as by ‘display’), and the
result added in the result string.  (If there are multiple expressions,
they are all evaluated and the corresponding strings inserted in the
result.)
     &{Hello &[(string-capitalize name)]!}

   You can leave out the square brackets when the expression is a
parenthesized expression:
     &{Hello &(string-capitalize name)!}

13.4.2.4 Formatting
...................

     ENCLOSED-MODIFIER ::=
       ~ FORMAT-SPECIFIER-AFTER-TILDE

   Using *note ‘format’: Format. allows finer-grained control over the
output, but a problem is that the association between format specifiers
and data expressions is positional, which is hard-to-read and
error-prone.  A better solution places the specifier adjacant to the
data expression:
     &{The response was &~,2f(* 100.0 (/ responses total))%.}

   The following escape forms are equivalent to the corresponding forms
withput the ‘~’FMT-SPEC, except the expression(s) are formatted using
‘format’:
     ‘&~’FMT-SPEC‘[’EXPRESSION^{*}‘]’
   Again using parentheses like this:
     ‘&~’FMT-SPEC‘(’EXPRESSION^{+}‘)’
   is equivalent to:
     ‘&~’FMT-SPEC‘[(’EXPRESSION^{+}‘)]’

   The syntax of ‘format’ specifications is arcane, but it allows you to
do some pretty neat things in a compact space.  For example to include
‘"_"’ between each element of the array ‘arr’ you can use the ‘~{...~}’
format speciers:
     (define arr [5 6 7])
     &{&~{&[arr]&~^_&~}} ⇒ "5_6_7"

   If no format is specified for an enclosed expression, the that is
equivalent to a ‘~a’ format specifier, so this is equivalent to:
     &{&~{&~a[arr]&~^_&~}} ⇒ "5_6_7"
   which is in turn equivalent to:
     (format #f "~{~a~^_~}" arr)

   The fine print that makes this work: If there are multiple
expressions in a ‘&[...]’ with no format specifier then there is an
implicit ‘~a’ for each expression.  On the other hand, if there is an
explicit format specifier, it is not repeated for each enclosed
expression: it appears exactly once in the effective format string,
whether there are zero, one, or many expressions.