1This is guile.info, produced by makeinfo version 6.7 from guile.texi.
2
3This manual documents Guile version 3.0.7.
4
5   Copyright (C) 1996-1997, 2000-2005, 2009-2021 Free Software
6Foundation, Inc.
7
8   Permission is granted to copy, distribute and/or modify this document
9under the terms of the GNU Free Documentation License, Version 1.3 or
10any later version published by the Free Software Foundation; with no
11Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
12copy of the license is included in the section entitled “GNU Free
13Documentation License.”
14INFO-DIR-SECTION The Algorithmic Language Scheme
15START-INFO-DIR-ENTRY
16* Guile Reference: (guile).     The Guile reference manual.
17END-INFO-DIR-ENTRY
18
19
20File: guile.info,  Node: Random Access,  Next: Line/Delimited,  Prev: Buffering,  Up: Input and Output
21
226.12.7 Random Access
23--------------------
24
25 -- Scheme Procedure: seek fd_port offset whence
26 -- C Function: scm_seek (fd_port, offset, whence)
27     Sets the current position of FD_PORT to the integer OFFSET.  For a
28     file port, OFFSET is expressed as a number of bytes; for other
29     types of ports, such as string ports, OFFSET is an abstract
30     representation of the position within the port’s data, not
31     necessarily expressed as a number of bytes.  OFFSET is interpreted
32     according to the value of WHENCE.
33
34     One of the following variables should be supplied for WHENCE:
35      -- Variable: SEEK_SET
36          Seek from the beginning of the file.
37      -- Variable: SEEK_CUR
38          Seek from the current position.
39      -- Variable: SEEK_END
40          Seek from the end of the file.
41     If FD_PORT is a file descriptor, the underlying system call is
42     ‘lseek’.  PORT may be a string port.
43
44     The value returned is the new position in FD_PORT.  This means that
45     the current position of a port can be obtained using:
46          (seek port 0 SEEK_CUR)
47
48 -- Scheme Procedure: ftell fd_port
49 -- C Function: scm_ftell (fd_port)
50     Return an integer representing the current position of FD_PORT,
51     measured from the beginning.  Equivalent to:
52
53          (seek port 0 SEEK_CUR)
54
55 -- Scheme Procedure: truncate-file file [length]
56 -- C Function: scm_truncate_file (file, length)
57     Truncate FILE to LENGTH bytes.  FILE can be a filename string, a
58     port object, or an integer file descriptor.  The return value is
59     unspecified.
60
61     For a port or file descriptor LENGTH can be omitted, in which case
62     the file is truncated at the current position (per ‘ftell’ above).
63
64     On most systems a file can be extended by giving a length greater
65     than the current size, but this is not mandatory in the POSIX
66     standard.
67
68
69File: guile.info,  Node: Line/Delimited,  Next: Default Ports,  Prev: Random Access,  Up: Input and Output
70
716.12.8 Line Oriented and Delimited Text
72---------------------------------------
73
74The delimited-I/O module can be accessed with:
75
76     (use-modules (ice-9 rdelim))
77
78   It can be used to read or write lines of text, or read text delimited
79by a specified set of characters.
80
81 -- Scheme Procedure: read-line [port] [handle-delim]
82     Return a line of text from PORT if specified, otherwise from the
83     value returned by ‘(current-input-port)’.  Under Unix, a line of
84     text is terminated by the first end-of-line character or by
85     end-of-file.
86
87     If HANDLE-DELIM is specified, it should be one of the following
88     symbols:
89     ‘trim’
90          Discard the terminating delimiter.  This is the default, but
91          it will be impossible to tell whether the read terminated with
92          a delimiter or end-of-file.
93     ‘concat’
94          Append the terminating delimiter (if any) to the returned
95          string.
96     ‘peek’
97          Push the terminating delimiter (if any) back on to the port.
98     ‘split’
99          Return a pair containing the string read from the port and the
100          terminating delimiter or end-of-file object.
101
102 -- Scheme Procedure: read-line! buf [port]
103     Read a line of text into the supplied string BUF and return the
104     number of characters added to BUF.  If BUF is filled, then ‘#f’ is
105     returned.  Read from PORT if specified, otherwise from the value
106     returned by ‘(current-input-port)’.
107
108 -- Scheme Procedure: read-delimited delims [port] [handle-delim]
109     Read text until one of the characters in the string DELIMS is found
110     or end-of-file is reached.  Read from PORT if supplied, otherwise
111     from the value returned by ‘(current-input-port)’.  HANDLE-DELIM
112     takes the same values as described for ‘read-line’.
113
114 -- Scheme Procedure: read-delimited! delims buf [port] [handle-delim]
115          [start] [end]
116     Read text into the supplied string BUF.
117
118     If a delimiter was found, return the number of characters written,
119     except if HANDLE-DELIM is ‘split’, in which case the return value
120     is a pair, as noted above.
121
122     As a special case, if PORT was already at end-of-stream, the EOF
123     object is returned.  Also, if no characters were written because
124     the buffer was full, ‘#f’ is returned.
125
126     It’s something of a wacky interface, to be honest.
127
128 -- Scheme Procedure: %read-delimited! delims str gobble [port [start
129          [end]]]
130 -- C Function: scm_read_delimited_x (delims, str, gobble, port, start,
131          end)
132     Read characters from PORT into STR until one of the characters in
133     the DELIMS string is encountered.  If GOBBLE is true, discard the
134     delimiter character; otherwise, leave it in the input stream for
135     the next read.  If PORT is not specified, use the value of
136     ‘(current-input-port)’.  If START or END are specified, store data
137     only into the substring of STR bounded by START and END (which
138     default to the beginning and end of the string, respectively).
139
140     Return a pair consisting of the delimiter that terminated the
141     string and the number of characters read.  If reading stopped at
142     the end of file, the delimiter returned is the EOF-OBJECT; if the
143     string was filled without encountering a delimiter, this value is
144     ‘#f’.
145
146 -- Scheme Procedure: %read-line [port]
147 -- C Function: scm_read_line (port)
148     Read a newline-terminated line from PORT, allocating storage as
149     necessary.  The newline terminator (if any) is removed from the
150     string, and a pair consisting of the line and its delimiter is
151     returned.  The delimiter may be either a newline or the EOF-OBJECT;
152     if ‘%read-line’ is called at the end of file, it returns the pair
153     ‘(#<eof> . #<eof>)’.
154
155 -- Scheme Procedure: write-line obj [port]
156 -- C Function: scm_write_line (obj, port)
157     Display OBJ and a newline character to PORT.  If PORT is not
158     specified, ‘(current-output-port)’ is used.  This procedure is
159     equivalent to:
160          (display obj [port])
161          (newline [port])
162
163
164File: guile.info,  Node: Default Ports,  Next: Port Types,  Prev: Line/Delimited,  Up: Input and Output
165
1666.12.9 Default Ports for Input, Output and Errors
167-------------------------------------------------
168
169 -- Scheme Procedure: current-input-port
170 -- C Function: scm_current_input_port ()
171     Return the current input port.  This is the default port used by
172     many input procedures.
173
174     Initially this is the “standard input” in Unix and C terminology.
175     When the standard input is a tty the port is unbuffered, otherwise
176     it’s fully buffered.
177
178     Unbuffered input is good if an application runs an interactive
179     subprocess, since any type-ahead input won’t go into Guile’s buffer
180     and be unavailable to the subprocess.
181
182     Note that Guile buffering is completely separate from the tty “line
183     discipline”.  In the usual cooked mode on a tty Guile only sees a
184     line of input once the user presses <Return>.
185
186 -- Scheme Procedure: current-output-port
187 -- C Function: scm_current_output_port ()
188     Return the current output port.  This is the default port used by
189     many output procedures.
190
191     Initially this is the “standard output” in Unix and C terminology.
192     When the standard output is a tty this port is unbuffered,
193     otherwise it’s fully buffered.
194
195     Unbuffered output to a tty is good for ensuring progress output or
196     a prompt is seen.  But an application which always prints whole
197     lines could change to line buffered, or an application with a lot
198     of output could go fully buffered and perhaps make explicit
199     ‘force-output’ calls (*note Buffering::) at selected points.
200
201 -- Scheme Procedure: current-error-port
202 -- C Function: scm_current_error_port ()
203     Return the port to which errors and warnings should be sent.
204
205     Initially this is the “standard error” in Unix and C terminology.
206     When the standard error is a tty this port is unbuffered, otherwise
207     it’s fully buffered.
208
209 -- Scheme Procedure: set-current-input-port port
210 -- Scheme Procedure: set-current-output-port port
211 -- Scheme Procedure: set-current-error-port port
212 -- C Function: scm_set_current_input_port (port)
213 -- C Function: scm_set_current_output_port (port)
214 -- C Function: scm_set_current_error_port (port)
215     Change the ports returned by ‘current-input-port’,
216     ‘current-output-port’ and ‘current-error-port’, respectively, so
217     that they use the supplied PORT for input or output.
218
219 -- Scheme Procedure: with-input-from-port port thunk
220 -- Scheme Procedure: with-output-to-port port thunk
221 -- Scheme Procedure: with-error-to-port port thunk
222     Call THUNK in a dynamic environment in which ‘current-input-port’,
223     ‘current-output-port’ or ‘current-error-port’ is rebound to the
224     given PORT.
225
226 -- C Function: void scm_dynwind_current_input_port (SCM port)
227 -- C Function: void scm_dynwind_current_output_port (SCM port)
228 -- C Function: void scm_dynwind_current_error_port (SCM port)
229     These functions must be used inside a pair of calls to
230     ‘scm_dynwind_begin’ and ‘scm_dynwind_end’ (*note Dynamic Wind::).
231     During the dynwind context, the indicated port is set to PORT.
232
233     More precisely, the current port is swapped with a ‘backup’ value
234     whenever the dynwind context is entered or left.  The backup value
235     is initialized with the PORT argument.
236
237
238File: guile.info,  Node: Port Types,  Next: Venerable Port Interfaces,  Prev: Default Ports,  Up: Input and Output
239
2406.12.10 Types of Port
241---------------------
242
243* Menu:
244
245* File Ports:: Ports on an operating system file.
246* Bytevector Ports:: Ports on a bytevector.
247* String Ports:: Ports on a Scheme string.
248* Custom Ports:: Ports whose implementation you control.
249* Soft Ports:: An older version of custom ports.
250* Void Ports:: Ports on nothing at all.
251
252
253File: guile.info,  Node: File Ports,  Next: Bytevector Ports,  Up: Port Types
254
2556.12.10.1 File Ports
256....................
257
258The following procedures are used to open file ports.  See also *note
259open: Ports and File Descriptors, for an interface to the Unix ‘open’
260system call.
261
262   All file access uses the “LFS” large file support functions when
263available, so files bigger than 2 Gbytes (2^31 bytes) can be read and
264written on a 32-bit system.
265
266   Most systems have limits on how many files can be open, so it’s
267strongly recommended that file ports be closed explicitly when no longer
268required (*note Ports::).
269
270 -- Scheme Procedure: open-file filename mode [#:guess-encoding=#f]
271          [#:encoding=#f]
272 -- C Function: scm_open_file_with_encoding (filename, mode,
273          guess_encoding, encoding)
274 -- C Function: scm_open_file (filename, mode)
275     Open the file whose name is FILENAME, and return a port
276     representing that file.  The attributes of the port are determined
277     by the MODE string.  The way in which this is interpreted is
278     similar to C stdio.  The first character must be one of the
279     following:
280
281     ‘r’
282          Open an existing file for input.
283     ‘w’
284          Open a file for output, creating it if it doesn’t already
285          exist or removing its contents if it does.
286     ‘a’
287          Open a file for output, creating it if it doesn’t already
288          exist.  All writes to the port will go to the end of the file.
289          The "append mode" can be turned off while the port is in use
290          *note fcntl: Ports and File Descriptors.
291
292     The following additional characters can be appended:
293
294     ‘+’
295          Open the port for both input and output.  E.g., ‘r+’: open an
296          existing file for both input and output.
297     ‘0’
298          Create an "unbuffered" port.  In this case input and output
299          operations are passed directly to the underlying port
300          implementation without additional buffering.  This is likely
301          to slow down I/O operations.  The buffering mode can be
302          changed while a port is in use (*note Buffering::).
303     ‘l’
304          Add line-buffering to the port.  The port output buffer will
305          be automatically flushed whenever a newline character is
306          written.
307     ‘b’
308          Use binary mode, ensuring that each byte in the file will be
309          read as one Scheme character.
310
311          To provide this property, the file will be opened with the
312          8-bit character encoding "ISO-8859-1", ignoring the default
313          port encoding.  *Note Ports::, for more information on port
314          encodings.
315
316          Note that while it is possible to read and write binary data
317          as characters or strings, it is usually better to treat bytes
318          as octets, and byte sequences as bytevectors.  *Note Binary
319          I/O::, for more.
320
321          This option had another historical meaning, for DOS
322          compatibility: in the default (textual) mode, DOS reads a
323          CR-LF sequence as one LF byte.  The ‘b’ flag prevents this
324          from happening, adding ‘O_BINARY’ to the underlying ‘open’
325          call.  Still, the flag is generally useful because of its port
326          encoding ramifications.
327
328     Unless binary mode is requested, the character encoding of the new
329     port is determined as follows: First, if GUESS-ENCODING is true,
330     the ‘file-encoding’ procedure is used to guess the encoding of the
331     file (*note Character Encoding of Source Files::).  If
332     GUESS-ENCODING is false or if ‘file-encoding’ fails, ENCODING is
333     used unless it is also false.  As a last resort, the default port
334     encoding is used.  *Note Ports::, for more information on port
335     encodings.  It is an error to pass a non-false GUESS-ENCODING or
336     ENCODING if binary mode is requested.
337
338     If a file cannot be opened with the access requested, ‘open-file’
339     throws an exception.
340
341 -- Scheme Procedure: open-input-file filename [#:guess-encoding=#f]
342          [#:encoding=#f] [#:binary=#f]
343
344     Open FILENAME for input.  If BINARY is true, open the port in
345     binary mode, otherwise use text mode.  ENCODING and GUESS-ENCODING
346     determine the character encoding as described above for
347     ‘open-file’.  Equivalent to
348          (open-file FILENAME
349                     (if BINARY "rb" "r")
350                     #:guess-encoding GUESS-ENCODING
351                     #:encoding ENCODING)
352
353 -- Scheme Procedure: open-output-file filename [#:encoding=#f]
354          [#:binary=#f]
355
356     Open FILENAME for output.  If BINARY is true, open the port in
357     binary mode, otherwise use text mode.  ENCODING specifies the
358     character encoding as described above for ‘open-file’.  Equivalent
359     to
360          (open-file FILENAME
361                     (if BINARY "wb" "w")
362                     #:encoding ENCODING)
363
364 -- Scheme Procedure: call-with-input-file filename proc
365          [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
366 -- Scheme Procedure: call-with-output-file filename proc
367          [#:encoding=#f] [#:binary=#f]
368     Open FILENAME for input or output, and call ‘(PROC port)’ with the
369     resulting port.  Return the value returned by PROC.  FILENAME is
370     opened as per ‘open-input-file’ or ‘open-output-file’ respectively,
371     and an error is signaled if it cannot be opened.
372
373     When PROC returns, the port is closed.  If PROC does not return
374     (e.g. if it throws an error), then the port might not be closed
375     automatically, though it will be garbage collected in the usual way
376     if not otherwise referenced.
377
378 -- Scheme Procedure: with-input-from-file filename thunk
379          [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
380 -- Scheme Procedure: with-output-to-file filename thunk [#:encoding=#f]
381          [#:binary=#f]
382 -- Scheme Procedure: with-error-to-file filename thunk [#:encoding=#f]
383          [#:binary=#f]
384     Open FILENAME and call ‘(THUNK)’ with the new port setup as
385     respectively the ‘current-input-port’, ‘current-output-port’, or
386     ‘current-error-port’.  Return the value returned by THUNK.
387     FILENAME is opened as per ‘open-input-file’ or ‘open-output-file’
388     respectively, and an error is signaled if it cannot be opened.
389
390     When THUNK returns, the port is closed and the previous setting of
391     the respective current port is restored.
392
393     The current port setting is managed with ‘dynamic-wind’, so the
394     previous value is restored no matter how THUNK exits (eg. an
395     exception), and if THUNK is re-entered (via a captured
396     continuation) then it’s set again to the FILENAME port.
397
398     The port is closed when THUNK returns normally, but not when exited
399     via an exception or new continuation.  This ensures it’s still
400     ready for use if THUNK is re-entered by a captured continuation.
401     Of course the port is always garbage collected and closed in the
402     usual way when no longer referenced anywhere.
403
404 -- Scheme Procedure: port-mode port
405 -- C Function: scm_port_mode (port)
406     Return the port modes associated with the open port PORT.  These
407     will not necessarily be identical to the modes used when the port
408     was opened, since modes such as "append" which are used only during
409     port creation are not retained.
410
411 -- Scheme Procedure: port-filename port
412 -- C Function: scm_port_filename (port)
413     Return the filename associated with PORT, or ‘#f’ if no filename is
414     associated with the port.
415
416     PORT must be open; ‘port-filename’ cannot be used once the port is
417     closed.
418
419 -- Scheme Procedure: set-port-filename! port filename
420 -- C Function: scm_set_port_filename_x (port, filename)
421     Change the filename associated with PORT, using the current input
422     port if none is specified.  Note that this does not change the
423     port’s source of data, but only the value that is returned by
424     ‘port-filename’ and reported in diagnostic output.
425
426 -- Scheme Procedure: file-port? obj
427 -- C Function: scm_file_port_p (obj)
428     Determine whether OBJ is a port that is related to a file.
429
430
431File: guile.info,  Node: Bytevector Ports,  Next: String Ports,  Prev: File Ports,  Up: Port Types
432
4336.12.10.2 Bytevector Ports
434..........................
435
436 -- Scheme Procedure: open-bytevector-input-port bv [transcoder]
437 -- C Function: scm_open_bytevector_input_port (bv, transcoder)
438     Return an input port whose contents are drawn from bytevector BV
439     (*note Bytevectors::).
440
441     The TRANSCODER argument is currently not supported.
442
443 -- Scheme Procedure: open-bytevector-output-port [transcoder]
444 -- C Function: scm_open_bytevector_output_port (transcoder)
445     Return two values: a binary output port and a procedure.  The
446     latter should be called with zero arguments to obtain a bytevector
447     containing the data accumulated by the port, as illustrated below.
448
449          (call-with-values
450            (lambda ()
451              (open-bytevector-output-port))
452            (lambda (port get-bytevector)
453              (display "hello" port)
454              (get-bytevector)))
455
456          ⇒ #vu8(104 101 108 108 111)
457
458     The TRANSCODER argument is currently not supported.
459
460 -- Scheme Procedure: call-with-output-bytevector proc
461     Call the one-argument procedure PROC with a newly created
462     bytevector output port.  When the function returns, the bytevector
463     composed of the characters written into the port is returned.  PROC
464     should not close the port.
465
466 -- Scheme Procedure: call-with-input-bytevector bytevector proc
467     Call the one-argument procedure PROC with a newly created input
468     port from which BYTEVECTOR’s contents may be read.  The values
469     yielded by the PROC is returned.
470
471
472File: guile.info,  Node: String Ports,  Next: Custom Ports,  Prev: Bytevector Ports,  Up: Port Types
473
4746.12.10.3 String Ports
475......................
476
477 -- Scheme Procedure: call-with-output-string proc
478 -- C Function: scm_call_with_output_string (proc)
479     Calls the one-argument procedure PROC with a newly created output
480     port.  When the function returns, the string composed of the
481     characters written into the port is returned.  PROC should not
482     close the port.
483
484 -- Scheme Procedure: call-with-input-string string proc
485 -- C Function: scm_call_with_input_string (string, proc)
486     Calls the one-argument procedure PROC with a newly created input
487     port from which STRING’s contents may be read.  The value yielded
488     by the PROC is returned.
489
490 -- Scheme Procedure: with-output-to-string thunk
491     Calls the zero-argument procedure THUNK with the current output
492     port set temporarily to a new string port.  It returns a string
493     composed of the characters written to the current output.
494
495 -- Scheme Procedure: with-input-from-string string thunk
496     Calls the zero-argument procedure THUNK with the current input port
497     set temporarily to a string port opened on the specified STRING.
498     The value yielded by THUNK is returned.
499
500 -- Scheme Procedure: open-input-string str
501 -- C Function: scm_open_input_string (str)
502     Take a string and return an input port that delivers characters
503     from the string.  The port can be closed by ‘close-input-port’,
504     though its storage will be reclaimed by the garbage collector if it
505     becomes inaccessible.
506
507 -- Scheme Procedure: open-output-string
508 -- C Function: scm_open_output_string ()
509     Return an output port that will accumulate characters for retrieval
510     by ‘get-output-string’.  The port can be closed by the procedure
511     ‘close-output-port’, though its storage will be reclaimed by the
512     garbage collector if it becomes inaccessible.
513
514 -- Scheme Procedure: get-output-string port
515 -- C Function: scm_get_output_string (port)
516     Given an output port created by ‘open-output-string’, return a
517     string consisting of the characters that have been output to the
518     port so far.
519
520     ‘get-output-string’ must be used before closing PORT, once closed
521     the string cannot be obtained.
522
523   With string ports, the port-encoding is treated differently than
524other types of ports.  When string ports are created, they do not
525inherit a character encoding from the current locale.  They are given a
526default locale that allows them to handle all valid string characters.
527Typically one should not modify a string port’s character encoding away
528from its default.  *Note Encoding::.
529
530
531File: guile.info,  Node: Custom Ports,  Next: Soft Ports,  Prev: String Ports,  Up: Port Types
532
5336.12.10.4 Custom Ports
534......................
535
536Custom ports allow the user to provide input and handle output via
537user-supplied procedures.  Guile currently only provides custom binary
538ports, not textual ports; for custom textual ports, *Note Soft Ports::.
539We should add the R6RS custom textual port interfaces though.
540Contributions are appreciated.
541
542 -- Scheme Procedure: make-custom-binary-input-port id read!
543          get-position set-position! close
544     Return a new custom binary input port(1) named ID (a string) whose
545     input is drained by invoking READ! and passing it a bytevector, an
546     index where bytes should be written, and the number of bytes to
547     read.  The ‘read!’ procedure must return an integer indicating the
548     number of bytes read, or ‘0’ to indicate the end-of-file.
549
550     Optionally, if GET-POSITION is not ‘#f’, it must be a thunk that
551     will be called when ‘port-position’ is invoked on the custom binary
552     port and should return an integer indicating the position within
553     the underlying data stream; if GET-POSITION was not supplied, the
554     returned port does not support ‘port-position’.
555
556     Likewise, if SET-POSITION! is not ‘#f’, it should be a one-argument
557     procedure.  When ‘set-port-position!’ is invoked on the custom
558     binary input port, SET-POSITION! is passed an integer indicating
559     the position of the next byte is to read.
560
561     Finally, if CLOSE is not ‘#f’, it must be a thunk.  It is invoked
562     when the custom binary input port is closed.
563
564     The returned port is fully buffered by default, but its buffering
565     mode can be changed using ‘setvbuf’ (*note Buffering::).
566
567     Using a custom binary input port, the ‘open-bytevector-input-port’
568     procedure (*note Bytevector Ports::) could be implemented as
569     follows:
570
571          (define (open-bytevector-input-port source)
572            (define position 0)
573            (define length (bytevector-length source))
574
575            (define (read! bv start count)
576              (let ((count (min count (- length position))))
577                (bytevector-copy! source position
578                                  bv start count)
579                (set! position (+ position count))
580                count))
581
582            (define (get-position) position)
583
584            (define (set-position! new-position)
585              (set! position new-position))
586
587            (make-custom-binary-input-port "the port" read!
588                                            get-position set-position!
589                                            #f))
590
591          (read (open-bytevector-input-port (string->utf8 "hello")))
592          ⇒ hello
593
594 -- Scheme Procedure: make-custom-binary-output-port id write!
595          get-position set-position! close
596     Return a new custom binary output port named ID (a string) whose
597     output is sunk by invoking WRITE! and passing it a bytevector, an
598     index where bytes should be read from this bytevector, and the
599     number of bytes to be “written”.  The ‘write!’ procedure must
600     return an integer indicating the number of bytes actually written;
601     when it is passed ‘0’ as the number of bytes to write, it should
602     behave as though an end-of-file was sent to the byte sink.
603
604     The other arguments are as for ‘make-custom-binary-input-port’.
605
606 -- Scheme Procedure: make-custom-binary-input/output-port id read!
607          write! get-position set-position! close
608     Return a new custom binary input/output port named ID (a string).
609     The various arguments are the same as for The other arguments are
610     as for ‘make-custom-binary-input-port’ and
611     ‘make-custom-binary-output-port’.  If buffering is enabled on the
612     port, as is the case by default, input will be buffered in both
613     directions; *Note Buffering::.  If the SET-POSITION! function is
614     provided and not ‘#f’, then the port will also be marked as
615     random-access, causing the buffer to be flushed between reads and
616     writes.
617
618   ---------- Footnotes ----------
619
620   (1) This is similar in spirit to Guile’s “soft ports” (*note Soft
621Ports::).
622
623
624File: guile.info,  Node: Soft Ports,  Next: Void Ports,  Prev: Custom Ports,  Up: Port Types
625
6266.12.10.5 Soft Ports
627....................
628
629A “soft port” is a port based on a vector of procedures capable of
630accepting or delivering characters.  It allows emulation of I/O ports.
631
632 -- Scheme Procedure: make-soft-port pv modes
633     Return a port capable of receiving or delivering characters as
634     specified by the MODES string (*note open-file: File Ports.).  PV
635     must be a vector of length 5 or 6.  Its components are as follows:
636
637       0. procedure accepting one character for output
638       1. procedure accepting a string for output
639       2. thunk for flushing output
640       3. thunk for getting one character
641       4. thunk for closing port (not by garbage collection)
642       5. (if present and not ‘#f’) thunk for computing the number of
643          characters that can be read from the port without blocking.
644
645     For an output-only port only elements 0, 1, 2, and 4 need be
646     procedures.  For an input-only port only elements 3 and 4 need be
647     procedures.  Thunks 2 and 4 can instead be ‘#f’ if there is no
648     useful operation for them to perform.
649
650     If thunk 3 returns ‘#f’ or an ‘eof-object’ (*note eof-object?:
651     (r5rs)Input.) it indicates that the port has reached end-of-file.
652     For example:
653
654          (define stdout (current-output-port))
655          (define p (make-soft-port
656                     (vector
657                      (lambda (c) (write c stdout))
658                      (lambda (s) (display s stdout))
659                      (lambda () (display "." stdout))
660                      (lambda () (char-upcase (read-char)))
661                      (lambda () (display "@" stdout)))
662                     "rw"))
663
664          (write p p) ⇒ #<input-output: soft 8081e20>
665
666
667File: guile.info,  Node: Void Ports,  Prev: Soft Ports,  Up: Port Types
668
6696.12.10.6 Void Ports
670....................
671
672This kind of port causes any data to be discarded when written to, and
673always returns the end-of-file object when read from.
674
675 -- Scheme Procedure: %make-void-port mode
676 -- C Function: scm_sys_make_void_port (mode)
677     Create and return a new void port.  A void port acts like
678/dev/null’.  The MODE argument specifies the input/output modes
679     for this port: see the documentation for ‘open-file’ in *note File
680     Ports::.
681
682
683File: guile.info,  Node: Venerable Port Interfaces,  Next: Using Ports from C,  Prev: Port Types,  Up: Input and Output
684
6856.12.11 Venerable Port Interfaces
686---------------------------------
687
688Over the 25 years or so that Guile has been around, its port system has
689evolved, adding many useful features.  At the same time there have been
690four major Scheme standards released in those 25 years, which also
691evolve the common Scheme understanding of what a port interface should
692be.  Alas, it would be too much to ask for all of these evolutionary
693branches to be consistent.  Some of Guile’s original interfaces don’t
694mesh with the later Scheme standards, and yet Guile can’t just drop old
695interfaces.  Sadly as well, the R6RS and R7RS standards both part from a
696base of R5RS, but end up in different and somewhat incompatible designs.
697
698   Guile’s approach is to pick a set of port primitives that make sense
699together.  We document that set of primitives, design our internal
700interfaces around them, and recommend them to users.  As the R6RS I/O
701system is the most capable standard that Scheme has yet produced in this
702domain, we mostly recommend that; ‘(ice-9 binary-ports)’ and ‘(ice-9
703textual-ports)’ are wholly modelled on ‘(rnrs io ports)’.  Guile does
704not wholly copy R6RS, however; *Note R6RS Incompatibilities::.
705
706   At the same time, we have many venerable port interfaces, lore handed
707down to us from our hacker ancestors.  Most of these interfaces even
708predate the expectation that Scheme should have modules, so they are
709present in the default environment.  In Guile we support them as well
710and we have no plans to remove them, but again we don’t recommend them
711for new users.
712
713 -- Scheme Procedure: char-ready? [port]
714     Return ‘#t’ if a character is ready on input PORT and return ‘#f’
715     otherwise.  If ‘char-ready?’ returns ‘#t’ then the next ‘read-char’
716     operation on PORT is guaranteed not to hang.  If PORT is a file
717     port at end of file then ‘char-ready?’ returns ‘#t’.
718
719     ‘char-ready?’ exists to make it possible for a program to accept
720     characters from interactive ports without getting stuck waiting for
721     input.  Any input editors associated with such ports must make sure
722     that characters whose existence has been asserted by ‘char-ready?’
723     cannot be rubbed out.  If ‘char-ready?’ were to return ‘#f’ at end
724     of file, a port at end of file would be indistinguishable from an
725     interactive port that has no ready characters.
726
727     Note that ‘char-ready?’ only works reliably for terminals and
728     sockets with one-byte encodings.  Under the hood it will return
729     ‘#t’ if the port has any input buffered, or if the file descriptor
730     that backs the port polls as readable, indicating that Guile can
731     fetch more bytes from the kernel.  However being able to fetch one
732     byte doesn’t mean that a full character is available; *Note
733     Encoding::.  Also, on many systems it’s possible for a file
734     descriptor to poll as readable, but then block when it comes time
735     to read bytes.  Note also that on Linux kernels, all file ports
736     backed by files always poll as readable.  For non-file ports, this
737     procedure always returns ‘#t’, except for soft ports, which have a
738     ‘char-ready?’ handler.  *Note Soft Ports::.
739
740     In short, this is a legacy procedure whose semantics are hard to
741     provide.  However it is a useful check to see if any input is
742     buffered.  *Note Non-Blocking I/O::.
743
744 -- Scheme Procedure: read-char [port]
745     The same as ‘get-char’, except that PORT defaults to the current
746     input port.  *Note Textual I/O::.
747
748 -- Scheme Procedure: peek-char [port]
749     The same as ‘lookahead-char’, except that PORT defaults to the
750     current input port.  *Note Textual I/O::.
751
752 -- Scheme Procedure: unread-char cobj [port]
753     The same as ‘unget-char’, except that PORT defaults to the current
754     input port, and the arguments are swapped.  *Note Textual I/O::.
755
756 -- Scheme Procedure: unread-string str port
757 -- C Function: scm_unread_string (str, port)
758     The same as ‘unget-string’, except that PORT defaults to the
759     current input port, and the arguments are swapped.  *Note Textual
760     I/O::.
761
762 -- Scheme Procedure: newline [port]
763     Send a newline to PORT.  If PORT is omitted, send to the current
764     output port.  Equivalent to ‘(put-char port #\newline)’.
765
766 -- Scheme Procedure: write-char chr [port]
767     The same as ‘put-char’, except that PORT defaults to the current
768     input port, and the arguments are swapped.  *Note Textual I/O::.
769
770
771File: guile.info,  Node: Using Ports from C,  Next: I/O Extensions,  Prev: Venerable Port Interfaces,  Up: Input and Output
772
7736.12.12 Using Ports from C
774--------------------------
775
776Guile’s C interfaces provides some niceties for sending and receiving
777bytes and characters in a way that works better with C.
778
779 -- C Function: size_t scm_c_read (SCM port, void *buffer, size_t size)
780     Read up to SIZE bytes from PORT and store them in BUFFER.  The
781     return value is the number of bytes actually read, which can be
782     less than SIZE if end-of-file has been reached.
783
784     Note that as this is a binary input procedure, this function does
785     not update ‘port-line’ and ‘port-column’ (*note Textual I/O::).
786
787 -- C Function: void scm_c_write (SCM port, const void *buffer, size_t
788          size)
789     Write SIZE bytes at BUFFER to PORT.
790
791     Note that as this is a binary output procedure, this function does
792     not update ‘port-line’ and ‘port-column’ (*note Textual I/O::).
793
794 -- C Function: size_t scm_c_read_bytes (SCM port, SCM bv, size_t start,
795          size_t count)
796 -- C Function: void scm_c_write_bytes (SCM port, SCM bv, size_t start,
797          size_t count)
798     Like ‘scm_c_read’ and ‘scm_c_write’, but reading into or writing
799     from the bytevector BV.  COUNT indicates the byte index at which to
800     start in the bytevector, and the read or write will continue for
801     COUNT bytes.
802
803 -- C Function: void scm_unget_bytes (const unsigned char *buf, size_t
804          len, SCM port)
805 -- C Function: void scm_unget_byte (int c, SCM port)
806 -- C Function: void scm_ungetc (scm_t_wchar c, SCM port)
807     Like ‘unget-bytevector’, ‘unget-byte’, and ‘unget-char’,
808     respectively.  *Note Textual I/O::.
809
810 -- C Function: void scm_c_put_latin1_chars (SCM port, const scm_t_uint8
811          *buf, size_t len)
812 -- C Function: void scm_c_put_utf32_chars (SCM port, const scm_t_uint32
813          *buf, size_t len);
814     Write a string to PORT.  In the first case, the ‘scm_t_uint8*’
815     buffer is a string in the latin-1 encoding.  In the second, the
816     ‘scm_t_uint32*’ buffer is a string in the UTF-32 encoding.  These
817     routines will update the port’s line and column.
818
819
820File: guile.info,  Node: I/O Extensions,  Next: Non-Blocking I/O,  Prev: Using Ports from C,  Up: Input and Output
821
8226.12.13 Implementing New Port Types in C
823----------------------------------------
824
825This section describes how to implement a new port type in C. Although
826ports support many operations, as a data structure they present an
827opaque interface to the user.  To the port implementor, you have two
828pieces of information to work with: the port type, and the port’s
829“stream”.  The port type is an opaque pointer allocated when defining
830your port type.  It is your key into the port API, and it helps you
831identify which ports are actually yours.  The “stream” is a pointer you
832control, and which you set when you create a port.  Get a stream from a
833port using the ‘SCM_STREAM’ macro.  Note that your port methods are only
834ever called with ports of your type.
835
836   A port type is created by calling ‘scm_make_port_type’.  Once you
837have your port type, you can create ports with ‘scm_c_make_port’, or
838‘scm_c_make_port_with_encoding’.
839
840 -- Function: scm_t_port_type* scm_make_port_type (char *name, size_t
841          (*read) (SCM port, SCM dst, size_t start, size_t count),
842          size_t (*write) (SCM port, SCM src, size_t start, size_t
843          count))
844     Define a new port type.  The NAME, READ and WRITE parameters are
845     initial values for those port type fields, as described below.  The
846     other fields are initialized with default values and can be changed
847     later.
848
849 -- Function: SCM scm_c_make_port_with_encoding (scm_t_port_type *type,
850          unsigned long mode_bits, SCM encoding, SCM
851          conversion_strategy, scm_t_bits stream)
852 -- Function: SCM scm_c_make_port (scm_t_port_type *type, unsigned long
853          mode_bits, scm_t_bits stream)
854     Make a port with the given TYPE.  The STREAM indicates the private
855     data associated with the port, which your port implementation may
856     later retrieve with ‘SCM_STREAM’.  The mode bits should include one
857     or more of the flags ‘SCM_RDNG’ or ‘SCM_WRTNG’, indicating that the
858     port is an input and/or an output port, respectively.  The mode
859     bits may also include ‘SCM_BUF0’ or ‘SCM_BUFLINE’, indicating that
860     the port should be unbuffered or line-buffered, respectively.  The
861     default is that the port will be block-buffered.  *Note
862     Buffering::.
863
864     As you would imagine, ENCODING and CONVERSION_STRATEGY specify the
865     port’s initial textual encoding and conversion strategy.  Both are
866     symbols.  ‘scm_c_make_port’ is the same as
867     ‘scm_c_make_port_with_encoding’, except it uses the default port
868     encoding and conversion strategy.
869
870   The port type has a number of associate procedures and properties
871which collectively implement the port’s behavior.  Creating a new port
872type mostly involves writing these procedures.
873
874‘name’
875     A pointer to a NUL terminated string: the name of the port type.
876     This property is initialized via the first argument to
877     ‘scm_make_port_type’.
878
879‘read’
880     A port’s ‘read’ implementation fills read buffers.  It should copy
881     bytes to the supplied bytevector ‘dst’, starting at offset ‘start’
882     and continuing for ‘count’ bytes, returning the number of bytes
883     read.
884
885‘write’
886     A port’s ‘write’ implementation flushes write buffers to the
887     mutable store.  It should write out bytes from the supplied
888     bytevector ‘src’, starting at offset ‘start’ and continuing for
889     ‘count’ bytes, and return the number of bytes that were written.
890
891‘read_wait_fd’
892‘write_wait_fd’
893     If a port’s ‘read’ or ‘write’ function returns ‘(size_t) -1’, that
894     indicates that reading or writing would block.  In that case to
895     preserve the illusion of a blocking read or write operation,
896     Guile’s C port run-time will ‘poll’ on the file descriptor returned
897     by either the port’s ‘read_wait_fd’ or ‘write_wait_fd’ function.
898     Set using
899
900      -- Function: void scm_set_port_read_wait_fd (scm_t_port_type
901               *type, int (*wait_fd) (SCM port))
902      -- Function: void scm_set_port_write_wait_fd (scm_t_port_type
903               *type, int (*wait_fd) (SCM port))
904
905     Only a port type which implements the ‘read_wait_fd’ or
906     ‘write_wait_fd’ port methods can usefully return ‘(size_t) -1’ from
907     a read or write function.  *Note Non-Blocking I/O::, for more on
908     non-blocking I/O in Guile.
909
910‘print’
911     Called when ‘write’ is called on the port, to print a port
912     description.  For example, for a file port it may produce something
913     like: ‘#<input: /etc/passwd 3>’.  Set using
914
915      -- Function: void scm_set_port_print (scm_t_port_type *type, int
916               (*print) (SCM port, SCM dest_port, scm_print_state
917               *pstate))
918          The first argument PORT is the port being printed, the second
919          argument DEST_PORT is where its description should go.
920
921‘close’
922     Called when the port is closed.  It should free any resources used
923     by the port.  Set using
924
925      -- Function: void scm_set_port_close (scm_t_port_type *type, void
926               (*close) (SCM port))
927
928     By default, ports that are garbage collected just go away without
929     closing.  If your port type needs to release some external resource
930     like a file descriptor, or needs to make sure that its internal
931     buffers are flushed even if the port is collected while it was
932     open, then mark the port type as needing a close on GC.
933
934      -- Function: void scm_set_port_needs_close_on_gc (scm_t_port_type
935               *type, int needs_close_p)
936
937‘seek’
938     Set the current position of the port.  Guile will flush read and/or
939     write buffers before seeking, as appropriate.
940
941      -- Function: void scm_set_port_seek (scm_t_port_type *type,
942               scm_t_off (*seek) (SCM port, scm_t_off offset, int
943               whence))
944
945‘truncate’
946     Truncate the port data to be specified length.  Guile will flush
947     buffers before hand, as appropriate.  Set using
948
949      -- Function: void scm_set_port_truncate (scm_t_port_type *type,
950               void (*truncate) (SCM port, scm_t_off length))
951
952‘random_access_p’
953     Determine whether this port is a random-access port.
954
955     Seeking on a random-access port with buffered input, or switching
956     to writing after reading, will cause the buffered input to be
957     discarded and Guile will seek the port back the buffered number of
958     bytes.  Likewise seeking on a random-access port with buffered
959     output, or switching to reading after writing, will flush pending
960     bytes with a call to the ‘write’ procedure.  *Note Buffering::.
961
962     Indicate to Guile that your port needs this behavior by returning a
963     nonzero value from your ‘random_access_p’ function.  The default
964     implementation of this function returns nonzero if the port type
965     supplies a seek implementation.
966
967      -- Function: void scm_set_port_random_access_p (scm_t_port_type
968               *type, int (*random_access_p) (SCM port));
969
970‘get_natural_buffer_sizes’
971     Guile will internally attach buffers to ports.  An input port
972     always has a read buffer and an output port always has a write
973     buffer.  *Note Buffering::.  A port buffer consists of a
974     bytevector, along with some cursors into that bytevector denoting
975     where to get and put data.
976
977     Port implementations generally don’t have to be concerned with
978     buffering: a port type’s ‘read’ or ‘write’ function will receive
979     the buffer’s bytevector as an argument, along with an offset and a
980     length into that bytevector, and should then either fill or empty
981     that bytevector.  However in some cases, port implementations may
982     be able to provide an appropriate default buffer size to Guile.
983
984      -- Function: void scm_set_port_get_natural_buffer_sizes
985               (scm_t_port_type *type, void (*get_natural_buffer_sizes)
986               (SCM, size_t *read_buf_size, size_t *write_buf_size))
987          Fill in READ_BUF_SIZE and WRITE_BUF_SIZE with an appropriate
988          buffer size for this port, if one is known.
989
990     File ports implement a ‘get_natural_buffer_sizes’ to let the
991     operating system inform Guile about the appropriate buffer sizes
992     for the particular file opened by the port.
993
994   Note that calls to all of these methods can proceed in parallel and
995concurrently and from any thread up until the point that the port is
996closed.  The call to ‘close’ will happen when no other method is
997running, and no method will be called after the ‘close’ method is
998called.  If your port implementation needs mutual exclusion to prevent
999concurrency, it is responsible for locking appropriately.
1000
1001
1002File: guile.info,  Node: Non-Blocking I/O,  Next: BOM Handling,  Prev: I/O Extensions,  Up: Input and Output
1003
10046.12.14 Non-Blocking I/O
1005------------------------
1006
1007Most ports in Guile are “blocking”: when you try to read a character
1008from a port, Guile will block on the read until a character is ready, or
1009end-of-stream is detected.  Likewise whenever Guile goes to write
1010(possibly buffered) data to an output port, Guile will block until all
1011the data is written.
1012
1013   Interacting with ports in blocking mode is very convenient: you can
1014write straightforward, sequential algorithms whose code flow reflects
1015the flow of data.  However, blocking I/O has two main limitations.
1016
1017   The first is that it’s easy to get into a situation where code is
1018waiting on data.  Time spent waiting on data when code could be doing
1019something else is wasteful and prevents your program from reaching its
1020peak throughput.  If you implement a web server that sequentially
1021handles requests from clients, it’s very easy for the server to end up
1022waiting on a client to finish its HTTP request, or waiting on it to
1023consume the response.  The end result is that you are able to serve
1024fewer requests per second than you’d like to serve.
1025
1026   The second limitation is related: a blocking parser over
1027user-controlled input is a denial-of-service vulnerability.  Indeed the
1028so-called “slow loris” attack of the early 2010s was just that: an
1029attack on common web servers that drip-fed HTTP requests, one character
1030at a time.  All it took was a handful of slow loris connections to
1031occupy an entire web server.
1032
1033   In Guile we would like to preserve the ability to write
1034straightforward blocking networking processes of all kinds, but under
1035the hood to allow those processes to suspend their requests if they
1036would block.
1037
1038   To do this, the first piece is to allow Guile ports to declare
1039themselves as being nonblocking.  This is currently supported only for
1040file ports, which also includes sockets, terminals, or any other port
1041that is backed by a file descriptor.  To do that, we use an arcane UNIX
1042incantation:
1043
1044     (let ((flags (fcntl socket F_GETFL)))
1045       (fcntl socket F_SETFL (logior O_NONBLOCK flags)))
1046
1047   Now the file descriptor is open in non-blocking mode.  If Guile tries
1048to read or write from this file and the read or write returns a result
1049indicating that more data can only be had by doing a blocking read or
1050write, Guile will block by polling on the socket’s ‘read-wait-fd’ or
1051‘write-wait-fd’, to preserve the illusion of a blocking read or write.
1052*Note I/O Extensions:: for more on those internal interfaces.
1053
1054   So far we have just reproduced the status quo: the file descriptor is
1055non-blocking, but the operations on the port do block.  To go farther,
1056it would be nice if we could suspend the “thread” using delimited
1057continuations, and only resume the thread once the file descriptor is
1058readable or writable.  (*Note Prompts::).
1059
1060   But here we run into a difficulty.  The ports code is implemented in
1061C, which means that although we can suspend the computation to some
1062outer prompt, we can’t resume it because Guile can’t resume delimited
1063continuations that capture the C stack.
1064
1065   To overcome this difficulty we have created a compatible but entirely
1066parallel implementation of port operations.  To use this implementation,
1067do the following:
1068
1069     (use-modules (ice-9 suspendable-ports))
1070     (install-suspendable-ports!)
1071
1072   This will replace the core I/O primitives like ‘get-char’ and
1073‘put-bytevector’ with new versions that are exactly the same as the ones
1074in the standard library, but with two differences.  One is that when a
1075read or a write would block, the suspendable port operations call out
1076the value of the ‘current-read-waiter’ or ‘current-write-waiter’
1077parameter, as appropriate.  *Note Parameters::.  The default read and
1078write waiters do the same thing that the C read and write waiters do,
1079which is to poll.  User code can parameterize the waiters, though,
1080enabling the computation to suspend and allow the program to process
1081other I/O operations.  Because the new suspendable ports implementation
1082is written in Scheme, that suspended computation can resume again later
1083when it is able to make progress.  Success!
1084
1085   The other main difference is that because the new ports
1086implementation is written in Scheme, it is slower than C, currently by a
1087factor of 3 or 4, though it depends on many factors.  For this reason we
1088have to keep the C implementations as the default ones.  One day when
1089Guile’s compiler is better, we can close this gap and have only one port
1090operation implementation again.
1091
1092   Note that Guile does not currently include an implementation of the
1093facility to suspend the current thread and schedule other threads in the
1094meantime.  Before adding such a thing, we want to make sure that we’re
1095providing the right primitives that can be used to build schedulers and
1096other user-space concurrency patterns, and that the patterns that we
1097settle on are the right patterns.  In the meantime, have a look at 8sync
1098(<https://gnu.org/software/8sync>) for a prototype of an asynchronous
1099I/O and concurrency facility.
1100
1101 -- Scheme Procedure: install-suspendable-ports!
1102     Replace the core ports implementation with suspendable ports, as
1103     described above.  This will mutate the values of the bindings like
1104     ‘get-char’, ‘put-u8’, and so on in place.
1105
1106 -- Scheme Procedure: uninstall-suspendable-ports!
1107     Restore the original core ports implementation, un-doing the effect
1108     of ‘install-suspendable-ports!’.
1109
1110 -- Scheme Parameter: current-read-waiter
1111 -- Scheme Parameter: current-write-waiter
1112     Parameters whose values are procedures of one argument, called when
1113     a suspendable port operation would block on a port while reading or
1114     writing, respectively.  The default values of these parameters do a
1115     blocking ‘poll’ on the port’s file descriptor.  The procedures are
1116     passed the port in question as their one argument.
1117
1118
1119File: guile.info,  Node: BOM Handling,  Prev: Non-Blocking I/O,  Up: Input and Output
1120
11216.12.15 Handling of Unicode Byte Order Marks
1122--------------------------------------------
1123
1124This section documents the finer points of Guile’s handling of Unicode
1125byte order marks (BOMs).  A byte order mark (U+FEFF) is typically found
1126at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
1127determine the byte order.  Occasionally, a BOM is found at the start of
1128a UTF-8 stream, but this is much less common and not generally
1129recommended.
1130
1131   Guile attempts to handle BOMs automatically, and in accordance with
1132the recommendations of the Unicode Standard, when the port encoding is
1133set to ‘UTF-8’, ‘UTF-16’, or ‘UTF-32’.  In brief, Guile automatically
1134writes a BOM at the start of a UTF-16 or UTF-32 stream, and
1135automatically consumes one from the start of a UTF-8, UTF-16, or UTF-32
1136stream.
1137
1138   As specified in the Unicode Standard, a BOM is only handled specially
1139at the start of a stream, and only if the port encoding is set to
1140‘UTF-8’, ‘UTF-16’ or ‘UTF-32’.  If the port encoding is set to
1141‘UTF-16BE’, ‘UTF-16LE’, ‘UTF-32BE’, or ‘UTF-32LE’, then BOMs are _not_
1142handled specially, and none of the special handling described in this
1143section applies.
1144
1145   • To ensure that Guile will properly detect the byte order of a
1146     UTF-16 or UTF-32 stream, you must perform a textual read before any
1147     writes, seeks, or binary I/O. Guile will not attempt to read a BOM
1148     unless a read is explicitly requested at the start of the stream.
1149
1150   • If a textual write is performed before the first read, then an
1151     arbitrary byte order will be chosen.  Currently, big endian is the
1152     default on all platforms, but that may change in the future.  If
1153     you wish to explicitly control the byte order of an output stream,
1154     set the port encoding to ‘UTF-16BE’, ‘UTF-16LE’, ‘UTF-32BE’, or
1155     ‘UTF-32LE’, and explicitly write a BOM (‘#\xFEFF’) if desired.
1156
1157   • If ‘set-port-encoding!’ is called in the middle of a stream, Guile
1158     treats this as a new logical “start of stream” for purposes of BOM
1159     handling, and will forget about any BOMs that had previously been
1160     seen.  Therefore, it may choose a different byte order than had
1161     been used previously.  This is intended to support multiple logical
1162     text streams embedded within a larger binary stream.
1163
1164   • Binary I/O operations are not guaranteed to update Guile’s notion
1165     of whether the port is at the “start of the stream”, nor are they
1166     guaranteed to produce or consume BOMs.
1167
1168   • For ports that support seeking (e.g.  normal files), the input and
1169     output streams are considered linked: if the user reads first, then
1170     a BOM will be consumed (if appropriate), but later writes will
1171     _not_ produce a BOM. Similarly, if the user writes first, then
1172     later reads will _not_ consume a BOM.
1173
1174   • For ports that are not random access (e.g.  pipes, sockets, and
1175     terminals), the input and output streams are considered
1176     _independent_ for purposes of BOM handling: the first read will
1177     consume a BOM (if appropriate), and the first write will _also_
1178     produce a BOM (if appropriate).  However, the input and output
1179     streams will always use the same byte order.
1180
1181   • Seeks to the beginning of a file will set the “start of stream”
1182     flags.  Therefore, a subsequent textual read or write will consume
1183     or produce a BOM. However, unlike ‘set-port-encoding!’, if a byte
1184     order had already been chosen for the port, it will remain in
1185     effect after a seek, and cannot be changed by the presence of a
1186     BOM. Seeks anywhere other than the beginning of a file clear the
1187     “start of stream” flags.
1188
1189
1190File: guile.info,  Node: Regular Expressions,  Next: LALR(1) Parsing,  Prev: Input and Output,  Up: API Reference
1191
11926.13 Regular Expressions
1193========================
1194
1195A “regular expression” (or “regexp”) is a pattern that describes a whole
1196class of strings.  A full description of regular expressions and their
1197syntax is beyond the scope of this manual.
1198
1199   If your system does not include a POSIX regular expression library,
1200and you have not linked Guile with a third-party regexp library such as
1201Rx, these functions will not be available.  You can tell whether your
1202Guile installation includes regular expression support by checking
1203whether ‘(provided? 'regex)’ returns true.
1204
1205   The following regexp and string matching features are provided by the
1206‘(ice-9 regex)’ module.  Before using the described functions, you
1207should load this module by executing ‘(use-modules (ice-9 regex))’.
1208
1209* Menu:
1210
1211* Regexp Functions::            Functions that create and match regexps.
1212* Match Structures::            Finding what was matched by a regexp.
1213* Backslash Escapes::           Removing the special meaning of regexp
1214                                meta-characters.
1215
1216
1217File: guile.info,  Node: Regexp Functions,  Next: Match Structures,  Up: Regular Expressions
1218
12196.13.1 Regexp Functions
1220-----------------------
1221
1222By default, Guile supports POSIX extended regular expressions.  That
1223means that the characters ‘(’, ‘)’, ‘+’ and ‘?’ are special, and must be
1224escaped if you wish to match the literal characters and there is no
1225support for “non-greedy” variants of ‘*’, ‘+’ or ‘?’.
1226
1227   This regular expression interface was modeled after that implemented
1228by SCSH, the Scheme Shell.  It is intended to be upwardly compatible
1229with SCSH regular expressions.
1230
1231   Zero bytes (‘#\nul’) cannot be used in regex patterns or input
1232strings, since the underlying C functions treat that as the end of
1233string.  If there’s a zero byte an error is thrown.
1234
1235   Internally, patterns and input strings are converted to the current
1236locale’s encoding, and then passed to the C library’s regular expression
1237routines (*note (libc)Regular Expressions::).  The returned match
1238structures always point to characters in the strings, not to individual
1239bytes, even in the case of multi-byte encodings.
1240
1241 -- Scheme Procedure: string-match pattern str [start]
1242     Compile the string PATTERN into a regular expression and compare it
1243     with STR.  The optional numeric argument START specifies the
1244     position of STR at which to begin matching.
1245
1246     ‘string-match’ returns a “match structure” which describes what, if
1247     anything, was matched by the regular expression.  *Note Match
1248     Structures::.  If STR does not match PATTERN at all, ‘string-match’
1249     returns ‘#f’.
1250
1251   Two examples of a match follow.  In the first example, the pattern
1252matches the four digits in the match string.  In the second, the pattern
1253matches nothing.
1254
1255     (string-match "[0-9][0-9][0-9][0-9]" "blah2002")
1256     ⇒ #("blah2002" (4 . 8))
1257
1258     (string-match "[A-Za-z]" "123456")
1259     ⇒ #f
1260
1261   Each time ‘string-match’ is called, it must compile its PATTERN
1262argument into a regular expression structure.  This operation is
1263expensive, which makes ‘string-match’ inefficient if the same regular
1264expression is used several times (for example, in a loop).  For better
1265performance, you can compile a regular expression in advance and then
1266match strings against the compiled regexp.
1267
1268 -- Scheme Procedure: make-regexp pat flag...
1269 -- C Function: scm_make_regexp (pat, flaglst)
1270     Compile the regular expression described by PAT, and return the
1271     compiled regexp structure.  If PAT does not describe a legal
1272     regular expression, ‘make-regexp’ throws a
1273     ‘regular-expression-syntax’ error.
1274
1275     The FLAG arguments change the behavior of the compiled regular
1276     expression.  The following values may be supplied:
1277
1278      -- Variable: regexp/icase
1279          Consider uppercase and lowercase letters to be the same when
1280          matching.
1281
1282      -- Variable: regexp/newline
1283          If a newline appears in the target string, then permit the ‘^’
1284          and ‘$’ operators to match immediately after or immediately
1285          before the newline, respectively.  Also, the ‘.’ and ‘[^...]’
1286          operators will never match a newline character.  The intent of
1287          this flag is to treat the target string as a buffer containing
1288          many lines of text, and the regular expression as a pattern
1289          that may match a single one of those lines.
1290
1291      -- Variable: regexp/basic
1292          Compile a basic (“obsolete”) regexp instead of the extended
1293          (“modern”) regexps that are the default.  Basic regexps do not
1294          consider ‘|’, ‘+’ or ‘?’ to be special characters, and require
1295          the ‘{...}’ and ‘(...)’ metacharacters to be backslash-escaped
1296          (*note Backslash Escapes::).  There are several other
1297          differences between basic and extended regular expressions,
1298          but these are the most significant.
1299
1300      -- Variable: regexp/extended
1301          Compile an extended regular expression rather than a basic
1302          regexp.  This is the default behavior; this flag will not
1303          usually be needed.  If a call to ‘make-regexp’ includes both
1304regexp/basic’ and ‘regexp/extended’ flags, the one which
1305          comes last will override the earlier one.
1306
1307 -- Scheme Procedure: regexp-exec rx str [start [flags]]
1308 -- C Function: scm_regexp_exec (rx, str, start, flags)
1309     Match the compiled regular expression RX against ‘str’.  If the
1310     optional integer START argument is provided, begin matching from
1311     that position in the string.  Return a match structure describing
1312     the results of the match, or ‘#f’ if no match could be found.
1313
1314     The FLAGS argument changes the matching behavior.  The following
1315     flag values may be supplied, use ‘logior’ (*note Bitwise
1316     Operations::) to combine them,
1317
1318      -- Variable: regexp/notbol
1319          Consider that the START offset into STR is not the beginning
1320          of a line and should not match operator ‘^’.
1321
1322          If RX was created with the ‘regexp/newline’ option above, ‘^’
1323          will still match after a newline in STR.
1324
1325      -- Variable: regexp/noteol
1326          Consider that the end of STR is not the end of a line and
1327          should not match operator ‘$’.
1328
1329          If RX was created with the ‘regexp/newline’ option above, ‘$’
1330          will still match before a newline in STR.
1331
1332     ;; Regexp to match uppercase letters
1333     (define r (make-regexp "[A-Z]*"))
1334
1335     ;; Regexp to match letters, ignoring case
1336     (define ri (make-regexp "[A-Z]*" regexp/icase))
1337
1338     ;; Search for bob using regexp r
1339     (match:substring (regexp-exec r "bob"))
1340     ⇒ ""                  ; no match
1341
1342     ;; Search for bob using regexp ri
1343     (match:substring (regexp-exec ri "Bob"))
1344     ⇒ "Bob"               ; matched case insensitive
1345
1346 -- Scheme Procedure: regexp? obj
1347 -- C Function: scm_regexp_p (obj)
1348     Return ‘#t’ if OBJ is a compiled regular expression, or ‘#f’
1349     otherwise.
1350
1351
1352 -- Scheme Procedure: list-matches regexp str [flags]
1353     Return a list of match structures which are the non-overlapping
1354     matches of REGEXP in STR.  REGEXP can be either a pattern string or
1355     a compiled regexp.  The FLAGS argument is as per ‘regexp-exec’
1356     above.
1357
1358          (map match:substring (list-matches "[a-z]+" "abc 42 def 78"))
1359          ⇒ ("abc" "def")
1360
1361 -- Scheme Procedure: fold-matches regexp str init proc [flags]
1362     Apply PROC to the non-overlapping matches of REGEXP in STR, to
1363     build a result.  REGEXP can be either a pattern string or a
1364     compiled regexp.  The FLAGS argument is as per ‘regexp-exec’ above.
1365
1366     PROC is called as ‘(PROC match prev)’ where MATCH is a match
1367     structure and PREV is the previous return from PROC.  For the first
1368     call PREV is the given INIT parameter.  ‘fold-matches’ returns the
1369     final value from PROC.
1370
1371     For example to count matches,
1372
1373          (fold-matches "[a-z][0-9]" "abc x1 def y2" 0
1374                        (lambda (match count)
1375                          (1+ count)))
1376          ⇒ 2
1377
1378
1379   Regular expressions are commonly used to find patterns in one string
1380and replace them with the contents of another string.  The following
1381functions are convenient ways to do this.
1382
1383 -- Scheme Procedure: regexp-substitute port match item ...
1384     Write to PORT selected parts of the match structure MATCH.  Or if
1385     PORT is ‘#f’ then form a string from those parts and return that.
1386
1387     Each ITEM specifies a part to be written, and may be one of the
1388     following,
1389
1390        • A string.  String arguments are written out verbatim.
1391
1392        • An integer.  The submatch with that number is written
1393          (‘match:substring’).  Zero is the entire match.
1394
1395        • The symbol ‘pre’.  The portion of the matched string preceding
1396          the regexp match is written (‘match:prefix’).
1397
1398        • The symbol ‘post’.  The portion of the matched string
1399          following the regexp match is written (‘match:suffix’).
1400
1401     For example, changing a match and retaining the text before and
1402     after,
1403
1404          (regexp-substitute #f (string-match "[0-9]+" "number 25 is good")
1405                             'pre "37" 'post)
1406          ⇒ "number 37 is good"
1407
1408     Or matching a YYYYMMDD format date such as ‘20020828’ and
1409     re-ordering and hyphenating the fields.
1410
1411          (define date-regex
1412             "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
1413          (define s "Date 20020429 12am.")
1414          (regexp-substitute #f (string-match date-regex s)
1415                             'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
1416          ⇒ "Date 04-29-2002 12am. (20020429)"
1417
1418 -- Scheme Procedure: regexp-substitute/global port regexp target
1419          item...
1420     Write to PORT selected parts of matches of REGEXP in TARGET.  If
1421     PORT is ‘#f’ then form a string from those parts and return that.
1422     REGEXP can be a string or a compiled regex.
1423
1424     This is similar to ‘regexp-substitute’, but allows global
1425     substitutions on TARGET.  Each ITEM behaves as per
1426     ‘regexp-substitute’, with the following differences,
1427
1428        • A function.  Called as ‘(ITEM match)’ with the match structure
1429          for the REGEXP match, it should return a string to be written
1430          to PORT.
1431
1432        • The symbol ‘post’.  This doesn’t output anything, but instead
1433          causes ‘regexp-substitute/global’ to recurse on the unmatched
1434          portion of TARGET.
1435
1436          This _must_ be supplied to perform a global search and replace
1437          on TARGET; without it ‘regexp-substitute/global’ returns after
1438          a single match and output.
1439
1440     For example, to collapse runs of tabs and spaces to a single hyphen
1441     each,
1442
1443          (regexp-substitute/global #f "[ \t]+"  "this   is   the text"
1444                                    'pre "-" 'post)
1445          ⇒ "this-is-the-text"
1446
1447     Or using a function to reverse the letters in each word,
1448
1449          (regexp-substitute/global #f "[a-z]+"  "to do and not-do"
1450            'pre (lambda (m) (string-reverse (match:substring m))) 'post)
1451          ⇒ "ot od dna ton-od"
1452
1453     Without the ‘post’ symbol, just one regexp match is made.  For
1454     example the following is the date example from ‘regexp-substitute’
1455     above, without the need for the separate ‘string-match’ call.
1456
1457          (define date-regex
1458             "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
1459          (define s "Date 20020429 12am.")
1460          (regexp-substitute/global #f date-regex s
1461                                    'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
1462
1463          ⇒ "Date 04-29-2002 12am. (20020429)"
1464
1465
1466File: guile.info,  Node: Match Structures,  Next: Backslash Escapes,  Prev: Regexp Functions,  Up: Regular Expressions
1467
14686.13.2 Match Structures
1469-----------------------
1470
1471A “match structure” is the object returned by ‘string-match’ and
1472‘regexp-exec’.  It describes which portion of a string, if any, matched
1473the given regular expression.  Match structures include: a reference to
1474the string that was checked for matches; the starting and ending
1475positions of the regexp match; and, if the regexp included any
1476parenthesized subexpressions, the starting and ending positions of each
1477submatch.
1478
1479   In each of the regexp match functions described below, the ‘match’
1480argument must be a match structure returned by a previous call to
1481‘string-match’ or ‘regexp-exec’.  Most of these functions return some
1482information about the original target string that was matched against a
1483regular expression; we will call that string TARGET for easy reference.
1484
1485 -- Scheme Procedure: regexp-match? obj
1486     Return ‘#t’ if OBJ is a match structure returned by a previous call
1487     to ‘regexp-exec’, or ‘#f’ otherwise.
1488
1489 -- Scheme Procedure: match:substring match [n]
1490     Return the portion of TARGET matched by subexpression number N.
1491     Submatch 0 (the default) represents the entire regexp match.  If
1492     the regular expression as a whole matched, but the subexpression
1493     number N did not match, return ‘#f’.
1494
1495     (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
1496     (match:substring s)
1497     ⇒ "2002"
1498
1499     ;; match starting at offset 6 in the string
1500     (match:substring
1501       (string-match "[0-9][0-9][0-9][0-9]" "blah987654" 6))
1502     ⇒ "7654"
1503
1504 -- Scheme Procedure: match:start match [n]
1505     Return the starting position of submatch number N.
1506
1507   In the following example, the result is 4, since the match starts at
1508character index 4:
1509
1510     (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
1511     (match:start s)
1512     ⇒ 4
1513
1514 -- Scheme Procedure: match:end match [n]
1515     Return the ending position of submatch number N.
1516
1517   In the following example, the result is 8, since the match runs
1518between characters 4 and 8 (i.e. the “2002”).
1519
1520     (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
1521     (match:end s)
1522     ⇒ 8
1523
1524 -- Scheme Procedure: match:prefix match
1525     Return the unmatched portion of TARGET preceding the regexp match.
1526
1527          (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
1528          (match:prefix s)
1529          ⇒ "blah"
1530
1531 -- Scheme Procedure: match:suffix match
1532     Return the unmatched portion of TARGET following the regexp match.
1533
1534     (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
1535     (match:suffix s)
1536     ⇒ "foo"
1537
1538 -- Scheme Procedure: match:count match
1539     Return the number of parenthesized subexpressions from MATCH.  Note
1540     that the entire regular expression match itself counts as a
1541     subexpression, and failed submatches are included in the count.
1542
1543 -- Scheme Procedure: match:string match
1544     Return the original TARGET string.
1545
1546     (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
1547     (match:string s)
1548     ⇒ "blah2002foo"
1549
1550
1551File: guile.info,  Node: Backslash Escapes,  Prev: Match Structures,  Up: Regular Expressions
1552
15536.13.3 Backslash Escapes
1554------------------------
1555
1556Sometimes you will want a regexp to match characters like ‘*’ or ‘$’
1557exactly.  For example, to check whether a particular string represents a
1558menu entry from an Info node, it would be useful to match it against a
1559regexp like ‘^* [^:]*::’.  However, this won’t work; because the
1560asterisk is a metacharacter, it won’t match the ‘*’ at the beginning of
1561the string.  In this case, we want to make the first asterisk un-magic.
1562
1563   You can do this by preceding the metacharacter with a backslash
1564character ‘\’.  (This is also called “quoting” the metacharacter, and is
1565known as a “backslash escape”.)  When Guile sees a backslash in a
1566regular expression, it considers the following glyph to be an ordinary
1567character, no matter what special meaning it would ordinarily have.
1568Therefore, we can make the above example work by changing the regexp to
1569‘^\* [^:]*::’.  The ‘\*’ sequence tells the regular expression engine to
1570match only a single asterisk in the target string.
1571
1572   Since the backslash is itself a metacharacter, you may force a regexp
1573to match a backslash in the target string by preceding the backslash
1574with itself.  For example, to find variable references in a TeX program,
1575you might want to find occurrences of the string ‘\let\’ followed by any
1576number of alphabetic characters.  The regular expression
1577‘\\let\\[A-Za-z]*’ would do this: the double backslashes in the regexp
1578each match a single backslash in the target string.
1579
1580 -- Scheme Procedure: regexp-quote str
1581     Quote each special character found in STR with a backslash, and
1582     return the resulting string.
1583
1584   *Very important:* Using backslash escapes in Guile source code (as in
1585Emacs Lisp or C) can be tricky, because the backslash character has
1586special meaning for the Guile reader.  For example, if Guile encounters
1587the character sequence ‘\n’ in the middle of a string while processing
1588Scheme code, it replaces those characters with a newline character.
1589Similarly, the character sequence ‘\t’ is replaced by a horizontal tab.
1590Several of these “escape sequences” are processed by the Guile reader
1591before your code is executed.  Unrecognized escape sequences are
1592ignored: if the characters ‘\*’ appear in a string, they will be
1593translated to the single character ‘*’.
1594
1595   This translation is obviously undesirable for regular expressions,
1596since we want to be able to include backslashes in a string in order to
1597escape regexp metacharacters.  Therefore, to make sure that a backslash
1598is preserved in a string in your Guile program, you must use _two_
1599consecutive backslashes:
1600
1601     (define Info-menu-entry-pattern (make-regexp "^\\* [^:]*"))
1602
1603   The string in this example is preprocessed by the Guile reader before
1604any code is executed.  The resulting argument to ‘make-regexp’ is the
1605string ‘^\* [^:]*’, which is what we really want.
1606
1607   This also means that in order to write a regular expression that
1608matches a single backslash character, the regular expression string in
1609the source code must include _four_ backslashes.  Each consecutive pair
1610of backslashes gets translated by the Guile reader to a single
1611backslash, and the resulting double-backslash is interpreted by the
1612regexp engine as matching a single backslash character.  Hence:
1613
1614     (define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*"))
1615
1616   The reason for the unwieldiness of this syntax is historical.  Both
1617regular expression pattern matchers and Unix string processing systems
1618have traditionally used backslashes with the special meanings described
1619above.  The POSIX regular expression specification and ANSI C standard
1620both require these semantics.  Attempting to abandon either convention
1621would cause other kinds of compatibility problems, possibly more severe
1622ones.  Therefore, without extending the Scheme reader to support strings
1623with different quoting conventions (an ungainly and confusing extension
1624when implemented in other languages), we must adhere to this cumbersome
1625escape syntax.
1626
1627
1628File: guile.info,  Node: LALR(1) Parsing,  Next: PEG Parsing,  Prev: Regular Expressions,  Up: API Reference
1629
16306.14 LALR(1) Parsing
1631====================
1632
1633The ‘(system base lalr)’ module provides the ‘lalr-scm’ LALR(1) parser
1634generator by Dominique Boucher (https://github.com/schemeway/lalr-scm/).
1635‘lalr-scm’ uses the same algorithm as GNU Bison (*note Introduction to
1636Bison: (bison)Introduction.).  Parsers are defined using the
1637‘lalr-parser’ macro.
1638
1639 -- Scheme Syntax: lalr-parser [OPTIONS] TOKENS RULES...
1640     Generate an LALR(1) syntax analyzer.  TOKENS is a list of symbols
1641     representing the terminal symbols of the grammar.  RULES are the
1642     grammar production rules.
1643
1644     Each rule has the form ‘(NON-TERMINAL (RHS ...) : ACTION ...)’,
1645     where NON-TERMINAL is the name of the rule, RHS are the right-hand
1646     sides, i.e., the production rule, and ACTION is a semantic action
1647     associated with the rule.
1648
1649     The generated parser is a two-argument procedure that takes a
1650     “tokenizer” and a “syntax error procedure”.  The tokenizer should
1651     be a thunk that returns lexical tokens as produced by
1652     ‘make-lexical-token’.  The syntax error procedure may be called
1653     with at least an error message (a string), and optionally the
1654     lexical token that caused the error.
1655
1656   Please refer to the ‘lalr-scm’ documentation for details.
1657
1658
1659File: guile.info,  Node: PEG Parsing,  Next: Read/Load/Eval/Compile,  Prev: LALR(1) Parsing,  Up: API Reference
1660
16616.15 PEG Parsing
1662================
1663
1664Parsing Expression Grammars (PEGs) are a way of specifying formal
1665languages for text processing.  They can be used either for matching
1666(like regular expressions) or for building recursive descent parsers
1667(like lex/yacc).  Guile uses a superset of PEG syntax that allows more
1668control over what information is preserved during parsing.
1669
1670   Wikipedia has a clear and concise introduction to PEGs if you want to
1671familiarize yourself with the syntax:
1672<http://en.wikipedia.org/wiki/Parsing_expression_grammar>.
1673
1674   The ‘(ice-9 peg)’ module works by compiling PEGs down to lambda
1675expressions.  These can either be stored in variables at compile-time by
1676the define macros (‘define-peg-pattern’ and
1677‘define-peg-string-patterns’) or calculated explicitly at runtime with
1678the compile functions (‘compile-peg-pattern’ and ‘peg-string-compile’).
1679
1680   They can then be used for either parsing (‘match-pattern’) or
1681searching (‘search-for-pattern’).  For convenience, ‘search-for-pattern’
1682also takes pattern literals in case you want to inline a simple search
1683(people often use regular expressions this way).
1684
1685   The rest of this documentation consists of a syntax reference, an API
1686reference, and a tutorial.
1687
1688* Menu:
1689
1690* PEG Syntax Reference::
1691* PEG API Reference::
1692* PEG Tutorial::
1693* PEG Internals::
1694
1695
1696File: guile.info,  Node: PEG Syntax Reference,  Next: PEG API Reference,  Up: PEG Parsing
1697
16986.15.1 PEG Syntax Reference
1699---------------------------
1700
1701Normal PEG Syntax:
1702..................
1703
1704 -- PEG Pattern: sequence a b
1705     Parses A.  If this succeeds, continues to parse B from the end of
1706     the text parsed as A.  Succeeds if both A and B succeed.
1707
1708     ‘"a b"’
1709
1710     ‘(and a b)’
1711
1712 -- PEG Pattern: ordered choice a b
1713     Parses A.  If this fails, backtracks and parses B.  Succeeds if
1714     either A or B succeeds.
1715
1716     ‘"a/b"’
1717
1718     ‘(or a b)’
1719
1720 -- PEG Pattern: zero or more a
1721     Parses A as many times in a row as it can, starting each A at the
1722     end of the text parsed by the previous A.  Always succeeds.
1723
1724     ‘"a*"’
1725
1726     ‘(* a)’
1727
1728 -- PEG Pattern: one or more a
1729     Parses A as many times in a row as it can, starting each A at the
1730     end of the text parsed by the previous A.  Succeeds if at least one
1731     A was parsed.
1732
1733     ‘"a+"’
1734
1735     ‘(+ a)’
1736
1737 -- PEG Pattern: optional a
1738     Tries to parse A.  Succeeds if A succeeds.
1739
1740     ‘"a?"’
1741
1742     ‘(? a)’
1743
1744 -- PEG Pattern: followed by a
1745     Makes sure it is possible to parse A, but does not actually parse
1746     it.  Succeeds if A would succeed.
1747
1748     ‘"&a"’
1749
1750     ‘(followed-by a)’
1751
1752 -- PEG Pattern: not followed by a
1753     Makes sure it is impossible to parse A, but does not actually parse
1754     it.  Succeeds if A would fail.
1755
1756     ‘"!a"’
1757
1758     ‘(not-followed-by a)’
1759
1760 -- PEG Pattern: string literal ``abc''
1761     Parses the string "ABC".  Succeeds if that parsing succeeds.
1762
1763     ‘"'abc'"’
1764
1765     ‘"abc"’
1766
1767 -- PEG Pattern: any character
1768     Parses any single character.  Succeeds unless there is no more text
1769     to be parsed.
1770
1771     ‘"."’
1772
1773     ‘peg-any’
1774
1775 -- PEG Pattern: character class a b
1776     Alternative syntax for “Ordered Choice A B” if A and B are
1777     characters.
1778
1779     ‘"[ab]"’
1780
1781     ‘(or "a" "b")’
1782
1783 -- PEG Pattern: range of characters a z
1784     Parses any character falling between A and Z.
1785
1786     ‘"[a-z]"’
1787
1788     ‘(range #\a #\z)’
1789
1790   Example:
1791
1792     "(a !b / c &d*) 'e'+"
1793
1794   Would be:
1795
1796     (and
1797      (or
1798       (and a (not-followed-by b))
1799       (and c (followed-by (* d))))
1800      (+ "e"))
1801
1802Extended Syntax
1803...............
1804
1805There is some extra syntax for S-expressions.
1806
1807 -- PEG Pattern: ignore a
1808     Ignore the text matching A
1809
1810 -- PEG Pattern: capture a
1811     Capture the text matching A.
1812
1813 -- PEG Pattern: peg a
1814     Embed the PEG pattern A using string syntax.
1815
1816   Example:
1817
1818     "!a / 'b'"
1819
1820   Is equivalent to
1821
1822     (or (peg "!a") "b")
1823
1824   and
1825
1826     (or (not-followed-by a) "b")
1827
1828
1829File: guile.info,  Node: PEG API Reference,  Next: PEG Tutorial,  Prev: PEG Syntax Reference,  Up: PEG Parsing
1830
18316.15.2 PEG API Reference
1832------------------------
1833
1834Define Macros
1835.............
1836
1837The most straightforward way to define a PEG is by using one of the
1838define macros (both of these macroexpand into ‘define’ expressions).
1839These macros bind parsing functions to variables.  These parsing
1840functions may be invoked by ‘match-pattern’ or ‘search-for-pattern’,
1841which return a PEG match record.  Raw data can be retrieved from this
1842record with the PEG match deconstructor functions.  More complicated
1843(and perhaps enlightening) examples can be found in the tutorial.
1844
1845 -- Scheme Macro: define-peg-string-patterns peg-string
1846     Defines all the nonterminals in the PEG PEG-STRING.  More
1847     precisely, ‘define-peg-string-patterns’ takes a superset of PEGs.
1848     A normal PEG has a ‘<-’ between the nonterminal and the pattern.
1849     ‘define-peg-string-patterns’ uses this symbol to determine what
1850     information it should propagate up the parse tree.  The normal ‘<-’
1851     propagates the matched text up the parse tree, ‘<--’ propagates the
1852     matched text up the parse tree tagged with the name of the
1853     nonterminal, and ‘<’ discards that matched text and propagates
1854     nothing up the parse tree.  Also, nonterminals may consist of any
1855     alphanumeric character or a “-” character (in normal PEGs
1856     nonterminals can only be alphabetic).
1857
1858     For example, if we:
1859          (define-peg-string-patterns
1860            "as <- 'a'+
1861          bs <- 'b'+
1862          as-or-bs <- as/bs")
1863          (define-peg-string-patterns
1864            "as-tag <-- 'a'+
1865          bs-tag <-- 'b'+
1866          as-or-bs-tag <-- as-tag/bs-tag")
1867     Then:
1868          (match-pattern as-or-bs "aabbcc") ⇒
1869          #<peg start: 0 end: 2 string: aabbcc tree: aa>
1870          (match-pattern as-or-bs-tag "aabbcc") ⇒
1871          #<peg start: 0 end: 2 string: aabbcc tree: (as-or-bs-tag (as-tag aa))>
1872
1873     Note that in doing this, we have bound 6 variables at the toplevel
1874     (AS, BS, AS-OR-BS, AS-TAG, BS-TAG, and AS-OR-BS-TAG).
1875
1876 -- Scheme Macro: define-peg-pattern name capture-type peg-sexp
1877     Defines a single nonterminal NAME.  CAPTURE-TYPE determines how
1878     much information is passed up the parse tree.  PEG-SEXP is a PEG in
1879     S-expression form.
1880
1881     Possible values for capture-type:
1882
1883     ‘all’
1884          passes the matched text up the parse tree tagged with the name
1885          of the nonterminal.
1886     ‘body’
1887          passes the matched text up the parse tree.
1888     ‘none’
1889          passes nothing up the parse tree.
1890
1891     For Example, if we:
1892          (define-peg-pattern as body (+ "a"))
1893          (define-peg-pattern bs body (+ "b"))
1894          (define-peg-pattern as-or-bs body (or as bs))
1895          (define-peg-pattern as-tag all (+ "a"))
1896          (define-peg-pattern bs-tag all (+ "b"))
1897          (define-peg-pattern as-or-bs-tag all (or as-tag bs-tag))
1898     Then:
1899          (match-pattern as-or-bs "aabbcc") ⇒
1900          #<peg start: 0 end: 2 string: aabbcc tree: aa>
1901          (match-pattern as-or-bs-tag "aabbcc") ⇒
1902          #<peg start: 0 end: 2 string: aabbcc tree: (as-or-bs-tag (as-tag aa))>
1903
1904     Note that in doing this, we have bound 6 variables at the toplevel
1905     (AS, BS, AS-OR-BS, AS-TAG, BS-TAG, and AS-OR-BS-TAG).
1906
1907Compile Functions
1908.................
1909
1910It is sometimes useful to be able to compile anonymous PEG patterns at
1911runtime.  These functions let you do that using either syntax.
1912
1913 -- Scheme Procedure: peg-string-compile peg-string capture-type
1914     Compiles the PEG pattern in PEG-STRING propagating according to
1915     CAPTURE-TYPE (capture-type can be any of the values from
1916     ‘define-peg-pattern’).
1917
1918 -- Scheme Procedure: compile-peg-pattern peg-sexp capture-type
1919     Compiles the PEG pattern in PEG-SEXP propagating according to
1920     CAPTURE-TYPE (capture-type can be any of the values from
1921     ‘define-peg-pattern’).
1922
1923   The functions return syntax objects, which can be useful if you want
1924to use them in macros.  If all you want is to define a new nonterminal,
1925you can do the following:
1926
1927     (define exp '(+ "a"))
1928     (define as (compile (compile-peg-pattern exp 'body)))
1929
1930   You can use this nonterminal with all of the regular PEG functions:
1931
1932     (match-pattern as "aaaaa") ⇒
1933     #<peg start: 0 end: 5 string: bbbbb tree: bbbbb>
1934
1935Parsing & Matching Functions
1936............................
1937
1938For our purposes, “parsing” means parsing a string into a tree starting
1939from the first character, while “matching” means searching through the
1940string for a substring.  In practice, the only difference between the
1941two functions is that ‘match-pattern’ gives up if it can’t find a valid
1942substring starting at index 0 and ‘search-for-pattern’ keeps looking.
1943They are both equally capable of “parsing” and “matching” given those
1944constraints.
1945
1946 -- Scheme Procedure: match-pattern nonterm string
1947     Parses STRING using the PEG stored in NONTERM.  If no match was
1948     found, ‘match-pattern’ returns false.  If a match was found, a PEG
1949     match record is returned.
1950
1951     The ‘capture-type’ argument to ‘define-peg-pattern’ allows you to
1952     choose what information to hold on to while parsing.  The options
1953     are:
1954
1955     ‘all’
1956          tag the matched text with the nonterminal
1957     ‘body’
1958          just the matched text
1959     ‘none’
1960          nothing
1961
1962          (define-peg-pattern as all (+ "a"))
1963          (match-pattern as "aabbcc") ⇒
1964          #<peg start: 0 end: 2 string: aabbcc tree: (as aa)>
1965
1966          (define-peg-pattern as body (+ "a"))
1967          (match-pattern as "aabbcc") ⇒
1968          #<peg start: 0 end: 2 string: aabbcc tree: aa>
1969
1970          (define-peg-pattern as none (+ "a"))
1971          (match-pattern as "aabbcc") ⇒
1972          #<peg start: 0 end: 2 string: aabbcc tree: ()>
1973
1974          (define-peg-pattern bs body (+ "b"))
1975          (match-pattern bs "aabbcc") ⇒
1976          #f
1977
1978 -- Scheme Macro: search-for-pattern nonterm-or-peg string
1979     Searches through STRING looking for a matching subexpression.
1980     NONTERM-OR-PEG can either be a nonterminal or a literal PEG
1981     pattern.  When a literal PEG pattern is provided,
1982     ‘search-for-pattern’ works very similarly to the regular expression
1983     searches many hackers are used to.  If no match was found,
1984     ‘search-for-pattern’ returns false.  If a match was found, a PEG
1985     match record is returned.
1986
1987          (define-peg-pattern as body (+ "a"))
1988          (search-for-pattern as "aabbcc") ⇒
1989          #<peg start: 0 end: 2 string: aabbcc tree: aa>
1990          (search-for-pattern (+ "a") "aabbcc") ⇒
1991          #<peg start: 0 end: 2 string: aabbcc tree: aa>
1992          (search-for-pattern "'a'+" "aabbcc") ⇒
1993          #<peg start: 0 end: 2 string: aabbcc tree: aa>
1994
1995          (define-peg-pattern as all (+ "a"))
1996          (search-for-pattern as "aabbcc") ⇒
1997          #<peg start: 0 end: 2 string: aabbcc tree: (as aa)>
1998
1999          (define-peg-pattern bs body (+ "b"))
2000          (search-for-pattern bs "aabbcc") ⇒
2001          #<peg start: 2 end: 4 string: aabbcc tree: bb>
2002          (search-for-pattern (+ "b") "aabbcc") ⇒
2003          #<peg start: 2 end: 4 string: aabbcc tree: bb>
2004          (search-for-pattern "'b'+" "aabbcc") ⇒
2005          #<peg start: 2 end: 4 string: aabbcc tree: bb>
2006
2007          (define-peg-pattern zs body (+ "z"))
2008          (search-for-pattern zs "aabbcc") ⇒
2009          #f
2010          (search-for-pattern (+ "z") "aabbcc") ⇒
2011          #f
2012          (search-for-pattern "'z'+" "aabbcc") ⇒
2013          #f
2014
2015PEG Match Records
2016.................
2017
2018The ‘match-pattern’ and ‘search-for-pattern’ functions both return PEG
2019match records.  Actual information can be extracted from these with the
2020following functions.
2021
2022 -- Scheme Procedure: peg:string match-record
2023     Returns the original string that was parsed in the creation of
2024     ‘match-record’.
2025
2026 -- Scheme Procedure: peg:start match-record
2027     Returns the index of the first parsed character in the original
2028     string (from ‘peg:string’).  If this is the same as ‘peg:end’,
2029     nothing was parsed.
2030
2031 -- Scheme Procedure: peg:end match-record
2032     Returns one more than the index of the last parsed character in the
2033     original string (from ‘peg:string’).  If this is the same as
2034     ‘peg:start’, nothing was parsed.
2035
2036 -- Scheme Procedure: peg:substring match-record
2037     Returns the substring parsed by ‘match-record’.  This is equivalent
2038     to ‘(substring (peg:string match-record) (peg:start match-record)
2039     (peg:end match-record))’.
2040
2041 -- Scheme Procedure: peg:tree match-record
2042     Returns the tree parsed by ‘match-record’.
2043
2044 -- Scheme Procedure: peg-record? match-record
2045     Returns true if ‘match-record’ is a PEG match record, or false
2046     otherwise.
2047
2048   Example:
2049     (define-peg-pattern bs all (peg "'b'+"))
2050
2051     (search-for-pattern bs "aabbcc") ⇒
2052     #<peg start: 2 end: 4 string: aabbcc tree: (bs bb)>
2053
2054     (let ((pm (search-for-pattern bs "aabbcc")))
2055        `((string ,(peg:string pm))
2056          (start ,(peg:start pm))
2057          (end ,(peg:end pm))
2058          (substring ,(peg:substring pm))
2059          (tree ,(peg:tree pm))
2060          (record? ,(peg-record? pm)))) ⇒
2061     ((string "aabbcc")
2062      (start 2)
2063      (end 4)
2064      (substring "bb")
2065      (tree (bs "bb"))
2066      (record? #t))
2067
2068Miscellaneous
2069.............
2070
2071 -- Scheme Procedure: context-flatten tst lst
2072     Takes a predicate TST and a list LST.  Flattens LST until all
2073     elements are either atoms or satisfy TST.  If LST itself satisfies
2074     TST, ‘(list lst)’ is returned (this is a flat list whose only
2075     element satisfies TST).
2076
2077          (context-flatten (lambda (x) (and (number? (car x)) (= (car x) 1))) '(2 2 (1 1 (2 2)) (2 2 (1 1)))) ⇒
2078          (2 2 (1 1 (2 2)) 2 2 (1 1))
2079          (context-flatten (lambda (x) (and (number? (car x)) (= (car x) 1))) '(1 1 (1 1 (2 2)) (2 2 (1 1)))) ⇒
2080          ((1 1 (1 1 (2 2)) (2 2 (1 1))))
2081
2082     If you’re wondering why this is here, take a look at the tutorial.
2083
2084 -- Scheme Procedure: keyword-flatten terms lst
2085     A less general form of ‘context-flatten’.  Takes a list of terminal
2086     atoms ‘terms’ and flattens LST until all elements are either atoms,
2087     or lists which have an atom from ‘terms’ as their first element.
2088          (keyword-flatten '(a b) '(c a b (a c) (b c) (c (b a) (c a)))) ⇒
2089          (c a b (a c) (b c) c (b a) c a)
2090
2091     If you’re wondering why this is here, take a look at the tutorial.
2092
2093
2094File: guile.info,  Node: PEG Tutorial,  Next: PEG Internals,  Prev: PEG API Reference,  Up: PEG Parsing
2095
20966.15.3 PEG Tutorial
2097-------------------
2098
2099Parsing /etc/passwd
2100...................
2101
2102This example will show how to parse /etc/passwd using PEGs.
2103
2104   First we define an example /etc/passwd file:
2105
2106     (define *etc-passwd*
2107       "root:x:0:0:root:/root:/bin/bash
2108     daemon:x:1:1:daemon:/usr/sbin:/bin/sh
2109     bin:x:2:2:bin:/bin:/bin/sh
2110     sys:x:3:3:sys:/dev:/bin/sh
2111     nobody:x:65534:65534:nobody:/nonexistent:/bin/sh
2112     messagebus:x:103:107::/var/run/dbus:/bin/false
2113     ")
2114
2115   As a first pass at this, we might want to have all the entries in
2116/etc/passwd in a list.
2117
2118   Doing this with string-based PEG syntax would look like this:
2119     (define-peg-string-patterns
2120       "passwd <- entry* !.
2121     entry <-- (! NL .)* NL*
2122     NL < '\n'")
2123
2124   A ‘passwd’ file is 0 or more entries (‘entry*’) until the end of the
2125file (‘!.’ (‘.’ is any character, so ‘!.’ means “not anything”)).  We
2126want to capture the data in the nonterminal ‘passwd’, but not tag it
2127with the name, so we use ‘<-’.
2128
2129   An entry is a series of 0 or more characters that aren’t newlines
2130(‘(! NL .)*’) followed by 0 or more newlines (‘NL*’).  We want to tag
2131all the entries with ‘entry’, so we use ‘<--’.
2132
2133   A newline is just a literal newline (‘'\n'’).  We don’t want a bunch
2134of newlines cluttering up the output, so we use ‘<’ to throw away the
2135captured data.
2136
2137   Here is the same PEG defined using S-expressions:
2138     (define-peg-pattern passwd body (and (* entry) (not-followed-by peg-any)))
2139     (define-peg-pattern entry all (and (* (and (not-followed-by NL) peg-any))
2140     			       (* NL)))
2141     (define-peg-pattern NL none "\n")
2142
2143   Obviously this is much more verbose.  On the other hand, it’s more
2144explicit, and thus easier to build automatically.  However, there are
2145some tricks that make S-expressions easier to use in some cases.  One is
2146the ‘ignore’ keyword; the string syntax has no way to say “throw away
2147this text” except breaking it out into a separate nonterminal.  For
2148instance, to throw away the newlines we had to define ‘NL’.  In the
2149S-expression syntax, we could have simply written ‘(ignore "\n")’.
2150Also, for the cases where string syntax is really much cleaner, the
2151‘peg’ keyword can be used to embed string syntax in S-expression syntax.
2152For instance, we could have written:
2153
2154     (define-peg-pattern passwd body (peg "entry* !."))
2155
2156   However we define it, parsing ‘*etc-passwd*’ with the ‘passwd’
2157nonterminal yields the same results:
2158
2159     (peg:tree (match-pattern passwd *etc-passwd*)) ⇒
2160     ((entry "root:x:0:0:root:/root:/bin/bash")
2161      (entry "daemon:x:1:1:daemon:/usr/sbin:/bin/sh")
2162      (entry "bin:x:2:2:bin:/bin:/bin/sh")
2163      (entry "sys:x:3:3:sys:/dev:/bin/sh")
2164      (entry "nobody:x:65534:65534:nobody:/nonexistent:/bin/sh")
2165      (entry "messagebus:x:103:107::/var/run/dbus:/bin/false"))
2166
2167   However, here is something to be wary of:
2168
2169     (peg:tree (match-pattern passwd "one entry")) ⇒
2170     (entry "one entry")
2171
2172   By default, the parse trees generated by PEGs are compressed as much
2173as possible without losing information.  It may not look like this is
2174what you want at first, but uncompressed parse trees are an enormous
2175headache (there’s no easy way to predict how deep particular lists will
2176nest, there are empty lists littered everywhere, etc.  etc.).  One
2177side-effect of this, however, is that sometimes the compressor is too
2178aggressive.  No information is discarded when ‘((entry "one entry"))’ is
2179compressed to ‘(entry "one entry")’, but in this particular case it
2180probably isn’t what we want.
2181
2182   There are two functions for easily dealing with this:
2183‘keyword-flatten’ and ‘context-flatten’.  The ‘keyword-flatten’ function
2184takes a list of keywords and a list to flatten, then tries to coerce the
2185list such that the first element of all sublists is one of the keywords.
2186The ‘context-flatten’ function is similar, but instead of a list of
2187keywords it takes a predicate that should indicate whether a given
2188sublist is good enough (refer to the API reference for more details).
2189
2190   What we want here is ‘keyword-flatten’.
2191     (keyword-flatten '(entry) (peg:tree (match-pattern passwd *etc-passwd*))) ⇒
2192     ((entry "root:x:0:0:root:/root:/bin/bash")
2193      (entry "daemon:x:1:1:daemon:/usr/sbin:/bin/sh")
2194      (entry "bin:x:2:2:bin:/bin:/bin/sh")
2195      (entry "sys:x:3:3:sys:/dev:/bin/sh")
2196      (entry "nobody:x:65534:65534:nobody:/nonexistent:/bin/sh")
2197      (entry "messagebus:x:103:107::/var/run/dbus:/bin/false"))
2198     (keyword-flatten '(entry) (peg:tree (match-pattern passwd "one entry"))) ⇒
2199     ((entry "one entry"))
2200
2201   Of course, this is a somewhat contrived example.  In practice we
2202would probably just tag the ‘passwd’ nonterminal to remove the ambiguity
2203(using either the ‘all’ keyword for S-expressions or the ‘<--’ symbol
2204for strings)..
2205
2206     (define-peg-pattern tag-passwd all (peg "entry* !."))
2207     (peg:tree (match-pattern tag-passwd *etc-passwd*)) ⇒
2208     (tag-passwd
2209       (entry "root:x:0:0:root:/root:/bin/bash")
2210       (entry "daemon:x:1:1:daemon:/usr/sbin:/bin/sh")
2211       (entry "bin:x:2:2:bin:/bin:/bin/sh")
2212       (entry "sys:x:3:3:sys:/dev:/bin/sh")
2213       (entry "nobody:x:65534:65534:nobody:/nonexistent:/bin/sh")
2214       (entry "messagebus:x:103:107::/var/run/dbus:/bin/false"))
2215     (peg:tree (match-pattern tag-passwd "one entry"))
2216     (tag-passwd
2217       (entry "one entry"))
2218
2219   If you’re ever uncertain about the potential results of parsing
2220something, remember the two absolute rules:
2221  1. No parsing information will ever be discarded.
2222  2. There will never be any lists with fewer than 2 elements.
2223
2224   For the purposes of (1), "parsing information" means things tagged
2225with the ‘any’ keyword or the ‘<--’ symbol.  Plain strings will be
2226concatenated.
2227
2228   Let’s extend this example a bit more and actually pull some useful
2229information out of the passwd file:
2230
2231     (define-peg-string-patterns
2232       "passwd <-- entry* !.
2233     entry <-- login C pass C uid C gid C nameORcomment C homedir C shell NL*
2234     login <-- text
2235     pass <-- text
2236     uid <-- [0-9]*
2237     gid <-- [0-9]*
2238     nameORcomment <-- text
2239     homedir <-- path
2240     shell <-- path
2241     path <-- (SLASH pathELEMENT)*
2242     pathELEMENT <-- (!NL !C  !'/' .)*
2243     text <- (!NL !C  .)*
2244     C < ':'
2245     NL < '\n'
2246     SLASH < '/'")
2247
2248   This produces rather pretty parse trees:
2249     (passwd
2250       (entry (login "root")
2251              (pass "x")
2252              (uid "0")
2253              (gid "0")
2254              (nameORcomment "root")
2255              (homedir (path (pathELEMENT "root")))
2256              (shell (path (pathELEMENT "bin") (pathELEMENT "bash"))))
2257       (entry (login "daemon")
2258              (pass "x")
2259              (uid "1")
2260              (gid "1")
2261              (nameORcomment "daemon")
2262              (homedir
2263                (path (pathELEMENT "usr") (pathELEMENT "sbin")))
2264              (shell (path (pathELEMENT "bin") (pathELEMENT "sh"))))
2265       (entry (login "bin")
2266              (pass "x")
2267              (uid "2")
2268              (gid "2")
2269              (nameORcomment "bin")
2270              (homedir (path (pathELEMENT "bin")))
2271              (shell (path (pathELEMENT "bin") (pathELEMENT "sh"))))
2272       (entry (login "sys")
2273              (pass "x")
2274              (uid "3")
2275              (gid "3")
2276              (nameORcomment "sys")
2277              (homedir (path (pathELEMENT "dev")))
2278              (shell (path (pathELEMENT "bin") (pathELEMENT "sh"))))
2279       (entry (login "nobody")
2280              (pass "x")
2281              (uid "65534")
2282              (gid "65534")
2283              (nameORcomment "nobody")
2284              (homedir (path (pathELEMENT "nonexistent")))
2285              (shell (path (pathELEMENT "bin") (pathELEMENT "sh"))))
2286       (entry (login "messagebus")
2287              (pass "x")
2288              (uid "103")
2289              (gid "107")
2290              nameORcomment
2291              (homedir
2292                (path (pathELEMENT "var")
2293                      (pathELEMENT "run")
2294                      (pathELEMENT "dbus")))
2295              (shell (path (pathELEMENT "bin") (pathELEMENT "false")))))
2296
2297   Notice that when there’s no entry in a field (e.g.  ‘nameORcomment’
2298for messagebus) the symbol is inserted.  This is the “don’t throw away
2299any information” rule—we succesfully matched a ‘nameORcomment’ of 0
2300characters (since we used ‘*’ when defining it).  This is usually what
2301you want, because it allows you to e.g.  use ‘list-ref’ to pull out
2302elements (since they all have known offsets).
2303
2304   If you’d prefer not to have symbols for empty matches, you can
2305replace the ‘*’ with a ‘+’ and add a ‘?’ after the ‘nameORcomment’ in
2306‘entry’.  Then it will try to parse 1 or more characters, fail
2307(inserting nothing into the parse tree), but continue because it didn’t
2308have to match the nameORcomment to continue.
2309
2310Embedding Arithmetic Expressions
2311................................
2312
2313We can parse simple mathematical expressions with the following PEG:
2314
2315     (define-peg-string-patterns
2316       "expr <- sum
2317     sum <-- (product ('+' / '-') sum) / product
2318     product <-- (value ('*' / '/') product) / value
2319     value <-- number / '(' expr ')'
2320     number <-- [0-9]+")
2321
2322   Then:
2323     (peg:tree (match-pattern expr "1+1/2*3+(1+1)/2")) ⇒
2324     (sum (product (value (number "1")))
2325          "+"
2326          (sum (product
2327                 (value (number "1"))
2328                 "/"
2329                 (product
2330                   (value (number "2"))
2331                   "*"
2332                   (product (value (number "3")))))
2333               "+"
2334               (sum (product
2335                      (value "("
2336                             (sum (product (value (number "1")))
2337                                  "+"
2338                                  (sum (product (value (number "1")))))
2339                             ")")
2340                      "/"
2341                      (product (value (number "2")))))))
2342
2343   There is very little wasted effort in this PEG. The ‘number’
2344nonterminal has to be tagged because otherwise the numbers might run
2345together with the arithmetic expressions during the string concatenation
2346stage of parse-tree compression (the parser will see “1” followed by “/”
2347and decide to call it “1/”).  When in doubt, tag.
2348
2349   It is very easy to turn these parse trees into lisp expressions:
2350
2351     (define (parse-sum sum left . rest)
2352       (if (null? rest)
2353           (apply parse-product left)
2354           (list (string->symbol (car rest))
2355     	    (apply parse-product left)
2356     	    (apply parse-sum (cadr rest)))))
2357
2358     (define (parse-product product left . rest)
2359       (if (null? rest)
2360           (apply parse-value left)
2361           (list (string->symbol (car rest))
2362     	    (apply parse-value left)
2363     	    (apply parse-product (cadr rest)))))
2364
2365     (define (parse-value value first . rest)
2366       (if (null? rest)
2367           (string->number (cadr first))
2368           (apply parse-sum (car rest))))
2369
2370     (define parse-expr parse-sum)
2371
2372   (Notice all these functions look very similar; for a more complicated
2373PEG, it would be worth abstracting.)
2374
2375   Then:
2376     (apply parse-expr (peg:tree (match-pattern expr "1+1/2*3+(1+1)/2"))) ⇒
2377     (+ 1 (+ (/ 1 (* 2 3)) (/ (+ 1 1) 2)))
2378
2379   But wait!  The associativity is wrong!  Where it says ‘(/ 1 (* 2
23803))’, it should say ‘(* (/ 1 2) 3)’.
2381
2382   It’s tempting to try replacing e.g.  ‘"sum <-- (product ('+' / '-')
2383sum) / product"’ with ‘"sum <-- (sum ('+' / '-') product) / product"’,
2384but this is a Bad Idea.  PEGs don’t support left recursion.  To see why,
2385imagine what the parser will do here.  When it tries to parse ‘sum’, it
2386first has to try and parse ‘sum’.  But to do that, it first has to try
2387and parse ‘sum’.  This will continue until the stack gets blown off.
2388
2389   So how does one parse left-associative binary operators with PEGs?
2390Honestly, this is one of their major shortcomings.  There’s no
2391general-purpose way of doing this, but here the repetition operators are
2392a good choice:
2393
2394     (use-modules (srfi srfi-1))
2395
2396     (define-peg-string-patterns
2397       "expr <- sum
2398     sum <-- (product ('+' / '-'))* product
2399     product <-- (value ('*' / '/'))* value
2400     value <-- number / '(' expr ')'
2401     number <-- [0-9]+")
2402
2403     ;; take a deep breath...
2404     (define (make-left-parser next-func)
2405       (lambda (sum first . rest) ;; general form, comments below assume
2406         ;; that we're dealing with a sum expression
2407         (if (null? rest) ;; form (sum (product ...))
2408           (apply next-func first)
2409           (if (string? (cadr first));; form (sum ((product ...) "+") (product ...))
2410     	  (list (string->symbol (cadr first))
2411     		(apply next-func (car first))
2412     		(apply next-func (car rest)))
2413               ;; form (sum (((product ...) "+") ((product ...) "+")) (product ...))
2414     	  (car
2415     	   (reduce ;; walk through the list and build a left-associative tree
2416     	    (lambda (l r)
2417     	      (list (list (cadr r) (car r) (apply next-func (car l)))
2418     		    (string->symbol (cadr l))))
2419     	    'ignore
2420     	    (append ;; make a list of all the products
2421                  ;; the first one should be pre-parsed
2422     	     (list (list (apply next-func (caar first))
2423     			 (string->symbol (cadar first))))
2424     	     (cdr first)
2425                  ;; the last one has to be added in
2426     	     (list (append rest '("done"))))))))))
2427
2428     (define (parse-value value first . rest)
2429       (if (null? rest)
2430           (string->number (cadr first))
2431           (apply parse-sum (car rest))))
2432     (define parse-product (make-left-parser parse-value))
2433     (define parse-sum (make-left-parser parse-product))
2434     (define parse-expr parse-sum)
2435
2436   Then:
2437     (apply parse-expr (peg:tree (match-pattern expr "1+1/2*3+(1+1)/2"))) ⇒
2438     (+ (+ 1 (* (/ 1 2) 3)) (/ (+ 1 1) 2))
2439
2440   As you can see, this is much uglier (it could be made prettier by
2441using ‘context-flatten’, but the way it’s written above makes it clear
2442how we deal with the three ways the zero-or-more ‘*’ expression can
2443parse).  Fortunately, most of the time we can get away with only using
2444right-associativity.
2445
2446Simplified Functions
2447....................
2448
2449For a more tantalizing example, consider the following grammar that
2450parses (highly) simplified C functions:
2451
2452     (define-peg-string-patterns
2453       "cfunc <-- cSP ctype cSP cname cSP cargs cLB cSP cbody cRB
2454     ctype <-- cidentifier
2455     cname <-- cidentifier
2456     cargs <-- cLP (! (cSP cRP) carg cSP (cCOMMA / cRP) cSP)* cSP
2457     carg <-- cSP ctype cSP cname
2458     cbody <-- cstatement *
2459     cidentifier <- [a-zA-z][a-zA-Z0-9_]*
2460     cstatement <-- (!';'.)*cSC cSP
2461     cSC < ';'
2462     cCOMMA < ','
2463     cLP < '('
2464     cRP < ')'
2465     cLB < '{'
2466     cRB < '}'
2467     cSP < [ \t\n]*")
2468
2469   Then:
2470     (match-pattern cfunc "int square(int a) { return a*a;}") ⇒
2471     (32
2472      (cfunc (ctype "int")
2473             (cname "square")
2474             (cargs (carg (ctype "int") (cname "a")))
2475             (cbody (cstatement "return a*a"))))
2476
2477   And:
2478     (match-pattern cfunc "int mod(int a, int b) { int c = a/b;return a-b*c; }") ⇒
2479     (52
2480      (cfunc (ctype "int")
2481             (cname "mod")
2482             (cargs (carg (ctype "int") (cname "a"))
2483                    (carg (ctype "int") (cname "b")))
2484             (cbody (cstatement "int c = a/b")
2485                    (cstatement "return a- b*c"))))
2486
2487   By wrapping all the ‘carg’ nonterminals in a ‘cargs’ nonterminal, we
2488were able to remove any ambiguity in the parsing structure and avoid
2489having to call ‘context-flatten’ on the output of ‘match-pattern’.  We
2490used the same trick with the ‘cstatement’ nonterminals, wrapping them in
2491a ‘cbody’ nonterminal.
2492
2493   The whitespace nonterminal ‘cSP’ used here is a (very) useful
2494instantiation of a common pattern for matching syntactically irrelevant
2495information.  Since it’s tagged with ‘<’ and ends with ‘*’ it won’t
2496clutter up the parse trees (all the empty lists will be discarded during
2497the compression step) and it will never cause parsing to fail.
2498
2499
2500File: guile.info,  Node: PEG Internals,  Prev: PEG Tutorial,  Up: PEG Parsing
2501
25026.15.4 PEG Internals
2503--------------------
2504
2505A PEG parser takes a string as input and attempts to parse it as a given
2506nonterminal.  The key idea of the PEG implementation is that every
2507nonterminal is just a function that takes a string as an argument and
2508attempts to parse that string as its nonterminal.  The functions always
2509start from the beginning, but a parse is considered successful if there
2510is material left over at the end.
2511
2512   This makes it easy to model different PEG parsing operations.  For
2513instance, consider the PEG grammar ‘"ab"’, which could also be written
2514‘(and "a" "b")’.  It matches the string “ab”.  Here’s how that might be
2515implemented in the PEG style:
2516
2517     (define (match-and-a-b str)
2518       (match-a str)
2519       (match-b str))
2520
2521   As you can see, the use of functions provides an easy way to model
2522sequencing.  In a similar way, one could model ‘(or a b)’ with something
2523like the following:
2524
2525     (define (match-or-a-b str)
2526       (or (match-a str) (match-b str)))
2527
2528   Here the semantics of a PEG ‘or’ expression map naturally onto
2529Scheme’s ‘or’ operator.  This function will attempt to run ‘(match-a
2530str)’, and return its result if it succeeds.  Otherwise it will run
2531‘(match-b str)’.
2532
2533   Of course, the code above wouldn’t quite work.  We need some way for
2534the parsing functions to communicate.  The actual interface used is
2535below.
2536
2537Parsing Function Interface
2538..........................
2539
2540A parsing function takes three arguments - a string, the length of that
2541string, and the position in that string it should start parsing at.  In
2542effect, the parsing functions pass around substrings in pieces - the
2543first argument is a buffer of characters, and the second two give a
2544range within that buffer that the parsing function should look at.
2545
2546   Parsing functions return either #f, if they failed to match their
2547nonterminal, or a list whose first element must be an integer
2548representing the final position in the string they matched and whose cdr
2549can be any other data the function wishes to return, or ’() if it
2550doesn’t have any more data.
2551
2552   The one caveat is that if the extra data it returns is a list, any
2553adjacent strings in that list will be appended by ‘match-pattern’.  For
2554instance, if a parsing function returns ‘(13 ("a" "b" "c"))’,
2555‘match-pattern’ will take ‘(13 ("abc"))’ as its value.
2556
2557   For example, here is a function to match “ab” using the actual
2558interface.
2559
2560     (define (match-a-b str len pos)
2561        (and (<= (+ pos 2) len)
2562             (string= str "ab" pos (+ pos 2))
2563             (list (+ pos 2) '()))) ; we return no extra information
2564
2565   The above function can be used to match a string by running
2566‘(match-pattern match-a-b "ab")’.
2567
2568Code Generators and Extensible Syntax
2569.....................................
2570
2571PEG expressions, such as those in a ‘define-peg-pattern’ form, are
2572interpreted internally in two steps.
2573
2574   First, any string PEG is expanded into an s-expression PEG by the
2575code in the ‘(ice-9 peg string-peg)’ module.
2576
2577   Then, the s-expression PEG that results is compiled into a parsing
2578function by the ‘(ice-9 peg codegen)’ module.  In particular, the
2579function ‘compile-peg-pattern’ is called on the s-expression.  It then
2580decides what to do based on the form it is passed.
2581
2582   The PEG syntax can be expanded by providing ‘compile-peg-pattern’
2583more options for what to do with its forms.  The extended syntax will be
2584associated with a symbol, for instance ‘my-parsing-form’, and will be
2585called on all PEG expressions of the form
2586     (my-parsing-form ...)
2587
2588   The parsing function should take two arguments.  The first will be a
2589syntax object containing a list with all of the arguments to the form
2590(but not the form’s name), and the second will be the ‘capture-type’
2591argument that is passed to ‘define-peg-pattern’.
2592
2593   New functions can be registered by calling ‘(add-peg-compiler! symbol
2594function)’, where ‘symbol’ is the symbol that will indicate a form of
2595this type and ‘function’ is the code generating function described
2596above.  The function ‘add-peg-compiler!’ is exported from the ‘(ice-9
2597peg codegen)’ module.
2598
2599
2600File: guile.info,  Node: Read/Load/Eval/Compile,  Next: Memory Management,  Prev: PEG Parsing,  Up: API Reference
2601
26026.16 Reading and Evaluating Scheme Code
2603=======================================
2604
2605This chapter describes Guile functions that are concerned with reading,
2606loading, evaluating, and compiling Scheme code at run time.
2607
2608* Menu:
2609
2610* Scheme Syntax::               Standard and extended Scheme syntax.
2611* Scheme Read::                 Reading Scheme code.
2612* Annotated Scheme Read::       Reading Scheme code, for the compiler.
2613* Scheme Write::                Writing Scheme values to a port.
2614* Fly Evaluation::              Procedures for on the fly evaluation.
2615* Compilation::                 How to compile Scheme files and procedures.
2616* Loading::                     Loading Scheme code from file.
2617* Load Paths::                  Where Guile looks for code.
2618* Character Encoding of Source Files:: Loading non-ASCII Scheme code from file.
2619* Delayed Evaluation::          Postponing evaluation until it is needed.
2620* Local Evaluation::            Evaluation in a local lexical environment.
2621* Local Inclusion::             Compile-time inclusion of one file in another.
2622* Sandboxed Evaluation::        Evaluation with limited capabilities.
2623* REPL Servers::                Serving a REPL over a socket.
2624* Cooperative REPL Servers::    REPL server for single-threaded applications.
2625
2626
2627File: guile.info,  Node: Scheme Syntax,  Next: Scheme Read,  Up: Read/Load/Eval/Compile
2628
26296.16.1 Scheme Syntax: Standard and Guile Extensions
2630---------------------------------------------------
2631
2632* Menu:
2633
2634* Expression Syntax::
2635* Comments::
2636* Block Comments::
2637* Case Sensitivity::
2638* Keyword Syntax::
2639* Reader Extensions::
2640
2641
2642File: guile.info,  Node: Expression Syntax,  Next: Comments,  Up: Scheme Syntax
2643
26446.16.1.1 Expression Syntax
2645..........................
2646
2647An expression to be evaluated takes one of the following forms.
2648
2649SYMBOL
2650     A symbol is evaluated by dereferencing.  A binding of that symbol
2651     is sought and the value there used.  For example,
2652
2653          (define x 123)
2654          x ⇒ 123
2655
2656(PROC ARGS...)
2657     A parenthesised expression is a function call.  PROC and each
2658     argument are evaluated, then the function (which PROC evaluated to)
2659     is called with those arguments.
2660
2661     The order in which PROC and the arguments are evaluated is
2662     unspecified, so be careful when using expressions with side
2663     effects.
2664
2665          (max 1 2 3) ⇒ 3
2666
2667          (define (get-some-proc)  min)
2668          ((get-some-proc) 1 2 3) ⇒ 1
2669
2670     The same sort of parenthesised form is used for a macro invocation,
2671     but in that case the arguments are not evaluated.  See the
2672     descriptions of macros for more on this (*note Macros::, and *note
2673     Syntax Rules::).
2674
2675CONSTANT
2676     Number, string, character and boolean constants evaluate “to
2677     themselves”, so can appear as literals.
2678
2679          123     ⇒ 123
2680          99.9    ⇒ 99.9
2681          "hello" ⇒ "hello"
2682          #\z     ⇒ #\z
2683          #t      ⇒ #t
2684
2685     Note that an application must not attempt to modify literal
2686     strings, since they may be in read-only memory.
2687
2688(quote DATA)
2689’DATA
2690     Quoting is used to obtain a literal symbol (instead of a variable
2691     reference), a literal list (instead of a function call), or a
2692     literal vector.  ’ is simply a shorthand for a ‘quote’ form.  For
2693     example,
2694
2695          'x                   ⇒ x
2696          '(1 2 3)             ⇒ (1 2 3)
2697          '#(1 (2 3) 4)        ⇒ #(1 (2 3) 4)
2698          (quote x)            ⇒ x
2699          (quote (1 2 3))      ⇒ (1 2 3)
2700          (quote #(1 (2 3) 4)) ⇒ #(1 (2 3) 4)
2701
2702     Note that an application must not attempt to modify literal lists
2703     or vectors obtained from a ‘quote’ form, since they may be in
2704     read-only memory.
2705
2706(quasiquote DATA)
2707‘DATA
2708     Backquote quasi-quotation is like ‘quote’, but selected
2709     sub-expressions are evaluated.  This is a convenient way to
2710     construct a list or vector structure most of which is constant, but
2711     at certain points should have expressions substituted.
2712
2713     The same effect can always be had with suitable ‘list’, ‘cons’ or
2714     ‘vector’ calls, but quasi-quoting is often easier.
2715
2716     (unquote EXPR)
2717     ,EXPR
2718          Within the quasiquote DATA, ‘unquote’ or ‘,’ indicates an
2719          expression to be evaluated and inserted.  The comma syntax ‘,’
2720          is simply a shorthand for an ‘unquote’ form.  For example,
2721
2722               `(1 2 (* 9 9) 3 4)       ⇒ (1 2 (* 9 9) 3 4)
2723               `(1 2 ,(* 9 9) 3 4)      ⇒ (1 2 81 3 4)
2724               `(1 (unquote (+ 1 1)) 3) ⇒ (1 2 3)
2725               `#(1 ,(/ 12 2))          ⇒ #(1 6)
2726
2727     (unquote-splicing EXPR)
2728     ,@EXPR
2729          Within the quasiquote DATA, ‘unquote-splicing’ or ‘,@’
2730          indicates an expression to be evaluated and the elements of
2731          the returned list inserted.  EXPR must evaluate to a list.
2732          The “comma-at” syntax ‘,@’ is simply a shorthand for an
2733          ‘unquote-splicing’ form.
2734
2735               (define x '(2 3))
2736               `(1 ,x 4)                           ⇒ (1 (2 3) 4)
2737               `(1 ,@x 4)                         ⇒ (1 2 3 4)
2738               `(1 (unquote-splicing (map 1+ x)))  ⇒ (1 3 4)
2739               `#(9 ,@x 9)                        ⇒ #(9 2 3 9)
2740
2741          Notice ‘,@’ differs from plain ‘,’ in the way one level of
2742          nesting is stripped.  For ‘,@’ the elements of a returned list
2743          are inserted, whereas with ‘,’ it would be the list itself
2744          inserted.
2745
2746
2747File: guile.info,  Node: Comments,  Next: Block Comments,  Prev: Expression Syntax,  Up: Scheme Syntax
2748
27496.16.1.2 Comments
2750.................
2751
2752Comments in Scheme source files are written by starting them with a
2753semicolon character (‘;’).  The comment then reaches up to the end of
2754the line.  Comments can begin at any column, and the may be inserted on
2755the same line as Scheme code.
2756
2757     ; Comment
2758     ;; Comment too
2759     (define x 1)        ; Comment after expression
2760     (let ((y 1))
2761       ;; Display something.
2762       (display y)
2763     ;;; Comment at left margin.
2764       (display (+ y 1)))
2765
2766   It is common to use a single semicolon for comments following
2767expressions on a line, to use two semicolons for comments which are
2768indented like code, and three semicolons for comments which start at
2769column 0, even if they are inside an indented code block.  This
2770convention is used when indenting code in Emacs’ Scheme mode.
2771
2772
2773File: guile.info,  Node: Block Comments,  Next: Case Sensitivity,  Prev: Comments,  Up: Scheme Syntax
2774
27756.16.1.3 Block Comments
2776.......................
2777
2778In addition to the standard line comments defined by R5RS, Guile has
2779another comment type for multiline comments, called “block comments”.
2780This type of comment begins with the character sequence ‘#!’ and ends
2781with the characters ‘!#’.
2782
2783   These comments are compatible with the block comments in the Scheme
2784Shell ‘scsh’ (*note The Scheme shell (scsh)::).  The characters ‘#!’
2785were chosen because they are the magic characters used in shell scripts
2786for indicating that the name of the program for executing the script
2787follows on the same line.
2788
2789   Thus a Guile script often starts like this.
2790
2791     #! /usr/local/bin/guile -s
2792     !#
2793
2794   More details on Guile scripting can be found in the scripting section
2795(*note Guile Scripting::).
2796
2797   Similarly, Guile (starting from version 2.0) supports nested block
2798comments as specified by R6RS and SRFI-30
2799(http://srfi.schemers.org/srfi-30/srfi-30.html):
2800
2801     (+ 1 #| this is a #| nested |# block comment |# 2)
2802     ⇒ 3
2803
2804   For backward compatibility, this syntax can be overridden with
2805‘read-hash-extend’ (*note ‘read-hash-extend’: Reader Extensions.).
2806
2807   There is one special case where the contents of a comment can
2808actually affect the interpretation of code.  When a character encoding
2809declaration, such as ‘coding: utf-8’ appears in one of the first few
2810lines of a source file, it indicates to Guile’s default reader that this
2811source code file is not ASCII. For details see *note Character Encoding
2812of Source Files::.
2813
2814
2815File: guile.info,  Node: Case Sensitivity,  Next: Keyword Syntax,  Prev: Block Comments,  Up: Scheme Syntax
2816
28176.16.1.4 Case Sensitivity
2818.........................
2819
2820Scheme as defined in R5RS is not case sensitive when reading symbols.
2821Guile, on the contrary is case sensitive by default, so the identifiers
2822
2823     guile-whuzzy
2824     Guile-Whuzzy
2825
2826   are the same in R5RS Scheme, but are different in Guile.
2827
2828   It is possible to turn off case sensitivity in Guile by setting the
2829reader option ‘case-insensitive’.  For more information on reader
2830options, *Note Scheme Read::.
2831
2832     (read-enable 'case-insensitive)
2833
2834   It is also possible to disable (or enable) case sensitivity within a
2835single file by placing the reader directives ‘#!fold-case’ (or
2836‘#!no-fold-case’) within the file itself.
2837
2838
2839File: guile.info,  Node: Keyword Syntax,  Next: Reader Extensions,  Prev: Case Sensitivity,  Up: Scheme Syntax
2840
28416.16.1.5 Keyword Syntax
2842.......................
2843
2844
2845File: guile.info,  Node: Reader Extensions,  Prev: Keyword Syntax,  Up: Scheme Syntax
2846
28476.16.1.6 Reader Extensions
2848..........................
2849
2850 -- Scheme Procedure: read-hash-extend chr proc
2851 -- C Function: scm_read_hash_extend (chr, proc)
2852     Install the procedure PROC for reading expressions starting with
2853     the character sequence ‘#’ and CHR.  PROC will be called with two
2854     arguments: the character CHR and the port to read further data
2855     from.  The object returned will be the return value of ‘read’.
2856     Passing ‘#f’ for PROC will remove a previous setting.
2857
2858
2859File: guile.info,  Node: Scheme Read,  Next: Annotated Scheme Read,  Prev: Scheme Syntax,  Up: Read/Load/Eval/Compile
2860
28616.16.2 Reading Scheme Code
2862--------------------------
2863
2864 -- Scheme Procedure: read [port]
2865 -- C Function: scm_read (port)
2866     Read an s-expression from the input port PORT, or from the current
2867     input port if PORT is not specified.  Any whitespace before the
2868     next token is discarded.
2869
2870   The behaviour of Guile’s Scheme reader can be modified by
2871manipulating its read options.
2872
2873 -- Scheme Procedure: read-options [setting]
2874     Display the current settings of the global read options.  If
2875     SETTING is omitted, only a short form of the current read options
2876     is printed.  Otherwise if SETTING is the symbol ‘help’, a complete
2877     options description is displayed.
2878
2879   The set of available options, and their default values, may be had by
2880invoking ‘read-options’ at the prompt.
2881
2882     scheme@(guile-user)> (read-options)
2883     (square-brackets keywords #f positions)
2884     scheme@(guile-user)> (read-options 'help)
2885     positions         yes   Record positions of source code expressions.
2886     case-insensitive  no    Convert symbols to lower case.
2887     keywords          #f    Style of keyword recognition: #f, 'prefix or 'postfix.
2888     r6rs-hex-escapes  no    Use R6RS variable-length character and string hex escapes.
2889     square-brackets   yes   Treat `[' and `]' as parentheses, for R6RS compatibility.
2890     hungry-eol-escapes no   In strings, consume leading whitespace after an
2891                             escaped end-of-line.
2892     curly-infix       no    Support SRFI-105 curly infix expressions.
2893     r7rs-symbols      no    Support R7RS |...| symbol notation.
2894
2895   Note that Guile also includes a preliminary mechanism for setting
2896read options on a per-port basis.  For instance, the ‘case-insensitive’
2897read option is set (or unset) on the port when the reader encounters the
2898‘#!fold-case’ or ‘#!no-fold-case’ reader directives.  Similarly, the
2899‘#!curly-infix’ reader directive sets the ‘curly-infix’ read option on
2900the port, and ‘#!curly-infix-and-bracket-lists’ sets ‘curly-infix’ and
2901unsets ‘square-brackets’ on the port (*note SRFI-105::).  There is
2902currently no other way to access or set the per-port read options.
2903
2904   The boolean options may be toggled with ‘read-enable’ and
2905‘read-disable’.  The non-boolean ‘keywords’ option must be set using
2906‘read-set!’.
2907
2908 -- Scheme Procedure: read-enable option-name
2909 -- Scheme Procedure: read-disable option-name
2910 -- Scheme Syntax: read-set! option-name value
2911     Modify the read options.  ‘read-enable’ should be used with boolean
2912     options and switches them on, ‘read-disable’ switches them off.
2913
2914     ‘read-set!’ can be used to set an option to a specific value.  Due
2915     to historical oddities, it is a macro that expects an unquoted
2916     option name.
2917
2918   For example, to make ‘read’ fold all symbols to their lower case
2919(perhaps for compatibility with older Scheme code), you can enter:
2920
2921     (read-enable 'case-insensitive)
2922
2923   For more information on the effect of the ‘r6rs-hex-escapes’ and
2924‘hungry-eol-escapes’ options, see (*note String Syntax::).
2925
2926   For more information on the ‘r7rs-symbols’ option, see (*note Symbol
2927Read Syntax::).
2928
2929
2930File: guile.info,  Node: Annotated Scheme Read,  Next: Scheme Write,  Prev: Scheme Read,  Up: Read/Load/Eval/Compile
2931
29326.16.3 Reading Scheme Code, For the Compiler
2933--------------------------------------------
2934
2935When something goes wrong with a Scheme program, the user will want to
2936know how to fix it.  This starts with identifying where the error
2937occured: we want to associate a source location with each component part
2938of source code, and propagate that source location information through
2939to the compiler or interpreter.
2940
2941   For that, Guile provides ‘read-syntax’.
2942
2943 -- Scheme Procedure: read-syntax [port]
2944     Read an s-expression from the input port PORT, or from the current
2945     input port if PORT is not specified.
2946
2947     If, after skipping white space and comments, no more bytes are
2948     available from PORT, return the end-of-file object.  *Note Binary
2949     I/O::.  Otherwise, return an annotated datum.  An annotated datum
2950     is a syntax object which associates a source location with a datum.
2951     For example:
2952
2953          (call-with-input-string "  foo" read-syntax)
2954          ; ⇒ #<syntax:unknown file:1:2 foo>
2955          (call-with-input-string "(foo)" read-syntax)
2956          ; ⇒
2957          ; #<syntax:unknown file:1:0
2958          ;   (#<syntax unknown file:1:1 foo>)>
2959
2960     As the second example shows, all fields of pairs and vectors are
2961     also annotated, recursively.
2962
2963   Most users are familiar with syntax objects in the context of macros,
2964which use syntax objects to associate scope information with
2965identifiers.  *Note Macros::.  Here we use syntax objects to associate
2966source location information with any datum, but without attaching scope
2967information.  The Scheme compiler (‘compile’) and the interpreter
2968(‘eval’) can accept syntax objects directly as input, allowing them to
2969associate source information with resulting code.  *Note Compilation::,
2970and *Note Fly Evaluation::.
2971
2972   Note that there is a legacy interface for getting source locations
2973into the Scheme compiler or interpreter, which is to use a side table
2974that associates “source properties” with each subdatum returned by
2975‘read’, instead of wrapping the datums directly as in ‘read-syntax’.
2976This has the disadvantage of not being able to annotate all kinds of
2977datums.  *Note Source Properties::, for more information.
2978
2979
2980File: guile.info,  Node: Scheme Write,  Next: Fly Evaluation,  Prev: Annotated Scheme Read,  Up: Read/Load/Eval/Compile
2981
29826.16.4 Writing Scheme Values
2983----------------------------
2984
2985Any scheme value may be written to a port.  Not all values may be read
2986back in (*note Scheme Read::), however.
2987
2988 -- Scheme Procedure: write obj [port]
2989     Send a representation of OBJ to PORT or to the current output port
2990     if not given.
2991
2992     The output is designed to be machine readable, and can be read back
2993     with ‘read’ (*note Scheme Read::).  Strings are printed in double
2994     quotes, with escapes if necessary, and characters are printed in
2995     ‘#\’ notation.
2996
2997 -- Scheme Procedure: display obj [port]
2998     Send a representation of OBJ to PORT or to the current output port
2999     if not given.
3000
3001     The output is designed for human readability, it differs from
3002     ‘write’ in that strings are printed without double quotes and
3003     escapes, and characters are printed as per ‘write-char’, not in
3004     ‘#\’ form.
3005
3006   As was the case with the Scheme reader, there are a few options that
3007affect the behavior of the Scheme printer.
3008
3009 -- Scheme Procedure: print-options [setting]
3010     Display the current settings of the read options.  If SETTING is
3011     omitted, only a short form of the current read options is printed.
3012     Otherwise if SETTING is the symbol ‘help’, a complete options
3013     description is displayed.
3014
3015   The set of available options, and their default values, may be had by
3016invoking ‘print-options’ at the prompt.
3017
3018     scheme@(guile-user)> (print-options)
3019     (quote-keywordish-symbols reader highlight-suffix "}" highlight-prefix "{")
3020     scheme@(guile-user)> (print-options 'help)
3021     highlight-prefix          {       The string to print before highlighted values.
3022     highlight-suffix          }       The string to print after highlighted values.
3023     quote-keywordish-symbols  reader  How to print symbols that have a colon
3024                                       as their first or last character. The
3025                                       value '#f' does not quote the colons;
3026                                       '#t' quotes them; 'reader' quotes them
3027                                       when the reader option 'keywords' is
3028                                       not '#f'.
3029     escape-newlines           yes     Render newlines as \n when printing
3030                                       using `write'.
3031     r7rs-symbols              no      Escape symbols using R7RS |...| symbol
3032                                       notation.
3033
3034   These options may be modified with the print-set!  syntax.
3035
3036 -- Scheme Syntax: print-set! option-name value
3037     Modify the print options.  Due to historical oddities, ‘print-set!’
3038     is a macro that expects an unquoted option name.
3039
3040
3041File: guile.info,  Node: Fly Evaluation,  Next: Compilation,  Prev: Scheme Write,  Up: Read/Load/Eval/Compile
3042
30436.16.5 Procedures for On the Fly Evaluation
3044-------------------------------------------
3045
3046Scheme has the lovely property that its expressions may be represented
3047as data.  The ‘eval’ procedure takes a Scheme datum and evaluates it as
3048code.
3049
3050 -- Scheme Procedure: eval exp module_or_state
3051 -- C Function: scm_eval (exp, module_or_state)
3052     Evaluate EXP, a list representing a Scheme expression, in the
3053     top-level environment specified by MODULE_OR_STATE.  While EXP is
3054     evaluated (using ‘primitive-eval’), MODULE_OR_STATE is made the
3055     current module.  The current module is reset to its previous value
3056     when ‘eval’ returns.  XXX - dynamic states.  Example: (eval ’(+ 1
3057     2) (interaction-environment))
3058
3059 -- Scheme Procedure: interaction-environment
3060 -- C Function: scm_interaction_environment ()
3061     Return a specifier for the environment that contains
3062     implementation–defined bindings, typically a superset of those
3063     listed in the report.  The intent is that this procedure will
3064     return the environment in which the implementation would evaluate
3065     expressions dynamically typed by the user.
3066
3067   *Note Environments::, for other environments.
3068
3069   One does not always receive code as Scheme data, of course, and this
3070is especially the case for Guile’s other language implementations (*note
3071Other Languages::).  For the case in which all you have is a string, we
3072have ‘eval-string’.  There is a legacy version of this procedure in the
3073default environment, but you really want the one from ‘(ice-9
3074eval-string)’, so load it up:
3075
3076     (use-modules (ice-9 eval-string))
3077
3078 -- Scheme Procedure: eval-string string [#:module=#f] [#:file=#f]
3079          [#:line=#f] [#:column=#f] [#:lang=(current-language)]
3080          [#:compile?=#f]
3081     Parse STRING according to the current language, normally Scheme.
3082     Evaluate or compile the expressions it contains, in order,
3083     returning the last expression.
3084
3085     If the MODULE keyword argument is set, save a module excursion
3086     (*note Module System Reflection::) and set the current module to
3087     MODULE before evaluation.
3088
3089     The FILE, LINE, and COLUMN keyword arguments can be used to
3090     indicate that the source string begins at a particular source
3091     location.
3092
3093     Finally, LANG is a language, defaulting to the current language,
3094     and the expression is compiled if COMPILE? is true or there is no
3095     evaluator for the given language.
3096
3097 -- C Function: scm_eval_string (string)
3098 -- C Function: scm_eval_string_in_module (string, module)
3099     These C bindings call ‘eval-string’ from ‘(ice-9 eval-string)’,
3100     evaluating within MODULE or the current module.
3101
3102 -- C Function: SCM scm_c_eval_string (const char *string)
3103     ‘scm_eval_string’, but taking a C string in locale encoding instead
3104     of an ‘SCM’.
3105
3106 -- Scheme Procedure: apply proc arg ... arglst
3107 -- C Function: scm_apply_0 (proc, arglst)
3108 -- C Function: scm_apply_1 (proc, arg1, arglst)
3109 -- C Function: scm_apply_2 (proc, arg1, arg2, arglst)
3110 -- C Function: scm_apply_3 (proc, arg1, arg2, arg3, arglst)
3111 -- C Function: scm_apply (proc, arg, rest)
3112     Call PROC with arguments ARG ... and the elements of the ARGLST
3113     list.
3114
3115     ‘scm_apply’ takes parameters corresponding to a Scheme level
3116     ‘(lambda (proc arg1 . rest) ...)’.  So ARG1 and all but the last
3117     element of the REST list make up ARG ..., and the last element of
3118     REST is the ARGLST list.  Or if REST is the empty list ‘SCM_EOL’
3119     then there’s no ARG ..., and (ARG1) is the ARGLST.
3120
3121     ARGLST is not modified, but the REST list passed to ‘scm_apply’ is
3122     modified.
3123
3124 -- C Function: scm_call_0 (proc)
3125 -- C Function: scm_call_1 (proc, arg1)
3126 -- C Function: scm_call_2 (proc, arg1, arg2)
3127 -- C Function: scm_call_3 (proc, arg1, arg2, arg3)
3128 -- C Function: scm_call_4 (proc, arg1, arg2, arg3, arg4)
3129 -- C Function: scm_call_5 (proc, arg1, arg2, arg3, arg4, arg5)
3130 -- C Function: scm_call_6 (proc, arg1, arg2, arg3, arg4, arg5, arg6)
3131 -- C Function: scm_call_7 (proc, arg1, arg2, arg3, arg4, arg5, arg6,
3132          arg7)
3133 -- C Function: scm_call_8 (proc, arg1, arg2, arg3, arg4, arg5, arg6,
3134          arg7, arg8)
3135 -- C Function: scm_call_9 (proc, arg1, arg2, arg3, arg4, arg5, arg6,
3136          arg7, arg8, arg9)
3137     Call PROC with the given arguments.
3138
3139 -- C Function: scm_call (proc, ...)
3140     Call PROC with any number of arguments.  The argument list must be
3141     terminated by ‘SCM_UNDEFINED’.  For example:
3142
3143          scm_call (scm_c_public_ref ("guile", "+"),
3144                    scm_from_int (1),
3145                    scm_from_int (2),
3146                    SCM_UNDEFINED);
3147
3148 -- C Function: scm_call_n (proc, argv, nargs)
3149     Call PROC with the array of arguments ARGV, as a ‘SCM*’.  The
3150     length of the arguments should be passed in NARGS, as a ‘size_t’.
3151
3152 -- Scheme Procedure: primitive-eval exp
3153 -- C Function: scm_primitive_eval (exp)
3154     Evaluate EXP in the top-level environment specified by the current
3155     module.
3156
3157
3158File: guile.info,  Node: Compilation,  Next: Loading,  Prev: Fly Evaluation,  Up: Read/Load/Eval/Compile
3159
31606.16.6 Compiling Scheme Code
3161----------------------------
3162
3163The ‘eval’ procedure directly interprets the S-expression representation
3164of Scheme.  An alternate strategy for evaluation is to determine ahead
3165of time what computations will be necessary to evaluate the expression,
3166and then use that recipe to produce the desired results.  This is known
3167as “compilation”.
3168
3169   While it is possible to compile simple Scheme expressions such as ‘(+
31702 2)’ or even ‘"Hello world!"’, compilation is most interesting in the
3171context of procedures.  Compiling a lambda expression produces a
3172compiled procedure, which is just like a normal procedure except
3173typically much faster, because it can bypass the generic interpreter.
3174
3175   Functions from system modules in a Guile installation are normally
3176compiled already, so they load and run quickly.
3177
3178   Note that well-written Scheme programs will not typically call the
3179procedures in this section, for the same reason that it is often bad
3180taste to use ‘eval’.  By default, Guile automatically compiles any files
3181it encounters that have not been compiled yet (*note ‘--auto-compile’:
3182Invoking Guile.).  The compiler can also be invoked explicitly from the
3183shell as ‘guild compile foo.scm’.
3184
3185   (Why are calls to ‘eval’ and ‘compile’ usually in bad taste?  Because
3186they are limited, in that they can only really make sense for top-level
3187expressions.  Also, most needs for “compile-time” computation are
3188fulfilled by macros and closures.  Of course one good counterexample is
3189the REPL itself, or any code that reads expressions from a port.)
3190
3191   Automatic compilation generally works transparently, without any need
3192for user intervention.  However Guile does not yet do proper dependency
3193tracking, so that if file ‘A.scm’ uses macros from ‘B.scm’, and B.SCM
3194changes, ‘A.scm’ would not be automatically recompiled.  To forcibly
3195invalidate the auto-compilation cache, pass the ‘--fresh-auto-compile’
3196option to Guile, or set the ‘GUILE_AUTO_COMPILE’ environment variable to
3197‘fresh’ (instead of to ‘0’ or ‘1’).
3198
3199   For more information on the compiler itself, see *note Compiling to
3200the Virtual Machine::.  For information on the virtual machine, see
3201*note A Virtual Machine for Guile::.
3202
3203   The command-line interface to Guile’s compiler is the ‘guild compile’
3204command:
3205
3206 -- Command: guild compile [‘option’...] FILE...
3207     Compile FILE, a source file, and store bytecode in the compilation
3208     cache or in the file specified by the ‘-o’ option.  The following
3209     options are available:
3210
3211     ‘-L DIR’
3212     ‘--load-path=DIR’
3213          Add DIR to the front of the module load path.
3214
3215     ‘-o OFILE’
3216     ‘--output=OFILE’
3217          Write output bytecode to OFILE.  By convention, bytecode file
3218          names end in ‘.go’.  When ‘-o’ is omitted, the output file
3219          name is as for ‘compile-file’ (see below).
3220
3221     ‘-x EXTENSION’
3222          Recognize EXTENSION as a valid source file name extension.
3223
3224          For example, to compile R6RS code, you might want to pass ‘-x
3225          .sls’ so that files ending in ‘.sls’ can be found.
3226
3227     ‘-W WARNING’
3228     ‘--warn=WARNING’
3229          Enable specific warning passes; use ‘-Whelp’ for a list of
3230          available options.  The default is ‘-W1’, which enables a
3231          number of common warnings.  Pass ‘-W0’ to disable all
3232          warnings.
3233
3234     ‘-O OPT’
3235     ‘--optimize=OPT’
3236          Enable or disable specific compiler optimizations; use
3237          ‘-Ohelp’ for a list of available options.  The default is
3238          ‘-O2’, which enables most optimizations.  ‘-O0’ is recommended
3239          if compilation speed is more important than the speed of the
3240          compiled code.  Pass ‘-Ono-OPT’ to disable a specific compiler
3241          pass.  Any number of ‘-O’ options can be passed to the
3242          compiler, with later ones taking precedence.
3243
3244     ‘--r6rs’
3245     ‘--r7rs’
3246          Compile in an environment whose default bindings, reader
3247          options, and load paths are adapted for specific Scheme
3248          standards.  *Note R6RS Support::, and *Note R7RS Support::.
3249
3250     ‘-f LANG’
3251     ‘--from=LANG’
3252          Use LANG as the source language of FILE.  If this option is
3253          omitted, ‘scheme’ is assumed.
3254
3255     ‘-t LANG’
3256     ‘--to=LANG’
3257          Use LANG as the target language of FILE.  If this option is
3258          omitted, ‘rtl’ is assumed.
3259
3260     ‘-T TARGET’
3261     ‘--target=TARGET’
3262          Produce code for TARGET instead of %HOST-TYPE (*note
3263          %host-type: Build Config.).  Target must be a valid GNU
3264          triplet, such as ‘armv5tel-unknown-linux-gnueabi’ (*note
3265          (autoconf)Specifying Target Triplets::).
3266
3267     Each FILE is assumed to be UTF-8-encoded, unless it contains a
3268     coding declaration as recognized by ‘file-encoding’ (*note
3269     Character Encoding of Source Files::).
3270
3271   The compiler can also be invoked directly by Scheme code.  These
3272interfaces are in their own module:
3273
3274     (use-modules (system base compile))
3275
3276 -- Scheme Procedure: compile exp [#:env=#f] [#:from=(current-language)]
3277          [#:to=value] [#:opts='()]
3278          [#:optimization-level=(default-optimization-level)]
3279          [#:warning-level=(default-warning-level)]
3280     Compile the expression EXP in the environment ENV.  If EXP is a
3281     procedure, the result will be a compiled procedure; otherwise
3282     ‘compile’ is mostly equivalent to ‘eval’.
3283
3284     For a discussion of languages and compiler options, *Note Compiling
3285     to the Virtual Machine::.
3286
3287 -- Scheme Procedure: compile-file file [#:output-file=#f]
3288          [#:from=(current-language)] [#:to='rtl]
3289          [#:env=(default-environment from)] [#:opts='()]
3290          [#:optimization-level=(default-optimization-level)]
3291          [#:warning-level=(default-warning-level)]
3292          [#:canonicalization='relative]
3293     Compile the file named FILE.
3294
3295     Output will be written to a OUTPUT-FILE.  If you do not supply an
3296     output file name, output is written to a file in the cache
3297     directory, as computed by ‘(compiled-file-name FILE)’.
3298
3299     FROM and TO specify the source and target languages.  *Note
3300     Compiling to the Virtual Machine::, for more information on these
3301     options, and on ENV and OPTS.
3302
3303     As with ‘guild compile’, FILE is assumed to be UTF-8-encoded unless
3304     it contains a coding declaration.
3305
3306 -- Scheme Parameter: default-optimization-level
3307     The default optimization level, as an integer from 0 to 9.  The
3308     default is 2.
3309 -- Scheme Parameter: default-warning-level
3310     The default warning level, as an integer from 0 to 9.  The default
3311     is 1.
3312
3313   *Note Parameters::, for more on how to set parameters.
3314
3315 -- Scheme Procedure: compiled-file-name file
3316     Compute a cached location for a compiled version of a Scheme file
3317     named FILE.
3318
3319     This file will usually be below the ‘$HOME/.cache/guile/ccache3320     directory, depending on the value of the ‘XDG_CACHE_HOME’
3321     environment variable.  The intention is that ‘compiled-file-name’
3322     provides a fallback location for caching auto-compiled files.  If
3323     you want to place a compile file in the ‘%load-compiled-path’, you
3324     should pass the OUTPUT-FILE option to ‘compile-file’, explicitly.
3325
3326 -- Scheme Variable: %auto-compilation-options
3327     This variable contains the options passed to the ‘compile-file’
3328     procedure when auto-compiling source files.  By default, it enables
3329     useful compilation warnings.  It can be customized from ‘~/.guile’.
3330
3331
3332File: guile.info,  Node: Loading,  Next: Load Paths,  Prev: Compilation,  Up: Read/Load/Eval/Compile
3333
33346.16.7 Loading Scheme Code from File
3335------------------------------------
3336
3337 -- Scheme Procedure: load filename [reader]
3338     Load FILENAME and evaluate its contents in the top-level
3339     environment.
3340
3341     READER if provided should be either ‘#f’, or a procedure with the
3342     signature ‘(lambda (port) ...)’ which reads the next expression
3343     from PORT.  If READER is ‘#f’ or absent, Guile’s built-in ‘read’
3344     procedure is used (*note Scheme Read::).
3345
3346     The READER argument takes effect by setting the value of the
3347     ‘current-reader’ fluid (see below) before loading the file, and
3348     restoring its previous value when loading is complete.  The Scheme
3349     code inside FILENAME can itself change the current reader procedure
3350     on the fly by setting ‘current-reader’ fluid.
3351
3352     If the variable ‘%load-hook’ is defined, it should be bound to a
3353     procedure that will be called before any code is loaded.  See
3354     documentation for ‘%load-hook’ later in this section.
3355
3356 -- Scheme Procedure: load-compiled filename
3357     Load the compiled file named FILENAME.
3358
3359     Compiling a source file (*note Read/Load/Eval/Compile::) and then
3360     calling ‘load-compiled’ on the resulting file is equivalent to
3361     calling ‘load’ on the source file.
3362
3363 -- Scheme Procedure: primitive-load filename
3364 -- C Function: scm_primitive_load (filename)
3365     Load the file named FILENAME and evaluate its contents in the
3366     top-level environment.  FILENAME must either be a full pathname or
3367     be a pathname relative to the current directory.  If the variable
3368     ‘%load-hook’ is defined, it should be bound to a procedure that
3369     will be called before any code is loaded.  See the documentation
3370     for ‘%load-hook’ later in this section.
3371
3372 -- C Function: SCM scm_c_primitive_load (const char *filename)
3373     ‘scm_primitive_load’, but taking a C string instead of an ‘SCM’.
3374
3375 -- Variable: current-reader
3376     ‘current-reader’ holds the read procedure that is currently being
3377     used by the above loading procedures to read expressions (from the
3378     file that they are loading).  ‘current-reader’ is a fluid, so it
3379     has an independent value in each dynamic root and should be read
3380     and set using ‘fluid-ref’ and ‘fluid-set!’ (*note Fluids and
3381     Dynamic States::).
3382
3383     Changing ‘current-reader’ is typically useful to introduce local
3384     syntactic changes, such that code following the ‘fluid-set!’ call
3385     is read using the newly installed reader.  The ‘current-reader’
3386     change should take place at evaluation time when the code is
3387     evaluated, or at compilation time when the code is compiled:
3388
3389          (eval-when (compile eval)
3390            (fluid-set! current-reader my-own-reader))
3391
3392     The ‘eval-when’ form above ensures that the ‘current-reader’ change
3393     occurs at the right time.
3394
3395 -- Variable: %load-hook
3396     A procedure to be called ‘(%load-hook FILENAME)’ whenever a file is
3397     loaded, or ‘#f’ for no such call.  ‘%load-hook’ is used by all of
3398     the loading functions (‘load’ and ‘primitive-load’, and
3399     ‘load-from-path’ and ‘primitive-load-path’ documented in the next
3400     section).
3401
3402     For example an application can set this to show what’s loaded,
3403
3404          (set! %load-hook (lambda (filename)
3405                             (format #t "Loading ~a ...\n" filename)))
3406          (load-from-path "foo.scm")
3407          ⊣ Loading /usr/local/share/guile/site/foo.scm ...
3408
3409 -- Scheme Procedure: current-load-port
3410 -- C Function: scm_current_load_port ()
3411     Return the current-load-port.  The load port is used internally by
3412     ‘primitive-load’.
3413
3414
3415File: guile.info,  Node: Load Paths,  Next: Character Encoding of Source Files,  Prev: Loading,  Up: Read/Load/Eval/Compile
3416
34176.16.8 Load Paths
3418-----------------
3419
3420The procedure in the previous section look for Scheme code in the file
3421system at specific location.  Guile also has some procedures to search
3422the load path for code.
3423
3424 -- Variable: %load-path
3425     List of directories which should be searched for Scheme modules and
3426     libraries.  When Guile starts up, ‘%load-path’ is initialized to
3427     the default load path ‘(list (%library-dir) (%site-dir)
3428     (%global-site-dir) (%package-data-dir))’.  The ‘GUILE_LOAD_PATH’
3429     environment variable can be used to prepend or append additional
3430     directories (*note Environment Variables::).
3431
3432     *Note Build Config::, for more on ‘%site-dir’ and related
3433     procedures.
3434
3435 -- Scheme Procedure: load-from-path filename
3436     Similar to ‘load’, but searches for FILENAME in the load paths.
3437     Preferentially loads a compiled version of the file, if it is
3438     available and up-to-date.
3439
3440   A user can extend the load path by calling ‘add-to-load-path’.
3441
3442 -- Scheme Syntax: add-to-load-path dir
3443     Add DIR to the load path.
3444
3445   For example, a script might include this form to add the directory
3446that it is in to the load path:
3447
3448     (add-to-load-path (dirname (current-filename)))
3449
3450   It’s better to use ‘add-to-load-path’ than to modify ‘%load-path’
3451directly, because ‘add-to-load-path’ takes care of modifying the path
3452both at compile-time and at run-time.
3453
3454 -- Scheme Procedure: primitive-load-path filename
3455          [exception-on-not-found]
3456 -- C Function: scm_primitive_load_path (filename)
3457     Search ‘%load-path’ for the file named FILENAME and load it into
3458     the top-level environment.  If FILENAME is a relative pathname and
3459     is not found in the list of search paths, an error is signalled.
3460     Preferentially loads a compiled version of the file, if it is
3461     available and up-to-date.
3462
3463     If FILENAME is a relative pathname and is not found in the list of
3464     search paths, one of three things may happen, depending on the
3465     optional second argument, EXCEPTION-ON-NOT-FOUND.  If it is ‘#f’,
3466     ‘#f’ will be returned.  If it is a procedure, it will be called
3467     with no arguments.  (This allows a distinction to be made between
3468     exceptions raised by loading a file, and exceptions related to the
3469     loader itself.)  Otherwise an error is signalled.
3470
3471     For compatibility with Guile 1.8 and earlier, the C function takes
3472     only one argument, which can be either a string (the file name) or
3473     an argument list.
3474
3475 -- Scheme Procedure: %search-load-path filename
3476 -- C Function: scm_sys_search_load_path (filename)
3477     Search ‘%load-path’ for the file named FILENAME, which must be
3478     readable by the current user.  If FILENAME is found in the list of
3479     paths to search or is an absolute pathname, return its full
3480     pathname.  Otherwise, return ‘#f’.  Filenames may have any of the
3481     optional extensions in the ‘%load-extensions’ list;
3482     ‘%search-load-path’ will try each extension automatically.
3483
3484 -- Variable: %load-extensions
3485     A list of default file extensions for files containing Scheme code.
3486     ‘%search-load-path’ tries each of these extensions when looking for
3487     a file to load.  By default, ‘%load-extensions’ is bound to the
3488     list ‘("" ".scm")’.
3489
3490   As mentioned above, when Guile searches the ‘%load-path’ for a source
3491file, it will also search the ‘%load-compiled-path’ for a corresponding
3492compiled file.  If the compiled file is as new or newer than the source
3493file, it will be loaded instead of the source file, using
3494‘load-compiled’.
3495
3496 -- Variable: %load-compiled-path
3497     Like ‘%load-path’, but for compiled files.  By default, this path
3498     has two entries: one for compiled files from Guile itself, and one
3499     for site packages.  The ‘GUILE_LOAD_COMPILED_PATH’ environment
3500     variable can be used to prepend or append additional directories
3501     (*note Environment Variables::).
3502
3503   When ‘primitive-load-path’ searches the ‘%load-compiled-path’ for a
3504corresponding compiled file for a relative path it does so by appending
3505‘.go’ to the relative path.  For example, searching for ‘ice-9/popen3506could find ‘/usr/lib/guile/3.0/ccache/ice-9/popen.go’, and use it
3507instead of ‘/usr/share/guile/3.0/ice-9/popen.scm’.
3508
3509   If ‘primitive-load-path’ does not find a corresponding ‘.go’ file in
3510the ‘%load-compiled-path’, or the ‘.go’ file is out of date, it will
3511search for a corresponding auto-compiled file in the fallback path,
3512possibly creating one if one does not exist.
3513
3514   *Note Installing Site Packages::, for more on how to correctly
3515install site packages.  *Note Modules and the File System::, for more on
3516the relationship between load paths and modules.  *Note Compilation::,
3517for more on the fallback path and auto-compilation.
3518
3519   Finally, there are a couple of helper procedures for general path
3520manipulation.
3521
3522 -- Scheme Procedure: parse-path path [tail]
3523 -- C Function: scm_parse_path (path, tail)
3524     Parse PATH, which is expected to be a colon-separated string, into
3525     a list and return the resulting list with TAIL appended.  If PATH
3526     is ‘#f’, TAIL is returned.
3527
3528 -- Scheme Procedure: parse-path-with-ellipsis path base
3529 -- C Function: scm_parse_path_with_ellipsis (path, base)
3530     Parse PATH, which is expected to be a colon-separated string, into
3531     a list and return the resulting list with BASE (a list) spliced in
3532     place of the ‘...’ path component, if present, or else BASE is
3533     added to the end.  If PATH is ‘#f’, BASE is returned.
3534
3535 -- Scheme Procedure: search-path path filename [extensions
3536          [require-exts?]]
3537 -- C Function: scm_search_path (path, filename, rest)
3538     Search PATH for a directory containing a file named FILENAME.  The
3539     file must be readable, and not a directory.  If we find one, return
3540     its full filename; otherwise, return ‘#f’.  If FILENAME is
3541     absolute, return it unchanged.  If given, EXTENSIONS is a list of
3542     strings; for each directory in PATH, we search for FILENAME
3543     concatenated with each EXTENSION.  If REQUIRE-EXTS? is true,
3544     require that the returned file name have one of the given
3545     extensions; if REQUIRE-EXTS? is not given, it defaults to ‘#f’.
3546
3547     For compatibility with Guile 1.8 and earlier, the C function takes
3548     only three arguments.
3549
3550
3551File: guile.info,  Node: Character Encoding of Source Files,  Next: Delayed Evaluation,  Prev: Load Paths,  Up: Read/Load/Eval/Compile
3552
35536.16.9 Character Encoding of Source Files
3554-----------------------------------------
3555
3556Scheme source code files are usually encoded in ASCII or UTF-8, but the
3557built-in reader can interpret other character encodings as well.  When
3558Guile loads Scheme source code, it uses the ‘file-encoding’ procedure
3559(described below) to try to guess the encoding of the file.  In the
3560absence of any hints, UTF-8 is assumed.  One way to provide a hint about
3561the encoding of a source file is to place a coding declaration in the
3562top 500 characters of the file.
3563
3564   A coding declaration has the form ‘coding: XXXXXX’, where ‘XXXXXX’ is
3565the name of a character encoding in which the source code file has been
3566encoded.  The coding declaration must appear in a scheme comment.  It
3567can either be a semicolon-initiated comment, or the first block ‘#!’
3568comment in the file.
3569
3570   The name of the character encoding in the coding declaration is
3571typically lower case and containing only letters, numbers, and hyphens,
3572as recognized by ‘set-port-encoding!’ (*note ‘set-port-encoding!’:
3573Ports.).  Common examples of character encoding names are ‘utf-8’ and
3574‘iso-8859-1’, as defined by IANA
3575(http://www.iana.org/assignments/character-sets).  Thus, the coding
3576declaration is mostly compatible with Emacs.
3577
3578   However, there are some differences in encoding names recognized by
3579Emacs and encoding names defined by IANA, the latter being essentially a
3580subset of the former.  For instance, ‘latin-1’ is a valid encoding name
3581for Emacs, but it’s not according to the IANA standard, which Guile
3582follows; instead, you should use ‘iso-8859-1’, which is both understood
3583by Emacs and dubbed by IANA (IANA writes it uppercase but Emacs wants it
3584lowercase and Guile is case insensitive.)
3585
3586   For source code, only a subset of all possible character encodings
3587can be interpreted by the built-in source code reader.  Only those
3588character encodings in which ASCII text appears unmodified can be used.
3589This includes ‘UTF-8’ and ‘ISO-8859-1’ through ‘ISO-8859-15’.  The
3590multi-byte character encodings ‘UTF-16’ and ‘UTF-32’ may not be used
3591because they are not compatible with ASCII.
3592
3593   There might be a scenario in which one would want to read non-ASCII
3594code from a port, such as with the function ‘read’, instead of with
3595‘load’.  If the port’s character encoding is the same as the encoding of
3596the code to be read by the port, not other special handling is
3597necessary.  The port will automatically do the character encoding
3598conversion.  The functions ‘setlocale’ or by ‘set-port-encoding!’ are
3599used to set port encodings (*note Ports::).
3600
3601   If a port is used to read code of unknown character encoding, it can
3602accomplish this in three steps.  First, the character encoding of the
3603port should be set to ISO-8859-1 using ‘set-port-encoding!’.  Then, the
3604procedure ‘file-encoding’, described below, is used to scan for a coding
3605declaration when reading from the port.  As a side effect, it rewinds
3606the port after its scan is complete.  After that, the port’s character
3607encoding should be set to the encoding returned by ‘file-encoding’, if
3608any, again by using ‘set-port-encoding!’.  Then the code can be read as
3609normal.
3610
3611   Alternatively, one can use the ‘#:guess-encoding’ keyword argument of
3612‘open-file’ and related procedures.  *Note File Ports::.
3613
3614 -- Scheme Procedure: file-encoding port
3615 -- C Function: scm_file_encoding (port)
3616     Attempt to scan the first few hundred bytes from the PORT for hints
3617     about its character encoding.  Return a string containing the
3618     encoding name or ‘#f’ if the encoding cannot be determined.  The
3619     port is rewound.
3620
3621     Currently, the only supported method is to look for an Emacs-like
3622     character coding declaration (*note how Emacs recognizes file
3623     encoding: (emacs)Recognize Coding.).  The coding declaration is of
3624     the form ‘coding: XXXXX’ and must appear in a Scheme comment.
3625     Additional heuristics may be added in the future.
3626
3627
3628File: guile.info,  Node: Delayed Evaluation,  Next: Local Evaluation,  Prev: Character Encoding of Source Files,  Up: Read/Load/Eval/Compile
3629
36306.16.10 Delayed Evaluation
3631--------------------------
3632
3633Promises are a convenient way to defer a calculation until its result is
3634actually needed, and to run such a calculation only once.  Also *note
3635SRFI-45::.
3636
3637 -- syntax: delay expr
3638     Return a promise object which holds the given EXPR expression,
3639     ready to be evaluated by a later ‘force’.
3640
3641 -- Scheme Procedure: promise? obj
3642 -- C Function: scm_promise_p (obj)
3643     Return true if OBJ is a promise.
3644
3645 -- Scheme Procedure: force p
3646 -- C Function: scm_force (p)
3647     Return the value obtained from evaluating the EXPR in the given
3648     promise P.  If P has previously been forced then its EXPR is not
3649     evaluated again, instead the value obtained at that time is simply
3650     returned.
3651
3652     During a ‘force’, an EXPR can call ‘force’ again on its own
3653     promise, resulting in a recursive evaluation of that EXPR.  The
3654     first evaluation to return gives the value for the promise.  Higher
3655     evaluations run to completion in the normal way, but their results
3656     are ignored, ‘force’ always returns the first value.
3657
3658
3659File: guile.info,  Node: Local Evaluation,  Next: Local Inclusion,  Prev: Delayed Evaluation,  Up: Read/Load/Eval/Compile
3660
36616.16.11 Local Evaluation
3662------------------------
3663
3664Guile includes a facility to capture a lexical environment, and later
3665evaluate a new expression within that environment.  This code is
3666implemented in a module.
3667
3668     (use-modules (ice-9 local-eval))
3669
3670 -- syntax: the-environment
3671     Captures and returns a lexical environment for use with
3672     ‘local-eval’ or ‘local-compile’.
3673
3674 -- Scheme Procedure: local-eval exp env
3675 -- C Function: scm_local_eval (exp, env)
3676 -- Scheme Procedure: local-compile exp env [opts=()]
3677     Evaluate or compile the expression EXP in the lexical environment
3678     ENV.
3679
3680   Here is a simple example, illustrating that it is the variable that
3681gets captured, not just its value at one point in time.
3682
3683     (define e (let ((x 100)) (the-environment)))
3684     (define fetch-x (local-eval '(lambda () x) e))
3685     (fetch-x)
3686     ⇒ 100
3687     (local-eval '(set! x 42) e)
3688     (fetch-x)
3689     ⇒ 42
3690
3691   While EXP is evaluated within the lexical environment of
3692‘(the-environment)’, it has the dynamic environment of the call to
3693‘local-eval’.
3694
3695   ‘local-eval’ and ‘local-compile’ can only evaluate expressions, not
3696definitions.
3697
3698     (local-eval '(define foo 42)
3699                 (let ((x 100)) (the-environment)))
3700     ⇒ syntax error: definition in expression context
3701
3702   Note that the current implementation of ‘(the-environment)’ only
3703captures “normal” lexical bindings, and pattern variables bound by
3704‘syntax-case’.  It does not currently capture local syntax transformers
3705bound by ‘let-syntax’, ‘letrec-syntax’ or non-top-level ‘define-syntax’
3706forms.  Any attempt to reference such captured syntactic keywords via
3707‘local-eval’ or ‘local-compile’ produces an error.
3708
3709
3710File: guile.info,  Node: Local Inclusion,  Next: Sandboxed Evaluation,  Prev: Local Evaluation,  Up: Read/Load/Eval/Compile
3711
37126.16.12 Local Inclusion
3713-----------------------
3714
3715This section has discussed various means of linking Scheme code
3716together: fundamentally, loading up files at run-time using ‘load’ and
3717‘load-compiled’.  Guile provides another option to compose parts of
3718programs together at expansion-time instead of at run-time.
3719
3720 -- Scheme Syntax: include file-name
3721     Open FILE-NAME, at expansion-time, and read the Scheme forms that
3722     it contains, splicing them into the location of the ‘include’,
3723     within a ‘begin’.
3724
3725     If FILE-NAME is a relative path, it is searched for relative to the
3726     path that contains the file that the ‘include’ form appears in.
3727
3728   If you are a C programmer, if ‘load’ in Scheme is like ‘dlopen’ in C,
3729consider ‘include’ to be like the C preprocessor’s ‘#include’.  When you
3730use ‘include’, it is as if the contents of the included file were typed
3731in instead of the ‘include’ form.
3732
3733   Because the code is included at compile-time, it is available to the
3734macroexpander.  Syntax definitions in the included file are available to
3735later code in the form in which the ‘include’ appears, without the need
3736for ‘eval-when’.  (*Note Eval When::.)
3737
3738   For the same reason, compiling a form that uses ‘include’ results in
3739one compilation unit, composed of multiple files.  Loading the compiled
3740file is one ‘stat’ operation for the compilation unit, instead of ‘2*N’
3741in the case of ‘load’ (once for each loaded source file, and once each
3742corresponding compiled file, in the best case).
3743
3744   Unlike ‘load’, ‘include’ also works within nested lexical contexts.
3745It so happens that the optimizer works best within a lexical context,
3746because all of the uses of bindings in a lexical context are visible, so
3747composing files by including them within a ‘(let () ...)’ can sometimes
3748lead to important speed improvements.
3749
3750   On the other hand, ‘include’ does have all the disadvantages of early
3751binding: once the code with the ‘include’ is compiled, no change to the
3752included file is reflected in the future behavior of the including form.
3753
3754   Also, the particular form of ‘include’, which requires an absolute
3755path, or a path relative to the current directory at compile-time, is
3756not very amenable to compiling the source in one place, but then
3757installing the source to another place.  For this reason, Guile provides
3758another form, ‘include-from-path’, which looks for the source file to
3759include within a load path.
3760
3761 -- Scheme Syntax: include-from-path file-name
3762     Like ‘include’, but instead of expecting ‘file-name’ to be an
3763     absolute file name, it is expected to be a relative path to search
3764     in the ‘%load-path’.
3765
3766   ‘include-from-path’ is more useful when you want to install all of
3767the source files for a package (as you should!).  It makes it possible
3768to evaluate an installed file from source, instead of relying on the
3769‘.go’ file being up to date.
3770
3771
3772File: guile.info,  Node: Sandboxed Evaluation,  Next: REPL Servers,  Prev: Local Inclusion,  Up: Read/Load/Eval/Compile
3773
37746.16.13 Sandboxed Evaluation
3775----------------------------
3776
3777Sometimes you would like to evaluate code that comes from an untrusted
3778party.  The safest way to do this is to buy a new computer, evaluate the
3779code on that computer, then throw the machine away.  However if you are
3780unwilling to take this simple approach, Guile does include a limited
3781“sandbox” facility that can allow untrusted code to be evaluated with
3782some confidence.
3783
3784   To use the sandboxed evaluator, load its module:
3785
3786     (use-modules (ice-9 sandbox))
3787
3788   Guile’s sandboxing facility starts with the ability to restrict the
3789time and space used by a piece of code.
3790
3791 -- Scheme Procedure: call-with-time-limit limit thunk limit-reached
3792     Call THUNK, but cancel it if LIMIT seconds of wall-clock time have
3793     elapsed.  If the computation is cancelled, call LIMIT-REACHED in
3794     tail position.  THUNK must not disable interrupts or prevent an
3795     abort via a ‘dynamic-wind’ unwind handler.
3796
3797 -- Scheme Procedure: call-with-allocation-limit limit thunk
3798          limit-reached
3799     Call THUNK, but cancel it if LIMIT bytes have been allocated.  If
3800     the computation is cancelled, call LIMIT-REACHED in tail position.
3801     THUNK must not disable interrupts or prevent an abort via a
3802     ‘dynamic-wind’ unwind handler.
3803
3804     This limit applies to both stack and heap allocation.  The
3805     computation will not be aborted before LIMIT bytes have been
3806     allocated, but for the heap allocation limit, the check may be
3807     postponed until the next garbage collection.
3808
3809     Note that as a current shortcoming, the heap size limit applies to
3810     all threads; concurrent allocation by other unrelated threads
3811     counts towards the allocation limit.
3812
3813 -- Scheme Procedure: call-with-time-and-allocation-limits time-limit
3814          allocation-limit thunk
3815     Invoke THUNK in a dynamic extent in which its execution is limited
3816     to TIME-LIMIT seconds of wall-clock time, and its allocation to
3817     ALLOCATION-LIMIT bytes.  THUNK must not disable interrupts or
3818     prevent an abort via a ‘dynamic-wind’ unwind handler.
3819
3820     If successful, return all values produced by invoking THUNK.  Any
3821     uncaught exception thrown by the thunk will propagate out.  If the
3822     time or allocation limit is exceeded, an exception will be thrown
3823     to the ‘limit-exceeded’ key.
3824
3825   The time limit and stack limit are both very precise, but the heap
3826limit only gets checked asynchronously, after a garbage collection.  In
3827particular, if the heap is already very large, the number of allocated
3828bytes between garbage collections will be large, and therefore the
3829precision of the check is reduced.
3830
3831   Additionally, due to the mechanism used by the allocation limit (the
3832‘after-gc-hook’), large single allocations like ‘(make-vector #e1e7)’
3833are only detected after the allocation completes, even if the allocation
3834itself causes garbage collection.  It’s possible therefore for user code
3835to not only exceed the allocation limit set, but also to exhaust all
3836available memory, causing out-of-memory conditions at any allocation
3837site.  Failure to allocate memory in Guile itself should be safe and
3838cause an exception to be thrown, but most systems are not designed to
3839handle ‘malloc’ failures.  An allocation failure may therefore exercise
3840unexpected code paths in your system, so it is a weakness of the sandbox
3841(and therefore an interesting point of attack).
3842
3843   The main sandbox interface is ‘eval-in-sandbox’.
3844
3845 -- Scheme Procedure: eval-in-sandbox exp [#:time-limit 0.1]
3846          [#:allocation-limit #e10e6] [#:bindings all-pure-bindings]
3847          [#:module (make-sandbox-module bindings)] [#:sever-module? #t]
3848     Evaluate the Scheme expression EXP within an isolated "sandbox".
3849     Limit its execution to TIME-LIMIT seconds of wall-clock time, and
3850     limit its allocation to ALLOCATION-LIMIT bytes.
3851
3852     The evaluation will occur in MODULE, which defaults to the result
3853     of calling ‘make-sandbox-module’ on BINDINGS, which itself defaults
3854     to ‘all-pure-bindings’.  This is the core of the sandbox: creating
3855     a scope for the expression that is “safe”.
3856
3857     A safe sandbox module has two characteristics.  Firstly, it will
3858     not allow the expression being evaluated to avoid being cancelled
3859     due to time or allocation limits.  This ensures that the expression
3860     terminates in a timely fashion.
3861
3862     Secondly, a safe sandbox module will prevent the evaluation from
3863     receiving information from previous evaluations, or from affecting
3864     future evaluations.  All combinations of binding sets exported by
3865     ‘(ice-9 sandbox)’ form safe sandbox modules.
3866
3867     The BINDINGS should be given as a list of import sets.  One import
3868     set is a list whose car names an interface, like ‘(ice-9 q)’, and
3869     whose cdr is a list of imports.  An import is either a bare symbol
3870     or a pair of ‘(OUT . IN)’, where OUT and IN are both symbols and
3871     denote the name under which a binding is exported from the module,
3872     and the name under which to make the binding available,
3873     respectively.  Note that BINDINGS is only used as an input to the
3874     default initializer for the MODULE argument; if you pass
3875     ‘#:module’, BINDINGS is unused.  If SEVER-MODULE? is true (the
3876     default), the module will be unlinked from the global module tree
3877     after the evaluation returns, to allow MOD to be garbage-collected.
3878
3879     If successful, return all values produced by EXP.  Any uncaught
3880     exception thrown by the expression will propagate out.  If the time
3881     or allocation limit is exceeded, an exception will be thrown to the
3882     ‘limit-exceeded’ key.
3883
3884   Constructing a safe sandbox module is tricky in general.  Guile
3885defines an easy way to construct safe modules from predefined sets of
3886bindings.  Before getting to that interface, here are some general notes
3887on safety.
3888
3889  1. The time and allocation limits rely on the ability to interrupt and
3890     cancel a computation.  For this reason, no binding included in a
3891     sandbox module should be able to indefinitely postpone interrupt
3892     handling, nor should a binding be able to prevent an abort.  In
3893     practice this second consideration means that ‘dynamic-wind’ should
3894     not be included in any binding set.
3895  2. The time and allocation limits apply only to the ‘eval-in-sandbox’
3896     call.  If the call returns a procedure which is later called, no
3897     limit is “automatically” in place.  Users of ‘eval-in-sandbox’ have
3898     to be very careful to reimpose limits when calling procedures that
3899     escape from sandboxes.
3900  3. Similarly, the dynamic environment of the ‘eval-in-sandbox’ call is
3901     not necessarily in place when any procedure that escapes from the
3902     sandbox is later called.
3903
3904     This detail prevents us from exposing ‘primitive-eval’ to the
3905     sandbox, for two reasons.  The first is that it’s possible for
3906     legacy code to forge references to any binding, if the
3907     ‘allow-legacy-syntax-objects?’ parameter is true.  The default for
3908     this parameter is true; *note Syntax Transformer Helpers:: for the
3909     details.  The parameter is bound to ‘#f’ for the duration of the
3910     ‘eval-in-sandbox’ call itself, but that will not be in place during
3911     calls to escaped procedures.
3912
3913     The second reason we don’t expose ‘primitive-eval’ is that
3914     ‘primitive-eval’ implicitly works in the current module, which for
3915     an escaped procedure will probably be different than the module
3916     that is current for the ‘eval-in-sandbox’ call itself.
3917
3918     The common denominator here is that if an interface exposed to the
3919     sandbox relies on dynamic environments, it is easy to mistakenly
3920     grant the sandboxed procedure additional capabilities in the form
3921     of bindings that it should not have access to.  For this reason,
3922     the default sets of predefined bindings do not depend on any
3923     dynamically scoped value.
3924  4. Mutation may allow a sandboxed evaluation to break some invariant
3925     in users of data supplied to it.  A lot of code culturally doesn’t
3926     expect mutation, but if you hand mutable data to a sandboxed
3927     evaluation and you also grant mutating capabilities to that
3928     evaluation, then the sandboxed code may indeed mutate that data.
3929     The default set of bindings to the sandbox do not include any
3930     mutating primitives.
3931
3932     Relatedly, ‘set!’ may allow a sandbox to mutate a primitive,
3933     invalidating many system-wide invariants.  Guile is currently quite
3934     permissive when it comes to imported bindings and mutability.
3935     Although ‘set!’ to a module-local or lexically bound variable would
3936     be fine, we don’t currently have an easy way to disallow ‘set!’ to
3937     an imported binding, so currently no binding set includes ‘set!’.
3938  5. Mutation may allow a sandboxed evaluation to keep state, or make a
3939     communication mechanism with other code.  On the one hand this
3940     sounds cool, but on the other hand maybe this is part of your
3941     threat model.  Again, the default set of bindings doesn’t include
3942     mutating primitives, preventing sandboxed evaluations from keeping
3943     state.
3944  6. The sandbox should probably not be able to open a network
3945     connection, or write to a file, or open a file from disk.  The
3946     default binding set includes no interaction with the operating
3947     system.
3948
3949   If you, dear reader, find the above discussion interesting, you will
3950enjoy Jonathan Rees’ dissertation, “A Security Kernel Based on the
3951Lambda Calculus”.
3952
3953 -- Scheme Variable: all-pure-bindings
3954     All “pure” bindings that together form a safe subset of those
3955     bindings available by default to Guile user code.
3956
3957 -- Scheme Variable: all-pure-and-impure-bindings
3958     Like ‘all-pure-bindings’, but additionally including mutating
3959     primitives like ‘vector-set!’.  This set is still safe in the sense
3960     mentioned above, with the caveats about mutation.
3961
3962   The components of these composite sets are as follows:
3963 -- Scheme Variable: alist-bindings
3964 -- Scheme Variable: array-bindings
3965 -- Scheme Variable: bit-bindings
3966 -- Scheme Variable: bitvector-bindings
3967 -- Scheme Variable: char-bindings
3968 -- Scheme Variable: char-set-bindings
3969 -- Scheme Variable: clock-bindings
3970 -- Scheme Variable: core-bindings
3971 -- Scheme Variable: error-bindings
3972 -- Scheme Variable: fluid-bindings
3973 -- Scheme Variable: hash-bindings
3974 -- Scheme Variable: iteration-bindings
3975 -- Scheme Variable: keyword-bindings
3976 -- Scheme Variable: list-bindings
3977 -- Scheme Variable: macro-bindings
3978 -- Scheme Variable: nil-bindings
3979 -- Scheme Variable: number-bindings
3980 -- Scheme Variable: pair-bindings
3981 -- Scheme Variable: predicate-bindings
3982 -- Scheme Variable: procedure-bindings
3983 -- Scheme Variable: promise-bindings
3984 -- Scheme Variable: prompt-bindings
3985 -- Scheme Variable: regexp-bindings
3986 -- Scheme Variable: sort-bindings
3987 -- Scheme Variable: srfi-4-bindings
3988 -- Scheme Variable: string-bindings
3989 -- Scheme Variable: symbol-bindings
3990 -- Scheme Variable: unspecified-bindings
3991 -- Scheme Variable: variable-bindings
3992 -- Scheme Variable: vector-bindings
3993 -- Scheme Variable: version-bindings
3994     The components of ‘all-pure-bindings’.
3995
3996 -- Scheme Variable: mutating-alist-bindings
3997 -- Scheme Variable: mutating-array-bindings
3998 -- Scheme Variable: mutating-bitvector-bindings
3999 -- Scheme Variable: mutating-fluid-bindings
4000 -- Scheme Variable: mutating-hash-bindings
4001 -- Scheme Variable: mutating-list-bindings
4002 -- Scheme Variable: mutating-pair-bindings
4003 -- Scheme Variable: mutating-sort-bindings
4004 -- Scheme Variable: mutating-srfi-4-bindings
4005 -- Scheme Variable: mutating-string-bindings
4006 -- Scheme Variable: mutating-variable-bindings
4007 -- Scheme Variable: mutating-vector-bindings
4008     The additional components of ‘all-pure-and-impure-bindings’.
4009
4010   Finally, what do you do with a binding set?  What is a binding set
4011anyway?  ‘make-sandbox-module’ is here for you.
4012
4013 -- Scheme Procedure: make-sandbox-module bindings
4014     Return a fresh module that only contains BINDINGS.
4015
4016     The BINDINGS should be given as a list of import sets.  One import
4017     set is a list whose car names an interface, like ‘(ice-9 q)’, and
4018     whose cdr is a list of imports.  An import is either a bare symbol
4019     or a pair of ‘(OUT . IN)’, where OUT and IN are both symbols and
4020     denote the name under which a binding is exported from the module,
4021     and the name under which to make the binding available,
4022     respectively.
4023
4024   So you see that binding sets are just lists, and
4025‘all-pure-and-impure-bindings’ is really just the result of appending
4026all of the component binding sets.
4027
4028
4029File: guile.info,  Node: REPL Servers,  Next: Cooperative REPL Servers,  Prev: Sandboxed Evaluation,  Up: Read/Load/Eval/Compile
4030
40316.16.14 REPL Servers
4032--------------------
4033
4034The procedures in this section are provided by
4035     (use-modules (system repl server))
4036
4037   When an application is written in Guile, it is often convenient to
4038allow the user to be able to interact with it by evaluating Scheme
4039expressions in a REPL.
4040
4041   The procedures of this module allow you to spawn a “REPL server”,
4042which permits interaction over a local or TCP connection.  Guile itself
4043uses them internally to implement the ‘--listen’ switch, *note
4044Command-line Options::.
4045
4046 -- Scheme Procedure: make-tcp-server-socket [#:host=#f] [#:addr]
4047          [#:port=37146]
4048     Return a stream socket bound to a given address ADDR and port
4049     number PORT.  If the HOST is given, and ADDR is not, then the HOST
4050     string is converted to an address.  If neither is given, we use the
4051     loopback address.
4052
4053 -- Scheme Procedure: make-unix-domain-server-socket
4054          [#:path="/tmp/guile-socket"]
4055     Return a UNIX domain socket, bound to a given PATH.
4056
4057 -- Scheme Procedure: run-server [server-socket]
4058 -- Scheme Procedure: spawn-server [server-socket]
4059     Create and run a REPL, making it available over the given
4060     SERVER-SOCKET.  If SERVER-SOCKET is not provided, it defaults to
4061     the socket created by calling ‘make-tcp-server-socket’ with no
4062     arguments.
4063
4064     ‘run-server’ runs the server in the current thread, whereas
4065     ‘spawn-server’ runs the server in a new thread.
4066
4067 -- Scheme Procedure: stop-server-and-clients!
4068     Closes the connection on all running server sockets.
4069
4070     Please note that in the current implementation, the REPL threads
4071     are cancelled without unwinding their stacks.  If any of them are
4072     holding mutexes or are within a critical section, the results are
4073     unspecified.
4074
4075
4076File: guile.info,  Node: Cooperative REPL Servers,  Prev: REPL Servers,  Up: Read/Load/Eval/Compile
4077
40786.16.15 Cooperative REPL Servers
4079--------------------------------
4080
4081The procedures in this section are provided by
4082     (use-modules (system repl coop-server))
4083
4084   Whereas ordinary REPL servers run in their own threads (*note REPL
4085Servers::), sometimes it is more convenient to provide REPLs that run at
4086specified times within an existing thread, for example in programs
4087utilizing an event loop or in single-threaded programs.  This allows for
4088safe access and mutation of a program’s data structures from the REPL,
4089without concern for thread synchronization.
4090
4091   Although the REPLs are run in the thread that calls
4092‘spawn-coop-repl-server’ and ‘poll-coop-repl-server’, dedicated threads
4093are spawned so that the calling thread is not blocked.  The spawned
4094threads read input for the REPLs and to listen for new connections.
4095
4096   Cooperative REPL servers must be polled periodically to evaluate any
4097pending expressions by calling ‘poll-coop-repl-server’ with the object
4098returned from ‘spawn-coop-repl-server’.  The thread that calls
4099‘poll-coop-repl-server’ will be blocked for as long as the expression
4100takes to be evaluated or if the debugger is entered.
4101
4102 -- Scheme Procedure: spawn-coop-repl-server [server-socket]
4103     Create and return a new cooperative REPL server object, and spawn a
4104     new thread to listen for connections on SERVER-SOCKET.  Proper
4105     functioning of the REPL server requires that
4106     ‘poll-coop-repl-server’ be called periodically on the returned
4107     server object.
4108
4109 -- Scheme Procedure: poll-coop-repl-server coop-server
4110     Poll the cooperative REPL server COOP-SERVER and apply a pending
4111     operation if there is one, such as evaluating an expression typed
4112     at the REPL prompt.  This procedure must be called from the same
4113     thread that called ‘spawn-coop-repl-server’.
4114
4115
4116File: guile.info,  Node: Memory Management,  Next: Modules,  Prev: Read/Load/Eval/Compile,  Up: API Reference
4117
41186.17 Memory Management and Garbage Collection
4119=============================================
4120
4121Guile uses a _garbage collector_ to manage most of its objects.  While
4122the garbage collector is designed to be mostly invisible, you sometimes
4123need to interact with it explicitly.
4124
4125   See *note Garbage Collection:: for a general discussion of how
4126garbage collection relates to using Guile from C.
4127
4128* Menu:
4129
4130* Garbage Collection Functions::
4131* Memory Blocks::
4132* Weak References::
4133* Guardians::
4134
4135
4136File: guile.info,  Node: Garbage Collection Functions,  Next: Memory Blocks,  Up: Memory Management
4137
41386.17.1 Function related to Garbage Collection
4139---------------------------------------------
4140
4141 -- Scheme Procedure: gc
4142 -- C Function: scm_gc ()
4143     Finds all of the “live” ‘SCM’ objects and reclaims for further use
4144     those that are no longer accessible.  You normally don’t need to
4145     call this function explicitly.  Its functionality is invoked
4146     automatically as needed.
4147
4148 -- C Function: SCM scm_gc_protect_object (SCM OBJ)
4149     Protects OBJ from being freed by the garbage collector, when it
4150     otherwise might be.  When you are done with the object, call
4151     ‘scm_gc_unprotect_object’ on the object.  Calls to
4152     ‘scm_gc_protect_object’/‘scm_gc_unprotect_object’ can be nested,
4153     and the object remains protected until it has been unprotected as
4154     many times as it was protected.  It is an error to unprotect an
4155     object more times than it has been protected.  Returns the SCM
4156     object it was passed.
4157
4158     Note that storing OBJ in a C global variable has the same
4159     effect(1).
4160
4161 -- C Function: SCM scm_gc_unprotect_object (SCM OBJ)
4162
4163     Unprotects an object from the garbage collector which was protected
4164     by ‘scm_gc_unprotect_object’.  Returns the SCM object it was
4165     passed.
4166
4167 -- C Function: SCM scm_permanent_object (SCM OBJ)
4168
4169     Similar to ‘scm_gc_protect_object’ in that it causes the collector
4170     to always mark the object, except that it should not be nested
4171     (only call ‘scm_permanent_object’ on an object once), and it has no
4172     corresponding unpermanent function.  Once an object is declared
4173     permanent, it will never be freed.  Returns the SCM object it was
4174     passed.
4175
4176 -- C Macro: void scm_remember_upto_here_1 (SCM obj)
4177 -- C Macro: void scm_remember_upto_here_2 (SCM obj1, SCM obj2)
4178     Create a reference to the given object or objects, so they’re
4179     certain to be present on the stack or in a register and hence will
4180     not be freed by the garbage collector before this point.
4181
4182     Note that these functions can only be applied to ordinary C local
4183     variables (ie. “automatics”).  Objects held in global or static
4184     variables or some malloced block or the like cannot be protected
4185     with this mechanism.
4186
4187 -- Scheme Procedure: gc-stats
4188 -- C Function: scm_gc_stats ()
4189     Return an association list of statistics about Guile’s current use
4190     of storage.
4191
4192 -- Scheme Procedure: gc-live-object-stats
4193 -- C Function: scm_gc_live_object_stats ()
4194     Return an alist of statistics of the current live objects.
4195
4196 -- Function: void scm_gc_mark (SCM X)
4197     Mark the object X, and recurse on any objects X refers to.  If X’s
4198     mark bit is already set, return immediately.  This function must
4199     only be called during the mark-phase of garbage collection,
4200     typically from a smob _mark_ function.
4201
4202   ---------- Footnotes ----------
4203
4204   (1) In Guile up to version 1.8, C global variables were not visited
4205by the garbage collector in the mark phase; hence,
4206‘scm_gc_protect_object’ was the only way in C to prevent a Scheme object
4207from being freed.
4208
4209
4210File: guile.info,  Node: Memory Blocks,  Next: Weak References,  Prev: Garbage Collection Functions,  Up: Memory Management
4211
42126.17.2 Memory Blocks
4213--------------------
4214
4215In C programs, dynamic management of memory blocks is normally done with
4216the functions malloc, realloc, and free.  Guile has additional functions
4217for dynamic memory allocation that are integrated into the garbage
4218collector and the error reporting system.
4219
4220   Memory blocks that are associated with Scheme objects (for example a
4221foreign object) should be allocated with ‘scm_gc_malloc’ or
4222‘scm_gc_malloc_pointerless’.  These two functions will either return a
4223valid pointer or signal an error.  Memory blocks allocated this way may
4224be released explicitly; however, this is not strictly needed, and we
4225recommend _not_ calling ‘scm_gc_free’.  All memory allocated with
4226‘scm_gc_malloc’ or ‘scm_gc_malloc_pointerless’ is automatically
4227reclaimed when the garbage collector no longer sees any live reference
4228to it(1).
4229
4230   When garbage collection occurs, Guile will visit the words in memory
4231allocated with ‘scm_gc_malloc’, looking for live pointers.  This means
4232that if ‘scm_gc_malloc’-allocated memory contains a pointer to some
4233other part of the memory, the garbage collector notices it and prevents
4234it from being reclaimed(2).  Conversely, memory allocated with
4235‘scm_gc_malloc_pointerless’ is assumed to be “pointer-less” and is not
4236scanned for pointers.
4237
4238   For memory that is not associated with a Scheme object, you can use
4239‘scm_malloc’ instead of ‘malloc’.  Like ‘scm_gc_malloc’, it will either
4240return a valid pointer or signal an error.  However, it will not assume
4241that the new memory block can be freed by a garbage collection.  The
4242memory must be explicitly freed with ‘free’.
4243
4244   There is also ‘scm_gc_realloc’ and ‘scm_realloc’, to be used in place
4245of ‘realloc’ when appropriate, and ‘scm_gc_calloc’ and ‘scm_calloc’, to
4246be used in place of ‘calloc’ when appropriate.
4247
4248   The function ‘scm_dynwind_free’ can be useful when memory should be
4249freed with libc’s ‘free’ when leaving a dynwind context, *Note Dynamic
4250Wind::.
4251
4252 -- C Function: void * scm_malloc (size_t SIZE)
4253 -- C Function: void * scm_calloc (size_t SIZE)
4254     Allocate SIZE bytes of memory and return a pointer to it.  When
4255     SIZE is 0, return ‘NULL’.  When not enough memory is available,
4256     signal an error.  This function runs the GC to free up some memory
4257     when it deems it appropriate.
4258
4259     The memory is allocated by the libc ‘malloc’ function and can be
4260     freed with ‘free’.  There is no ‘scm_free’ function to go with
4261     ‘scm_malloc’ to make it easier to pass memory back and forth
4262     between different modules.
4263
4264     The function ‘scm_calloc’ is similar to ‘scm_malloc’, but
4265     initializes the block of memory to zero as well.
4266
4267     These functions will (indirectly) call
4268     ‘scm_gc_register_allocation’.
4269
4270 -- C Function: void * scm_realloc (void *MEM, size_t NEW_SIZE)
4271     Change the size of the memory block at MEM to NEW_SIZE and return
4272     its new location.  When NEW_SIZE is 0, this is the same as calling
4273     ‘free’ on MEM and ‘NULL’ is returned.  When MEM is ‘NULL’, this
4274     function behaves like ‘scm_malloc’ and allocates a new block of
4275     size NEW_SIZE.
4276
4277     When not enough memory is available, signal an error.  This
4278     function runs the GC to free up some memory when it deems it
4279     appropriate.
4280
4281     This function will call ‘scm_gc_register_allocation’.
4282
4283 -- C Function: void * scm_gc_malloc (size_t SIZE, const char *WHAT)
4284 -- C Function: void * scm_gc_malloc_pointerless (size_t SIZE, const
4285          char *WHAT)
4286 -- C Function: void * scm_gc_realloc (void *MEM, size_t OLD_SIZE,
4287          size_t NEW_SIZE, const char *WHAT);
4288 -- C Function: void * scm_gc_calloc (size_t SIZE, const char *WHAT)
4289     Allocate SIZE bytes of automatically-managed memory.  The memory is
4290     automatically freed when no longer referenced from any live memory
4291     block.
4292
4293     When garbage collection occurs, Guile will visit the words in
4294     memory allocated with ‘scm_gc_malloc’ or ‘scm_gc_calloc’, looking
4295     for pointers to other memory allocations that are managed by the
4296     GC. In contrast, memory allocated by ‘scm_gc_malloc_pointerless’ is
4297     not scanned for pointers.
4298
4299     The ‘scm_gc_realloc’ call preserves the “pointerlessness” of the
4300     memory area pointed to by MEM.  Note that you need to pass the old
4301     size of a reallocated memory block as well.  See below for a
4302     motivation.
4303
4304 -- C Function: void scm_gc_free (void *MEM, size_t SIZE, const char
4305          *WHAT)
4306     Explicitly free the memory block pointed to by MEM, which was
4307     previously allocated by one of the above ‘scm_gc’ functions.  This
4308     function is almost always unnecessary, except for codebases that
4309     still need to compile on Guile 1.8.
4310
4311     Note that you need to explicitly pass the SIZE parameter.  This is
4312     done since it should normally be easy to provide this parameter
4313     (for memory that is associated with GC controlled objects) and help
4314     keep the memory management overhead very low.  However, in Guile
4315     2.x, SIZE is always ignored.
4316
4317 -- C Function: void scm_gc_register_allocation (size_t SIZE)
4318     Informs the garbage collector that SIZE bytes have been allocated,
4319     which the collector would otherwise not have known about.
4320
4321     In general, Scheme will decide to collect garbage only after some
4322     amount of memory has been allocated.  Calling this function will
4323     make the Scheme garbage collector know about more allocation, and
4324     thus run more often (as appropriate).
4325
4326     It is especially important to call this function when large
4327     unmanaged allocations, like images, may be freed by small Scheme
4328     allocations, like foreign objects.
4329
4330 -- C Function: void scm_dynwind_free (void *mem)
4331     Equivalent to ‘scm_dynwind_unwind_handler (free, MEM,
4332     SCM_F_WIND_EXPLICITLY)’.  That is, the memory block at MEM will be
4333     freed (using ‘free’ from the C library) when the current dynwind is
4334     left.
4335
4336 -- Scheme Procedure: malloc-stats
4337     Return an alist ((WHAT .  N) ...)  describing number of malloced
4338     objects.  WHAT is the second argument to ‘scm_gc_malloc’, N is the
4339     number of objects of that type currently allocated.
4340
4341     This function is only available if the ‘GUILE_DEBUG_MALLOC’
4342     preprocessor macro was defined when Guile was compiled.
4343
4344   ---------- Footnotes ----------
4345
4346   (1) In Guile up to version 1.8, memory allocated with ‘scm_gc_malloc’
4347_had_ to be freed with ‘scm_gc_free’.
4348
4349   (2) In Guile up to 1.8, memory allocated with ‘scm_gc_malloc’ was
4350_not_ visited by the collector in the mark phase.  Consequently, the GC
4351had to be told explicitly about pointers to live objects contained in
4352the memory block, e.g., via SMOB mark functions (*note
4353‘scm_set_smob_mark’: Smobs.)
4354
4355
4356File: guile.info,  Node: Weak References,  Next: Guardians,  Prev: Memory Blocks,  Up: Memory Management
4357
43586.17.3 Weak References
4359----------------------
4360
4361[FIXME: This chapter is based on Mikael Djurfeldt’s answer to a question
4362by Michael Livshin.  Any mistakes are not theirs, of course.  ]
4363
4364   Weak references let you attach bookkeeping information to data so
4365that the additional information automatically disappears when the
4366original data is no longer in use and gets garbage collected.  In a weak
4367key hash, the hash entry for that key disappears as soon as the key is
4368no longer referenced from anywhere else.  For weak value hashes, the
4369same happens as soon as the value is no longer in use.  Entries in a
4370doubly weak hash disappear when either the key or the value are not used
4371anywhere else anymore.
4372
4373   Object properties offer the same kind of functionality as weak key
4374hashes in many situations.  (*note Object Properties::)
4375
4376   Here’s an example (a little bit strained perhaps, but one of the
4377examples is actually used in Guile):
4378
4379   Assume that you’re implementing a debugging system where you want to
4380associate information about filename and position of source code
4381expressions with the expressions themselves.
4382
4383   Hashtables can be used for that, but if you use ordinary hash tables
4384it will be impossible for the scheme interpreter to "forget" old source
4385when, for example, a file is reloaded.
4386
4387   To implement the mapping from source code expressions to positional
4388information it is necessary to use weak-key tables since we don’t want
4389the expressions to be remembered just because they are in our table.
4390
4391   To implement a mapping from source file line numbers to source code
4392expressions you would use a weak-value table.
4393
4394   To implement a mapping from source code expressions to the procedures
4395they constitute a doubly-weak table has to be used.
4396
4397* Menu:
4398
4399* Weak hash tables::
4400* Weak vectors::
4401
4402
4403File: guile.info,  Node: Weak hash tables,  Next: Weak vectors,  Up: Weak References
4404
44056.17.3.1 Weak hash tables
4406.........................
4407
4408 -- Scheme Procedure: make-weak-key-hash-table [size]
4409 -- Scheme Procedure: make-weak-value-hash-table [size]
4410 -- Scheme Procedure: make-doubly-weak-hash-table [size]
4411 -- C Function: scm_make_weak_key_hash_table (size)
4412 -- C Function: scm_make_weak_value_hash_table (size)
4413 -- C Function: scm_make_doubly_weak_hash_table (size)
4414     Return a weak hash table with SIZE buckets.  As with any hash
4415     table, choosing a good size for the table requires some caution.
4416
4417     You can modify weak hash tables in exactly the same way you would
4418     modify regular hash tables, with the exception of the routines that
4419     act on handles.  Weak tables have a different implementation behind
4420     the scenes that doesn’t have handles.  *note Hash Tables::, for
4421     more on ‘hashq-ref’ et al.
4422
4423   Note that in a weak-key hash table, the reference to the value is
4424strong.  This means that if the value references the key, even
4425indirectly, the key will never be collected, which can lead to a memory
4426leak.  The reverse is true for weak value tables.
4427
4428 -- Scheme Procedure: weak-key-hash-table? obj
4429 -- Scheme Procedure: weak-value-hash-table? obj
4430 -- Scheme Procedure: doubly-weak-hash-table? obj
4431 -- C Function: scm_weak_key_hash_table_p (obj)
4432 -- C Function: scm_weak_value_hash_table_p (obj)
4433 -- C Function: scm_doubly_weak_hash_table_p (obj)
4434     Return ‘#t’ if OBJ is the specified weak hash table.  Note that a
4435     doubly weak hash table is neither a weak key nor a weak value hash
4436     table.
4437
4438
4439File: guile.info,  Node: Weak vectors,  Prev: Weak hash tables,  Up: Weak References
4440
44416.17.3.2 Weak vectors
4442.....................
4443
4444 -- Scheme Procedure: make-weak-vector size [fill]
4445 -- C Function: scm_make_weak_vector (size, fill)
4446     Return a weak vector with SIZE elements.  If the optional argument
4447     FILL is given, all entries in the vector will be set to FILL.  The
4448     default value for FILL is the empty list.
4449
4450 -- Scheme Procedure: weak-vector elem ...
4451 -- Scheme Procedure: list->weak-vector l
4452 -- C Function: scm_weak_vector (l)
4453     Construct a weak vector from a list: ‘weak-vector’ uses the list of
4454     its arguments while ‘list->weak-vector’ uses its only argument L (a
4455     list) to construct a weak vector the same way ‘list->vector’ would.
4456
4457 -- Scheme Procedure: weak-vector? obj
4458 -- C Function: scm_weak_vector_p (obj)
4459     Return ‘#t’ if OBJ is a weak vector.
4460
4461 -- Scheme Procedure: weak-vector-ref wvect k
4462 -- C Function: scm_weak_vector_ref (wvect, k)
4463     Return the Kth element of the weak vector WVECT, or ‘#f’ if that
4464     element has been collected.
4465
4466 -- Scheme Procedure: weak-vector-set! wvect k elt
4467 -- C Function: scm_weak_vector_set_x (wvect, k, elt)
4468     Set the Kth element of the weak vector WVECT to ELT.
4469
4470
4471File: guile.info,  Node: Guardians,  Prev: Weak References,  Up: Memory Management
4472
44736.17.4 Guardians
4474----------------
4475
4476Guardians provide a way to be notified about objects that would
4477otherwise be collected as garbage.  Guarding them prevents the objects
4478from being collected and cleanup actions can be performed on them, for
4479example.
4480
4481   See R. Kent Dybvig, Carl Bruggeman, and David Eby (1993) "Guardians
4482in a Generation-Based Garbage Collector".  ACM SIGPLAN Conference on
4483Programming Language Design and Implementation, June 1993.
4484
4485 -- Scheme Procedure: make-guardian
4486 -- C Function: scm_make_guardian ()
4487     Create a new guardian.  A guardian protects a set of objects from
4488     garbage collection, allowing a program to apply cleanup or other
4489     actions.
4490
4491     ‘make-guardian’ returns a procedure representing the guardian.
4492     Calling the guardian procedure with an argument adds the argument
4493     to the guardian’s set of protected objects.  Calling the guardian
4494     procedure without an argument returns one of the protected objects
4495     which are ready for garbage collection, or ‘#f’ if no such object
4496     is available.  Objects which are returned in this way are removed
4497     from the guardian.
4498
4499     You can put a single object into a guardian more than once and you
4500     can put a single object into more than one guardian.  The object
4501     will then be returned multiple times by the guardian procedures.
4502
4503     An object is eligible to be returned from a guardian when it is no
4504     longer referenced from outside any guardian.
4505
4506     There is no guarantee about the order in which objects are returned
4507     from a guardian.  If you want to impose an order on finalization
4508     actions, for example, you can do that by keeping objects alive in
4509     some global data structure until they are no longer needed for
4510     finalizing other objects.
4511
4512     Being an element in a weak vector, a key in a hash table with weak
4513     keys, or a value in a hash table with weak values does not prevent
4514     an object from being returned by a guardian.  But as long as an
4515     object can be returned from a guardian it will not be removed from
4516     such a weak vector or hash table.  In other words, a weak link does
4517     not prevent an object from being considered collectable, but being
4518     inside a guardian prevents a weak link from being broken.
4519
4520     A key in a weak key hash table can be thought of as having a strong
4521     reference to its associated value as long as the key is accessible.
4522     Consequently, when the key is only accessible from within a
4523     guardian, the reference from the key to the value is also
4524     considered to be coming from within a guardian.  Thus, if there is
4525     no other reference to the value, it is eligible to be returned from
4526     a guardian.
4527
4528
4529File: guile.info,  Node: Modules,  Next: Foreign Function Interface,  Prev: Memory Management,  Up: API Reference
4530
45316.18 Modules
4532============
4533
4534When programs become large, naming conflicts can occur when a function
4535or global variable defined in one file has the same name as a function
4536or global variable in another file.  Even just a _similarity_ between
4537function names can cause hard-to-find bugs, since a programmer might
4538type the wrong function name.
4539
4540   The approach used to tackle this problem is called _information
4541encapsulation_, which consists of packaging functional units into a
4542given name space that is clearly separated from other name spaces.
4543
4544   The language features that allow this are usually called _the module
4545system_ because programs are broken up into modules that are compiled
4546separately (or loaded separately in an interpreter).
4547
4548   Older languages, like C, have limited support for name space
4549manipulation and protection.  In C a variable or function is public by
4550default, and can be made local to a module with the ‘static’ keyword.
4551But you cannot reference public variables and functions from another
4552module with different names.
4553
4554   More advanced module systems have become a common feature in recently
4555designed languages: ML, Python, Perl, and Modula 3 all allow the
4556_renaming_ of objects from a foreign module, so they will not clutter
4557the global name space.
4558
4559   In addition, Guile offers variables as first-class objects.  They can
4560be used for interacting with the module system.
4561
4562* Menu:
4563
4564* General Information about Modules::  Guile module basics.
4565* Using Guile Modules::         How to use existing modules.
4566* Creating Guile Modules::      How to package your code into modules.
4567* Modules and the File System:: Installing modules in the file system.
4568* R6RS Version References::     Using version numbers with modules.
4569* R6RS Libraries::              The library and import forms.
4570* Variables::                   First-class variables.
4571* Module System Reflection::    First-class modules.
4572* Declarative Modules::         Allowing Guile to reason about modules.
4573* Accessing Modules from C::    How to work with modules with C code.
4574* provide and require::         The SLIB feature mechanism.
4575* Environments::                R5RS top-level environments.
4576
4577
4578File: guile.info,  Node: General Information about Modules,  Next: Using Guile Modules,  Up: Modules
4579
45806.18.1 General Information about Modules
4581----------------------------------------
4582
4583A Guile module can be thought of as a collection of named procedures,
4584variables and macros.  More precisely, it is a set of “bindings” of
4585symbols (names) to Scheme objects.
4586
4587   Within a module, all bindings are visible.  Certain bindings can be
4588declared “public”, in which case they are added to the module’s
4589so-called “export list”; this set of public bindings is called the
4590module’s “public interface” (*note Creating Guile Modules::).
4591
4592   A client module “uses” a providing module’s bindings by either
4593accessing the providing module’s public interface, or by building a
4594custom interface (and then accessing that).  In a custom interface, the
4595client module can “select” which bindings to access and can also
4596algorithmically “rename” bindings.  In contrast, when using the
4597providing module’s public interface, the entire export list is available
4598without renaming (*note Using Guile Modules::).
4599
4600   All Guile modules have a unique “module name”, for example ‘(ice-9
4601popen)’ or ‘(srfi srfi-11)’.  Module names are lists of one or more
4602symbols.
4603
4604   When Guile goes to use an interface from a module, for example
4605‘(ice-9 popen)’, Guile first looks to see if it has loaded ‘(ice-9
4606popen)’ for any reason.  If the module has not been loaded yet, Guile
4607searches a “load path” for a file that might define it, and loads that
4608file.
4609
4610   The following subsections go into more detail on using, creating,
4611installing, and otherwise manipulating modules and the module system.
4612
4613
4614File: guile.info,  Node: Using Guile Modules,  Next: Creating Guile Modules,  Prev: General Information about Modules,  Up: Modules
4615
46166.18.2 Using Guile Modules
4617--------------------------
4618
4619To use a Guile module is to access either its public interface or a
4620custom interface (*note General Information about Modules::).  Both
4621types of access are handled by the syntactic form ‘use-modules’, which
4622accepts one or more interface specifications and, upon evaluation,
4623arranges for those interfaces to be available to the current module.
4624This process may include locating and loading code for a given module if
4625that code has not yet been loaded, following ‘%load-path’ (*note Modules
4626and the File System::).
4627
4628   An “interface specification” has one of two forms.  The first
4629variation is simply to name the module, in which case its public
4630interface is the one accessed.  For example:
4631
4632     (use-modules (ice-9 popen))
4633
4634   Here, the interface specification is ‘(ice-9 popen)’, and the result
4635is that the current module now has access to ‘open-pipe’, ‘close-pipe’,
4636‘open-input-pipe’, and so on (*note Pipes::).
4637
4638   Note in the previous example that if the current module had already
4639defined ‘open-pipe’, that definition would be overwritten by the
4640definition in ‘(ice-9 popen)’.  For this reason (and others), there is a
4641second variation of interface specification that not only names a module
4642to be accessed, but also selects bindings from it and renames them to
4643suit the current module’s needs.  For example:
4644
4645     (use-modules ((ice-9 popen)
4646                   #:select ((open-pipe . pipe-open) close-pipe)
4647                   #:renamer (symbol-prefix-proc 'unixy:)))
4648
4649or more simply:
4650
4651     (use-modules ((ice-9 popen)
4652                   #:select ((open-pipe . pipe-open) close-pipe)
4653                   #:prefix unixy:))
4654
4655   Here, the interface specification is more complex than before, and
4656the result is that a custom interface with only two bindings is created
4657and subsequently accessed by the current module.  The mapping of old to
4658new names is as follows:
4659
4660     (ice-9 popen) sees:             current module sees:
4661     open-pipe                       unixy:pipe-open
4662     close-pipe                      unixy:close-pipe
4663
4664   This example also shows how to use the convenience procedure
4665‘symbol-prefix-proc’.
4666
4667   You can also directly refer to bindings in a module by using the ‘@’
4668syntax.  For example, instead of using the ‘use-modules’ statement from
4669above and writing ‘unixy:pipe-open’ to refer to the ‘pipe-open’ from the
4670‘(ice-9 popen)’, you could also write ‘(@ (ice-9 popen) open-pipe)’.
4671Thus an alternative to the complete ‘use-modules’ statement would be
4672
4673     (define unixy:pipe-open (@ (ice-9 popen) open-pipe))
4674     (define unixy:close-pipe (@ (ice-9 popen) close-pipe))
4675
4676   There is also ‘@@’, which can be used like ‘@’, but does not check
4677whether the variable that is being accessed is actually exported.  Thus,
4678‘@@’ can be thought of as the impolite version of ‘@’ and should only be
4679used as a last resort or for debugging, for example.
4680
4681   Note that just as with a ‘use-modules’ statement, any module that has
4682not yet been loaded will be loaded when referenced by a ‘@’ or ‘@@’
4683form.
4684
4685   You can also use the ‘@’ and ‘@@’ syntaxes as the target of a ‘set!’
4686when the binding refers to a variable.
4687
4688 -- Scheme Procedure: symbol-prefix-proc prefix-sym
4689     Return a procedure that prefixes its arg (a symbol) with
4690     PREFIX-SYM.
4691
4692 -- syntax: use-modules spec ...
4693     Resolve each interface specification SPEC into an interface and
4694     arrange for these to be accessible by the current module.  The
4695     return value is unspecified.
4696
4697     SPEC can be a list of symbols, in which case it names a module
4698     whose public interface is found and used.
4699
4700     SPEC can also be of the form:
4701
4702           (MODULE-NAME [#:select SELECTION]
4703                        [#:prefix PREFIX]
4704                        [#:renamer RENAMER])
4705
4706     in which case a custom interface is newly created and used.
4707     MODULE-NAME is a list of symbols, as above; SELECTION is a list of
4708     selection-specs; PREFIX is a symbol that is prepended to imported
4709     names; and RENAMER is a procedure that takes a symbol and returns
4710     its new name.  A selection-spec is either a symbol or a pair of
4711     symbols ‘(ORIG . SEEN)’, where ORIG is the name in the used module
4712     and SEEN is the name in the using module.  Note that SEEN is also
4713     modified by PREFIX and RENAMER.
4714
4715     The ‘#:select’, ‘#:prefix’, and ‘#:renamer’ clauses are optional.
4716     If all are omitted, the returned interface has no bindings.  If the
4717     ‘#:select’ clause is omitted, PREFIX and RENAMER operate on the
4718     used module’s public interface.
4719
4720     In addition to the above, SPEC can also include a ‘#:version’
4721     clause, of the form:
4722
4723           #:version VERSION-SPEC
4724
4725     where VERSION-SPEC is an R6RS-compatible version reference.  An
4726     error will be signaled in the case in which a module with the same
4727     name has already been loaded, if that module specifies a version
4728     and that version is not compatible with VERSION-SPEC.  *Note R6RS
4729     Version References::, for more on version references.
4730
4731     If the module name is not resolvable, ‘use-modules’ will signal an
4732     error.
4733
4734 -- syntax: @ module-name binding-name
4735     Refer to the binding named BINDING-NAME in module MODULE-NAME.  The
4736     binding must have been exported by the module.
4737
4738 -- syntax: @@ module-name binding-name
4739     Refer to the binding named BINDING-NAME in module MODULE-NAME.  The
4740     binding must not have been exported by the module.  This syntax is
4741     only intended for debugging purposes or as a last resort.  *Note
4742     Declarative Modules::, for some limitations on the use of ‘@@’.
4743
4744
4745File: guile.info,  Node: Creating Guile Modules,  Next: Modules and the File System,  Prev: Using Guile Modules,  Up: Modules
4746
47476.18.3 Creating Guile Modules
4748-----------------------------
4749
4750When you want to create your own modules, you have to take the following
4751steps:
4752
4753   • Create a Scheme source file and add all variables and procedures
4754     you wish to export, or which are required by the exported
4755     procedures.
4756
4757   • Add a ‘define-module’ form at the beginning.
4758
4759   • Export all bindings which should be in the public interface, either
4760     by using ‘define-public’ or ‘export’ (both documented below).
4761
4762 -- syntax: define-module module-name option ...
4763     MODULE-NAME is a list of one or more symbols.
4764
4765          (define-module (ice-9 popen))
4766
4767     ‘define-module’ makes this module available to Guile programs under
4768     the given MODULE-NAME.
4769
4770     OPTION ... are keyword/value pairs which specify more about the
4771     defined module.  The recognized options and their meaning are shown
4772     in the following table.
4773
4774     ‘#:use-module INTERFACE-SPECIFICATION’
4775          Equivalent to a ‘(use-modules INTERFACE-SPECIFICATION)’ (*note
4776          Using Guile Modules::).
4777
4778     ‘#:autoload MODULE SYMBOL-LIST’
4779          Load MODULE when any of SYMBOL-LIST are accessed.  For
4780          example,
4781
4782               (define-module (my mod)
4783                 #:autoload (srfi srfi-1) (partition delete-duplicates))
4784               ...
4785               (when something
4786                 (set! foo (delete-duplicates ...)))
4787
4788          When a module is autoloaded, only the bindings in SYMBOL-LIST
4789          become available(1).
4790
4791          An autoload is a good way to put off loading a big module
4792          until it’s really needed, for instance for faster startup or
4793          if it will only be needed in certain circumstances.
4794
4795     ‘#:export LIST’
4796          Export all identifiers in LIST which must be a list of symbols
4797          or pairs of symbols.  This is equivalent to ‘(export LIST)’ in
4798          the module body.
4799
4800     ‘#:re-export LIST’
4801          Re-export all identifiers in LIST which must be a list of
4802          symbols or pairs of symbols.  The symbols in LIST must be
4803          imported by the current module from other modules.  This is
4804          equivalent to ‘re-export’ below.
4805
4806     ‘#:replace LIST’
4807          Export all identifiers in LIST (a list of symbols or pairs of
4808          symbols) and mark them as “replacing bindings”.  In the module
4809          user’s name space, this will have the effect of replacing any
4810          binding with the same name that is not also “replacing”.
4811          Normally a replacement results in an “override” warning
4812          message, ‘#:replace’ avoids that.
4813
4814          In general, a module that exports a binding for which the
4815          ‘(guile)’ module already has a definition should use
4816          ‘#:replace’ instead of ‘#:export’.  ‘#:replace’, in a sense,
4817          lets Guile know that the module _purposefully_ replaces a core
4818          binding.  It is important to note, however, that this binding
4819          replacement is confined to the name space of the module user.
4820          In other words, the value of the core binding in question
4821          remains unchanged for other modules.
4822
4823          Note that although it is often a good idea for the replaced
4824          binding to remain compatible with a binding in ‘(guile)’, to
4825          avoid surprising the user, sometimes the bindings will be
4826          incompatible.  For example, SRFI-19 exports its own version of
4827          ‘current-time’ (*note SRFI-19 Time::) which is not compatible
4828          with the core ‘current-time’ function (*note Time::).  Guile
4829          assumes that a user importing a module knows what she is
4830          doing, and uses ‘#:replace’ for this binding rather than
4831          ‘#:export’.
4832
4833          A ‘#:replace’ clause is equivalent to ‘(export! LIST)’ in the
4834          module body.
4835
4836          The ‘#:duplicates’ (see below) provides fine-grain control
4837          about duplicate binding handling on the module-user side.
4838
4839     ‘#:re-export-and-replace LIST’
4840          Like ‘#:re-export’, but also marking the bindings as
4841          replacements in the sense of ‘#:replace’.
4842
4843     ‘#:version LIST’
4844          Specify a version for the module in the form of LIST, a list
4845          of zero or more exact, nonnegative integers.  The
4846          corresponding ‘#:version’ option in the ‘use-modules’ form
4847          allows callers to restrict the value of this option in various
4848          ways.
4849
4850     ‘#:duplicates LIST’
4851          Tell Guile to handle duplicate bindings for the bindings
4852          imported by the current module according to the policy defined
4853          by LIST, a list of symbols.  LIST must contain symbols
4854          representing a duplicate binding handling policy chosen among
4855          the following:
4856
4857          ‘check’
4858               Raises an error when a binding is imported from more than
4859               one place.
4860          ‘warn’
4861               Issue a warning when a binding is imported from more than
4862               one place and leave the responsibility of actually
4863               handling the duplication to the next duplicate binding
4864               handler.
4865          ‘replace’
4866               When a new binding is imported that has the same name as
4867               a previously imported binding, then do the following:
4868
4869                 1. If the old binding was said to be “replacing” (via
4870                    the ‘#:replace’ option above) and the new binding is
4871                    not replacing, the keep the old binding.
4872                 2. If the old binding was not said to be replacing and
4873                    the new binding is replacing, then replace the old
4874                    binding with the new one.
4875                 3. If neither the old nor the new binding is replacing,
4876                    then keep the old one.
4877
4878          ‘warn-override-core’
4879               Issue a warning when a core binding is being overwritten
4880               and actually override the core binding with the new one.
4881          ‘first’
4882               In case of duplicate bindings, the firstly imported
4883               binding is always the one which is kept.
4884          ‘last’
4885               In case of duplicate bindings, the lastly imported
4886               binding is always the one which is kept.
4887          ‘noop’
4888               In case of duplicate bindings, leave the responsibility
4889               to the next duplicate handler.
4890
4891          If LIST contains more than one symbol, then the duplicate
4892          binding handlers which appear first will be used first when
4893          resolving a duplicate binding situation.  As mentioned above,
4894          some resolution policies may explicitly leave the
4895          responsibility of handling the duplication to the next handler
4896          in LIST.
4897
4898          If GOOPS has been loaded before the ‘#:duplicates’ clause is
4899          processed, there are additional strategies available for
4900          dealing with generic functions.  *Note Merging Generics::, for
4901          more information.
4902
4903          The default duplicate binding resolution policy is given by
4904          the ‘default-duplicate-binding-handler’ procedure, and is
4905
4906               (replace warn-override-core warn last)
4907
4908     ‘#:pure’
4909          Create a “pure” module, that is a module which does not
4910          contain any of the standard procedure bindings except for the
4911          syntax forms.  This is useful if you want to create “safe”
4912          modules, that is modules which do not know anything about
4913          dangerous procedures.
4914
4915 -- syntax: export variable ...
4916     Add all VARIABLEs (which must be symbols or pairs of symbols) to
4917     the list of exported bindings of the current module.  If VARIABLE
4918     is a pair, its ‘car’ gives the name of the variable as seen by the
4919     current module and its ‘cdr’ specifies a name for the binding in
4920     the current module’s public interface.
4921
4922 -- syntax: define-public ...
4923     Equivalent to ‘(begin (define foo ...) (export foo))’.
4924
4925 -- syntax: re-export variable ...
4926     Add all VARIABLEs (which must be symbols or pairs of symbols) to
4927     the list of re-exported bindings of the current module.  Pairs of
4928     symbols are handled as in ‘export’.  Re-exported bindings must be
4929     imported by the current module from some other module.
4930
4931 -- syntax: export! variable ...
4932     Like ‘export’, but marking the exported variables as replacing.
4933     Using a module with replacing bindings will cause any existing
4934     bindings to be replaced without issuing any warnings.  See the
4935     discussion of ‘#:replace’ above.
4936
4937   ---------- Footnotes ----------
4938
4939   (1) In Guile 2.2 and earlier, _all_ the module bindings would become
4940available; SYMBOL-LIST was just the list of bindings that will first
4941trigger the load.
4942
4943
4944File: guile.info,  Node: Modules and the File System,  Next: R6RS Version References,  Prev: Creating Guile Modules,  Up: Modules
4945
49466.18.4 Modules and the File System
4947----------------------------------
4948
4949Typical programs only use a small subset of modules installed on a Guile
4950system.  In order to keep startup time down, Guile only loads modules
4951when a program uses them, on demand.
4952
4953   When a program evaluates ‘(use-modules (ice-9 popen))’, and the
4954module is not loaded, Guile searches for a conventionally-named file
4955from in the “load path”.
4956
4957   In this case, loading ‘(ice-9 popen)’ will eventually cause Guile to
4958run ‘(primitive-load-path "ice-9/popen")’.  ‘primitive-load-path’ will
4959search for a file ‘ice-9/popen’ in the ‘%load-path’ (*note Load
4960Paths::).  For each directory in ‘%load-path’, Guile will try to find
4961the file name, concatenated with the extensions from ‘%load-extensions’.
4962By default, this will cause Guile to ‘stat’ ‘ice-9/popen.scm’, and then
4963ice-9/popen’.  *Note Load Paths::, for more on ‘primitive-load-path’.
4964
4965   If a corresponding compiled ‘.go’ file is found in the
4966‘%load-compiled-path’ or in the fallback path, and is as fresh as the
4967source file, it will be loaded instead of the source file.  If no
4968compiled file is found, Guile may try to compile the source file and
4969cache away the resulting ‘.go’ file.  *Note Compilation::, for more on
4970compilation.
4971
4972   Once Guile finds a suitable source or compiled file is found, the
4973file will be loaded.  If, after loading the file, the module under
4974consideration is still not defined, Guile will signal an error.
4975
4976   For more information on where and how to install Scheme modules,
4977*Note Installing Site Packages::.
4978
4979
4980File: guile.info,  Node: R6RS Version References,  Next: R6RS Libraries,  Prev: Modules and the File System,  Up: Modules
4981
49826.18.5 R6RS Version References
4983------------------------------
4984
4985Guile’s module system includes support for locating modules based on a
4986declared version specifier of the same form as the one described in R6RS
4987(*note R6RS Library Form: (r6rs)Library form.).  By using the
4988‘#:version’ keyword in a ‘define-module’ form, a module may specify a
4989version as a list of zero or more exact, nonnegative integers.
4990
4991   This version can then be used to locate the module during the module
4992search process.  Client modules and callers of the ‘use-modules’
4993function may specify constraints on the versions of target modules by
4994providing a “version reference”, which has one of the following forms:
4995
4996      (SUB-VERSION-REFERENCE ...)
4997      (and VERSION-REFERENCE ...)
4998      (or VERSION-REFERENCE ...)
4999      (not VERSION-REFERENCE)
5000
5001   in which SUB-VERSION-REFERENCE is in turn one of:
5002
5003      (SUB-VERSION)
5004      (>= SUB-VERSION)
5005      (<= SUB-VERSION)
5006      (and SUB-VERSION-REFERENCE ...)
5007      (or SUB-VERSION-REFERENCE ...)
5008      (not SUB-VERSION-REFERENCE)
5009
5010   in which SUB-VERSION is an exact, nonnegative integer as above.  A
5011version reference matches a declared module version if each element of
5012the version reference matches a corresponding element of the module
5013version, according to the following rules:
5014
5015   • The ‘and’ sub-form matches a version or version element if every
5016     element in the tail of the sub-form matches the specified version
5017     or version element.
5018
5019   • The ‘or’ sub-form matches a version or version element if any
5020     element in the tail of the sub-form matches the specified version
5021     or version element.
5022
5023   • The ‘not’ sub-form matches a version or version element if the tail
5024     of the sub-form does not match the version or version element.
5025
5026   • The ‘>=’ sub-form matches a version element if the element is
5027     greater than or equal to the SUB-VERSION in the tail of the
5028     sub-form.
5029
5030   • The ‘<=’ sub-form matches a version element if the version is less
5031     than or equal to the SUB-VERSION in the tail of the sub-form.
5032
5033   • A SUB-VERSION matches a version element if one is EQV? to the
5034     other.
5035
5036   For example, a module declared as:
5037
5038      (define-module (mylib mymodule) #:version (1 2 0))
5039
5040   would be successfully loaded by any of the following ‘use-modules’
5041expressions:
5042
5043      (use-modules ((mylib mymodule) #:version (1 2 (>= 0))))
5044      (use-modules ((mylib mymodule) #:version (or (1 2 0) (1 2 1))))
5045      (use-modules ((mylib mymodule) #:version ((and (>= 1) (not 2)) 2 0)))
5046
5047
5048File: guile.info,  Node: R6RS Libraries,  Next: Variables,  Prev: R6RS Version References,  Up: Modules
5049
50506.18.6 R6RS Libraries
5051---------------------
5052
5053In addition to the API described in the previous sections, you also have
5054the option to create modules using the portable ‘library’ form described
5055in R6RS (*note R6RS Library Form: (r6rs)Library form.), and to import
5056libraries created in this format by other programmers.  Guile’s R6RS
5057library implementation takes advantage of the flexibility built into the
5058module system by expanding the R6RS library form into a corresponding
5059Guile ‘define-module’ form that specifies equivalent import and export
5060requirements and includes the same body expressions.  The library
5061expression:
5062
5063       (library (mylib (1 2))
5064         (export mybinding)
5065         (import (otherlib (3))))
5066
5067   is equivalent to the module definition:
5068
5069       (define-module (mylib)
5070         #:version (1 2)
5071         #:use-module ((otherlib) #:version (3))
5072         #:export (mybinding))
5073
5074   Central to the mechanics of R6RS libraries is the concept of import
5075and export “levels”, which control the visibility of bindings at various
5076phases of a library’s lifecycle — macros necessary to expand forms in
5077the library’s body need to be available at expand time; variables used
5078in the body of a procedure exported by the library must be available at
5079runtime.  R6RS specifies the optional ‘for’ sub-form of an _import set_
5080specification (see below) as a mechanism by which a library author can
5081indicate that a particular library import should take place at a
5082particular phase with respect to the lifecycle of the importing library.
5083
5084   Guile’s library implementation uses a technique called “implicit
5085phasing” (first described by Abdulaziz Ghuloum and R. Kent Dybvig),
5086which allows the expander and compiler to automatically determine the
5087necessary visibility of a binding imported from another library.  As
5088such, the ‘for’ sub-form described below is ignored by Guile (but may be
5089required by Schemes in which phasing is explicit).
5090
5091 -- Scheme Syntax: library name (export export-spec ...) (import
5092          import-spec ...) body ...
5093     Defines a new library with the specified name, exports, and
5094     imports, and evaluates the specified body expressions in this
5095     library’s environment.
5096
5097     The library NAME is a non-empty list of identifiers, optionally
5098     ending with a version specification of the form described above
5099     (*note Creating Guile Modules::).
5100
5101     Each EXPORT-SPEC is the name of a variable defined or imported by
5102     the library, or must take the form ‘(rename (internal-name
5103     external-name) ...)’, where the identifier INTERNAL-NAME names a
5104     variable defined or imported by the library and EXTERNAL-NAME is
5105     the name by which the variable is seen by importing libraries.
5106
5107     Each IMPORT-SPEC must be either an “import set” (see below) or must
5108     be of the form ‘(for import-set import-level ...)’, where each
5109     IMPORT-LEVEL is one of:
5110
5111            run
5112            expand
5113            (meta LEVEL)
5114
5115     where LEVEL is an integer.  Note that since Guile does not require
5116     explicit phase specification, any IMPORT-SETs found inside of ‘for’
5117     sub-forms will be “unwrapped” during expansion and processed as if
5118     they had been specified directly.
5119
5120     Import sets in turn take one of the following forms:
5121
5122            LIBRARY-REFERENCE
5123            (library LIBRARY-REFERENCE)
5124            (only IMPORT-SET IDENTIFIER ...)
5125            (except IMPORT-SET IDENTIFIER ...)
5126            (prefix IMPORT-SET IDENTIFIER)
5127            (rename IMPORT-SET (INTERNAL-IDENTIFIER EXTERNAL-IDENTIFIER) ...)
5128
5129     where LIBRARY-REFERENCE is a non-empty list of identifiers ending
5130     with an optional version reference (*note R6RS Version
5131     References::), and the other sub-forms have the following
5132     semantics, defined recursively on nested IMPORT-SETs:
5133
5134        • The ‘library’ sub-form is used to specify libraries for import
5135          whose names begin with the identifier “library.”
5136
5137        • The ‘only’ sub-form imports only the specified IDENTIFIERs
5138          from the given IMPORT-SET.
5139
5140        • The ‘except’ sub-form imports all of the bindings exported by
5141          IMPORT-SET except for those that appear in the specified list
5142          of IDENTIFIERs.
5143
5144        • The ‘prefix’ sub-form imports all of the bindings exported by
5145          IMPORT-SET, first prefixing them with the specified
5146          IDENTIFIER.
5147
5148        • The ‘rename’ sub-form imports all of the identifiers exported
5149          by IMPORT-SET.  The binding for each INTERNAL-IDENTIFIER among
5150          these identifiers is made visible to the importing library as
5151          the corresponding EXTERNAL-IDENTIFIER; all other bindings are
5152          imported using the names provided by IMPORT-SET.
5153
5154     Note that because Guile translates R6RS libraries into module
5155     definitions, an import specification may be used to declare a
5156     dependency on a native Guile module — although doing so may make
5157     your libraries less portable to other Schemes.
5158
5159 -- Scheme Syntax: import import-spec ...
5160     Import into the current environment the libraries specified by the
5161     given import specifications, where each IMPORT-SPEC takes the same
5162     form as in the ‘library’ form described above.
5163
5164
5165File: guile.info,  Node: Variables,  Next: Module System Reflection,  Prev: R6RS Libraries,  Up: Modules
5166
51676.18.7 Variables
5168----------------
5169
5170Each module has its own hash table, sometimes known as an “obarray”,
5171that maps the names defined in that module to their corresponding
5172variable objects.
5173
5174   A variable is a box-like object that can hold any Scheme value.  It
5175is said to be “undefined” if its box holds a special Scheme value that
5176denotes undefined-ness (which is different from all other Scheme values,
5177including for example ‘#f’); otherwise the variable is “defined”.
5178
5179   On its own, a variable object is anonymous.  A variable is said to be
5180“bound” when it is associated with a name in some way, usually a symbol
5181in a module obarray.  When this happens, the name is said to be bound to
5182the variable, in that module.
5183
5184   (That’s the theory, anyway.  In practice, defined-ness and bound-ness
5185sometimes get confused, because Lisp and Scheme implementations have
5186often conflated — or deliberately drawn no distinction between — a name
5187that is unbound and a name that is bound to a variable whose value is
5188undefined.  We will try to be clear about the difference and explain any
5189confusion where it is unavoidable.)
5190
5191   Variables do not have a read syntax.  Most commonly they are created
5192and bound implicitly by ‘define’ expressions: a top-level ‘define’
5193expression of the form
5194
5195     (define NAME VALUE)
5196
5197creates a variable with initial value VALUE and binds it to the name
5198NAME in the current module.  But they can also be created dynamically by
5199calling one of the constructor procedures ‘make-variable’ and
5200‘make-undefined-variable’.
5201
5202 -- Scheme Procedure: make-undefined-variable
5203 -- C Function: scm_make_undefined_variable ()
5204     Return a variable that is initially unbound.
5205
5206 -- Scheme Procedure: make-variable init
5207 -- C Function: scm_make_variable (init)
5208     Return a variable initialized to value INIT.
5209
5210 -- Scheme Procedure: variable-bound? var
5211 -- C Function: scm_variable_bound_p (var)
5212     Return ‘#t’ if VAR is bound to a value, or ‘#f’ otherwise.  Throws
5213     an error if VAR is not a variable object.
5214
5215 -- Scheme Procedure: variable-ref var
5216 -- C Function: scm_variable_ref (var)
5217     Dereference VAR and return its value.  VAR must be a variable
5218     object; see ‘make-variable’ and ‘make-undefined-variable’.
5219
5220 -- Scheme Procedure: variable-set! var val
5221 -- C Function: scm_variable_set_x (var, val)
5222     Set the value of the variable VAR to VAL.  VAR must be a variable
5223     object, VAL can be any value.  Return an unspecified value.
5224
5225 -- Scheme Procedure: variable-unset! var
5226 -- C Function: scm_variable_unset_x (var)
5227     Unset the value of the variable VAR, leaving VAR unbound.
5228
5229 -- Scheme Procedure: variable? obj
5230 -- C Function: scm_variable_p (obj)
5231     Return ‘#t’ if OBJ is a variable object, else return ‘#f’.
5232
5233
5234File: guile.info,  Node: Module System Reflection,  Next: Declarative Modules,  Prev: Variables,  Up: Modules
5235
52366.18.8 Module System Reflection
5237-------------------------------
5238
5239The previous sections have described a declarative view of the module
5240system.  You can also work with it programmatically by accessing and
5241modifying various parts of the Scheme objects that Guile uses to
5242implement the module system.
5243
5244   At any time, there is a “current module”.  This module is the one
5245where a top-level ‘define’ and similar syntax will add new bindings.
5246You can find other module objects with ‘resolve-module’, for example.
5247
5248   These module objects can be used as the second argument to ‘eval’.
5249
5250 -- Scheme Procedure: current-module
5251 -- C Function: scm_current_module ()
5252     Return the current module object.
5253
5254 -- Scheme Procedure: set-current-module module
5255 -- C Function: scm_set_current_module (module)
5256     Set the current module to MODULE and return the previous current
5257     module.
5258
5259 -- Scheme Procedure: save-module-excursion thunk
5260     Call THUNK within a ‘dynamic-wind’ such that the module that is
5261     current at invocation time is restored when THUNK’s dynamic extent
5262     is left (*note Dynamic Wind::).
5263
5264     More precisely, if THUNK escapes non-locally, the current module
5265     (at the time of escape) is saved, and the original current module
5266     (at the time THUNK’s dynamic extent was last entered) is restored.
5267     If THUNK’s dynamic extent is re-entered, then the current module is
5268     saved, and the previously saved inner module is set current again.
5269
5270 -- Scheme Procedure: resolve-module name [autoload=#t] [version=#f]
5271          [#:ensure=#t]
5272 -- C Function: scm_resolve_module (name)
5273     Find the module named NAME and return it.  When it has not already
5274     been defined and AUTOLOAD is true, try to auto-load it.  When it
5275     can’t be found that way either, create an empty module if ENSURE is
5276     true, otherwise return ‘#f’.  If VERSION is true, ensure that the
5277     resulting module is compatible with the given version reference
5278     (*note R6RS Version References::).  The name is a list of symbols.
5279
5280 -- Scheme Procedure: resolve-interface name [#:select=#f] [#:hide='()]
5281          [#:prefix=#f] [#:renamer=#f] [#:version=#f]
5282     Find the module named NAME as with ‘resolve-module’ and return its
5283     interface.  The interface of a module is also a module object, but
5284     it contains only the exported bindings.
5285
5286 -- Scheme Procedure: module-uses module
5287     Return a list of the interfaces used by MODULE.
5288
5289 -- Scheme Procedure: module-use! module interface
5290     Add INTERFACE to the front of the use-list of MODULE.  Both
5291     arguments should be module objects, and INTERFACE should very
5292     likely be a module returned by ‘resolve-interface’.
5293
5294 -- Scheme Procedure: reload-module module
5295     Revisit the source file that corresponds to MODULE.  Raises an
5296     error if no source file is associated with the given module.
5297
5298   As mentioned in the previous section, modules contain a mapping
5299between identifiers (as symbols) and storage locations (as variables).
5300Guile defines a number of procedures to allow access to this mapping.
5301If you are programming in C, *note Accessing Modules from C::.
5302
5303 -- Scheme Procedure: module-variable module name
5304     Return the variable bound to NAME (a symbol) in MODULE, or ‘#f’ if
5305     NAME is unbound.
5306
5307 -- Scheme Procedure: module-add! module name var
5308     Define a new binding between NAME (a symbol) and VAR (a variable)
5309     in MODULE.
5310
5311 -- Scheme Procedure: module-ref module name
5312     Look up the value bound to NAME in MODULE.  Like ‘module-variable’,
5313     but also does a ‘variable-ref’ on the resulting variable, raising
5314     an error if NAME is unbound.
5315
5316 -- Scheme Procedure: module-define! module name value
5317     Locally bind NAME to VALUE in MODULE.  If NAME was already locally
5318     bound in MODULE, i.e., defined locally and not by an imported
5319     module, the value stored in the existing variable will be updated.
5320     Otherwise, a new variable will be added to the module, via
5321     ‘module-add!’.
5322
5323 -- Scheme Procedure: module-set! module name value
5324     Update the binding of NAME in MODULE to VALUE, raising an error if
5325     NAME is not already bound in MODULE.
5326
5327   There are many other reflective procedures available in the default
5328environment.  If you find yourself using one of them, please contact the
5329Guile developers so that we can commit to stability for that interface.
5330
5331
5332File: guile.info,  Node: Declarative Modules,  Next: Accessing Modules from C,  Prev: Module System Reflection,  Up: Modules
5333
53346.18.9 Declarative Modules
5335--------------------------
5336
5337The first-class access to modules and module variables described in the
5338previous subsection is very powerful and allows Guile users to build
5339many tools to dynamically learn things about their Guile systems.
5340However, as Scheme godparent Mathias Felleisen wrote in “On the
5341Expressive Power of Programming Languages”, a more expressive language
5342is necessarily harder to reason about.  There are transformations that
5343Guile’s compiler would like to make which can’t be done if every
5344top-level definition is subject to mutation at any time.
5345
5346   Consider this module:
5347
5348     (define-module (boxes)
5349       #:export (make-box box-ref box-set! box-swap!))
5350
5351     (define (make-box x) (list x))
5352     (define (box-ref box) (car box))
5353     (define (box-set! box x) (set-car! box x))
5354     (define (box-swap! box x)
5355       (let ((y (box-ref box)))
5356         (box-set! box x)
5357         y))
5358
5359   Ideally you’d like for the ‘box-ref’ in ‘box-swap!’ to be inlined to
5360‘car’.  Guile’s compiler can do this, but only if it knows that
5361‘box-ref’’s definition is what it appears to be in the text.  However,
5362in the general case it could be that a programmer could reach into the
5363‘(boxes)’ module at any time and change the value of ‘box-ref’.
5364
5365   To allow Guile to reason about the values of top-levels from a
5366module, a module can be marked as “declarative”.  This flag applies only
5367to the subset of top-level definitions that are themselves declarative:
5368those that are defined within the compilation unit, and not assigned
5369(‘set!’) or redefined within the compilation unit.
5370
5371   To explicitly mark a module as being declarative, pass the
5372‘#:declarative?’ keyword argument when declaring a module:
5373
5374     (define-module (boxes)
5375       #:export (make-box box-ref box-set! box-swap!)
5376       #:declarative? #t)
5377
5378   By default, modules are compiled declaratively if the
5379‘user-modules-declarative?’ parameter is true when the module is
5380compiled.
5381
5382 -- Scheme Parameter: user-modules-declarative?
5383     A boolean indicating whether definitions in modules created by
5384     ‘define-module’ or implicitly as part of a compilation unit without
5385     an explicit module can be treated as declarative.
5386
5387   Because it’s usually what you want, the default value of
5388‘user-modules-declarative?’ is ‘#t’.
5389
5390Should I Mark My Module As Declarative?
5391.......................................
5392
5393In the vast majority of use cases, declarative modules are what you
5394want.  However, there are exceptions.
5395
5396   Consider the ‘(boxes)’ module above.  Let’s say you want to be able
5397to go in and change the definition of ‘box-set!’ at run-time:
5398
5399     scheme@(guile-user)> (use-modules (boxes))
5400     scheme@(guile-user)> ,module boxes
5401     scheme@(boxes)> (define (box-set! x y) (set-car! x (pk y)))
5402
5403   However, considering that ‘(boxes)’ is a declarative module, it could
5404be that ‘box-swap!’ inlined the call to ‘box-set!’ – so it may be that
5405you are surprised if you call ‘(box-swap! x y)’ and you don’t see the
5406new definition being used.  (Note, however, that Guile has no guarantees
5407about what definitions its compiler will or will not inline.)
5408
5409   If you want to allow the definition of ‘box-set!’ to be changed and
5410to have all of its uses updated, then probably the best option is to
5411edit the module and reload the whole thing:
5412
5413     scheme@(guile-user)> ,reload (boxes)
5414
5415   The advantage of the reloading approach is that you maintain the
5416optimizations that declarative modules enable, while also being able to
5417live-update the code.  If the module keeps precious program state, those
5418definitions can be marked as ‘define-once’ to prevent reloads from
5419overwriting them.  *Note Top Level::, for more on ‘define-once’.
5420Incidentally, ‘define-once’ also prevents declarative-definition
5421optimizations, so if there’s a limited subset of redefinable bindings,
5422‘define-once’ could be an interesting tool to mark those definitions as
5423works-in-progress for interactive program development.
5424
5425   To users, whether a module is declarative or not is mostly
5426immaterial: besides normal use via ‘use-modules’, users can reference
5427and redefine public or private bindings programmatically or
5428interactively.  The only difference is that changing a declarative
5429definition may not change all of its uses.  If this use-case is
5430important to you, and if reloading whole modules is insufficient, then
5431you can mark all definitions in a module as non-declarative by adding
5432‘#:declarative? #f’ to the module definition.
5433
5434   The default of whether modules are declarative or not can be
5435controlled via the ‘(user-modules-declarative?)’ parameter mentioned
5436above, but care should be taken to set this parameter when the modules
5437are compiled, e.g.  via ‘(eval-when (expand) (user-modules-declarative?
5438#f))’.  *Note Eval When::.
5439
5440   Alternately you can prevent declarative-definition optimizations by
5441compiling at the ‘-O1’ optimization level instead of the default ‘-O2’,
5442or via explicitly passing ‘-Ono-letrectify’ to the ‘guild compile’
5443invocation.  *Note Compilation::, for more on compiler options.
5444
5445   One final note.  Currently, definitions from declarative modules can
5446only be inlined within the module they are defined in, and within a
5447compilation unit.  This may change in the future to allow Guile to
5448inline imported declarative definitions as well (cross-module inlining).
5449To Guile, whether a definition is inlinable or not is a property of the
5450definition, not its use.  We hope to improve compiler tooling in the
5451future to allow the user to identify definitions that are out of date
5452when a declarative binding is redefined.
5453
5454
5455File: guile.info,  Node: Accessing Modules from C,  Next: provide and require,  Prev: Declarative Modules,  Up: Modules
5456
54576.18.10 Accessing Modules from C
5458--------------------------------
5459
5460The last sections have described how modules are used in Scheme code,
5461which is the recommended way of creating and accessing modules.  You can
5462also work with modules from C, but it is more cumbersome.
5463
5464   The following procedures are available.
5465
5466 -- C Function: SCM scm_c_call_with_current_module (SCM MODULE, SCM
5467          (*FUNC)(void *), void *DATA)
5468     Call FUNC and make MODULE the current module during the call.  The
5469     argument DATA is passed to FUNC.  The return value of
5470     ‘scm_c_call_with_current_module’ is the return value of FUNC.
5471
5472 -- C Function: SCM scm_public_variable (SCM MODULE_NAME, SCM NAME)
5473 -- C Function: SCM scm_c_public_variable (const char *MODULE_NAME,
5474          const char *NAME)
5475     Find a the variable bound to the symbol NAME in the public
5476     interface of the module named MODULE_NAME.
5477
5478     MODULE_NAME should be a list of symbols, when represented as a
5479     Scheme object, or a space-separated string, in the ‘const char *’
5480     case.  See ‘scm_c_define_module’ below, for more examples.
5481
5482     Signals an error if no module was found with the given name.  If
5483     NAME is not bound in the module, just returns ‘#f’.
5484
5485 -- C Function: SCM scm_private_variable (SCM MODULE_NAME, SCM NAME)
5486 -- C Function: SCM scm_c_private_variable (const char *MODULE_NAME,
5487          const char *NAME)
5488     Like ‘scm_public_variable’, but looks in the internals of the
5489     module named MODULE_NAME instead of the public interface.
5490     Logically, these procedures should only be called on modules you
5491     write.
5492
5493 -- C Function: SCM scm_public_lookup (SCM MODULE_NAME, SCM NAME)
5494 -- C Function: SCM scm_c_public_lookup (const char *MODULE_NAME, const
5495          char *NAME)
5496 -- C Function: SCM scm_private_lookup (SCM MODULE_NAME, SCM NAME)
5497 -- C Function: SCM scm_c_private_lookup (const char *MODULE_NAME, const
5498          char *NAME)
5499     Like ‘scm_public_variable’ or ‘scm_private_variable’, but if the
5500     NAME is not bound in the module, signals an error.  Returns a
5501     variable, always.
5502
5503          static SCM eval_string_var;
5504
5505          /* NOTE: It is important that the call to 'my_init'
5506             happens-before all calls to 'my_eval_string'. */
5507          void my_init (void)
5508          {
5509            eval_string_var = scm_c_public_lookup ("ice-9 eval-string",
5510                                                   "eval-string");
5511          }
5512
5513          SCM my_eval_string (SCM str)
5514          {
5515            return scm_call_1 (scm_variable_ref (eval_string_var), str);
5516          }
5517
5518 -- C Function: SCM scm_public_ref (SCM MODULE_NAME, SCM NAME)
5519 -- C Function: SCM scm_c_public_ref (const char *MODULE_NAME, const
5520          char *NAME)
5521 -- C Function: SCM scm_private_ref (SCM MODULE_NAME, SCM NAME)
5522 -- C Function: SCM scm_c_private_ref (const char *MODULE_NAME, const
5523          char *NAME)
5524     Like ‘scm_public_lookup’ or ‘scm_private_lookup’, but additionally
5525     dereferences the variable.  If the variable object is unbound,
5526     signals an error.  Returns the value bound to NAME in MODULE_NAME.
5527
5528   In addition, there are a number of other lookup-related procedures.
5529We suggest that you use the ‘scm_public_’ and ‘scm_private_’ family of
5530procedures instead, if possible.
5531
5532 -- C Function: SCM scm_c_lookup (const char *NAME)
5533     Return the variable bound to the symbol indicated by NAME in the
5534     current module.  If there is no such binding or the symbol is not
5535     bound to a variable, signal an error.
5536
5537 -- C Function: SCM scm_lookup (SCM NAME)
5538     Like ‘scm_c_lookup’, but the symbol is specified directly.
5539
5540 -- C Function: SCM scm_c_module_lookup (SCM MODULE, const char *NAME)
5541 -- C Function: SCM scm_module_lookup (SCM MODULE, SCM NAME)
5542     Like ‘scm_c_lookup’ and ‘scm_lookup’, but the specified module is
5543     used instead of the current one.
5544
5545 -- C Function: SCM scm_module_variable (SCM MODULE, SCM NAME)
5546     Like ‘scm_module_lookup’, but if the binding does not exist, just
5547     returns ‘#f’ instead of raising an error.
5548
5549   To define a value, use ‘scm_define’:
5550
5551 -- C Function: SCM scm_c_define (const char *NAME, SCM VAL)
5552     Bind the symbol indicated by NAME to a variable in the current
5553     module and set that variable to VAL.  When NAME is already bound to
5554     a variable, use that.  Else create a new variable.
5555
5556 -- C Function: SCM scm_define (SCM NAME, SCM VAL)
5557     Like ‘scm_c_define’, but the symbol is specified directly.
5558
5559 -- C Function: SCM scm_c_module_define (SCM MODULE, const char *NAME,
5560          SCM VAL)
5561 -- C Function: SCM scm_module_define (SCM MODULE, SCM NAME, SCM VAL)
5562     Like ‘scm_c_define’ and ‘scm_define’, but the specified module is
5563     used instead of the current one.
5564
5565   In some rare cases, you may need to access the variable that
5566‘scm_module_define’ would have accessed, without changing the binding of
5567the existing variable, if one is present.  In that case, use
5568‘scm_module_ensure_local_variable’:
5569
5570 -- C Function: SCM scm_module_ensure_local_variable (SCM MODULE, SCM
5571          SYM)
5572     Like ‘scm_module_define’, but if the SYM is already locally bound
5573     in that module, the variable’s existing binding is not reset.
5574     Returns a variable.
5575
5576 -- C Function: SCM scm_module_reverse_lookup (SCM MODULE, SCM VARIABLE)
5577     Find the symbol that is bound to VARIABLE in MODULE.  When no such
5578     binding is found, return ‘#f’.
5579
5580 -- C Function: SCM scm_c_define_module (const char *NAME, void
5581          (*INIT)(void *), void *DATA)
5582     Define a new module named NAME and make it current while INIT is
5583     called, passing it DATA.  Return the module.
5584
5585     The parameter NAME is a string with the symbols that make up the
5586     module name, separated by spaces.  For example, ‘"foo bar"’ names
5587     the module ‘(foo bar)’.
5588
5589     When there already exists a module named NAME, it is used
5590     unchanged, otherwise, an empty module is created.
5591
5592 -- C Function: SCM scm_c_resolve_module (const char *NAME)
5593     Find the module name NAME and return it.  When it has not already
5594     been defined, try to auto-load it.  When it can’t be found that way
5595     either, create an empty module.  The name is interpreted as for
5596     ‘scm_c_define_module’.
5597
5598 -- C Function: SCM scm_c_use_module (const char *NAME)
5599     Add the module named NAME to the uses list of the current module,
5600     as with ‘(use-modules NAME)’.  The name is interpreted as for
5601     ‘scm_c_define_module’.
5602
5603 -- C Function: void scm_c_export (const char *NAME, ...)
5604     Add the bindings designated by NAME, ...  to the public interface
5605     of the current module.  The list of names is terminated by ‘NULL’.
5606
5607
5608File: guile.info,  Node: provide and require,  Next: Environments,  Prev: Accessing Modules from C,  Up: Modules
5609
56106.18.11 provide and require
5611---------------------------
5612
5613Aubrey Jaffer, mostly to support his portable Scheme library SLIB,
5614implemented a provide/require mechanism for many Scheme implementations.
5615Library files in SLIB _provide_ a feature, and when user programs
5616_require_ that feature, the library file is loaded in.
5617
5618   For example, the file ‘random.scm’ in the SLIB package contains the
5619line
5620
5621     (provide 'random)
5622
5623   so to use its procedures, a user would type
5624
5625     (require 'random)
5626
5627   and they would magically become available, _but still have the same
5628names!_  So this method is nice, but not as good as a full-featured
5629module system.
5630
5631   When SLIB is used with Guile, provide and require can be used to
5632access its facilities.
5633
5634
5635File: guile.info,  Node: Environments,  Prev: provide and require,  Up: Modules
5636
56376.18.12 Environments
5638--------------------
5639
5640Scheme, as defined in R5RS, does _not_ have a full module system.
5641However it does define the concept of a top-level “environment”.  Such
5642an environment maps identifiers (symbols) to Scheme objects such as
5643procedures and lists: *note About Closure::.  In other words, it
5644implements a set of “bindings”.
5645
5646   Environments in R5RS can be passed as the second argument to ‘eval’
5647(*note Fly Evaluation::).  Three procedures are defined to return
5648environments: ‘scheme-report-environment’, ‘null-environment’ and
5649‘interaction-environment’ (*note Fly Evaluation::).
5650
5651   In addition, in Guile any module can be used as an R5RS environment,
5652i.e., passed as the second argument to ‘eval’.
5653
5654   Note: the following two procedures are available only when the
5655‘(ice-9 r5rs)’ module is loaded:
5656
5657     (use-modules (ice-9 r5rs))
5658
5659 -- Scheme Procedure: scheme-report-environment version
5660 -- Scheme Procedure: null-environment version
5661     VERSION must be the exact integer ‘5’, corresponding to revision 5
5662     of the Scheme report (the Revised^5 Report on Scheme).
5663     ‘scheme-report-environment’ returns a specifier for an environment
5664     that is empty except for all bindings defined in the report that
5665     are either required or both optional and supported by the
5666     implementation.  ‘null-environment’ returns a specifier for an
5667     environment that is empty except for the (syntactic) bindings for
5668     all syntactic keywords defined in the report that are either
5669     required or both optional and supported by the implementation.
5670
5671     Currently Guile does not support values of VERSION for other
5672     revisions of the report.
5673
5674     The effect of assigning (through the use of ‘eval’) a variable
5675     bound in a ‘scheme-report-environment’ (for example ‘car’) is
5676     unspecified.  Currently the environments specified by
5677     ‘scheme-report-environment’ are not immutable in Guile.
5678
5679
5680File: guile.info,  Node: Foreign Function Interface,  Next: Foreign Objects,  Prev: Modules,  Up: API Reference
5681
56826.19 Foreign Function Interface
5683===============================
5684
5685Sometimes you need to use libraries written in C or Rust or some other
5686non-Scheme language.  More rarely, you might need to write some C to
5687extend Guile.  This section describes how to load these “foreign
5688libraries”, look up data and functions inside them, and so on.
5689
5690* Menu:
5691
5692* Foreign Libraries::              Dynamically linking to libraries.
5693* Foreign Extensions::             Extending Guile in C with loadable modules.
5694* Foreign Pointers::               Pointers to C data or functions.
5695* Foreign Types::                  Expressing C types in Scheme.
5696* Foreign Functions::              Simple calls to C procedures.
5697* Void Pointers and Byte Access::  Pointers into the ether.
5698* Foreign Structs::                Packing and unpacking structs.
5699* More Foreign Functions::         Advanced examples.
5700
5701
5702File: guile.info,  Node: Foreign Libraries,  Next: Foreign Extensions,  Up: Foreign Function Interface
5703
57046.19.1 Foreign Libraries
5705------------------------
5706
5707Just as Guile can load up Scheme libraries at run-time, Guile can also
5708load some system libraries written in C or other low-level languages.
5709We refer to these as dynamically-loadable modules as “foreign
5710libraries”, to distinguish them from native libraries written in Scheme
5711or other languages implemented by Guile.
5712
5713   Foreign libraries usually come in two forms.  Some foreign libraries
5714are part of the operating system, such as the compression library
5715‘libz’.  These shared libraries are built in such a way that many
5716programs can use their functionality without duplicating their code.
5717When a program written in C is built, it can declare that it uses a
5718specific set of shared libraries.  When the program is run, the
5719operating system takes care of locating and loading the shared
5720libraries.
5721
5722   The operating system components that can dynamically load and link
5723shared libraries when a program is run are also available
5724programmatically during a program’s execution.  This is the interface
5725that’s most useful for Guile, and this is what we mean in Guile when we
5726refer to “dynamic linking”.  Dynamic linking at run-time is sometimes
5727called “dlopening”, to distinguish it from the dynamic linking that
5728happens at program start-up.
5729
5730   The other kind of foreign library is sometimes known as a module,
5731plug-in, bundle, or an extension.  These foreign libraries aren’t meant
5732to be linked to by C programs, but rather only to be dynamically loaded
5733at run-time – they extend some main program with functionality, but
5734don’t stand on their own.  Sometimes a Guile library will implement some
5735of its functionality in a loadable module.
5736
5737   In either case, the interface on the Guile side is the same.  You
5738load the interface using ‘load-foreign-library’.  The resulting foreign
5739library object implements a simple lookup interface whereby the user can
5740get addresses of data or code exported by the library.  There is no
5741facility to inspect foreign libraries; you have to know what’s in there
5742already before you look.
5743
5744   Routines for loading foreign libraries and accessing their contents
5745are implemented in the ‘(system foreign-library)’ module.
5746
5747     (use-modules (system foreign-library))
5748
5749 -- Scheme Procedure: load-foreign-library [library]
5750          [#:extensions=system-library-extensions]
5751          [#:search-ltdl-library-path?=#t] [#:search-path=search-path]
5752          [#:search-system-paths?=#t] [#:lazy?=#t] [#:global=#f]
5753     [#:rename-on-cygwin?=#t] Find the shared library denoted by LIBRARY
5754     (a string or ‘#f’) and link it into the running Guile application.
5755     When everything works out, return a Scheme object suitable for
5756     representing the linked object file.  Otherwise an error is thrown.
5757
5758     If LIBRARY argument is omitted, it defaults to ‘#f’.  If ‘library’
5759     is false, the resulting foreign library gives access to all symbols
5760     available for dynamic linking in the main binary.
5761
5762     It is not necessary to include any extension such as ‘.so’ in
5763     LIBRARY.  For each system, Guile has a default set of extensions
5764     that it will try.  On GNU systems, the default extension set is
5765     just ‘.so’; on Windows, just ‘.dll’; and on Darwin (Mac OS), it is
5766     ‘.bundle’, ‘.so’, and ‘.dylib’.  Pass ‘#:extensions EXTENSIONS’ to
5767     override the default extensions list.  If LIBRARY contains one of
5768     the extensions, no extensions are tried, so it is possible to
5769     specify the extension if you know exactly what file to load.
5770
5771     Unless LIBRARY denotes an absolute file name or otherwise contains
5772     a directory separator (‘/’, and also ‘\’ on Windows), Guile will
5773     search for the library in the directories listed in SEARCH-PATHS.
5774     The default search path has three components, which can all be
5775     overriden by colon-delimited (semicolon on Windows) environment
5776     variables:
5777
5778     ‘GUILE_EXTENSIONS_PATH’
5779          This is the main environment variable for users to add
5780          directories containing Guile extensions.  The default value
5781          has no entries.  This environment variable was added in Guile
5782          3.0.6.
5783     ‘LTDL_LIBRARY_PATH’
5784          Before Guile 3.0.6, Guile loaded foreign libraries using
5785          ‘libltdl’, the dynamic library loader provided by libtool.
5786          This loader used ‘LTDL_LIBRARY_PATH’, and for backwards
5787          compatibility we still support that path.
5788
5789          However, ‘libltdl’ would not only open ‘.so’ (or ‘.dll’ and so
5790          on) files, but also the ‘.la’ files created by libtool.  In
5791          installed libraries – libraries that are in the target
5792          directories of ‘make install’ – ‘.la’ files are never needed,
5793          to the extent that most GNU/Linux distributions remove them
5794          entirely.  It is sufficient to just load the ‘.so’ (or ‘.dll’
5795          and so on) files, which are always located in the same
5796          directory as the ‘.la’ files.
5797
5798          But for uninstalled dynamic libraries, like those in a build
5799          tree, the situation is a bit of a mess.  If you have a project
5800          that uses libtool to build libraries – which is the case for
5801          Guile, and for most projects using autotools – and you build
5802foo.so’ in directory ‘D’, libtool will put ‘foo.la’ in ‘D’,
5803          but ‘foo.so’ gets put into ‘D/.libs’.
5804
5805          Users were mostly oblivious to this situation, as ‘libltdl’
5806          had special logic to be able to read the ‘.la’ file to know
5807          where to find the ‘.so’, even from an uninstalled build tree,
5808          preventing the existence of ‘.libs’ from leaking out to the
5809          user.
5810
5811          We don’t use libltdl now, essentially for flexibility and
5812          error-reporting reasons.  But, to keep this old use-case
5813          working, if SEARCH-LTDL-LIBRARY-PATH? is true, we add each
5814          entry of ‘LTDL_LIBRARY_PATH’ to the default extensions load
5815          path, additionally adding the ‘.libs’ subdirextories for each
5816          entry, in case there are ‘.so’ files there instead of
5817          alongside the ‘.la’ files.
5818     ‘GUILE_SYSTEM_EXTENSIONS_PATH’
5819          The last path in Guile’s search path belongs to Guile itself,
5820          and defaults to the libdir and the extensiondir, in that
5821          order.  For example, if you install to ‘/opt/guile’, these
5822          would probably be ‘/opt/guile/lib’ and
5823/opt/guile/lib/guile/3.0/extensions’, respectively.  *Note
5824          Parallel Installations::, for more details on ‘extensionsdir’.
5825
5826     Finally, if no library is found in the search path, and if LIBRARY
5827     is not absolute and does not include directory separators, and if
5828     SEARCH-SYSTEM-PATHS? is true, the operating system may have its own
5829     logic for where to locate LIBRARY.  For example, on GNU, there will
5830     be a default set of paths (often ‘/usr/lib’ and ‘/lib’, though it
5831     depends on the system), and the ‘LD_LIBRARY_PATH’ environment
5832     variable can add additional paths.  Other operating systems have
5833     other conventions.
5834
5835     Falling back to the operating system for search is usually not a
5836     great thing; it is a recipe for making programs that work on one
5837     machine but not on others.  Still, when wrapping system libraries,
5838     it can be the only way to get things working at all.
5839
5840     If LAZY? is true (the default), Guile will request the operating
5841     system to resolve symbols used by the loaded library as they are
5842     first used.  If GLOBAL? is true, symbols defined by the loaded
5843     library will be available when other modules need to resolve
5844     symbols; the default is ‘#f’, which keeps symbols local.
5845
5846     If RENAME-ON-CYGWIN? is true (the default) – on Cygwin hosts only –
5847     the search behavior is modified such that a filename that starts
5848     with “lib” will be searched for under the name “cyg”, as is
5849     customary for Cygwin.
5850
5851   The environment variables mentioned above are parsed when the
5852foreign-library module is first loaded and bound to parameters.  Null
5853path components, for example the three components of
5854‘GUILE_SYSTEM_EXTENSIONS_PATH="::"’, are ignored.
5855
5856 -- Scheme Parameter: guile-extensions-path
5857 -- Scheme Parameter: ltdl-library-path
5858 -- Scheme Parameter: guile-system-extensions-path
5859     Parameters whose initial values are taken from
5860     ‘GUILE_EXTENSIONS_PATH’, ‘LTDL_LIBRARY_PATH’, and
5861     ‘GUILE_SYSTEM_EXTENSIONS_PATH’, respectively.  *Note Parameters::.
5862     The current values of these parameters are used when building the
5863     search path when ‘load-foreign-library’ is called, unless the
5864     caller explicitly passes a ‘#:search-path’ argument.
5865
5866 -- Scheme Procedure: foreign-library? obj
5867     Return ‘#t’ if OBJ is a foreign library, or ‘#f’ otherwise.
5868
5869
5870File: guile.info,  Node: Foreign Extensions,  Next: Foreign Pointers,  Prev: Foreign Libraries,  Up: Foreign Function Interface
5871
58726.19.2 Foreign Extensions
5873-------------------------
5874
5875One way to use shared libraries is to extend Guile.  Such loadable
5876modules generally define one distinguished initialization function that,
5877when called, will use the ‘libguile’ API to define procedures in the
5878current module.
5879
5880   Concretely, you might extend Guile with an implementation of the
5881Bessel function, ‘j0’:
5882
5883     #include <math.h>
5884     #include <libguile.h>
5885
5886     SCM
5887     j0_wrapper (SCM x)
5888     {
5889       return scm_from_double (j0 (scm_to_double (x, "j0")));
5890     }
5891
5892     void
5893     init_math_bessel (void)
5894     {
5895       scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
5896     }
5897
5898   The C source file would then need to be compiled into a shared
5899library.  On GNU/Linux, the compiler invocation might look like this:
5900
5901     gcc -shared -o bessel.so -fPIC bessel.c
5902
5903   A good default place to put shared libraries that extend Guile is
5904into the extensions dir.  From the command line or a build script,
5905invoke ‘pkg-config --variable=extensionsdir guile-3.0’ to print the
5906extensions dir.  *Note Parallel Installations::, for more details.
5907
5908   Guile can load up ‘bessel.so’ via ‘load-extension’.
5909
5910 -- Scheme Procedure: load-extension lib init
5911 -- C Function: scm_load_extension (lib, init)
5912     Load and initialize the extension designated by LIB and INIT.
5913
5914   The normal way for a extension to be used is to write a small Scheme
5915file that defines a module, and to load the extension into this module.
5916When the module is auto-loaded, the extension is loaded as well.  For
5917example:
5918
5919     (define-module (math bessel)
5920       #:export (j0))
5921
5922     (load-extension "bessel" "init_math_bessel")
5923
5924   This ‘load-extension’ invocation loads the ‘bessel’ library via
5925‘(load-foreign-library "bessel")’, then looks up the ‘init_math_bessel’
5926symbol in the library, treating it as a function of no arguments, and
5927calls that function.
5928
5929   If you decide to put your extension outside the default search path
5930for ‘load-foreign-library’, probably you should adapt the Scheme module
5931to specify its absolute path.  For example, if you use ‘automake’ to
5932build your extension and place it in ‘$(pkglibdir)’, you might define a
5933build-parameters module that gets created by the build system:
5934
5935     (define-module (math config)
5936       #:export (extensiondir))
5937     (define extensiondir "PKGLIBDIR")
5938
5939   This file would be ‘config.scm.in’.  You would define a ‘make’ rule
5940to substitute in the absolute installed file name:
5941
5942     config.scm: config.scm.in
5943             sed 's|PKGLIBDIR|$(pkglibdir)|' <$< >$
5944
5945   Then your ‘(math bessel)’ would import ‘(math config)’, then
5946‘(load-extension (in-vicinity extensiondir "bessel")
5947"init_math_bessel")’.
5948
5949   An alternate approach would be to rebind the ‘guile-extensions-path’
5950parameter, or its corresponding environment variable, but note that
5951changing those parameters applies to other users of
5952‘load-foreign-library’ as well.
5953
5954   Note that the new primitives that the extension adds to Guile with
5955‘scm_c_define_gsubr’ (*note Primitive Procedures::) or with any of the
5956other mechanisms are placed into the module that is current when the
5957‘scm_c_define_gsubr’ is executed, so to be clear about what goes vwhere
5958it’s best to include the ‘load-extension’ in a module, as above.
5959Alternately, the C code can use ‘scm_c_define_module’ to specify which
5960module is being created:
5961
5962     static void
5963     do_init (void *unused)
5964     {
5965       scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
5966       scm_c_export ("j0", NULL);
5967     }
5968
5969     void
5970     init_math_bessel ()
5971     {
5972       scm_c_define_module ("math bessel", do_init, NULL);
5973     }
5974
5975   And yet...  if what we want is just the ‘j0’ function, it seems like
5976a lot of ceremony to have to compile a Guile-specific wrapper library
5977complete with an initialization function and wraper module to allow
5978Guile users to call it.  There is another way, but to get there, we have
5979to talk about function pointers and function types first.  *Note Foreign
5980Functions::, to skip to the good parts.
5981
5982
5983File: guile.info,  Node: Foreign Pointers,  Next: Foreign Types,  Prev: Foreign Extensions,  Up: Foreign Function Interface
5984
59856.19.3 Foreign Pointers
5986-----------------------
5987
5988Foreign libraries are essentially key-value mappings, where the keys are
5989names of definitions and the values are the addresses of those
5990definitions.  To look up the address of a definition, use
5991‘foreign-library-pointer’ from the ‘(system foreign-library)’ module.
5992
5993 -- Scheme Procedure: foreign-library-pointer lib name
5994     Return a “wrapped pointer” for the symbol NAME in the shared object
5995     referred to by LIB.  The returned pointer points to a C object.
5996
5997     As a convenience, if LIB is not a foreign library, it will be
5998     passed to ‘load-foreign-library’.
5999
6000   If we continue with the ‘bessel.so’ example from before, we can get
6001the address of the ‘init_math_bessel’ function via:
6002
6003     (use-modules (system foreign-library))
6004     (define init (foreign-library-pointer "bessel" "init_math_bessel"))
6005     init
6006     ⇒ #<pointer 0x7fb35b1b4688>
6007
6008   A value returned by ‘foreign-library-pointer’ is a Scheme wrapper for
6009a C pointer.  Pointers are a data type in Guile that is disjoint from
6010all other types.  The next section discusses ways to dereference
6011pointers, but before then we describe the usual type predicates and so
6012on.
6013
6014   Note that the rest of the interfaces in this section are part of the
6015‘(system foreign)’ library:
6016
6017     (use-modules (system foreign))
6018
6019 -- Scheme Procedure: pointer-address pointer
6020 -- C Function: scm_pointer_address (pointer)
6021     Return the numerical value of POINTER.
6022
6023          (pointer-address init)
6024          ⇒ 139984413364296 ; YMMV
6025
6026 -- Scheme Procedure: make-pointer address [finalizer]
6027     Return a foreign pointer object pointing to ADDRESS.  If FINALIZER
6028     is passed, it should be a pointer to a one-argument C function that
6029     will be called when the pointer object becomes unreachable.
6030
6031 -- Scheme Procedure: pointer? obj
6032     Return ‘#t’ if OBJ is a pointer object, or ‘#f’ otherwise.
6033
6034 -- Scheme Variable: %null-pointer
6035     A foreign pointer whose value is 0.
6036
6037 -- Scheme Procedure: null-pointer? pointer
6038     Return ‘#t’ if POINTER is the null pointer, ‘#f’ otherwise.
6039
6040   For the purpose of passing SCM values directly to foreign functions,
6041and allowing them to return SCM values, Guile also supports some unsafe
6042casting operators.
6043
6044 -- Scheme Procedure: scm->pointer scm
6045     Return a foreign pointer object with the ‘object-address’ of SCM.
6046
6047 -- Scheme Procedure: pointer->scm pointer
6048     Unsafely cast POINTER to a Scheme object.  Cross your fingers!
6049
6050   Sometimes you want to give C extensions access to the dynamic FFI. At
6051that point, the names get confusing, because “pointer” can refer to a
6052‘SCM’ object that wraps a pointer, or to a ‘void*’ value.  We will try
6053to use “pointer object” to refer to Scheme objects, and “pointer value”
6054to refer to ‘void *’ values.
6055
6056 -- C Function: SCM scm_from_pointer (void *ptr, void (*finalizer)
6057          (void*))
6058     Create a pointer object from a pointer value.
6059
6060     If FINALIZER is non-null, Guile arranges to call it on the pointer
6061     value at some point after the pointer object becomes collectable.
6062
6063 -- C Function: void* scm_to_pointer (SCM obj)
6064     Unpack the pointer value from a pointer object.
6065
6066
6067File: guile.info,  Node: Foreign Types,  Next: Foreign Functions,  Prev: Foreign Pointers,  Up: Foreign Function Interface
6068
60696.19.4 Foreign Types
6070--------------------
6071
6072From Scheme’s perspective, foreign pointers are shards of chaos.  The
6073user can create a foreign pointer for any address, and do with it what
6074they will.  The only thing that lends a sense of order to the whole is a
6075shared hallucination that certain storage locations have certain types.
6076When making Scheme wrappers for foreign interfaces, we hide the madness
6077by explicitly representing the the data types of parameters and fields.
6078
6079   These “foreign type values” may be constructed using the constants
6080and procedures from the ‘(system foreign)’ module, which may be loaded
6081like this:
6082
6083     (use-modules (system foreign))
6084
6085   ‘(system foreign)’ exports a number of values expressing the basic C
6086types.
6087
6088 -- Scheme Variable: int8
6089 -- Scheme Variable: uint8
6090 -- Scheme Variable: uint16
6091 -- Scheme Variable: int16
6092 -- Scheme Variable: uint32
6093 -- Scheme Variable: int32
6094 -- Scheme Variable: uint64
6095 -- Scheme Variable: int64
6096 -- Scheme Variable: float
6097 -- Scheme Variable: double
6098     These values represent the C numeric types of the specified sizes
6099     and signednesses.
6100
6101   In addition there are some convenience bindings for indicating types
6102of platform-dependent size.
6103
6104 -- Scheme Variable: int
6105 -- Scheme Variable: unsigned-int
6106 -- Scheme Variable: long
6107 -- Scheme Variable: unsigned-long
6108 -- Scheme Variable: short
6109 -- Scheme Variable: unsigned-short
6110 -- Scheme Variable: size_t
6111 -- Scheme Variable: ssize_t
6112 -- Scheme Variable: ptrdiff_t
6113 -- Scheme Variable: intptr_t
6114 -- Scheme Variable: uintptr_t
6115     Values exported by the ‘(system foreign)’ module, representing C
6116     numeric types.  For example, ‘long’ may be ‘equal?’ to ‘int64’ on a
6117     64-bit platform.
6118
6119 -- Scheme Variable: void
6120     The ‘void’ type.  It can be used as the first argument to
6121     ‘pointer->procedure’ to wrap a C function that returns nothing.
6122
6123   In addition, the symbol ‘*’ is used by convention to denote pointer
6124types.  Procedures detailed in the following sections, such as
6125‘pointer->procedure’, accept it as a type descriptor.
6126
6127
6128File: guile.info,  Node: Foreign Functions,  Next: Void Pointers and Byte Access,  Prev: Foreign Types,  Up: Foreign Function Interface
6129
61306.19.5 Foreign Functions
6131------------------------
6132
6133The most natural thing to do with a dynamic library is to grovel around
6134in it for a function pointer: a “foreign function”.  Load the ‘(system
6135foreign)’ module to use these Scheme interfaces.
6136
6137     (use-modules (system foreign))
6138
6139 -- Scheme Procedure: pointer->procedure return_type func_ptr arg_types
6140          [#:return-errno?=#f]
6141 -- C Function: scm_pointer_to_procedure (return_type, func_ptr,
6142          arg_types)
6143 -- C Function: scm_pointer_to_procedure_with_errno (return_type,
6144          func_ptr, arg_types)
6145
6146     Make a foreign function.
6147
6148     Given the foreign void pointer FUNC_PTR, its argument and return
6149     types ARG_TYPES and RETURN_TYPE, return a procedure that will pass
6150     arguments to the foreign function and return appropriate values.
6151
6152     ARG_TYPES should be a list of foreign types.  ‘return_type’ should
6153     be a foreign type.  *Note Foreign Types::, for more information on
6154     foreign types.
6155
6156     If RETURN-ERRNO? is true, or when calling
6157     ‘scm_pointer_to_procedure_with_errno’, the returned procedure will
6158     return two values, with ‘errno’ as the second value.
6159
6160   Finally, in ‘(system foreign-library)’ there is a convenient wrapper
6161function, joining together ‘foreign-libary-pointer’ and
6162‘procedure->pointer’:
6163
6164 -- Scheme Procedure: foreign-library-function lib name
6165          [#:return-type=void] [#:arg-types='()] [#:return-errno?=#f]
6166     Load the address of NAME from LIB, and treat it as a function
6167     taking arguments ARG-TYPES and returning RETURN-TYPE, optionally
6168     also with errno.
6169
6170     An invocation of ‘foreign-library-function’ is entirely equivalent
6171     to:
6172          (pointer->procedure RETURN-TYPE
6173                              (foreign-library-pointer LIB NAME)
6174                              ARG-TYPES
6175                              #:return-errno? RETURN-ERRNO?).
6176
6177   Pulling all this together, here is a better definition of ‘(math
6178bessel)’:
6179
6180     (define-module (math bessel)
6181       #:use-module (system foreign)
6182       #:use-module (system foreign-library)
6183       #:export (j0))
6184
6185     (define j0
6186       (foreign-library-function "libm" "j0"
6187                                 #:return-type double
6188                                 #:arg-types (list double)))
6189
6190   That’s it!  No C at all.
6191
6192   Before going on to more detailed examples, the next two sections
6193discuss how to deal with data that is more complex than, say, ‘int8’.
6194*Note More Foreign Functions::, to continue with foreign function
6195examples.
6196
6197
6198File: guile.info,  Node: Void Pointers and Byte Access,  Next: Foreign Structs,  Prev: Foreign Functions,  Up: Foreign Function Interface
6199
62006.19.6 Void Pointers and Byte Access
6201------------------------------------
6202
6203Wrapped pointers are untyped, so they are essentially equivalent to C
6204‘void’ pointers.  As in C, the memory region pointed to by a pointer can
6205be accessed at the byte level.  This is achieved using _bytevectors_
6206(*note Bytevectors::).  The ‘(rnrs bytevectors)’ module contains
6207procedures that can be used to convert byte sequences to Scheme objects
6208such as strings, floating point numbers, or integers.
6209
6210   Load the ‘(system foreign)’ module to use these Scheme interfaces.
6211
6212     (use-modules (system foreign))
6213
6214 -- Scheme Procedure: pointer->bytevector pointer len [offset
6215          [uvec_type]]
6216 -- C Function: scm_pointer_to_bytevector (pointer, len, offset,
6217          uvec_type)
6218     Return a bytevector aliasing the LEN bytes pointed to by POINTER.
6219
6220     The user may specify an alternate default interpretation for the
6221     memory by passing the UVEC_TYPE argument, to indicate that the
6222     memory is an array of elements of that type.  UVEC_TYPE should be
6223     something that ‘array-type’ would return, like ‘f32’ or ‘s16’.
6224
6225     When OFFSET is passed, it specifies the offset in bytes relative to
6226     POINTER of the memory region aliased by the returned bytevector.
6227
6228     Mutating the returned bytevector mutates the memory pointed to by
6229     POINTER, so buckle your seatbelts.
6230
6231 -- Scheme Procedure: bytevector->pointer bv [offset]
6232 -- C Function: scm_bytevector_to_pointer (bv, offset)
6233     Return a pointer pointer aliasing the memory pointed to by BV or
6234     OFFSET bytes after BV when OFFSET is passed.
6235
6236   In addition to these primitives, convenience procedures are
6237available:
6238
6239 -- Scheme Procedure: dereference-pointer pointer
6240     Assuming POINTER points to a memory region that holds a pointer,
6241     return this pointer.
6242
6243 -- Scheme Procedure: string->pointer string [encoding]
6244     Return a foreign pointer to a nul-terminated copy of STRING in the
6245     given ENCODING, defaulting to the current locale encoding.  The C
6246     string is freed when the returned foreign pointer becomes
6247     unreachable.
6248
6249     This is the Scheme equivalent of ‘scm_to_stringn’.
6250
6251 -- Scheme Procedure: pointer->string pointer [length] [encoding]
6252     Return the string representing the C string pointed to by POINTER.
6253     If LENGTH is omitted or ‘-1’, the string is assumed to be
6254     nul-terminated.  Otherwise LENGTH is the number of bytes in memory
6255     pointed to by POINTER.  The C string is assumed to be in the given
6256     ENCODING, defaulting to the current locale encoding.
6257
6258     This is the Scheme equivalent of ‘scm_from_stringn’.
6259
6260   Most object-oriented C libraries use pointers to specific data
6261structures to identify objects.  It is useful in such cases to reify the
6262different pointer types as disjoint Scheme types.  The
6263‘define-wrapped-pointer-type’ macro simplifies this.
6264
6265 -- Scheme Syntax: define-wrapped-pointer-type type-name pred wrap
6266          unwrap print
6267     Define helper procedures to wrap pointer objects into Scheme
6268     objects with a disjoint type.  Specifically, this macro defines:
6269
6270        • PRED, a predicate for the new Scheme type;
6271        • WRAP, a procedure that takes a pointer object and returns an
6272          object that satisfies PRED;
6273        • UNWRAP, which does the reverse.
6274
6275     WRAP preserves pointer identity, for two pointer objects P1 and P2
6276     that are ‘equal?’, ‘(eq? (WRAP P1) (WRAP P2)) ⇒ #t’.
6277
6278     Finally, PRINT should name a user-defined procedure to print such
6279     objects.  The procedure is passed the wrapped object and a port to
6280     write to.
6281
6282     For example, assume we are wrapping a C library that defines a
6283     type, ‘bottle_t’, and functions that can be passed ‘bottle_t *’
6284     pointers to manipulate them.  We could write:
6285
6286          (define-wrapped-pointer-type bottle
6287            bottle?
6288            wrap-bottle unwrap-bottle
6289            (lambda (b p)
6290              (format p "#<bottle of ~a ~x>"
6291                      (bottle-contents b)
6292                      (pointer-address (unwrap-bottle b)))))
6293
6294          (define grab-bottle
6295            ;; Wrapper for `bottle_t *grab (void)'.
6296            (let ((grab (foreign-library-function libbottle "grab_bottle"
6297                                                  #:return-type '*)))
6298              (lambda ()
6299                "Return a new bottle."
6300                (wrap-bottle (grab)))))
6301
6302          (define bottle-contents
6303            ;; Wrapper for `const char *bottle_contents (bottle_t *)'.
6304            (let ((contents (foreign-library-function libbottle "bottle_contents"
6305                                                      #:return-type '*
6306                                                      #:arg-types  '(*))))
6307              (lambda (b)
6308                "Return the contents of B."
6309                (pointer->string (contents (unwrap-bottle b))))))
6310
6311          (write (grab-bottle))
6312          ⇒ #<bottle of Château Haut-Brion 803d36>
6313
6314     In this example, ‘grab-bottle’ is guaranteed to return a genuine
6315     ‘bottle’ object satisfying ‘bottle?’.  Likewise, ‘bottle-contents’
6316     errors out when its argument is not a genuine ‘bottle’ object.
6317
6318   As another example, currently Guile has a variable, ‘scm_numptob’, as
6319part of its API. It is declared as a C ‘long’.  So, to read its value,
6320we can do:
6321
6322     (use-modules (system foreign))
6323     (use-modules (rnrs bytevectors))
6324     (define numptob
6325       (foreign-library-pointer #f "scm_numptob"))
6326     numptob
6327     (bytevector-uint-ref (pointer->bytevector numptob (sizeof long))
6328                          0 (native-endianness)
6329                          (sizeof long))
6330     ⇒ 8
6331
6332   If we wanted to corrupt Guile’s internal state, we could set
6333‘scm_numptob’ to another value; but we shouldn’t, because that variable
6334is not meant to be set.  Indeed this point applies more widely: the C
6335API is a dangerous place to be.  Not only might setting a value crash
6336your program, simply accessing the data pointed to by a dangling pointer
6337or similar can prove equally disastrous.
6338
6339
6340File: guile.info,  Node: Foreign Structs,  Next: More Foreign Functions,  Prev: Void Pointers and Byte Access,  Up: Foreign Function Interface
6341
63426.19.7 Foreign Structs
6343----------------------
6344
6345Finally, one last note on foreign values before moving on to actually
6346calling foreign functions.  Sometimes you need to deal with C structs,
6347which requires interpreting each element of the struct according to the
6348its type, offset, and alignment.  The ‘(system foreign)’ module has some
6349primitives to support this.
6350
6351     (use-modules (system foreign))
6352
6353 -- Scheme Procedure: sizeof type
6354 -- C Function: scm_sizeof (type)
6355     Return the size of TYPE, in bytes.
6356
6357     TYPE should be a valid C type, like ‘int’.  Alternately TYPE may be
6358     the symbol ‘*’, in which case the size of a pointer is returned.
6359     TYPE may also be a list of types, in which case the size of a
6360     ‘struct’ with ABI-conventional packing is returned.
6361
6362 -- Scheme Procedure: alignof type
6363 -- C Function: scm_alignof (type)
6364     Return the alignment of TYPE, in bytes.
6365
6366     TYPE should be a valid C type, like ‘int’.  Alternately TYPE may be
6367     the symbol ‘*’, in which case the alignment of a pointer is
6368     returned.  TYPE may also be a list of types, in which case the
6369     alignment of a ‘struct’ with ABI-conventional packing is returned.
6370
6371   Guile also provides some convenience methods to pack and unpack
6372foreign pointers wrapping C structs.
6373
6374 -- Scheme Procedure: make-c-struct types vals
6375     Create a foreign pointer to a C struct containing VALS with types
6376     ‘types’.
6377
6378     VALS and ‘types’ should be lists of the same length.
6379
6380 -- Scheme Procedure: parse-c-struct foreign types
6381     Parse a foreign pointer to a C struct, returning a list of values.
6382
6383     ‘types’ should be a list of C types.
6384
6385   For example, to create and parse the equivalent of a ‘struct {
6386int64_t a; uint8_t b; }’:
6387
6388     (parse-c-struct (make-c-struct (list int64 uint8)
6389                                    (list 300 43))
6390                     (list int64 uint8))
6391     ⇒ (300 43)
6392
6393   As yet, Guile only has convenience routines to support
6394conventionally-packed structs.  But given the ‘bytevector->pointer’ and
6395‘pointer->bytevector’ routines, one can create and parse tightly packed
6396structs and unions by hand.  See the code for ‘(system foreign)’ for
6397details.
6398
6399
6400File: guile.info,  Node: More Foreign Functions,  Prev: Foreign Structs,  Up: Foreign Function Interface
6401
64026.19.8 More Foreign Functions
6403-----------------------------
6404
6405It is possible to pass pointers to foreign functions, and to return them
6406as well.  In that case the type of the argument or return value should
6407be the symbol ‘*’, indicating a pointer.  For example, the following
6408code makes ‘memcpy’ available to Scheme:
6409
6410     (use-modules (system foreign))
6411     (define memcpy
6412       (foreign-library-function #f "memcpy"
6413                                 #:return-type '*
6414                                 #:arg-types (list '* '* size_t)))
6415
6416   To invoke ‘memcpy’, one must pass it foreign pointers:
6417
6418     (use-modules (rnrs bytevectors))
6419
6420     (define src-bits
6421       (u8-list->bytevector '(0 1 2 3 4 5 6 7)))
6422     (define src
6423       (bytevector->pointer src-bits))
6424     (define dest
6425       (bytevector->pointer (make-bytevector 16 0)))
6426
6427     (memcpy dest src (bytevector-length src-bits))
6428
6429     (bytevector->u8-list (pointer->bytevector dest 16))
6430     ⇒ (0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0)
6431
6432   One may also pass structs as values, passing structs as foreign
6433pointers.  *Note Foreign Structs::, for more information on how to
6434express struct types and struct values.
6435
6436   “Out” arguments are passed as foreign pointers.  The memory pointed
6437to by the foreign pointer is mutated in place.
6438
6439     ;; struct timeval {
6440     ;;      time_t      tv_sec;     /* seconds */
6441     ;;      suseconds_t tv_usec;    /* microseconds */
6442     ;; };
6443     ;; assuming fields are of type "long"
6444
6445     (define gettimeofday
6446       (let ((f (foreign-library-function #f "gettimeofday"
6447                                          #:return-type int
6448                                          #:arg-types (list '* '*)))
6449             (tv-type (list long long)))
6450         (lambda ()
6451           (let* ((timeval (make-c-struct tv-type (list 0 0)))
6452                  (ret (f timeval %null-pointer)))
6453             (if (zero? ret)
6454                 (apply values (parse-c-struct timeval tv-type))
6455                 (error "gettimeofday returned an error" ret))))))
6456
6457     (gettimeofday)
6458     ⇒ 1270587589
6459     ⇒ 499553
6460
6461   As you can see, this interface to foreign functions is at a very low,
6462somewhat dangerous level(1).
6463
6464   The FFI can also work in the opposite direction: making Scheme
6465procedures callable from C. This makes it possible to use Scheme
6466procedures as “callbacks” expected by C function.
6467
6468 -- Scheme Procedure: procedure->pointer return-type proc arg-types
6469 -- C Function: scm_procedure_to_pointer (return_type, proc, arg_types)
6470     Return a pointer to a C function of type RETURN-TYPE taking
6471     arguments of types ARG-TYPES (a list) and behaving as a proxy to
6472     procedure PROC.  Thus PROC’s arity, supported argument types, and
6473     return type should match RETURN-TYPE and ARG-TYPES.
6474
6475   As an example, here’s how the C library’s ‘qsort’ array sorting
6476function can be made accessible to Scheme (*note ‘qsort’: (libc)Array
6477Sort Function.):
6478
6479     (define qsort!
6480       (let ((qsort (foreign-library-function
6481                     #f "qsort" #:arg-types (list '* size_t size_t '*))))
6482         (lambda (bv compare)
6483           ;; Sort bytevector BV in-place according to comparison
6484           ;; procedure COMPARE.
6485           (let ((ptr (procedure->pointer int
6486                                          (lambda (x y)
6487                                            ;; X and Y are pointers so,
6488                                            ;; for convenience, dereference
6489                                            ;; them before calling COMPARE.
6490                                            (compare (dereference-uint8* x)
6491                                                     (dereference-uint8* y)))
6492                                          (list '* '*))))
6493             (qsort (bytevector->pointer bv)
6494                    (bytevector-length bv) 1 ;; we're sorting bytes
6495                    ptr)))))
6496
6497     (define (dereference-uint8* ptr)
6498       ;; Helper function: dereference the byte pointed to by PTR.
6499       (let ((b (pointer->bytevector ptr 1)))
6500         (bytevector-u8-ref b 0)))
6501
6502     (define bv
6503       ;; An unsorted array of bytes.
6504       (u8-list->bytevector '(7 1 127 3 5 4 77 2 9 0)))
6505
6506     ;; Sort BV.
6507     (qsort! bv (lambda (x y) (- x y)))
6508
6509     ;; Let's see what the sorted array looks like:
6510     (bytevector->u8-list bv)
6511     ⇒ (0 1 2 3 4 5 7 9 77 127)
6512
6513   And voilà!
6514
6515   Note that ‘procedure->pointer’ is not supported (and not defined) on
6516a few exotic architectures.  Thus, user code may need to check
6517‘(defined? 'procedure->pointer)’.  Nevertheless, it is available on many
6518architectures, including (as of libffi 3.0.9) x86, ia64, SPARC, PowerPC,
6519ARM, and MIPS, to name a few.
6520
6521   ---------- Footnotes ----------
6522
6523   (1) A contribution to Guile in the form of a high-level FFI would be
6524most welcome.
6525
6526
6527File: guile.info,  Node: Foreign Objects,  Next: Smobs,  Prev: Foreign Function Interface,  Up: API Reference
6528
65296.20 Foreign Objects
6530====================
6531
6532This chapter contains reference information related to defining and
6533working with foreign objects.  *Note Defining New Foreign Object
6534Types::, for a tutorial-like introduction to foreign objects.
6535
6536 -- C Type: scm_t_struct_finalize
6537     This function type returns ‘void’ and takes one ‘SCM’ argument.
6538
6539 -- C Function: SCM scm_make_foreign_object_type (SCM name, SCM slots,
6540          scm_t_struct_finalize finalizer)
6541     Create a fresh foreign object type.  NAME is a symbol naming the
6542     type.  SLOTS is a list of symbols, each one naming a field in the
6543     foreign object type.  FINALIZER indicates the finalizer, and may be
6544     ‘NULL’.
6545
6546   We recommend that finalizers be avoided if possible.  *Note Foreign
6547Object Memory Management::.  Finalizers must be async-safe and
6548thread-safe.  Again, *note Foreign Object Memory Management::.  If you
6549are embedding Guile in an application that is not thread-safe, and you
6550define foreign object types that need finalization, you might want to
6551disable automatic finalization, and arrange to call
6552‘scm_manually_run_finalizers ()’ yourself.
6553
6554 -- C Function: int scm_set_automatic_finalization_enabled (int
6555          enabled_p)
6556     Enable or disable automatic finalization.  By default, Guile
6557     arranges to invoke object finalizers automatically, in a separate
6558     thread if possible.  Passing a zero value for ENABLED_P will
6559     disable automatic finalization for Guile as a whole.  If you
6560     disable automatic finalization, you will have to call
6561     ‘scm_run_finalizers ()’ periodically.
6562
6563     Unlike most other Guile functions, you can call
6564     ‘scm_set_automatic_finalization_enabled’ before Guile has been
6565     initialized.
6566
6567     Return the previous status of automatic finalization.
6568
6569 -- C Function: int scm_run_finalizers (void)
6570     Invoke any pending finalizers.  Returns the number of finalizers
6571     that were invoked.  This function should be called when automatic
6572     finalization is disabled, though it may be called if it is enabled
6573     as well.
6574
6575 -- C Function: void scm_assert_foreign_object_type (SCM type, SCM val)
6576     When VAL is a foreign object of the given TYPE, do nothing.
6577     Otherwise, signal an error.
6578
6579 -- C Function: SCM scm_make_foreign_object_0 (SCM type)
6580 -- C Function: SCM scm_make_foreign_object_1 (SCM type, void *val0)
6581 -- C Function: SCM scm_make_foreign_object_2 (SCM type, void *val0,
6582          void *val1)
6583 -- C Function: SCM scm_make_foreign_object_3 (SCM type, void *val0,
6584          void *val1, void *val2)
6585 -- C Function: SCM scm_make_foreign_object_n (SCM type, size_t n, void
6586          *vals[])
6587     Make a new foreign object of the type with type TYPE and initialize
6588     the first N fields to the given values, as appropriate.
6589
6590     The number of fields for objects of a given type is fixed when the
6591     type is created.  It is an error to give more initializers than
6592     there are fields in the value.  It is perfectly fine to give fewer
6593     initializers than needed; this is convenient when some fields are
6594     of non-pointer types, and would be easier to initialize with the
6595     setters described below.
6596
6597 -- C Function: void* scm_foreign_object_ref (SCM obj, size_t n);
6598 -- C Function: scm_t_bits scm_foreign_object_unsigned_ref (SCM obj,
6599          size_t n);
6600 -- C Function: scm_t_signed_bits scm_foreign_object_signed_ref (SCM
6601          obj, size_t n);
6602     Return the value of the Nth field of the foreign object OBJ.  The
6603     backing store for the fields is as wide as a ‘scm_t_bits’ value,
6604     which is at least as wide as a pointer.  The different variants
6605     handle casting in a portable way.
6606
6607 -- C Function: void scm_foreign_object_set_x (SCM obj, size_t n, void
6608          *val);
6609 -- C Function: void scm_foreign_object_unsigned_set_x (SCM obj, size_t
6610          n, scm_t_bits val);
6611 -- C Function: void scm_foreign_object_signed_set_x (SCM obj, size_t n,
6612          scm_t_signed_bits val);
6613     Set the value of the Nth field of the foreign object OBJ to VAL,
6614     after portably converting to a ‘scm_t_bits’ value, if needed.
6615
6616   One can also access foreign objects from Scheme.  *Note Foreign
6617Objects and Scheme::, for some examples.
6618
6619     (use-modules (system foreign-object))
6620
6621 -- Scheme Procedure: make-foreign-object-type name slots
6622          [#:finalizer=#f]
6623     Make a new foreign object type.  See the above documentation for
6624     ‘scm_make_foreign_object_type’; these functions are exactly
6625     equivalent, except for the way in which the finalizer gets attached
6626     to instances (an internal detail).
6627
6628     The resulting value is a GOOPS class.  *Note GOOPS::, for more on
6629     classes in Guile.
6630
6631 -- Scheme Syntax: define-foreign-object-type name constructor (slot
6632          ...) [#:finalizer=#f]
6633     A convenience macro to define a type, using
6634     ‘make-foreign-object-type’, and bind it to NAME.  A constructor
6635     will be bound to CONSTRUCTOR, and getters will be bound to each of
6636     SLOT....
6637
6638
6639File: guile.info,  Node: Smobs,  Next: Scheduling,  Prev: Foreign Objects,  Up: API Reference
6640
66416.21 Smobs
6642==========
6643
6644A “smob” is a “small object”.  Before foreign objects were introduced in
6645Guile 2.0.12 (*note Foreign Objects::), smobs were the preferred way to
6646for C code to define new kinds of Scheme objects.  With the exception of
6647the so-called “applicable SMOBs” discussed below, smobs are now a legacy
6648interface and are headed for eventual deprecation.  *Note Deprecation::.
6649New code should use the foreign object interface.
6650
6651   This section contains reference information related to defining and
6652working with smobs.  For a tutorial-like introduction to smobs, see
6653“Defining New Types (Smobs)” in previous versions of this manual.
6654
6655 -- Function: scm_t_bits scm_make_smob_type (const char *name, size_t
6656          size)
6657     This function adds a new smob type, named NAME, with instance size
6658     SIZE, to the system.  The return value is a tag that is used in
6659     creating instances of the type.
6660
6661     If SIZE is 0, the default _free_ function will do nothing.
6662
6663     If SIZE is not 0, the default _free_ function will deallocate the
6664     memory block pointed to by ‘SCM_SMOB_DATA’ with ‘scm_gc_free’.  The
6665     WHAT parameter in the call to ‘scm_gc_free’ will be NAME.
6666
6667     Default values are provided for the _mark_, _free_, _print_, and
6668     _equalp_ functions.  If you want to customize any of these
6669     functions, the call to ‘scm_make_smob_type’ should be immediately
6670     followed by calls to one or several of ‘scm_set_smob_mark’,
6671     ‘scm_set_smob_free’, ‘scm_set_smob_print’, and/or
6672     ‘scm_set_smob_equalp’.
6673
6674 -- C Function: void scm_set_smob_free (scm_t_bits tc, size_t (*free)
6675          (SCM obj))
6676     This function sets the smob freeing procedure (sometimes referred
6677     to as a “finalizer”) for the smob type specified by the tag TC.  TC
6678     is the tag returned by ‘scm_make_smob_type’.
6679
6680     The FREE procedure must deallocate all resources that are directly
6681     associated with the smob instance OBJ.  It must assume that all
6682     ‘SCM’ values that it references have already been freed and are
6683     thus invalid.
6684
6685     It must also not call any libguile function or macro except
6686     ‘scm_gc_free’, ‘SCM_SMOB_FLAGS’, ‘SCM_SMOB_DATA’,
6687     ‘SCM_SMOB_DATA_2’, and ‘SCM_SMOB_DATA_3’.
6688
6689     The FREE procedure must return 0.
6690
6691     Note that defining a freeing procedure is not necessary if the
6692     resources associated with OBJ consists only of memory allocated
6693     with ‘scm_gc_malloc’ or ‘scm_gc_malloc_pointerless’ because this
6694     memory is automatically reclaimed by the garbage collector when it
6695     is no longer needed (*note ‘scm_gc_malloc’: Memory Blocks.).
6696
6697   Smob free functions must be thread-safe.  *Note Foreign Object Memory
6698Management::, for a discussion on finalizers and concurrency.  If you
6699are embedding Guile in an application that is not thread-safe, and you
6700define smob types that need finalization, you might want to disable
6701automatic finalization, and arrange to call ‘scm_manually_run_finalizers
6702()’ yourself.  *Note Foreign Objects::.
6703
6704 -- C Function: void scm_set_smob_mark (scm_t_bits tc, SCM (*mark) (SCM
6705          obj))
6706     This function sets the smob marking procedure for the smob type
6707     specified by the tag TC.  TC is the tag returned by
6708     ‘scm_make_smob_type’.
6709
6710     Defining a marking procedure is almost always the wrong thing to
6711     do.  It is much, much preferable to allocate smob data with the
6712     ‘scm_gc_malloc’ and ‘scm_gc_malloc_pointerless’ functions, and
6713     allow the GC to trace pointers automatically.
6714
6715     Any mark procedures you see currently almost surely date from the
6716     time of Guile 1.8, before the switch to the Boehm-Demers-Weiser
6717     collector.  Such smob implementations should be changed to just use
6718     ‘scm_gc_malloc’ and friends, and to lose their mark function.
6719
6720     If you decide to keep the mark function, note that it may be called
6721     on objects that are on the free list.  Please read and digest the
6722     comments from the BDW GC’s ‘gc/gc_mark.h’ header.
6723
6724     The MARK procedure must cause ‘scm_gc_mark’ to be called for every
6725     ‘SCM’ value that is directly referenced by the smob instance OBJ.
6726     One of these ‘SCM’ values can be returned from the procedure and
6727     Guile will call ‘scm_gc_mark’ for it.  This can be used to avoid
6728     deep recursions for smob instances that form a list.
6729
6730     It must not call any libguile function or macro except
6731     ‘scm_gc_mark’, ‘SCM_SMOB_FLAGS’, ‘SCM_SMOB_DATA’,
6732     ‘SCM_SMOB_DATA_2’, and ‘SCM_SMOB_DATA_3’.
6733
6734 -- C Function: void scm_set_smob_print (scm_t_bits tc, int (*print)
6735          (SCM obj, SCM port, scm_print_state* pstate))
6736     This function sets the smob printing procedure for the smob type
6737     specified by the tag TC.  TC is the tag returned by
6738     ‘scm_make_smob_type’.
6739
6740     The PRINT procedure should output a textual representation of the
6741     smob instance OBJ to PORT, using information in PSTATE.
6742
6743     The textual representation should be of the form ‘#<name ...>’.
6744     This ensures that ‘read’ will not interpret it as some other Scheme
6745     value.
6746
6747     It is often best to ignore PSTATE and just print to PORT with
6748     ‘scm_display’, ‘scm_write’, ‘scm_simple_format’, and ‘scm_puts’.
6749
6750 -- C Function: void scm_set_smob_equalp (scm_t_bits tc, SCM (*equalp)
6751          (SCM obj1, SCM obj2))
6752     This function sets the smob equality-testing predicate for the smob
6753     type specified by the tag TC.  TC is the tag returned by
6754     ‘scm_make_smob_type’.
6755
6756     The EQUALP procedure should return ‘SCM_BOOL_T’ when OBJ1 is
6757     ‘equal?’ to OBJ2.  Else it should return ‘SCM_BOOL_F’.  Both OBJ1
6758     and OBJ2 are instances of the smob type TC.
6759
6760 -- C Function: void scm_assert_smob_type (scm_t_bits tag, SCM val)
6761     When VAL is a smob of the type indicated by TAG, do nothing.  Else,
6762     signal an error.
6763
6764 -- C Macro: int SCM_SMOB_PREDICATE (scm_t_bits tag, SCM exp)
6765     Return true if EXP is a smob instance of the type indicated by TAG,
6766     or false otherwise.  The expression EXP can be evaluated more than
6767     once, so it shouldn’t contain any side effects.
6768
6769 -- C Function: SCM scm_new_smob (scm_t_bits tag, void *data)
6770 -- C Function: SCM scm_new_double_smob (scm_t_bits tag, void *data,
6771          void *data2, void *data3)
6772     Make a new smob of the type with tag TAG and smob data DATA, DATA2,
6773     and DATA3, as appropriate.
6774
6775     The TAG is what has been returned by ‘scm_make_smob_type’.  The
6776     initial values DATA, DATA2, and DATA3 are of type ‘scm_t_bits’;
6777     when you want to use them for ‘SCM’ values, these values need to be
6778     converted to a ‘scm_t_bits’ first by using ‘SCM_UNPACK’.
6779
6780     The flags of the smob instance start out as zero.
6781
6782 -- C Macro: scm_t_bits SCM_SMOB_FLAGS (SCM obj)
6783     Return the 16 extra bits of the smob OBJ.  No meaning is predefined
6784     for these bits, you can use them freely.
6785
6786 -- C Macro: scm_t_bits SCM_SET_SMOB_FLAGS (SCM obj, scm_t_bits flags)
6787     Set the 16 extra bits of the smob OBJ to FLAGS.  No meaning is
6788     predefined for these bits, you can use them freely.
6789
6790 -- C Macro: scm_t_bits SCM_SMOB_DATA (SCM obj)
6791 -- C Macro: scm_t_bits SCM_SMOB_DATA_2 (SCM obj)
6792 -- C Macro: scm_t_bits SCM_SMOB_DATA_3 (SCM obj)
6793     Return the first (second, third) immediate word of the smob OBJ as
6794     a ‘scm_t_bits’ value.  When the word contains a ‘SCM’ value, use
6795     ‘SCM_SMOB_OBJECT’ (etc.)  instead.
6796
6797 -- C Macro: void SCM_SET_SMOB_DATA (SCM obj, scm_t_bits val)
6798 -- C Macro: void SCM_SET_SMOB_DATA_2 (SCM obj, scm_t_bits val)
6799 -- C Macro: void SCM_SET_SMOB_DATA_3 (SCM obj, scm_t_bits val)
6800     Set the first (second, third) immediate word of the smob OBJ to
6801     VAL.  When the word should be set to a ‘SCM’ value, use
6802     ‘SCM_SMOB_SET_OBJECT’ (etc.)  instead.
6803
6804 -- C Macro: SCM SCM_SMOB_OBJECT (SCM obj)
6805 -- C Macro: SCM SCM_SMOB_OBJECT_2 (SCM obj)
6806 -- C Macro: SCM SCM_SMOB_OBJECT_3 (SCM obj)
6807     Return the first (second, third) immediate word of the smob OBJ as
6808     a ‘SCM’ value.  When the word contains a ‘scm_t_bits’ value, use
6809     ‘SCM_SMOB_DATA’ (etc.)  instead.
6810
6811 -- C Macro: void SCM_SET_SMOB_OBJECT (SCM obj, SCM val)
6812 -- C Macro: void SCM_SET_SMOB_OBJECT_2 (SCM obj, SCM val)
6813 -- C Macro: void SCM_SET_SMOB_OBJECT_3 (SCM obj, SCM val)
6814     Set the first (second, third) immediate word of the smob OBJ to
6815     VAL.  When the word should be set to a ‘scm_t_bits’ value, use
6816     ‘SCM_SMOB_SET_DATA’ (etc.)  instead.
6817
6818 -- C Macro: SCM * SCM_SMOB_OBJECT_LOC (SCM obj)
6819 -- C Macro: SCM * SCM_SMOB_OBJECT_2_LOC (SCM obj)
6820 -- C Macro: SCM * SCM_SMOB_OBJECT_3_LOC (SCM obj)
6821     Return a pointer to the first (second, third) immediate word of the
6822     smob OBJ.  Note that this is a pointer to ‘SCM’.  If you need to
6823     work with ‘scm_t_bits’ values, use ‘SCM_PACK’ and ‘SCM_UNPACK’, as
6824     appropriate.
6825
6826 -- Function: SCM scm_markcdr (SCM X)
6827     Mark the references in the smob X, assuming that X’s first data
6828     word contains an ordinary Scheme object, and X refers to no other
6829     objects.  This function simply returns X’s first data word.
6830
6831