1This is guile.info, produced by makeinfo version 6.7 from guile.texi. 2 3This manual documents Guile version 3.0.7. 4 5 Copyright (C) 1996-1997, 2000-2005, 2009-2021 Free Software 6Foundation, Inc. 7 8 Permission is granted to copy, distribute and/or modify this document 9under the terms of the GNU Free Documentation License, Version 1.3 or 10any later version published by the Free Software Foundation; with no 11Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A 12copy of the license is included in the section entitled “GNU Free 13Documentation License.” 14INFO-DIR-SECTION The Algorithmic Language Scheme 15START-INFO-DIR-ENTRY 16* Guile Reference: (guile). The Guile reference manual. 17END-INFO-DIR-ENTRY 18 19 20File: guile.info, Node: Random Access, Next: Line/Delimited, Prev: Buffering, Up: Input and Output 21 226.12.7 Random Access 23-------------------- 24 25 -- Scheme Procedure: seek fd_port offset whence 26 -- C Function: scm_seek (fd_port, offset, whence) 27 Sets the current position of FD_PORT to the integer OFFSET. For a 28 file port, OFFSET is expressed as a number of bytes; for other 29 types of ports, such as string ports, OFFSET is an abstract 30 representation of the position within the port’s data, not 31 necessarily expressed as a number of bytes. OFFSET is interpreted 32 according to the value of WHENCE. 33 34 One of the following variables should be supplied for WHENCE: 35 -- Variable: SEEK_SET 36 Seek from the beginning of the file. 37 -- Variable: SEEK_CUR 38 Seek from the current position. 39 -- Variable: SEEK_END 40 Seek from the end of the file. 41 If FD_PORT is a file descriptor, the underlying system call is 42 ‘lseek’. PORT may be a string port. 43 44 The value returned is the new position in FD_PORT. This means that 45 the current position of a port can be obtained using: 46 (seek port 0 SEEK_CUR) 47 48 -- Scheme Procedure: ftell fd_port 49 -- C Function: scm_ftell (fd_port) 50 Return an integer representing the current position of FD_PORT, 51 measured from the beginning. Equivalent to: 52 53 (seek port 0 SEEK_CUR) 54 55 -- Scheme Procedure: truncate-file file [length] 56 -- C Function: scm_truncate_file (file, length) 57 Truncate FILE to LENGTH bytes. FILE can be a filename string, a 58 port object, or an integer file descriptor. The return value is 59 unspecified. 60 61 For a port or file descriptor LENGTH can be omitted, in which case 62 the file is truncated at the current position (per ‘ftell’ above). 63 64 On most systems a file can be extended by giving a length greater 65 than the current size, but this is not mandatory in the POSIX 66 standard. 67 68 69File: guile.info, Node: Line/Delimited, Next: Default Ports, Prev: Random Access, Up: Input and Output 70 716.12.8 Line Oriented and Delimited Text 72--------------------------------------- 73 74The delimited-I/O module can be accessed with: 75 76 (use-modules (ice-9 rdelim)) 77 78 It can be used to read or write lines of text, or read text delimited 79by a specified set of characters. 80 81 -- Scheme Procedure: read-line [port] [handle-delim] 82 Return a line of text from PORT if specified, otherwise from the 83 value returned by ‘(current-input-port)’. Under Unix, a line of 84 text is terminated by the first end-of-line character or by 85 end-of-file. 86 87 If HANDLE-DELIM is specified, it should be one of the following 88 symbols: 89 ‘trim’ 90 Discard the terminating delimiter. This is the default, but 91 it will be impossible to tell whether the read terminated with 92 a delimiter or end-of-file. 93 ‘concat’ 94 Append the terminating delimiter (if any) to the returned 95 string. 96 ‘peek’ 97 Push the terminating delimiter (if any) back on to the port. 98 ‘split’ 99 Return a pair containing the string read from the port and the 100 terminating delimiter or end-of-file object. 101 102 -- Scheme Procedure: read-line! buf [port] 103 Read a line of text into the supplied string BUF and return the 104 number of characters added to BUF. If BUF is filled, then ‘#f’ is 105 returned. Read from PORT if specified, otherwise from the value 106 returned by ‘(current-input-port)’. 107 108 -- Scheme Procedure: read-delimited delims [port] [handle-delim] 109 Read text until one of the characters in the string DELIMS is found 110 or end-of-file is reached. Read from PORT if supplied, otherwise 111 from the value returned by ‘(current-input-port)’. HANDLE-DELIM 112 takes the same values as described for ‘read-line’. 113 114 -- Scheme Procedure: read-delimited! delims buf [port] [handle-delim] 115 [start] [end] 116 Read text into the supplied string BUF. 117 118 If a delimiter was found, return the number of characters written, 119 except if HANDLE-DELIM is ‘split’, in which case the return value 120 is a pair, as noted above. 121 122 As a special case, if PORT was already at end-of-stream, the EOF 123 object is returned. Also, if no characters were written because 124 the buffer was full, ‘#f’ is returned. 125 126 It’s something of a wacky interface, to be honest. 127 128 -- Scheme Procedure: %read-delimited! delims str gobble [port [start 129 [end]]] 130 -- C Function: scm_read_delimited_x (delims, str, gobble, port, start, 131 end) 132 Read characters from PORT into STR until one of the characters in 133 the DELIMS string is encountered. If GOBBLE is true, discard the 134 delimiter character; otherwise, leave it in the input stream for 135 the next read. If PORT is not specified, use the value of 136 ‘(current-input-port)’. If START or END are specified, store data 137 only into the substring of STR bounded by START and END (which 138 default to the beginning and end of the string, respectively). 139 140 Return a pair consisting of the delimiter that terminated the 141 string and the number of characters read. If reading stopped at 142 the end of file, the delimiter returned is the EOF-OBJECT; if the 143 string was filled without encountering a delimiter, this value is 144 ‘#f’. 145 146 -- Scheme Procedure: %read-line [port] 147 -- C Function: scm_read_line (port) 148 Read a newline-terminated line from PORT, allocating storage as 149 necessary. The newline terminator (if any) is removed from the 150 string, and a pair consisting of the line and its delimiter is 151 returned. The delimiter may be either a newline or the EOF-OBJECT; 152 if ‘%read-line’ is called at the end of file, it returns the pair 153 ‘(#<eof> . #<eof>)’. 154 155 -- Scheme Procedure: write-line obj [port] 156 -- C Function: scm_write_line (obj, port) 157 Display OBJ and a newline character to PORT. If PORT is not 158 specified, ‘(current-output-port)’ is used. This procedure is 159 equivalent to: 160 (display obj [port]) 161 (newline [port]) 162 163 164File: guile.info, Node: Default Ports, Next: Port Types, Prev: Line/Delimited, Up: Input and Output 165 1666.12.9 Default Ports for Input, Output and Errors 167------------------------------------------------- 168 169 -- Scheme Procedure: current-input-port 170 -- C Function: scm_current_input_port () 171 Return the current input port. This is the default port used by 172 many input procedures. 173 174 Initially this is the “standard input” in Unix and C terminology. 175 When the standard input is a tty the port is unbuffered, otherwise 176 it’s fully buffered. 177 178 Unbuffered input is good if an application runs an interactive 179 subprocess, since any type-ahead input won’t go into Guile’s buffer 180 and be unavailable to the subprocess. 181 182 Note that Guile buffering is completely separate from the tty “line 183 discipline”. In the usual cooked mode on a tty Guile only sees a 184 line of input once the user presses <Return>. 185 186 -- Scheme Procedure: current-output-port 187 -- C Function: scm_current_output_port () 188 Return the current output port. This is the default port used by 189 many output procedures. 190 191 Initially this is the “standard output” in Unix and C terminology. 192 When the standard output is a tty this port is unbuffered, 193 otherwise it’s fully buffered. 194 195 Unbuffered output to a tty is good for ensuring progress output or 196 a prompt is seen. But an application which always prints whole 197 lines could change to line buffered, or an application with a lot 198 of output could go fully buffered and perhaps make explicit 199 ‘force-output’ calls (*note Buffering::) at selected points. 200 201 -- Scheme Procedure: current-error-port 202 -- C Function: scm_current_error_port () 203 Return the port to which errors and warnings should be sent. 204 205 Initially this is the “standard error” in Unix and C terminology. 206 When the standard error is a tty this port is unbuffered, otherwise 207 it’s fully buffered. 208 209 -- Scheme Procedure: set-current-input-port port 210 -- Scheme Procedure: set-current-output-port port 211 -- Scheme Procedure: set-current-error-port port 212 -- C Function: scm_set_current_input_port (port) 213 -- C Function: scm_set_current_output_port (port) 214 -- C Function: scm_set_current_error_port (port) 215 Change the ports returned by ‘current-input-port’, 216 ‘current-output-port’ and ‘current-error-port’, respectively, so 217 that they use the supplied PORT for input or output. 218 219 -- Scheme Procedure: with-input-from-port port thunk 220 -- Scheme Procedure: with-output-to-port port thunk 221 -- Scheme Procedure: with-error-to-port port thunk 222 Call THUNK in a dynamic environment in which ‘current-input-port’, 223 ‘current-output-port’ or ‘current-error-port’ is rebound to the 224 given PORT. 225 226 -- C Function: void scm_dynwind_current_input_port (SCM port) 227 -- C Function: void scm_dynwind_current_output_port (SCM port) 228 -- C Function: void scm_dynwind_current_error_port (SCM port) 229 These functions must be used inside a pair of calls to 230 ‘scm_dynwind_begin’ and ‘scm_dynwind_end’ (*note Dynamic Wind::). 231 During the dynwind context, the indicated port is set to PORT. 232 233 More precisely, the current port is swapped with a ‘backup’ value 234 whenever the dynwind context is entered or left. The backup value 235 is initialized with the PORT argument. 236 237 238File: guile.info, Node: Port Types, Next: Venerable Port Interfaces, Prev: Default Ports, Up: Input and Output 239 2406.12.10 Types of Port 241--------------------- 242 243* Menu: 244 245* File Ports:: Ports on an operating system file. 246* Bytevector Ports:: Ports on a bytevector. 247* String Ports:: Ports on a Scheme string. 248* Custom Ports:: Ports whose implementation you control. 249* Soft Ports:: An older version of custom ports. 250* Void Ports:: Ports on nothing at all. 251 252 253File: guile.info, Node: File Ports, Next: Bytevector Ports, Up: Port Types 254 2556.12.10.1 File Ports 256.................... 257 258The following procedures are used to open file ports. See also *note 259open: Ports and File Descriptors, for an interface to the Unix ‘open’ 260system call. 261 262 All file access uses the “LFS” large file support functions when 263available, so files bigger than 2 Gbytes (2^31 bytes) can be read and 264written on a 32-bit system. 265 266 Most systems have limits on how many files can be open, so it’s 267strongly recommended that file ports be closed explicitly when no longer 268required (*note Ports::). 269 270 -- Scheme Procedure: open-file filename mode [#:guess-encoding=#f] 271 [#:encoding=#f] 272 -- C Function: scm_open_file_with_encoding (filename, mode, 273 guess_encoding, encoding) 274 -- C Function: scm_open_file (filename, mode) 275 Open the file whose name is FILENAME, and return a port 276 representing that file. The attributes of the port are determined 277 by the MODE string. The way in which this is interpreted is 278 similar to C stdio. The first character must be one of the 279 following: 280 281 ‘r’ 282 Open an existing file for input. 283 ‘w’ 284 Open a file for output, creating it if it doesn’t already 285 exist or removing its contents if it does. 286 ‘a’ 287 Open a file for output, creating it if it doesn’t already 288 exist. All writes to the port will go to the end of the file. 289 The "append mode" can be turned off while the port is in use 290 *note fcntl: Ports and File Descriptors. 291 292 The following additional characters can be appended: 293 294 ‘+’ 295 Open the port for both input and output. E.g., ‘r+’: open an 296 existing file for both input and output. 297 ‘0’ 298 Create an "unbuffered" port. In this case input and output 299 operations are passed directly to the underlying port 300 implementation without additional buffering. This is likely 301 to slow down I/O operations. The buffering mode can be 302 changed while a port is in use (*note Buffering::). 303 ‘l’ 304 Add line-buffering to the port. The port output buffer will 305 be automatically flushed whenever a newline character is 306 written. 307 ‘b’ 308 Use binary mode, ensuring that each byte in the file will be 309 read as one Scheme character. 310 311 To provide this property, the file will be opened with the 312 8-bit character encoding "ISO-8859-1", ignoring the default 313 port encoding. *Note Ports::, for more information on port 314 encodings. 315 316 Note that while it is possible to read and write binary data 317 as characters or strings, it is usually better to treat bytes 318 as octets, and byte sequences as bytevectors. *Note Binary 319 I/O::, for more. 320 321 This option had another historical meaning, for DOS 322 compatibility: in the default (textual) mode, DOS reads a 323 CR-LF sequence as one LF byte. The ‘b’ flag prevents this 324 from happening, adding ‘O_BINARY’ to the underlying ‘open’ 325 call. Still, the flag is generally useful because of its port 326 encoding ramifications. 327 328 Unless binary mode is requested, the character encoding of the new 329 port is determined as follows: First, if GUESS-ENCODING is true, 330 the ‘file-encoding’ procedure is used to guess the encoding of the 331 file (*note Character Encoding of Source Files::). If 332 GUESS-ENCODING is false or if ‘file-encoding’ fails, ENCODING is 333 used unless it is also false. As a last resort, the default port 334 encoding is used. *Note Ports::, for more information on port 335 encodings. It is an error to pass a non-false GUESS-ENCODING or 336 ENCODING if binary mode is requested. 337 338 If a file cannot be opened with the access requested, ‘open-file’ 339 throws an exception. 340 341 -- Scheme Procedure: open-input-file filename [#:guess-encoding=#f] 342 [#:encoding=#f] [#:binary=#f] 343 344 Open FILENAME for input. If BINARY is true, open the port in 345 binary mode, otherwise use text mode. ENCODING and GUESS-ENCODING 346 determine the character encoding as described above for 347 ‘open-file’. Equivalent to 348 (open-file FILENAME 349 (if BINARY "rb" "r") 350 #:guess-encoding GUESS-ENCODING 351 #:encoding ENCODING) 352 353 -- Scheme Procedure: open-output-file filename [#:encoding=#f] 354 [#:binary=#f] 355 356 Open FILENAME for output. If BINARY is true, open the port in 357 binary mode, otherwise use text mode. ENCODING specifies the 358 character encoding as described above for ‘open-file’. Equivalent 359 to 360 (open-file FILENAME 361 (if BINARY "wb" "w") 362 #:encoding ENCODING) 363 364 -- Scheme Procedure: call-with-input-file filename proc 365 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] 366 -- Scheme Procedure: call-with-output-file filename proc 367 [#:encoding=#f] [#:binary=#f] 368 Open FILENAME for input or output, and call ‘(PROC port)’ with the 369 resulting port. Return the value returned by PROC. FILENAME is 370 opened as per ‘open-input-file’ or ‘open-output-file’ respectively, 371 and an error is signaled if it cannot be opened. 372 373 When PROC returns, the port is closed. If PROC does not return 374 (e.g. if it throws an error), then the port might not be closed 375 automatically, though it will be garbage collected in the usual way 376 if not otherwise referenced. 377 378 -- Scheme Procedure: with-input-from-file filename thunk 379 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] 380 -- Scheme Procedure: with-output-to-file filename thunk [#:encoding=#f] 381 [#:binary=#f] 382 -- Scheme Procedure: with-error-to-file filename thunk [#:encoding=#f] 383 [#:binary=#f] 384 Open FILENAME and call ‘(THUNK)’ with the new port setup as 385 respectively the ‘current-input-port’, ‘current-output-port’, or 386 ‘current-error-port’. Return the value returned by THUNK. 387 FILENAME is opened as per ‘open-input-file’ or ‘open-output-file’ 388 respectively, and an error is signaled if it cannot be opened. 389 390 When THUNK returns, the port is closed and the previous setting of 391 the respective current port is restored. 392 393 The current port setting is managed with ‘dynamic-wind’, so the 394 previous value is restored no matter how THUNK exits (eg. an 395 exception), and if THUNK is re-entered (via a captured 396 continuation) then it’s set again to the FILENAME port. 397 398 The port is closed when THUNK returns normally, but not when exited 399 via an exception or new continuation. This ensures it’s still 400 ready for use if THUNK is re-entered by a captured continuation. 401 Of course the port is always garbage collected and closed in the 402 usual way when no longer referenced anywhere. 403 404 -- Scheme Procedure: port-mode port 405 -- C Function: scm_port_mode (port) 406 Return the port modes associated with the open port PORT. These 407 will not necessarily be identical to the modes used when the port 408 was opened, since modes such as "append" which are used only during 409 port creation are not retained. 410 411 -- Scheme Procedure: port-filename port 412 -- C Function: scm_port_filename (port) 413 Return the filename associated with PORT, or ‘#f’ if no filename is 414 associated with the port. 415 416 PORT must be open; ‘port-filename’ cannot be used once the port is 417 closed. 418 419 -- Scheme Procedure: set-port-filename! port filename 420 -- C Function: scm_set_port_filename_x (port, filename) 421 Change the filename associated with PORT, using the current input 422 port if none is specified. Note that this does not change the 423 port’s source of data, but only the value that is returned by 424 ‘port-filename’ and reported in diagnostic output. 425 426 -- Scheme Procedure: file-port? obj 427 -- C Function: scm_file_port_p (obj) 428 Determine whether OBJ is a port that is related to a file. 429 430 431File: guile.info, Node: Bytevector Ports, Next: String Ports, Prev: File Ports, Up: Port Types 432 4336.12.10.2 Bytevector Ports 434.......................... 435 436 -- Scheme Procedure: open-bytevector-input-port bv [transcoder] 437 -- C Function: scm_open_bytevector_input_port (bv, transcoder) 438 Return an input port whose contents are drawn from bytevector BV 439 (*note Bytevectors::). 440 441 The TRANSCODER argument is currently not supported. 442 443 -- Scheme Procedure: open-bytevector-output-port [transcoder] 444 -- C Function: scm_open_bytevector_output_port (transcoder) 445 Return two values: a binary output port and a procedure. The 446 latter should be called with zero arguments to obtain a bytevector 447 containing the data accumulated by the port, as illustrated below. 448 449 (call-with-values 450 (lambda () 451 (open-bytevector-output-port)) 452 (lambda (port get-bytevector) 453 (display "hello" port) 454 (get-bytevector))) 455 456 ⇒ #vu8(104 101 108 108 111) 457 458 The TRANSCODER argument is currently not supported. 459 460 -- Scheme Procedure: call-with-output-bytevector proc 461 Call the one-argument procedure PROC with a newly created 462 bytevector output port. When the function returns, the bytevector 463 composed of the characters written into the port is returned. PROC 464 should not close the port. 465 466 -- Scheme Procedure: call-with-input-bytevector bytevector proc 467 Call the one-argument procedure PROC with a newly created input 468 port from which BYTEVECTOR’s contents may be read. The values 469 yielded by the PROC is returned. 470 471 472File: guile.info, Node: String Ports, Next: Custom Ports, Prev: Bytevector Ports, Up: Port Types 473 4746.12.10.3 String Ports 475...................... 476 477 -- Scheme Procedure: call-with-output-string proc 478 -- C Function: scm_call_with_output_string (proc) 479 Calls the one-argument procedure PROC with a newly created output 480 port. When the function returns, the string composed of the 481 characters written into the port is returned. PROC should not 482 close the port. 483 484 -- Scheme Procedure: call-with-input-string string proc 485 -- C Function: scm_call_with_input_string (string, proc) 486 Calls the one-argument procedure PROC with a newly created input 487 port from which STRING’s contents may be read. The value yielded 488 by the PROC is returned. 489 490 -- Scheme Procedure: with-output-to-string thunk 491 Calls the zero-argument procedure THUNK with the current output 492 port set temporarily to a new string port. It returns a string 493 composed of the characters written to the current output. 494 495 -- Scheme Procedure: with-input-from-string string thunk 496 Calls the zero-argument procedure THUNK with the current input port 497 set temporarily to a string port opened on the specified STRING. 498 The value yielded by THUNK is returned. 499 500 -- Scheme Procedure: open-input-string str 501 -- C Function: scm_open_input_string (str) 502 Take a string and return an input port that delivers characters 503 from the string. The port can be closed by ‘close-input-port’, 504 though its storage will be reclaimed by the garbage collector if it 505 becomes inaccessible. 506 507 -- Scheme Procedure: open-output-string 508 -- C Function: scm_open_output_string () 509 Return an output port that will accumulate characters for retrieval 510 by ‘get-output-string’. The port can be closed by the procedure 511 ‘close-output-port’, though its storage will be reclaimed by the 512 garbage collector if it becomes inaccessible. 513 514 -- Scheme Procedure: get-output-string port 515 -- C Function: scm_get_output_string (port) 516 Given an output port created by ‘open-output-string’, return a 517 string consisting of the characters that have been output to the 518 port so far. 519 520 ‘get-output-string’ must be used before closing PORT, once closed 521 the string cannot be obtained. 522 523 With string ports, the port-encoding is treated differently than 524other types of ports. When string ports are created, they do not 525inherit a character encoding from the current locale. They are given a 526default locale that allows them to handle all valid string characters. 527Typically one should not modify a string port’s character encoding away 528from its default. *Note Encoding::. 529 530 531File: guile.info, Node: Custom Ports, Next: Soft Ports, Prev: String Ports, Up: Port Types 532 5336.12.10.4 Custom Ports 534...................... 535 536Custom ports allow the user to provide input and handle output via 537user-supplied procedures. Guile currently only provides custom binary 538ports, not textual ports; for custom textual ports, *Note Soft Ports::. 539We should add the R6RS custom textual port interfaces though. 540Contributions are appreciated. 541 542 -- Scheme Procedure: make-custom-binary-input-port id read! 543 get-position set-position! close 544 Return a new custom binary input port(1) named ID (a string) whose 545 input is drained by invoking READ! and passing it a bytevector, an 546 index where bytes should be written, and the number of bytes to 547 read. The ‘read!’ procedure must return an integer indicating the 548 number of bytes read, or ‘0’ to indicate the end-of-file. 549 550 Optionally, if GET-POSITION is not ‘#f’, it must be a thunk that 551 will be called when ‘port-position’ is invoked on the custom binary 552 port and should return an integer indicating the position within 553 the underlying data stream; if GET-POSITION was not supplied, the 554 returned port does not support ‘port-position’. 555 556 Likewise, if SET-POSITION! is not ‘#f’, it should be a one-argument 557 procedure. When ‘set-port-position!’ is invoked on the custom 558 binary input port, SET-POSITION! is passed an integer indicating 559 the position of the next byte is to read. 560 561 Finally, if CLOSE is not ‘#f’, it must be a thunk. It is invoked 562 when the custom binary input port is closed. 563 564 The returned port is fully buffered by default, but its buffering 565 mode can be changed using ‘setvbuf’ (*note Buffering::). 566 567 Using a custom binary input port, the ‘open-bytevector-input-port’ 568 procedure (*note Bytevector Ports::) could be implemented as 569 follows: 570 571 (define (open-bytevector-input-port source) 572 (define position 0) 573 (define length (bytevector-length source)) 574 575 (define (read! bv start count) 576 (let ((count (min count (- length position)))) 577 (bytevector-copy! source position 578 bv start count) 579 (set! position (+ position count)) 580 count)) 581 582 (define (get-position) position) 583 584 (define (set-position! new-position) 585 (set! position new-position)) 586 587 (make-custom-binary-input-port "the port" read! 588 get-position set-position! 589 #f)) 590 591 (read (open-bytevector-input-port (string->utf8 "hello"))) 592 ⇒ hello 593 594 -- Scheme Procedure: make-custom-binary-output-port id write! 595 get-position set-position! close 596 Return a new custom binary output port named ID (a string) whose 597 output is sunk by invoking WRITE! and passing it a bytevector, an 598 index where bytes should be read from this bytevector, and the 599 number of bytes to be “written”. The ‘write!’ procedure must 600 return an integer indicating the number of bytes actually written; 601 when it is passed ‘0’ as the number of bytes to write, it should 602 behave as though an end-of-file was sent to the byte sink. 603 604 The other arguments are as for ‘make-custom-binary-input-port’. 605 606 -- Scheme Procedure: make-custom-binary-input/output-port id read! 607 write! get-position set-position! close 608 Return a new custom binary input/output port named ID (a string). 609 The various arguments are the same as for The other arguments are 610 as for ‘make-custom-binary-input-port’ and 611 ‘make-custom-binary-output-port’. If buffering is enabled on the 612 port, as is the case by default, input will be buffered in both 613 directions; *Note Buffering::. If the SET-POSITION! function is 614 provided and not ‘#f’, then the port will also be marked as 615 random-access, causing the buffer to be flushed between reads and 616 writes. 617 618 ---------- Footnotes ---------- 619 620 (1) This is similar in spirit to Guile’s “soft ports” (*note Soft 621Ports::). 622 623 624File: guile.info, Node: Soft Ports, Next: Void Ports, Prev: Custom Ports, Up: Port Types 625 6266.12.10.5 Soft Ports 627.................... 628 629A “soft port” is a port based on a vector of procedures capable of 630accepting or delivering characters. It allows emulation of I/O ports. 631 632 -- Scheme Procedure: make-soft-port pv modes 633 Return a port capable of receiving or delivering characters as 634 specified by the MODES string (*note open-file: File Ports.). PV 635 must be a vector of length 5 or 6. Its components are as follows: 636 637 0. procedure accepting one character for output 638 1. procedure accepting a string for output 639 2. thunk for flushing output 640 3. thunk for getting one character 641 4. thunk for closing port (not by garbage collection) 642 5. (if present and not ‘#f’) thunk for computing the number of 643 characters that can be read from the port without blocking. 644 645 For an output-only port only elements 0, 1, 2, and 4 need be 646 procedures. For an input-only port only elements 3 and 4 need be 647 procedures. Thunks 2 and 4 can instead be ‘#f’ if there is no 648 useful operation for them to perform. 649 650 If thunk 3 returns ‘#f’ or an ‘eof-object’ (*note eof-object?: 651 (r5rs)Input.) it indicates that the port has reached end-of-file. 652 For example: 653 654 (define stdout (current-output-port)) 655 (define p (make-soft-port 656 (vector 657 (lambda (c) (write c stdout)) 658 (lambda (s) (display s stdout)) 659 (lambda () (display "." stdout)) 660 (lambda () (char-upcase (read-char))) 661 (lambda () (display "@" stdout))) 662 "rw")) 663 664 (write p p) ⇒ #<input-output: soft 8081e20> 665 666 667File: guile.info, Node: Void Ports, Prev: Soft Ports, Up: Port Types 668 6696.12.10.6 Void Ports 670.................... 671 672This kind of port causes any data to be discarded when written to, and 673always returns the end-of-file object when read from. 674 675 -- Scheme Procedure: %make-void-port mode 676 -- C Function: scm_sys_make_void_port (mode) 677 Create and return a new void port. A void port acts like 678 ‘/dev/null’. The MODE argument specifies the input/output modes 679 for this port: see the documentation for ‘open-file’ in *note File 680 Ports::. 681 682 683File: guile.info, Node: Venerable Port Interfaces, Next: Using Ports from C, Prev: Port Types, Up: Input and Output 684 6856.12.11 Venerable Port Interfaces 686--------------------------------- 687 688Over the 25 years or so that Guile has been around, its port system has 689evolved, adding many useful features. At the same time there have been 690four major Scheme standards released in those 25 years, which also 691evolve the common Scheme understanding of what a port interface should 692be. Alas, it would be too much to ask for all of these evolutionary 693branches to be consistent. Some of Guile’s original interfaces don’t 694mesh with the later Scheme standards, and yet Guile can’t just drop old 695interfaces. Sadly as well, the R6RS and R7RS standards both part from a 696base of R5RS, but end up in different and somewhat incompatible designs. 697 698 Guile’s approach is to pick a set of port primitives that make sense 699together. We document that set of primitives, design our internal 700interfaces around them, and recommend them to users. As the R6RS I/O 701system is the most capable standard that Scheme has yet produced in this 702domain, we mostly recommend that; ‘(ice-9 binary-ports)’ and ‘(ice-9 703textual-ports)’ are wholly modelled on ‘(rnrs io ports)’. Guile does 704not wholly copy R6RS, however; *Note R6RS Incompatibilities::. 705 706 At the same time, we have many venerable port interfaces, lore handed 707down to us from our hacker ancestors. Most of these interfaces even 708predate the expectation that Scheme should have modules, so they are 709present in the default environment. In Guile we support them as well 710and we have no plans to remove them, but again we don’t recommend them 711for new users. 712 713 -- Scheme Procedure: char-ready? [port] 714 Return ‘#t’ if a character is ready on input PORT and return ‘#f’ 715 otherwise. If ‘char-ready?’ returns ‘#t’ then the next ‘read-char’ 716 operation on PORT is guaranteed not to hang. If PORT is a file 717 port at end of file then ‘char-ready?’ returns ‘#t’. 718 719 ‘char-ready?’ exists to make it possible for a program to accept 720 characters from interactive ports without getting stuck waiting for 721 input. Any input editors associated with such ports must make sure 722 that characters whose existence has been asserted by ‘char-ready?’ 723 cannot be rubbed out. If ‘char-ready?’ were to return ‘#f’ at end 724 of file, a port at end of file would be indistinguishable from an 725 interactive port that has no ready characters. 726 727 Note that ‘char-ready?’ only works reliably for terminals and 728 sockets with one-byte encodings. Under the hood it will return 729 ‘#t’ if the port has any input buffered, or if the file descriptor 730 that backs the port polls as readable, indicating that Guile can 731 fetch more bytes from the kernel. However being able to fetch one 732 byte doesn’t mean that a full character is available; *Note 733 Encoding::. Also, on many systems it’s possible for a file 734 descriptor to poll as readable, but then block when it comes time 735 to read bytes. Note also that on Linux kernels, all file ports 736 backed by files always poll as readable. For non-file ports, this 737 procedure always returns ‘#t’, except for soft ports, which have a 738 ‘char-ready?’ handler. *Note Soft Ports::. 739 740 In short, this is a legacy procedure whose semantics are hard to 741 provide. However it is a useful check to see if any input is 742 buffered. *Note Non-Blocking I/O::. 743 744 -- Scheme Procedure: read-char [port] 745 The same as ‘get-char’, except that PORT defaults to the current 746 input port. *Note Textual I/O::. 747 748 -- Scheme Procedure: peek-char [port] 749 The same as ‘lookahead-char’, except that PORT defaults to the 750 current input port. *Note Textual I/O::. 751 752 -- Scheme Procedure: unread-char cobj [port] 753 The same as ‘unget-char’, except that PORT defaults to the current 754 input port, and the arguments are swapped. *Note Textual I/O::. 755 756 -- Scheme Procedure: unread-string str port 757 -- C Function: scm_unread_string (str, port) 758 The same as ‘unget-string’, except that PORT defaults to the 759 current input port, and the arguments are swapped. *Note Textual 760 I/O::. 761 762 -- Scheme Procedure: newline [port] 763 Send a newline to PORT. If PORT is omitted, send to the current 764 output port. Equivalent to ‘(put-char port #\newline)’. 765 766 -- Scheme Procedure: write-char chr [port] 767 The same as ‘put-char’, except that PORT defaults to the current 768 input port, and the arguments are swapped. *Note Textual I/O::. 769 770 771File: guile.info, Node: Using Ports from C, Next: I/O Extensions, Prev: Venerable Port Interfaces, Up: Input and Output 772 7736.12.12 Using Ports from C 774-------------------------- 775 776Guile’s C interfaces provides some niceties for sending and receiving 777bytes and characters in a way that works better with C. 778 779 -- C Function: size_t scm_c_read (SCM port, void *buffer, size_t size) 780 Read up to SIZE bytes from PORT and store them in BUFFER. The 781 return value is the number of bytes actually read, which can be 782 less than SIZE if end-of-file has been reached. 783 784 Note that as this is a binary input procedure, this function does 785 not update ‘port-line’ and ‘port-column’ (*note Textual I/O::). 786 787 -- C Function: void scm_c_write (SCM port, const void *buffer, size_t 788 size) 789 Write SIZE bytes at BUFFER to PORT. 790 791 Note that as this is a binary output procedure, this function does 792 not update ‘port-line’ and ‘port-column’ (*note Textual I/O::). 793 794 -- C Function: size_t scm_c_read_bytes (SCM port, SCM bv, size_t start, 795 size_t count) 796 -- C Function: void scm_c_write_bytes (SCM port, SCM bv, size_t start, 797 size_t count) 798 Like ‘scm_c_read’ and ‘scm_c_write’, but reading into or writing 799 from the bytevector BV. COUNT indicates the byte index at which to 800 start in the bytevector, and the read or write will continue for 801 COUNT bytes. 802 803 -- C Function: void scm_unget_bytes (const unsigned char *buf, size_t 804 len, SCM port) 805 -- C Function: void scm_unget_byte (int c, SCM port) 806 -- C Function: void scm_ungetc (scm_t_wchar c, SCM port) 807 Like ‘unget-bytevector’, ‘unget-byte’, and ‘unget-char’, 808 respectively. *Note Textual I/O::. 809 810 -- C Function: void scm_c_put_latin1_chars (SCM port, const scm_t_uint8 811 *buf, size_t len) 812 -- C Function: void scm_c_put_utf32_chars (SCM port, const scm_t_uint32 813 *buf, size_t len); 814 Write a string to PORT. In the first case, the ‘scm_t_uint8*’ 815 buffer is a string in the latin-1 encoding. In the second, the 816 ‘scm_t_uint32*’ buffer is a string in the UTF-32 encoding. These 817 routines will update the port’s line and column. 818 819 820File: guile.info, Node: I/O Extensions, Next: Non-Blocking I/O, Prev: Using Ports from C, Up: Input and Output 821 8226.12.13 Implementing New Port Types in C 823---------------------------------------- 824 825This section describes how to implement a new port type in C. Although 826ports support many operations, as a data structure they present an 827opaque interface to the user. To the port implementor, you have two 828pieces of information to work with: the port type, and the port’s 829“stream”. The port type is an opaque pointer allocated when defining 830your port type. It is your key into the port API, and it helps you 831identify which ports are actually yours. The “stream” is a pointer you 832control, and which you set when you create a port. Get a stream from a 833port using the ‘SCM_STREAM’ macro. Note that your port methods are only 834ever called with ports of your type. 835 836 A port type is created by calling ‘scm_make_port_type’. Once you 837have your port type, you can create ports with ‘scm_c_make_port’, or 838‘scm_c_make_port_with_encoding’. 839 840 -- Function: scm_t_port_type* scm_make_port_type (char *name, size_t 841 (*read) (SCM port, SCM dst, size_t start, size_t count), 842 size_t (*write) (SCM port, SCM src, size_t start, size_t 843 count)) 844 Define a new port type. The NAME, READ and WRITE parameters are 845 initial values for those port type fields, as described below. The 846 other fields are initialized with default values and can be changed 847 later. 848 849 -- Function: SCM scm_c_make_port_with_encoding (scm_t_port_type *type, 850 unsigned long mode_bits, SCM encoding, SCM 851 conversion_strategy, scm_t_bits stream) 852 -- Function: SCM scm_c_make_port (scm_t_port_type *type, unsigned long 853 mode_bits, scm_t_bits stream) 854 Make a port with the given TYPE. The STREAM indicates the private 855 data associated with the port, which your port implementation may 856 later retrieve with ‘SCM_STREAM’. The mode bits should include one 857 or more of the flags ‘SCM_RDNG’ or ‘SCM_WRTNG’, indicating that the 858 port is an input and/or an output port, respectively. The mode 859 bits may also include ‘SCM_BUF0’ or ‘SCM_BUFLINE’, indicating that 860 the port should be unbuffered or line-buffered, respectively. The 861 default is that the port will be block-buffered. *Note 862 Buffering::. 863 864 As you would imagine, ENCODING and CONVERSION_STRATEGY specify the 865 port’s initial textual encoding and conversion strategy. Both are 866 symbols. ‘scm_c_make_port’ is the same as 867 ‘scm_c_make_port_with_encoding’, except it uses the default port 868 encoding and conversion strategy. 869 870 The port type has a number of associate procedures and properties 871which collectively implement the port’s behavior. Creating a new port 872type mostly involves writing these procedures. 873 874‘name’ 875 A pointer to a NUL terminated string: the name of the port type. 876 This property is initialized via the first argument to 877 ‘scm_make_port_type’. 878 879‘read’ 880 A port’s ‘read’ implementation fills read buffers. It should copy 881 bytes to the supplied bytevector ‘dst’, starting at offset ‘start’ 882 and continuing for ‘count’ bytes, returning the number of bytes 883 read. 884 885‘write’ 886 A port’s ‘write’ implementation flushes write buffers to the 887 mutable store. It should write out bytes from the supplied 888 bytevector ‘src’, starting at offset ‘start’ and continuing for 889 ‘count’ bytes, and return the number of bytes that were written. 890 891‘read_wait_fd’ 892‘write_wait_fd’ 893 If a port’s ‘read’ or ‘write’ function returns ‘(size_t) -1’, that 894 indicates that reading or writing would block. In that case to 895 preserve the illusion of a blocking read or write operation, 896 Guile’s C port run-time will ‘poll’ on the file descriptor returned 897 by either the port’s ‘read_wait_fd’ or ‘write_wait_fd’ function. 898 Set using 899 900 -- Function: void scm_set_port_read_wait_fd (scm_t_port_type 901 *type, int (*wait_fd) (SCM port)) 902 -- Function: void scm_set_port_write_wait_fd (scm_t_port_type 903 *type, int (*wait_fd) (SCM port)) 904 905 Only a port type which implements the ‘read_wait_fd’ or 906 ‘write_wait_fd’ port methods can usefully return ‘(size_t) -1’ from 907 a read or write function. *Note Non-Blocking I/O::, for more on 908 non-blocking I/O in Guile. 909 910‘print’ 911 Called when ‘write’ is called on the port, to print a port 912 description. For example, for a file port it may produce something 913 like: ‘#<input: /etc/passwd 3>’. Set using 914 915 -- Function: void scm_set_port_print (scm_t_port_type *type, int 916 (*print) (SCM port, SCM dest_port, scm_print_state 917 *pstate)) 918 The first argument PORT is the port being printed, the second 919 argument DEST_PORT is where its description should go. 920 921‘close’ 922 Called when the port is closed. It should free any resources used 923 by the port. Set using 924 925 -- Function: void scm_set_port_close (scm_t_port_type *type, void 926 (*close) (SCM port)) 927 928 By default, ports that are garbage collected just go away without 929 closing. If your port type needs to release some external resource 930 like a file descriptor, or needs to make sure that its internal 931 buffers are flushed even if the port is collected while it was 932 open, then mark the port type as needing a close on GC. 933 934 -- Function: void scm_set_port_needs_close_on_gc (scm_t_port_type 935 *type, int needs_close_p) 936 937‘seek’ 938 Set the current position of the port. Guile will flush read and/or 939 write buffers before seeking, as appropriate. 940 941 -- Function: void scm_set_port_seek (scm_t_port_type *type, 942 scm_t_off (*seek) (SCM port, scm_t_off offset, int 943 whence)) 944 945‘truncate’ 946 Truncate the port data to be specified length. Guile will flush 947 buffers before hand, as appropriate. Set using 948 949 -- Function: void scm_set_port_truncate (scm_t_port_type *type, 950 void (*truncate) (SCM port, scm_t_off length)) 951 952‘random_access_p’ 953 Determine whether this port is a random-access port. 954 955 Seeking on a random-access port with buffered input, or switching 956 to writing after reading, will cause the buffered input to be 957 discarded and Guile will seek the port back the buffered number of 958 bytes. Likewise seeking on a random-access port with buffered 959 output, or switching to reading after writing, will flush pending 960 bytes with a call to the ‘write’ procedure. *Note Buffering::. 961 962 Indicate to Guile that your port needs this behavior by returning a 963 nonzero value from your ‘random_access_p’ function. The default 964 implementation of this function returns nonzero if the port type 965 supplies a seek implementation. 966 967 -- Function: void scm_set_port_random_access_p (scm_t_port_type 968 *type, int (*random_access_p) (SCM port)); 969 970‘get_natural_buffer_sizes’ 971 Guile will internally attach buffers to ports. An input port 972 always has a read buffer and an output port always has a write 973 buffer. *Note Buffering::. A port buffer consists of a 974 bytevector, along with some cursors into that bytevector denoting 975 where to get and put data. 976 977 Port implementations generally don’t have to be concerned with 978 buffering: a port type’s ‘read’ or ‘write’ function will receive 979 the buffer’s bytevector as an argument, along with an offset and a 980 length into that bytevector, and should then either fill or empty 981 that bytevector. However in some cases, port implementations may 982 be able to provide an appropriate default buffer size to Guile. 983 984 -- Function: void scm_set_port_get_natural_buffer_sizes 985 (scm_t_port_type *type, void (*get_natural_buffer_sizes) 986 (SCM, size_t *read_buf_size, size_t *write_buf_size)) 987 Fill in READ_BUF_SIZE and WRITE_BUF_SIZE with an appropriate 988 buffer size for this port, if one is known. 989 990 File ports implement a ‘get_natural_buffer_sizes’ to let the 991 operating system inform Guile about the appropriate buffer sizes 992 for the particular file opened by the port. 993 994 Note that calls to all of these methods can proceed in parallel and 995concurrently and from any thread up until the point that the port is 996closed. The call to ‘close’ will happen when no other method is 997running, and no method will be called after the ‘close’ method is 998called. If your port implementation needs mutual exclusion to prevent 999concurrency, it is responsible for locking appropriately. 1000 1001 1002File: guile.info, Node: Non-Blocking I/O, Next: BOM Handling, Prev: I/O Extensions, Up: Input and Output 1003 10046.12.14 Non-Blocking I/O 1005------------------------ 1006 1007Most ports in Guile are “blocking”: when you try to read a character 1008from a port, Guile will block on the read until a character is ready, or 1009end-of-stream is detected. Likewise whenever Guile goes to write 1010(possibly buffered) data to an output port, Guile will block until all 1011the data is written. 1012 1013 Interacting with ports in blocking mode is very convenient: you can 1014write straightforward, sequential algorithms whose code flow reflects 1015the flow of data. However, blocking I/O has two main limitations. 1016 1017 The first is that it’s easy to get into a situation where code is 1018waiting on data. Time spent waiting on data when code could be doing 1019something else is wasteful and prevents your program from reaching its 1020peak throughput. If you implement a web server that sequentially 1021handles requests from clients, it’s very easy for the server to end up 1022waiting on a client to finish its HTTP request, or waiting on it to 1023consume the response. The end result is that you are able to serve 1024fewer requests per second than you’d like to serve. 1025 1026 The second limitation is related: a blocking parser over 1027user-controlled input is a denial-of-service vulnerability. Indeed the 1028so-called “slow loris” attack of the early 2010s was just that: an 1029attack on common web servers that drip-fed HTTP requests, one character 1030at a time. All it took was a handful of slow loris connections to 1031occupy an entire web server. 1032 1033 In Guile we would like to preserve the ability to write 1034straightforward blocking networking processes of all kinds, but under 1035the hood to allow those processes to suspend their requests if they 1036would block. 1037 1038 To do this, the first piece is to allow Guile ports to declare 1039themselves as being nonblocking. This is currently supported only for 1040file ports, which also includes sockets, terminals, or any other port 1041that is backed by a file descriptor. To do that, we use an arcane UNIX 1042incantation: 1043 1044 (let ((flags (fcntl socket F_GETFL))) 1045 (fcntl socket F_SETFL (logior O_NONBLOCK flags))) 1046 1047 Now the file descriptor is open in non-blocking mode. If Guile tries 1048to read or write from this file and the read or write returns a result 1049indicating that more data can only be had by doing a blocking read or 1050write, Guile will block by polling on the socket’s ‘read-wait-fd’ or 1051‘write-wait-fd’, to preserve the illusion of a blocking read or write. 1052*Note I/O Extensions:: for more on those internal interfaces. 1053 1054 So far we have just reproduced the status quo: the file descriptor is 1055non-blocking, but the operations on the port do block. To go farther, 1056it would be nice if we could suspend the “thread” using delimited 1057continuations, and only resume the thread once the file descriptor is 1058readable or writable. (*Note Prompts::). 1059 1060 But here we run into a difficulty. The ports code is implemented in 1061C, which means that although we can suspend the computation to some 1062outer prompt, we can’t resume it because Guile can’t resume delimited 1063continuations that capture the C stack. 1064 1065 To overcome this difficulty we have created a compatible but entirely 1066parallel implementation of port operations. To use this implementation, 1067do the following: 1068 1069 (use-modules (ice-9 suspendable-ports)) 1070 (install-suspendable-ports!) 1071 1072 This will replace the core I/O primitives like ‘get-char’ and 1073‘put-bytevector’ with new versions that are exactly the same as the ones 1074in the standard library, but with two differences. One is that when a 1075read or a write would block, the suspendable port operations call out 1076the value of the ‘current-read-waiter’ or ‘current-write-waiter’ 1077parameter, as appropriate. *Note Parameters::. The default read and 1078write waiters do the same thing that the C read and write waiters do, 1079which is to poll. User code can parameterize the waiters, though, 1080enabling the computation to suspend and allow the program to process 1081other I/O operations. Because the new suspendable ports implementation 1082is written in Scheme, that suspended computation can resume again later 1083when it is able to make progress. Success! 1084 1085 The other main difference is that because the new ports 1086implementation is written in Scheme, it is slower than C, currently by a 1087factor of 3 or 4, though it depends on many factors. For this reason we 1088have to keep the C implementations as the default ones. One day when 1089Guile’s compiler is better, we can close this gap and have only one port 1090operation implementation again. 1091 1092 Note that Guile does not currently include an implementation of the 1093facility to suspend the current thread and schedule other threads in the 1094meantime. Before adding such a thing, we want to make sure that we’re 1095providing the right primitives that can be used to build schedulers and 1096other user-space concurrency patterns, and that the patterns that we 1097settle on are the right patterns. In the meantime, have a look at 8sync 1098(<https://gnu.org/software/8sync>) for a prototype of an asynchronous 1099I/O and concurrency facility. 1100 1101 -- Scheme Procedure: install-suspendable-ports! 1102 Replace the core ports implementation with suspendable ports, as 1103 described above. This will mutate the values of the bindings like 1104 ‘get-char’, ‘put-u8’, and so on in place. 1105 1106 -- Scheme Procedure: uninstall-suspendable-ports! 1107 Restore the original core ports implementation, un-doing the effect 1108 of ‘install-suspendable-ports!’. 1109 1110 -- Scheme Parameter: current-read-waiter 1111 -- Scheme Parameter: current-write-waiter 1112 Parameters whose values are procedures of one argument, called when 1113 a suspendable port operation would block on a port while reading or 1114 writing, respectively. The default values of these parameters do a 1115 blocking ‘poll’ on the port’s file descriptor. The procedures are 1116 passed the port in question as their one argument. 1117 1118 1119File: guile.info, Node: BOM Handling, Prev: Non-Blocking I/O, Up: Input and Output 1120 11216.12.15 Handling of Unicode Byte Order Marks 1122-------------------------------------------- 1123 1124This section documents the finer points of Guile’s handling of Unicode 1125byte order marks (BOMs). A byte order mark (U+FEFF) is typically found 1126at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably 1127determine the byte order. Occasionally, a BOM is found at the start of 1128a UTF-8 stream, but this is much less common and not generally 1129recommended. 1130 1131 Guile attempts to handle BOMs automatically, and in accordance with 1132the recommendations of the Unicode Standard, when the port encoding is 1133set to ‘UTF-8’, ‘UTF-16’, or ‘UTF-32’. In brief, Guile automatically 1134writes a BOM at the start of a UTF-16 or UTF-32 stream, and 1135automatically consumes one from the start of a UTF-8, UTF-16, or UTF-32 1136stream. 1137 1138 As specified in the Unicode Standard, a BOM is only handled specially 1139at the start of a stream, and only if the port encoding is set to 1140‘UTF-8’, ‘UTF-16’ or ‘UTF-32’. If the port encoding is set to 1141‘UTF-16BE’, ‘UTF-16LE’, ‘UTF-32BE’, or ‘UTF-32LE’, then BOMs are _not_ 1142handled specially, and none of the special handling described in this 1143section applies. 1144 1145 • To ensure that Guile will properly detect the byte order of a 1146 UTF-16 or UTF-32 stream, you must perform a textual read before any 1147 writes, seeks, or binary I/O. Guile will not attempt to read a BOM 1148 unless a read is explicitly requested at the start of the stream. 1149 1150 • If a textual write is performed before the first read, then an 1151 arbitrary byte order will be chosen. Currently, big endian is the 1152 default on all platforms, but that may change in the future. If 1153 you wish to explicitly control the byte order of an output stream, 1154 set the port encoding to ‘UTF-16BE’, ‘UTF-16LE’, ‘UTF-32BE’, or 1155 ‘UTF-32LE’, and explicitly write a BOM (‘#\xFEFF’) if desired. 1156 1157 • If ‘set-port-encoding!’ is called in the middle of a stream, Guile 1158 treats this as a new logical “start of stream” for purposes of BOM 1159 handling, and will forget about any BOMs that had previously been 1160 seen. Therefore, it may choose a different byte order than had 1161 been used previously. This is intended to support multiple logical 1162 text streams embedded within a larger binary stream. 1163 1164 • Binary I/O operations are not guaranteed to update Guile’s notion 1165 of whether the port is at the “start of the stream”, nor are they 1166 guaranteed to produce or consume BOMs. 1167 1168 • For ports that support seeking (e.g. normal files), the input and 1169 output streams are considered linked: if the user reads first, then 1170 a BOM will be consumed (if appropriate), but later writes will 1171 _not_ produce a BOM. Similarly, if the user writes first, then 1172 later reads will _not_ consume a BOM. 1173 1174 • For ports that are not random access (e.g. pipes, sockets, and 1175 terminals), the input and output streams are considered 1176 _independent_ for purposes of BOM handling: the first read will 1177 consume a BOM (if appropriate), and the first write will _also_ 1178 produce a BOM (if appropriate). However, the input and output 1179 streams will always use the same byte order. 1180 1181 • Seeks to the beginning of a file will set the “start of stream” 1182 flags. Therefore, a subsequent textual read or write will consume 1183 or produce a BOM. However, unlike ‘set-port-encoding!’, if a byte 1184 order had already been chosen for the port, it will remain in 1185 effect after a seek, and cannot be changed by the presence of a 1186 BOM. Seeks anywhere other than the beginning of a file clear the 1187 “start of stream” flags. 1188 1189 1190File: guile.info, Node: Regular Expressions, Next: LALR(1) Parsing, Prev: Input and Output, Up: API Reference 1191 11926.13 Regular Expressions 1193======================== 1194 1195A “regular expression” (or “regexp”) is a pattern that describes a whole 1196class of strings. A full description of regular expressions and their 1197syntax is beyond the scope of this manual. 1198 1199 If your system does not include a POSIX regular expression library, 1200and you have not linked Guile with a third-party regexp library such as 1201Rx, these functions will not be available. You can tell whether your 1202Guile installation includes regular expression support by checking 1203whether ‘(provided? 'regex)’ returns true. 1204 1205 The following regexp and string matching features are provided by the 1206‘(ice-9 regex)’ module. Before using the described functions, you 1207should load this module by executing ‘(use-modules (ice-9 regex))’. 1208 1209* Menu: 1210 1211* Regexp Functions:: Functions that create and match regexps. 1212* Match Structures:: Finding what was matched by a regexp. 1213* Backslash Escapes:: Removing the special meaning of regexp 1214 meta-characters. 1215 1216 1217File: guile.info, Node: Regexp Functions, Next: Match Structures, Up: Regular Expressions 1218 12196.13.1 Regexp Functions 1220----------------------- 1221 1222By default, Guile supports POSIX extended regular expressions. That 1223means that the characters ‘(’, ‘)’, ‘+’ and ‘?’ are special, and must be 1224escaped if you wish to match the literal characters and there is no 1225support for “non-greedy” variants of ‘*’, ‘+’ or ‘?’. 1226 1227 This regular expression interface was modeled after that implemented 1228by SCSH, the Scheme Shell. It is intended to be upwardly compatible 1229with SCSH regular expressions. 1230 1231 Zero bytes (‘#\nul’) cannot be used in regex patterns or input 1232strings, since the underlying C functions treat that as the end of 1233string. If there’s a zero byte an error is thrown. 1234 1235 Internally, patterns and input strings are converted to the current 1236locale’s encoding, and then passed to the C library’s regular expression 1237routines (*note (libc)Regular Expressions::). The returned match 1238structures always point to characters in the strings, not to individual 1239bytes, even in the case of multi-byte encodings. 1240 1241 -- Scheme Procedure: string-match pattern str [start] 1242 Compile the string PATTERN into a regular expression and compare it 1243 with STR. The optional numeric argument START specifies the 1244 position of STR at which to begin matching. 1245 1246 ‘string-match’ returns a “match structure” which describes what, if 1247 anything, was matched by the regular expression. *Note Match 1248 Structures::. If STR does not match PATTERN at all, ‘string-match’ 1249 returns ‘#f’. 1250 1251 Two examples of a match follow. In the first example, the pattern 1252matches the four digits in the match string. In the second, the pattern 1253matches nothing. 1254 1255 (string-match "[0-9][0-9][0-9][0-9]" "blah2002") 1256 ⇒ #("blah2002" (4 . 8)) 1257 1258 (string-match "[A-Za-z]" "123456") 1259 ⇒ #f 1260 1261 Each time ‘string-match’ is called, it must compile its PATTERN 1262argument into a regular expression structure. This operation is 1263expensive, which makes ‘string-match’ inefficient if the same regular 1264expression is used several times (for example, in a loop). For better 1265performance, you can compile a regular expression in advance and then 1266match strings against the compiled regexp. 1267 1268 -- Scheme Procedure: make-regexp pat flag... 1269 -- C Function: scm_make_regexp (pat, flaglst) 1270 Compile the regular expression described by PAT, and return the 1271 compiled regexp structure. If PAT does not describe a legal 1272 regular expression, ‘make-regexp’ throws a 1273 ‘regular-expression-syntax’ error. 1274 1275 The FLAG arguments change the behavior of the compiled regular 1276 expression. The following values may be supplied: 1277 1278 -- Variable: regexp/icase 1279 Consider uppercase and lowercase letters to be the same when 1280 matching. 1281 1282 -- Variable: regexp/newline 1283 If a newline appears in the target string, then permit the ‘^’ 1284 and ‘$’ operators to match immediately after or immediately 1285 before the newline, respectively. Also, the ‘.’ and ‘[^...]’ 1286 operators will never match a newline character. The intent of 1287 this flag is to treat the target string as a buffer containing 1288 many lines of text, and the regular expression as a pattern 1289 that may match a single one of those lines. 1290 1291 -- Variable: regexp/basic 1292 Compile a basic (“obsolete”) regexp instead of the extended 1293 (“modern”) regexps that are the default. Basic regexps do not 1294 consider ‘|’, ‘+’ or ‘?’ to be special characters, and require 1295 the ‘{...}’ and ‘(...)’ metacharacters to be backslash-escaped 1296 (*note Backslash Escapes::). There are several other 1297 differences between basic and extended regular expressions, 1298 but these are the most significant. 1299 1300 -- Variable: regexp/extended 1301 Compile an extended regular expression rather than a basic 1302 regexp. This is the default behavior; this flag will not 1303 usually be needed. If a call to ‘make-regexp’ includes both 1304 ‘regexp/basic’ and ‘regexp/extended’ flags, the one which 1305 comes last will override the earlier one. 1306 1307 -- Scheme Procedure: regexp-exec rx str [start [flags]] 1308 -- C Function: scm_regexp_exec (rx, str, start, flags) 1309 Match the compiled regular expression RX against ‘str’. If the 1310 optional integer START argument is provided, begin matching from 1311 that position in the string. Return a match structure describing 1312 the results of the match, or ‘#f’ if no match could be found. 1313 1314 The FLAGS argument changes the matching behavior. The following 1315 flag values may be supplied, use ‘logior’ (*note Bitwise 1316 Operations::) to combine them, 1317 1318 -- Variable: regexp/notbol 1319 Consider that the START offset into STR is not the beginning 1320 of a line and should not match operator ‘^’. 1321 1322 If RX was created with the ‘regexp/newline’ option above, ‘^’ 1323 will still match after a newline in STR. 1324 1325 -- Variable: regexp/noteol 1326 Consider that the end of STR is not the end of a line and 1327 should not match operator ‘$’. 1328 1329 If RX was created with the ‘regexp/newline’ option above, ‘$’ 1330 will still match before a newline in STR. 1331 1332 ;; Regexp to match uppercase letters 1333 (define r (make-regexp "[A-Z]*")) 1334 1335 ;; Regexp to match letters, ignoring case 1336 (define ri (make-regexp "[A-Z]*" regexp/icase)) 1337 1338 ;; Search for bob using regexp r 1339 (match:substring (regexp-exec r "bob")) 1340 ⇒ "" ; no match 1341 1342 ;; Search for bob using regexp ri 1343 (match:substring (regexp-exec ri "Bob")) 1344 ⇒ "Bob" ; matched case insensitive 1345 1346 -- Scheme Procedure: regexp? obj 1347 -- C Function: scm_regexp_p (obj) 1348 Return ‘#t’ if OBJ is a compiled regular expression, or ‘#f’ 1349 otherwise. 1350 1351 1352 -- Scheme Procedure: list-matches regexp str [flags] 1353 Return a list of match structures which are the non-overlapping 1354 matches of REGEXP in STR. REGEXP can be either a pattern string or 1355 a compiled regexp. The FLAGS argument is as per ‘regexp-exec’ 1356 above. 1357 1358 (map match:substring (list-matches "[a-z]+" "abc 42 def 78")) 1359 ⇒ ("abc" "def") 1360 1361 -- Scheme Procedure: fold-matches regexp str init proc [flags] 1362 Apply PROC to the non-overlapping matches of REGEXP in STR, to 1363 build a result. REGEXP can be either a pattern string or a 1364 compiled regexp. The FLAGS argument is as per ‘regexp-exec’ above. 1365 1366 PROC is called as ‘(PROC match prev)’ where MATCH is a match 1367 structure and PREV is the previous return from PROC. For the first 1368 call PREV is the given INIT parameter. ‘fold-matches’ returns the 1369 final value from PROC. 1370 1371 For example to count matches, 1372 1373 (fold-matches "[a-z][0-9]" "abc x1 def y2" 0 1374 (lambda (match count) 1375 (1+ count))) 1376 ⇒ 2 1377 1378 1379 Regular expressions are commonly used to find patterns in one string 1380and replace them with the contents of another string. The following 1381functions are convenient ways to do this. 1382 1383 -- Scheme Procedure: regexp-substitute port match item ... 1384 Write to PORT selected parts of the match structure MATCH. Or if 1385 PORT is ‘#f’ then form a string from those parts and return that. 1386 1387 Each ITEM specifies a part to be written, and may be one of the 1388 following, 1389 1390 • A string. String arguments are written out verbatim. 1391 1392 • An integer. The submatch with that number is written 1393 (‘match:substring’). Zero is the entire match. 1394 1395 • The symbol ‘pre’. The portion of the matched string preceding 1396 the regexp match is written (‘match:prefix’). 1397 1398 • The symbol ‘post’. The portion of the matched string 1399 following the regexp match is written (‘match:suffix’). 1400 1401 For example, changing a match and retaining the text before and 1402 after, 1403 1404 (regexp-substitute #f (string-match "[0-9]+" "number 25 is good") 1405 'pre "37" 'post) 1406 ⇒ "number 37 is good" 1407 1408 Or matching a YYYYMMDD format date such as ‘20020828’ and 1409 re-ordering and hyphenating the fields. 1410 1411 (define date-regex 1412 "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") 1413 (define s "Date 20020429 12am.") 1414 (regexp-substitute #f (string-match date-regex s) 1415 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") 1416 ⇒ "Date 04-29-2002 12am. (20020429)" 1417 1418 -- Scheme Procedure: regexp-substitute/global port regexp target 1419 item... 1420 Write to PORT selected parts of matches of REGEXP in TARGET. If 1421 PORT is ‘#f’ then form a string from those parts and return that. 1422 REGEXP can be a string or a compiled regex. 1423 1424 This is similar to ‘regexp-substitute’, but allows global 1425 substitutions on TARGET. Each ITEM behaves as per 1426 ‘regexp-substitute’, with the following differences, 1427 1428 • A function. Called as ‘(ITEM match)’ with the match structure 1429 for the REGEXP match, it should return a string to be written 1430 to PORT. 1431 1432 • The symbol ‘post’. This doesn’t output anything, but instead 1433 causes ‘regexp-substitute/global’ to recurse on the unmatched 1434 portion of TARGET. 1435 1436 This _must_ be supplied to perform a global search and replace 1437 on TARGET; without it ‘regexp-substitute/global’ returns after 1438 a single match and output. 1439 1440 For example, to collapse runs of tabs and spaces to a single hyphen 1441 each, 1442 1443 (regexp-substitute/global #f "[ \t]+" "this is the text" 1444 'pre "-" 'post) 1445 ⇒ "this-is-the-text" 1446 1447 Or using a function to reverse the letters in each word, 1448 1449 (regexp-substitute/global #f "[a-z]+" "to do and not-do" 1450 'pre (lambda (m) (string-reverse (match:substring m))) 'post) 1451 ⇒ "ot od dna ton-od" 1452 1453 Without the ‘post’ symbol, just one regexp match is made. For 1454 example the following is the date example from ‘regexp-substitute’ 1455 above, without the need for the separate ‘string-match’ call. 1456 1457 (define date-regex 1458 "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") 1459 (define s "Date 20020429 12am.") 1460 (regexp-substitute/global #f date-regex s 1461 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") 1462 1463 ⇒ "Date 04-29-2002 12am. (20020429)" 1464 1465 1466File: guile.info, Node: Match Structures, Next: Backslash Escapes, Prev: Regexp Functions, Up: Regular Expressions 1467 14686.13.2 Match Structures 1469----------------------- 1470 1471A “match structure” is the object returned by ‘string-match’ and 1472‘regexp-exec’. It describes which portion of a string, if any, matched 1473the given regular expression. Match structures include: a reference to 1474the string that was checked for matches; the starting and ending 1475positions of the regexp match; and, if the regexp included any 1476parenthesized subexpressions, the starting and ending positions of each 1477submatch. 1478 1479 In each of the regexp match functions described below, the ‘match’ 1480argument must be a match structure returned by a previous call to 1481‘string-match’ or ‘regexp-exec’. Most of these functions return some 1482information about the original target string that was matched against a 1483regular expression; we will call that string TARGET for easy reference. 1484 1485 -- Scheme Procedure: regexp-match? obj 1486 Return ‘#t’ if OBJ is a match structure returned by a previous call 1487 to ‘regexp-exec’, or ‘#f’ otherwise. 1488 1489 -- Scheme Procedure: match:substring match [n] 1490 Return the portion of TARGET matched by subexpression number N. 1491 Submatch 0 (the default) represents the entire regexp match. If 1492 the regular expression as a whole matched, but the subexpression 1493 number N did not match, return ‘#f’. 1494 1495 (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) 1496 (match:substring s) 1497 ⇒ "2002" 1498 1499 ;; match starting at offset 6 in the string 1500 (match:substring 1501 (string-match "[0-9][0-9][0-9][0-9]" "blah987654" 6)) 1502 ⇒ "7654" 1503 1504 -- Scheme Procedure: match:start match [n] 1505 Return the starting position of submatch number N. 1506 1507 In the following example, the result is 4, since the match starts at 1508character index 4: 1509 1510 (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) 1511 (match:start s) 1512 ⇒ 4 1513 1514 -- Scheme Procedure: match:end match [n] 1515 Return the ending position of submatch number N. 1516 1517 In the following example, the result is 8, since the match runs 1518between characters 4 and 8 (i.e. the “2002”). 1519 1520 (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) 1521 (match:end s) 1522 ⇒ 8 1523 1524 -- Scheme Procedure: match:prefix match 1525 Return the unmatched portion of TARGET preceding the regexp match. 1526 1527 (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) 1528 (match:prefix s) 1529 ⇒ "blah" 1530 1531 -- Scheme Procedure: match:suffix match 1532 Return the unmatched portion of TARGET following the regexp match. 1533 1534 (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) 1535 (match:suffix s) 1536 ⇒ "foo" 1537 1538 -- Scheme Procedure: match:count match 1539 Return the number of parenthesized subexpressions from MATCH. Note 1540 that the entire regular expression match itself counts as a 1541 subexpression, and failed submatches are included in the count. 1542 1543 -- Scheme Procedure: match:string match 1544 Return the original TARGET string. 1545 1546 (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) 1547 (match:string s) 1548 ⇒ "blah2002foo" 1549 1550 1551File: guile.info, Node: Backslash Escapes, Prev: Match Structures, Up: Regular Expressions 1552 15536.13.3 Backslash Escapes 1554------------------------ 1555 1556Sometimes you will want a regexp to match characters like ‘*’ or ‘$’ 1557exactly. For example, to check whether a particular string represents a 1558menu entry from an Info node, it would be useful to match it against a 1559regexp like ‘^* [^:]*::’. However, this won’t work; because the 1560asterisk is a metacharacter, it won’t match the ‘*’ at the beginning of 1561the string. In this case, we want to make the first asterisk un-magic. 1562 1563 You can do this by preceding the metacharacter with a backslash 1564character ‘\’. (This is also called “quoting” the metacharacter, and is 1565known as a “backslash escape”.) When Guile sees a backslash in a 1566regular expression, it considers the following glyph to be an ordinary 1567character, no matter what special meaning it would ordinarily have. 1568Therefore, we can make the above example work by changing the regexp to 1569‘^\* [^:]*::’. The ‘\*’ sequence tells the regular expression engine to 1570match only a single asterisk in the target string. 1571 1572 Since the backslash is itself a metacharacter, you may force a regexp 1573to match a backslash in the target string by preceding the backslash 1574with itself. For example, to find variable references in a TeX program, 1575you might want to find occurrences of the string ‘\let\’ followed by any 1576number of alphabetic characters. The regular expression 1577‘\\let\\[A-Za-z]*’ would do this: the double backslashes in the regexp 1578each match a single backslash in the target string. 1579 1580 -- Scheme Procedure: regexp-quote str 1581 Quote each special character found in STR with a backslash, and 1582 return the resulting string. 1583 1584 *Very important:* Using backslash escapes in Guile source code (as in 1585Emacs Lisp or C) can be tricky, because the backslash character has 1586special meaning for the Guile reader. For example, if Guile encounters 1587the character sequence ‘\n’ in the middle of a string while processing 1588Scheme code, it replaces those characters with a newline character. 1589Similarly, the character sequence ‘\t’ is replaced by a horizontal tab. 1590Several of these “escape sequences” are processed by the Guile reader 1591before your code is executed. Unrecognized escape sequences are 1592ignored: if the characters ‘\*’ appear in a string, they will be 1593translated to the single character ‘*’. 1594 1595 This translation is obviously undesirable for regular expressions, 1596since we want to be able to include backslashes in a string in order to 1597escape regexp metacharacters. Therefore, to make sure that a backslash 1598is preserved in a string in your Guile program, you must use _two_ 1599consecutive backslashes: 1600 1601 (define Info-menu-entry-pattern (make-regexp "^\\* [^:]*")) 1602 1603 The string in this example is preprocessed by the Guile reader before 1604any code is executed. The resulting argument to ‘make-regexp’ is the 1605string ‘^\* [^:]*’, which is what we really want. 1606 1607 This also means that in order to write a regular expression that 1608matches a single backslash character, the regular expression string in 1609the source code must include _four_ backslashes. Each consecutive pair 1610of backslashes gets translated by the Guile reader to a single 1611backslash, and the resulting double-backslash is interpreted by the 1612regexp engine as matching a single backslash character. Hence: 1613 1614 (define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*")) 1615 1616 The reason for the unwieldiness of this syntax is historical. Both 1617regular expression pattern matchers and Unix string processing systems 1618have traditionally used backslashes with the special meanings described 1619above. The POSIX regular expression specification and ANSI C standard 1620both require these semantics. Attempting to abandon either convention 1621would cause other kinds of compatibility problems, possibly more severe 1622ones. Therefore, without extending the Scheme reader to support strings 1623with different quoting conventions (an ungainly and confusing extension 1624when implemented in other languages), we must adhere to this cumbersome 1625escape syntax. 1626 1627 1628File: guile.info, Node: LALR(1) Parsing, Next: PEG Parsing, Prev: Regular Expressions, Up: API Reference 1629 16306.14 LALR(1) Parsing 1631==================== 1632 1633The ‘(system base lalr)’ module provides the ‘lalr-scm’ LALR(1) parser 1634generator by Dominique Boucher (https://github.com/schemeway/lalr-scm/). 1635‘lalr-scm’ uses the same algorithm as GNU Bison (*note Introduction to 1636Bison: (bison)Introduction.). Parsers are defined using the 1637‘lalr-parser’ macro. 1638 1639 -- Scheme Syntax: lalr-parser [OPTIONS] TOKENS RULES... 1640 Generate an LALR(1) syntax analyzer. TOKENS is a list of symbols 1641 representing the terminal symbols of the grammar. RULES are the 1642 grammar production rules. 1643 1644 Each rule has the form ‘(NON-TERMINAL (RHS ...) : ACTION ...)’, 1645 where NON-TERMINAL is the name of the rule, RHS are the right-hand 1646 sides, i.e., the production rule, and ACTION is a semantic action 1647 associated with the rule. 1648 1649 The generated parser is a two-argument procedure that takes a 1650 “tokenizer” and a “syntax error procedure”. The tokenizer should 1651 be a thunk that returns lexical tokens as produced by 1652 ‘make-lexical-token’. The syntax error procedure may be called 1653 with at least an error message (a string), and optionally the 1654 lexical token that caused the error. 1655 1656 Please refer to the ‘lalr-scm’ documentation for details. 1657 1658 1659File: guile.info, Node: PEG Parsing, Next: Read/Load/Eval/Compile, Prev: LALR(1) Parsing, Up: API Reference 1660 16616.15 PEG Parsing 1662================ 1663 1664Parsing Expression Grammars (PEGs) are a way of specifying formal 1665languages for text processing. They can be used either for matching 1666(like regular expressions) or for building recursive descent parsers 1667(like lex/yacc). Guile uses a superset of PEG syntax that allows more 1668control over what information is preserved during parsing. 1669 1670 Wikipedia has a clear and concise introduction to PEGs if you want to 1671familiarize yourself with the syntax: 1672<http://en.wikipedia.org/wiki/Parsing_expression_grammar>. 1673 1674 The ‘(ice-9 peg)’ module works by compiling PEGs down to lambda 1675expressions. These can either be stored in variables at compile-time by 1676the define macros (‘define-peg-pattern’ and 1677‘define-peg-string-patterns’) or calculated explicitly at runtime with 1678the compile functions (‘compile-peg-pattern’ and ‘peg-string-compile’). 1679 1680 They can then be used for either parsing (‘match-pattern’) or 1681searching (‘search-for-pattern’). For convenience, ‘search-for-pattern’ 1682also takes pattern literals in case you want to inline a simple search 1683(people often use regular expressions this way). 1684 1685 The rest of this documentation consists of a syntax reference, an API 1686reference, and a tutorial. 1687 1688* Menu: 1689 1690* PEG Syntax Reference:: 1691* PEG API Reference:: 1692* PEG Tutorial:: 1693* PEG Internals:: 1694 1695 1696File: guile.info, Node: PEG Syntax Reference, Next: PEG API Reference, Up: PEG Parsing 1697 16986.15.1 PEG Syntax Reference 1699--------------------------- 1700 1701Normal PEG Syntax: 1702.................. 1703 1704 -- PEG Pattern: sequence a b 1705 Parses A. If this succeeds, continues to parse B from the end of 1706 the text parsed as A. Succeeds if both A and B succeed. 1707 1708 ‘"a b"’ 1709 1710 ‘(and a b)’ 1711 1712 -- PEG Pattern: ordered choice a b 1713 Parses A. If this fails, backtracks and parses B. Succeeds if 1714 either A or B succeeds. 1715 1716 ‘"a/b"’ 1717 1718 ‘(or a b)’ 1719 1720 -- PEG Pattern: zero or more a 1721 Parses A as many times in a row as it can, starting each A at the 1722 end of the text parsed by the previous A. Always succeeds. 1723 1724 ‘"a*"’ 1725 1726 ‘(* a)’ 1727 1728 -- PEG Pattern: one or more a 1729 Parses A as many times in a row as it can, starting each A at the 1730 end of the text parsed by the previous A. Succeeds if at least one 1731 A was parsed. 1732 1733 ‘"a+"’ 1734 1735 ‘(+ a)’ 1736 1737 -- PEG Pattern: optional a 1738 Tries to parse A. Succeeds if A succeeds. 1739 1740 ‘"a?"’ 1741 1742 ‘(? a)’ 1743 1744 -- PEG Pattern: followed by a 1745 Makes sure it is possible to parse A, but does not actually parse 1746 it. Succeeds if A would succeed. 1747 1748 ‘"&a"’ 1749 1750 ‘(followed-by a)’ 1751 1752 -- PEG Pattern: not followed by a 1753 Makes sure it is impossible to parse A, but does not actually parse 1754 it. Succeeds if A would fail. 1755 1756 ‘"!a"’ 1757 1758 ‘(not-followed-by a)’ 1759 1760 -- PEG Pattern: string literal ``abc'' 1761 Parses the string "ABC". Succeeds if that parsing succeeds. 1762 1763 ‘"'abc'"’ 1764 1765 ‘"abc"’ 1766 1767 -- PEG Pattern: any character 1768 Parses any single character. Succeeds unless there is no more text 1769 to be parsed. 1770 1771 ‘"."’ 1772 1773 ‘peg-any’ 1774 1775 -- PEG Pattern: character class a b 1776 Alternative syntax for “Ordered Choice A B” if A and B are 1777 characters. 1778 1779 ‘"[ab]"’ 1780 1781 ‘(or "a" "b")’ 1782 1783 -- PEG Pattern: range of characters a z 1784 Parses any character falling between A and Z. 1785 1786 ‘"[a-z]"’ 1787 1788 ‘(range #\a #\z)’ 1789 1790 Example: 1791 1792 "(a !b / c &d*) 'e'+" 1793 1794 Would be: 1795 1796 (and 1797 (or 1798 (and a (not-followed-by b)) 1799 (and c (followed-by (* d)))) 1800 (+ "e")) 1801 1802Extended Syntax 1803............... 1804 1805There is some extra syntax for S-expressions. 1806 1807 -- PEG Pattern: ignore a 1808 Ignore the text matching A 1809 1810 -- PEG Pattern: capture a 1811 Capture the text matching A. 1812 1813 -- PEG Pattern: peg a 1814 Embed the PEG pattern A using string syntax. 1815 1816 Example: 1817 1818 "!a / 'b'" 1819 1820 Is equivalent to 1821 1822 (or (peg "!a") "b") 1823 1824 and 1825 1826 (or (not-followed-by a) "b") 1827 1828 1829File: guile.info, Node: PEG API Reference, Next: PEG Tutorial, Prev: PEG Syntax Reference, Up: PEG Parsing 1830 18316.15.2 PEG API Reference 1832------------------------ 1833 1834Define Macros 1835............. 1836 1837The most straightforward way to define a PEG is by using one of the 1838define macros (both of these macroexpand into ‘define’ expressions). 1839These macros bind parsing functions to variables. These parsing 1840functions may be invoked by ‘match-pattern’ or ‘search-for-pattern’, 1841which return a PEG match record. Raw data can be retrieved from this 1842record with the PEG match deconstructor functions. More complicated 1843(and perhaps enlightening) examples can be found in the tutorial. 1844 1845 -- Scheme Macro: define-peg-string-patterns peg-string 1846 Defines all the nonterminals in the PEG PEG-STRING. More 1847 precisely, ‘define-peg-string-patterns’ takes a superset of PEGs. 1848 A normal PEG has a ‘<-’ between the nonterminal and the pattern. 1849 ‘define-peg-string-patterns’ uses this symbol to determine what 1850 information it should propagate up the parse tree. The normal ‘<-’ 1851 propagates the matched text up the parse tree, ‘<--’ propagates the 1852 matched text up the parse tree tagged with the name of the 1853 nonterminal, and ‘<’ discards that matched text and propagates 1854 nothing up the parse tree. Also, nonterminals may consist of any 1855 alphanumeric character or a “-” character (in normal PEGs 1856 nonterminals can only be alphabetic). 1857 1858 For example, if we: 1859 (define-peg-string-patterns 1860 "as <- 'a'+ 1861 bs <- 'b'+ 1862 as-or-bs <- as/bs") 1863 (define-peg-string-patterns 1864 "as-tag <-- 'a'+ 1865 bs-tag <-- 'b'+ 1866 as-or-bs-tag <-- as-tag/bs-tag") 1867 Then: 1868 (match-pattern as-or-bs "aabbcc") ⇒ 1869 #<peg start: 0 end: 2 string: aabbcc tree: aa> 1870 (match-pattern as-or-bs-tag "aabbcc") ⇒ 1871 #<peg start: 0 end: 2 string: aabbcc tree: (as-or-bs-tag (as-tag aa))> 1872 1873 Note that in doing this, we have bound 6 variables at the toplevel 1874 (AS, BS, AS-OR-BS, AS-TAG, BS-TAG, and AS-OR-BS-TAG). 1875 1876 -- Scheme Macro: define-peg-pattern name capture-type peg-sexp 1877 Defines a single nonterminal NAME. CAPTURE-TYPE determines how 1878 much information is passed up the parse tree. PEG-SEXP is a PEG in 1879 S-expression form. 1880 1881 Possible values for capture-type: 1882 1883 ‘all’ 1884 passes the matched text up the parse tree tagged with the name 1885 of the nonterminal. 1886 ‘body’ 1887 passes the matched text up the parse tree. 1888 ‘none’ 1889 passes nothing up the parse tree. 1890 1891 For Example, if we: 1892 (define-peg-pattern as body (+ "a")) 1893 (define-peg-pattern bs body (+ "b")) 1894 (define-peg-pattern as-or-bs body (or as bs)) 1895 (define-peg-pattern as-tag all (+ "a")) 1896 (define-peg-pattern bs-tag all (+ "b")) 1897 (define-peg-pattern as-or-bs-tag all (or as-tag bs-tag)) 1898 Then: 1899 (match-pattern as-or-bs "aabbcc") ⇒ 1900 #<peg start: 0 end: 2 string: aabbcc tree: aa> 1901 (match-pattern as-or-bs-tag "aabbcc") ⇒ 1902 #<peg start: 0 end: 2 string: aabbcc tree: (as-or-bs-tag (as-tag aa))> 1903 1904 Note that in doing this, we have bound 6 variables at the toplevel 1905 (AS, BS, AS-OR-BS, AS-TAG, BS-TAG, and AS-OR-BS-TAG). 1906 1907Compile Functions 1908................. 1909 1910It is sometimes useful to be able to compile anonymous PEG patterns at 1911runtime. These functions let you do that using either syntax. 1912 1913 -- Scheme Procedure: peg-string-compile peg-string capture-type 1914 Compiles the PEG pattern in PEG-STRING propagating according to 1915 CAPTURE-TYPE (capture-type can be any of the values from 1916 ‘define-peg-pattern’). 1917 1918 -- Scheme Procedure: compile-peg-pattern peg-sexp capture-type 1919 Compiles the PEG pattern in PEG-SEXP propagating according to 1920 CAPTURE-TYPE (capture-type can be any of the values from 1921 ‘define-peg-pattern’). 1922 1923 The functions return syntax objects, which can be useful if you want 1924to use them in macros. If all you want is to define a new nonterminal, 1925you can do the following: 1926 1927 (define exp '(+ "a")) 1928 (define as (compile (compile-peg-pattern exp 'body))) 1929 1930 You can use this nonterminal with all of the regular PEG functions: 1931 1932 (match-pattern as "aaaaa") ⇒ 1933 #<peg start: 0 end: 5 string: bbbbb tree: bbbbb> 1934 1935Parsing & Matching Functions 1936............................ 1937 1938For our purposes, “parsing” means parsing a string into a tree starting 1939from the first character, while “matching” means searching through the 1940string for a substring. In practice, the only difference between the 1941two functions is that ‘match-pattern’ gives up if it can’t find a valid 1942substring starting at index 0 and ‘search-for-pattern’ keeps looking. 1943They are both equally capable of “parsing” and “matching” given those 1944constraints. 1945 1946 -- Scheme Procedure: match-pattern nonterm string 1947 Parses STRING using the PEG stored in NONTERM. If no match was 1948 found, ‘match-pattern’ returns false. If a match was found, a PEG 1949 match record is returned. 1950 1951 The ‘capture-type’ argument to ‘define-peg-pattern’ allows you to 1952 choose what information to hold on to while parsing. The options 1953 are: 1954 1955 ‘all’ 1956 tag the matched text with the nonterminal 1957 ‘body’ 1958 just the matched text 1959 ‘none’ 1960 nothing 1961 1962 (define-peg-pattern as all (+ "a")) 1963 (match-pattern as "aabbcc") ⇒ 1964 #<peg start: 0 end: 2 string: aabbcc tree: (as aa)> 1965 1966 (define-peg-pattern as body (+ "a")) 1967 (match-pattern as "aabbcc") ⇒ 1968 #<peg start: 0 end: 2 string: aabbcc tree: aa> 1969 1970 (define-peg-pattern as none (+ "a")) 1971 (match-pattern as "aabbcc") ⇒ 1972 #<peg start: 0 end: 2 string: aabbcc tree: ()> 1973 1974 (define-peg-pattern bs body (+ "b")) 1975 (match-pattern bs "aabbcc") ⇒ 1976 #f 1977 1978 -- Scheme Macro: search-for-pattern nonterm-or-peg string 1979 Searches through STRING looking for a matching subexpression. 1980 NONTERM-OR-PEG can either be a nonterminal or a literal PEG 1981 pattern. When a literal PEG pattern is provided, 1982 ‘search-for-pattern’ works very similarly to the regular expression 1983 searches many hackers are used to. If no match was found, 1984 ‘search-for-pattern’ returns false. If a match was found, a PEG 1985 match record is returned. 1986 1987 (define-peg-pattern as body (+ "a")) 1988 (search-for-pattern as "aabbcc") ⇒ 1989 #<peg start: 0 end: 2 string: aabbcc tree: aa> 1990 (search-for-pattern (+ "a") "aabbcc") ⇒ 1991 #<peg start: 0 end: 2 string: aabbcc tree: aa> 1992 (search-for-pattern "'a'+" "aabbcc") ⇒ 1993 #<peg start: 0 end: 2 string: aabbcc tree: aa> 1994 1995 (define-peg-pattern as all (+ "a")) 1996 (search-for-pattern as "aabbcc") ⇒ 1997 #<peg start: 0 end: 2 string: aabbcc tree: (as aa)> 1998 1999 (define-peg-pattern bs body (+ "b")) 2000 (search-for-pattern bs "aabbcc") ⇒ 2001 #<peg start: 2 end: 4 string: aabbcc tree: bb> 2002 (search-for-pattern (+ "b") "aabbcc") ⇒ 2003 #<peg start: 2 end: 4 string: aabbcc tree: bb> 2004 (search-for-pattern "'b'+" "aabbcc") ⇒ 2005 #<peg start: 2 end: 4 string: aabbcc tree: bb> 2006 2007 (define-peg-pattern zs body (+ "z")) 2008 (search-for-pattern zs "aabbcc") ⇒ 2009 #f 2010 (search-for-pattern (+ "z") "aabbcc") ⇒ 2011 #f 2012 (search-for-pattern "'z'+" "aabbcc") ⇒ 2013 #f 2014 2015PEG Match Records 2016................. 2017 2018The ‘match-pattern’ and ‘search-for-pattern’ functions both return PEG 2019match records. Actual information can be extracted from these with the 2020following functions. 2021 2022 -- Scheme Procedure: peg:string match-record 2023 Returns the original string that was parsed in the creation of 2024 ‘match-record’. 2025 2026 -- Scheme Procedure: peg:start match-record 2027 Returns the index of the first parsed character in the original 2028 string (from ‘peg:string’). If this is the same as ‘peg:end’, 2029 nothing was parsed. 2030 2031 -- Scheme Procedure: peg:end match-record 2032 Returns one more than the index of the last parsed character in the 2033 original string (from ‘peg:string’). If this is the same as 2034 ‘peg:start’, nothing was parsed. 2035 2036 -- Scheme Procedure: peg:substring match-record 2037 Returns the substring parsed by ‘match-record’. This is equivalent 2038 to ‘(substring (peg:string match-record) (peg:start match-record) 2039 (peg:end match-record))’. 2040 2041 -- Scheme Procedure: peg:tree match-record 2042 Returns the tree parsed by ‘match-record’. 2043 2044 -- Scheme Procedure: peg-record? match-record 2045 Returns true if ‘match-record’ is a PEG match record, or false 2046 otherwise. 2047 2048 Example: 2049 (define-peg-pattern bs all (peg "'b'+")) 2050 2051 (search-for-pattern bs "aabbcc") ⇒ 2052 #<peg start: 2 end: 4 string: aabbcc tree: (bs bb)> 2053 2054 (let ((pm (search-for-pattern bs "aabbcc"))) 2055 `((string ,(peg:string pm)) 2056 (start ,(peg:start pm)) 2057 (end ,(peg:end pm)) 2058 (substring ,(peg:substring pm)) 2059 (tree ,(peg:tree pm)) 2060 (record? ,(peg-record? pm)))) ⇒ 2061 ((string "aabbcc") 2062 (start 2) 2063 (end 4) 2064 (substring "bb") 2065 (tree (bs "bb")) 2066 (record? #t)) 2067 2068Miscellaneous 2069............. 2070 2071 -- Scheme Procedure: context-flatten tst lst 2072 Takes a predicate TST and a list LST. Flattens LST until all 2073 elements are either atoms or satisfy TST. If LST itself satisfies 2074 TST, ‘(list lst)’ is returned (this is a flat list whose only 2075 element satisfies TST). 2076 2077 (context-flatten (lambda (x) (and (number? (car x)) (= (car x) 1))) '(2 2 (1 1 (2 2)) (2 2 (1 1)))) ⇒ 2078 (2 2 (1 1 (2 2)) 2 2 (1 1)) 2079 (context-flatten (lambda (x) (and (number? (car x)) (= (car x) 1))) '(1 1 (1 1 (2 2)) (2 2 (1 1)))) ⇒ 2080 ((1 1 (1 1 (2 2)) (2 2 (1 1)))) 2081 2082 If you’re wondering why this is here, take a look at the tutorial. 2083 2084 -- Scheme Procedure: keyword-flatten terms lst 2085 A less general form of ‘context-flatten’. Takes a list of terminal 2086 atoms ‘terms’ and flattens LST until all elements are either atoms, 2087 or lists which have an atom from ‘terms’ as their first element. 2088 (keyword-flatten '(a b) '(c a b (a c) (b c) (c (b a) (c a)))) ⇒ 2089 (c a b (a c) (b c) c (b a) c a) 2090 2091 If you’re wondering why this is here, take a look at the tutorial. 2092 2093 2094File: guile.info, Node: PEG Tutorial, Next: PEG Internals, Prev: PEG API Reference, Up: PEG Parsing 2095 20966.15.3 PEG Tutorial 2097------------------- 2098 2099Parsing /etc/passwd 2100................... 2101 2102This example will show how to parse /etc/passwd using PEGs. 2103 2104 First we define an example /etc/passwd file: 2105 2106 (define *etc-passwd* 2107 "root:x:0:0:root:/root:/bin/bash 2108 daemon:x:1:1:daemon:/usr/sbin:/bin/sh 2109 bin:x:2:2:bin:/bin:/bin/sh 2110 sys:x:3:3:sys:/dev:/bin/sh 2111 nobody:x:65534:65534:nobody:/nonexistent:/bin/sh 2112 messagebus:x:103:107::/var/run/dbus:/bin/false 2113 ") 2114 2115 As a first pass at this, we might want to have all the entries in 2116/etc/passwd in a list. 2117 2118 Doing this with string-based PEG syntax would look like this: 2119 (define-peg-string-patterns 2120 "passwd <- entry* !. 2121 entry <-- (! NL .)* NL* 2122 NL < '\n'") 2123 2124 A ‘passwd’ file is 0 or more entries (‘entry*’) until the end of the 2125file (‘!.’ (‘.’ is any character, so ‘!.’ means “not anything”)). We 2126want to capture the data in the nonterminal ‘passwd’, but not tag it 2127with the name, so we use ‘<-’. 2128 2129 An entry is a series of 0 or more characters that aren’t newlines 2130(‘(! NL .)*’) followed by 0 or more newlines (‘NL*’). We want to tag 2131all the entries with ‘entry’, so we use ‘<--’. 2132 2133 A newline is just a literal newline (‘'\n'’). We don’t want a bunch 2134of newlines cluttering up the output, so we use ‘<’ to throw away the 2135captured data. 2136 2137 Here is the same PEG defined using S-expressions: 2138 (define-peg-pattern passwd body (and (* entry) (not-followed-by peg-any))) 2139 (define-peg-pattern entry all (and (* (and (not-followed-by NL) peg-any)) 2140 (* NL))) 2141 (define-peg-pattern NL none "\n") 2142 2143 Obviously this is much more verbose. On the other hand, it’s more 2144explicit, and thus easier to build automatically. However, there are 2145some tricks that make S-expressions easier to use in some cases. One is 2146the ‘ignore’ keyword; the string syntax has no way to say “throw away 2147this text” except breaking it out into a separate nonterminal. For 2148instance, to throw away the newlines we had to define ‘NL’. In the 2149S-expression syntax, we could have simply written ‘(ignore "\n")’. 2150Also, for the cases where string syntax is really much cleaner, the 2151‘peg’ keyword can be used to embed string syntax in S-expression syntax. 2152For instance, we could have written: 2153 2154 (define-peg-pattern passwd body (peg "entry* !.")) 2155 2156 However we define it, parsing ‘*etc-passwd*’ with the ‘passwd’ 2157nonterminal yields the same results: 2158 2159 (peg:tree (match-pattern passwd *etc-passwd*)) ⇒ 2160 ((entry "root:x:0:0:root:/root:/bin/bash") 2161 (entry "daemon:x:1:1:daemon:/usr/sbin:/bin/sh") 2162 (entry "bin:x:2:2:bin:/bin:/bin/sh") 2163 (entry "sys:x:3:3:sys:/dev:/bin/sh") 2164 (entry "nobody:x:65534:65534:nobody:/nonexistent:/bin/sh") 2165 (entry "messagebus:x:103:107::/var/run/dbus:/bin/false")) 2166 2167 However, here is something to be wary of: 2168 2169 (peg:tree (match-pattern passwd "one entry")) ⇒ 2170 (entry "one entry") 2171 2172 By default, the parse trees generated by PEGs are compressed as much 2173as possible without losing information. It may not look like this is 2174what you want at first, but uncompressed parse trees are an enormous 2175headache (there’s no easy way to predict how deep particular lists will 2176nest, there are empty lists littered everywhere, etc. etc.). One 2177side-effect of this, however, is that sometimes the compressor is too 2178aggressive. No information is discarded when ‘((entry "one entry"))’ is 2179compressed to ‘(entry "one entry")’, but in this particular case it 2180probably isn’t what we want. 2181 2182 There are two functions for easily dealing with this: 2183‘keyword-flatten’ and ‘context-flatten’. The ‘keyword-flatten’ function 2184takes a list of keywords and a list to flatten, then tries to coerce the 2185list such that the first element of all sublists is one of the keywords. 2186The ‘context-flatten’ function is similar, but instead of a list of 2187keywords it takes a predicate that should indicate whether a given 2188sublist is good enough (refer to the API reference for more details). 2189 2190 What we want here is ‘keyword-flatten’. 2191 (keyword-flatten '(entry) (peg:tree (match-pattern passwd *etc-passwd*))) ⇒ 2192 ((entry "root:x:0:0:root:/root:/bin/bash") 2193 (entry "daemon:x:1:1:daemon:/usr/sbin:/bin/sh") 2194 (entry "bin:x:2:2:bin:/bin:/bin/sh") 2195 (entry "sys:x:3:3:sys:/dev:/bin/sh") 2196 (entry "nobody:x:65534:65534:nobody:/nonexistent:/bin/sh") 2197 (entry "messagebus:x:103:107::/var/run/dbus:/bin/false")) 2198 (keyword-flatten '(entry) (peg:tree (match-pattern passwd "one entry"))) ⇒ 2199 ((entry "one entry")) 2200 2201 Of course, this is a somewhat contrived example. In practice we 2202would probably just tag the ‘passwd’ nonterminal to remove the ambiguity 2203(using either the ‘all’ keyword for S-expressions or the ‘<--’ symbol 2204for strings).. 2205 2206 (define-peg-pattern tag-passwd all (peg "entry* !.")) 2207 (peg:tree (match-pattern tag-passwd *etc-passwd*)) ⇒ 2208 (tag-passwd 2209 (entry "root:x:0:0:root:/root:/bin/bash") 2210 (entry "daemon:x:1:1:daemon:/usr/sbin:/bin/sh") 2211 (entry "bin:x:2:2:bin:/bin:/bin/sh") 2212 (entry "sys:x:3:3:sys:/dev:/bin/sh") 2213 (entry "nobody:x:65534:65534:nobody:/nonexistent:/bin/sh") 2214 (entry "messagebus:x:103:107::/var/run/dbus:/bin/false")) 2215 (peg:tree (match-pattern tag-passwd "one entry")) 2216 (tag-passwd 2217 (entry "one entry")) 2218 2219 If you’re ever uncertain about the potential results of parsing 2220something, remember the two absolute rules: 2221 1. No parsing information will ever be discarded. 2222 2. There will never be any lists with fewer than 2 elements. 2223 2224 For the purposes of (1), "parsing information" means things tagged 2225with the ‘any’ keyword or the ‘<--’ symbol. Plain strings will be 2226concatenated. 2227 2228 Let’s extend this example a bit more and actually pull some useful 2229information out of the passwd file: 2230 2231 (define-peg-string-patterns 2232 "passwd <-- entry* !. 2233 entry <-- login C pass C uid C gid C nameORcomment C homedir C shell NL* 2234 login <-- text 2235 pass <-- text 2236 uid <-- [0-9]* 2237 gid <-- [0-9]* 2238 nameORcomment <-- text 2239 homedir <-- path 2240 shell <-- path 2241 path <-- (SLASH pathELEMENT)* 2242 pathELEMENT <-- (!NL !C !'/' .)* 2243 text <- (!NL !C .)* 2244 C < ':' 2245 NL < '\n' 2246 SLASH < '/'") 2247 2248 This produces rather pretty parse trees: 2249 (passwd 2250 (entry (login "root") 2251 (pass "x") 2252 (uid "0") 2253 (gid "0") 2254 (nameORcomment "root") 2255 (homedir (path (pathELEMENT "root"))) 2256 (shell (path (pathELEMENT "bin") (pathELEMENT "bash")))) 2257 (entry (login "daemon") 2258 (pass "x") 2259 (uid "1") 2260 (gid "1") 2261 (nameORcomment "daemon") 2262 (homedir 2263 (path (pathELEMENT "usr") (pathELEMENT "sbin"))) 2264 (shell (path (pathELEMENT "bin") (pathELEMENT "sh")))) 2265 (entry (login "bin") 2266 (pass "x") 2267 (uid "2") 2268 (gid "2") 2269 (nameORcomment "bin") 2270 (homedir (path (pathELEMENT "bin"))) 2271 (shell (path (pathELEMENT "bin") (pathELEMENT "sh")))) 2272 (entry (login "sys") 2273 (pass "x") 2274 (uid "3") 2275 (gid "3") 2276 (nameORcomment "sys") 2277 (homedir (path (pathELEMENT "dev"))) 2278 (shell (path (pathELEMENT "bin") (pathELEMENT "sh")))) 2279 (entry (login "nobody") 2280 (pass "x") 2281 (uid "65534") 2282 (gid "65534") 2283 (nameORcomment "nobody") 2284 (homedir (path (pathELEMENT "nonexistent"))) 2285 (shell (path (pathELEMENT "bin") (pathELEMENT "sh")))) 2286 (entry (login "messagebus") 2287 (pass "x") 2288 (uid "103") 2289 (gid "107") 2290 nameORcomment 2291 (homedir 2292 (path (pathELEMENT "var") 2293 (pathELEMENT "run") 2294 (pathELEMENT "dbus"))) 2295 (shell (path (pathELEMENT "bin") (pathELEMENT "false"))))) 2296 2297 Notice that when there’s no entry in a field (e.g. ‘nameORcomment’ 2298for messagebus) the symbol is inserted. This is the “don’t throw away 2299any information” rule—we succesfully matched a ‘nameORcomment’ of 0 2300characters (since we used ‘*’ when defining it). This is usually what 2301you want, because it allows you to e.g. use ‘list-ref’ to pull out 2302elements (since they all have known offsets). 2303 2304 If you’d prefer not to have symbols for empty matches, you can 2305replace the ‘*’ with a ‘+’ and add a ‘?’ after the ‘nameORcomment’ in 2306‘entry’. Then it will try to parse 1 or more characters, fail 2307(inserting nothing into the parse tree), but continue because it didn’t 2308have to match the nameORcomment to continue. 2309 2310Embedding Arithmetic Expressions 2311................................ 2312 2313We can parse simple mathematical expressions with the following PEG: 2314 2315 (define-peg-string-patterns 2316 "expr <- sum 2317 sum <-- (product ('+' / '-') sum) / product 2318 product <-- (value ('*' / '/') product) / value 2319 value <-- number / '(' expr ')' 2320 number <-- [0-9]+") 2321 2322 Then: 2323 (peg:tree (match-pattern expr "1+1/2*3+(1+1)/2")) ⇒ 2324 (sum (product (value (number "1"))) 2325 "+" 2326 (sum (product 2327 (value (number "1")) 2328 "/" 2329 (product 2330 (value (number "2")) 2331 "*" 2332 (product (value (number "3"))))) 2333 "+" 2334 (sum (product 2335 (value "(" 2336 (sum (product (value (number "1"))) 2337 "+" 2338 (sum (product (value (number "1"))))) 2339 ")") 2340 "/" 2341 (product (value (number "2"))))))) 2342 2343 There is very little wasted effort in this PEG. The ‘number’ 2344nonterminal has to be tagged because otherwise the numbers might run 2345together with the arithmetic expressions during the string concatenation 2346stage of parse-tree compression (the parser will see “1” followed by “/” 2347and decide to call it “1/”). When in doubt, tag. 2348 2349 It is very easy to turn these parse trees into lisp expressions: 2350 2351 (define (parse-sum sum left . rest) 2352 (if (null? rest) 2353 (apply parse-product left) 2354 (list (string->symbol (car rest)) 2355 (apply parse-product left) 2356 (apply parse-sum (cadr rest))))) 2357 2358 (define (parse-product product left . rest) 2359 (if (null? rest) 2360 (apply parse-value left) 2361 (list (string->symbol (car rest)) 2362 (apply parse-value left) 2363 (apply parse-product (cadr rest))))) 2364 2365 (define (parse-value value first . rest) 2366 (if (null? rest) 2367 (string->number (cadr first)) 2368 (apply parse-sum (car rest)))) 2369 2370 (define parse-expr parse-sum) 2371 2372 (Notice all these functions look very similar; for a more complicated 2373PEG, it would be worth abstracting.) 2374 2375 Then: 2376 (apply parse-expr (peg:tree (match-pattern expr "1+1/2*3+(1+1)/2"))) ⇒ 2377 (+ 1 (+ (/ 1 (* 2 3)) (/ (+ 1 1) 2))) 2378 2379 But wait! The associativity is wrong! Where it says ‘(/ 1 (* 2 23803))’, it should say ‘(* (/ 1 2) 3)’. 2381 2382 It’s tempting to try replacing e.g. ‘"sum <-- (product ('+' / '-') 2383sum) / product"’ with ‘"sum <-- (sum ('+' / '-') product) / product"’, 2384but this is a Bad Idea. PEGs don’t support left recursion. To see why, 2385imagine what the parser will do here. When it tries to parse ‘sum’, it 2386first has to try and parse ‘sum’. But to do that, it first has to try 2387and parse ‘sum’. This will continue until the stack gets blown off. 2388 2389 So how does one parse left-associative binary operators with PEGs? 2390Honestly, this is one of their major shortcomings. There’s no 2391general-purpose way of doing this, but here the repetition operators are 2392a good choice: 2393 2394 (use-modules (srfi srfi-1)) 2395 2396 (define-peg-string-patterns 2397 "expr <- sum 2398 sum <-- (product ('+' / '-'))* product 2399 product <-- (value ('*' / '/'))* value 2400 value <-- number / '(' expr ')' 2401 number <-- [0-9]+") 2402 2403 ;; take a deep breath... 2404 (define (make-left-parser next-func) 2405 (lambda (sum first . rest) ;; general form, comments below assume 2406 ;; that we're dealing with a sum expression 2407 (if (null? rest) ;; form (sum (product ...)) 2408 (apply next-func first) 2409 (if (string? (cadr first));; form (sum ((product ...) "+") (product ...)) 2410 (list (string->symbol (cadr first)) 2411 (apply next-func (car first)) 2412 (apply next-func (car rest))) 2413 ;; form (sum (((product ...) "+") ((product ...) "+")) (product ...)) 2414 (car 2415 (reduce ;; walk through the list and build a left-associative tree 2416 (lambda (l r) 2417 (list (list (cadr r) (car r) (apply next-func (car l))) 2418 (string->symbol (cadr l)))) 2419 'ignore 2420 (append ;; make a list of all the products 2421 ;; the first one should be pre-parsed 2422 (list (list (apply next-func (caar first)) 2423 (string->symbol (cadar first)))) 2424 (cdr first) 2425 ;; the last one has to be added in 2426 (list (append rest '("done")))))))))) 2427 2428 (define (parse-value value first . rest) 2429 (if (null? rest) 2430 (string->number (cadr first)) 2431 (apply parse-sum (car rest)))) 2432 (define parse-product (make-left-parser parse-value)) 2433 (define parse-sum (make-left-parser parse-product)) 2434 (define parse-expr parse-sum) 2435 2436 Then: 2437 (apply parse-expr (peg:tree (match-pattern expr "1+1/2*3+(1+1)/2"))) ⇒ 2438 (+ (+ 1 (* (/ 1 2) 3)) (/ (+ 1 1) 2)) 2439 2440 As you can see, this is much uglier (it could be made prettier by 2441using ‘context-flatten’, but the way it’s written above makes it clear 2442how we deal with the three ways the zero-or-more ‘*’ expression can 2443parse). Fortunately, most of the time we can get away with only using 2444right-associativity. 2445 2446Simplified Functions 2447.................... 2448 2449For a more tantalizing example, consider the following grammar that 2450parses (highly) simplified C functions: 2451 2452 (define-peg-string-patterns 2453 "cfunc <-- cSP ctype cSP cname cSP cargs cLB cSP cbody cRB 2454 ctype <-- cidentifier 2455 cname <-- cidentifier 2456 cargs <-- cLP (! (cSP cRP) carg cSP (cCOMMA / cRP) cSP)* cSP 2457 carg <-- cSP ctype cSP cname 2458 cbody <-- cstatement * 2459 cidentifier <- [a-zA-z][a-zA-Z0-9_]* 2460 cstatement <-- (!';'.)*cSC cSP 2461 cSC < ';' 2462 cCOMMA < ',' 2463 cLP < '(' 2464 cRP < ')' 2465 cLB < '{' 2466 cRB < '}' 2467 cSP < [ \t\n]*") 2468 2469 Then: 2470 (match-pattern cfunc "int square(int a) { return a*a;}") ⇒ 2471 (32 2472 (cfunc (ctype "int") 2473 (cname "square") 2474 (cargs (carg (ctype "int") (cname "a"))) 2475 (cbody (cstatement "return a*a")))) 2476 2477 And: 2478 (match-pattern cfunc "int mod(int a, int b) { int c = a/b;return a-b*c; }") ⇒ 2479 (52 2480 (cfunc (ctype "int") 2481 (cname "mod") 2482 (cargs (carg (ctype "int") (cname "a")) 2483 (carg (ctype "int") (cname "b"))) 2484 (cbody (cstatement "int c = a/b") 2485 (cstatement "return a- b*c")))) 2486 2487 By wrapping all the ‘carg’ nonterminals in a ‘cargs’ nonterminal, we 2488were able to remove any ambiguity in the parsing structure and avoid 2489having to call ‘context-flatten’ on the output of ‘match-pattern’. We 2490used the same trick with the ‘cstatement’ nonterminals, wrapping them in 2491a ‘cbody’ nonterminal. 2492 2493 The whitespace nonterminal ‘cSP’ used here is a (very) useful 2494instantiation of a common pattern for matching syntactically irrelevant 2495information. Since it’s tagged with ‘<’ and ends with ‘*’ it won’t 2496clutter up the parse trees (all the empty lists will be discarded during 2497the compression step) and it will never cause parsing to fail. 2498 2499 2500File: guile.info, Node: PEG Internals, Prev: PEG Tutorial, Up: PEG Parsing 2501 25026.15.4 PEG Internals 2503-------------------- 2504 2505A PEG parser takes a string as input and attempts to parse it as a given 2506nonterminal. The key idea of the PEG implementation is that every 2507nonterminal is just a function that takes a string as an argument and 2508attempts to parse that string as its nonterminal. The functions always 2509start from the beginning, but a parse is considered successful if there 2510is material left over at the end. 2511 2512 This makes it easy to model different PEG parsing operations. For 2513instance, consider the PEG grammar ‘"ab"’, which could also be written 2514‘(and "a" "b")’. It matches the string “ab”. Here’s how that might be 2515implemented in the PEG style: 2516 2517 (define (match-and-a-b str) 2518 (match-a str) 2519 (match-b str)) 2520 2521 As you can see, the use of functions provides an easy way to model 2522sequencing. In a similar way, one could model ‘(or a b)’ with something 2523like the following: 2524 2525 (define (match-or-a-b str) 2526 (or (match-a str) (match-b str))) 2527 2528 Here the semantics of a PEG ‘or’ expression map naturally onto 2529Scheme’s ‘or’ operator. This function will attempt to run ‘(match-a 2530str)’, and return its result if it succeeds. Otherwise it will run 2531‘(match-b str)’. 2532 2533 Of course, the code above wouldn’t quite work. We need some way for 2534the parsing functions to communicate. The actual interface used is 2535below. 2536 2537Parsing Function Interface 2538.......................... 2539 2540A parsing function takes three arguments - a string, the length of that 2541string, and the position in that string it should start parsing at. In 2542effect, the parsing functions pass around substrings in pieces - the 2543first argument is a buffer of characters, and the second two give a 2544range within that buffer that the parsing function should look at. 2545 2546 Parsing functions return either #f, if they failed to match their 2547nonterminal, or a list whose first element must be an integer 2548representing the final position in the string they matched and whose cdr 2549can be any other data the function wishes to return, or ’() if it 2550doesn’t have any more data. 2551 2552 The one caveat is that if the extra data it returns is a list, any 2553adjacent strings in that list will be appended by ‘match-pattern’. For 2554instance, if a parsing function returns ‘(13 ("a" "b" "c"))’, 2555‘match-pattern’ will take ‘(13 ("abc"))’ as its value. 2556 2557 For example, here is a function to match “ab” using the actual 2558interface. 2559 2560 (define (match-a-b str len pos) 2561 (and (<= (+ pos 2) len) 2562 (string= str "ab" pos (+ pos 2)) 2563 (list (+ pos 2) '()))) ; we return no extra information 2564 2565 The above function can be used to match a string by running 2566‘(match-pattern match-a-b "ab")’. 2567 2568Code Generators and Extensible Syntax 2569..................................... 2570 2571PEG expressions, such as those in a ‘define-peg-pattern’ form, are 2572interpreted internally in two steps. 2573 2574 First, any string PEG is expanded into an s-expression PEG by the 2575code in the ‘(ice-9 peg string-peg)’ module. 2576 2577 Then, the s-expression PEG that results is compiled into a parsing 2578function by the ‘(ice-9 peg codegen)’ module. In particular, the 2579function ‘compile-peg-pattern’ is called on the s-expression. It then 2580decides what to do based on the form it is passed. 2581 2582 The PEG syntax can be expanded by providing ‘compile-peg-pattern’ 2583more options for what to do with its forms. The extended syntax will be 2584associated with a symbol, for instance ‘my-parsing-form’, and will be 2585called on all PEG expressions of the form 2586 (my-parsing-form ...) 2587 2588 The parsing function should take two arguments. The first will be a 2589syntax object containing a list with all of the arguments to the form 2590(but not the form’s name), and the second will be the ‘capture-type’ 2591argument that is passed to ‘define-peg-pattern’. 2592 2593 New functions can be registered by calling ‘(add-peg-compiler! symbol 2594function)’, where ‘symbol’ is the symbol that will indicate a form of 2595this type and ‘function’ is the code generating function described 2596above. The function ‘add-peg-compiler!’ is exported from the ‘(ice-9 2597peg codegen)’ module. 2598 2599 2600File: guile.info, Node: Read/Load/Eval/Compile, Next: Memory Management, Prev: PEG Parsing, Up: API Reference 2601 26026.16 Reading and Evaluating Scheme Code 2603======================================= 2604 2605This chapter describes Guile functions that are concerned with reading, 2606loading, evaluating, and compiling Scheme code at run time. 2607 2608* Menu: 2609 2610* Scheme Syntax:: Standard and extended Scheme syntax. 2611* Scheme Read:: Reading Scheme code. 2612* Annotated Scheme Read:: Reading Scheme code, for the compiler. 2613* Scheme Write:: Writing Scheme values to a port. 2614* Fly Evaluation:: Procedures for on the fly evaluation. 2615* Compilation:: How to compile Scheme files and procedures. 2616* Loading:: Loading Scheme code from file. 2617* Load Paths:: Where Guile looks for code. 2618* Character Encoding of Source Files:: Loading non-ASCII Scheme code from file. 2619* Delayed Evaluation:: Postponing evaluation until it is needed. 2620* Local Evaluation:: Evaluation in a local lexical environment. 2621* Local Inclusion:: Compile-time inclusion of one file in another. 2622* Sandboxed Evaluation:: Evaluation with limited capabilities. 2623* REPL Servers:: Serving a REPL over a socket. 2624* Cooperative REPL Servers:: REPL server for single-threaded applications. 2625 2626 2627File: guile.info, Node: Scheme Syntax, Next: Scheme Read, Up: Read/Load/Eval/Compile 2628 26296.16.1 Scheme Syntax: Standard and Guile Extensions 2630--------------------------------------------------- 2631 2632* Menu: 2633 2634* Expression Syntax:: 2635* Comments:: 2636* Block Comments:: 2637* Case Sensitivity:: 2638* Keyword Syntax:: 2639* Reader Extensions:: 2640 2641 2642File: guile.info, Node: Expression Syntax, Next: Comments, Up: Scheme Syntax 2643 26446.16.1.1 Expression Syntax 2645.......................... 2646 2647An expression to be evaluated takes one of the following forms. 2648 2649SYMBOL 2650 A symbol is evaluated by dereferencing. A binding of that symbol 2651 is sought and the value there used. For example, 2652 2653 (define x 123) 2654 x ⇒ 123 2655 2656(PROC ARGS...) 2657 A parenthesised expression is a function call. PROC and each 2658 argument are evaluated, then the function (which PROC evaluated to) 2659 is called with those arguments. 2660 2661 The order in which PROC and the arguments are evaluated is 2662 unspecified, so be careful when using expressions with side 2663 effects. 2664 2665 (max 1 2 3) ⇒ 3 2666 2667 (define (get-some-proc) min) 2668 ((get-some-proc) 1 2 3) ⇒ 1 2669 2670 The same sort of parenthesised form is used for a macro invocation, 2671 but in that case the arguments are not evaluated. See the 2672 descriptions of macros for more on this (*note Macros::, and *note 2673 Syntax Rules::). 2674 2675CONSTANT 2676 Number, string, character and boolean constants evaluate “to 2677 themselves”, so can appear as literals. 2678 2679 123 ⇒ 123 2680 99.9 ⇒ 99.9 2681 "hello" ⇒ "hello" 2682 #\z ⇒ #\z 2683 #t ⇒ #t 2684 2685 Note that an application must not attempt to modify literal 2686 strings, since they may be in read-only memory. 2687 2688(quote DATA) 2689’DATA 2690 Quoting is used to obtain a literal symbol (instead of a variable 2691 reference), a literal list (instead of a function call), or a 2692 literal vector. ’ is simply a shorthand for a ‘quote’ form. For 2693 example, 2694 2695 'x ⇒ x 2696 '(1 2 3) ⇒ (1 2 3) 2697 '#(1 (2 3) 4) ⇒ #(1 (2 3) 4) 2698 (quote x) ⇒ x 2699 (quote (1 2 3)) ⇒ (1 2 3) 2700 (quote #(1 (2 3) 4)) ⇒ #(1 (2 3) 4) 2701 2702 Note that an application must not attempt to modify literal lists 2703 or vectors obtained from a ‘quote’ form, since they may be in 2704 read-only memory. 2705 2706(quasiquote DATA) 2707‘DATA 2708 Backquote quasi-quotation is like ‘quote’, but selected 2709 sub-expressions are evaluated. This is a convenient way to 2710 construct a list or vector structure most of which is constant, but 2711 at certain points should have expressions substituted. 2712 2713 The same effect can always be had with suitable ‘list’, ‘cons’ or 2714 ‘vector’ calls, but quasi-quoting is often easier. 2715 2716 (unquote EXPR) 2717 ,EXPR 2718 Within the quasiquote DATA, ‘unquote’ or ‘,’ indicates an 2719 expression to be evaluated and inserted. The comma syntax ‘,’ 2720 is simply a shorthand for an ‘unquote’ form. For example, 2721 2722 `(1 2 (* 9 9) 3 4) ⇒ (1 2 (* 9 9) 3 4) 2723 `(1 2 ,(* 9 9) 3 4) ⇒ (1 2 81 3 4) 2724 `(1 (unquote (+ 1 1)) 3) ⇒ (1 2 3) 2725 `#(1 ,(/ 12 2)) ⇒ #(1 6) 2726 2727 (unquote-splicing EXPR) 2728 ,@EXPR 2729 Within the quasiquote DATA, ‘unquote-splicing’ or ‘,@’ 2730 indicates an expression to be evaluated and the elements of 2731 the returned list inserted. EXPR must evaluate to a list. 2732 The “comma-at” syntax ‘,@’ is simply a shorthand for an 2733 ‘unquote-splicing’ form. 2734 2735 (define x '(2 3)) 2736 `(1 ,x 4) ⇒ (1 (2 3) 4) 2737 `(1 ,@x 4) ⇒ (1 2 3 4) 2738 `(1 (unquote-splicing (map 1+ x))) ⇒ (1 3 4) 2739 `#(9 ,@x 9) ⇒ #(9 2 3 9) 2740 2741 Notice ‘,@’ differs from plain ‘,’ in the way one level of 2742 nesting is stripped. For ‘,@’ the elements of a returned list 2743 are inserted, whereas with ‘,’ it would be the list itself 2744 inserted. 2745 2746 2747File: guile.info, Node: Comments, Next: Block Comments, Prev: Expression Syntax, Up: Scheme Syntax 2748 27496.16.1.2 Comments 2750................. 2751 2752Comments in Scheme source files are written by starting them with a 2753semicolon character (‘;’). The comment then reaches up to the end of 2754the line. Comments can begin at any column, and the may be inserted on 2755the same line as Scheme code. 2756 2757 ; Comment 2758 ;; Comment too 2759 (define x 1) ; Comment after expression 2760 (let ((y 1)) 2761 ;; Display something. 2762 (display y) 2763 ;;; Comment at left margin. 2764 (display (+ y 1))) 2765 2766 It is common to use a single semicolon for comments following 2767expressions on a line, to use two semicolons for comments which are 2768indented like code, and three semicolons for comments which start at 2769column 0, even if they are inside an indented code block. This 2770convention is used when indenting code in Emacs’ Scheme mode. 2771 2772 2773File: guile.info, Node: Block Comments, Next: Case Sensitivity, Prev: Comments, Up: Scheme Syntax 2774 27756.16.1.3 Block Comments 2776....................... 2777 2778In addition to the standard line comments defined by R5RS, Guile has 2779another comment type for multiline comments, called “block comments”. 2780This type of comment begins with the character sequence ‘#!’ and ends 2781with the characters ‘!#’. 2782 2783 These comments are compatible with the block comments in the Scheme 2784Shell ‘scsh’ (*note The Scheme shell (scsh)::). The characters ‘#!’ 2785were chosen because they are the magic characters used in shell scripts 2786for indicating that the name of the program for executing the script 2787follows on the same line. 2788 2789 Thus a Guile script often starts like this. 2790 2791 #! /usr/local/bin/guile -s 2792 !# 2793 2794 More details on Guile scripting can be found in the scripting section 2795(*note Guile Scripting::). 2796 2797 Similarly, Guile (starting from version 2.0) supports nested block 2798comments as specified by R6RS and SRFI-30 2799(http://srfi.schemers.org/srfi-30/srfi-30.html): 2800 2801 (+ 1 #| this is a #| nested |# block comment |# 2) 2802 ⇒ 3 2803 2804 For backward compatibility, this syntax can be overridden with 2805‘read-hash-extend’ (*note ‘read-hash-extend’: Reader Extensions.). 2806 2807 There is one special case where the contents of a comment can 2808actually affect the interpretation of code. When a character encoding 2809declaration, such as ‘coding: utf-8’ appears in one of the first few 2810lines of a source file, it indicates to Guile’s default reader that this 2811source code file is not ASCII. For details see *note Character Encoding 2812of Source Files::. 2813 2814 2815File: guile.info, Node: Case Sensitivity, Next: Keyword Syntax, Prev: Block Comments, Up: Scheme Syntax 2816 28176.16.1.4 Case Sensitivity 2818......................... 2819 2820Scheme as defined in R5RS is not case sensitive when reading symbols. 2821Guile, on the contrary is case sensitive by default, so the identifiers 2822 2823 guile-whuzzy 2824 Guile-Whuzzy 2825 2826 are the same in R5RS Scheme, but are different in Guile. 2827 2828 It is possible to turn off case sensitivity in Guile by setting the 2829reader option ‘case-insensitive’. For more information on reader 2830options, *Note Scheme Read::. 2831 2832 (read-enable 'case-insensitive) 2833 2834 It is also possible to disable (or enable) case sensitivity within a 2835single file by placing the reader directives ‘#!fold-case’ (or 2836‘#!no-fold-case’) within the file itself. 2837 2838 2839File: guile.info, Node: Keyword Syntax, Next: Reader Extensions, Prev: Case Sensitivity, Up: Scheme Syntax 2840 28416.16.1.5 Keyword Syntax 2842....................... 2843 2844 2845File: guile.info, Node: Reader Extensions, Prev: Keyword Syntax, Up: Scheme Syntax 2846 28476.16.1.6 Reader Extensions 2848.......................... 2849 2850 -- Scheme Procedure: read-hash-extend chr proc 2851 -- C Function: scm_read_hash_extend (chr, proc) 2852 Install the procedure PROC for reading expressions starting with 2853 the character sequence ‘#’ and CHR. PROC will be called with two 2854 arguments: the character CHR and the port to read further data 2855 from. The object returned will be the return value of ‘read’. 2856 Passing ‘#f’ for PROC will remove a previous setting. 2857 2858 2859File: guile.info, Node: Scheme Read, Next: Annotated Scheme Read, Prev: Scheme Syntax, Up: Read/Load/Eval/Compile 2860 28616.16.2 Reading Scheme Code 2862-------------------------- 2863 2864 -- Scheme Procedure: read [port] 2865 -- C Function: scm_read (port) 2866 Read an s-expression from the input port PORT, or from the current 2867 input port if PORT is not specified. Any whitespace before the 2868 next token is discarded. 2869 2870 The behaviour of Guile’s Scheme reader can be modified by 2871manipulating its read options. 2872 2873 -- Scheme Procedure: read-options [setting] 2874 Display the current settings of the global read options. If 2875 SETTING is omitted, only a short form of the current read options 2876 is printed. Otherwise if SETTING is the symbol ‘help’, a complete 2877 options description is displayed. 2878 2879 The set of available options, and their default values, may be had by 2880invoking ‘read-options’ at the prompt. 2881 2882 scheme@(guile-user)> (read-options) 2883 (square-brackets keywords #f positions) 2884 scheme@(guile-user)> (read-options 'help) 2885 positions yes Record positions of source code expressions. 2886 case-insensitive no Convert symbols to lower case. 2887 keywords #f Style of keyword recognition: #f, 'prefix or 'postfix. 2888 r6rs-hex-escapes no Use R6RS variable-length character and string hex escapes. 2889 square-brackets yes Treat `[' and `]' as parentheses, for R6RS compatibility. 2890 hungry-eol-escapes no In strings, consume leading whitespace after an 2891 escaped end-of-line. 2892 curly-infix no Support SRFI-105 curly infix expressions. 2893 r7rs-symbols no Support R7RS |...| symbol notation. 2894 2895 Note that Guile also includes a preliminary mechanism for setting 2896read options on a per-port basis. For instance, the ‘case-insensitive’ 2897read option is set (or unset) on the port when the reader encounters the 2898‘#!fold-case’ or ‘#!no-fold-case’ reader directives. Similarly, the 2899‘#!curly-infix’ reader directive sets the ‘curly-infix’ read option on 2900the port, and ‘#!curly-infix-and-bracket-lists’ sets ‘curly-infix’ and 2901unsets ‘square-brackets’ on the port (*note SRFI-105::). There is 2902currently no other way to access or set the per-port read options. 2903 2904 The boolean options may be toggled with ‘read-enable’ and 2905‘read-disable’. The non-boolean ‘keywords’ option must be set using 2906‘read-set!’. 2907 2908 -- Scheme Procedure: read-enable option-name 2909 -- Scheme Procedure: read-disable option-name 2910 -- Scheme Syntax: read-set! option-name value 2911 Modify the read options. ‘read-enable’ should be used with boolean 2912 options and switches them on, ‘read-disable’ switches them off. 2913 2914 ‘read-set!’ can be used to set an option to a specific value. Due 2915 to historical oddities, it is a macro that expects an unquoted 2916 option name. 2917 2918 For example, to make ‘read’ fold all symbols to their lower case 2919(perhaps for compatibility with older Scheme code), you can enter: 2920 2921 (read-enable 'case-insensitive) 2922 2923 For more information on the effect of the ‘r6rs-hex-escapes’ and 2924‘hungry-eol-escapes’ options, see (*note String Syntax::). 2925 2926 For more information on the ‘r7rs-symbols’ option, see (*note Symbol 2927Read Syntax::). 2928 2929 2930File: guile.info, Node: Annotated Scheme Read, Next: Scheme Write, Prev: Scheme Read, Up: Read/Load/Eval/Compile 2931 29326.16.3 Reading Scheme Code, For the Compiler 2933-------------------------------------------- 2934 2935When something goes wrong with a Scheme program, the user will want to 2936know how to fix it. This starts with identifying where the error 2937occured: we want to associate a source location with each component part 2938of source code, and propagate that source location information through 2939to the compiler or interpreter. 2940 2941 For that, Guile provides ‘read-syntax’. 2942 2943 -- Scheme Procedure: read-syntax [port] 2944 Read an s-expression from the input port PORT, or from the current 2945 input port if PORT is not specified. 2946 2947 If, after skipping white space and comments, no more bytes are 2948 available from PORT, return the end-of-file object. *Note Binary 2949 I/O::. Otherwise, return an annotated datum. An annotated datum 2950 is a syntax object which associates a source location with a datum. 2951 For example: 2952 2953 (call-with-input-string " foo" read-syntax) 2954 ; ⇒ #<syntax:unknown file:1:2 foo> 2955 (call-with-input-string "(foo)" read-syntax) 2956 ; ⇒ 2957 ; #<syntax:unknown file:1:0 2958 ; (#<syntax unknown file:1:1 foo>)> 2959 2960 As the second example shows, all fields of pairs and vectors are 2961 also annotated, recursively. 2962 2963 Most users are familiar with syntax objects in the context of macros, 2964which use syntax objects to associate scope information with 2965identifiers. *Note Macros::. Here we use syntax objects to associate 2966source location information with any datum, but without attaching scope 2967information. The Scheme compiler (‘compile’) and the interpreter 2968(‘eval’) can accept syntax objects directly as input, allowing them to 2969associate source information with resulting code. *Note Compilation::, 2970and *Note Fly Evaluation::. 2971 2972 Note that there is a legacy interface for getting source locations 2973into the Scheme compiler or interpreter, which is to use a side table 2974that associates “source properties” with each subdatum returned by 2975‘read’, instead of wrapping the datums directly as in ‘read-syntax’. 2976This has the disadvantage of not being able to annotate all kinds of 2977datums. *Note Source Properties::, for more information. 2978 2979 2980File: guile.info, Node: Scheme Write, Next: Fly Evaluation, Prev: Annotated Scheme Read, Up: Read/Load/Eval/Compile 2981 29826.16.4 Writing Scheme Values 2983---------------------------- 2984 2985Any scheme value may be written to a port. Not all values may be read 2986back in (*note Scheme Read::), however. 2987 2988 -- Scheme Procedure: write obj [port] 2989 Send a representation of OBJ to PORT or to the current output port 2990 if not given. 2991 2992 The output is designed to be machine readable, and can be read back 2993 with ‘read’ (*note Scheme Read::). Strings are printed in double 2994 quotes, with escapes if necessary, and characters are printed in 2995 ‘#\’ notation. 2996 2997 -- Scheme Procedure: display obj [port] 2998 Send a representation of OBJ to PORT or to the current output port 2999 if not given. 3000 3001 The output is designed for human readability, it differs from 3002 ‘write’ in that strings are printed without double quotes and 3003 escapes, and characters are printed as per ‘write-char’, not in 3004 ‘#\’ form. 3005 3006 As was the case with the Scheme reader, there are a few options that 3007affect the behavior of the Scheme printer. 3008 3009 -- Scheme Procedure: print-options [setting] 3010 Display the current settings of the read options. If SETTING is 3011 omitted, only a short form of the current read options is printed. 3012 Otherwise if SETTING is the symbol ‘help’, a complete options 3013 description is displayed. 3014 3015 The set of available options, and their default values, may be had by 3016invoking ‘print-options’ at the prompt. 3017 3018 scheme@(guile-user)> (print-options) 3019 (quote-keywordish-symbols reader highlight-suffix "}" highlight-prefix "{") 3020 scheme@(guile-user)> (print-options 'help) 3021 highlight-prefix { The string to print before highlighted values. 3022 highlight-suffix } The string to print after highlighted values. 3023 quote-keywordish-symbols reader How to print symbols that have a colon 3024 as their first or last character. The 3025 value '#f' does not quote the colons; 3026 '#t' quotes them; 'reader' quotes them 3027 when the reader option 'keywords' is 3028 not '#f'. 3029 escape-newlines yes Render newlines as \n when printing 3030 using `write'. 3031 r7rs-symbols no Escape symbols using R7RS |...| symbol 3032 notation. 3033 3034 These options may be modified with the print-set! syntax. 3035 3036 -- Scheme Syntax: print-set! option-name value 3037 Modify the print options. Due to historical oddities, ‘print-set!’ 3038 is a macro that expects an unquoted option name. 3039 3040 3041File: guile.info, Node: Fly Evaluation, Next: Compilation, Prev: Scheme Write, Up: Read/Load/Eval/Compile 3042 30436.16.5 Procedures for On the Fly Evaluation 3044------------------------------------------- 3045 3046Scheme has the lovely property that its expressions may be represented 3047as data. The ‘eval’ procedure takes a Scheme datum and evaluates it as 3048code. 3049 3050 -- Scheme Procedure: eval exp module_or_state 3051 -- C Function: scm_eval (exp, module_or_state) 3052 Evaluate EXP, a list representing a Scheme expression, in the 3053 top-level environment specified by MODULE_OR_STATE. While EXP is 3054 evaluated (using ‘primitive-eval’), MODULE_OR_STATE is made the 3055 current module. The current module is reset to its previous value 3056 when ‘eval’ returns. XXX - dynamic states. Example: (eval ’(+ 1 3057 2) (interaction-environment)) 3058 3059 -- Scheme Procedure: interaction-environment 3060 -- C Function: scm_interaction_environment () 3061 Return a specifier for the environment that contains 3062 implementation–defined bindings, typically a superset of those 3063 listed in the report. The intent is that this procedure will 3064 return the environment in which the implementation would evaluate 3065 expressions dynamically typed by the user. 3066 3067 *Note Environments::, for other environments. 3068 3069 One does not always receive code as Scheme data, of course, and this 3070is especially the case for Guile’s other language implementations (*note 3071Other Languages::). For the case in which all you have is a string, we 3072have ‘eval-string’. There is a legacy version of this procedure in the 3073default environment, but you really want the one from ‘(ice-9 3074eval-string)’, so load it up: 3075 3076 (use-modules (ice-9 eval-string)) 3077 3078 -- Scheme Procedure: eval-string string [#:module=#f] [#:file=#f] 3079 [#:line=#f] [#:column=#f] [#:lang=(current-language)] 3080 [#:compile?=#f] 3081 Parse STRING according to the current language, normally Scheme. 3082 Evaluate or compile the expressions it contains, in order, 3083 returning the last expression. 3084 3085 If the MODULE keyword argument is set, save a module excursion 3086 (*note Module System Reflection::) and set the current module to 3087 MODULE before evaluation. 3088 3089 The FILE, LINE, and COLUMN keyword arguments can be used to 3090 indicate that the source string begins at a particular source 3091 location. 3092 3093 Finally, LANG is a language, defaulting to the current language, 3094 and the expression is compiled if COMPILE? is true or there is no 3095 evaluator for the given language. 3096 3097 -- C Function: scm_eval_string (string) 3098 -- C Function: scm_eval_string_in_module (string, module) 3099 These C bindings call ‘eval-string’ from ‘(ice-9 eval-string)’, 3100 evaluating within MODULE or the current module. 3101 3102 -- C Function: SCM scm_c_eval_string (const char *string) 3103 ‘scm_eval_string’, but taking a C string in locale encoding instead 3104 of an ‘SCM’. 3105 3106 -- Scheme Procedure: apply proc arg ... arglst 3107 -- C Function: scm_apply_0 (proc, arglst) 3108 -- C Function: scm_apply_1 (proc, arg1, arglst) 3109 -- C Function: scm_apply_2 (proc, arg1, arg2, arglst) 3110 -- C Function: scm_apply_3 (proc, arg1, arg2, arg3, arglst) 3111 -- C Function: scm_apply (proc, arg, rest) 3112 Call PROC with arguments ARG ... and the elements of the ARGLST 3113 list. 3114 3115 ‘scm_apply’ takes parameters corresponding to a Scheme level 3116 ‘(lambda (proc arg1 . rest) ...)’. So ARG1 and all but the last 3117 element of the REST list make up ARG ..., and the last element of 3118 REST is the ARGLST list. Or if REST is the empty list ‘SCM_EOL’ 3119 then there’s no ARG ..., and (ARG1) is the ARGLST. 3120 3121 ARGLST is not modified, but the REST list passed to ‘scm_apply’ is 3122 modified. 3123 3124 -- C Function: scm_call_0 (proc) 3125 -- C Function: scm_call_1 (proc, arg1) 3126 -- C Function: scm_call_2 (proc, arg1, arg2) 3127 -- C Function: scm_call_3 (proc, arg1, arg2, arg3) 3128 -- C Function: scm_call_4 (proc, arg1, arg2, arg3, arg4) 3129 -- C Function: scm_call_5 (proc, arg1, arg2, arg3, arg4, arg5) 3130 -- C Function: scm_call_6 (proc, arg1, arg2, arg3, arg4, arg5, arg6) 3131 -- C Function: scm_call_7 (proc, arg1, arg2, arg3, arg4, arg5, arg6, 3132 arg7) 3133 -- C Function: scm_call_8 (proc, arg1, arg2, arg3, arg4, arg5, arg6, 3134 arg7, arg8) 3135 -- C Function: scm_call_9 (proc, arg1, arg2, arg3, arg4, arg5, arg6, 3136 arg7, arg8, arg9) 3137 Call PROC with the given arguments. 3138 3139 -- C Function: scm_call (proc, ...) 3140 Call PROC with any number of arguments. The argument list must be 3141 terminated by ‘SCM_UNDEFINED’. For example: 3142 3143 scm_call (scm_c_public_ref ("guile", "+"), 3144 scm_from_int (1), 3145 scm_from_int (2), 3146 SCM_UNDEFINED); 3147 3148 -- C Function: scm_call_n (proc, argv, nargs) 3149 Call PROC with the array of arguments ARGV, as a ‘SCM*’. The 3150 length of the arguments should be passed in NARGS, as a ‘size_t’. 3151 3152 -- Scheme Procedure: primitive-eval exp 3153 -- C Function: scm_primitive_eval (exp) 3154 Evaluate EXP in the top-level environment specified by the current 3155 module. 3156 3157 3158File: guile.info, Node: Compilation, Next: Loading, Prev: Fly Evaluation, Up: Read/Load/Eval/Compile 3159 31606.16.6 Compiling Scheme Code 3161---------------------------- 3162 3163The ‘eval’ procedure directly interprets the S-expression representation 3164of Scheme. An alternate strategy for evaluation is to determine ahead 3165of time what computations will be necessary to evaluate the expression, 3166and then use that recipe to produce the desired results. This is known 3167as “compilation”. 3168 3169 While it is possible to compile simple Scheme expressions such as ‘(+ 31702 2)’ or even ‘"Hello world!"’, compilation is most interesting in the 3171context of procedures. Compiling a lambda expression produces a 3172compiled procedure, which is just like a normal procedure except 3173typically much faster, because it can bypass the generic interpreter. 3174 3175 Functions from system modules in a Guile installation are normally 3176compiled already, so they load and run quickly. 3177 3178 Note that well-written Scheme programs will not typically call the 3179procedures in this section, for the same reason that it is often bad 3180taste to use ‘eval’. By default, Guile automatically compiles any files 3181it encounters that have not been compiled yet (*note ‘--auto-compile’: 3182Invoking Guile.). The compiler can also be invoked explicitly from the 3183shell as ‘guild compile foo.scm’. 3184 3185 (Why are calls to ‘eval’ and ‘compile’ usually in bad taste? Because 3186they are limited, in that they can only really make sense for top-level 3187expressions. Also, most needs for “compile-time” computation are 3188fulfilled by macros and closures. Of course one good counterexample is 3189the REPL itself, or any code that reads expressions from a port.) 3190 3191 Automatic compilation generally works transparently, without any need 3192for user intervention. However Guile does not yet do proper dependency 3193tracking, so that if file ‘A.scm’ uses macros from ‘B.scm’, and B.SCM 3194changes, ‘A.scm’ would not be automatically recompiled. To forcibly 3195invalidate the auto-compilation cache, pass the ‘--fresh-auto-compile’ 3196option to Guile, or set the ‘GUILE_AUTO_COMPILE’ environment variable to 3197‘fresh’ (instead of to ‘0’ or ‘1’). 3198 3199 For more information on the compiler itself, see *note Compiling to 3200the Virtual Machine::. For information on the virtual machine, see 3201*note A Virtual Machine for Guile::. 3202 3203 The command-line interface to Guile’s compiler is the ‘guild compile’ 3204command: 3205 3206 -- Command: guild compile [‘option’...] FILE... 3207 Compile FILE, a source file, and store bytecode in the compilation 3208 cache or in the file specified by the ‘-o’ option. The following 3209 options are available: 3210 3211 ‘-L DIR’ 3212 ‘--load-path=DIR’ 3213 Add DIR to the front of the module load path. 3214 3215 ‘-o OFILE’ 3216 ‘--output=OFILE’ 3217 Write output bytecode to OFILE. By convention, bytecode file 3218 names end in ‘.go’. When ‘-o’ is omitted, the output file 3219 name is as for ‘compile-file’ (see below). 3220 3221 ‘-x EXTENSION’ 3222 Recognize EXTENSION as a valid source file name extension. 3223 3224 For example, to compile R6RS code, you might want to pass ‘-x 3225 .sls’ so that files ending in ‘.sls’ can be found. 3226 3227 ‘-W WARNING’ 3228 ‘--warn=WARNING’ 3229 Enable specific warning passes; use ‘-Whelp’ for a list of 3230 available options. The default is ‘-W1’, which enables a 3231 number of common warnings. Pass ‘-W0’ to disable all 3232 warnings. 3233 3234 ‘-O OPT’ 3235 ‘--optimize=OPT’ 3236 Enable or disable specific compiler optimizations; use 3237 ‘-Ohelp’ for a list of available options. The default is 3238 ‘-O2’, which enables most optimizations. ‘-O0’ is recommended 3239 if compilation speed is more important than the speed of the 3240 compiled code. Pass ‘-Ono-OPT’ to disable a specific compiler 3241 pass. Any number of ‘-O’ options can be passed to the 3242 compiler, with later ones taking precedence. 3243 3244 ‘--r6rs’ 3245 ‘--r7rs’ 3246 Compile in an environment whose default bindings, reader 3247 options, and load paths are adapted for specific Scheme 3248 standards. *Note R6RS Support::, and *Note R7RS Support::. 3249 3250 ‘-f LANG’ 3251 ‘--from=LANG’ 3252 Use LANG as the source language of FILE. If this option is 3253 omitted, ‘scheme’ is assumed. 3254 3255 ‘-t LANG’ 3256 ‘--to=LANG’ 3257 Use LANG as the target language of FILE. If this option is 3258 omitted, ‘rtl’ is assumed. 3259 3260 ‘-T TARGET’ 3261 ‘--target=TARGET’ 3262 Produce code for TARGET instead of %HOST-TYPE (*note 3263 %host-type: Build Config.). Target must be a valid GNU 3264 triplet, such as ‘armv5tel-unknown-linux-gnueabi’ (*note 3265 (autoconf)Specifying Target Triplets::). 3266 3267 Each FILE is assumed to be UTF-8-encoded, unless it contains a 3268 coding declaration as recognized by ‘file-encoding’ (*note 3269 Character Encoding of Source Files::). 3270 3271 The compiler can also be invoked directly by Scheme code. These 3272interfaces are in their own module: 3273 3274 (use-modules (system base compile)) 3275 3276 -- Scheme Procedure: compile exp [#:env=#f] [#:from=(current-language)] 3277 [#:to=value] [#:opts='()] 3278 [#:optimization-level=(default-optimization-level)] 3279 [#:warning-level=(default-warning-level)] 3280 Compile the expression EXP in the environment ENV. If EXP is a 3281 procedure, the result will be a compiled procedure; otherwise 3282 ‘compile’ is mostly equivalent to ‘eval’. 3283 3284 For a discussion of languages and compiler options, *Note Compiling 3285 to the Virtual Machine::. 3286 3287 -- Scheme Procedure: compile-file file [#:output-file=#f] 3288 [#:from=(current-language)] [#:to='rtl] 3289 [#:env=(default-environment from)] [#:opts='()] 3290 [#:optimization-level=(default-optimization-level)] 3291 [#:warning-level=(default-warning-level)] 3292 [#:canonicalization='relative] 3293 Compile the file named FILE. 3294 3295 Output will be written to a OUTPUT-FILE. If you do not supply an 3296 output file name, output is written to a file in the cache 3297 directory, as computed by ‘(compiled-file-name FILE)’. 3298 3299 FROM and TO specify the source and target languages. *Note 3300 Compiling to the Virtual Machine::, for more information on these 3301 options, and on ENV and OPTS. 3302 3303 As with ‘guild compile’, FILE is assumed to be UTF-8-encoded unless 3304 it contains a coding declaration. 3305 3306 -- Scheme Parameter: default-optimization-level 3307 The default optimization level, as an integer from 0 to 9. The 3308 default is 2. 3309 -- Scheme Parameter: default-warning-level 3310 The default warning level, as an integer from 0 to 9. The default 3311 is 1. 3312 3313 *Note Parameters::, for more on how to set parameters. 3314 3315 -- Scheme Procedure: compiled-file-name file 3316 Compute a cached location for a compiled version of a Scheme file 3317 named FILE. 3318 3319 This file will usually be below the ‘$HOME/.cache/guile/ccache’ 3320 directory, depending on the value of the ‘XDG_CACHE_HOME’ 3321 environment variable. The intention is that ‘compiled-file-name’ 3322 provides a fallback location for caching auto-compiled files. If 3323 you want to place a compile file in the ‘%load-compiled-path’, you 3324 should pass the OUTPUT-FILE option to ‘compile-file’, explicitly. 3325 3326 -- Scheme Variable: %auto-compilation-options 3327 This variable contains the options passed to the ‘compile-file’ 3328 procedure when auto-compiling source files. By default, it enables 3329 useful compilation warnings. It can be customized from ‘~/.guile’. 3330 3331 3332File: guile.info, Node: Loading, Next: Load Paths, Prev: Compilation, Up: Read/Load/Eval/Compile 3333 33346.16.7 Loading Scheme Code from File 3335------------------------------------ 3336 3337 -- Scheme Procedure: load filename [reader] 3338 Load FILENAME and evaluate its contents in the top-level 3339 environment. 3340 3341 READER if provided should be either ‘#f’, or a procedure with the 3342 signature ‘(lambda (port) ...)’ which reads the next expression 3343 from PORT. If READER is ‘#f’ or absent, Guile’s built-in ‘read’ 3344 procedure is used (*note Scheme Read::). 3345 3346 The READER argument takes effect by setting the value of the 3347 ‘current-reader’ fluid (see below) before loading the file, and 3348 restoring its previous value when loading is complete. The Scheme 3349 code inside FILENAME can itself change the current reader procedure 3350 on the fly by setting ‘current-reader’ fluid. 3351 3352 If the variable ‘%load-hook’ is defined, it should be bound to a 3353 procedure that will be called before any code is loaded. See 3354 documentation for ‘%load-hook’ later in this section. 3355 3356 -- Scheme Procedure: load-compiled filename 3357 Load the compiled file named FILENAME. 3358 3359 Compiling a source file (*note Read/Load/Eval/Compile::) and then 3360 calling ‘load-compiled’ on the resulting file is equivalent to 3361 calling ‘load’ on the source file. 3362 3363 -- Scheme Procedure: primitive-load filename 3364 -- C Function: scm_primitive_load (filename) 3365 Load the file named FILENAME and evaluate its contents in the 3366 top-level environment. FILENAME must either be a full pathname or 3367 be a pathname relative to the current directory. If the variable 3368 ‘%load-hook’ is defined, it should be bound to a procedure that 3369 will be called before any code is loaded. See the documentation 3370 for ‘%load-hook’ later in this section. 3371 3372 -- C Function: SCM scm_c_primitive_load (const char *filename) 3373 ‘scm_primitive_load’, but taking a C string instead of an ‘SCM’. 3374 3375 -- Variable: current-reader 3376 ‘current-reader’ holds the read procedure that is currently being 3377 used by the above loading procedures to read expressions (from the 3378 file that they are loading). ‘current-reader’ is a fluid, so it 3379 has an independent value in each dynamic root and should be read 3380 and set using ‘fluid-ref’ and ‘fluid-set!’ (*note Fluids and 3381 Dynamic States::). 3382 3383 Changing ‘current-reader’ is typically useful to introduce local 3384 syntactic changes, such that code following the ‘fluid-set!’ call 3385 is read using the newly installed reader. The ‘current-reader’ 3386 change should take place at evaluation time when the code is 3387 evaluated, or at compilation time when the code is compiled: 3388 3389 (eval-when (compile eval) 3390 (fluid-set! current-reader my-own-reader)) 3391 3392 The ‘eval-when’ form above ensures that the ‘current-reader’ change 3393 occurs at the right time. 3394 3395 -- Variable: %load-hook 3396 A procedure to be called ‘(%load-hook FILENAME)’ whenever a file is 3397 loaded, or ‘#f’ for no such call. ‘%load-hook’ is used by all of 3398 the loading functions (‘load’ and ‘primitive-load’, and 3399 ‘load-from-path’ and ‘primitive-load-path’ documented in the next 3400 section). 3401 3402 For example an application can set this to show what’s loaded, 3403 3404 (set! %load-hook (lambda (filename) 3405 (format #t "Loading ~a ...\n" filename))) 3406 (load-from-path "foo.scm") 3407 ⊣ Loading /usr/local/share/guile/site/foo.scm ... 3408 3409 -- Scheme Procedure: current-load-port 3410 -- C Function: scm_current_load_port () 3411 Return the current-load-port. The load port is used internally by 3412 ‘primitive-load’. 3413 3414 3415File: guile.info, Node: Load Paths, Next: Character Encoding of Source Files, Prev: Loading, Up: Read/Load/Eval/Compile 3416 34176.16.8 Load Paths 3418----------------- 3419 3420The procedure in the previous section look for Scheme code in the file 3421system at specific location. Guile also has some procedures to search 3422the load path for code. 3423 3424 -- Variable: %load-path 3425 List of directories which should be searched for Scheme modules and 3426 libraries. When Guile starts up, ‘%load-path’ is initialized to 3427 the default load path ‘(list (%library-dir) (%site-dir) 3428 (%global-site-dir) (%package-data-dir))’. The ‘GUILE_LOAD_PATH’ 3429 environment variable can be used to prepend or append additional 3430 directories (*note Environment Variables::). 3431 3432 *Note Build Config::, for more on ‘%site-dir’ and related 3433 procedures. 3434 3435 -- Scheme Procedure: load-from-path filename 3436 Similar to ‘load’, but searches for FILENAME in the load paths. 3437 Preferentially loads a compiled version of the file, if it is 3438 available and up-to-date. 3439 3440 A user can extend the load path by calling ‘add-to-load-path’. 3441 3442 -- Scheme Syntax: add-to-load-path dir 3443 Add DIR to the load path. 3444 3445 For example, a script might include this form to add the directory 3446that it is in to the load path: 3447 3448 (add-to-load-path (dirname (current-filename))) 3449 3450 It’s better to use ‘add-to-load-path’ than to modify ‘%load-path’ 3451directly, because ‘add-to-load-path’ takes care of modifying the path 3452both at compile-time and at run-time. 3453 3454 -- Scheme Procedure: primitive-load-path filename 3455 [exception-on-not-found] 3456 -- C Function: scm_primitive_load_path (filename) 3457 Search ‘%load-path’ for the file named FILENAME and load it into 3458 the top-level environment. If FILENAME is a relative pathname and 3459 is not found in the list of search paths, an error is signalled. 3460 Preferentially loads a compiled version of the file, if it is 3461 available and up-to-date. 3462 3463 If FILENAME is a relative pathname and is not found in the list of 3464 search paths, one of three things may happen, depending on the 3465 optional second argument, EXCEPTION-ON-NOT-FOUND. If it is ‘#f’, 3466 ‘#f’ will be returned. If it is a procedure, it will be called 3467 with no arguments. (This allows a distinction to be made between 3468 exceptions raised by loading a file, and exceptions related to the 3469 loader itself.) Otherwise an error is signalled. 3470 3471 For compatibility with Guile 1.8 and earlier, the C function takes 3472 only one argument, which can be either a string (the file name) or 3473 an argument list. 3474 3475 -- Scheme Procedure: %search-load-path filename 3476 -- C Function: scm_sys_search_load_path (filename) 3477 Search ‘%load-path’ for the file named FILENAME, which must be 3478 readable by the current user. If FILENAME is found in the list of 3479 paths to search or is an absolute pathname, return its full 3480 pathname. Otherwise, return ‘#f’. Filenames may have any of the 3481 optional extensions in the ‘%load-extensions’ list; 3482 ‘%search-load-path’ will try each extension automatically. 3483 3484 -- Variable: %load-extensions 3485 A list of default file extensions for files containing Scheme code. 3486 ‘%search-load-path’ tries each of these extensions when looking for 3487 a file to load. By default, ‘%load-extensions’ is bound to the 3488 list ‘("" ".scm")’. 3489 3490 As mentioned above, when Guile searches the ‘%load-path’ for a source 3491file, it will also search the ‘%load-compiled-path’ for a corresponding 3492compiled file. If the compiled file is as new or newer than the source 3493file, it will be loaded instead of the source file, using 3494‘load-compiled’. 3495 3496 -- Variable: %load-compiled-path 3497 Like ‘%load-path’, but for compiled files. By default, this path 3498 has two entries: one for compiled files from Guile itself, and one 3499 for site packages. The ‘GUILE_LOAD_COMPILED_PATH’ environment 3500 variable can be used to prepend or append additional directories 3501 (*note Environment Variables::). 3502 3503 When ‘primitive-load-path’ searches the ‘%load-compiled-path’ for a 3504corresponding compiled file for a relative path it does so by appending 3505‘.go’ to the relative path. For example, searching for ‘ice-9/popen’ 3506could find ‘/usr/lib/guile/3.0/ccache/ice-9/popen.go’, and use it 3507instead of ‘/usr/share/guile/3.0/ice-9/popen.scm’. 3508 3509 If ‘primitive-load-path’ does not find a corresponding ‘.go’ file in 3510the ‘%load-compiled-path’, or the ‘.go’ file is out of date, it will 3511search for a corresponding auto-compiled file in the fallback path, 3512possibly creating one if one does not exist. 3513 3514 *Note Installing Site Packages::, for more on how to correctly 3515install site packages. *Note Modules and the File System::, for more on 3516the relationship between load paths and modules. *Note Compilation::, 3517for more on the fallback path and auto-compilation. 3518 3519 Finally, there are a couple of helper procedures for general path 3520manipulation. 3521 3522 -- Scheme Procedure: parse-path path [tail] 3523 -- C Function: scm_parse_path (path, tail) 3524 Parse PATH, which is expected to be a colon-separated string, into 3525 a list and return the resulting list with TAIL appended. If PATH 3526 is ‘#f’, TAIL is returned. 3527 3528 -- Scheme Procedure: parse-path-with-ellipsis path base 3529 -- C Function: scm_parse_path_with_ellipsis (path, base) 3530 Parse PATH, which is expected to be a colon-separated string, into 3531 a list and return the resulting list with BASE (a list) spliced in 3532 place of the ‘...’ path component, if present, or else BASE is 3533 added to the end. If PATH is ‘#f’, BASE is returned. 3534 3535 -- Scheme Procedure: search-path path filename [extensions 3536 [require-exts?]] 3537 -- C Function: scm_search_path (path, filename, rest) 3538 Search PATH for a directory containing a file named FILENAME. The 3539 file must be readable, and not a directory. If we find one, return 3540 its full filename; otherwise, return ‘#f’. If FILENAME is 3541 absolute, return it unchanged. If given, EXTENSIONS is a list of 3542 strings; for each directory in PATH, we search for FILENAME 3543 concatenated with each EXTENSION. If REQUIRE-EXTS? is true, 3544 require that the returned file name have one of the given 3545 extensions; if REQUIRE-EXTS? is not given, it defaults to ‘#f’. 3546 3547 For compatibility with Guile 1.8 and earlier, the C function takes 3548 only three arguments. 3549 3550 3551File: guile.info, Node: Character Encoding of Source Files, Next: Delayed Evaluation, Prev: Load Paths, Up: Read/Load/Eval/Compile 3552 35536.16.9 Character Encoding of Source Files 3554----------------------------------------- 3555 3556Scheme source code files are usually encoded in ASCII or UTF-8, but the 3557built-in reader can interpret other character encodings as well. When 3558Guile loads Scheme source code, it uses the ‘file-encoding’ procedure 3559(described below) to try to guess the encoding of the file. In the 3560absence of any hints, UTF-8 is assumed. One way to provide a hint about 3561the encoding of a source file is to place a coding declaration in the 3562top 500 characters of the file. 3563 3564 A coding declaration has the form ‘coding: XXXXXX’, where ‘XXXXXX’ is 3565the name of a character encoding in which the source code file has been 3566encoded. The coding declaration must appear in a scheme comment. It 3567can either be a semicolon-initiated comment, or the first block ‘#!’ 3568comment in the file. 3569 3570 The name of the character encoding in the coding declaration is 3571typically lower case and containing only letters, numbers, and hyphens, 3572as recognized by ‘set-port-encoding!’ (*note ‘set-port-encoding!’: 3573Ports.). Common examples of character encoding names are ‘utf-8’ and 3574‘iso-8859-1’, as defined by IANA 3575(http://www.iana.org/assignments/character-sets). Thus, the coding 3576declaration is mostly compatible with Emacs. 3577 3578 However, there are some differences in encoding names recognized by 3579Emacs and encoding names defined by IANA, the latter being essentially a 3580subset of the former. For instance, ‘latin-1’ is a valid encoding name 3581for Emacs, but it’s not according to the IANA standard, which Guile 3582follows; instead, you should use ‘iso-8859-1’, which is both understood 3583by Emacs and dubbed by IANA (IANA writes it uppercase but Emacs wants it 3584lowercase and Guile is case insensitive.) 3585 3586 For source code, only a subset of all possible character encodings 3587can be interpreted by the built-in source code reader. Only those 3588character encodings in which ASCII text appears unmodified can be used. 3589This includes ‘UTF-8’ and ‘ISO-8859-1’ through ‘ISO-8859-15’. The 3590multi-byte character encodings ‘UTF-16’ and ‘UTF-32’ may not be used 3591because they are not compatible with ASCII. 3592 3593 There might be a scenario in which one would want to read non-ASCII 3594code from a port, such as with the function ‘read’, instead of with 3595‘load’. If the port’s character encoding is the same as the encoding of 3596the code to be read by the port, not other special handling is 3597necessary. The port will automatically do the character encoding 3598conversion. The functions ‘setlocale’ or by ‘set-port-encoding!’ are 3599used to set port encodings (*note Ports::). 3600 3601 If a port is used to read code of unknown character encoding, it can 3602accomplish this in three steps. First, the character encoding of the 3603port should be set to ISO-8859-1 using ‘set-port-encoding!’. Then, the 3604procedure ‘file-encoding’, described below, is used to scan for a coding 3605declaration when reading from the port. As a side effect, it rewinds 3606the port after its scan is complete. After that, the port’s character 3607encoding should be set to the encoding returned by ‘file-encoding’, if 3608any, again by using ‘set-port-encoding!’. Then the code can be read as 3609normal. 3610 3611 Alternatively, one can use the ‘#:guess-encoding’ keyword argument of 3612‘open-file’ and related procedures. *Note File Ports::. 3613 3614 -- Scheme Procedure: file-encoding port 3615 -- C Function: scm_file_encoding (port) 3616 Attempt to scan the first few hundred bytes from the PORT for hints 3617 about its character encoding. Return a string containing the 3618 encoding name or ‘#f’ if the encoding cannot be determined. The 3619 port is rewound. 3620 3621 Currently, the only supported method is to look for an Emacs-like 3622 character coding declaration (*note how Emacs recognizes file 3623 encoding: (emacs)Recognize Coding.). The coding declaration is of 3624 the form ‘coding: XXXXX’ and must appear in a Scheme comment. 3625 Additional heuristics may be added in the future. 3626 3627 3628File: guile.info, Node: Delayed Evaluation, Next: Local Evaluation, Prev: Character Encoding of Source Files, Up: Read/Load/Eval/Compile 3629 36306.16.10 Delayed Evaluation 3631-------------------------- 3632 3633Promises are a convenient way to defer a calculation until its result is 3634actually needed, and to run such a calculation only once. Also *note 3635SRFI-45::. 3636 3637 -- syntax: delay expr 3638 Return a promise object which holds the given EXPR expression, 3639 ready to be evaluated by a later ‘force’. 3640 3641 -- Scheme Procedure: promise? obj 3642 -- C Function: scm_promise_p (obj) 3643 Return true if OBJ is a promise. 3644 3645 -- Scheme Procedure: force p 3646 -- C Function: scm_force (p) 3647 Return the value obtained from evaluating the EXPR in the given 3648 promise P. If P has previously been forced then its EXPR is not 3649 evaluated again, instead the value obtained at that time is simply 3650 returned. 3651 3652 During a ‘force’, an EXPR can call ‘force’ again on its own 3653 promise, resulting in a recursive evaluation of that EXPR. The 3654 first evaluation to return gives the value for the promise. Higher 3655 evaluations run to completion in the normal way, but their results 3656 are ignored, ‘force’ always returns the first value. 3657 3658 3659File: guile.info, Node: Local Evaluation, Next: Local Inclusion, Prev: Delayed Evaluation, Up: Read/Load/Eval/Compile 3660 36616.16.11 Local Evaluation 3662------------------------ 3663 3664Guile includes a facility to capture a lexical environment, and later 3665evaluate a new expression within that environment. This code is 3666implemented in a module. 3667 3668 (use-modules (ice-9 local-eval)) 3669 3670 -- syntax: the-environment 3671 Captures and returns a lexical environment for use with 3672 ‘local-eval’ or ‘local-compile’. 3673 3674 -- Scheme Procedure: local-eval exp env 3675 -- C Function: scm_local_eval (exp, env) 3676 -- Scheme Procedure: local-compile exp env [opts=()] 3677 Evaluate or compile the expression EXP in the lexical environment 3678 ENV. 3679 3680 Here is a simple example, illustrating that it is the variable that 3681gets captured, not just its value at one point in time. 3682 3683 (define e (let ((x 100)) (the-environment))) 3684 (define fetch-x (local-eval '(lambda () x) e)) 3685 (fetch-x) 3686 ⇒ 100 3687 (local-eval '(set! x 42) e) 3688 (fetch-x) 3689 ⇒ 42 3690 3691 While EXP is evaluated within the lexical environment of 3692‘(the-environment)’, it has the dynamic environment of the call to 3693‘local-eval’. 3694 3695 ‘local-eval’ and ‘local-compile’ can only evaluate expressions, not 3696definitions. 3697 3698 (local-eval '(define foo 42) 3699 (let ((x 100)) (the-environment))) 3700 ⇒ syntax error: definition in expression context 3701 3702 Note that the current implementation of ‘(the-environment)’ only 3703captures “normal” lexical bindings, and pattern variables bound by 3704‘syntax-case’. It does not currently capture local syntax transformers 3705bound by ‘let-syntax’, ‘letrec-syntax’ or non-top-level ‘define-syntax’ 3706forms. Any attempt to reference such captured syntactic keywords via 3707‘local-eval’ or ‘local-compile’ produces an error. 3708 3709 3710File: guile.info, Node: Local Inclusion, Next: Sandboxed Evaluation, Prev: Local Evaluation, Up: Read/Load/Eval/Compile 3711 37126.16.12 Local Inclusion 3713----------------------- 3714 3715This section has discussed various means of linking Scheme code 3716together: fundamentally, loading up files at run-time using ‘load’ and 3717‘load-compiled’. Guile provides another option to compose parts of 3718programs together at expansion-time instead of at run-time. 3719 3720 -- Scheme Syntax: include file-name 3721 Open FILE-NAME, at expansion-time, and read the Scheme forms that 3722 it contains, splicing them into the location of the ‘include’, 3723 within a ‘begin’. 3724 3725 If FILE-NAME is a relative path, it is searched for relative to the 3726 path that contains the file that the ‘include’ form appears in. 3727 3728 If you are a C programmer, if ‘load’ in Scheme is like ‘dlopen’ in C, 3729consider ‘include’ to be like the C preprocessor’s ‘#include’. When you 3730use ‘include’, it is as if the contents of the included file were typed 3731in instead of the ‘include’ form. 3732 3733 Because the code is included at compile-time, it is available to the 3734macroexpander. Syntax definitions in the included file are available to 3735later code in the form in which the ‘include’ appears, without the need 3736for ‘eval-when’. (*Note Eval When::.) 3737 3738 For the same reason, compiling a form that uses ‘include’ results in 3739one compilation unit, composed of multiple files. Loading the compiled 3740file is one ‘stat’ operation for the compilation unit, instead of ‘2*N’ 3741in the case of ‘load’ (once for each loaded source file, and once each 3742corresponding compiled file, in the best case). 3743 3744 Unlike ‘load’, ‘include’ also works within nested lexical contexts. 3745It so happens that the optimizer works best within a lexical context, 3746because all of the uses of bindings in a lexical context are visible, so 3747composing files by including them within a ‘(let () ...)’ can sometimes 3748lead to important speed improvements. 3749 3750 On the other hand, ‘include’ does have all the disadvantages of early 3751binding: once the code with the ‘include’ is compiled, no change to the 3752included file is reflected in the future behavior of the including form. 3753 3754 Also, the particular form of ‘include’, which requires an absolute 3755path, or a path relative to the current directory at compile-time, is 3756not very amenable to compiling the source in one place, but then 3757installing the source to another place. For this reason, Guile provides 3758another form, ‘include-from-path’, which looks for the source file to 3759include within a load path. 3760 3761 -- Scheme Syntax: include-from-path file-name 3762 Like ‘include’, but instead of expecting ‘file-name’ to be an 3763 absolute file name, it is expected to be a relative path to search 3764 in the ‘%load-path’. 3765 3766 ‘include-from-path’ is more useful when you want to install all of 3767the source files for a package (as you should!). It makes it possible 3768to evaluate an installed file from source, instead of relying on the 3769‘.go’ file being up to date. 3770 3771 3772File: guile.info, Node: Sandboxed Evaluation, Next: REPL Servers, Prev: Local Inclusion, Up: Read/Load/Eval/Compile 3773 37746.16.13 Sandboxed Evaluation 3775---------------------------- 3776 3777Sometimes you would like to evaluate code that comes from an untrusted 3778party. The safest way to do this is to buy a new computer, evaluate the 3779code on that computer, then throw the machine away. However if you are 3780unwilling to take this simple approach, Guile does include a limited 3781“sandbox” facility that can allow untrusted code to be evaluated with 3782some confidence. 3783 3784 To use the sandboxed evaluator, load its module: 3785 3786 (use-modules (ice-9 sandbox)) 3787 3788 Guile’s sandboxing facility starts with the ability to restrict the 3789time and space used by a piece of code. 3790 3791 -- Scheme Procedure: call-with-time-limit limit thunk limit-reached 3792 Call THUNK, but cancel it if LIMIT seconds of wall-clock time have 3793 elapsed. If the computation is cancelled, call LIMIT-REACHED in 3794 tail position. THUNK must not disable interrupts or prevent an 3795 abort via a ‘dynamic-wind’ unwind handler. 3796 3797 -- Scheme Procedure: call-with-allocation-limit limit thunk 3798 limit-reached 3799 Call THUNK, but cancel it if LIMIT bytes have been allocated. If 3800 the computation is cancelled, call LIMIT-REACHED in tail position. 3801 THUNK must not disable interrupts or prevent an abort via a 3802 ‘dynamic-wind’ unwind handler. 3803 3804 This limit applies to both stack and heap allocation. The 3805 computation will not be aborted before LIMIT bytes have been 3806 allocated, but for the heap allocation limit, the check may be 3807 postponed until the next garbage collection. 3808 3809 Note that as a current shortcoming, the heap size limit applies to 3810 all threads; concurrent allocation by other unrelated threads 3811 counts towards the allocation limit. 3812 3813 -- Scheme Procedure: call-with-time-and-allocation-limits time-limit 3814 allocation-limit thunk 3815 Invoke THUNK in a dynamic extent in which its execution is limited 3816 to TIME-LIMIT seconds of wall-clock time, and its allocation to 3817 ALLOCATION-LIMIT bytes. THUNK must not disable interrupts or 3818 prevent an abort via a ‘dynamic-wind’ unwind handler. 3819 3820 If successful, return all values produced by invoking THUNK. Any 3821 uncaught exception thrown by the thunk will propagate out. If the 3822 time or allocation limit is exceeded, an exception will be thrown 3823 to the ‘limit-exceeded’ key. 3824 3825 The time limit and stack limit are both very precise, but the heap 3826limit only gets checked asynchronously, after a garbage collection. In 3827particular, if the heap is already very large, the number of allocated 3828bytes between garbage collections will be large, and therefore the 3829precision of the check is reduced. 3830 3831 Additionally, due to the mechanism used by the allocation limit (the 3832‘after-gc-hook’), large single allocations like ‘(make-vector #e1e7)’ 3833are only detected after the allocation completes, even if the allocation 3834itself causes garbage collection. It’s possible therefore for user code 3835to not only exceed the allocation limit set, but also to exhaust all 3836available memory, causing out-of-memory conditions at any allocation 3837site. Failure to allocate memory in Guile itself should be safe and 3838cause an exception to be thrown, but most systems are not designed to 3839handle ‘malloc’ failures. An allocation failure may therefore exercise 3840unexpected code paths in your system, so it is a weakness of the sandbox 3841(and therefore an interesting point of attack). 3842 3843 The main sandbox interface is ‘eval-in-sandbox’. 3844 3845 -- Scheme Procedure: eval-in-sandbox exp [#:time-limit 0.1] 3846 [#:allocation-limit #e10e6] [#:bindings all-pure-bindings] 3847 [#:module (make-sandbox-module bindings)] [#:sever-module? #t] 3848 Evaluate the Scheme expression EXP within an isolated "sandbox". 3849 Limit its execution to TIME-LIMIT seconds of wall-clock time, and 3850 limit its allocation to ALLOCATION-LIMIT bytes. 3851 3852 The evaluation will occur in MODULE, which defaults to the result 3853 of calling ‘make-sandbox-module’ on BINDINGS, which itself defaults 3854 to ‘all-pure-bindings’. This is the core of the sandbox: creating 3855 a scope for the expression that is “safe”. 3856 3857 A safe sandbox module has two characteristics. Firstly, it will 3858 not allow the expression being evaluated to avoid being cancelled 3859 due to time or allocation limits. This ensures that the expression 3860 terminates in a timely fashion. 3861 3862 Secondly, a safe sandbox module will prevent the evaluation from 3863 receiving information from previous evaluations, or from affecting 3864 future evaluations. All combinations of binding sets exported by 3865 ‘(ice-9 sandbox)’ form safe sandbox modules. 3866 3867 The BINDINGS should be given as a list of import sets. One import 3868 set is a list whose car names an interface, like ‘(ice-9 q)’, and 3869 whose cdr is a list of imports. An import is either a bare symbol 3870 or a pair of ‘(OUT . IN)’, where OUT and IN are both symbols and 3871 denote the name under which a binding is exported from the module, 3872 and the name under which to make the binding available, 3873 respectively. Note that BINDINGS is only used as an input to the 3874 default initializer for the MODULE argument; if you pass 3875 ‘#:module’, BINDINGS is unused. If SEVER-MODULE? is true (the 3876 default), the module will be unlinked from the global module tree 3877 after the evaluation returns, to allow MOD to be garbage-collected. 3878 3879 If successful, return all values produced by EXP. Any uncaught 3880 exception thrown by the expression will propagate out. If the time 3881 or allocation limit is exceeded, an exception will be thrown to the 3882 ‘limit-exceeded’ key. 3883 3884 Constructing a safe sandbox module is tricky in general. Guile 3885defines an easy way to construct safe modules from predefined sets of 3886bindings. Before getting to that interface, here are some general notes 3887on safety. 3888 3889 1. The time and allocation limits rely on the ability to interrupt and 3890 cancel a computation. For this reason, no binding included in a 3891 sandbox module should be able to indefinitely postpone interrupt 3892 handling, nor should a binding be able to prevent an abort. In 3893 practice this second consideration means that ‘dynamic-wind’ should 3894 not be included in any binding set. 3895 2. The time and allocation limits apply only to the ‘eval-in-sandbox’ 3896 call. If the call returns a procedure which is later called, no 3897 limit is “automatically” in place. Users of ‘eval-in-sandbox’ have 3898 to be very careful to reimpose limits when calling procedures that 3899 escape from sandboxes. 3900 3. Similarly, the dynamic environment of the ‘eval-in-sandbox’ call is 3901 not necessarily in place when any procedure that escapes from the 3902 sandbox is later called. 3903 3904 This detail prevents us from exposing ‘primitive-eval’ to the 3905 sandbox, for two reasons. The first is that it’s possible for 3906 legacy code to forge references to any binding, if the 3907 ‘allow-legacy-syntax-objects?’ parameter is true. The default for 3908 this parameter is true; *note Syntax Transformer Helpers:: for the 3909 details. The parameter is bound to ‘#f’ for the duration of the 3910 ‘eval-in-sandbox’ call itself, but that will not be in place during 3911 calls to escaped procedures. 3912 3913 The second reason we don’t expose ‘primitive-eval’ is that 3914 ‘primitive-eval’ implicitly works in the current module, which for 3915 an escaped procedure will probably be different than the module 3916 that is current for the ‘eval-in-sandbox’ call itself. 3917 3918 The common denominator here is that if an interface exposed to the 3919 sandbox relies on dynamic environments, it is easy to mistakenly 3920 grant the sandboxed procedure additional capabilities in the form 3921 of bindings that it should not have access to. For this reason, 3922 the default sets of predefined bindings do not depend on any 3923 dynamically scoped value. 3924 4. Mutation may allow a sandboxed evaluation to break some invariant 3925 in users of data supplied to it. A lot of code culturally doesn’t 3926 expect mutation, but if you hand mutable data to a sandboxed 3927 evaluation and you also grant mutating capabilities to that 3928 evaluation, then the sandboxed code may indeed mutate that data. 3929 The default set of bindings to the sandbox do not include any 3930 mutating primitives. 3931 3932 Relatedly, ‘set!’ may allow a sandbox to mutate a primitive, 3933 invalidating many system-wide invariants. Guile is currently quite 3934 permissive when it comes to imported bindings and mutability. 3935 Although ‘set!’ to a module-local or lexically bound variable would 3936 be fine, we don’t currently have an easy way to disallow ‘set!’ to 3937 an imported binding, so currently no binding set includes ‘set!’. 3938 5. Mutation may allow a sandboxed evaluation to keep state, or make a 3939 communication mechanism with other code. On the one hand this 3940 sounds cool, but on the other hand maybe this is part of your 3941 threat model. Again, the default set of bindings doesn’t include 3942 mutating primitives, preventing sandboxed evaluations from keeping 3943 state. 3944 6. The sandbox should probably not be able to open a network 3945 connection, or write to a file, or open a file from disk. The 3946 default binding set includes no interaction with the operating 3947 system. 3948 3949 If you, dear reader, find the above discussion interesting, you will 3950enjoy Jonathan Rees’ dissertation, “A Security Kernel Based on the 3951Lambda Calculus”. 3952 3953 -- Scheme Variable: all-pure-bindings 3954 All “pure” bindings that together form a safe subset of those 3955 bindings available by default to Guile user code. 3956 3957 -- Scheme Variable: all-pure-and-impure-bindings 3958 Like ‘all-pure-bindings’, but additionally including mutating 3959 primitives like ‘vector-set!’. This set is still safe in the sense 3960 mentioned above, with the caveats about mutation. 3961 3962 The components of these composite sets are as follows: 3963 -- Scheme Variable: alist-bindings 3964 -- Scheme Variable: array-bindings 3965 -- Scheme Variable: bit-bindings 3966 -- Scheme Variable: bitvector-bindings 3967 -- Scheme Variable: char-bindings 3968 -- Scheme Variable: char-set-bindings 3969 -- Scheme Variable: clock-bindings 3970 -- Scheme Variable: core-bindings 3971 -- Scheme Variable: error-bindings 3972 -- Scheme Variable: fluid-bindings 3973 -- Scheme Variable: hash-bindings 3974 -- Scheme Variable: iteration-bindings 3975 -- Scheme Variable: keyword-bindings 3976 -- Scheme Variable: list-bindings 3977 -- Scheme Variable: macro-bindings 3978 -- Scheme Variable: nil-bindings 3979 -- Scheme Variable: number-bindings 3980 -- Scheme Variable: pair-bindings 3981 -- Scheme Variable: predicate-bindings 3982 -- Scheme Variable: procedure-bindings 3983 -- Scheme Variable: promise-bindings 3984 -- Scheme Variable: prompt-bindings 3985 -- Scheme Variable: regexp-bindings 3986 -- Scheme Variable: sort-bindings 3987 -- Scheme Variable: srfi-4-bindings 3988 -- Scheme Variable: string-bindings 3989 -- Scheme Variable: symbol-bindings 3990 -- Scheme Variable: unspecified-bindings 3991 -- Scheme Variable: variable-bindings 3992 -- Scheme Variable: vector-bindings 3993 -- Scheme Variable: version-bindings 3994 The components of ‘all-pure-bindings’. 3995 3996 -- Scheme Variable: mutating-alist-bindings 3997 -- Scheme Variable: mutating-array-bindings 3998 -- Scheme Variable: mutating-bitvector-bindings 3999 -- Scheme Variable: mutating-fluid-bindings 4000 -- Scheme Variable: mutating-hash-bindings 4001 -- Scheme Variable: mutating-list-bindings 4002 -- Scheme Variable: mutating-pair-bindings 4003 -- Scheme Variable: mutating-sort-bindings 4004 -- Scheme Variable: mutating-srfi-4-bindings 4005 -- Scheme Variable: mutating-string-bindings 4006 -- Scheme Variable: mutating-variable-bindings 4007 -- Scheme Variable: mutating-vector-bindings 4008 The additional components of ‘all-pure-and-impure-bindings’. 4009 4010 Finally, what do you do with a binding set? What is a binding set 4011anyway? ‘make-sandbox-module’ is here for you. 4012 4013 -- Scheme Procedure: make-sandbox-module bindings 4014 Return a fresh module that only contains BINDINGS. 4015 4016 The BINDINGS should be given as a list of import sets. One import 4017 set is a list whose car names an interface, like ‘(ice-9 q)’, and 4018 whose cdr is a list of imports. An import is either a bare symbol 4019 or a pair of ‘(OUT . IN)’, where OUT and IN are both symbols and 4020 denote the name under which a binding is exported from the module, 4021 and the name under which to make the binding available, 4022 respectively. 4023 4024 So you see that binding sets are just lists, and 4025‘all-pure-and-impure-bindings’ is really just the result of appending 4026all of the component binding sets. 4027 4028 4029File: guile.info, Node: REPL Servers, Next: Cooperative REPL Servers, Prev: Sandboxed Evaluation, Up: Read/Load/Eval/Compile 4030 40316.16.14 REPL Servers 4032-------------------- 4033 4034The procedures in this section are provided by 4035 (use-modules (system repl server)) 4036 4037 When an application is written in Guile, it is often convenient to 4038allow the user to be able to interact with it by evaluating Scheme 4039expressions in a REPL. 4040 4041 The procedures of this module allow you to spawn a “REPL server”, 4042which permits interaction over a local or TCP connection. Guile itself 4043uses them internally to implement the ‘--listen’ switch, *note 4044Command-line Options::. 4045 4046 -- Scheme Procedure: make-tcp-server-socket [#:host=#f] [#:addr] 4047 [#:port=37146] 4048 Return a stream socket bound to a given address ADDR and port 4049 number PORT. If the HOST is given, and ADDR is not, then the HOST 4050 string is converted to an address. If neither is given, we use the 4051 loopback address. 4052 4053 -- Scheme Procedure: make-unix-domain-server-socket 4054 [#:path="/tmp/guile-socket"] 4055 Return a UNIX domain socket, bound to a given PATH. 4056 4057 -- Scheme Procedure: run-server [server-socket] 4058 -- Scheme Procedure: spawn-server [server-socket] 4059 Create and run a REPL, making it available over the given 4060 SERVER-SOCKET. If SERVER-SOCKET is not provided, it defaults to 4061 the socket created by calling ‘make-tcp-server-socket’ with no 4062 arguments. 4063 4064 ‘run-server’ runs the server in the current thread, whereas 4065 ‘spawn-server’ runs the server in a new thread. 4066 4067 -- Scheme Procedure: stop-server-and-clients! 4068 Closes the connection on all running server sockets. 4069 4070 Please note that in the current implementation, the REPL threads 4071 are cancelled without unwinding their stacks. If any of them are 4072 holding mutexes or are within a critical section, the results are 4073 unspecified. 4074 4075 4076File: guile.info, Node: Cooperative REPL Servers, Prev: REPL Servers, Up: Read/Load/Eval/Compile 4077 40786.16.15 Cooperative REPL Servers 4079-------------------------------- 4080 4081The procedures in this section are provided by 4082 (use-modules (system repl coop-server)) 4083 4084 Whereas ordinary REPL servers run in their own threads (*note REPL 4085Servers::), sometimes it is more convenient to provide REPLs that run at 4086specified times within an existing thread, for example in programs 4087utilizing an event loop or in single-threaded programs. This allows for 4088safe access and mutation of a program’s data structures from the REPL, 4089without concern for thread synchronization. 4090 4091 Although the REPLs are run in the thread that calls 4092‘spawn-coop-repl-server’ and ‘poll-coop-repl-server’, dedicated threads 4093are spawned so that the calling thread is not blocked. The spawned 4094threads read input for the REPLs and to listen for new connections. 4095 4096 Cooperative REPL servers must be polled periodically to evaluate any 4097pending expressions by calling ‘poll-coop-repl-server’ with the object 4098returned from ‘spawn-coop-repl-server’. The thread that calls 4099‘poll-coop-repl-server’ will be blocked for as long as the expression 4100takes to be evaluated or if the debugger is entered. 4101 4102 -- Scheme Procedure: spawn-coop-repl-server [server-socket] 4103 Create and return a new cooperative REPL server object, and spawn a 4104 new thread to listen for connections on SERVER-SOCKET. Proper 4105 functioning of the REPL server requires that 4106 ‘poll-coop-repl-server’ be called periodically on the returned 4107 server object. 4108 4109 -- Scheme Procedure: poll-coop-repl-server coop-server 4110 Poll the cooperative REPL server COOP-SERVER and apply a pending 4111 operation if there is one, such as evaluating an expression typed 4112 at the REPL prompt. This procedure must be called from the same 4113 thread that called ‘spawn-coop-repl-server’. 4114 4115 4116File: guile.info, Node: Memory Management, Next: Modules, Prev: Read/Load/Eval/Compile, Up: API Reference 4117 41186.17 Memory Management and Garbage Collection 4119============================================= 4120 4121Guile uses a _garbage collector_ to manage most of its objects. While 4122the garbage collector is designed to be mostly invisible, you sometimes 4123need to interact with it explicitly. 4124 4125 See *note Garbage Collection:: for a general discussion of how 4126garbage collection relates to using Guile from C. 4127 4128* Menu: 4129 4130* Garbage Collection Functions:: 4131* Memory Blocks:: 4132* Weak References:: 4133* Guardians:: 4134 4135 4136File: guile.info, Node: Garbage Collection Functions, Next: Memory Blocks, Up: Memory Management 4137 41386.17.1 Function related to Garbage Collection 4139--------------------------------------------- 4140 4141 -- Scheme Procedure: gc 4142 -- C Function: scm_gc () 4143 Finds all of the “live” ‘SCM’ objects and reclaims for further use 4144 those that are no longer accessible. You normally don’t need to 4145 call this function explicitly. Its functionality is invoked 4146 automatically as needed. 4147 4148 -- C Function: SCM scm_gc_protect_object (SCM OBJ) 4149 Protects OBJ from being freed by the garbage collector, when it 4150 otherwise might be. When you are done with the object, call 4151 ‘scm_gc_unprotect_object’ on the object. Calls to 4152 ‘scm_gc_protect_object’/‘scm_gc_unprotect_object’ can be nested, 4153 and the object remains protected until it has been unprotected as 4154 many times as it was protected. It is an error to unprotect an 4155 object more times than it has been protected. Returns the SCM 4156 object it was passed. 4157 4158 Note that storing OBJ in a C global variable has the same 4159 effect(1). 4160 4161 -- C Function: SCM scm_gc_unprotect_object (SCM OBJ) 4162 4163 Unprotects an object from the garbage collector which was protected 4164 by ‘scm_gc_unprotect_object’. Returns the SCM object it was 4165 passed. 4166 4167 -- C Function: SCM scm_permanent_object (SCM OBJ) 4168 4169 Similar to ‘scm_gc_protect_object’ in that it causes the collector 4170 to always mark the object, except that it should not be nested 4171 (only call ‘scm_permanent_object’ on an object once), and it has no 4172 corresponding unpermanent function. Once an object is declared 4173 permanent, it will never be freed. Returns the SCM object it was 4174 passed. 4175 4176 -- C Macro: void scm_remember_upto_here_1 (SCM obj) 4177 -- C Macro: void scm_remember_upto_here_2 (SCM obj1, SCM obj2) 4178 Create a reference to the given object or objects, so they’re 4179 certain to be present on the stack or in a register and hence will 4180 not be freed by the garbage collector before this point. 4181 4182 Note that these functions can only be applied to ordinary C local 4183 variables (ie. “automatics”). Objects held in global or static 4184 variables or some malloced block or the like cannot be protected 4185 with this mechanism. 4186 4187 -- Scheme Procedure: gc-stats 4188 -- C Function: scm_gc_stats () 4189 Return an association list of statistics about Guile’s current use 4190 of storage. 4191 4192 -- Scheme Procedure: gc-live-object-stats 4193 -- C Function: scm_gc_live_object_stats () 4194 Return an alist of statistics of the current live objects. 4195 4196 -- Function: void scm_gc_mark (SCM X) 4197 Mark the object X, and recurse on any objects X refers to. If X’s 4198 mark bit is already set, return immediately. This function must 4199 only be called during the mark-phase of garbage collection, 4200 typically from a smob _mark_ function. 4201 4202 ---------- Footnotes ---------- 4203 4204 (1) In Guile up to version 1.8, C global variables were not visited 4205by the garbage collector in the mark phase; hence, 4206‘scm_gc_protect_object’ was the only way in C to prevent a Scheme object 4207from being freed. 4208 4209 4210File: guile.info, Node: Memory Blocks, Next: Weak References, Prev: Garbage Collection Functions, Up: Memory Management 4211 42126.17.2 Memory Blocks 4213-------------------- 4214 4215In C programs, dynamic management of memory blocks is normally done with 4216the functions malloc, realloc, and free. Guile has additional functions 4217for dynamic memory allocation that are integrated into the garbage 4218collector and the error reporting system. 4219 4220 Memory blocks that are associated with Scheme objects (for example a 4221foreign object) should be allocated with ‘scm_gc_malloc’ or 4222‘scm_gc_malloc_pointerless’. These two functions will either return a 4223valid pointer or signal an error. Memory blocks allocated this way may 4224be released explicitly; however, this is not strictly needed, and we 4225recommend _not_ calling ‘scm_gc_free’. All memory allocated with 4226‘scm_gc_malloc’ or ‘scm_gc_malloc_pointerless’ is automatically 4227reclaimed when the garbage collector no longer sees any live reference 4228to it(1). 4229 4230 When garbage collection occurs, Guile will visit the words in memory 4231allocated with ‘scm_gc_malloc’, looking for live pointers. This means 4232that if ‘scm_gc_malloc’-allocated memory contains a pointer to some 4233other part of the memory, the garbage collector notices it and prevents 4234it from being reclaimed(2). Conversely, memory allocated with 4235‘scm_gc_malloc_pointerless’ is assumed to be “pointer-less” and is not 4236scanned for pointers. 4237 4238 For memory that is not associated with a Scheme object, you can use 4239‘scm_malloc’ instead of ‘malloc’. Like ‘scm_gc_malloc’, it will either 4240return a valid pointer or signal an error. However, it will not assume 4241that the new memory block can be freed by a garbage collection. The 4242memory must be explicitly freed with ‘free’. 4243 4244 There is also ‘scm_gc_realloc’ and ‘scm_realloc’, to be used in place 4245of ‘realloc’ when appropriate, and ‘scm_gc_calloc’ and ‘scm_calloc’, to 4246be used in place of ‘calloc’ when appropriate. 4247 4248 The function ‘scm_dynwind_free’ can be useful when memory should be 4249freed with libc’s ‘free’ when leaving a dynwind context, *Note Dynamic 4250Wind::. 4251 4252 -- C Function: void * scm_malloc (size_t SIZE) 4253 -- C Function: void * scm_calloc (size_t SIZE) 4254 Allocate SIZE bytes of memory and return a pointer to it. When 4255 SIZE is 0, return ‘NULL’. When not enough memory is available, 4256 signal an error. This function runs the GC to free up some memory 4257 when it deems it appropriate. 4258 4259 The memory is allocated by the libc ‘malloc’ function and can be 4260 freed with ‘free’. There is no ‘scm_free’ function to go with 4261 ‘scm_malloc’ to make it easier to pass memory back and forth 4262 between different modules. 4263 4264 The function ‘scm_calloc’ is similar to ‘scm_malloc’, but 4265 initializes the block of memory to zero as well. 4266 4267 These functions will (indirectly) call 4268 ‘scm_gc_register_allocation’. 4269 4270 -- C Function: void * scm_realloc (void *MEM, size_t NEW_SIZE) 4271 Change the size of the memory block at MEM to NEW_SIZE and return 4272 its new location. When NEW_SIZE is 0, this is the same as calling 4273 ‘free’ on MEM and ‘NULL’ is returned. When MEM is ‘NULL’, this 4274 function behaves like ‘scm_malloc’ and allocates a new block of 4275 size NEW_SIZE. 4276 4277 When not enough memory is available, signal an error. This 4278 function runs the GC to free up some memory when it deems it 4279 appropriate. 4280 4281 This function will call ‘scm_gc_register_allocation’. 4282 4283 -- C Function: void * scm_gc_malloc (size_t SIZE, const char *WHAT) 4284 -- C Function: void * scm_gc_malloc_pointerless (size_t SIZE, const 4285 char *WHAT) 4286 -- C Function: void * scm_gc_realloc (void *MEM, size_t OLD_SIZE, 4287 size_t NEW_SIZE, const char *WHAT); 4288 -- C Function: void * scm_gc_calloc (size_t SIZE, const char *WHAT) 4289 Allocate SIZE bytes of automatically-managed memory. The memory is 4290 automatically freed when no longer referenced from any live memory 4291 block. 4292 4293 When garbage collection occurs, Guile will visit the words in 4294 memory allocated with ‘scm_gc_malloc’ or ‘scm_gc_calloc’, looking 4295 for pointers to other memory allocations that are managed by the 4296 GC. In contrast, memory allocated by ‘scm_gc_malloc_pointerless’ is 4297 not scanned for pointers. 4298 4299 The ‘scm_gc_realloc’ call preserves the “pointerlessness” of the 4300 memory area pointed to by MEM. Note that you need to pass the old 4301 size of a reallocated memory block as well. See below for a 4302 motivation. 4303 4304 -- C Function: void scm_gc_free (void *MEM, size_t SIZE, const char 4305 *WHAT) 4306 Explicitly free the memory block pointed to by MEM, which was 4307 previously allocated by one of the above ‘scm_gc’ functions. This 4308 function is almost always unnecessary, except for codebases that 4309 still need to compile on Guile 1.8. 4310 4311 Note that you need to explicitly pass the SIZE parameter. This is 4312 done since it should normally be easy to provide this parameter 4313 (for memory that is associated with GC controlled objects) and help 4314 keep the memory management overhead very low. However, in Guile 4315 2.x, SIZE is always ignored. 4316 4317 -- C Function: void scm_gc_register_allocation (size_t SIZE) 4318 Informs the garbage collector that SIZE bytes have been allocated, 4319 which the collector would otherwise not have known about. 4320 4321 In general, Scheme will decide to collect garbage only after some 4322 amount of memory has been allocated. Calling this function will 4323 make the Scheme garbage collector know about more allocation, and 4324 thus run more often (as appropriate). 4325 4326 It is especially important to call this function when large 4327 unmanaged allocations, like images, may be freed by small Scheme 4328 allocations, like foreign objects. 4329 4330 -- C Function: void scm_dynwind_free (void *mem) 4331 Equivalent to ‘scm_dynwind_unwind_handler (free, MEM, 4332 SCM_F_WIND_EXPLICITLY)’. That is, the memory block at MEM will be 4333 freed (using ‘free’ from the C library) when the current dynwind is 4334 left. 4335 4336 -- Scheme Procedure: malloc-stats 4337 Return an alist ((WHAT . N) ...) describing number of malloced 4338 objects. WHAT is the second argument to ‘scm_gc_malloc’, N is the 4339 number of objects of that type currently allocated. 4340 4341 This function is only available if the ‘GUILE_DEBUG_MALLOC’ 4342 preprocessor macro was defined when Guile was compiled. 4343 4344 ---------- Footnotes ---------- 4345 4346 (1) In Guile up to version 1.8, memory allocated with ‘scm_gc_malloc’ 4347_had_ to be freed with ‘scm_gc_free’. 4348 4349 (2) In Guile up to 1.8, memory allocated with ‘scm_gc_malloc’ was 4350_not_ visited by the collector in the mark phase. Consequently, the GC 4351had to be told explicitly about pointers to live objects contained in 4352the memory block, e.g., via SMOB mark functions (*note 4353‘scm_set_smob_mark’: Smobs.) 4354 4355 4356File: guile.info, Node: Weak References, Next: Guardians, Prev: Memory Blocks, Up: Memory Management 4357 43586.17.3 Weak References 4359---------------------- 4360 4361[FIXME: This chapter is based on Mikael Djurfeldt’s answer to a question 4362by Michael Livshin. Any mistakes are not theirs, of course. ] 4363 4364 Weak references let you attach bookkeeping information to data so 4365that the additional information automatically disappears when the 4366original data is no longer in use and gets garbage collected. In a weak 4367key hash, the hash entry for that key disappears as soon as the key is 4368no longer referenced from anywhere else. For weak value hashes, the 4369same happens as soon as the value is no longer in use. Entries in a 4370doubly weak hash disappear when either the key or the value are not used 4371anywhere else anymore. 4372 4373 Object properties offer the same kind of functionality as weak key 4374hashes in many situations. (*note Object Properties::) 4375 4376 Here’s an example (a little bit strained perhaps, but one of the 4377examples is actually used in Guile): 4378 4379 Assume that you’re implementing a debugging system where you want to 4380associate information about filename and position of source code 4381expressions with the expressions themselves. 4382 4383 Hashtables can be used for that, but if you use ordinary hash tables 4384it will be impossible for the scheme interpreter to "forget" old source 4385when, for example, a file is reloaded. 4386 4387 To implement the mapping from source code expressions to positional 4388information it is necessary to use weak-key tables since we don’t want 4389the expressions to be remembered just because they are in our table. 4390 4391 To implement a mapping from source file line numbers to source code 4392expressions you would use a weak-value table. 4393 4394 To implement a mapping from source code expressions to the procedures 4395they constitute a doubly-weak table has to be used. 4396 4397* Menu: 4398 4399* Weak hash tables:: 4400* Weak vectors:: 4401 4402 4403File: guile.info, Node: Weak hash tables, Next: Weak vectors, Up: Weak References 4404 44056.17.3.1 Weak hash tables 4406......................... 4407 4408 -- Scheme Procedure: make-weak-key-hash-table [size] 4409 -- Scheme Procedure: make-weak-value-hash-table [size] 4410 -- Scheme Procedure: make-doubly-weak-hash-table [size] 4411 -- C Function: scm_make_weak_key_hash_table (size) 4412 -- C Function: scm_make_weak_value_hash_table (size) 4413 -- C Function: scm_make_doubly_weak_hash_table (size) 4414 Return a weak hash table with SIZE buckets. As with any hash 4415 table, choosing a good size for the table requires some caution. 4416 4417 You can modify weak hash tables in exactly the same way you would 4418 modify regular hash tables, with the exception of the routines that 4419 act on handles. Weak tables have a different implementation behind 4420 the scenes that doesn’t have handles. *note Hash Tables::, for 4421 more on ‘hashq-ref’ et al. 4422 4423 Note that in a weak-key hash table, the reference to the value is 4424strong. This means that if the value references the key, even 4425indirectly, the key will never be collected, which can lead to a memory 4426leak. The reverse is true for weak value tables. 4427 4428 -- Scheme Procedure: weak-key-hash-table? obj 4429 -- Scheme Procedure: weak-value-hash-table? obj 4430 -- Scheme Procedure: doubly-weak-hash-table? obj 4431 -- C Function: scm_weak_key_hash_table_p (obj) 4432 -- C Function: scm_weak_value_hash_table_p (obj) 4433 -- C Function: scm_doubly_weak_hash_table_p (obj) 4434 Return ‘#t’ if OBJ is the specified weak hash table. Note that a 4435 doubly weak hash table is neither a weak key nor a weak value hash 4436 table. 4437 4438 4439File: guile.info, Node: Weak vectors, Prev: Weak hash tables, Up: Weak References 4440 44416.17.3.2 Weak vectors 4442..................... 4443 4444 -- Scheme Procedure: make-weak-vector size [fill] 4445 -- C Function: scm_make_weak_vector (size, fill) 4446 Return a weak vector with SIZE elements. If the optional argument 4447 FILL is given, all entries in the vector will be set to FILL. The 4448 default value for FILL is the empty list. 4449 4450 -- Scheme Procedure: weak-vector elem ... 4451 -- Scheme Procedure: list->weak-vector l 4452 -- C Function: scm_weak_vector (l) 4453 Construct a weak vector from a list: ‘weak-vector’ uses the list of 4454 its arguments while ‘list->weak-vector’ uses its only argument L (a 4455 list) to construct a weak vector the same way ‘list->vector’ would. 4456 4457 -- Scheme Procedure: weak-vector? obj 4458 -- C Function: scm_weak_vector_p (obj) 4459 Return ‘#t’ if OBJ is a weak vector. 4460 4461 -- Scheme Procedure: weak-vector-ref wvect k 4462 -- C Function: scm_weak_vector_ref (wvect, k) 4463 Return the Kth element of the weak vector WVECT, or ‘#f’ if that 4464 element has been collected. 4465 4466 -- Scheme Procedure: weak-vector-set! wvect k elt 4467 -- C Function: scm_weak_vector_set_x (wvect, k, elt) 4468 Set the Kth element of the weak vector WVECT to ELT. 4469 4470 4471File: guile.info, Node: Guardians, Prev: Weak References, Up: Memory Management 4472 44736.17.4 Guardians 4474---------------- 4475 4476Guardians provide a way to be notified about objects that would 4477otherwise be collected as garbage. Guarding them prevents the objects 4478from being collected and cleanup actions can be performed on them, for 4479example. 4480 4481 See R. Kent Dybvig, Carl Bruggeman, and David Eby (1993) "Guardians 4482in a Generation-Based Garbage Collector". ACM SIGPLAN Conference on 4483Programming Language Design and Implementation, June 1993. 4484 4485 -- Scheme Procedure: make-guardian 4486 -- C Function: scm_make_guardian () 4487 Create a new guardian. A guardian protects a set of objects from 4488 garbage collection, allowing a program to apply cleanup or other 4489 actions. 4490 4491 ‘make-guardian’ returns a procedure representing the guardian. 4492 Calling the guardian procedure with an argument adds the argument 4493 to the guardian’s set of protected objects. Calling the guardian 4494 procedure without an argument returns one of the protected objects 4495 which are ready for garbage collection, or ‘#f’ if no such object 4496 is available. Objects which are returned in this way are removed 4497 from the guardian. 4498 4499 You can put a single object into a guardian more than once and you 4500 can put a single object into more than one guardian. The object 4501 will then be returned multiple times by the guardian procedures. 4502 4503 An object is eligible to be returned from a guardian when it is no 4504 longer referenced from outside any guardian. 4505 4506 There is no guarantee about the order in which objects are returned 4507 from a guardian. If you want to impose an order on finalization 4508 actions, for example, you can do that by keeping objects alive in 4509 some global data structure until they are no longer needed for 4510 finalizing other objects. 4511 4512 Being an element in a weak vector, a key in a hash table with weak 4513 keys, or a value in a hash table with weak values does not prevent 4514 an object from being returned by a guardian. But as long as an 4515 object can be returned from a guardian it will not be removed from 4516 such a weak vector or hash table. In other words, a weak link does 4517 not prevent an object from being considered collectable, but being 4518 inside a guardian prevents a weak link from being broken. 4519 4520 A key in a weak key hash table can be thought of as having a strong 4521 reference to its associated value as long as the key is accessible. 4522 Consequently, when the key is only accessible from within a 4523 guardian, the reference from the key to the value is also 4524 considered to be coming from within a guardian. Thus, if there is 4525 no other reference to the value, it is eligible to be returned from 4526 a guardian. 4527 4528 4529File: guile.info, Node: Modules, Next: Foreign Function Interface, Prev: Memory Management, Up: API Reference 4530 45316.18 Modules 4532============ 4533 4534When programs become large, naming conflicts can occur when a function 4535or global variable defined in one file has the same name as a function 4536or global variable in another file. Even just a _similarity_ between 4537function names can cause hard-to-find bugs, since a programmer might 4538type the wrong function name. 4539 4540 The approach used to tackle this problem is called _information 4541encapsulation_, which consists of packaging functional units into a 4542given name space that is clearly separated from other name spaces. 4543 4544 The language features that allow this are usually called _the module 4545system_ because programs are broken up into modules that are compiled 4546separately (or loaded separately in an interpreter). 4547 4548 Older languages, like C, have limited support for name space 4549manipulation and protection. In C a variable or function is public by 4550default, and can be made local to a module with the ‘static’ keyword. 4551But you cannot reference public variables and functions from another 4552module with different names. 4553 4554 More advanced module systems have become a common feature in recently 4555designed languages: ML, Python, Perl, and Modula 3 all allow the 4556_renaming_ of objects from a foreign module, so they will not clutter 4557the global name space. 4558 4559 In addition, Guile offers variables as first-class objects. They can 4560be used for interacting with the module system. 4561 4562* Menu: 4563 4564* General Information about Modules:: Guile module basics. 4565* Using Guile Modules:: How to use existing modules. 4566* Creating Guile Modules:: How to package your code into modules. 4567* Modules and the File System:: Installing modules in the file system. 4568* R6RS Version References:: Using version numbers with modules. 4569* R6RS Libraries:: The library and import forms. 4570* Variables:: First-class variables. 4571* Module System Reflection:: First-class modules. 4572* Declarative Modules:: Allowing Guile to reason about modules. 4573* Accessing Modules from C:: How to work with modules with C code. 4574* provide and require:: The SLIB feature mechanism. 4575* Environments:: R5RS top-level environments. 4576 4577 4578File: guile.info, Node: General Information about Modules, Next: Using Guile Modules, Up: Modules 4579 45806.18.1 General Information about Modules 4581---------------------------------------- 4582 4583A Guile module can be thought of as a collection of named procedures, 4584variables and macros. More precisely, it is a set of “bindings” of 4585symbols (names) to Scheme objects. 4586 4587 Within a module, all bindings are visible. Certain bindings can be 4588declared “public”, in which case they are added to the module’s 4589so-called “export list”; this set of public bindings is called the 4590module’s “public interface” (*note Creating Guile Modules::). 4591 4592 A client module “uses” a providing module’s bindings by either 4593accessing the providing module’s public interface, or by building a 4594custom interface (and then accessing that). In a custom interface, the 4595client module can “select” which bindings to access and can also 4596algorithmically “rename” bindings. In contrast, when using the 4597providing module’s public interface, the entire export list is available 4598without renaming (*note Using Guile Modules::). 4599 4600 All Guile modules have a unique “module name”, for example ‘(ice-9 4601popen)’ or ‘(srfi srfi-11)’. Module names are lists of one or more 4602symbols. 4603 4604 When Guile goes to use an interface from a module, for example 4605‘(ice-9 popen)’, Guile first looks to see if it has loaded ‘(ice-9 4606popen)’ for any reason. If the module has not been loaded yet, Guile 4607searches a “load path” for a file that might define it, and loads that 4608file. 4609 4610 The following subsections go into more detail on using, creating, 4611installing, and otherwise manipulating modules and the module system. 4612 4613 4614File: guile.info, Node: Using Guile Modules, Next: Creating Guile Modules, Prev: General Information about Modules, Up: Modules 4615 46166.18.2 Using Guile Modules 4617-------------------------- 4618 4619To use a Guile module is to access either its public interface or a 4620custom interface (*note General Information about Modules::). Both 4621types of access are handled by the syntactic form ‘use-modules’, which 4622accepts one or more interface specifications and, upon evaluation, 4623arranges for those interfaces to be available to the current module. 4624This process may include locating and loading code for a given module if 4625that code has not yet been loaded, following ‘%load-path’ (*note Modules 4626and the File System::). 4627 4628 An “interface specification” has one of two forms. The first 4629variation is simply to name the module, in which case its public 4630interface is the one accessed. For example: 4631 4632 (use-modules (ice-9 popen)) 4633 4634 Here, the interface specification is ‘(ice-9 popen)’, and the result 4635is that the current module now has access to ‘open-pipe’, ‘close-pipe’, 4636‘open-input-pipe’, and so on (*note Pipes::). 4637 4638 Note in the previous example that if the current module had already 4639defined ‘open-pipe’, that definition would be overwritten by the 4640definition in ‘(ice-9 popen)’. For this reason (and others), there is a 4641second variation of interface specification that not only names a module 4642to be accessed, but also selects bindings from it and renames them to 4643suit the current module’s needs. For example: 4644 4645 (use-modules ((ice-9 popen) 4646 #:select ((open-pipe . pipe-open) close-pipe) 4647 #:renamer (symbol-prefix-proc 'unixy:))) 4648 4649or more simply: 4650 4651 (use-modules ((ice-9 popen) 4652 #:select ((open-pipe . pipe-open) close-pipe) 4653 #:prefix unixy:)) 4654 4655 Here, the interface specification is more complex than before, and 4656the result is that a custom interface with only two bindings is created 4657and subsequently accessed by the current module. The mapping of old to 4658new names is as follows: 4659 4660 (ice-9 popen) sees: current module sees: 4661 open-pipe unixy:pipe-open 4662 close-pipe unixy:close-pipe 4663 4664 This example also shows how to use the convenience procedure 4665‘symbol-prefix-proc’. 4666 4667 You can also directly refer to bindings in a module by using the ‘@’ 4668syntax. For example, instead of using the ‘use-modules’ statement from 4669above and writing ‘unixy:pipe-open’ to refer to the ‘pipe-open’ from the 4670‘(ice-9 popen)’, you could also write ‘(@ (ice-9 popen) open-pipe)’. 4671Thus an alternative to the complete ‘use-modules’ statement would be 4672 4673 (define unixy:pipe-open (@ (ice-9 popen) open-pipe)) 4674 (define unixy:close-pipe (@ (ice-9 popen) close-pipe)) 4675 4676 There is also ‘@@’, which can be used like ‘@’, but does not check 4677whether the variable that is being accessed is actually exported. Thus, 4678‘@@’ can be thought of as the impolite version of ‘@’ and should only be 4679used as a last resort or for debugging, for example. 4680 4681 Note that just as with a ‘use-modules’ statement, any module that has 4682not yet been loaded will be loaded when referenced by a ‘@’ or ‘@@’ 4683form. 4684 4685 You can also use the ‘@’ and ‘@@’ syntaxes as the target of a ‘set!’ 4686when the binding refers to a variable. 4687 4688 -- Scheme Procedure: symbol-prefix-proc prefix-sym 4689 Return a procedure that prefixes its arg (a symbol) with 4690 PREFIX-SYM. 4691 4692 -- syntax: use-modules spec ... 4693 Resolve each interface specification SPEC into an interface and 4694 arrange for these to be accessible by the current module. The 4695 return value is unspecified. 4696 4697 SPEC can be a list of symbols, in which case it names a module 4698 whose public interface is found and used. 4699 4700 SPEC can also be of the form: 4701 4702 (MODULE-NAME [#:select SELECTION] 4703 [#:prefix PREFIX] 4704 [#:renamer RENAMER]) 4705 4706 in which case a custom interface is newly created and used. 4707 MODULE-NAME is a list of symbols, as above; SELECTION is a list of 4708 selection-specs; PREFIX is a symbol that is prepended to imported 4709 names; and RENAMER is a procedure that takes a symbol and returns 4710 its new name. A selection-spec is either a symbol or a pair of 4711 symbols ‘(ORIG . SEEN)’, where ORIG is the name in the used module 4712 and SEEN is the name in the using module. Note that SEEN is also 4713 modified by PREFIX and RENAMER. 4714 4715 The ‘#:select’, ‘#:prefix’, and ‘#:renamer’ clauses are optional. 4716 If all are omitted, the returned interface has no bindings. If the 4717 ‘#:select’ clause is omitted, PREFIX and RENAMER operate on the 4718 used module’s public interface. 4719 4720 In addition to the above, SPEC can also include a ‘#:version’ 4721 clause, of the form: 4722 4723 #:version VERSION-SPEC 4724 4725 where VERSION-SPEC is an R6RS-compatible version reference. An 4726 error will be signaled in the case in which a module with the same 4727 name has already been loaded, if that module specifies a version 4728 and that version is not compatible with VERSION-SPEC. *Note R6RS 4729 Version References::, for more on version references. 4730 4731 If the module name is not resolvable, ‘use-modules’ will signal an 4732 error. 4733 4734 -- syntax: @ module-name binding-name 4735 Refer to the binding named BINDING-NAME in module MODULE-NAME. The 4736 binding must have been exported by the module. 4737 4738 -- syntax: @@ module-name binding-name 4739 Refer to the binding named BINDING-NAME in module MODULE-NAME. The 4740 binding must not have been exported by the module. This syntax is 4741 only intended for debugging purposes or as a last resort. *Note 4742 Declarative Modules::, for some limitations on the use of ‘@@’. 4743 4744 4745File: guile.info, Node: Creating Guile Modules, Next: Modules and the File System, Prev: Using Guile Modules, Up: Modules 4746 47476.18.3 Creating Guile Modules 4748----------------------------- 4749 4750When you want to create your own modules, you have to take the following 4751steps: 4752 4753 • Create a Scheme source file and add all variables and procedures 4754 you wish to export, or which are required by the exported 4755 procedures. 4756 4757 • Add a ‘define-module’ form at the beginning. 4758 4759 • Export all bindings which should be in the public interface, either 4760 by using ‘define-public’ or ‘export’ (both documented below). 4761 4762 -- syntax: define-module module-name option ... 4763 MODULE-NAME is a list of one or more symbols. 4764 4765 (define-module (ice-9 popen)) 4766 4767 ‘define-module’ makes this module available to Guile programs under 4768 the given MODULE-NAME. 4769 4770 OPTION ... are keyword/value pairs which specify more about the 4771 defined module. The recognized options and their meaning are shown 4772 in the following table. 4773 4774 ‘#:use-module INTERFACE-SPECIFICATION’ 4775 Equivalent to a ‘(use-modules INTERFACE-SPECIFICATION)’ (*note 4776 Using Guile Modules::). 4777 4778 ‘#:autoload MODULE SYMBOL-LIST’ 4779 Load MODULE when any of SYMBOL-LIST are accessed. For 4780 example, 4781 4782 (define-module (my mod) 4783 #:autoload (srfi srfi-1) (partition delete-duplicates)) 4784 ... 4785 (when something 4786 (set! foo (delete-duplicates ...))) 4787 4788 When a module is autoloaded, only the bindings in SYMBOL-LIST 4789 become available(1). 4790 4791 An autoload is a good way to put off loading a big module 4792 until it’s really needed, for instance for faster startup or 4793 if it will only be needed in certain circumstances. 4794 4795 ‘#:export LIST’ 4796 Export all identifiers in LIST which must be a list of symbols 4797 or pairs of symbols. This is equivalent to ‘(export LIST)’ in 4798 the module body. 4799 4800 ‘#:re-export LIST’ 4801 Re-export all identifiers in LIST which must be a list of 4802 symbols or pairs of symbols. The symbols in LIST must be 4803 imported by the current module from other modules. This is 4804 equivalent to ‘re-export’ below. 4805 4806 ‘#:replace LIST’ 4807 Export all identifiers in LIST (a list of symbols or pairs of 4808 symbols) and mark them as “replacing bindings”. In the module 4809 user’s name space, this will have the effect of replacing any 4810 binding with the same name that is not also “replacing”. 4811 Normally a replacement results in an “override” warning 4812 message, ‘#:replace’ avoids that. 4813 4814 In general, a module that exports a binding for which the 4815 ‘(guile)’ module already has a definition should use 4816 ‘#:replace’ instead of ‘#:export’. ‘#:replace’, in a sense, 4817 lets Guile know that the module _purposefully_ replaces a core 4818 binding. It is important to note, however, that this binding 4819 replacement is confined to the name space of the module user. 4820 In other words, the value of the core binding in question 4821 remains unchanged for other modules. 4822 4823 Note that although it is often a good idea for the replaced 4824 binding to remain compatible with a binding in ‘(guile)’, to 4825 avoid surprising the user, sometimes the bindings will be 4826 incompatible. For example, SRFI-19 exports its own version of 4827 ‘current-time’ (*note SRFI-19 Time::) which is not compatible 4828 with the core ‘current-time’ function (*note Time::). Guile 4829 assumes that a user importing a module knows what she is 4830 doing, and uses ‘#:replace’ for this binding rather than 4831 ‘#:export’. 4832 4833 A ‘#:replace’ clause is equivalent to ‘(export! LIST)’ in the 4834 module body. 4835 4836 The ‘#:duplicates’ (see below) provides fine-grain control 4837 about duplicate binding handling on the module-user side. 4838 4839 ‘#:re-export-and-replace LIST’ 4840 Like ‘#:re-export’, but also marking the bindings as 4841 replacements in the sense of ‘#:replace’. 4842 4843 ‘#:version LIST’ 4844 Specify a version for the module in the form of LIST, a list 4845 of zero or more exact, nonnegative integers. The 4846 corresponding ‘#:version’ option in the ‘use-modules’ form 4847 allows callers to restrict the value of this option in various 4848 ways. 4849 4850 ‘#:duplicates LIST’ 4851 Tell Guile to handle duplicate bindings for the bindings 4852 imported by the current module according to the policy defined 4853 by LIST, a list of symbols. LIST must contain symbols 4854 representing a duplicate binding handling policy chosen among 4855 the following: 4856 4857 ‘check’ 4858 Raises an error when a binding is imported from more than 4859 one place. 4860 ‘warn’ 4861 Issue a warning when a binding is imported from more than 4862 one place and leave the responsibility of actually 4863 handling the duplication to the next duplicate binding 4864 handler. 4865 ‘replace’ 4866 When a new binding is imported that has the same name as 4867 a previously imported binding, then do the following: 4868 4869 1. If the old binding was said to be “replacing” (via 4870 the ‘#:replace’ option above) and the new binding is 4871 not replacing, the keep the old binding. 4872 2. If the old binding was not said to be replacing and 4873 the new binding is replacing, then replace the old 4874 binding with the new one. 4875 3. If neither the old nor the new binding is replacing, 4876 then keep the old one. 4877 4878 ‘warn-override-core’ 4879 Issue a warning when a core binding is being overwritten 4880 and actually override the core binding with the new one. 4881 ‘first’ 4882 In case of duplicate bindings, the firstly imported 4883 binding is always the one which is kept. 4884 ‘last’ 4885 In case of duplicate bindings, the lastly imported 4886 binding is always the one which is kept. 4887 ‘noop’ 4888 In case of duplicate bindings, leave the responsibility 4889 to the next duplicate handler. 4890 4891 If LIST contains more than one symbol, then the duplicate 4892 binding handlers which appear first will be used first when 4893 resolving a duplicate binding situation. As mentioned above, 4894 some resolution policies may explicitly leave the 4895 responsibility of handling the duplication to the next handler 4896 in LIST. 4897 4898 If GOOPS has been loaded before the ‘#:duplicates’ clause is 4899 processed, there are additional strategies available for 4900 dealing with generic functions. *Note Merging Generics::, for 4901 more information. 4902 4903 The default duplicate binding resolution policy is given by 4904 the ‘default-duplicate-binding-handler’ procedure, and is 4905 4906 (replace warn-override-core warn last) 4907 4908 ‘#:pure’ 4909 Create a “pure” module, that is a module which does not 4910 contain any of the standard procedure bindings except for the 4911 syntax forms. This is useful if you want to create “safe” 4912 modules, that is modules which do not know anything about 4913 dangerous procedures. 4914 4915 -- syntax: export variable ... 4916 Add all VARIABLEs (which must be symbols or pairs of symbols) to 4917 the list of exported bindings of the current module. If VARIABLE 4918 is a pair, its ‘car’ gives the name of the variable as seen by the 4919 current module and its ‘cdr’ specifies a name for the binding in 4920 the current module’s public interface. 4921 4922 -- syntax: define-public ... 4923 Equivalent to ‘(begin (define foo ...) (export foo))’. 4924 4925 -- syntax: re-export variable ... 4926 Add all VARIABLEs (which must be symbols or pairs of symbols) to 4927 the list of re-exported bindings of the current module. Pairs of 4928 symbols are handled as in ‘export’. Re-exported bindings must be 4929 imported by the current module from some other module. 4930 4931 -- syntax: export! variable ... 4932 Like ‘export’, but marking the exported variables as replacing. 4933 Using a module with replacing bindings will cause any existing 4934 bindings to be replaced without issuing any warnings. See the 4935 discussion of ‘#:replace’ above. 4936 4937 ---------- Footnotes ---------- 4938 4939 (1) In Guile 2.2 and earlier, _all_ the module bindings would become 4940available; SYMBOL-LIST was just the list of bindings that will first 4941trigger the load. 4942 4943 4944File: guile.info, Node: Modules and the File System, Next: R6RS Version References, Prev: Creating Guile Modules, Up: Modules 4945 49466.18.4 Modules and the File System 4947---------------------------------- 4948 4949Typical programs only use a small subset of modules installed on a Guile 4950system. In order to keep startup time down, Guile only loads modules 4951when a program uses them, on demand. 4952 4953 When a program evaluates ‘(use-modules (ice-9 popen))’, and the 4954module is not loaded, Guile searches for a conventionally-named file 4955from in the “load path”. 4956 4957 In this case, loading ‘(ice-9 popen)’ will eventually cause Guile to 4958run ‘(primitive-load-path "ice-9/popen")’. ‘primitive-load-path’ will 4959search for a file ‘ice-9/popen’ in the ‘%load-path’ (*note Load 4960Paths::). For each directory in ‘%load-path’, Guile will try to find 4961the file name, concatenated with the extensions from ‘%load-extensions’. 4962By default, this will cause Guile to ‘stat’ ‘ice-9/popen.scm’, and then 4963‘ice-9/popen’. *Note Load Paths::, for more on ‘primitive-load-path’. 4964 4965 If a corresponding compiled ‘.go’ file is found in the 4966‘%load-compiled-path’ or in the fallback path, and is as fresh as the 4967source file, it will be loaded instead of the source file. If no 4968compiled file is found, Guile may try to compile the source file and 4969cache away the resulting ‘.go’ file. *Note Compilation::, for more on 4970compilation. 4971 4972 Once Guile finds a suitable source or compiled file is found, the 4973file will be loaded. If, after loading the file, the module under 4974consideration is still not defined, Guile will signal an error. 4975 4976 For more information on where and how to install Scheme modules, 4977*Note Installing Site Packages::. 4978 4979 4980File: guile.info, Node: R6RS Version References, Next: R6RS Libraries, Prev: Modules and the File System, Up: Modules 4981 49826.18.5 R6RS Version References 4983------------------------------ 4984 4985Guile’s module system includes support for locating modules based on a 4986declared version specifier of the same form as the one described in R6RS 4987(*note R6RS Library Form: (r6rs)Library form.). By using the 4988‘#:version’ keyword in a ‘define-module’ form, a module may specify a 4989version as a list of zero or more exact, nonnegative integers. 4990 4991 This version can then be used to locate the module during the module 4992search process. Client modules and callers of the ‘use-modules’ 4993function may specify constraints on the versions of target modules by 4994providing a “version reference”, which has one of the following forms: 4995 4996 (SUB-VERSION-REFERENCE ...) 4997 (and VERSION-REFERENCE ...) 4998 (or VERSION-REFERENCE ...) 4999 (not VERSION-REFERENCE) 5000 5001 in which SUB-VERSION-REFERENCE is in turn one of: 5002 5003 (SUB-VERSION) 5004 (>= SUB-VERSION) 5005 (<= SUB-VERSION) 5006 (and SUB-VERSION-REFERENCE ...) 5007 (or SUB-VERSION-REFERENCE ...) 5008 (not SUB-VERSION-REFERENCE) 5009 5010 in which SUB-VERSION is an exact, nonnegative integer as above. A 5011version reference matches a declared module version if each element of 5012the version reference matches a corresponding element of the module 5013version, according to the following rules: 5014 5015 • The ‘and’ sub-form matches a version or version element if every 5016 element in the tail of the sub-form matches the specified version 5017 or version element. 5018 5019 • The ‘or’ sub-form matches a version or version element if any 5020 element in the tail of the sub-form matches the specified version 5021 or version element. 5022 5023 • The ‘not’ sub-form matches a version or version element if the tail 5024 of the sub-form does not match the version or version element. 5025 5026 • The ‘>=’ sub-form matches a version element if the element is 5027 greater than or equal to the SUB-VERSION in the tail of the 5028 sub-form. 5029 5030 • The ‘<=’ sub-form matches a version element if the version is less 5031 than or equal to the SUB-VERSION in the tail of the sub-form. 5032 5033 • A SUB-VERSION matches a version element if one is EQV? to the 5034 other. 5035 5036 For example, a module declared as: 5037 5038 (define-module (mylib mymodule) #:version (1 2 0)) 5039 5040 would be successfully loaded by any of the following ‘use-modules’ 5041expressions: 5042 5043 (use-modules ((mylib mymodule) #:version (1 2 (>= 0)))) 5044 (use-modules ((mylib mymodule) #:version (or (1 2 0) (1 2 1)))) 5045 (use-modules ((mylib mymodule) #:version ((and (>= 1) (not 2)) 2 0))) 5046 5047 5048File: guile.info, Node: R6RS Libraries, Next: Variables, Prev: R6RS Version References, Up: Modules 5049 50506.18.6 R6RS Libraries 5051--------------------- 5052 5053In addition to the API described in the previous sections, you also have 5054the option to create modules using the portable ‘library’ form described 5055in R6RS (*note R6RS Library Form: (r6rs)Library form.), and to import 5056libraries created in this format by other programmers. Guile’s R6RS 5057library implementation takes advantage of the flexibility built into the 5058module system by expanding the R6RS library form into a corresponding 5059Guile ‘define-module’ form that specifies equivalent import and export 5060requirements and includes the same body expressions. The library 5061expression: 5062 5063 (library (mylib (1 2)) 5064 (export mybinding) 5065 (import (otherlib (3)))) 5066 5067 is equivalent to the module definition: 5068 5069 (define-module (mylib) 5070 #:version (1 2) 5071 #:use-module ((otherlib) #:version (3)) 5072 #:export (mybinding)) 5073 5074 Central to the mechanics of R6RS libraries is the concept of import 5075and export “levels”, which control the visibility of bindings at various 5076phases of a library’s lifecycle — macros necessary to expand forms in 5077the library’s body need to be available at expand time; variables used 5078in the body of a procedure exported by the library must be available at 5079runtime. R6RS specifies the optional ‘for’ sub-form of an _import set_ 5080specification (see below) as a mechanism by which a library author can 5081indicate that a particular library import should take place at a 5082particular phase with respect to the lifecycle of the importing library. 5083 5084 Guile’s library implementation uses a technique called “implicit 5085phasing” (first described by Abdulaziz Ghuloum and R. Kent Dybvig), 5086which allows the expander and compiler to automatically determine the 5087necessary visibility of a binding imported from another library. As 5088such, the ‘for’ sub-form described below is ignored by Guile (but may be 5089required by Schemes in which phasing is explicit). 5090 5091 -- Scheme Syntax: library name (export export-spec ...) (import 5092 import-spec ...) body ... 5093 Defines a new library with the specified name, exports, and 5094 imports, and evaluates the specified body expressions in this 5095 library’s environment. 5096 5097 The library NAME is a non-empty list of identifiers, optionally 5098 ending with a version specification of the form described above 5099 (*note Creating Guile Modules::). 5100 5101 Each EXPORT-SPEC is the name of a variable defined or imported by 5102 the library, or must take the form ‘(rename (internal-name 5103 external-name) ...)’, where the identifier INTERNAL-NAME names a 5104 variable defined or imported by the library and EXTERNAL-NAME is 5105 the name by which the variable is seen by importing libraries. 5106 5107 Each IMPORT-SPEC must be either an “import set” (see below) or must 5108 be of the form ‘(for import-set import-level ...)’, where each 5109 IMPORT-LEVEL is one of: 5110 5111 run 5112 expand 5113 (meta LEVEL) 5114 5115 where LEVEL is an integer. Note that since Guile does not require 5116 explicit phase specification, any IMPORT-SETs found inside of ‘for’ 5117 sub-forms will be “unwrapped” during expansion and processed as if 5118 they had been specified directly. 5119 5120 Import sets in turn take one of the following forms: 5121 5122 LIBRARY-REFERENCE 5123 (library LIBRARY-REFERENCE) 5124 (only IMPORT-SET IDENTIFIER ...) 5125 (except IMPORT-SET IDENTIFIER ...) 5126 (prefix IMPORT-SET IDENTIFIER) 5127 (rename IMPORT-SET (INTERNAL-IDENTIFIER EXTERNAL-IDENTIFIER) ...) 5128 5129 where LIBRARY-REFERENCE is a non-empty list of identifiers ending 5130 with an optional version reference (*note R6RS Version 5131 References::), and the other sub-forms have the following 5132 semantics, defined recursively on nested IMPORT-SETs: 5133 5134 • The ‘library’ sub-form is used to specify libraries for import 5135 whose names begin with the identifier “library.” 5136 5137 • The ‘only’ sub-form imports only the specified IDENTIFIERs 5138 from the given IMPORT-SET. 5139 5140 • The ‘except’ sub-form imports all of the bindings exported by 5141 IMPORT-SET except for those that appear in the specified list 5142 of IDENTIFIERs. 5143 5144 • The ‘prefix’ sub-form imports all of the bindings exported by 5145 IMPORT-SET, first prefixing them with the specified 5146 IDENTIFIER. 5147 5148 • The ‘rename’ sub-form imports all of the identifiers exported 5149 by IMPORT-SET. The binding for each INTERNAL-IDENTIFIER among 5150 these identifiers is made visible to the importing library as 5151 the corresponding EXTERNAL-IDENTIFIER; all other bindings are 5152 imported using the names provided by IMPORT-SET. 5153 5154 Note that because Guile translates R6RS libraries into module 5155 definitions, an import specification may be used to declare a 5156 dependency on a native Guile module — although doing so may make 5157 your libraries less portable to other Schemes. 5158 5159 -- Scheme Syntax: import import-spec ... 5160 Import into the current environment the libraries specified by the 5161 given import specifications, where each IMPORT-SPEC takes the same 5162 form as in the ‘library’ form described above. 5163 5164 5165File: guile.info, Node: Variables, Next: Module System Reflection, Prev: R6RS Libraries, Up: Modules 5166 51676.18.7 Variables 5168---------------- 5169 5170Each module has its own hash table, sometimes known as an “obarray”, 5171that maps the names defined in that module to their corresponding 5172variable objects. 5173 5174 A variable is a box-like object that can hold any Scheme value. It 5175is said to be “undefined” if its box holds a special Scheme value that 5176denotes undefined-ness (which is different from all other Scheme values, 5177including for example ‘#f’); otherwise the variable is “defined”. 5178 5179 On its own, a variable object is anonymous. A variable is said to be 5180“bound” when it is associated with a name in some way, usually a symbol 5181in a module obarray. When this happens, the name is said to be bound to 5182the variable, in that module. 5183 5184 (That’s the theory, anyway. In practice, defined-ness and bound-ness 5185sometimes get confused, because Lisp and Scheme implementations have 5186often conflated — or deliberately drawn no distinction between — a name 5187that is unbound and a name that is bound to a variable whose value is 5188undefined. We will try to be clear about the difference and explain any 5189confusion where it is unavoidable.) 5190 5191 Variables do not have a read syntax. Most commonly they are created 5192and bound implicitly by ‘define’ expressions: a top-level ‘define’ 5193expression of the form 5194 5195 (define NAME VALUE) 5196 5197creates a variable with initial value VALUE and binds it to the name 5198NAME in the current module. But they can also be created dynamically by 5199calling one of the constructor procedures ‘make-variable’ and 5200‘make-undefined-variable’. 5201 5202 -- Scheme Procedure: make-undefined-variable 5203 -- C Function: scm_make_undefined_variable () 5204 Return a variable that is initially unbound. 5205 5206 -- Scheme Procedure: make-variable init 5207 -- C Function: scm_make_variable (init) 5208 Return a variable initialized to value INIT. 5209 5210 -- Scheme Procedure: variable-bound? var 5211 -- C Function: scm_variable_bound_p (var) 5212 Return ‘#t’ if VAR is bound to a value, or ‘#f’ otherwise. Throws 5213 an error if VAR is not a variable object. 5214 5215 -- Scheme Procedure: variable-ref var 5216 -- C Function: scm_variable_ref (var) 5217 Dereference VAR and return its value. VAR must be a variable 5218 object; see ‘make-variable’ and ‘make-undefined-variable’. 5219 5220 -- Scheme Procedure: variable-set! var val 5221 -- C Function: scm_variable_set_x (var, val) 5222 Set the value of the variable VAR to VAL. VAR must be a variable 5223 object, VAL can be any value. Return an unspecified value. 5224 5225 -- Scheme Procedure: variable-unset! var 5226 -- C Function: scm_variable_unset_x (var) 5227 Unset the value of the variable VAR, leaving VAR unbound. 5228 5229 -- Scheme Procedure: variable? obj 5230 -- C Function: scm_variable_p (obj) 5231 Return ‘#t’ if OBJ is a variable object, else return ‘#f’. 5232 5233 5234File: guile.info, Node: Module System Reflection, Next: Declarative Modules, Prev: Variables, Up: Modules 5235 52366.18.8 Module System Reflection 5237------------------------------- 5238 5239The previous sections have described a declarative view of the module 5240system. You can also work with it programmatically by accessing and 5241modifying various parts of the Scheme objects that Guile uses to 5242implement the module system. 5243 5244 At any time, there is a “current module”. This module is the one 5245where a top-level ‘define’ and similar syntax will add new bindings. 5246You can find other module objects with ‘resolve-module’, for example. 5247 5248 These module objects can be used as the second argument to ‘eval’. 5249 5250 -- Scheme Procedure: current-module 5251 -- C Function: scm_current_module () 5252 Return the current module object. 5253 5254 -- Scheme Procedure: set-current-module module 5255 -- C Function: scm_set_current_module (module) 5256 Set the current module to MODULE and return the previous current 5257 module. 5258 5259 -- Scheme Procedure: save-module-excursion thunk 5260 Call THUNK within a ‘dynamic-wind’ such that the module that is 5261 current at invocation time is restored when THUNK’s dynamic extent 5262 is left (*note Dynamic Wind::). 5263 5264 More precisely, if THUNK escapes non-locally, the current module 5265 (at the time of escape) is saved, and the original current module 5266 (at the time THUNK’s dynamic extent was last entered) is restored. 5267 If THUNK’s dynamic extent is re-entered, then the current module is 5268 saved, and the previously saved inner module is set current again. 5269 5270 -- Scheme Procedure: resolve-module name [autoload=#t] [version=#f] 5271 [#:ensure=#t] 5272 -- C Function: scm_resolve_module (name) 5273 Find the module named NAME and return it. When it has not already 5274 been defined and AUTOLOAD is true, try to auto-load it. When it 5275 can’t be found that way either, create an empty module if ENSURE is 5276 true, otherwise return ‘#f’. If VERSION is true, ensure that the 5277 resulting module is compatible with the given version reference 5278 (*note R6RS Version References::). The name is a list of symbols. 5279 5280 -- Scheme Procedure: resolve-interface name [#:select=#f] [#:hide='()] 5281 [#:prefix=#f] [#:renamer=#f] [#:version=#f] 5282 Find the module named NAME as with ‘resolve-module’ and return its 5283 interface. The interface of a module is also a module object, but 5284 it contains only the exported bindings. 5285 5286 -- Scheme Procedure: module-uses module 5287 Return a list of the interfaces used by MODULE. 5288 5289 -- Scheme Procedure: module-use! module interface 5290 Add INTERFACE to the front of the use-list of MODULE. Both 5291 arguments should be module objects, and INTERFACE should very 5292 likely be a module returned by ‘resolve-interface’. 5293 5294 -- Scheme Procedure: reload-module module 5295 Revisit the source file that corresponds to MODULE. Raises an 5296 error if no source file is associated with the given module. 5297 5298 As mentioned in the previous section, modules contain a mapping 5299between identifiers (as symbols) and storage locations (as variables). 5300Guile defines a number of procedures to allow access to this mapping. 5301If you are programming in C, *note Accessing Modules from C::. 5302 5303 -- Scheme Procedure: module-variable module name 5304 Return the variable bound to NAME (a symbol) in MODULE, or ‘#f’ if 5305 NAME is unbound. 5306 5307 -- Scheme Procedure: module-add! module name var 5308 Define a new binding between NAME (a symbol) and VAR (a variable) 5309 in MODULE. 5310 5311 -- Scheme Procedure: module-ref module name 5312 Look up the value bound to NAME in MODULE. Like ‘module-variable’, 5313 but also does a ‘variable-ref’ on the resulting variable, raising 5314 an error if NAME is unbound. 5315 5316 -- Scheme Procedure: module-define! module name value 5317 Locally bind NAME to VALUE in MODULE. If NAME was already locally 5318 bound in MODULE, i.e., defined locally and not by an imported 5319 module, the value stored in the existing variable will be updated. 5320 Otherwise, a new variable will be added to the module, via 5321 ‘module-add!’. 5322 5323 -- Scheme Procedure: module-set! module name value 5324 Update the binding of NAME in MODULE to VALUE, raising an error if 5325 NAME is not already bound in MODULE. 5326 5327 There are many other reflective procedures available in the default 5328environment. If you find yourself using one of them, please contact the 5329Guile developers so that we can commit to stability for that interface. 5330 5331 5332File: guile.info, Node: Declarative Modules, Next: Accessing Modules from C, Prev: Module System Reflection, Up: Modules 5333 53346.18.9 Declarative Modules 5335-------------------------- 5336 5337The first-class access to modules and module variables described in the 5338previous subsection is very powerful and allows Guile users to build 5339many tools to dynamically learn things about their Guile systems. 5340However, as Scheme godparent Mathias Felleisen wrote in “On the 5341Expressive Power of Programming Languages”, a more expressive language 5342is necessarily harder to reason about. There are transformations that 5343Guile’s compiler would like to make which can’t be done if every 5344top-level definition is subject to mutation at any time. 5345 5346 Consider this module: 5347 5348 (define-module (boxes) 5349 #:export (make-box box-ref box-set! box-swap!)) 5350 5351 (define (make-box x) (list x)) 5352 (define (box-ref box) (car box)) 5353 (define (box-set! box x) (set-car! box x)) 5354 (define (box-swap! box x) 5355 (let ((y (box-ref box))) 5356 (box-set! box x) 5357 y)) 5358 5359 Ideally you’d like for the ‘box-ref’ in ‘box-swap!’ to be inlined to 5360‘car’. Guile’s compiler can do this, but only if it knows that 5361‘box-ref’’s definition is what it appears to be in the text. However, 5362in the general case it could be that a programmer could reach into the 5363‘(boxes)’ module at any time and change the value of ‘box-ref’. 5364 5365 To allow Guile to reason about the values of top-levels from a 5366module, a module can be marked as “declarative”. This flag applies only 5367to the subset of top-level definitions that are themselves declarative: 5368those that are defined within the compilation unit, and not assigned 5369(‘set!’) or redefined within the compilation unit. 5370 5371 To explicitly mark a module as being declarative, pass the 5372‘#:declarative?’ keyword argument when declaring a module: 5373 5374 (define-module (boxes) 5375 #:export (make-box box-ref box-set! box-swap!) 5376 #:declarative? #t) 5377 5378 By default, modules are compiled declaratively if the 5379‘user-modules-declarative?’ parameter is true when the module is 5380compiled. 5381 5382 -- Scheme Parameter: user-modules-declarative? 5383 A boolean indicating whether definitions in modules created by 5384 ‘define-module’ or implicitly as part of a compilation unit without 5385 an explicit module can be treated as declarative. 5386 5387 Because it’s usually what you want, the default value of 5388‘user-modules-declarative?’ is ‘#t’. 5389 5390Should I Mark My Module As Declarative? 5391....................................... 5392 5393In the vast majority of use cases, declarative modules are what you 5394want. However, there are exceptions. 5395 5396 Consider the ‘(boxes)’ module above. Let’s say you want to be able 5397to go in and change the definition of ‘box-set!’ at run-time: 5398 5399 scheme@(guile-user)> (use-modules (boxes)) 5400 scheme@(guile-user)> ,module boxes 5401 scheme@(boxes)> (define (box-set! x y) (set-car! x (pk y))) 5402 5403 However, considering that ‘(boxes)’ is a declarative module, it could 5404be that ‘box-swap!’ inlined the call to ‘box-set!’ – so it may be that 5405you are surprised if you call ‘(box-swap! x y)’ and you don’t see the 5406new definition being used. (Note, however, that Guile has no guarantees 5407about what definitions its compiler will or will not inline.) 5408 5409 If you want to allow the definition of ‘box-set!’ to be changed and 5410to have all of its uses updated, then probably the best option is to 5411edit the module and reload the whole thing: 5412 5413 scheme@(guile-user)> ,reload (boxes) 5414 5415 The advantage of the reloading approach is that you maintain the 5416optimizations that declarative modules enable, while also being able to 5417live-update the code. If the module keeps precious program state, those 5418definitions can be marked as ‘define-once’ to prevent reloads from 5419overwriting them. *Note Top Level::, for more on ‘define-once’. 5420Incidentally, ‘define-once’ also prevents declarative-definition 5421optimizations, so if there’s a limited subset of redefinable bindings, 5422‘define-once’ could be an interesting tool to mark those definitions as 5423works-in-progress for interactive program development. 5424 5425 To users, whether a module is declarative or not is mostly 5426immaterial: besides normal use via ‘use-modules’, users can reference 5427and redefine public or private bindings programmatically or 5428interactively. The only difference is that changing a declarative 5429definition may not change all of its uses. If this use-case is 5430important to you, and if reloading whole modules is insufficient, then 5431you can mark all definitions in a module as non-declarative by adding 5432‘#:declarative? #f’ to the module definition. 5433 5434 The default of whether modules are declarative or not can be 5435controlled via the ‘(user-modules-declarative?)’ parameter mentioned 5436above, but care should be taken to set this parameter when the modules 5437are compiled, e.g. via ‘(eval-when (expand) (user-modules-declarative? 5438#f))’. *Note Eval When::. 5439 5440 Alternately you can prevent declarative-definition optimizations by 5441compiling at the ‘-O1’ optimization level instead of the default ‘-O2’, 5442or via explicitly passing ‘-Ono-letrectify’ to the ‘guild compile’ 5443invocation. *Note Compilation::, for more on compiler options. 5444 5445 One final note. Currently, definitions from declarative modules can 5446only be inlined within the module they are defined in, and within a 5447compilation unit. This may change in the future to allow Guile to 5448inline imported declarative definitions as well (cross-module inlining). 5449To Guile, whether a definition is inlinable or not is a property of the 5450definition, not its use. We hope to improve compiler tooling in the 5451future to allow the user to identify definitions that are out of date 5452when a declarative binding is redefined. 5453 5454 5455File: guile.info, Node: Accessing Modules from C, Next: provide and require, Prev: Declarative Modules, Up: Modules 5456 54576.18.10 Accessing Modules from C 5458-------------------------------- 5459 5460The last sections have described how modules are used in Scheme code, 5461which is the recommended way of creating and accessing modules. You can 5462also work with modules from C, but it is more cumbersome. 5463 5464 The following procedures are available. 5465 5466 -- C Function: SCM scm_c_call_with_current_module (SCM MODULE, SCM 5467 (*FUNC)(void *), void *DATA) 5468 Call FUNC and make MODULE the current module during the call. The 5469 argument DATA is passed to FUNC. The return value of 5470 ‘scm_c_call_with_current_module’ is the return value of FUNC. 5471 5472 -- C Function: SCM scm_public_variable (SCM MODULE_NAME, SCM NAME) 5473 -- C Function: SCM scm_c_public_variable (const char *MODULE_NAME, 5474 const char *NAME) 5475 Find a the variable bound to the symbol NAME in the public 5476 interface of the module named MODULE_NAME. 5477 5478 MODULE_NAME should be a list of symbols, when represented as a 5479 Scheme object, or a space-separated string, in the ‘const char *’ 5480 case. See ‘scm_c_define_module’ below, for more examples. 5481 5482 Signals an error if no module was found with the given name. If 5483 NAME is not bound in the module, just returns ‘#f’. 5484 5485 -- C Function: SCM scm_private_variable (SCM MODULE_NAME, SCM NAME) 5486 -- C Function: SCM scm_c_private_variable (const char *MODULE_NAME, 5487 const char *NAME) 5488 Like ‘scm_public_variable’, but looks in the internals of the 5489 module named MODULE_NAME instead of the public interface. 5490 Logically, these procedures should only be called on modules you 5491 write. 5492 5493 -- C Function: SCM scm_public_lookup (SCM MODULE_NAME, SCM NAME) 5494 -- C Function: SCM scm_c_public_lookup (const char *MODULE_NAME, const 5495 char *NAME) 5496 -- C Function: SCM scm_private_lookup (SCM MODULE_NAME, SCM NAME) 5497 -- C Function: SCM scm_c_private_lookup (const char *MODULE_NAME, const 5498 char *NAME) 5499 Like ‘scm_public_variable’ or ‘scm_private_variable’, but if the 5500 NAME is not bound in the module, signals an error. Returns a 5501 variable, always. 5502 5503 static SCM eval_string_var; 5504 5505 /* NOTE: It is important that the call to 'my_init' 5506 happens-before all calls to 'my_eval_string'. */ 5507 void my_init (void) 5508 { 5509 eval_string_var = scm_c_public_lookup ("ice-9 eval-string", 5510 "eval-string"); 5511 } 5512 5513 SCM my_eval_string (SCM str) 5514 { 5515 return scm_call_1 (scm_variable_ref (eval_string_var), str); 5516 } 5517 5518 -- C Function: SCM scm_public_ref (SCM MODULE_NAME, SCM NAME) 5519 -- C Function: SCM scm_c_public_ref (const char *MODULE_NAME, const 5520 char *NAME) 5521 -- C Function: SCM scm_private_ref (SCM MODULE_NAME, SCM NAME) 5522 -- C Function: SCM scm_c_private_ref (const char *MODULE_NAME, const 5523 char *NAME) 5524 Like ‘scm_public_lookup’ or ‘scm_private_lookup’, but additionally 5525 dereferences the variable. If the variable object is unbound, 5526 signals an error. Returns the value bound to NAME in MODULE_NAME. 5527 5528 In addition, there are a number of other lookup-related procedures. 5529We suggest that you use the ‘scm_public_’ and ‘scm_private_’ family of 5530procedures instead, if possible. 5531 5532 -- C Function: SCM scm_c_lookup (const char *NAME) 5533 Return the variable bound to the symbol indicated by NAME in the 5534 current module. If there is no such binding or the symbol is not 5535 bound to a variable, signal an error. 5536 5537 -- C Function: SCM scm_lookup (SCM NAME) 5538 Like ‘scm_c_lookup’, but the symbol is specified directly. 5539 5540 -- C Function: SCM scm_c_module_lookup (SCM MODULE, const char *NAME) 5541 -- C Function: SCM scm_module_lookup (SCM MODULE, SCM NAME) 5542 Like ‘scm_c_lookup’ and ‘scm_lookup’, but the specified module is 5543 used instead of the current one. 5544 5545 -- C Function: SCM scm_module_variable (SCM MODULE, SCM NAME) 5546 Like ‘scm_module_lookup’, but if the binding does not exist, just 5547 returns ‘#f’ instead of raising an error. 5548 5549 To define a value, use ‘scm_define’: 5550 5551 -- C Function: SCM scm_c_define (const char *NAME, SCM VAL) 5552 Bind the symbol indicated by NAME to a variable in the current 5553 module and set that variable to VAL. When NAME is already bound to 5554 a variable, use that. Else create a new variable. 5555 5556 -- C Function: SCM scm_define (SCM NAME, SCM VAL) 5557 Like ‘scm_c_define’, but the symbol is specified directly. 5558 5559 -- C Function: SCM scm_c_module_define (SCM MODULE, const char *NAME, 5560 SCM VAL) 5561 -- C Function: SCM scm_module_define (SCM MODULE, SCM NAME, SCM VAL) 5562 Like ‘scm_c_define’ and ‘scm_define’, but the specified module is 5563 used instead of the current one. 5564 5565 In some rare cases, you may need to access the variable that 5566‘scm_module_define’ would have accessed, without changing the binding of 5567the existing variable, if one is present. In that case, use 5568‘scm_module_ensure_local_variable’: 5569 5570 -- C Function: SCM scm_module_ensure_local_variable (SCM MODULE, SCM 5571 SYM) 5572 Like ‘scm_module_define’, but if the SYM is already locally bound 5573 in that module, the variable’s existing binding is not reset. 5574 Returns a variable. 5575 5576 -- C Function: SCM scm_module_reverse_lookup (SCM MODULE, SCM VARIABLE) 5577 Find the symbol that is bound to VARIABLE in MODULE. When no such 5578 binding is found, return ‘#f’. 5579 5580 -- C Function: SCM scm_c_define_module (const char *NAME, void 5581 (*INIT)(void *), void *DATA) 5582 Define a new module named NAME and make it current while INIT is 5583 called, passing it DATA. Return the module. 5584 5585 The parameter NAME is a string with the symbols that make up the 5586 module name, separated by spaces. For example, ‘"foo bar"’ names 5587 the module ‘(foo bar)’. 5588 5589 When there already exists a module named NAME, it is used 5590 unchanged, otherwise, an empty module is created. 5591 5592 -- C Function: SCM scm_c_resolve_module (const char *NAME) 5593 Find the module name NAME and return it. When it has not already 5594 been defined, try to auto-load it. When it can’t be found that way 5595 either, create an empty module. The name is interpreted as for 5596 ‘scm_c_define_module’. 5597 5598 -- C Function: SCM scm_c_use_module (const char *NAME) 5599 Add the module named NAME to the uses list of the current module, 5600 as with ‘(use-modules NAME)’. The name is interpreted as for 5601 ‘scm_c_define_module’. 5602 5603 -- C Function: void scm_c_export (const char *NAME, ...) 5604 Add the bindings designated by NAME, ... to the public interface 5605 of the current module. The list of names is terminated by ‘NULL’. 5606 5607 5608File: guile.info, Node: provide and require, Next: Environments, Prev: Accessing Modules from C, Up: Modules 5609 56106.18.11 provide and require 5611--------------------------- 5612 5613Aubrey Jaffer, mostly to support his portable Scheme library SLIB, 5614implemented a provide/require mechanism for many Scheme implementations. 5615Library files in SLIB _provide_ a feature, and when user programs 5616_require_ that feature, the library file is loaded in. 5617 5618 For example, the file ‘random.scm’ in the SLIB package contains the 5619line 5620 5621 (provide 'random) 5622 5623 so to use its procedures, a user would type 5624 5625 (require 'random) 5626 5627 and they would magically become available, _but still have the same 5628names!_ So this method is nice, but not as good as a full-featured 5629module system. 5630 5631 When SLIB is used with Guile, provide and require can be used to 5632access its facilities. 5633 5634 5635File: guile.info, Node: Environments, Prev: provide and require, Up: Modules 5636 56376.18.12 Environments 5638-------------------- 5639 5640Scheme, as defined in R5RS, does _not_ have a full module system. 5641However it does define the concept of a top-level “environment”. Such 5642an environment maps identifiers (symbols) to Scheme objects such as 5643procedures and lists: *note About Closure::. In other words, it 5644implements a set of “bindings”. 5645 5646 Environments in R5RS can be passed as the second argument to ‘eval’ 5647(*note Fly Evaluation::). Three procedures are defined to return 5648environments: ‘scheme-report-environment’, ‘null-environment’ and 5649‘interaction-environment’ (*note Fly Evaluation::). 5650 5651 In addition, in Guile any module can be used as an R5RS environment, 5652i.e., passed as the second argument to ‘eval’. 5653 5654 Note: the following two procedures are available only when the 5655‘(ice-9 r5rs)’ module is loaded: 5656 5657 (use-modules (ice-9 r5rs)) 5658 5659 -- Scheme Procedure: scheme-report-environment version 5660 -- Scheme Procedure: null-environment version 5661 VERSION must be the exact integer ‘5’, corresponding to revision 5 5662 of the Scheme report (the Revised^5 Report on Scheme). 5663 ‘scheme-report-environment’ returns a specifier for an environment 5664 that is empty except for all bindings defined in the report that 5665 are either required or both optional and supported by the 5666 implementation. ‘null-environment’ returns a specifier for an 5667 environment that is empty except for the (syntactic) bindings for 5668 all syntactic keywords defined in the report that are either 5669 required or both optional and supported by the implementation. 5670 5671 Currently Guile does not support values of VERSION for other 5672 revisions of the report. 5673 5674 The effect of assigning (through the use of ‘eval’) a variable 5675 bound in a ‘scheme-report-environment’ (for example ‘car’) is 5676 unspecified. Currently the environments specified by 5677 ‘scheme-report-environment’ are not immutable in Guile. 5678 5679 5680File: guile.info, Node: Foreign Function Interface, Next: Foreign Objects, Prev: Modules, Up: API Reference 5681 56826.19 Foreign Function Interface 5683=============================== 5684 5685Sometimes you need to use libraries written in C or Rust or some other 5686non-Scheme language. More rarely, you might need to write some C to 5687extend Guile. This section describes how to load these “foreign 5688libraries”, look up data and functions inside them, and so on. 5689 5690* Menu: 5691 5692* Foreign Libraries:: Dynamically linking to libraries. 5693* Foreign Extensions:: Extending Guile in C with loadable modules. 5694* Foreign Pointers:: Pointers to C data or functions. 5695* Foreign Types:: Expressing C types in Scheme. 5696* Foreign Functions:: Simple calls to C procedures. 5697* Void Pointers and Byte Access:: Pointers into the ether. 5698* Foreign Structs:: Packing and unpacking structs. 5699* More Foreign Functions:: Advanced examples. 5700 5701 5702File: guile.info, Node: Foreign Libraries, Next: Foreign Extensions, Up: Foreign Function Interface 5703 57046.19.1 Foreign Libraries 5705------------------------ 5706 5707Just as Guile can load up Scheme libraries at run-time, Guile can also 5708load some system libraries written in C or other low-level languages. 5709We refer to these as dynamically-loadable modules as “foreign 5710libraries”, to distinguish them from native libraries written in Scheme 5711or other languages implemented by Guile. 5712 5713 Foreign libraries usually come in two forms. Some foreign libraries 5714are part of the operating system, such as the compression library 5715‘libz’. These shared libraries are built in such a way that many 5716programs can use their functionality without duplicating their code. 5717When a program written in C is built, it can declare that it uses a 5718specific set of shared libraries. When the program is run, the 5719operating system takes care of locating and loading the shared 5720libraries. 5721 5722 The operating system components that can dynamically load and link 5723shared libraries when a program is run are also available 5724programmatically during a program’s execution. This is the interface 5725that’s most useful for Guile, and this is what we mean in Guile when we 5726refer to “dynamic linking”. Dynamic linking at run-time is sometimes 5727called “dlopening”, to distinguish it from the dynamic linking that 5728happens at program start-up. 5729 5730 The other kind of foreign library is sometimes known as a module, 5731plug-in, bundle, or an extension. These foreign libraries aren’t meant 5732to be linked to by C programs, but rather only to be dynamically loaded 5733at run-time – they extend some main program with functionality, but 5734don’t stand on their own. Sometimes a Guile library will implement some 5735of its functionality in a loadable module. 5736 5737 In either case, the interface on the Guile side is the same. You 5738load the interface using ‘load-foreign-library’. The resulting foreign 5739library object implements a simple lookup interface whereby the user can 5740get addresses of data or code exported by the library. There is no 5741facility to inspect foreign libraries; you have to know what’s in there 5742already before you look. 5743 5744 Routines for loading foreign libraries and accessing their contents 5745are implemented in the ‘(system foreign-library)’ module. 5746 5747 (use-modules (system foreign-library)) 5748 5749 -- Scheme Procedure: load-foreign-library [library] 5750 [#:extensions=system-library-extensions] 5751 [#:search-ltdl-library-path?=#t] [#:search-path=search-path] 5752 [#:search-system-paths?=#t] [#:lazy?=#t] [#:global=#f] 5753 [#:rename-on-cygwin?=#t] Find the shared library denoted by LIBRARY 5754 (a string or ‘#f’) and link it into the running Guile application. 5755 When everything works out, return a Scheme object suitable for 5756 representing the linked object file. Otherwise an error is thrown. 5757 5758 If LIBRARY argument is omitted, it defaults to ‘#f’. If ‘library’ 5759 is false, the resulting foreign library gives access to all symbols 5760 available for dynamic linking in the main binary. 5761 5762 It is not necessary to include any extension such as ‘.so’ in 5763 LIBRARY. For each system, Guile has a default set of extensions 5764 that it will try. On GNU systems, the default extension set is 5765 just ‘.so’; on Windows, just ‘.dll’; and on Darwin (Mac OS), it is 5766 ‘.bundle’, ‘.so’, and ‘.dylib’. Pass ‘#:extensions EXTENSIONS’ to 5767 override the default extensions list. If LIBRARY contains one of 5768 the extensions, no extensions are tried, so it is possible to 5769 specify the extension if you know exactly what file to load. 5770 5771 Unless LIBRARY denotes an absolute file name or otherwise contains 5772 a directory separator (‘/’, and also ‘\’ on Windows), Guile will 5773 search for the library in the directories listed in SEARCH-PATHS. 5774 The default search path has three components, which can all be 5775 overriden by colon-delimited (semicolon on Windows) environment 5776 variables: 5777 5778 ‘GUILE_EXTENSIONS_PATH’ 5779 This is the main environment variable for users to add 5780 directories containing Guile extensions. The default value 5781 has no entries. This environment variable was added in Guile 5782 3.0.6. 5783 ‘LTDL_LIBRARY_PATH’ 5784 Before Guile 3.0.6, Guile loaded foreign libraries using 5785 ‘libltdl’, the dynamic library loader provided by libtool. 5786 This loader used ‘LTDL_LIBRARY_PATH’, and for backwards 5787 compatibility we still support that path. 5788 5789 However, ‘libltdl’ would not only open ‘.so’ (or ‘.dll’ and so 5790 on) files, but also the ‘.la’ files created by libtool. In 5791 installed libraries – libraries that are in the target 5792 directories of ‘make install’ – ‘.la’ files are never needed, 5793 to the extent that most GNU/Linux distributions remove them 5794 entirely. It is sufficient to just load the ‘.so’ (or ‘.dll’ 5795 and so on) files, which are always located in the same 5796 directory as the ‘.la’ files. 5797 5798 But for uninstalled dynamic libraries, like those in a build 5799 tree, the situation is a bit of a mess. If you have a project 5800 that uses libtool to build libraries – which is the case for 5801 Guile, and for most projects using autotools – and you build 5802 ‘foo.so’ in directory ‘D’, libtool will put ‘foo.la’ in ‘D’, 5803 but ‘foo.so’ gets put into ‘D/.libs’. 5804 5805 Users were mostly oblivious to this situation, as ‘libltdl’ 5806 had special logic to be able to read the ‘.la’ file to know 5807 where to find the ‘.so’, even from an uninstalled build tree, 5808 preventing the existence of ‘.libs’ from leaking out to the 5809 user. 5810 5811 We don’t use libltdl now, essentially for flexibility and 5812 error-reporting reasons. But, to keep this old use-case 5813 working, if SEARCH-LTDL-LIBRARY-PATH? is true, we add each 5814 entry of ‘LTDL_LIBRARY_PATH’ to the default extensions load 5815 path, additionally adding the ‘.libs’ subdirextories for each 5816 entry, in case there are ‘.so’ files there instead of 5817 alongside the ‘.la’ files. 5818 ‘GUILE_SYSTEM_EXTENSIONS_PATH’ 5819 The last path in Guile’s search path belongs to Guile itself, 5820 and defaults to the libdir and the extensiondir, in that 5821 order. For example, if you install to ‘/opt/guile’, these 5822 would probably be ‘/opt/guile/lib’ and 5823 ‘/opt/guile/lib/guile/3.0/extensions’, respectively. *Note 5824 Parallel Installations::, for more details on ‘extensionsdir’. 5825 5826 Finally, if no library is found in the search path, and if LIBRARY 5827 is not absolute and does not include directory separators, and if 5828 SEARCH-SYSTEM-PATHS? is true, the operating system may have its own 5829 logic for where to locate LIBRARY. For example, on GNU, there will 5830 be a default set of paths (often ‘/usr/lib’ and ‘/lib’, though it 5831 depends on the system), and the ‘LD_LIBRARY_PATH’ environment 5832 variable can add additional paths. Other operating systems have 5833 other conventions. 5834 5835 Falling back to the operating system for search is usually not a 5836 great thing; it is a recipe for making programs that work on one 5837 machine but not on others. Still, when wrapping system libraries, 5838 it can be the only way to get things working at all. 5839 5840 If LAZY? is true (the default), Guile will request the operating 5841 system to resolve symbols used by the loaded library as they are 5842 first used. If GLOBAL? is true, symbols defined by the loaded 5843 library will be available when other modules need to resolve 5844 symbols; the default is ‘#f’, which keeps symbols local. 5845 5846 If RENAME-ON-CYGWIN? is true (the default) – on Cygwin hosts only – 5847 the search behavior is modified such that a filename that starts 5848 with “lib” will be searched for under the name “cyg”, as is 5849 customary for Cygwin. 5850 5851 The environment variables mentioned above are parsed when the 5852foreign-library module is first loaded and bound to parameters. Null 5853path components, for example the three components of 5854‘GUILE_SYSTEM_EXTENSIONS_PATH="::"’, are ignored. 5855 5856 -- Scheme Parameter: guile-extensions-path 5857 -- Scheme Parameter: ltdl-library-path 5858 -- Scheme Parameter: guile-system-extensions-path 5859 Parameters whose initial values are taken from 5860 ‘GUILE_EXTENSIONS_PATH’, ‘LTDL_LIBRARY_PATH’, and 5861 ‘GUILE_SYSTEM_EXTENSIONS_PATH’, respectively. *Note Parameters::. 5862 The current values of these parameters are used when building the 5863 search path when ‘load-foreign-library’ is called, unless the 5864 caller explicitly passes a ‘#:search-path’ argument. 5865 5866 -- Scheme Procedure: foreign-library? obj 5867 Return ‘#t’ if OBJ is a foreign library, or ‘#f’ otherwise. 5868 5869 5870File: guile.info, Node: Foreign Extensions, Next: Foreign Pointers, Prev: Foreign Libraries, Up: Foreign Function Interface 5871 58726.19.2 Foreign Extensions 5873------------------------- 5874 5875One way to use shared libraries is to extend Guile. Such loadable 5876modules generally define one distinguished initialization function that, 5877when called, will use the ‘libguile’ API to define procedures in the 5878current module. 5879 5880 Concretely, you might extend Guile with an implementation of the 5881Bessel function, ‘j0’: 5882 5883 #include <math.h> 5884 #include <libguile.h> 5885 5886 SCM 5887 j0_wrapper (SCM x) 5888 { 5889 return scm_from_double (j0 (scm_to_double (x, "j0"))); 5890 } 5891 5892 void 5893 init_math_bessel (void) 5894 { 5895 scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper); 5896 } 5897 5898 The C source file would then need to be compiled into a shared 5899library. On GNU/Linux, the compiler invocation might look like this: 5900 5901 gcc -shared -o bessel.so -fPIC bessel.c 5902 5903 A good default place to put shared libraries that extend Guile is 5904into the extensions dir. From the command line or a build script, 5905invoke ‘pkg-config --variable=extensionsdir guile-3.0’ to print the 5906extensions dir. *Note Parallel Installations::, for more details. 5907 5908 Guile can load up ‘bessel.so’ via ‘load-extension’. 5909 5910 -- Scheme Procedure: load-extension lib init 5911 -- C Function: scm_load_extension (lib, init) 5912 Load and initialize the extension designated by LIB and INIT. 5913 5914 The normal way for a extension to be used is to write a small Scheme 5915file that defines a module, and to load the extension into this module. 5916When the module is auto-loaded, the extension is loaded as well. For 5917example: 5918 5919 (define-module (math bessel) 5920 #:export (j0)) 5921 5922 (load-extension "bessel" "init_math_bessel") 5923 5924 This ‘load-extension’ invocation loads the ‘bessel’ library via 5925‘(load-foreign-library "bessel")’, then looks up the ‘init_math_bessel’ 5926symbol in the library, treating it as a function of no arguments, and 5927calls that function. 5928 5929 If you decide to put your extension outside the default search path 5930for ‘load-foreign-library’, probably you should adapt the Scheme module 5931to specify its absolute path. For example, if you use ‘automake’ to 5932build your extension and place it in ‘$(pkglibdir)’, you might define a 5933build-parameters module that gets created by the build system: 5934 5935 (define-module (math config) 5936 #:export (extensiondir)) 5937 (define extensiondir "PKGLIBDIR") 5938 5939 This file would be ‘config.scm.in’. You would define a ‘make’ rule 5940to substitute in the absolute installed file name: 5941 5942 config.scm: config.scm.in 5943 sed 's|PKGLIBDIR|$(pkglibdir)|' <$< >$ 5944 5945 Then your ‘(math bessel)’ would import ‘(math config)’, then 5946‘(load-extension (in-vicinity extensiondir "bessel") 5947"init_math_bessel")’. 5948 5949 An alternate approach would be to rebind the ‘guile-extensions-path’ 5950parameter, or its corresponding environment variable, but note that 5951changing those parameters applies to other users of 5952‘load-foreign-library’ as well. 5953 5954 Note that the new primitives that the extension adds to Guile with 5955‘scm_c_define_gsubr’ (*note Primitive Procedures::) or with any of the 5956other mechanisms are placed into the module that is current when the 5957‘scm_c_define_gsubr’ is executed, so to be clear about what goes vwhere 5958it’s best to include the ‘load-extension’ in a module, as above. 5959Alternately, the C code can use ‘scm_c_define_module’ to specify which 5960module is being created: 5961 5962 static void 5963 do_init (void *unused) 5964 { 5965 scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper); 5966 scm_c_export ("j0", NULL); 5967 } 5968 5969 void 5970 init_math_bessel () 5971 { 5972 scm_c_define_module ("math bessel", do_init, NULL); 5973 } 5974 5975 And yet... if what we want is just the ‘j0’ function, it seems like 5976a lot of ceremony to have to compile a Guile-specific wrapper library 5977complete with an initialization function and wraper module to allow 5978Guile users to call it. There is another way, but to get there, we have 5979to talk about function pointers and function types first. *Note Foreign 5980Functions::, to skip to the good parts. 5981 5982 5983File: guile.info, Node: Foreign Pointers, Next: Foreign Types, Prev: Foreign Extensions, Up: Foreign Function Interface 5984 59856.19.3 Foreign Pointers 5986----------------------- 5987 5988Foreign libraries are essentially key-value mappings, where the keys are 5989names of definitions and the values are the addresses of those 5990definitions. To look up the address of a definition, use 5991‘foreign-library-pointer’ from the ‘(system foreign-library)’ module. 5992 5993 -- Scheme Procedure: foreign-library-pointer lib name 5994 Return a “wrapped pointer” for the symbol NAME in the shared object 5995 referred to by LIB. The returned pointer points to a C object. 5996 5997 As a convenience, if LIB is not a foreign library, it will be 5998 passed to ‘load-foreign-library’. 5999 6000 If we continue with the ‘bessel.so’ example from before, we can get 6001the address of the ‘init_math_bessel’ function via: 6002 6003 (use-modules (system foreign-library)) 6004 (define init (foreign-library-pointer "bessel" "init_math_bessel")) 6005 init 6006 ⇒ #<pointer 0x7fb35b1b4688> 6007 6008 A value returned by ‘foreign-library-pointer’ is a Scheme wrapper for 6009a C pointer. Pointers are a data type in Guile that is disjoint from 6010all other types. The next section discusses ways to dereference 6011pointers, but before then we describe the usual type predicates and so 6012on. 6013 6014 Note that the rest of the interfaces in this section are part of the 6015‘(system foreign)’ library: 6016 6017 (use-modules (system foreign)) 6018 6019 -- Scheme Procedure: pointer-address pointer 6020 -- C Function: scm_pointer_address (pointer) 6021 Return the numerical value of POINTER. 6022 6023 (pointer-address init) 6024 ⇒ 139984413364296 ; YMMV 6025 6026 -- Scheme Procedure: make-pointer address [finalizer] 6027 Return a foreign pointer object pointing to ADDRESS. If FINALIZER 6028 is passed, it should be a pointer to a one-argument C function that 6029 will be called when the pointer object becomes unreachable. 6030 6031 -- Scheme Procedure: pointer? obj 6032 Return ‘#t’ if OBJ is a pointer object, or ‘#f’ otherwise. 6033 6034 -- Scheme Variable: %null-pointer 6035 A foreign pointer whose value is 0. 6036 6037 -- Scheme Procedure: null-pointer? pointer 6038 Return ‘#t’ if POINTER is the null pointer, ‘#f’ otherwise. 6039 6040 For the purpose of passing SCM values directly to foreign functions, 6041and allowing them to return SCM values, Guile also supports some unsafe 6042casting operators. 6043 6044 -- Scheme Procedure: scm->pointer scm 6045 Return a foreign pointer object with the ‘object-address’ of SCM. 6046 6047 -- Scheme Procedure: pointer->scm pointer 6048 Unsafely cast POINTER to a Scheme object. Cross your fingers! 6049 6050 Sometimes you want to give C extensions access to the dynamic FFI. At 6051that point, the names get confusing, because “pointer” can refer to a 6052‘SCM’ object that wraps a pointer, or to a ‘void*’ value. We will try 6053to use “pointer object” to refer to Scheme objects, and “pointer value” 6054to refer to ‘void *’ values. 6055 6056 -- C Function: SCM scm_from_pointer (void *ptr, void (*finalizer) 6057 (void*)) 6058 Create a pointer object from a pointer value. 6059 6060 If FINALIZER is non-null, Guile arranges to call it on the pointer 6061 value at some point after the pointer object becomes collectable. 6062 6063 -- C Function: void* scm_to_pointer (SCM obj) 6064 Unpack the pointer value from a pointer object. 6065 6066 6067File: guile.info, Node: Foreign Types, Next: Foreign Functions, Prev: Foreign Pointers, Up: Foreign Function Interface 6068 60696.19.4 Foreign Types 6070-------------------- 6071 6072From Scheme’s perspective, foreign pointers are shards of chaos. The 6073user can create a foreign pointer for any address, and do with it what 6074they will. The only thing that lends a sense of order to the whole is a 6075shared hallucination that certain storage locations have certain types. 6076When making Scheme wrappers for foreign interfaces, we hide the madness 6077by explicitly representing the the data types of parameters and fields. 6078 6079 These “foreign type values” may be constructed using the constants 6080and procedures from the ‘(system foreign)’ module, which may be loaded 6081like this: 6082 6083 (use-modules (system foreign)) 6084 6085 ‘(system foreign)’ exports a number of values expressing the basic C 6086types. 6087 6088 -- Scheme Variable: int8 6089 -- Scheme Variable: uint8 6090 -- Scheme Variable: uint16 6091 -- Scheme Variable: int16 6092 -- Scheme Variable: uint32 6093 -- Scheme Variable: int32 6094 -- Scheme Variable: uint64 6095 -- Scheme Variable: int64 6096 -- Scheme Variable: float 6097 -- Scheme Variable: double 6098 These values represent the C numeric types of the specified sizes 6099 and signednesses. 6100 6101 In addition there are some convenience bindings for indicating types 6102of platform-dependent size. 6103 6104 -- Scheme Variable: int 6105 -- Scheme Variable: unsigned-int 6106 -- Scheme Variable: long 6107 -- Scheme Variable: unsigned-long 6108 -- Scheme Variable: short 6109 -- Scheme Variable: unsigned-short 6110 -- Scheme Variable: size_t 6111 -- Scheme Variable: ssize_t 6112 -- Scheme Variable: ptrdiff_t 6113 -- Scheme Variable: intptr_t 6114 -- Scheme Variable: uintptr_t 6115 Values exported by the ‘(system foreign)’ module, representing C 6116 numeric types. For example, ‘long’ may be ‘equal?’ to ‘int64’ on a 6117 64-bit platform. 6118 6119 -- Scheme Variable: void 6120 The ‘void’ type. It can be used as the first argument to 6121 ‘pointer->procedure’ to wrap a C function that returns nothing. 6122 6123 In addition, the symbol ‘*’ is used by convention to denote pointer 6124types. Procedures detailed in the following sections, such as 6125‘pointer->procedure’, accept it as a type descriptor. 6126 6127 6128File: guile.info, Node: Foreign Functions, Next: Void Pointers and Byte Access, Prev: Foreign Types, Up: Foreign Function Interface 6129 61306.19.5 Foreign Functions 6131------------------------ 6132 6133The most natural thing to do with a dynamic library is to grovel around 6134in it for a function pointer: a “foreign function”. Load the ‘(system 6135foreign)’ module to use these Scheme interfaces. 6136 6137 (use-modules (system foreign)) 6138 6139 -- Scheme Procedure: pointer->procedure return_type func_ptr arg_types 6140 [#:return-errno?=#f] 6141 -- C Function: scm_pointer_to_procedure (return_type, func_ptr, 6142 arg_types) 6143 -- C Function: scm_pointer_to_procedure_with_errno (return_type, 6144 func_ptr, arg_types) 6145 6146 Make a foreign function. 6147 6148 Given the foreign void pointer FUNC_PTR, its argument and return 6149 types ARG_TYPES and RETURN_TYPE, return a procedure that will pass 6150 arguments to the foreign function and return appropriate values. 6151 6152 ARG_TYPES should be a list of foreign types. ‘return_type’ should 6153 be a foreign type. *Note Foreign Types::, for more information on 6154 foreign types. 6155 6156 If RETURN-ERRNO? is true, or when calling 6157 ‘scm_pointer_to_procedure_with_errno’, the returned procedure will 6158 return two values, with ‘errno’ as the second value. 6159 6160 Finally, in ‘(system foreign-library)’ there is a convenient wrapper 6161function, joining together ‘foreign-libary-pointer’ and 6162‘procedure->pointer’: 6163 6164 -- Scheme Procedure: foreign-library-function lib name 6165 [#:return-type=void] [#:arg-types='()] [#:return-errno?=#f] 6166 Load the address of NAME from LIB, and treat it as a function 6167 taking arguments ARG-TYPES and returning RETURN-TYPE, optionally 6168 also with errno. 6169 6170 An invocation of ‘foreign-library-function’ is entirely equivalent 6171 to: 6172 (pointer->procedure RETURN-TYPE 6173 (foreign-library-pointer LIB NAME) 6174 ARG-TYPES 6175 #:return-errno? RETURN-ERRNO?). 6176 6177 Pulling all this together, here is a better definition of ‘(math 6178bessel)’: 6179 6180 (define-module (math bessel) 6181 #:use-module (system foreign) 6182 #:use-module (system foreign-library) 6183 #:export (j0)) 6184 6185 (define j0 6186 (foreign-library-function "libm" "j0" 6187 #:return-type double 6188 #:arg-types (list double))) 6189 6190 That’s it! No C at all. 6191 6192 Before going on to more detailed examples, the next two sections 6193discuss how to deal with data that is more complex than, say, ‘int8’. 6194*Note More Foreign Functions::, to continue with foreign function 6195examples. 6196 6197 6198File: guile.info, Node: Void Pointers and Byte Access, Next: Foreign Structs, Prev: Foreign Functions, Up: Foreign Function Interface 6199 62006.19.6 Void Pointers and Byte Access 6201------------------------------------ 6202 6203Wrapped pointers are untyped, so they are essentially equivalent to C 6204‘void’ pointers. As in C, the memory region pointed to by a pointer can 6205be accessed at the byte level. This is achieved using _bytevectors_ 6206(*note Bytevectors::). The ‘(rnrs bytevectors)’ module contains 6207procedures that can be used to convert byte sequences to Scheme objects 6208such as strings, floating point numbers, or integers. 6209 6210 Load the ‘(system foreign)’ module to use these Scheme interfaces. 6211 6212 (use-modules (system foreign)) 6213 6214 -- Scheme Procedure: pointer->bytevector pointer len [offset 6215 [uvec_type]] 6216 -- C Function: scm_pointer_to_bytevector (pointer, len, offset, 6217 uvec_type) 6218 Return a bytevector aliasing the LEN bytes pointed to by POINTER. 6219 6220 The user may specify an alternate default interpretation for the 6221 memory by passing the UVEC_TYPE argument, to indicate that the 6222 memory is an array of elements of that type. UVEC_TYPE should be 6223 something that ‘array-type’ would return, like ‘f32’ or ‘s16’. 6224 6225 When OFFSET is passed, it specifies the offset in bytes relative to 6226 POINTER of the memory region aliased by the returned bytevector. 6227 6228 Mutating the returned bytevector mutates the memory pointed to by 6229 POINTER, so buckle your seatbelts. 6230 6231 -- Scheme Procedure: bytevector->pointer bv [offset] 6232 -- C Function: scm_bytevector_to_pointer (bv, offset) 6233 Return a pointer pointer aliasing the memory pointed to by BV or 6234 OFFSET bytes after BV when OFFSET is passed. 6235 6236 In addition to these primitives, convenience procedures are 6237available: 6238 6239 -- Scheme Procedure: dereference-pointer pointer 6240 Assuming POINTER points to a memory region that holds a pointer, 6241 return this pointer. 6242 6243 -- Scheme Procedure: string->pointer string [encoding] 6244 Return a foreign pointer to a nul-terminated copy of STRING in the 6245 given ENCODING, defaulting to the current locale encoding. The C 6246 string is freed when the returned foreign pointer becomes 6247 unreachable. 6248 6249 This is the Scheme equivalent of ‘scm_to_stringn’. 6250 6251 -- Scheme Procedure: pointer->string pointer [length] [encoding] 6252 Return the string representing the C string pointed to by POINTER. 6253 If LENGTH is omitted or ‘-1’, the string is assumed to be 6254 nul-terminated. Otherwise LENGTH is the number of bytes in memory 6255 pointed to by POINTER. The C string is assumed to be in the given 6256 ENCODING, defaulting to the current locale encoding. 6257 6258 This is the Scheme equivalent of ‘scm_from_stringn’. 6259 6260 Most object-oriented C libraries use pointers to specific data 6261structures to identify objects. It is useful in such cases to reify the 6262different pointer types as disjoint Scheme types. The 6263‘define-wrapped-pointer-type’ macro simplifies this. 6264 6265 -- Scheme Syntax: define-wrapped-pointer-type type-name pred wrap 6266 unwrap print 6267 Define helper procedures to wrap pointer objects into Scheme 6268 objects with a disjoint type. Specifically, this macro defines: 6269 6270 • PRED, a predicate for the new Scheme type; 6271 • WRAP, a procedure that takes a pointer object and returns an 6272 object that satisfies PRED; 6273 • UNWRAP, which does the reverse. 6274 6275 WRAP preserves pointer identity, for two pointer objects P1 and P2 6276 that are ‘equal?’, ‘(eq? (WRAP P1) (WRAP P2)) ⇒ #t’. 6277 6278 Finally, PRINT should name a user-defined procedure to print such 6279 objects. The procedure is passed the wrapped object and a port to 6280 write to. 6281 6282 For example, assume we are wrapping a C library that defines a 6283 type, ‘bottle_t’, and functions that can be passed ‘bottle_t *’ 6284 pointers to manipulate them. We could write: 6285 6286 (define-wrapped-pointer-type bottle 6287 bottle? 6288 wrap-bottle unwrap-bottle 6289 (lambda (b p) 6290 (format p "#<bottle of ~a ~x>" 6291 (bottle-contents b) 6292 (pointer-address (unwrap-bottle b))))) 6293 6294 (define grab-bottle 6295 ;; Wrapper for `bottle_t *grab (void)'. 6296 (let ((grab (foreign-library-function libbottle "grab_bottle" 6297 #:return-type '*))) 6298 (lambda () 6299 "Return a new bottle." 6300 (wrap-bottle (grab))))) 6301 6302 (define bottle-contents 6303 ;; Wrapper for `const char *bottle_contents (bottle_t *)'. 6304 (let ((contents (foreign-library-function libbottle "bottle_contents" 6305 #:return-type '* 6306 #:arg-types '(*)))) 6307 (lambda (b) 6308 "Return the contents of B." 6309 (pointer->string (contents (unwrap-bottle b)))))) 6310 6311 (write (grab-bottle)) 6312 ⇒ #<bottle of Château Haut-Brion 803d36> 6313 6314 In this example, ‘grab-bottle’ is guaranteed to return a genuine 6315 ‘bottle’ object satisfying ‘bottle?’. Likewise, ‘bottle-contents’ 6316 errors out when its argument is not a genuine ‘bottle’ object. 6317 6318 As another example, currently Guile has a variable, ‘scm_numptob’, as 6319part of its API. It is declared as a C ‘long’. So, to read its value, 6320we can do: 6321 6322 (use-modules (system foreign)) 6323 (use-modules (rnrs bytevectors)) 6324 (define numptob 6325 (foreign-library-pointer #f "scm_numptob")) 6326 numptob 6327 (bytevector-uint-ref (pointer->bytevector numptob (sizeof long)) 6328 0 (native-endianness) 6329 (sizeof long)) 6330 ⇒ 8 6331 6332 If we wanted to corrupt Guile’s internal state, we could set 6333‘scm_numptob’ to another value; but we shouldn’t, because that variable 6334is not meant to be set. Indeed this point applies more widely: the C 6335API is a dangerous place to be. Not only might setting a value crash 6336your program, simply accessing the data pointed to by a dangling pointer 6337or similar can prove equally disastrous. 6338 6339 6340File: guile.info, Node: Foreign Structs, Next: More Foreign Functions, Prev: Void Pointers and Byte Access, Up: Foreign Function Interface 6341 63426.19.7 Foreign Structs 6343---------------------- 6344 6345Finally, one last note on foreign values before moving on to actually 6346calling foreign functions. Sometimes you need to deal with C structs, 6347which requires interpreting each element of the struct according to the 6348its type, offset, and alignment. The ‘(system foreign)’ module has some 6349primitives to support this. 6350 6351 (use-modules (system foreign)) 6352 6353 -- Scheme Procedure: sizeof type 6354 -- C Function: scm_sizeof (type) 6355 Return the size of TYPE, in bytes. 6356 6357 TYPE should be a valid C type, like ‘int’. Alternately TYPE may be 6358 the symbol ‘*’, in which case the size of a pointer is returned. 6359 TYPE may also be a list of types, in which case the size of a 6360 ‘struct’ with ABI-conventional packing is returned. 6361 6362 -- Scheme Procedure: alignof type 6363 -- C Function: scm_alignof (type) 6364 Return the alignment of TYPE, in bytes. 6365 6366 TYPE should be a valid C type, like ‘int’. Alternately TYPE may be 6367 the symbol ‘*’, in which case the alignment of a pointer is 6368 returned. TYPE may also be a list of types, in which case the 6369 alignment of a ‘struct’ with ABI-conventional packing is returned. 6370 6371 Guile also provides some convenience methods to pack and unpack 6372foreign pointers wrapping C structs. 6373 6374 -- Scheme Procedure: make-c-struct types vals 6375 Create a foreign pointer to a C struct containing VALS with types 6376 ‘types’. 6377 6378 VALS and ‘types’ should be lists of the same length. 6379 6380 -- Scheme Procedure: parse-c-struct foreign types 6381 Parse a foreign pointer to a C struct, returning a list of values. 6382 6383 ‘types’ should be a list of C types. 6384 6385 For example, to create and parse the equivalent of a ‘struct { 6386int64_t a; uint8_t b; }’: 6387 6388 (parse-c-struct (make-c-struct (list int64 uint8) 6389 (list 300 43)) 6390 (list int64 uint8)) 6391 ⇒ (300 43) 6392 6393 As yet, Guile only has convenience routines to support 6394conventionally-packed structs. But given the ‘bytevector->pointer’ and 6395‘pointer->bytevector’ routines, one can create and parse tightly packed 6396structs and unions by hand. See the code for ‘(system foreign)’ for 6397details. 6398 6399 6400File: guile.info, Node: More Foreign Functions, Prev: Foreign Structs, Up: Foreign Function Interface 6401 64026.19.8 More Foreign Functions 6403----------------------------- 6404 6405It is possible to pass pointers to foreign functions, and to return them 6406as well. In that case the type of the argument or return value should 6407be the symbol ‘*’, indicating a pointer. For example, the following 6408code makes ‘memcpy’ available to Scheme: 6409 6410 (use-modules (system foreign)) 6411 (define memcpy 6412 (foreign-library-function #f "memcpy" 6413 #:return-type '* 6414 #:arg-types (list '* '* size_t))) 6415 6416 To invoke ‘memcpy’, one must pass it foreign pointers: 6417 6418 (use-modules (rnrs bytevectors)) 6419 6420 (define src-bits 6421 (u8-list->bytevector '(0 1 2 3 4 5 6 7))) 6422 (define src 6423 (bytevector->pointer src-bits)) 6424 (define dest 6425 (bytevector->pointer (make-bytevector 16 0))) 6426 6427 (memcpy dest src (bytevector-length src-bits)) 6428 6429 (bytevector->u8-list (pointer->bytevector dest 16)) 6430 ⇒ (0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0) 6431 6432 One may also pass structs as values, passing structs as foreign 6433pointers. *Note Foreign Structs::, for more information on how to 6434express struct types and struct values. 6435 6436 “Out” arguments are passed as foreign pointers. The memory pointed 6437to by the foreign pointer is mutated in place. 6438 6439 ;; struct timeval { 6440 ;; time_t tv_sec; /* seconds */ 6441 ;; suseconds_t tv_usec; /* microseconds */ 6442 ;; }; 6443 ;; assuming fields are of type "long" 6444 6445 (define gettimeofday 6446 (let ((f (foreign-library-function #f "gettimeofday" 6447 #:return-type int 6448 #:arg-types (list '* '*))) 6449 (tv-type (list long long))) 6450 (lambda () 6451 (let* ((timeval (make-c-struct tv-type (list 0 0))) 6452 (ret (f timeval %null-pointer))) 6453 (if (zero? ret) 6454 (apply values (parse-c-struct timeval tv-type)) 6455 (error "gettimeofday returned an error" ret)))))) 6456 6457 (gettimeofday) 6458 ⇒ 1270587589 6459 ⇒ 499553 6460 6461 As you can see, this interface to foreign functions is at a very low, 6462somewhat dangerous level(1). 6463 6464 The FFI can also work in the opposite direction: making Scheme 6465procedures callable from C. This makes it possible to use Scheme 6466procedures as “callbacks” expected by C function. 6467 6468 -- Scheme Procedure: procedure->pointer return-type proc arg-types 6469 -- C Function: scm_procedure_to_pointer (return_type, proc, arg_types) 6470 Return a pointer to a C function of type RETURN-TYPE taking 6471 arguments of types ARG-TYPES (a list) and behaving as a proxy to 6472 procedure PROC. Thus PROC’s arity, supported argument types, and 6473 return type should match RETURN-TYPE and ARG-TYPES. 6474 6475 As an example, here’s how the C library’s ‘qsort’ array sorting 6476function can be made accessible to Scheme (*note ‘qsort’: (libc)Array 6477Sort Function.): 6478 6479 (define qsort! 6480 (let ((qsort (foreign-library-function 6481 #f "qsort" #:arg-types (list '* size_t size_t '*)))) 6482 (lambda (bv compare) 6483 ;; Sort bytevector BV in-place according to comparison 6484 ;; procedure COMPARE. 6485 (let ((ptr (procedure->pointer int 6486 (lambda (x y) 6487 ;; X and Y are pointers so, 6488 ;; for convenience, dereference 6489 ;; them before calling COMPARE. 6490 (compare (dereference-uint8* x) 6491 (dereference-uint8* y))) 6492 (list '* '*)))) 6493 (qsort (bytevector->pointer bv) 6494 (bytevector-length bv) 1 ;; we're sorting bytes 6495 ptr))))) 6496 6497 (define (dereference-uint8* ptr) 6498 ;; Helper function: dereference the byte pointed to by PTR. 6499 (let ((b (pointer->bytevector ptr 1))) 6500 (bytevector-u8-ref b 0))) 6501 6502 (define bv 6503 ;; An unsorted array of bytes. 6504 (u8-list->bytevector '(7 1 127 3 5 4 77 2 9 0))) 6505 6506 ;; Sort BV. 6507 (qsort! bv (lambda (x y) (- x y))) 6508 6509 ;; Let's see what the sorted array looks like: 6510 (bytevector->u8-list bv) 6511 ⇒ (0 1 2 3 4 5 7 9 77 127) 6512 6513 And voilà! 6514 6515 Note that ‘procedure->pointer’ is not supported (and not defined) on 6516a few exotic architectures. Thus, user code may need to check 6517‘(defined? 'procedure->pointer)’. Nevertheless, it is available on many 6518architectures, including (as of libffi 3.0.9) x86, ia64, SPARC, PowerPC, 6519ARM, and MIPS, to name a few. 6520 6521 ---------- Footnotes ---------- 6522 6523 (1) A contribution to Guile in the form of a high-level FFI would be 6524most welcome. 6525 6526 6527File: guile.info, Node: Foreign Objects, Next: Smobs, Prev: Foreign Function Interface, Up: API Reference 6528 65296.20 Foreign Objects 6530==================== 6531 6532This chapter contains reference information related to defining and 6533working with foreign objects. *Note Defining New Foreign Object 6534Types::, for a tutorial-like introduction to foreign objects. 6535 6536 -- C Type: scm_t_struct_finalize 6537 This function type returns ‘void’ and takes one ‘SCM’ argument. 6538 6539 -- C Function: SCM scm_make_foreign_object_type (SCM name, SCM slots, 6540 scm_t_struct_finalize finalizer) 6541 Create a fresh foreign object type. NAME is a symbol naming the 6542 type. SLOTS is a list of symbols, each one naming a field in the 6543 foreign object type. FINALIZER indicates the finalizer, and may be 6544 ‘NULL’. 6545 6546 We recommend that finalizers be avoided if possible. *Note Foreign 6547Object Memory Management::. Finalizers must be async-safe and 6548thread-safe. Again, *note Foreign Object Memory Management::. If you 6549are embedding Guile in an application that is not thread-safe, and you 6550define foreign object types that need finalization, you might want to 6551disable automatic finalization, and arrange to call 6552‘scm_manually_run_finalizers ()’ yourself. 6553 6554 -- C Function: int scm_set_automatic_finalization_enabled (int 6555 enabled_p) 6556 Enable or disable automatic finalization. By default, Guile 6557 arranges to invoke object finalizers automatically, in a separate 6558 thread if possible. Passing a zero value for ENABLED_P will 6559 disable automatic finalization for Guile as a whole. If you 6560 disable automatic finalization, you will have to call 6561 ‘scm_run_finalizers ()’ periodically. 6562 6563 Unlike most other Guile functions, you can call 6564 ‘scm_set_automatic_finalization_enabled’ before Guile has been 6565 initialized. 6566 6567 Return the previous status of automatic finalization. 6568 6569 -- C Function: int scm_run_finalizers (void) 6570 Invoke any pending finalizers. Returns the number of finalizers 6571 that were invoked. This function should be called when automatic 6572 finalization is disabled, though it may be called if it is enabled 6573 as well. 6574 6575 -- C Function: void scm_assert_foreign_object_type (SCM type, SCM val) 6576 When VAL is a foreign object of the given TYPE, do nothing. 6577 Otherwise, signal an error. 6578 6579 -- C Function: SCM scm_make_foreign_object_0 (SCM type) 6580 -- C Function: SCM scm_make_foreign_object_1 (SCM type, void *val0) 6581 -- C Function: SCM scm_make_foreign_object_2 (SCM type, void *val0, 6582 void *val1) 6583 -- C Function: SCM scm_make_foreign_object_3 (SCM type, void *val0, 6584 void *val1, void *val2) 6585 -- C Function: SCM scm_make_foreign_object_n (SCM type, size_t n, void 6586 *vals[]) 6587 Make a new foreign object of the type with type TYPE and initialize 6588 the first N fields to the given values, as appropriate. 6589 6590 The number of fields for objects of a given type is fixed when the 6591 type is created. It is an error to give more initializers than 6592 there are fields in the value. It is perfectly fine to give fewer 6593 initializers than needed; this is convenient when some fields are 6594 of non-pointer types, and would be easier to initialize with the 6595 setters described below. 6596 6597 -- C Function: void* scm_foreign_object_ref (SCM obj, size_t n); 6598 -- C Function: scm_t_bits scm_foreign_object_unsigned_ref (SCM obj, 6599 size_t n); 6600 -- C Function: scm_t_signed_bits scm_foreign_object_signed_ref (SCM 6601 obj, size_t n); 6602 Return the value of the Nth field of the foreign object OBJ. The 6603 backing store for the fields is as wide as a ‘scm_t_bits’ value, 6604 which is at least as wide as a pointer. The different variants 6605 handle casting in a portable way. 6606 6607 -- C Function: void scm_foreign_object_set_x (SCM obj, size_t n, void 6608 *val); 6609 -- C Function: void scm_foreign_object_unsigned_set_x (SCM obj, size_t 6610 n, scm_t_bits val); 6611 -- C Function: void scm_foreign_object_signed_set_x (SCM obj, size_t n, 6612 scm_t_signed_bits val); 6613 Set the value of the Nth field of the foreign object OBJ to VAL, 6614 after portably converting to a ‘scm_t_bits’ value, if needed. 6615 6616 One can also access foreign objects from Scheme. *Note Foreign 6617Objects and Scheme::, for some examples. 6618 6619 (use-modules (system foreign-object)) 6620 6621 -- Scheme Procedure: make-foreign-object-type name slots 6622 [#:finalizer=#f] 6623 Make a new foreign object type. See the above documentation for 6624 ‘scm_make_foreign_object_type’; these functions are exactly 6625 equivalent, except for the way in which the finalizer gets attached 6626 to instances (an internal detail). 6627 6628 The resulting value is a GOOPS class. *Note GOOPS::, for more on 6629 classes in Guile. 6630 6631 -- Scheme Syntax: define-foreign-object-type name constructor (slot 6632 ...) [#:finalizer=#f] 6633 A convenience macro to define a type, using 6634 ‘make-foreign-object-type’, and bind it to NAME. A constructor 6635 will be bound to CONSTRUCTOR, and getters will be bound to each of 6636 SLOT.... 6637 6638 6639File: guile.info, Node: Smobs, Next: Scheduling, Prev: Foreign Objects, Up: API Reference 6640 66416.21 Smobs 6642========== 6643 6644A “smob” is a “small object”. Before foreign objects were introduced in 6645Guile 2.0.12 (*note Foreign Objects::), smobs were the preferred way to 6646for C code to define new kinds of Scheme objects. With the exception of 6647the so-called “applicable SMOBs” discussed below, smobs are now a legacy 6648interface and are headed for eventual deprecation. *Note Deprecation::. 6649New code should use the foreign object interface. 6650 6651 This section contains reference information related to defining and 6652working with smobs. For a tutorial-like introduction to smobs, see 6653“Defining New Types (Smobs)” in previous versions of this manual. 6654 6655 -- Function: scm_t_bits scm_make_smob_type (const char *name, size_t 6656 size) 6657 This function adds a new smob type, named NAME, with instance size 6658 SIZE, to the system. The return value is a tag that is used in 6659 creating instances of the type. 6660 6661 If SIZE is 0, the default _free_ function will do nothing. 6662 6663 If SIZE is not 0, the default _free_ function will deallocate the 6664 memory block pointed to by ‘SCM_SMOB_DATA’ with ‘scm_gc_free’. The 6665 WHAT parameter in the call to ‘scm_gc_free’ will be NAME. 6666 6667 Default values are provided for the _mark_, _free_, _print_, and 6668 _equalp_ functions. If you want to customize any of these 6669 functions, the call to ‘scm_make_smob_type’ should be immediately 6670 followed by calls to one or several of ‘scm_set_smob_mark’, 6671 ‘scm_set_smob_free’, ‘scm_set_smob_print’, and/or 6672 ‘scm_set_smob_equalp’. 6673 6674 -- C Function: void scm_set_smob_free (scm_t_bits tc, size_t (*free) 6675 (SCM obj)) 6676 This function sets the smob freeing procedure (sometimes referred 6677 to as a “finalizer”) for the smob type specified by the tag TC. TC 6678 is the tag returned by ‘scm_make_smob_type’. 6679 6680 The FREE procedure must deallocate all resources that are directly 6681 associated with the smob instance OBJ. It must assume that all 6682 ‘SCM’ values that it references have already been freed and are 6683 thus invalid. 6684 6685 It must also not call any libguile function or macro except 6686 ‘scm_gc_free’, ‘SCM_SMOB_FLAGS’, ‘SCM_SMOB_DATA’, 6687 ‘SCM_SMOB_DATA_2’, and ‘SCM_SMOB_DATA_3’. 6688 6689 The FREE procedure must return 0. 6690 6691 Note that defining a freeing procedure is not necessary if the 6692 resources associated with OBJ consists only of memory allocated 6693 with ‘scm_gc_malloc’ or ‘scm_gc_malloc_pointerless’ because this 6694 memory is automatically reclaimed by the garbage collector when it 6695 is no longer needed (*note ‘scm_gc_malloc’: Memory Blocks.). 6696 6697 Smob free functions must be thread-safe. *Note Foreign Object Memory 6698Management::, for a discussion on finalizers and concurrency. If you 6699are embedding Guile in an application that is not thread-safe, and you 6700define smob types that need finalization, you might want to disable 6701automatic finalization, and arrange to call ‘scm_manually_run_finalizers 6702()’ yourself. *Note Foreign Objects::. 6703 6704 -- C Function: void scm_set_smob_mark (scm_t_bits tc, SCM (*mark) (SCM 6705 obj)) 6706 This function sets the smob marking procedure for the smob type 6707 specified by the tag TC. TC is the tag returned by 6708 ‘scm_make_smob_type’. 6709 6710 Defining a marking procedure is almost always the wrong thing to 6711 do. It is much, much preferable to allocate smob data with the 6712 ‘scm_gc_malloc’ and ‘scm_gc_malloc_pointerless’ functions, and 6713 allow the GC to trace pointers automatically. 6714 6715 Any mark procedures you see currently almost surely date from the 6716 time of Guile 1.8, before the switch to the Boehm-Demers-Weiser 6717 collector. Such smob implementations should be changed to just use 6718 ‘scm_gc_malloc’ and friends, and to lose their mark function. 6719 6720 If you decide to keep the mark function, note that it may be called 6721 on objects that are on the free list. Please read and digest the 6722 comments from the BDW GC’s ‘gc/gc_mark.h’ header. 6723 6724 The MARK procedure must cause ‘scm_gc_mark’ to be called for every 6725 ‘SCM’ value that is directly referenced by the smob instance OBJ. 6726 One of these ‘SCM’ values can be returned from the procedure and 6727 Guile will call ‘scm_gc_mark’ for it. This can be used to avoid 6728 deep recursions for smob instances that form a list. 6729 6730 It must not call any libguile function or macro except 6731 ‘scm_gc_mark’, ‘SCM_SMOB_FLAGS’, ‘SCM_SMOB_DATA’, 6732 ‘SCM_SMOB_DATA_2’, and ‘SCM_SMOB_DATA_3’. 6733 6734 -- C Function: void scm_set_smob_print (scm_t_bits tc, int (*print) 6735 (SCM obj, SCM port, scm_print_state* pstate)) 6736 This function sets the smob printing procedure for the smob type 6737 specified by the tag TC. TC is the tag returned by 6738 ‘scm_make_smob_type’. 6739 6740 The PRINT procedure should output a textual representation of the 6741 smob instance OBJ to PORT, using information in PSTATE. 6742 6743 The textual representation should be of the form ‘#<name ...>’. 6744 This ensures that ‘read’ will not interpret it as some other Scheme 6745 value. 6746 6747 It is often best to ignore PSTATE and just print to PORT with 6748 ‘scm_display’, ‘scm_write’, ‘scm_simple_format’, and ‘scm_puts’. 6749 6750 -- C Function: void scm_set_smob_equalp (scm_t_bits tc, SCM (*equalp) 6751 (SCM obj1, SCM obj2)) 6752 This function sets the smob equality-testing predicate for the smob 6753 type specified by the tag TC. TC is the tag returned by 6754 ‘scm_make_smob_type’. 6755 6756 The EQUALP procedure should return ‘SCM_BOOL_T’ when OBJ1 is 6757 ‘equal?’ to OBJ2. Else it should return ‘SCM_BOOL_F’. Both OBJ1 6758 and OBJ2 are instances of the smob type TC. 6759 6760 -- C Function: void scm_assert_smob_type (scm_t_bits tag, SCM val) 6761 When VAL is a smob of the type indicated by TAG, do nothing. Else, 6762 signal an error. 6763 6764 -- C Macro: int SCM_SMOB_PREDICATE (scm_t_bits tag, SCM exp) 6765 Return true if EXP is a smob instance of the type indicated by TAG, 6766 or false otherwise. The expression EXP can be evaluated more than 6767 once, so it shouldn’t contain any side effects. 6768 6769 -- C Function: SCM scm_new_smob (scm_t_bits tag, void *data) 6770 -- C Function: SCM scm_new_double_smob (scm_t_bits tag, void *data, 6771 void *data2, void *data3) 6772 Make a new smob of the type with tag TAG and smob data DATA, DATA2, 6773 and DATA3, as appropriate. 6774 6775 The TAG is what has been returned by ‘scm_make_smob_type’. The 6776 initial values DATA, DATA2, and DATA3 are of type ‘scm_t_bits’; 6777 when you want to use them for ‘SCM’ values, these values need to be 6778 converted to a ‘scm_t_bits’ first by using ‘SCM_UNPACK’. 6779 6780 The flags of the smob instance start out as zero. 6781 6782 -- C Macro: scm_t_bits SCM_SMOB_FLAGS (SCM obj) 6783 Return the 16 extra bits of the smob OBJ. No meaning is predefined 6784 for these bits, you can use them freely. 6785 6786 -- C Macro: scm_t_bits SCM_SET_SMOB_FLAGS (SCM obj, scm_t_bits flags) 6787 Set the 16 extra bits of the smob OBJ to FLAGS. No meaning is 6788 predefined for these bits, you can use them freely. 6789 6790 -- C Macro: scm_t_bits SCM_SMOB_DATA (SCM obj) 6791 -- C Macro: scm_t_bits SCM_SMOB_DATA_2 (SCM obj) 6792 -- C Macro: scm_t_bits SCM_SMOB_DATA_3 (SCM obj) 6793 Return the first (second, third) immediate word of the smob OBJ as 6794 a ‘scm_t_bits’ value. When the word contains a ‘SCM’ value, use 6795 ‘SCM_SMOB_OBJECT’ (etc.) instead. 6796 6797 -- C Macro: void SCM_SET_SMOB_DATA (SCM obj, scm_t_bits val) 6798 -- C Macro: void SCM_SET_SMOB_DATA_2 (SCM obj, scm_t_bits val) 6799 -- C Macro: void SCM_SET_SMOB_DATA_3 (SCM obj, scm_t_bits val) 6800 Set the first (second, third) immediate word of the smob OBJ to 6801 VAL. When the word should be set to a ‘SCM’ value, use 6802 ‘SCM_SMOB_SET_OBJECT’ (etc.) instead. 6803 6804 -- C Macro: SCM SCM_SMOB_OBJECT (SCM obj) 6805 -- C Macro: SCM SCM_SMOB_OBJECT_2 (SCM obj) 6806 -- C Macro: SCM SCM_SMOB_OBJECT_3 (SCM obj) 6807 Return the first (second, third) immediate word of the smob OBJ as 6808 a ‘SCM’ value. When the word contains a ‘scm_t_bits’ value, use 6809 ‘SCM_SMOB_DATA’ (etc.) instead. 6810 6811 -- C Macro: void SCM_SET_SMOB_OBJECT (SCM obj, SCM val) 6812 -- C Macro: void SCM_SET_SMOB_OBJECT_2 (SCM obj, SCM val) 6813 -- C Macro: void SCM_SET_SMOB_OBJECT_3 (SCM obj, SCM val) 6814 Set the first (second, third) immediate word of the smob OBJ to 6815 VAL. When the word should be set to a ‘scm_t_bits’ value, use 6816 ‘SCM_SMOB_SET_DATA’ (etc.) instead. 6817 6818 -- C Macro: SCM * SCM_SMOB_OBJECT_LOC (SCM obj) 6819 -- C Macro: SCM * SCM_SMOB_OBJECT_2_LOC (SCM obj) 6820 -- C Macro: SCM * SCM_SMOB_OBJECT_3_LOC (SCM obj) 6821 Return a pointer to the first (second, third) immediate word of the 6822 smob OBJ. Note that this is a pointer to ‘SCM’. If you need to 6823 work with ‘scm_t_bits’ values, use ‘SCM_PACK’ and ‘SCM_UNPACK’, as 6824 appropriate. 6825 6826 -- Function: SCM scm_markcdr (SCM X) 6827 Mark the references in the smob X, assuming that X’s first data 6828 word contains an ordinary Scheme object, and X refers to no other 6829 objects. This function simply returns X’s first data word. 6830 6831