1This is guile.info, produced by makeinfo version 6.7 from guile.texi.
2
3This manual documents Guile version 3.0.7.
4
5   Copyright (C) 1996-1997, 2000-2005, 2009-2021 Free Software
6Foundation, Inc.
7
8   Permission is granted to copy, distribute and/or modify this document
9under the terms of the GNU Free Documentation License, Version 1.3 or
10any later version published by the Free Software Foundation; with no
11Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
12copy of the license is included in the section entitled “GNU Free
13Documentation License.”
14INFO-DIR-SECTION The Algorithmic Language Scheme
15START-INFO-DIR-ENTRY
16* Guile Reference: (guile).     The Guile reference manual.
17END-INFO-DIR-ENTRY
18
19
20File: guile.info,  Node: Copyright Notice,  Next: Class Definition,  Up: GOOPS
21
228.1 Copyright Notice
23====================
24
25The material in this chapter is partly derived from the STk Reference
26Manual written by Erick Gallesio, whose copyright notice is as follows.
27
28   Copyright © 1993-1999 Erick Gallesio - I3S-CNRS/ESSI <eg@unice.fr>
29Permission to use, copy, modify, distribute,and license this software
30and its documentation for any purpose is hereby granted, provided that
31existing copyright notices are retained in all copies and that this
32notice is included verbatim in any distributions.  No written agreement,
33license, or royalty fee is required for any of the authorized uses.
34This software is provided “AS IS” without express or implied warranty.
35
36   The material has been adapted for use in Guile, with the author’s
37permission.
38
39
40File: guile.info,  Node: Class Definition,  Next: Instance Creation,  Prev: Copyright Notice,  Up: GOOPS
41
428.2 Class Definition
43====================
44
45A new class is defined with the ‘define-class’ syntax:
46
47     (define-class CLASS (SUPERCLASS ...)
48        SLOT-DESCRIPTION ...
49        CLASS-OPTION ...)
50
51   CLASS is the class being defined.  The list of SUPERCLASSes specifies
52which existing classes, if any, to inherit slots and properties from.
53“Slots” hold per-instance(1) data, for instances of that class — like
54“fields” or “member variables” in other object oriented systems.  Each
55SLOT-DESCRIPTION gives the name of a slot and optionally some
56“properties” of this slot; for example its initial value, the name of a
57function which will access its value, and so on.  Class options, slot
58descriptions and inheritance are discussed more below.
59
60 -- syntax: define-class name (super ...) slot-definition ...
61          class-option ...
62     Define a class called NAME that inherits from SUPERs, with direct
63     slots defined by SLOT-DEFINITIONs and CLASS-OPTIONs.  The newly
64     created class is bound to the variable name NAME in the current
65     environment.
66
67     Each SLOT-DEFINITION is either a symbol that names the slot or a
68     list,
69
70          (SLOT-NAME-SYMBOL . SLOT-OPTIONS)
71
72     where SLOT-NAME-SYMBOL is a symbol and SLOT-OPTIONS is a list with
73     an even number of elements.  The even-numbered elements of
74     SLOT-OPTIONS (counting from zero) are slot option keywords; the
75     odd-numbered elements are the corresponding values for those
76     keywords.
77
78     Each CLASS-OPTION is an option keyword and corresponding value.
79
80   As an example, let us define a type for representing a complex number
81in terms of two real numbers.(2)  This can be done with the following
82class definition:
83
84     (define-class <my-complex> (<number>)
85        r i)
86
87   This binds the variable ‘<my-complex>’ to a new class whose instances
88will contain two slots.  These slots are called ‘r’ and ‘i’ and will
89hold the real and imaginary parts of a complex number.  Note that this
90class inherits from ‘<number>’, which is a predefined class.(3)
91
92   Slot options are described in the next section.  The possible class
93options are as follows.
94
95 -- class option: #:metaclass metaclass
96     The ‘#:metaclass’ class option specifies the metaclass of the class
97     being defined.  METACLASS must be a class that inherits from
98     ‘<class>’.  For the use of metaclasses, see *note Metaobjects and
99     the Metaobject Protocol:: and *note Metaclasses::.
100
101     If the ‘#:metaclass’ option is absent, GOOPS reuses or constructs a
102     metaclass for the new class by calling ‘ensure-metaclass’ (*note
103     ensure-metaclass: Class Definition Protocol.).
104
105 -- class option: #:name name
106     The ‘#:name’ class option specifies the new class’s name.  This
107     name is used to identify the class whenever related objects - the
108     class itself, its instances and its subclasses - are printed.
109
110     If the ‘#:name’ option is absent, GOOPS uses the first argument to
111     ‘define-class’ as the class name.
112
113   ---------- Footnotes ----------
114
115   (1) Usually — but see also the ‘#:allocation’ slot option.
116
117   (2) Of course Guile already provides complex numbers, and ‘<complex>’
118is in fact a predefined class in GOOPS; but the definition here is still
119useful as an example.
120
121   (3) ‘<number>’ is the direct superclass of the predefined class
122‘<complex>’; ‘<complex>’ is the superclass of ‘<real>’, and ‘<real>’ is
123the superclass of ‘<integer>’.
124
125
126File: guile.info,  Node: Instance Creation,  Next: Slot Options,  Prev: Class Definition,  Up: GOOPS
127
1288.3 Instance Creation and Slot Access
129=====================================
130
131An instance (or object) of a defined class can be created with ‘make’.
132‘make’ takes one mandatory parameter, which is the class of the instance
133to create, and a list of optional arguments that will be used to
134initialize the slots of the new instance.  For instance the following
135form
136
137     (define c (make <my-complex>))
138
139creates a new ‘<my-complex>’ object and binds it to the Scheme variable
140‘c’.
141
142 -- generic: make
143 -- method: make (class <class>) initarg ...
144     Create and return a new instance of class CLASS, initialized using
145     INITARG ....
146
147     In theory, INITARG ... can have any structure that is understood by
148     whatever methods get applied when the ‘initialize’ generic function
149     is applied to the newly allocated instance.
150
151     In practice, specialized ‘initialize’ methods would normally call
152     ‘(next-method)’, and so eventually the standard GOOPS ‘initialize’
153     methods are applied.  These methods expect INITARGS to be a list
154     with an even number of elements, where even-numbered elements
155     (counting from zero) are keywords and odd-numbered elements are the
156     corresponding values.
157
158     GOOPS processes initialization argument keywords automatically for
159     slots whose definition includes the ‘#:init-keyword’ option (*note
160     init-keyword: Slot Options.).  Other keyword value pairs can only
161     be processed by an ‘initialize’ method that is specialized for the
162     new instance’s class.  Any unprocessed keyword value pairs are
163     ignored.
164
165 -- generic: make-instance
166 -- method: make-instance (class <class>) initarg ...
167     ‘make-instance’ is an alias for ‘make’.
168
169   The slots of the new complex number can be accessed using ‘slot-ref’
170and ‘slot-set!’.  ‘slot-set!’ sets the value of an object slot and
171‘slot-ref’ retrieves it.
172
173     (slot-set! c 'r 10)
174     (slot-set! c 'i 3)
175     (slot-ref c 'r) ⇒ 10
176     (slot-ref c 'i) ⇒ 3
177
178   The ‘(oop goops describe)’ module provides a ‘describe’ function that
179is useful for seeing all the slots of an object; it prints the slots and
180their values to standard output.
181
182     (describe c)
183184     #<<my-complex> 401d8638> is an instance of class <my-complex>
185     Slots are:
186          r = 10
187          i = 3
188
189
190File: guile.info,  Node: Slot Options,  Next: Slot Description Example,  Prev: Instance Creation,  Up: GOOPS
191
1928.4 Slot Options
193================
194
195When specifying a slot (in a ‘(define-class ...)’ form), various options
196can be specified in addition to the slot’s name.  Each option is
197specified by a keyword.  The list of possible keywords is as follows.
198
199 -- slot option: #:init-value init-value
200 -- slot option: #:init-form init-form
201 -- slot option: #:init-thunk init-thunk
202 -- slot option: #:init-keyword init-keyword
203     These options provide various ways to specify how to initialize the
204     slot’s value at instance creation time.
205
206     INIT-VALUE specifies a fixed initial slot value (shared across all
207     new instances of the class).
208
209     INIT-THUNK specifies a thunk that will provide a default value for
210     the slot.  The thunk is called when a new instance is created and
211     should return the desired initial slot value.
212
213     INIT-FORM specifies a form that, when evaluated, will return an
214     initial value for the slot.  The form is evaluated each time that
215     an instance of the class is created, in the lexical environment of
216     the containing ‘define-class’ expression.
217
218     INIT-KEYWORD specifies a keyword that can be used to pass an
219     initial slot value to ‘make’ when creating a new instance.
220
221     Note that, since an ‘init-value’ value is shared across all
222     instances of a class, you should only use it when the initial value
223     is an immutable value, like a constant.  If you want to initialize
224     a slot with a fresh, independently mutable value, you should use
225     ‘init-thunk’ or ‘init-form’ instead.  Consider the following
226     example.
227
228          (define-class <chbouib> ()
229            (hashtab #:init-value (make-hash-table)))
230
231     Here only one hash table is created and all instances of
232     ‘<chbouib>’ have their ‘hashtab’ slot refer to it.  In order to
233     have each instance of ‘<chbouib>’ refer to a new hash table, you
234     should instead write:
235
236          (define-class <chbouib> ()
237            (hashtab #:init-thunk make-hash-table))
238
239     or:
240
241          (define-class <chbouib> ()
242            (hashtab #:init-form (make-hash-table)))
243
244     If more than one of these options is specified for the same slot,
245     the order of precedence, highest first is
246
247        • ‘#:init-keyword’, if INIT-KEYWORD is present in the options
248          passed to ‘make’
249
250        • ‘#:init-thunk’, ‘#:init-form’ or ‘#:init-value’.
251
252     If the slot definition contains more than one initialization option
253     of the same precedence, the later ones are ignored.  If a slot is
254     not initialized at all, its value is unbound.
255
256     In general, slots that are shared between more than one instance
257     are only initialized at new instance creation time if the slot
258     value is unbound at that time.  However, if the new instance
259     creation specifies a valid init keyword and value for a shared
260     slot, the slot is re-initialized regardless of its previous value.
261
262     Note, however, that the power of GOOPS’ metaobject protocol means
263     that everything written here may be customized or overridden for
264     particular classes!  The slot initializations described here are
265     performed by the least specialized method of the generic function
266     ‘initialize’, whose signature is
267
268          (define-method (initialize (object <object>) initargs) ...)
269
270     The initialization of instances of any given class can be
271     customized by defining a ‘initialize’ method that is specialized
272     for that class, and the author of the specialized method may decide
273     to call ‘next-method’ - which will result in a call to the next
274     less specialized ‘initialize’ method - at any point within the
275     specialized code, or maybe not at all.  In general, therefore, the
276     initialization mechanisms described here may be modified or
277     overridden by more specialized code, or may not be supported at all
278     for particular classes.
279
280 -- slot option: #:getter getter
281 -- slot option: #:setter setter
282 -- slot option: #:accessor accessor
283     Given an object OBJ with slots named ‘foo’ and ‘bar’, it is always
284     possible to read and write those slots by calling ‘slot-ref’ and
285     ‘slot-set!’ with the relevant slot name; for example:
286
287          (slot-ref OBJ 'foo)
288          (slot-set! OBJ 'bar 25)
289
290     The ‘#:getter’, ‘#:setter’ and ‘#:accessor’ options, if present,
291     tell GOOPS to create generic function and method definitions that
292     can be used to get and set the slot value more conveniently.
293     GETTER specifies a generic function to which GOOPS will add a
294     method for getting the slot value.  SETTER specifies a generic
295     function to which GOOPS will add a method for setting the slot
296     value.  ACCESSOR specifies an accessor to which GOOPS will add
297     methods for both getting and setting the slot value.
298
299     So if a class includes a slot definition like this:
300
301          (c #:getter get-count #:setter set-count #:accessor count)
302
303     GOOPS defines generic function methods such that the slot value can
304     be referenced using either the getter or the accessor -
305
306          (let ((current-count (get-count obj))) ...)
307          (let ((current-count (count obj))) ...)
308
309     - and set using either the setter or the accessor -
310
311          (set-count obj (+ 1 current-count))
312          (set! (count obj) (+ 1 current-count))
313
314     Note that
315
316        • with an accessor, the slot value is set using the generalized
317          ‘set!’ syntax
318
319        • in practice, it is unusual for a slot to use all three of
320          these options: read-only, write-only and read-write slots
321          would typically use only ‘#:getter’, ‘#:setter’ and
322          ‘#:accessor’ options respectively.
323
324     The binding of the specified names is done in the environment of
325     the ‘define-class’ expression.  If the names are already bound (in
326     that environment) to values that cannot be upgraded to generic
327     functions, those values are overwritten when the ‘define-class’
328     expression is evaluated.  For more detail, see *note
329     ensure-generic: Generic Function Internals.
330
331 -- slot option: #:allocation allocation
332     The ‘#:allocation’ option tells GOOPS how to allocate storage for
333     the slot.  Possible values for ALLOCATION are
334
335        • ‘#:instance’
336
337          Indicates that GOOPS should create separate storage for this
338          slot in each new instance of the containing class (and its
339          subclasses).  This is the default.
340
341        • ‘#:class’
342
343          Indicates that GOOPS should create storage for this slot that
344          is shared by all instances of the containing class (and its
345          subclasses).  In other words, a slot in class C with
346          allocation ‘#:class’ is shared by all INSTANCEs for which
347          ‘(is-a? INSTANCE C)’.  This permits defining a kind of global
348          variable which can be accessed only by (in)direct instances of
349          the class which defines the slot.
350
351        • ‘#:each-subclass’
352
353          Indicates that GOOPS should create storage for this slot that
354          is shared by all _direct_ instances of the containing class,
355          and that whenever a subclass of the containing class is
356          defined, GOOPS should create a new storage for the slot that
357          is shared by all _direct_ instances of the subclass.  In other
358          words, a slot with allocation ‘#:each-subclass’ is shared by
359          all instances with the same ‘class-of’.
360
361        • ‘#:virtual’
362
363          Indicates that GOOPS should not allocate storage for this
364          slot.  The slot definition must also include the ‘#:slot-ref’
365          and ‘#:slot-set!’ options to specify how to reference and set
366          the value for this slot.  See the example below.
367
368     Slot allocation options are processed when defining a new class by
369     the generic function ‘compute-get-n-set’, which is specialized by
370     the class’s metaclass.  Hence new types of slot allocation can be
371     implemented by defining a new metaclass and a method for
372     ‘compute-get-n-set’ that is specialized for the new metaclass.  For
373     an example of how to do this, see *note Customizing Class
374     Definition::.
375
376 -- slot option: #:slot-ref getter
377 -- slot option: #:slot-set! setter
378     The ‘#:slot-ref’ and ‘#:slot-set!’ options must be specified if the
379     slot allocation is ‘#:virtual’, and are ignored otherwise.
380
381     GETTER should be a closure taking a single INSTANCE parameter that
382     returns the current slot value.  SETTER should be a closure taking
383     two parameters - INSTANCE and NEW-VAL - that sets the slot value to
384     NEW-VAL.
385
386
387File: guile.info,  Node: Slot Description Example,  Next: Methods and Generic Functions,  Prev: Slot Options,  Up: GOOPS
388
3898.5 Illustrating Slot Description
390=================================
391
392To illustrate slot description, we can redefine the ‘<my-complex>’ class
393seen before.  A definition could be:
394
395     (define-class <my-complex> (<number>)
396        (r #:init-value 0 #:getter get-r #:setter set-r! #:init-keyword #:r)
397        (i #:init-value 0 #:getter get-i #:setter set-i! #:init-keyword #:i))
398
399With this definition, the ‘r’ and ‘i’ slots are set to 0 by default, and
400can be initialised to other values by calling ‘make’ with the ‘#:r’ and
401‘#:i’ keywords.  Also the generic functions ‘get-r’, ‘set-r!’, ‘get-i’
402and ‘set-i!’ are automatically defined to read and write the slots.
403
404     (define c1 (make <my-complex> #:r 1 #:i 2))
405     (get-r c1) ⇒ 1
406     (set-r! c1 12)
407     (get-r c1) ⇒ 12
408     (define c2 (make <my-complex> #:r 2))
409     (get-r c2) ⇒ 2
410     (get-i c2) ⇒ 0
411
412   Accessors can both read and write a slot.  So, another definition of
413the ‘<my-complex>’ class, using the ‘#:accessor’ option, could be:
414
415     (define-class <my-complex> (<number>)
416        (r #:init-value 0 #:accessor real-part #:init-keyword #:r)
417        (i #:init-value 0 #:accessor imag-part #:init-keyword #:i))
418
419With this definition, the ‘r’ slot can be read with:
420     (real-part c)
421and set with:
422     (set! (real-part c) new-value)
423
424   Suppose now that we want to manipulate complex numbers with both
425rectangular and polar coordinates.  One solution could be to have a
426definition of complex numbers which uses one particular representation
427and some conversion functions to pass from one representation to the
428other.  A better solution is to use virtual slots, like this:
429
430     (define-class <my-complex> (<number>)
431        ;; True slots use rectangular coordinates
432        (r #:init-value 0 #:accessor real-part #:init-keyword #:r)
433        (i #:init-value 0 #:accessor imag-part #:init-keyword #:i)
434        ;; Virtual slots access do the conversion
435        (m #:accessor magnitude #:init-keyword #:magn
436           #:allocation #:virtual
437           #:slot-ref (lambda (o)
438                       (let ((r (slot-ref o 'r)) (i (slot-ref o 'i)))
439                         (sqrt (+ (* r r) (* i i)))))
440           #:slot-set! (lambda (o m)
441                         (let ((a (slot-ref o 'a)))
442                           (slot-set! o 'r (* m (cos a)))
443                           (slot-set! o 'i (* m (sin a))))))
444        (a #:accessor angle #:init-keyword #:angle
445           #:allocation #:virtual
446           #:slot-ref (lambda (o)
447                       (atan (slot-ref o 'i) (slot-ref o 'r)))
448           #:slot-set! (lambda(o a)
449                        (let ((m (slot-ref o 'm)))
450                           (slot-set! o 'r (* m (cos a)))
451                           (slot-set! o 'i (* m (sin a)))))))
452
453
454   In this class definition, the magnitude ‘m’ and angle ‘a’ slots are
455virtual, and are calculated, when referenced, from the normal (i.e.
456‘#:allocation #:instance’) slots ‘r’ and ‘i’, by calling the function
457defined in the relevant ‘#:slot-ref’ option.  Correspondingly, writing
458‘m’ or ‘a’ leads to calling the function defined in the ‘#:slot-set!’
459option.  Thus the following expression
460
461     (slot-set! c 'a 3)
462
463permits to set the angle of the ‘c’ complex number.
464
465     (define c (make <my-complex> #:r 12 #:i 20))
466     (real-part c) ⇒ 12
467     (angle c) ⇒ 1.03037682652431
468     (slot-set! c 'i 10)
469     (set! (real-part c) 1)
470     (describe c)
471472     #<<my-complex> 401e9b58> is an instance of class <my-complex>
473     Slots are:
474          r = 1
475          i = 10
476          m = 10.0498756211209
477          a = 1.47112767430373
478
479   Since initialization keywords have been defined for the four slots,
480we can now define the standard Scheme primitives ‘make-rectangular’ and
481‘make-polar’.
482
483     (define make-rectangular
484        (lambda (x y) (make <my-complex> #:r x #:i y)))
485
486     (define make-polar
487        (lambda (x y) (make <my-complex> #:magn x #:angle y)))
488
489
490File: guile.info,  Node: Methods and Generic Functions,  Next: Inheritance,  Prev: Slot Description Example,  Up: GOOPS
491
4928.6 Methods and Generic Functions
493=================================
494
495A GOOPS method is like a Scheme procedure except that it is specialized
496for a particular set of argument classes, and will only be used when the
497actual arguments in a call match the classes in the method definition.
498
499     (define-method (+ (x <string>) (y <string>))
500       (string-append x y))
501
502     (+ "abc" "de") ⇒ "abcde"
503
504   A method is not formally associated with any single class (as it is
505in many other object oriented languages), because a method can be
506specialized for a combination of several classes.  If you’ve studied
507object orientation in non-Lispy languages, you may remember discussions
508such as whether a method to stretch a graphical image around a surface
509should be a method of the image class, with a surface as a parameter, or
510a method of the surface class, with an image as a parameter.  In GOOPS
511you’d just write
512
513     (define-method (stretch (im <image>) (sf <surface>))
514       ...)
515
516and the question of which class the method is more associated with does
517not need answering.
518
519   There can simultaneously be several methods with the same name but
520different sets of specializing argument classes; for example:
521
522     (define-method (+ (x <string>) (y <string)) ...)
523     (define-method (+ (x <matrix>) (y <matrix>)) ...)
524     (define-method (+ (f <fish>) (b <bicycle>)) ...)
525     (define-method (+ (a <foo>) (b <bar>) (c <baz>)) ...)
526
527A generic function is a container for the set of such methods that a
528program intends to use.
529
530   If you look at a program’s source code, and see ‘(+ x y)’ somewhere
531in it, conceptually what is happening is that the program at that point
532calls a generic function (in this case, the generic function bound to
533the identifier ‘+’).  When that happens, Guile works out which of the
534generic function’s methods is the most appropriate for the arguments
535that the function is being called with; then it evaluates the method’s
536code with the arguments as formal parameters.  This happens every time
537that a generic function call is evaluated — it isn’t assumed that a
538given source code call will end up invoking the same method every time.
539
540   Defining an identifier as a generic function is done with the
541‘define-generic’ macro.  Definition of a new method is done with the
542‘define-method’ macro.  Note that ‘define-method’ automatically does a
543‘define-generic’ if the identifier concerned is not already a generic
544function, so often an explicit ‘define-generic’ call is not needed.
545
546 -- syntax: define-generic symbol
547     Create a generic function with name SYMBOL and bind it to the
548     variable SYMBOL.  If SYMBOL was previously bound to a Scheme
549     procedure (or procedure-with-setter), the old procedure (and
550     setter) is incorporated into the new generic function as its
551     default procedure (and setter).  Any other previous value,
552     including an existing generic function, is discarded and replaced
553     by a new, empty generic function.
554
555 -- syntax: define-method (generic parameter ...) body ...
556     Define a method for the generic function or accessor GENERIC with
557     parameters PARAMETERs and body BODY ....
558
559     GENERIC is a generic function.  If GENERIC is a variable which is
560     not yet bound to a generic function object, the expansion of
561     ‘define-method’ will include a call to ‘define-generic’.  If
562     GENERIC is ‘(setter GENERIC-WITH-SETTER)’, where
563     GENERIC-WITH-SETTER is a variable which is not yet bound to a
564     generic-with-setter object, the expansion will include a call to
565     ‘define-accessor’.
566
567     Each PARAMETER must be either a symbol or a two-element list
568     ‘(SYMBOL CLASS)’.  The symbols refer to variables in the body forms
569     that will be bound to the parameters supplied by the caller when
570     calling this method.  The CLASSes, if present, specify the possible
571     combinations of parameters to which this method can be applied.
572
573     BODY ... are the bodies of the method definition.
574
575   ‘define-method’ expressions look a little like Scheme procedure
576definitions of the form
577
578     (define (name formals ...) . body)
579
580   The important difference is that each formal parameter, apart from
581the possible “rest” argument, can be qualified by a class name: ‘FORMAL’
582becomes ‘(FORMAL CLASS)’.  The meaning of this qualification is that the
583method being defined will only be applicable in a particular generic
584function invocation if the corresponding argument is an instance of
585‘CLASS’ (or one of its subclasses).  If more than one of the formal
586parameters is qualified in this way, then the method will only be
587applicable if each of the corresponding arguments is an instance of its
588respective qualifying class.
589
590   Note that unqualified formal parameters act as though they are
591qualified by the class ‘<top>’, which GOOPS uses to mean the superclass
592of all valid Scheme types, including both primitive types and GOOPS
593classes.
594
595   For example, if a generic function method is defined with PARAMETERs
596‘(s1 <square>)’ and ‘(n <number>)’, that method is only applicable to
597invocations of its generic function that have two parameters where the
598first parameter is an instance of the ‘<square>’ class and the second
599parameter is a number.
600
601* Menu:
602
603* Accessors::
604* Extending Primitives::
605* Merging Generics::
606* Next-method::
607* Generic Function and Method Examples::
608* Handling Invocation Errors::
609
610
611File: guile.info,  Node: Accessors,  Next: Extending Primitives,  Up: Methods and Generic Functions
612
6138.6.1 Accessors
614---------------
615
616An accessor is a generic function that can also be used with the
617generalized ‘set!’ syntax (*note Procedures with Setters::).  Guile will
618handle a call like
619
620     (set! (accessor args...) value)
621
622by calling the most specialized method of ‘accessor’ that matches the
623classes of ‘args’ and ‘value’.  ‘define-accessor’ is used to bind an
624identifier to an accessor.
625
626 -- syntax: define-accessor symbol
627     Create an accessor with name SYMBOL and bind it to the variable
628     SYMBOL.  If SYMBOL was previously bound to a Scheme procedure (or
629     procedure-with-setter), the old procedure (and setter) is
630     incorporated into the new accessor as its default procedure (and
631     setter).  Any other previous value, including an existing generic
632     function or accessor, is discarded and replaced by a new, empty
633     accessor.
634
635
636File: guile.info,  Node: Extending Primitives,  Next: Merging Generics,  Prev: Accessors,  Up: Methods and Generic Functions
637
6388.6.2 Extending Primitives
639--------------------------
640
641Many of Guile’s primitive procedures can be extended by giving them a
642generic function definition that operates in conjunction with their
643normal C-coded implementation.  When a primitive is extended in this
644way, it behaves like a generic function with the C-coded implementation
645as its default method.
646
647   This extension happens automatically if a method is defined (by a
648‘define-method’ call) for a variable whose current value is a primitive.
649But it can also be forced by calling ‘enable-primitive-generic!’.
650
651 -- primitive procedure: enable-primitive-generic! primitive
652     Force the creation of a generic function definition for PRIMITIVE.
653
654   Once the generic function definition for a primitive has been
655created, it can be retrieved using ‘primitive-generic-generic’.
656
657 -- primitive procedure: primitive-generic-generic primitive
658     Return the generic function definition of PRIMITIVE.
659
660     ‘primitive-generic-generic’ raises an error if PRIMITIVE is not a
661     primitive with generic capability.
662
663
664File: guile.info,  Node: Merging Generics,  Next: Next-method,  Prev: Extending Primitives,  Up: Methods and Generic Functions
665
6668.6.3 Merging Generics
667----------------------
668
669GOOPS generic functions and accessors often have short, generic names.
670For example, if a vector package provides an accessor for the X
671coordinate of a vector, that accessor may just be called ‘x’.  It
672doesn’t need to be called, for example, ‘vector:x’, because GOOPS will
673work out, when it sees code like ‘(x OBJ)’, that the vector-specific
674method of ‘x’ should be called if OBJ is a vector.
675
676   That raises the question, though, of what happens when different
677packages define a generic function with the same name.  Suppose we work
678with a graphical package which needs to use two independent vector
679packages for 2D and 3D vectors respectively.  If both packages export
680‘x’, what does the code using those packages end up with?
681
682   *note duplicate binding handlers: Creating Guile Modules. explains
683how this is resolved for conflicting bindings in general.  For generics,
684there is a special duplicates handler, ‘merge-generics’, which tells the
685module system to merge generic functions with the same name.  Here is an
686example:
687
688     (define-module (math 2D-vectors)
689       #:use-module (oop goops)
690       #:export (x y ...))
691
692     (define-module (math 3D-vectors)
693       #:use-module (oop goops)
694       #:export (x y z ...))
695
696     (define-module (my-module)
697       #:use-module (oop goops)
698       #:use-module (math 2D-vectors)
699       #:use-module (math 3D-vectors)
700       #:duplicates (merge-generics))
701
702   The generic function ‘x’ in ‘(my-module)’ will now incorporate all of
703the methods of ‘x’ from both imported modules.
704
705   To be precise, there will now be three distinct generic functions
706named ‘x’: ‘x’ in ‘(math 2D-vectors)’, ‘x’ in ‘(math 3D-vectors)’, and
707‘x’ in ‘(my-module)’; and these functions share their methods in an
708interesting and dynamic way.
709
710   To explain, let’s call the imported generic functions (in ‘(math
7112D-vectors)’ and ‘(math 3D-vectors)’) the “ancestors”, and the merged
712generic function (in ‘(my-module)’), the “descendant”.  The general rule
713is that for any generic function G, the applicable methods are selected
714from the union of the methods of G’s descendant functions, the methods
715of G itself and the methods of G’s ancestor functions.
716
717   Thus ancestor functions effectively share methods with their
718descendants, and vice versa.  In the example above, ‘x’ in ‘(math
7192D-vectors)’ will share the methods of ‘x’ in ‘(my-module)’ and vice
720versa.(1)  Sharing is dynamic, so adding another new method to a
721descendant implies adding it to that descendant’s ancestors too.
722
723   ---------- Footnotes ----------
724
725   (1) But note that ‘x’ in ‘(math 2D-vectors)’ doesn’t share methods
726with ‘x’ in ‘(math 3D-vectors)’, so modularity is still preserved.
727
728
729File: guile.info,  Node: Next-method,  Next: Generic Function and Method Examples,  Prev: Merging Generics,  Up: Methods and Generic Functions
730
7318.6.4 Next-method
732-----------------
733
734When you call a generic function, with a particular set of arguments,
735GOOPS builds a list of all the methods that are applicable to those
736arguments and orders them by how closely the method definitions match
737the actual argument types.  It then calls the method at the top of this
738list.  If the selected method’s code wants to call on to the next method
739in this list, it can do so by using ‘next-method’.
740
741     (define-method (Test (a <integer>)) (cons 'integer (next-method)))
742     (define-method (Test (a <number>))  (cons 'number  (next-method)))
743     (define-method (Test a)             (list 'top))
744
745   With these definitions,
746
747     (Test 1)   ⇒ (integer number top)
748     (Test 1.0) ⇒ (number top)
749     (Test #t)  ⇒ (top)
750
751   ‘next-method’ is always called as just ‘(next-method)’.  The
752arguments for the next method call are always implicit, and always the
753same as for the original method call.
754
755   If you want to call on to a method with the same name but with a
756different set of arguments (as you might with overloaded methods in C++,
757for example), you do not use ‘next-method’, but instead simply write the
758new call as usual:
759
760     (define-method (Test (a <number>) min max)
761       (if (and (>= a min) (<= a max))
762           (display "Number is in range\n"))
763       (Test a))
764
765     (Test 2 1 10)
766767     Number is in range
768769     (integer number top)
770
771   (You should be careful in this case that the ‘Test’ calls do not lead
772to an infinite recursion, but this consideration is just the same as in
773Scheme code in general.)
774
775
776File: guile.info,  Node: Generic Function and Method Examples,  Next: Handling Invocation Errors,  Prev: Next-method,  Up: Methods and Generic Functions
777
7788.6.5 Generic Function and Method Examples
779------------------------------------------
780
781Consider the following definitions:
782
783     (define-generic G)
784     (define-method (G (a <integer>) b) 'integer)
785     (define-method (G (a <real>) b) 'real)
786     (define-method (G a b) 'top)
787
788   The ‘define-generic’ call defines G as a generic function.  The three
789next lines define methods for G.  Each method uses a sequence of
790“parameter specializers” that specify when the given method is
791applicable.  A specializer permits to indicate the class a parameter
792must belong to (directly or indirectly) to be applicable.  If no
793specializer is given, the system defaults it to ‘<top>’.  Thus, the
794first method definition is equivalent to
795
796     (define-method (G (a <integer>) (b <top>)) 'integer)
797
798   Now, let’s look at some possible calls to the generic function G:
799
800     (G 2 3)    ⇒ integer
801     (G 2 #t)   ⇒ integer
802     (G 1.2 'a) ⇒ real
803     (G #t #f)  ⇒ top
804     (G 1 2 3)  ⇒ error (since no method exists for 3 parameters)
805
806   The methods above use only one specializer per parameter list.  But
807in general, any or all of a method’s parameters may be specialized.
808Suppose we define now:
809
810     (define-method (G (a <integer>) (b <number>))  'integer-number)
811     (define-method (G (a <integer>) (b <real>))    'integer-real)
812     (define-method (G (a <integer>) (b <integer>)) 'integer-integer)
813     (define-method (G a (b <number>))              'top-number)
814
815With these definitions:
816
817     (G 1 2)   ⇒ integer-integer
818     (G 1 1.0) ⇒ integer-real
819     (G 1 #t)  ⇒ integer
820     (G 'a 1)  ⇒ top-number
821
822   As a further example we shall continue to define operations on the
823‘<my-complex>’ class.  Suppose that we want to use it to implement
824complex numbers completely.  For instance a definition for the addition
825of two complex numbers could be
826
827     (define-method (new-+ (a <my-complex>) (b <my-complex>))
828       (make-rectangular (+ (real-part a) (real-part b))
829                         (+ (imag-part a) (imag-part b))))
830
831   To be sure that the ‘+’ used in the method ‘new-+’ is the standard
832addition we can do:
833
834     (define-generic new-+)
835
836     (let ((+ +))
837       (define-method (new-+ (a <my-complex>) (b <my-complex>))
838         (make-rectangular (+ (real-part a) (real-part b))
839                           (+ (imag-part a) (imag-part b)))))
840
841   The ‘define-generic’ ensures here that ‘new-+’ will be defined in the
842global environment.  Once this is done, we can add methods to the
843generic function ‘new-+’ which make a closure on the ‘+’ symbol.  A
844complete writing of the ‘new-+’ methods is shown in *note Figure 8.1:
845fig:newplus.
846
847     (define-generic new-+)
848
849     (let ((+ +))
850
851       (define-method (new-+ (a <real>) (b <real>)) (+ a b))
852
853       (define-method (new-+ (a <real>) (b <my-complex>))
854         (make-rectangular (+ a (real-part b)) (imag-part b)))
855
856       (define-method (new-+ (a <my-complex>) (b <real>))
857         (make-rectangular (+ (real-part a) b) (imag-part a)))
858
859       (define-method (new-+ (a <my-complex>) (b <my-complex>))
860         (make-rectangular (+ (real-part a) (real-part b))
861                           (+ (imag-part a) (imag-part b))))
862
863       (define-method (new-+ (a <number>))  a)
864
865       (define-method (new-+) 0)
866
867       (define-method (new-+ . args)
868         (new-+ (car args)
869           (apply new-+ (cdr args)))))
870
871     (set! + new-+)
872
873Figure 8.1: Extending ‘+’ to handle complex numbers
874
875   We take advantage here of the fact that generic function are not
876obliged to have a fixed number of parameters.  The four first methods
877implement dyadic addition.  The fifth method says that the addition of a
878single element is this element itself.  The sixth method says that using
879the addition with no parameter always return 0 (as is also true for the
880primitive ‘+’).  The last method takes an arbitrary number of
881parameters(1).  This method acts as a kind of ‘reduce’: it calls the
882dyadic addition on the _car_ of the list and on the result of applying
883it on its rest.  To finish, the ‘set!’ permits to redefine the ‘+’
884symbol to our extended addition.
885
886   To conclude our implementation (integration?)  of complex numbers, we
887could redefine standard Scheme predicates in the following manner:
888
889     (define-method (complex? c <my-complex>) #t)
890     (define-method (complex? c)           #f)
891
892     (define-method (number? n <number>) #t)
893     (define-method (number? n)          #f)
894     ...
895
896   Standard primitives in which complex numbers are involved could also
897be redefined in the same manner.
898
899   ---------- Footnotes ----------
900
901   (1) The parameter list for a ‘define-method’ follows the conventions
902used for Scheme procedures.  In particular it can use the dot notation
903or a symbol to denote an arbitrary number of parameters
904
905
906File: guile.info,  Node: Handling Invocation Errors,  Prev: Generic Function and Method Examples,  Up: Methods and Generic Functions
907
9088.6.6 Handling Invocation Errors
909--------------------------------
910
911If a generic function is invoked with a combination of parameters for
912which there is no applicable method, GOOPS raises an error.
913
914 -- generic: no-method
915 -- method: no-method (gf <generic>) args
916     When an application invokes a generic function, and no methods at
917     all have been defined for that generic function, GOOPS calls the
918     ‘no-method’ generic function.  The default method calls
919     ‘goops-error’ with an appropriate message.
920
921 -- generic: no-applicable-method
922 -- method: no-applicable-method (gf <generic>) args
923     When an application applies a generic function to a set of
924     arguments, and no methods have been defined for those argument
925     types, GOOPS calls the ‘no-applicable-method’ generic function.
926     The default method calls ‘goops-error’ with an appropriate message.
927
928 -- generic: no-next-method
929 -- method: no-next-method (gf <generic>) args
930     When a generic function method calls ‘(next-method)’ to invoke the
931     next less specialized method for that generic function, and no less
932     specialized methods have been defined for the current generic
933     function arguments, GOOPS calls the ‘no-next-method’ generic
934     function.  The default method calls ‘goops-error’ with an
935     appropriate message.
936
937
938File: guile.info,  Node: Inheritance,  Next: Introspection,  Prev: Methods and Generic Functions,  Up: GOOPS
939
9408.7 Inheritance
941===============
942
943Here are some class definitions to help illustrate inheritance:
944
945     (define-class A () a)
946     (define-class B () b)
947     (define-class C () c)
948     (define-class D (A B) d a)
949     (define-class E (A C) e c)
950     (define-class F (D E) f)
951
952   ‘A’, ‘B’, ‘C’ have a null list of superclasses.  In this case, the
953system will replace the null list by a list which only contains
954‘<object>’, the root of all the classes defined by ‘define-class’.  ‘D’,
955‘E’, ‘F’ use multiple inheritance: each class inherits from two
956previously defined classes.  Those class definitions define a hierarchy
957which is shown in *note Figure 8.2: fig:hier.  In this figure, the class
958‘<top>’ is also shown; this class is the superclass of all Scheme
959objects.  In particular, ‘<top>’ is the superclass of all standard
960Scheme types.
961
962          <top>
963          / \\\_____________________
964         /   \\___________          \
965        /     \           \          \
966    <object>  <pair>  <procedure>  <number>
967    /  |  \                           |
968   /   |   \                          |
969  A    B    C                      <complex>
970  |\__/__   |                         |
971   \ /   \ /                          |
972    D     E                         <real>
973     \   /                            |
974       F                              |
975                                   <integer>
976
977Figure 8.2: A class hierarchy.
978
979   When a class has superclasses, its set of slots is calculated by
980taking the union of its own slots and those of all its superclasses.
981Thus each instance of D will have three slots, ‘a’, ‘b’ and ‘d’).  The
982slots of a class can be discovered using the ‘class-slots’ primitive.
983For instance,
984
985     (class-slots A) ⇒ ((a))
986     (class-slots E) ⇒ ((a) (e) (c))
987     (class-slots F) ⇒ ((e) (c) (b) (d) (a) (f))
988
989The ordering of the returned slots is not significant.
990
991* Menu:
992
993* Class Precedence List::
994* Sorting Methods::
995
996
997File: guile.info,  Node: Class Precedence List,  Next: Sorting Methods,  Up: Inheritance
998
9998.7.1 Class Precedence List
1000---------------------------
1001
1002What happens when a class inherits from two or more superclasses that
1003have a slot with the same name but incompatible definitions — for
1004example, different init values or slot allocations?  We need a rule for
1005deciding which slot definition the derived class ends up with, and this
1006rule is provided by the class’s “Class Precedence List”.(1)
1007
1008   Another problem arises when invoking a generic function, and there is
1009more than one method that could apply to the call arguments.  Here we
1010need a way of ordering the applicable methods, so that Guile knows which
1011method to use first, which to use next if that method calls
1012‘next-method’, and so on.  One of the ingredients for this ordering is
1013determining, for each given call argument, which of the specializing
1014classes, from each applicable method’s definition, is the most specific
1015for that argument; and here again the class precedence list helps.
1016
1017   If inheritance was restricted such that each class could only have
1018one superclass — which is known as “single” inheritance — class ordering
1019would be easy.  The rule would be simply that a subclass is considered
1020more specific than its superclass.
1021
1022   With multiple inheritance, ordering is less obvious, and we have to
1023impose an arbitrary rule to determine precedence.  Suppose we have
1024
1025     (define-class X ()
1026        (x #:init-value 1))
1027
1028     (define-class Y ()
1029        (x #:init-value 2))
1030
1031     (define-class Z (X Y)
1032        (...))
1033
1034Clearly the ‘Z’ class is more specific than ‘X’ or ‘Y’, for instances of
1035‘Z’.  But which is more specific out of ‘X’ and ‘Y’ — and hence, for the
1036definitions above, which ‘#:init-value’ will take effect when creating
1037an instance of ‘Z’?  The rule in GOOPS is that the superclasses listed
1038earlier are more specific than those listed later.  Hence ‘X’ is more
1039specific than ‘Y’, and the ‘#:init-value’ for slot ‘x’ in instances of
1040‘Z’ will be 1.
1041
1042   Hence there is a linear ordering for a class and all its
1043superclasses, from most specific to least specific, and this ordering is
1044called the Class Precedence List of the class.
1045
1046   In fact the rules above are not quite enough to always determine a
1047unique order, but they give an idea of how things work.  For example,
1048for the ‘F’ class shown in *note Figure 8.2: fig:hier, the class
1049precedence list is
1050
1051     (f d e a c b <object> <top>)
1052
1053In cases where there is any ambiguity (like this one), it is a bad idea
1054for programmers to rely on exactly what the order is.  If the order for
1055some superclasses is important, it can be expressed directly in the
1056class definition.
1057
1058   The precedence list of a class can be obtained by calling
1059‘class-precedence-list’.  This function returns a ordered list whose
1060first element is the most specific class.  For instance:
1061
1062     (class-precedence-list B) ⇒ (#<<class> B 401b97c8>
1063                                          #<<class> <object> 401e4a10>
1064                                          #<<class> <top> 4026a9d8>)
1065
1066Or for a more immediately readable result:
1067
1068     (map class-name (class-precedence-list B)) ⇒ (B <object> <top>)
1069
1070   ---------- Footnotes ----------
1071
1072   (1) This section is an adaptation of material from Jeff Dalton’s
1073(J.Dalton@ed.ac.uk) ‘Brief introduction to CLOS’
1074
1075
1076File: guile.info,  Node: Sorting Methods,  Prev: Class Precedence List,  Up: Inheritance
1077
10788.7.2 Sorting Methods
1079---------------------
1080
1081Now, with the idea of the class precedence list, we can state precisely
1082how the possible methods are sorted when more than one of the methods of
1083a generic function are applicable to the call arguments.
1084
1085   The rules are that
1086   • the applicable methods are sorted in order of specificity, and the
1087     most specific method is used first, then the next if that method
1088     calls ‘next-method’, and so on
1089
1090   • a method M1 is more specific than another method M2 if the first
1091     specializing class that differs, between the definitions of M1 and
1092     M2, is more specific, in M1’s definition, for the corresponding
1093     actual call argument, than the specializing class in M2’s
1094     definition
1095
1096   • a class C1 is more specific than another class C2, for an object of
1097     actual class C, if C1 comes before C2 in C’s class precedence list.
1098
1099
1100File: guile.info,  Node: Introspection,  Next: GOOPS Error Handling,  Prev: Inheritance,  Up: GOOPS
1101
11028.8 Introspection
1103=================
1104
1105“Introspection”, or “reflection”, means being able to obtain information
1106dynamically about GOOPS objects.  It is perhaps best illustrated by
1107considering an object oriented language that does not provide any
1108introspection, namely C++.
1109
1110   Nothing in C++ allows a running program to obtain answers to the
1111following types of question:
1112
1113   • What are the data members of this object or class?
1114
1115   • What classes does this class inherit from?
1116
1117   • Is this method call virtual or non-virtual?
1118
1119   • If I invoke ‘Employee::adjustHoliday()’, what class contains the
1120     ‘adjustHoliday()’ method that will be applied?
1121
1122   In C++, answers to such questions can only be determined by looking
1123at the source code, if you have access to it.  GOOPS, on the other hand,
1124includes procedures that allow answers to these questions — or their
1125GOOPS equivalents — to be obtained dynamically, at run time.
1126
1127* Menu:
1128
1129* Classes::
1130* Instances::
1131* Slots::
1132* Generic Functions::
1133* Accessing Slots::
1134
1135
1136File: guile.info,  Node: Classes,  Next: Instances,  Up: Introspection
1137
11388.8.1 Classes
1139-------------
1140
1141A GOOPS class is itself an instance of the ‘<class>’ class, or of a
1142subclass of ‘<class>’.  The definition of the ‘<class>’ class has slots
1143that are used to describe the properties of a class, including the
1144following.
1145
1146 -- primitive procedure: class-name class
1147     Return the name of class CLASS.  This is the value of CLASS’s
1148     ‘name’ slot.
1149
1150 -- primitive procedure: class-direct-supers class
1151     Return a list containing the direct superclasses of CLASS.  This is
1152     the value of CLASS’s ‘direct-supers’ slot.
1153
1154 -- primitive procedure: class-direct-slots class
1155     Return a list containing the slot definitions of the direct slots
1156     of CLASS.  This is the value of CLASS’s ‘direct-slots’ slot.
1157
1158 -- primitive procedure: class-direct-subclasses class
1159     Return a list containing the direct subclasses of CLASS.  This is
1160     the value of CLASS’s ‘direct-subclasses’ slot.
1161
1162 -- primitive procedure: class-direct-methods class
1163     Return a list of all the generic function methods that use CLASS as
1164     a formal parameter specializer.  This is the value of CLASS’s
1165     ‘direct-methods’ slot.
1166
1167 -- primitive procedure: class-precedence-list class
1168     Return the class precedence list for class CLASS (*note Class
1169     Precedence List::).  This is the value of CLASS’s ‘cpl’ slot.
1170
1171 -- primitive procedure: class-slots class
1172     Return a list containing the slot definitions for all CLASS’s
1173     slots, including any slots that are inherited from superclasses.
1174     This is the value of CLASS’s ‘slots’ slot.
1175
1176 -- procedure: class-subclasses class
1177     Return a list of all subclasses of CLASS.
1178
1179 -- procedure: class-methods class
1180     Return a list of all methods that use CLASS or a subclass of CLASS
1181     as one of its formal parameter specializers.
1182
1183
1184File: guile.info,  Node: Instances,  Next: Slots,  Prev: Classes,  Up: Introspection
1185
11868.8.2 Instances
1187---------------
1188
1189 -- primitive procedure: class-of value
1190     Return the GOOPS class of any Scheme VALUE.
1191
1192 -- primitive procedure: instance? object
1193     Return ‘#t’ if OBJECT is any GOOPS instance, otherwise ‘#f’.
1194
1195 -- procedure: is-a? object class
1196     Return ‘#t’ if OBJECT is an instance of CLASS or one of its
1197     subclasses.
1198
1199   You can use the ‘is-a?’ predicate to ask whether any given value
1200belongs to a given class, or ‘class-of’ to discover the class of a given
1201value.  Note that when GOOPS is loaded (by code using the ‘(oop goops)’
1202module) built-in classes like ‘<string>’, ‘<list>’ and ‘<number>’ are
1203automatically set up, corresponding to all Guile Scheme types.
1204
1205     (is-a? 2.3 <number>) ⇒ #t
1206     (is-a? 2.3 <real>) ⇒ #t
1207     (is-a? 2.3 <string>) ⇒ #f
1208     (is-a? '("a" "b") <string>) ⇒ #f
1209     (is-a? '("a" "b") <list>) ⇒ #t
1210     (is-a? (car '("a" "b")) <string>) ⇒ #t
1211     (is-a? <string> <class>) ⇒ #t
1212     (is-a? <class> <string>) ⇒ #f
1213
1214     (class-of 2.3) ⇒ #<<class> <real> 908c708>
1215     (class-of #(1 2 3)) ⇒ #<<class> <vector> 908cd20>
1216     (class-of <string>) ⇒ #<<class> <class> 8bd3e10>
1217     (class-of <class>) ⇒ #<<class> <class> 8bd3e10>
1218
1219
1220File: guile.info,  Node: Slots,  Next: Generic Functions,  Prev: Instances,  Up: Introspection
1221
12228.8.3 Slots
1223-----------
1224
1225 -- procedure: class-slot-definition class slot-name
1226     Return the slot definition for the slot named SLOT-NAME in class
1227     CLASS.  SLOT-NAME should be a symbol.
1228
1229 -- procedure: slot-definition-name slot-def
1230     Extract and return the slot name from SLOT-DEF.
1231
1232 -- procedure: slot-definition-options slot-def
1233     Extract and return the slot options from SLOT-DEF.
1234
1235 -- procedure: slot-definition-allocation slot-def
1236     Extract and return the slot allocation option from SLOT-DEF.  This
1237     is the value of the ‘#:allocation’ keyword (*note allocation: Slot
1238     Options.), or ‘#:instance’ if the ‘#:allocation’ keyword is absent.
1239
1240 -- procedure: slot-definition-getter slot-def
1241     Extract and return the slot getter option from SLOT-DEF.  This is
1242     the value of the ‘#:getter’ keyword (*note getter: Slot Options.),
1243     or ‘#f’ if the ‘#:getter’ keyword is absent.
1244
1245 -- procedure: slot-definition-setter slot-def
1246     Extract and return the slot setter option from SLOT-DEF.  This is
1247     the value of the ‘#:setter’ keyword (*note setter: Slot Options.),
1248     or ‘#f’ if the ‘#:setter’ keyword is absent.
1249
1250 -- procedure: slot-definition-accessor slot-def
1251     Extract and return the slot accessor option from SLOT-DEF.  This is
1252     the value of the ‘#:accessor’ keyword (*note accessor: Slot
1253     Options.), or ‘#f’ if the ‘#:accessor’ keyword is absent.
1254
1255 -- procedure: slot-definition-init-value slot-def
1256     Extract and return the slot init-value option from SLOT-DEF.  This
1257     is the value of the ‘#:init-value’ keyword (*note init-value: Slot
1258     Options.), or the unbound value if the ‘#:init-value’ keyword is
1259     absent.
1260
1261 -- procedure: slot-definition-init-form slot-def
1262     Extract and return the slot init-form option from SLOT-DEF.  This
1263     is the value of the ‘#:init-form’ keyword (*note init-form: Slot
1264     Options.), or the unbound value if the ‘#:init-form’ keyword is
1265     absent.
1266
1267 -- procedure: slot-definition-init-thunk slot-def
1268     Extract and return the slot init-thunk option from SLOT-DEF.  This
1269     is the value of the ‘#:init-thunk’ keyword (*note init-thunk: Slot
1270     Options.), or ‘#f’ if the ‘#:init-thunk’ keyword is absent.
1271
1272 -- procedure: slot-definition-init-keyword slot-def
1273     Extract and return the slot init-keyword option from SLOT-DEF.
1274     This is the value of the ‘#:init-keyword’ keyword (*note
1275     init-keyword: Slot Options.), or ‘#f’ if the ‘#:init-keyword’
1276     keyword is absent.
1277
1278 -- procedure: slot-init-function class slot-name
1279     Return the initialization function for the slot named SLOT-NAME in
1280     class CLASS.  SLOT-NAME should be a symbol.
1281
1282     The returned initialization function incorporates the effects of
1283     the standard ‘#:init-thunk’, ‘#:init-form’ and ‘#:init-value’ slot
1284     options.  These initializations can be overridden by the
1285     ‘#:init-keyword’ slot option or by a specialized ‘initialize’
1286     method, so, in general, the function returned by
1287     ‘slot-init-function’ may be irrelevant.  For a fuller discussion,
1288     see *note init-value: Slot Options.
1289
1290
1291File: guile.info,  Node: Generic Functions,  Next: Accessing Slots,  Prev: Slots,  Up: Introspection
1292
12938.8.4 Generic Functions
1294-----------------------
1295
1296A generic function is an instance of the ‘<generic>’ class, or of a
1297subclass of ‘<generic>’.  The definition of the ‘<generic>’ class has
1298slots that are used to describe the properties of a generic function.
1299
1300 -- primitive procedure: generic-function-name gf
1301     Return the name of generic function GF.
1302
1303 -- primitive procedure: generic-function-methods gf
1304     Return a list of the methods of generic function GF.  This is the
1305     value of GF’s ‘methods’ slot.
1306
1307   Similarly, a method is an instance of the ‘<method>’ class, or of a
1308subclass of ‘<method>’; and the definition of the ‘<method>’ class has
1309slots that are used to describe the properties of a method.
1310
1311 -- primitive procedure: method-generic-function method
1312     Return the generic function that METHOD belongs to.  This is the
1313     value of METHOD’s ‘generic-function’ slot.
1314
1315 -- primitive procedure: method-specializers method
1316     Return a list of METHOD’s formal parameter specializers .  This is
1317     the value of METHOD’s ‘specializers’ slot.
1318
1319 -- primitive procedure: method-procedure method
1320     Return the procedure that implements METHOD.  This is the value of
1321     METHOD’s ‘procedure’ slot.
1322
1323 -- generic: method-source
1324 -- method: method-source (m <method>)
1325     Return an expression that prints to show the definition of method
1326     M.
1327
1328          (define-generic cube)
1329
1330          (define-method (cube (n <number>))
1331            (* n n n))
1332
1333          (map method-source (generic-function-methods cube))
13341335          ((method ((n <number>)) (* n n n)))
1336
1337
1338File: guile.info,  Node: Accessing Slots,  Prev: Generic Functions,  Up: Introspection
1339
13408.8.5 Accessing Slots
1341---------------------
1342
1343Any slot, regardless of its allocation, can be queried, referenced and
1344set using the following four primitive procedures.
1345
1346 -- primitive procedure: slot-exists? obj slot-name
1347     Return ‘#t’ if OBJ has a slot with name SLOT-NAME, otherwise ‘#f’.
1348
1349 -- primitive procedure: slot-bound? obj slot-name
1350     Return ‘#t’ if the slot named SLOT-NAME in OBJ has a value,
1351     otherwise ‘#f’.
1352
1353     ‘slot-bound?’ calls the generic function ‘slot-missing’ if OBJ does
1354     not have a slot called SLOT-NAME (*note slot-missing: Accessing
1355     Slots.).
1356
1357 -- primitive procedure: slot-ref obj slot-name
1358     Return the value of the slot named SLOT-NAME in OBJ.
1359
1360     ‘slot-ref’ calls the generic function ‘slot-missing’ if OBJ does
1361     not have a slot called SLOT-NAME (*note slot-missing: Accessing
1362     Slots.).
1363
1364     ‘slot-ref’ calls the generic function ‘slot-unbound’ if the named
1365     slot in OBJ does not have a value (*note slot-unbound: Accessing
1366     Slots.).
1367
1368 -- primitive procedure: slot-set! obj slot-name value
1369     Set the value of the slot named SLOT-NAME in OBJ to VALUE.
1370
1371     ‘slot-set!’ calls the generic function ‘slot-missing’ if OBJ does
1372     not have a slot called SLOT-NAME (*note slot-missing: Accessing
1373     Slots.).
1374
1375   GOOPS stores information about slots in classes.  Internally, all of
1376these procedures work by looking up the slot definition for the slot
1377named SLOT-NAME in the class ‘(class-of OBJ)’, and then using the slot
1378definition’s “getter” and “setter” closures to get and set the slot
1379value.
1380
1381   The next four procedures differ from the previous ones in that they
1382take the class as an explicit argument, rather than assuming ‘(class-of
1383OBJ)’.  Therefore they allow you to apply the “getter” and “setter”
1384closures of a slot definition in one class to an instance of a different
1385class.
1386
1387 -- primitive procedure: slot-exists-using-class? class obj slot-name
1388     Return ‘#t’ if CLASS has a slot definition for a slot with name
1389     SLOT-NAME, otherwise ‘#f’.
1390
1391 -- primitive procedure: slot-bound-using-class? class obj slot-name
1392     Return ‘#t’ if applying ‘slot-ref-using-class’ to the same
1393     arguments would call the generic function ‘slot-unbound’, otherwise
1394     ‘#f’.
1395
1396     ‘slot-bound-using-class?’ calls the generic function ‘slot-missing’
1397     if CLASS does not have a slot definition for a slot called
1398     SLOT-NAME (*note slot-missing: Accessing Slots.).
1399
1400 -- primitive procedure: slot-ref-using-class class obj slot-name
1401     Apply the “getter” closure for the slot named SLOT-NAME in CLASS to
1402     OBJ, and return its result.
1403
1404     ‘slot-ref-using-class’ calls the generic function ‘slot-missing’ if
1405     CLASS does not have a slot definition for a slot called SLOT-NAME
1406     (*note slot-missing: Accessing Slots.).
1407
1408     ‘slot-ref-using-class’ calls the generic function ‘slot-unbound’ if
1409     the application of the “getter” closure to OBJ returns an unbound
1410     value (*note slot-unbound: Accessing Slots.).
1411
1412 -- primitive procedure: slot-set-using-class! class obj slot-name value
1413     Apply the “setter” closure for the slot named SLOT-NAME in CLASS to
1414     OBJ and VALUE.
1415
1416     ‘slot-set-using-class!’ calls the generic function ‘slot-missing’
1417     if CLASS does not have a slot definition for a slot called
1418     SLOT-NAME (*note slot-missing: Accessing Slots.).
1419
1420   Slots whose allocation is per-class rather than per-instance can be
1421referenced and set without needing to specify any particular instance.
1422
1423 -- procedure: class-slot-ref class slot-name
1424     Return the value of the slot named SLOT-NAME in class CLASS.  The
1425     named slot must have ‘#:class’ or ‘#:each-subclass’ allocation
1426     (*note allocation: Slot Options.).
1427
1428     If there is no such slot with ‘#:class’ or ‘#:each-subclass’
1429     allocation, ‘class-slot-ref’ calls the ‘slot-missing’ generic
1430     function with arguments CLASS and SLOT-NAME.  Otherwise, if the
1431     slot value is unbound, ‘class-slot-ref’ calls the ‘slot-unbound’
1432     generic function, with the same arguments.
1433
1434 -- procedure: class-slot-set! class slot-name value
1435     Set the value of the slot named SLOT-NAME in class CLASS to VALUE.
1436     The named slot must have ‘#:class’ or ‘#:each-subclass’ allocation
1437     (*note allocation: Slot Options.).
1438
1439     If there is no such slot with ‘#:class’ or ‘#:each-subclass’
1440     allocation, ‘class-slot-ref’ calls the ‘slot-missing’ generic
1441     function with arguments CLASS and SLOT-NAME.
1442
1443   When a ‘slot-ref’ or ‘slot-set!’ call specifies a non-existent slot
1444name, or tries to reference a slot whose value is unbound, GOOPS calls
1445one of the following generic functions.
1446
1447 -- generic: slot-missing
1448 -- method: slot-missing (class <class>) slot-name
1449 -- method: slot-missing (class <class>) (object <object>) slot-name
1450 -- method: slot-missing (class <class>) (object <object>) slot-name
1451          value
1452     When an application attempts to reference or set a class or
1453     instance slot by name, and the slot name is invalid for the
1454     specified CLASS or OBJECT, GOOPS calls the ‘slot-missing’ generic
1455     function.
1456
1457     The default methods all call ‘goops-error’ with an appropriate
1458     message.
1459
1460 -- generic: slot-unbound
1461 -- method: slot-unbound (object <object>)
1462 -- method: slot-unbound (class <class>) slot-name
1463 -- method: slot-unbound (class <class>) (object <object>) slot-name
1464     When an application attempts to reference a class or instance slot,
1465     and the slot’s value is unbound, GOOPS calls the ‘slot-unbound’
1466     generic function.
1467
1468     The default methods all call ‘goops-error’ with an appropriate
1469     message.
1470
1471
1472File: guile.info,  Node: GOOPS Error Handling,  Next: GOOPS Object Miscellany,  Prev: Introspection,  Up: GOOPS
1473
14748.9 Error Handling
1475==================
1476
1477The procedure ‘goops-error’ is called to raise an appropriate error by
1478the default methods of the following generic functions:
1479
1480   • ‘slot-missing’ (*note slot-missing: Accessing Slots.)
1481
1482   • ‘slot-unbound’ (*note slot-unbound: Accessing Slots.)
1483
1484   • ‘no-method’ (*note no-method: Handling Invocation Errors.)
1485
1486   • ‘no-applicable-method’ (*note no-applicable-method: Handling
1487     Invocation Errors.)
1488
1489   • ‘no-next-method’ (*note no-next-method: Handling Invocation
1490     Errors.)
1491
1492   If you customize these functions for particular classes or
1493metaclasses, you may still want to use ‘goops-error’ to signal any error
1494conditions that you detect.
1495
1496 -- procedure: goops-error format-string arg ...
1497     Raise an error with key ‘goops-error’ and error message constructed
1498     from FORMAT-STRING and ARG ....  Error message formatting is as
1499     done by ‘scm-error’.
1500
1501
1502File: guile.info,  Node: GOOPS Object Miscellany,  Next: The Metaobject Protocol,  Prev: GOOPS Error Handling,  Up: GOOPS
1503
15048.10 GOOPS Object Miscellany
1505============================
1506
1507Here we cover some points about GOOPS objects that aren’t substantial
1508enough to merit sections on their own.
1509
1510Object Equality
1511---------------
1512
1513When GOOPS is loaded, ‘eqv?’, ‘equal?’ and ‘=’ become generic functions,
1514and you can define methods for them, specialized for your own classes,
1515so as to control what the various kinds of equality mean for your
1516classes.
1517
1518   For example, the ‘assoc’ procedure, for looking up an entry in an
1519alist, is specified as using ‘equal?’ to determine when the car of an
1520entry in the alist is the same as the key parameter that ‘assoc’ is
1521called with.  Hence, if you had defined a new class, and wanted to use
1522instances of that class as the keys in an alist, you could define a
1523method for ‘equal?’, for your class, to control ‘assoc’’s lookup
1524precisely.
1525
1526Cloning Objects
1527---------------
1528
1529 -- generic: shallow-clone
1530 -- method: shallow-clone (self <object>)
1531     Return a “shallow” clone of SELF.  The default method makes a
1532     shallow clone by allocating a new instance and copying slot values
1533     from self to the new instance.  Each slot value is copied either as
1534     an immediate value or by reference.
1535
1536 -- generic: deep-clone
1537 -- method: deep-clone (self <object>)
1538     Return a “deep” clone of SELF.  The default method makes a deep
1539     clone by allocating a new instance and copying or cloning slot
1540     values from self to the new instance.  If a slot value is an
1541     instance (satisfies ‘instance?’), it is cloned by calling
1542     ‘deep-clone’ on that value.  Other slot values are copied either as
1543     immediate values or by reference.
1544
1545Write and Display
1546-----------------
1547
1548 -- primitive generic: write object port
1549 -- primitive generic: display object port
1550     When GOOPS is loaded, ‘write’ and ‘display’ become generic
1551     functions with special methods for printing
1552
1553        • objects - instances of the class ‘<object>’
1554
1555        • foreign objects - instances of the class ‘<foreign-object>’
1556
1557        • classes - instances of the class ‘<class>’
1558
1559        • generic functions - instances of the class ‘<generic>’
1560
1561        • methods - instances of the class ‘<method>’.
1562
1563     ‘write’ and ‘display’ print non-GOOPS values in the same way as the
1564     Guile primitive ‘write’ and ‘display’ functions.
1565
1566   In addition to the cases mentioned, you can of course define ‘write’
1567and ‘display’ methods for your own classes, to customize how instances
1568of those classes are printed.
1569
1570
1571File: guile.info,  Node: The Metaobject Protocol,  Next: Redefining a Class,  Prev: GOOPS Object Miscellany,  Up: GOOPS
1572
15738.11 The Metaobject Protocol
1574============================
1575
1576At this point, we’ve said about as much as can be said about GOOPS
1577without having to confront the idea of the metaobject protocol.  There
1578are a couple more topics that could be discussed in isolation first —
1579class redefinition, and changing the class of existing instances — but
1580in practice developers using them will be advanced enough to want to
1581understand the metaobject protocol too, and will probably be using the
1582protocol to customize exactly what happens during these events.
1583
1584   So let’s plunge in.  GOOPS is based on a “metaobject protocol” (aka
1585“MOP”) derived from the ones used in CLOS (the Common Lisp Object
1586System), tiny-clos (a small Scheme implementation of a subset of CLOS
1587functionality) and STKlos.
1588
1589   The MOP underlies many possible GOOPS customizations — such as
1590defining an ‘initialize’ method to customize the initialization of
1591instances of an application-defined class — and an understanding of the
1592MOP makes it much easier to explain such customizations in a precise
1593way.  And at a deeper level, understanding the MOP is a key part of
1594understanding GOOPS, and of taking full advantage of GOOPS’ power, by
1595customizing the behaviour of GOOPS itself.
1596
1597* Menu:
1598
1599* Metaobjects and the Metaobject Protocol::
1600* Metaclasses::
1601* MOP Specification::
1602* Instance Creation Protocol::
1603* Class Definition Protocol::
1604* Customizing Class Definition::
1605* Method Definition::
1606* Method Definition Internals::
1607* Generic Function Internals::
1608* Generic Function Invocation::
1609
1610
1611File: guile.info,  Node: Metaobjects and the Metaobject Protocol,  Next: Metaclasses,  Up: The Metaobject Protocol
1612
16138.11.1 Metaobjects and the Metaobject Protocol
1614----------------------------------------------
1615
1616The building blocks of GOOPS are classes, slot definitions, instances,
1617generic functions and methods.  A class is a grouping of inheritance
1618relations and slot definitions.  An instance is an object with slots
1619that are allocated following the rules implied by its class’s
1620superclasses and slot definitions.  A generic function is a collection
1621of methods and rules for determining which of those methods to apply
1622when the generic function is invoked.  A method is a procedure and a set
1623of specializers that specify the type of arguments to which the
1624procedure is applicable.
1625
1626   Of these entities, GOOPS represents classes, generic functions and
1627methods as “metaobjects”.  In other words, the values in a GOOPS program
1628that describe classes, generic functions and methods, are themselves
1629instances (or “objects”) of special GOOPS classes that encapsulate the
1630behaviour, respectively, of classes, generic functions, and methods.
1631
1632   (The other two entities are slot definitions and instances.  Slot
1633definitions are not strictly instances, but every slot definition is
1634associated with a GOOPS class that specifies the behaviour of the slot
1635as regards accessibility and protection from garbage collection.
1636Instances are of course objects in the usual sense, and there is no
1637benefit from thinking of them as metaobjects.)
1638
1639   The “metaobject protocol” (or “MOP”) is the specification of the
1640generic functions which determine the behaviour of these metaobjects and
1641the circumstances in which these generic functions are invoked.
1642
1643   For a concrete example of what this means, consider how GOOPS
1644calculates the set of slots for a class that is being defined using
1645‘define-class’.  The desired set of slots is the union of the new
1646class’s direct slots and the slots of all its superclasses.  But
1647‘define-class’ itself does not perform this calculation.  Instead, there
1648is a method of the ‘initialize’ generic function that is specialized for
1649instances of type ‘<class>’, and it is this method that performs the
1650slot calculation.
1651
1652   ‘initialize’ is a generic function which GOOPS calls whenever a new
1653instance is created, immediately after allocating memory for a new
1654instance, in order to initialize the new instance’s slots.  The sequence
1655of steps is as follows.
1656
1657   • ‘define-class’ uses ‘make’ to make a new instance of the ‘<class>’
1658     class, passing as initialization arguments the superclasses, slot
1659     definitions and class options that were specified in the
1660     ‘define-class’ form.
1661
1662   • ‘make’ allocates memory for the new instance, and invokes the
1663     ‘initialize’ generic function to initialize the new instance’s
1664     slots.
1665
1666   • The ‘initialize’ generic function applies the method that is
1667     specialized for instances of type ‘<class>’, and this method
1668     performs the slot calculation.
1669
1670   In other words, rather than being hardcoded in ‘define-class’, the
1671default behaviour of class definition is encapsulated by generic
1672function methods that are specialized for the class ‘<class>’.
1673
1674   It is possible to create a new class that inherits from ‘<class>’,
1675which is called a “metaclass”, and to write a new ‘initialize’ method
1676that is specialized for instances of the new metaclass.  Then, if the
1677‘define-class’ form includes a ‘#:metaclass’ class option whose value is
1678the new metaclass, the class that is defined by the ‘define-class’ form
1679will be an instance of the new metaclass rather than of the default
1680‘<class>’, and will be defined in accordance with the new ‘initialize’
1681method.  Thus the default slot calculation, as well as any other aspect
1682of the new class’s relationship with its superclasses, can be modified
1683or overridden.
1684
1685   In a similar way, the behaviour of generic functions can be modified
1686or overridden by creating a new class that inherits from the standard
1687generic function class ‘<generic>’, writing appropriate methods that are
1688specialized to the new class, and creating new generic functions that
1689are instances of the new class.
1690
1691   The same is true for method metaobjects.  And the same basic
1692mechanism allows the application class author to write an ‘initialize’
1693method that is specialized to their application class, to initialize
1694instances of that class.
1695
1696   Such is the power of the MOP. Note that ‘initialize’ is just one of a
1697large number of generic functions that can be customized to modify the
1698behaviour of application objects and classes and of GOOPS itself.  Each
1699following section covers a particular area of GOOPS functionality, and
1700describes the generic functions that are relevant for customization of
1701that area.
1702
1703
1704File: guile.info,  Node: Metaclasses,  Next: MOP Specification,  Prev: Metaobjects and the Metaobject Protocol,  Up: The Metaobject Protocol
1705
17068.11.2 Metaclasses
1707------------------
1708
1709A “metaclass” is the class of an object which represents a GOOPS class.
1710Put more succinctly, a metaclass is a class’s class.
1711
1712   Most GOOPS classes have the metaclass ‘<class>’ and, by default, any
1713new class that is created using ‘define-class’ has the metaclass
1714‘<class>’.
1715
1716   But what does this really mean?  To find out, let’s look in more
1717detail at what happens when a new class is created using ‘define-class’:
1718
1719     (define-class <my-class> (<object>) . slots)
1720
1721Guile expands this to something like:
1722
1723     (define <my-class> (class (<object>) . slots))
1724
1725which in turn expands to:
1726
1727     (define <my-class>
1728       (make <class> #:dsupers (list <object>) #:slots slots))
1729
1730   As this expansion makes clear, the resulting value of ‘<my-class>’ is
1731an instance of the class ‘<class>’ with slot values specifying the
1732superclasses and slot definitions for the class ‘<my-class>’.
1733(‘#:dsupers’ and ‘#:slots’ are initialization keywords for the ‘dsupers’
1734and ‘dslots’ slots of the ‘<class>’ class.)
1735
1736   Now suppose that you want to define a new class with a metaclass
1737other than the default ‘<class>’.  This is done by writing:
1738
1739     (define-class <my-class2> (<object>)
1740        slot ...
1741        #:metaclass <my-metaclass>)
1742
1743and Guile expands _this_ to something like:
1744
1745     (define <my-class2>
1746       (make <my-metaclass> #:dsupers (list <object>) #:slots slots))
1747
1748   In this case, the value of ‘<my-class2>’ is an instance of the more
1749specialized class ‘<my-metaclass>’.  Note that ‘<my-metaclass>’ itself
1750must previously have been defined as a subclass of ‘<class>’.  For a
1751full discussion of when and how it is useful to define new metaclasses,
1752see *note MOP Specification::.
1753
1754   Now let’s make an instance of ‘<my-class2>’:
1755
1756     (define my-object (make <my-class2> ...))
1757
1758   All of the following statements are correct expressions of the
1759relationships between ‘my-object’, ‘<my-class2>’, ‘<my-metaclass>’ and
1760‘<class>’.
1761
1762   • ‘my-object’ is an instance of the class ‘<my-class2>’.
1763
1764   • ‘<my-class2>’ is an instance of the class ‘<my-metaclass>’.
1765
1766   • ‘<my-metaclass>’ is an instance of the class ‘<class>’.
1767
1768   • The class of ‘my-object’ is ‘<my-class2>’.
1769
1770   • The class of ‘<my-class2>’ is ‘<my-metaclass>’.
1771
1772   • The class of ‘<my-metaclass>’ is ‘<class>’.
1773
1774
1775File: guile.info,  Node: MOP Specification,  Next: Instance Creation Protocol,  Prev: Metaclasses,  Up: The Metaobject Protocol
1776
17778.11.3 MOP Specification
1778------------------------
1779
1780The aim of the MOP specification in this chapter is to specify all the
1781customizable generic function invocations that can be made by the
1782standard GOOPS syntax, procedures and methods, and to explain the
1783protocol for customizing such invocations.
1784
1785   A generic function invocation is customizable if the types of the
1786arguments to which it is applied are not completely determined by the
1787lexical context in which the invocation appears.  For example, the
1788‘(initialize INSTANCE INITARGS)’ invocation in the default
1789‘make-instance’ method is customizable, because the type of the
1790‘INSTANCE’ argument is determined by the class that was passed to
1791‘make-instance’.
1792
1793   (Whereas — to give a counter-example — the ‘(make <generic> #:name
1794',name)’ invocation in ‘define-generic’ is not customizable, because all
1795of its arguments have lexically determined types.)
1796
1797   When using this rule to decide whether a given generic function
1798invocation is customizable, we ignore arguments that are expected to be
1799handled in method definitions as a single “rest” list argument.
1800
1801   For each customizable generic function invocation, the “invocation
1802protocol” is explained by specifying
1803
1804   • what, conceptually, the applied method is intended to do
1805
1806   • what assumptions, if any, the caller makes about the applied
1807     method’s side effects
1808
1809   • what the caller expects to get as the applied method’s return
1810     value.
1811
1812
1813File: guile.info,  Node: Instance Creation Protocol,  Next: Class Definition Protocol,  Prev: MOP Specification,  Up: The Metaobject Protocol
1814
18158.11.4 Instance Creation Protocol
1816---------------------------------
1817
1818‘make <class> . INITARGS’ (method)
1819
1820   • ‘allocate-instance CLASS INITARGS’ (generic)
1821
1822     The applied ‘allocate-instance’ method should allocate storage for
1823     a new instance of class CLASS and return the uninitialized
1824     instance.
1825
1826   • ‘initialize INSTANCE INITARGS’ (generic)
1827
1828     INSTANCE is the uninitialized instance returned by
1829     ‘allocate-instance’.  The applied method should initialize the new
1830     instance in whatever sense is appropriate for its class.  The
1831     method’s return value is ignored.
1832
1833   ‘make’ itself is a generic function.  Hence the ‘make’ invocation
1834itself can be customized in the case where the new instance’s metaclass
1835is more specialized than the default ‘<class>’, by defining a ‘make’
1836method that is specialized to that metaclass.
1837
1838   Normally, however, the method for classes with metaclass ‘<class>’
1839will be applied.  This method calls two generic functions:
1840
1841   • (allocate-instance CLASS .  INITARGS)
1842
1843   • (initialize INSTANCE .  INITARGS)
1844
1845   ‘allocate-instance’ allocates storage for and returns the new
1846instance, uninitialized.  You might customize ‘allocate-instance’, for
1847example, if you wanted to provide a GOOPS wrapper around some other
1848object programming system.
1849
1850   To do this, you would create a specialized metaclass, which would act
1851as the metaclass for all classes and instances from the other system.
1852Then define an ‘allocate-instance’ method, specialized to that
1853metaclass, which calls a Guile primitive C function (or FFI code), which
1854in turn allocates the new instance using the interface of the other
1855object system.
1856
1857   In this case, for a complete system, you would also need to customize
1858a number of other generic functions like ‘make’ and ‘initialize’, so
1859that GOOPS knows how to make classes from the other system, access
1860instance slots, and so on.
1861
1862   ‘initialize’ initializes the instance that is returned by
1863‘allocate-instance’.  The standard GOOPS methods perform initializations
1864appropriate to the instance class.
1865
1866   • At the least specialized level, the method for instances of type
1867     ‘<object>’ performs internal GOOPS instance initialization, and
1868     initializes the instance’s slots according to the slot definitions
1869     and any slot initialization keywords that appear in INITARGS.
1870
1871   • The method for instances of type ‘<class>’ calls ‘(next-method)’,
1872     then performs the class initializations described in *note Class
1873     Definition Protocol::.
1874
1875   • and so on for generic functions, methods, operator classes ...
1876
1877   Similarly, you can customize the initialization of instances of any
1878application-defined class by defining an ‘initialize’ method specialized
1879to that class.
1880
1881   Imagine a class whose instances’ slots need to be initialized at
1882instance creation time by querying a database.  Although it might be
1883possible to achieve this a combination of ‘#:init-thunk’ keywords and
1884closures in the slot definitions, it may be neater to write an
1885‘initialize’ method for the class that queries the database once and
1886initializes all the dependent slot values according to the results.
1887
1888
1889File: guile.info,  Node: Class Definition Protocol,  Next: Customizing Class Definition,  Prev: Instance Creation Protocol,  Up: The Metaobject Protocol
1890
18918.11.5 Class Definition Protocol
1892--------------------------------
1893
1894Here is a summary diagram of the syntax, procedures and generic
1895functions that may be involved in class definition.
1896
1897‘define-class’ (syntax)
1898
1899   • ‘class’ (syntax)
1900
1901        • ‘make-class’ (procedure)
1902
1903             • ‘ensure-metaclass’ (procedure)
1904
1905             • ‘make METACLASS ...’ (generic)
1906
1907                  • ‘allocate-instance’ (generic)
1908
1909                  • ‘initialize’ (generic)
1910
1911                       • ‘compute-cpl’ (generic)
1912
1913                            • ‘compute-std-cpl’ (procedure)
1914
1915                       • ‘compute-slots’ (generic)
1916
1917                       • ‘compute-get-n-set’ (generic)
1918
1919                       • ‘compute-getter-method’ (generic)
1920
1921                       • ‘compute-setter-method’ (generic)
1922
1923   • ‘class-redefinition’ (generic)
1924
1925        • ‘remove-class-accessors’ (generic)
1926
1927        • ‘update-direct-method!’ (generic)
1928
1929        • ‘update-direct-subclass!’ (generic)
1930
1931   Wherever a step above is marked as “generic”, it can be customized,
1932and the detail shown below it is only “correct” insofar as it describes
1933what the default method of that generic function does.  For example, if
1934you write an ‘initialize’ method, for some metaclass, that does not call
1935‘next-method’ and does not call ‘compute-cpl’, then ‘compute-cpl’ will
1936not be called when a class is defined with that metaclass.
1937
1938   A ‘(define-class ...)’ form (*note Class Definition::) expands to an
1939expression which
1940
1941   • checks that it is being evaluated only at top level
1942
1943   • defines any accessors that are implied by the SLOT-DEFINITIONs
1944
1945   • uses ‘class’ to create the new class
1946
1947   • checks for a previous class definition for NAME and, if found,
1948     handles the redefinition by invoking ‘class-redefinition’ (*note
1949     Redefining a Class::).
1950
1951 -- syntax: class name (super ...) slot-definition ... class-option ...
1952     Return a newly created class that inherits from SUPERs, with direct
1953     slots defined by SLOT-DEFINITIONs and CLASS-OPTIONs.  For the
1954     format of SLOT-DEFINITIONs and CLASS-OPTIONs, see *note
1955     define-class: Class Definition.
1956
1957‘class’ expands to an expression which
1958
1959   • processes the class and slot definition options to check that they
1960     are well-formed, to convert the ‘#:init-form’ option to an
1961     ‘#:init-thunk’ option, to supply a default environment parameter
1962     (the current top-level environment) and to evaluate all the bits
1963     that need to be evaluated
1964
1965   • calls ‘make-class’ to create the class with the processed and
1966     evaluated parameters.
1967
1968 -- procedure: make-class supers slots class-option ...
1969     Return a newly created class that inherits from SUPERS, with direct
1970     slots defined by SLOTS and CLASS-OPTIONs.  For the format of SLOTS
1971     and CLASS-OPTIONs, see *note define-class: Class Definition, except
1972     note that for ‘make-class’, SLOTS is a separate list of slot
1973     definitions.
1974
1975‘make-class’
1976
1977   • adds ‘<object>’ to the SUPERS list if SUPERS is empty or if none of
1978     the classes in SUPERS have ‘<object>’ in their class precedence
1979     list
1980
1981   • defaults the ‘#:environment’, ‘#:name’ and ‘#:metaclass’ options,
1982     if they are not specified by OPTIONS, to the current top-level
1983     environment, the unbound value, and ‘(ensure-metaclass SUPERS)’
1984     respectively
1985
1986   • checks for duplicate classes in SUPERS and duplicate slot names in
1987     SLOTS, and signals an error if there are any duplicates
1988
1989   • calls ‘make’, passing the metaclass as the first parameter and all
1990     other parameters as option keywords with values.
1991
1992 -- procedure: ensure-metaclass supers env
1993     Return a metaclass suitable for a class that inherits from the list
1994     of classes in SUPERS.  The returned metaclass is the union by
1995     inheritance of the metaclasses of the classes in SUPERS.
1996
1997     In the simplest case, where all the SUPERS are straightforward
1998     classes with metaclass ‘<class>’, the returned metaclass is just
1999     ‘<class>’.
2000
2001     For a more complex example, suppose that SUPERS contained one class
2002     with metaclass ‘<operator-class>’ and one with metaclass
2003     ‘<foreign-object-class>’.  Then the returned metaclass would be a
2004     class that inherits from both ‘<operator-class>’ and
2005     ‘<foreign-object-class>’.
2006
2007     If SUPERS is the empty list, ‘ensure-metaclass’ returns the default
2008     GOOPS metaclass ‘<class>’.
2009
2010     GOOPS keeps a list of the metaclasses created by
2011     ‘ensure-metaclass’, so that each required type of metaclass only
2012     has to be created once.
2013
2014     The ‘env’ parameter is ignored.
2015
2016 -- generic: make metaclass initarg ...
2017     METACLASS is the metaclass of the class being defined, either taken
2018     from the ‘#:metaclass’ class option or computed by
2019     ‘ensure-metaclass’.  The applied method must create and return the
2020     fully initialized class metaobject for the new class definition.
2021
2022   The ‘(make METACLASS INITARG ...)’ invocation is a particular case of
2023the instance creation protocol covered in the previous section.  It will
2024create an class metaobject with metaclass METACLASS.  By default, this
2025metaobject will be initialized by the ‘initialize’ method that is
2026specialized for instances of type ‘<class>’.
2027
2028   The ‘initialize’ method for classes (signature ‘(initialize <class>
2029initargs)’) calls the following generic functions.
2030
2031   • ‘compute-cpl CLASS’ (generic)
2032
2033     The applied method should compute and return the class precedence
2034     list for CLASS as a list of class metaobjects.  When ‘compute-cpl’
2035     is called, the following CLASS metaobject slots have all been
2036     initialized: ‘name’, ‘direct-supers’, ‘direct-slots’,
2037     ‘direct-subclasses’ (empty), ‘direct-methods’.  The value returned
2038     by ‘compute-cpl’ will be stored in the ‘cpl’ slot.
2039
2040   • ‘compute-slots CLASS’ (generic)
2041
2042     The applied method should compute and return the slots (union of
2043     direct and inherited) for CLASS as a list of slot definitions.
2044     When ‘compute-slots’ is called, all the CLASS metaobject slots
2045     mentioned for ‘compute-cpl’ have been initialized, plus the
2046     following: ‘cpl’, ‘redefined’ (‘#f’), ‘environment’.  The value
2047     returned by ‘compute-slots’ will be stored in the ‘slots’ slot.
2048
2049   • ‘compute-get-n-set CLASS SLOT-DEF’ (generic)
2050
2051     ‘initialize’ calls ‘compute-get-n-set’ for each slot computed by
2052     ‘compute-slots’.  The applied method should compute and return a
2053     pair of closures that, respectively, get and set the value of the
2054     specified slot.  The get closure should have arity 1 and expect a
2055     single argument that is the instance whose slot value is to be
2056     retrieved.  The set closure should have arity 2 and expect two
2057     arguments, where the first argument is the instance whose slot
2058     value is to be set and the second argument is the new value for
2059     that slot.  The closures should be returned in a two element list:
2060     ‘(list GET SET)’.
2061
2062     The closures returned by ‘compute-get-n-set’ are stored as part of
2063     the value of the CLASS metaobject’s ‘getters-n-setters’ slot.
2064     Specifically, the value of this slot is a list with the same number
2065     of elements as there are slots in the class, and each element looks
2066     either like
2067
2068          (SLOT-NAME-SYMBOL INIT-FUNCTION . INDEX)
2069
2070     or like
2071
2072          (SLOT-NAME-SYMBOL INIT-FUNCTION GET SET)
2073
2074     Where the get and set closures are replaced by INDEX, the slot is
2075     an instance slot and INDEX is the slot’s index in the underlying
2076     structure: GOOPS knows how to get and set the value of such slots
2077     and so does not need specially constructed get and set closures.
2078     Otherwise, GET and SET are the closures returned by
2079     ‘compute-get-n-set’.
2080
2081     The structure of the ‘getters-n-setters’ slot value is important
2082     when understanding the next customizable generic functions that
2083     ‘initialize’ calls...
2084
2085   • ‘compute-getter-method CLASS GNS’ (generic)
2086
2087     ‘initialize’ calls ‘compute-getter-method’ for each of the class’s
2088     slots (as determined by ‘compute-slots’) that includes a ‘#:getter’
2089     or ‘#:accessor’ slot option.  GNS is the element of the CLASS
2090     metaobject’s ‘getters-n-setters’ slot that specifies how the slot
2091     in question is referenced and set, as described above under
2092     ‘compute-get-n-set’.  The applied method should create and return a
2093     method that is specialized for instances of type CLASS and uses the
2094     get closure to retrieve the slot’s value.  ‘initialize’ uses
2095     ‘add-method!’ to add the returned method to the generic function
2096     named by the slot definition’s ‘#:getter’ or ‘#:accessor’ option.
2097
2098   • ‘compute-setter-method CLASS GNS’ (generic)
2099
2100     ‘compute-setter-method’ is invoked with the same arguments as
2101     ‘compute-getter-method’, for each of the class’s slots that
2102     includes a ‘#:setter’ or ‘#:accessor’ slot option.  The applied
2103     method should create and return a method that is specialized for
2104     instances of type CLASS and uses the set closure to set the slot’s
2105     value.  ‘initialize’ then uses ‘add-method!’ to add the returned
2106     method to the generic function named by the slot definition’s
2107     ‘#:setter’ or ‘#:accessor’ option.
2108
2109
2110File: guile.info,  Node: Customizing Class Definition,  Next: Method Definition,  Prev: Class Definition Protocol,  Up: The Metaobject Protocol
2111
21128.11.6 Customizing Class Definition
2113-----------------------------------
2114
2115If the metaclass of the new class is something more specialized than the
2116default ‘<class>’, then the type of CLASS in the calls above is more
2117specialized than ‘<class>’, and hence it becomes possible to define
2118generic function methods, specialized for the new class’s metaclass,
2119that can modify or override the default behaviour of ‘initialize’,
2120‘compute-cpl’ or ‘compute-get-n-set’.
2121
2122   ‘compute-cpl’ computes the class precedence list (“CPL”) for the new
2123class (*note Class Precedence List::), and returns it as a list of class
2124objects.  The CPL is important because it defines a superclass ordering
2125that is used, when a generic function is invoked upon an instance of the
2126class, to decide which of the available generic function methods is the
2127most specific.  Hence ‘compute-cpl’ could be customized in order to
2128modify the CPL ordering algorithm for all classes with a special
2129metaclass.
2130
2131   The default CPL algorithm is encapsulated by the ‘compute-std-cpl’
2132procedure, which is called by the default ‘compute-cpl’ method.
2133
2134 -- procedure: compute-std-cpl class
2135     Compute and return the class precedence list for CLASS according to
2136     the algorithm described in *note Class Precedence List::.
2137
2138   ‘compute-slots’ computes and returns a list of all slot definitions
2139for the new class.  By default, this list includes the direct slot
2140definitions from the ‘define-class’ form, plus the slot definitions that
2141are inherited from the new class’s superclasses.  The default
2142‘compute-slots’ method uses the CPL computed by ‘compute-cpl’ to
2143calculate this union of slot definitions, with the rule that slots
2144inherited from superclasses are shadowed by direct slots with the same
2145name.  One possible reason for customizing ‘compute-slots’ would be to
2146implement an alternative resolution strategy for slot name conflicts.
2147
2148   ‘compute-get-n-set’ computes the low-level closures that will be used
2149to get and set the value of a particular slot, and returns them in a
2150list with two elements.
2151
2152   The closures returned depend on how storage for that slot is
2153allocated.  The standard ‘compute-get-n-set’ method, specialized for
2154classes of type ‘<class>’, handles the standard GOOPS values for the
2155‘#:allocation’ slot option (*note allocation: Slot Options.).  By
2156defining a new ‘compute-get-n-set’ method for a more specialized
2157metaclass, it is possible to support new types of slot allocation.
2158
2159   Suppose you wanted to create a large number of instances of some
2160class with a slot that should be shared between some but not all
2161instances of that class - say every 10 instances should share the same
2162slot storage.  The following example shows how to implement and use a
2163new type of slot allocation to do this.
2164
2165     (define-class <batched-allocation-metaclass> (<class>))
2166
2167     (let ((batch-allocation-count 0)
2168           (batch-get-n-set #f))
2169       (define-method (compute-get-n-set
2170                          (class <batched-allocation-metaclass>) s)
2171         (case (slot-definition-allocation s)
2172           ((#:batched)
2173            ;; If we've already used the same slot storage for 10 instances,
2174            ;; reset variables.
2175            (if (= batch-allocation-count 10)
2176                (begin
2177                  (set! batch-allocation-count 0)
2178                  (set! batch-get-n-set #f)))
2179            ;; If we don't have a current pair of get and set closures,
2180            ;; create one.  make-closure-variable returns a pair of closures
2181            ;; around a single Scheme variable - see goops.scm for details.
2182            (or batch-get-n-set
2183                (set! batch-get-n-set (make-closure-variable)))
2184            ;; Increment the batch allocation count.
2185            (set! batch-allocation-count (+ batch-allocation-count 1))
2186            batch-get-n-set)
2187
2188           ;; Call next-method to handle standard allocation types.
2189           (else (next-method)))))
2190
2191     (define-class <class-using-batched-slot> ()
2192       ...
2193       (c #:allocation #:batched)
2194       ...
2195       #:metaclass <batched-allocation-metaclass>)
2196
2197   The usage of ‘compute-getter-method’ and ‘compute-setter-method’ is
2198described in *note Class Definition Protocol::.
2199
2200   ‘compute-cpl’ and ‘compute-get-n-set’ are called by the standard
2201‘initialize’ method for classes whose metaclass is ‘<class>’.  But
2202‘initialize’ itself can also be modified, by defining an ‘initialize’
2203method specialized to the new class’s metaclass.  Such a method could
2204complete override the standard behaviour, by not calling ‘(next-method)’
2205at all, but more typically it would perform additional class
2206initialization steps before and/or after calling ‘(next-method)’ for the
2207standard behaviour.
2208
2209
2210File: guile.info,  Node: Method Definition,  Next: Method Definition Internals,  Prev: Customizing Class Definition,  Up: The Metaobject Protocol
2211
22128.11.7 Method Definition
2213------------------------
2214
2215‘define-method’ (syntax)
2216
2217   • ‘add-method! TARGET METHOD’ (generic)
2218
2219‘define-method’ invokes the ‘add-method!’ generic function to handle
2220adding the new method to a variety of possible targets.  GOOPS includes
2221methods to handle TARGET as
2222
2223   • a generic function (the most common case)
2224
2225   • a procedure
2226
2227   • a primitive generic (*note Extending Primitives::)
2228
2229   By defining further methods for ‘add-method!’, you can theoretically
2230handle adding methods to further types of target.
2231
2232
2233File: guile.info,  Node: Method Definition Internals,  Next: Generic Function Internals,  Prev: Method Definition,  Up: The Metaobject Protocol
2234
22358.11.8 Method Definition Internals
2236----------------------------------
2237
2238‘define-method’:
2239
2240   • checks the form of the first parameter, and applies the following
2241     steps to the accessor’s setter if it has the ‘(setter ...)’ form
2242
2243   • interpolates a call to ‘define-generic’ or ‘define-accessor’ if a
2244     generic function is not already defined with the supplied name
2245
2246   • calls ‘method’ with the PARAMETERs and BODY, to make a new method
2247     instance
2248
2249   • calls ‘add-method!’ to add this method to the relevant generic
2250     function.
2251
2252 -- syntax: method (parameter ...) body ...
2253     Make a method whose specializers are defined by the classes in
2254     PARAMETERs and whose procedure definition is constructed from the
2255     PARAMETER symbols and BODY forms.
2256
2257     The PARAMETER and BODY parameters should be as for ‘define-method’
2258     (*note define-method: Methods and Generic Functions.).
2259
2260‘method’:
2261
2262   • extracts formals and specializing classes from the PARAMETERs,
2263     defaulting the class for unspecialized parameters to ‘<top>’
2264
2265   • creates a closure using the formals and the BODY forms
2266
2267   • calls ‘make’ with metaclass ‘<method>’ and the specializers and
2268     closure using the ‘#:specializers’ and ‘#:procedure’ keywords.
2269
2270 -- procedure: make-method specializers procedure
2271     Make a method using SPECIALIZERS and PROCEDURE.
2272
2273     SPECIALIZERS should be a list of classes that specifies the
2274     parameter combinations to which this method will be applicable.
2275
2276     PROCEDURE should be the closure that will applied to the generic
2277     function parameters when this method is invoked.
2278
2279‘make-method’ is a simple wrapper around ‘make’ with metaclass
2280‘<method>’.
2281
2282 -- generic: add-method! target method
2283     Generic function for adding method METHOD to TARGET.
2284
2285 -- method: add-method! (generic <generic>) (method <method>)
2286     Add method METHOD to the generic function GENERIC.
2287
2288 -- method: add-method! (proc <procedure>) (method <method>)
2289     If PROC is a procedure with generic capability (*note
2290     generic-capability?: Extending Primitives.), upgrade it to a
2291     primitive generic and add METHOD to its generic function
2292     definition.
2293
2294 -- method: add-method! (pg <primitive-generic>) (method <method>)
2295     Add method METHOD to the generic function definition of PG.
2296
2297     Implementation: ‘(add-method! (primitive-generic-generic pg)
2298     method)’.
2299
2300 -- method: add-method! (whatever <top>) (method <method>)
2301     Raise an error indicating that WHATEVER is not a valid generic
2302     function.
2303
2304
2305File: guile.info,  Node: Generic Function Internals,  Next: Generic Function Invocation,  Prev: Method Definition Internals,  Up: The Metaobject Protocol
2306
23078.11.9 Generic Function Internals
2308---------------------------------
2309
2310‘define-generic’ calls ‘ensure-generic’ to upgrade a pre-existing
2311procedure value, or ‘make’ with metaclass ‘<generic>’ to create a new
2312generic function.
2313
2314   ‘define-accessor’ calls ‘ensure-accessor’ to upgrade a pre-existing
2315procedure value, or ‘make-accessor’ to create a new accessor.
2316
2317 -- procedure: ensure-generic old-definition [name]
2318     Return a generic function with name NAME, if possible by using or
2319     upgrading OLD-DEFINITION.  If unspecified, NAME defaults to ‘#f’.
2320
2321     If OLD-DEFINITION is already a generic function, it is returned
2322     unchanged.
2323
2324     If OLD-DEFINITION is a Scheme procedure or procedure-with-setter,
2325     ‘ensure-generic’ returns a new generic function that uses
2326     OLD-DEFINITION for its default procedure and setter.
2327
2328     Otherwise ‘ensure-generic’ returns a new generic function with no
2329     defaults and no methods.
2330
2331 -- procedure: make-generic [name]
2332     Return a new generic function with name ‘(car NAME)’.  If
2333     unspecified, NAME defaults to ‘#f’.
2334
2335   ‘ensure-generic’ calls ‘make’ with metaclasses ‘<generic>’ and
2336‘<generic-with-setter>’, depending on the previous value of the variable
2337that it is trying to upgrade.
2338
2339   ‘make-generic’ is a simple wrapper for ‘make’ with metaclass
2340‘<generic>’.
2341
2342 -- procedure: ensure-accessor proc [name]
2343     Return an accessor with name NAME, if possible by using or
2344     upgrading PROC.  If unspecified, NAME defaults to ‘#f’.
2345
2346     If PROC is already an accessor, it is returned unchanged.
2347
2348     If PROC is a Scheme procedure, procedure-with-setter or generic
2349     function, ‘ensure-accessor’ returns an accessor that reuses the
2350     reusable elements of PROC.
2351
2352     Otherwise ‘ensure-accessor’ returns a new accessor with no defaults
2353     and no methods.
2354
2355 -- procedure: make-accessor [name]
2356     Return a new accessor with name ‘(car NAME)’.  If unspecified, NAME
2357     defaults to ‘#f’.
2358
2359   ‘ensure-accessor’ calls ‘make’ with metaclass
2360‘<generic-with-setter>’, as well as calls to ‘ensure-generic’,
2361‘make-accessor’ and (tail recursively) ‘ensure-accessor’.
2362
2363   ‘make-accessor’ calls ‘make’ twice, first with metaclass ‘<generic>’
2364to create a generic function for the setter, then with metaclass
2365‘<generic-with-setter>’ to create the accessor, passing the setter
2366generic function as the value of the ‘#:setter’ keyword.
2367
2368
2369File: guile.info,  Node: Generic Function Invocation,  Prev: Generic Function Internals,  Up: The Metaobject Protocol
2370
23718.11.10 Generic Function Invocation
2372-----------------------------------
2373
2374There is a detailed and customizable protocol involved in the process of
2375invoking a generic function — i.e., in the process of deciding which of
2376the generic function’s methods are applicable to the current arguments,
2377and which one of those to apply.  Here is a summary diagram of the
2378generic functions involved.
2379
2380‘apply-generic’ (generic)
2381
2382   • ‘no-method’ (generic)
2383
2384   • ‘compute-applicable-methods’ (generic)
2385
2386   • ‘sort-applicable-methods’ (generic)
2387
2388        • ‘method-more-specific?’ (generic)
2389
2390   • ‘apply-methods’ (generic)
2391
2392        • ‘apply-method’ (generic)
2393
2394        • ‘no-next-method’ (generic)
2395
2396   • ‘no-applicable-method’
2397
2398   We do not yet have full documentation for these.  Please refer to the
2399code (‘oop/goops.scm’) for details.
2400
2401
2402File: guile.info,  Node: Redefining a Class,  Next: Changing the Class of an Instance,  Prev: The Metaobject Protocol,  Up: GOOPS
2403
24048.12 Redefining a Class
2405=======================
2406
2407Suppose that a class ‘<my-class>’ is defined using ‘define-class’ (*note
2408define-class: Class Definition.), with slots that have accessor
2409functions, and that an application has created several instances of
2410‘<my-class>’ using ‘make’ (*note make: Instance Creation.).  What then
2411happens if ‘<my-class>’ is redefined by calling ‘define-class’ again?
2412
2413* Menu:
2414
2415* Redefinable Classes::
2416* Default Class Redefinition Behaviour::
2417* Customizing Class Redefinition::
2418
2419
2420File: guile.info,  Node: Redefinable Classes,  Next: Default Class Redefinition Behaviour,  Up: Redefining a Class
2421
24228.12.1 Redefinable Classes
2423--------------------------
2424
2425The ability for a class to be redefined is a choice for a class author
2426to make.  By default, classes in GOOPS are _not_ redefinable.  A
2427redefinable class is an instance of ‘<redefinable-class>’; that is to
2428say, a class with ‘<redefinable-class>’ as its metaclass.  Accordingly,
2429to define a redefinable class, add ‘#:metaclass <redefinable-class>’ to
2430its class definition:
2431
2432     (define-class <foo> ()
2433       #:metaclass <redefinable-class>)
2434
2435   Note that any subclass of ‘<foo>’ is also redefinable, without the
2436need to explicitly pass the ‘#:metaclass’ argument, so you only need to
2437specify ‘#:metaclass’ for the roots of your application’s class
2438hierarchy.
2439
2440     (define-class <bar> (<foo>))
2441     (class-of <bar>) ⇒ <redefinable-class>
2442
2443   Note that prior to Guile 3.0, all GOOPS classes were redefinable in
2444theory.  In practice, attempting to, for example, redefine ‘<class>’
2445itself would almost certainly not do what you want.  Still, redefinition
2446is an interesting capability when building long-lived resilient systems,
2447so GOOPS does offer this facility.
2448
2449
2450File: guile.info,  Node: Default Class Redefinition Behaviour,  Next: Customizing Class Redefinition,  Prev: Redefinable Classes,  Up: Redefining a Class
2451
24528.12.2 Default Class Redefinition Behaviour
2453-------------------------------------------
2454
2455When a class is defined using ‘define-class’ and the class name was
2456previously defined, by default the new binding just replaces the old
2457binding.  This is the normal behavior for ‘define’.  However if both the
2458old and new bindings are redefinable classes (instances of
2459‘<redefinable-class>’), then the class will be updated in place, and its
2460instances lazily migrated over.
2461
2462   The way that the class is updated and the way that the instances
2463migrate over are of course part of the meta-object protocol.  However
2464the default behavior usually suffices, and it goes as follows.
2465
2466   • All existing direct instances of ‘<my-class>’ are converted to be
2467     instances of the new class.  This is achieved by preserving the
2468     values of slots that exist in both the old and new definitions, and
2469     initializing the values of new slots in the usual way (*note make:
2470     Instance Creation.).
2471
2472   • All existing subclasses of ‘<my-class>’ are redefined, as though
2473     the ‘define-class’ expressions that defined them were re-evaluated
2474     following the redefinition of ‘<my-class>’, and the class
2475     redefinition process described here is applied recursively to the
2476     redefined subclasses.
2477
2478   • Once all of its instances and subclasses have been updated, the
2479     class metaobject previously bound to the variable ‘<my-class>’ is
2480     no longer needed and so can be allowed to be garbage collected.
2481
2482   To keep things tidy, GOOPS also needs to do a little housekeeping on
2483methods that are associated with the redefined class.
2484
2485   • Slot accessor methods for slots in the old definition should be
2486     removed from their generic functions.  They will be replaced by
2487     accessor methods for the slots of the new class definition.
2488
2489   • Any generic function method that uses the old ‘<my-class>’
2490     metaobject as one of its formal parameter specializers must be
2491     updated to refer to the new ‘<my-class>’ metaobject.  (Whenever a
2492     new generic function method is defined, ‘define-method’ adds the
2493     method to a list stored in the class metaobject for each class used
2494     as a formal parameter specializer, so it is easy to identify all
2495     the methods that must be updated when a class is redefined.)
2496
2497   If this class redefinition strategy strikes you as rather
2498counter-intuitive, bear in mind that it is derived from similar
2499behaviour in other object systems such as CLOS, and that experience in
2500those systems has shown it to be very useful in practice.
2501
2502   Also bear in mind that, like most of GOOPS’ default behaviour, it can
2503be customized...
2504
2505
2506File: guile.info,  Node: Customizing Class Redefinition,  Prev: Default Class Redefinition Behaviour,  Up: Redefining a Class
2507
25088.12.3 Customizing Class Redefinition
2509-------------------------------------
2510
2511When ‘define-class’ notices that a class is being redefined, it
2512constructs the new class metaobject as usual, then invokes the
2513‘class-redefinition’ generic function with the old and new classes as
2514arguments.  Therefore, if the old or new classes have metaclasses other
2515than the default ‘<redefinable-class>’, class redefinition behaviour can
2516be customized by defining a ‘class-redefinition’ method that is
2517specialized for the relevant metaclasses.
2518
2519 -- generic: class-redefinition
2520     Handle the class redefinition from OLD to NEW, and return the new
2521     class metaobject that should be bound to the variable specified by
2522     ‘define-class’’s first argument.
2523
2524 -- method: class-redefinition (old <top>) (new <class>)
2525     Not all classes are redefinable, and not all previous bindings are
2526     classes.  *Note Redefinable Classes::.  This default method just
2527     returns NEW.
2528
2529 -- method: class-redefinition (old <redefinable-class>) (new
2530          <redefinable-class>)
2531     This method implements GOOPS’ default class redefinition behaviour,
2532     as described in *note Default Class Redefinition Behaviour::.
2533     Returns the metaobject for the new class definition.
2534
2535   The ‘class-redefinition’ method for classes with metaclass
2536‘<redefinable-class>’ calls the following generic functions, which could
2537of course be individually customized.
2538
2539 -- generic: remove-class-accessors! old
2540     The default ‘remove-class-accessors!’ method removes the accessor
2541     methods of the old class from all classes which they specialize.
2542
2543 -- generic: update-direct-method! method old new
2544     The default ‘update-direct-method!’ method substitutes the new
2545     class for the old in all methods specialized to the old class.
2546
2547 -- generic: update-direct-subclass! subclass old new
2548     The default ‘update-direct-subclass!’ method invokes
2549     ‘class-redefinition’ recursively to handle the redefinition of
2550     subclasses.
2551
2552   An alternative class redefinition strategy could be to leave all
2553existing instances as instances of the old class, but accepting that the
2554old class is now “nameless”, since its name has been taken over by the
2555new definition.  In this strategy, any existing subclasses could also be
2556left as they are, on the understanding that they inherit from a nameless
2557superclass.
2558
2559   This strategy is easily implemented in GOOPS, by defining a new
2560metaclass, that will be used as the metaclass for all classes to which
2561the strategy should apply, and then defining a ‘class-redefinition’
2562method that is specialized for this metaclass:
2563
2564     (define-class <can-be-nameless> (<redefinable-class>))
2565
2566     (define-method (class-redefinition (old <can-be-nameless>)
2567                                        (new <class>))
2568       new)
2569
2570   When customization can be as easy as this, aren’t you glad that GOOPS
2571implements the far more difficult strategy as its default!
2572
2573
2574File: guile.info,  Node: Changing the Class of an Instance,  Prev: Redefining a Class,  Up: GOOPS
2575
25768.13 Changing the Class of an Instance
2577======================================
2578
2579When a redefinable class is redefined, any existing instance of the
2580redefined class will be modified for the new class definition before the
2581next time that any of the instance’s slots is referenced or set.  GOOPS
2582modifies each instance by calling the generic function ‘change-class’.
2583
2584   More generally, you can change the class of an existing instance at
2585any time by invoking the generic function ‘change-class’ with two
2586arguments: the instance and the new class.
2587
2588   The default method for ‘change-class’ decides how to implement the
2589change of class by looking at the slot definitions for the instance’s
2590existing class and for the new class.  If the new class has slots with
2591the same name as slots in the existing class, the values for those slots
2592are preserved.  Slots that are present only in the existing class are
2593discarded.  Slots that are present only in the new class are initialized
2594using the corresponding slot definition’s init function (*note
2595slot-init-function: Classes.).
2596
2597 -- generic: change-class instance new-class
2598
2599 -- method: change-class (obj <object>) (new <redefinable-class>)
2600     Modify instance OBJ to make it an instance of class NEW.  OBJ
2601     itself must already be an instance of a redefinable class.
2602
2603     The value of each of OBJ’s slots is preserved only if a similarly
2604     named slot exists in NEW; any other slot values are discarded.
2605
2606     The slots in NEW that do not correspond to any of OBJ’s
2607     pre-existing slots are initialized according to NEW’s slot
2608     definitions’ init functions.
2609
2610   The default ‘change-class’ method also invokes another generic
2611function, ‘update-instance-for-different-class’, as the last thing that
2612it does before returning.  The applied
2613‘update-instance-for-different-class’ method can make any further
2614adjustments to NEW-INSTANCE that are required to complete or modify the
2615change of class.  The return value from the applied method is ignored.
2616
2617 -- generic: update-instance-for-different-class old-instance
2618          new-instance
2619     A generic function that can be customized to put finishing touches
2620     to an instance whose class has just been changed.  The default
2621     ‘update-instance-for-different-class’ method does nothing.
2622
2623   Customized change of class behaviour can be implemented by defining
2624‘change-class’ methods that are specialized either by the class of the
2625instances to be modified or by the metaclass of the new class.
2626
2627
2628File: guile.info,  Node: Guile Implementation,  Next: GNU Free Documentation License,  Prev: GOOPS,  Up: Top
2629
26309 Guile Implementation
2631**********************
2632
2633At some point, after one has been programming in Scheme for some time,
2634another level of Scheme comes into view: its implementation.  Knowledge
2635of how Scheme can be implemented turns out to be necessary to become an
2636expert hacker.  As Peter Norvig notes in his retrospective on PAIP(1),
2637“The expert Lisp programmer eventually develops a good ‘efficiency
2638model’.”
2639
2640   By this Norvig means that over time, the Lisp hacker eventually
2641develops an understanding of how much her code “costs” in terms of space
2642and time.
2643
2644   This chapter describes Guile as an implementation of Scheme: its
2645history, how it represents and evaluates its data, and its compiler.
2646This knowledge can help you to make that step from being one who is
2647merely familiar with Scheme to being a real hacker.
2648
2649* Menu:
2650
2651* History::                          A brief history of Guile.
2652* Data Representation::              How Guile represents Scheme data.
2653* A Virtual Machine for Guile::      How compiled procedures work.
2654* Compiling to the Virtual Machine:: Not as hard as you might think.
2655
2656   ---------- Footnotes ----------
2657
2658   (1) PAIP is the common abbreviation for ‘Paradigms of Artificial
2659Intelligence Programming’, an old but still useful text on Lisp.
2660Norvig’s retrospective sums up the lessons of PAIP, and can be found at
2661<http://norvig.com/Lisp-retro.html>.
2662
2663
2664File: guile.info,  Node: History,  Next: Data Representation,  Up: Guile Implementation
2665
26669.1 A Brief History of Guile
2667============================
2668
2669Guile is an artifact of historical processes, both as code and as a
2670community of hackers.  It is sometimes useful to know this history when
2671hacking the source code, to know about past decisions and future
2672directions.
2673
2674   Of course, the real history of Guile is written by the hackers
2675hacking and not the writers writing, so we round up the section with a
2676note on current status and future directions.
2677
2678* Menu:
2679
2680* The Emacs Thesis::
2681* Early Days::
2682* A Scheme of Many Maintainers::
2683* A Timeline of Selected Guile Releases::
2684* Status::
2685
2686
2687File: guile.info,  Node: The Emacs Thesis,  Next: Early Days,  Up: History
2688
26899.1.1 The Emacs Thesis
2690----------------------
2691
2692The story of Guile is the story of bringing the development experience
2693of Emacs to the mass of programs on a GNU system.
2694
2695   Emacs, when it was first created in its GNU form in 1984, was a new
2696take on the problem of “how to make a program”.  The Emacs thesis is
2697that it is delightful to create composite programs based on an
2698orthogonal kernel written in a low-level language together with a
2699powerful, high-level extension language.
2700
2701   Extension languages foster extensible programs, programs which adapt
2702readily to different users and to changing times.  Proof of this can be
2703seen in Emacs’ current and continued existence, spanning more than a
2704quarter-century.
2705
2706   Besides providing for modification of a program by others, extension
2707languages are good for _intension_ as well.  Programs built in “the
2708Emacs way” are pleasurable and easy for their authors to flesh out with
2709the features that they need.
2710
2711   After the Emacs experience was appreciated more widely, a number of
2712hackers started to consider how to spread this experience to the rest of
2713the GNU system.  It was clear that the easiest way to Emacsify a program
2714would be to embed a shared language implementation into it.
2715
2716
2717File: guile.info,  Node: Early Days,  Next: A Scheme of Many Maintainers,  Prev: The Emacs Thesis,  Up: History
2718
27199.1.2 Early Days
2720----------------
2721
2722Tom Lord was the first to fully concentrate his efforts on an embeddable
2723language runtime, which he named “GEL”, the GNU Extension Language.
2724
2725   GEL was the product of converting SCM, Aubrey Jaffer’s implementation
2726of Scheme, into something more appropriate to embedding as a library.
2727(SCM was itself based on an implementation by George Carrette, SIOD.)
2728
2729   Lord managed to convince Richard Stallman to dub GEL the official
2730extension language for the GNU project.  It was a natural fit, given
2731that Scheme was a cleaner, more modern Lisp than Emacs Lisp.  Part of
2732the argument was that eventually when GEL became more capable, it could
2733gain the ability to execute other languages, especially Emacs Lisp.
2734
2735   Due to a naming conflict with another programming language, Jim
2736Blandy suggested a new name for GEL: “Guile”.  Besides being a recursive
2737acronym, “Guile” craftily follows the naming of its ancestors,
2738“Planner”, “Conniver”, and “Schemer”.  (The latter was truncated to
2739“Scheme” due to a 6-character file name limit on an old operating
2740system.)  Finally, “Guile” suggests “guy-ell”, or “Guy L. Steele”, who,
2741together with Gerald Sussman, originally discovered Scheme.
2742
2743   Around the same time that Guile (then GEL) was readying itself for
2744public release, another extension language was gaining in popularity,
2745Tcl.  Many developers found advantages in Tcl because of its shell-like
2746syntax and its well-developed graphical widgets library, Tk.  Also, at
2747the time there was a large marketing push promoting Tcl as a “universal
2748extension language”.
2749
2750   Richard Stallman, as the primary author of GNU Emacs, had a
2751particular vision of what extension languages should be, and Tcl did not
2752seem to him to be as capable as Emacs Lisp.  He posted a criticism to
2753the comp.lang.tcl newsgroup, sparking one of the internet’s legendary
2754flamewars.  As part of these discussions, retrospectively dubbed the
2755“Tcl Wars”, he announced the Free Software Foundation’s intent to
2756promote Guile as the extension language for the GNU project.
2757
2758   It is a common misconception that Guile was created as a reaction to
2759Tcl.  While it is true that the public announcement of Guile happened at
2760the same time as the “Tcl wars”, Guile was created out of a condition
2761that existed outside the polemic.  Indeed, the need for a powerful
2762language to bridge the gap between extension of existing applications
2763and a more fully dynamic programming environment is still with us today.
2764
2765
2766File: guile.info,  Node: A Scheme of Many Maintainers,  Next: A Timeline of Selected Guile Releases,  Prev: Early Days,  Up: History
2767
27689.1.3 A Scheme of Many Maintainers
2769----------------------------------
2770
2771Surveying the field, it seems that Scheme implementations correspond
2772with their maintainers on an N-to-1 relationship.  That is to say, that
2773those people that implement Schemes might do so on a number of
2774occasions, but that the lifetime of a given Scheme is tied to the
2775maintainership of one individual.
2776
2777   Guile is atypical in this regard.
2778
2779   Tom Lord maintained Guile for its first year and a half or so,
2780corresponding to the end of 1994 through the middle of 1996.  The
2781releases made in this time constitute an arc from SCM as a standalone
2782program to Guile as a reusable, embeddable library, but passing through
2783a explosion of features: embedded Tcl and Tk, a toolchain for compiling
2784and disassembling Java, addition of a C-like syntax, creation of a
2785module system, and a start at a rich POSIX interface.
2786
2787   Only some of those features remain in Guile.  There were ongoing
2788tensions between providing a small, embeddable language, and one which
2789had all of the features (e.g. a graphical toolkit) that a modern Emacs
2790might need.  In the end, as Guile gained in uptake, the development team
2791decided to focus on depth, documentation and orthogonality rather than
2792on breadth.  This has been the focus of Guile ever since, although there
2793is a wide range of third-party libraries for Guile.
2794
2795   Jim Blandy presided over that period of stabilization, in the three
2796years until the end of 1999, when he too moved on to other projects.
2797Since then, Guile has had a group maintainership.  The first group was
2798Maciej Stachowiak, Mikael Djurfeldt, and Marius Vollmer, with Vollmer
2799staying on the longest.  By late 2007, Marius had mostly moved on to
2800other things, so Neil Jerram and Ludovic Courtès stepped up to take on
2801the primary maintenance responsibility.  Neil and Ludovic were joined by
2802Andy Wingo in late 2009, allowing Neil to step away, and Mark Weaver
2803joined shortly thereafter.  After spending more than 5 years in the
2804role, Mark stepped down as well, leaving Ludovic and Andy as the current
2805co-maintainers of Guile as of January 2020.
2806
2807   Of course, a large part of the actual work on Guile has come from
2808other contributors too numerous to mention, but without whom the world
2809would be a poorer place.
2810
2811
2812File: guile.info,  Node: A Timeline of Selected Guile Releases,  Next: Status,  Prev: A Scheme of Many Maintainers,  Up: History
2813
28149.1.4 A Timeline of Selected Guile Releases
2815-------------------------------------------
2816
2817guile-i — 4 February 1995
2818     SCM, turned into a library.
2819
2820guile-ii — 6 April 1995
2821     A low-level module system was added.  Tcl/Tk support was added,
2822     allowing extension of Scheme by Tcl or vice versa.  POSIX support
2823     was improved, and there was an experimental stab at Java
2824     integration.
2825
2826guile-iii — 18 August 1995
2827     The C-like syntax, ctax, was improved, but mostly this release
2828     featured a start at the task of breaking Guile into pieces.
2829
28301.0 — 5 January 1997
2831     ‘#f’ was distinguished from ‘'()’.  User-level, cooperative
2832     multi-threading was added.  Source-level debugging became more
2833     useful, and programmer’s and user’s manuals were begun.  The module
2834     system gained a high-level interface, which is still used today in
2835     more or less the same form.
2836
28371.1 — 16 May 1997
28381.2 — 24 June 1997
2839     Support for Tcl/Tk and ctax were split off as separate packages,
2840     and have remained there since.  Guile became more compatible with
2841     SCSH, and more useful as a UNIX scripting language.  Libguile could
2842     now be built as a shared library, and third-party extensions
2843     written in C became loadable via dynamic linking.
2844
28451.3.0 — 19 October 1998
2846     Command-line editing became much more pleasant through the use of
2847     the readline library.  The initial support for internationalization
2848     via multi-byte strings was removed; 10 years were to pass before
2849     proper internationalization would land again.  Initial Emacs Lisp
2850     support landed, ports gained better support for file descriptors,
2851     and fluids were added.
2852
28531.3.2 — 20 August 1999
28541.3.4 — 25 September 1999
28551.4 — 21 June 2000
2856     A long list of lispy features were added: hooks, Common Lisp’s
2857     ‘format’, optional and keyword procedure arguments, ‘getopt-long’,
2858     sorting, random numbers, and many other fixes and enhancements.
2859     Guile also gained an interactive debugger, interactive help, and
2860     better backtraces.
2861
28621.6 — 6 September 2002
2863     Guile gained support for the R5RS standard, and added a number of
2864     SRFI modules.  The module system was expanded with programmatic
2865     support for identifier selection and renaming.  The GOOPS object
2866     system was merged into Guile core.
2867
28681.8 — 20 February 2006
2869     Guile’s arbitrary-precision arithmetic switched to use the GMP
2870     library, and added support for exact rationals.  Guile’s embedded
2871     user-space threading was removed in favor of POSIX pre-emptive
2872     threads, providing true multiprocessing.  Gettext support was
2873     added, and Guile’s C API was cleaned up and orthogonalized in a
2874     massive way.
2875
28762.0 — 16 February 2010
2877     A virtual machine was added to Guile, along with the associated
2878     compiler and toolchain.  Support for internationalization was
2879     finally reimplemented, in terms of unicode, locales, and
2880     libunistring.  Running Guile instances became controllable and
2881     debuggable from within Emacs, via Geiser.  Guile caught up to
2882     features found in a number of other Schemes: SRFI-18 threads,
2883     module-hygienic macros, a profiler, tracer, and debugger, SSAX XML
2884     integration, bytevectors, a dynamic FFI, delimited continuations,
2885     module versions, and partial support for R6RS.
2886
28872.2 — 15 March 2017
2888     The virtual machine and introduced in 2.0 was completely rewritten,
2889     along with much of the compiler and toolchain.  This speeds up many
2890     Guile programs as well as reducing startup time and memory usage.
2891     Guile’s POSIX multithreading was improved, stacks became
2892     dynamically expandable, the ports facility gained support for
2893     non-blocking I/O.
2894
28953.0 – January 2020
2896     Guile gained support for native code generation via a simple
2897     just-in-time (JIT) compiler, further improving the speed of its
2898     virtual machine.  The compiler itself gained a number of new
2899     optimizations: inlining of top-level bindings, better closure
2900     optimization, and better unboxing of integer and floating-point
2901     values.  R7RS support was added, and R6RS support improved.  The
2902     exception facility (throw and catch) was rewritten in terms of
2903     SRFI-34 exception handlers.
2904
2905
2906File: guile.info,  Node: Status,  Prev: A Timeline of Selected Guile Releases,  Up: History
2907
29089.1.5 Status, or: Your Help Needed
2909----------------------------------
2910
2911Guile has achieved much of what it set out to achieve, but there is much
2912remaining to do.
2913
2914   There is still the old problem of bringing existing applications into
2915a more Emacs-like experience.  Guile has had some successes in this
2916respect, but still most applications in the GNU system are without Guile
2917integration.
2918
2919   Getting Guile to those applications takes an investment, the
2920“hacktivation energy” needed to wire Guile into a program that only pays
2921off once it is good enough to enable new kinds of behavior.  This would
2922be a great way for new hackers to contribute: take an application that
2923you use and that you know well, think of something that it can’t yet do,
2924and figure out a way to integrate Guile and implement that task in
2925Guile.
2926
2927   With time, perhaps this exposure can reverse itself, whereby programs
2928can run under Guile instead of vice versa, eventually resulting in the
2929Emacsification of the entire GNU system.  Indeed, this is the reason for
2930the naming of the many Guile modules that live in the ‘ice-9’ namespace,
2931a nod to the fictional substance in Kurt Vonnegut’s novel, Cat’s Cradle,
2932capable of acting as a seed crystal to crystallize the mass of software.
2933
2934   Implicit to this whole discussion is the idea that dynamic languages
2935are somehow better than languages like C. While languages like C have
2936their place, Guile’s take on this question is that yes, Scheme is more
2937expressive than C, and more fun to write.  This realization carries an
2938imperative with it to write as much code in Scheme as possible rather
2939than in other languages.
2940
2941   These days it is possible to write extensible applications almost
2942entirely from high-level languages, through byte-code and native
2943compilation, speed gains in the underlying hardware, and foreign call
2944interfaces in the high-level language.  Smalltalk systems are like this,
2945as are Common Lisp-based systems.  While there already are a number of
2946pure-Guile applications out there, in the past users have still needed
2947to drop down to C for some tasks: interfacing to system libraries that
2948don’t have prebuilt Guile interfaces, and for some tasks requiring high
2949performance.  With the arrival of native code generation via a JIT
2950compiler in Guile 3.0, most of these older applications can now be
2951updated to move more C code to Scheme.
2952
2953   Still, even with an all-Guile application, sometimes you want to
2954provide an opportunity for users to extend your program from a language
2955with a syntax that is closer to C, or to Python.  Another interesting
2956idea to consider is compiling e.g. Python to Guile.  It’s not that
2957far-fetched of an idea: see for example IronPython or JRuby.
2958
2959   Also, there’s Emacs itself.  Guile’s Emacs Lisp support has reached
2960an excellent level of correctness, robustness, and speed.  However there
2961is still work to do to finish its integration into Emacs itself.  This
2962will give lots of exciting things to Emacs: native threads, a real
2963object system, more sophisticated types, cleaner syntax, and access to
2964all of the Guile extensions.
2965
2966   Finally, so much of the world’s computation is performed in web
2967browsers that it makes sense to ask ourselves what the
2968Guile-on-the-web-client story is.  With the advent of WebAssembly, there
2969may finally be a reasonable compilation target that’s present on almost
2970all user-exposed devices.  Especially with the upcoming proposals to
2971allow for tail calls, delimited continuations, and GC-managed objects,
2972Scheme might once again have a place in the web browser.  Get to it!
2973
2974
2975File: guile.info,  Node: Data Representation,  Next: A Virtual Machine for Guile,  Prev: History,  Up: Guile Implementation
2976
29779.2 Data Representation
2978=======================
2979
2980Scheme is a latently-typed language; this means that the system cannot,
2981in general, determine the type of a given expression at compile time.
2982Types only become apparent at run time.  Variables do not have fixed
2983types; a variable may hold a pair at one point, an integer at the next,
2984and a thousand-element vector later.  Instead, values, not variables,
2985have fixed types.
2986
2987   In order to implement standard Scheme functions like ‘pair?’ and
2988‘string?’ and provide garbage collection, the representation of every
2989value must contain enough information to accurately determine its type
2990at run time.  Often, Scheme systems also use this information to
2991determine whether a program has attempted to apply an operation to an
2992inappropriately typed value (such as taking the ‘car’ of a string).
2993
2994   Because variables, pairs, and vectors may hold values of any type,
2995Scheme implementations use a uniform representation for values — a
2996single type large enough to hold either a complete value or a pointer to
2997a complete value, along with the necessary typing information.
2998
2999   The following sections will present a simple typing system, and then
3000make some refinements to correct its major weaknesses.  We then conclude
3001with a discussion of specific choices that Guile has made regarding
3002garbage collection and data representation.
3003
3004* Menu:
3005
3006* A Simple Representation::
3007* Faster Integers::
3008* Cheaper Pairs::
3009* Conservative GC::
3010* The SCM Type in Guile::
3011
3012
3013File: guile.info,  Node: A Simple Representation,  Next: Faster Integers,  Up: Data Representation
3014
30159.2.1 A Simple Representation
3016-----------------------------
3017
3018The simplest way to represent Scheme values in C would be to represent
3019each value as a pointer to a structure containing a type indicator,
3020followed by a union carrying the real value.  Assuming that ‘SCM’ is the
3021name of our universal type, we can write:
3022
3023     enum type { integer, pair, string, vector, ... };
3024
3025     typedef struct value *SCM;
3026
3027     struct value {
3028       enum type type;
3029       union {
3030         int integer;
3031         struct { SCM car, cdr; } pair;
3032         struct { int length; char *elts; } string;
3033         struct { int length; SCM  *elts; } vector;
3034         ...
3035       } value;
3036     };
3037   with the ellipses replaced with code for the remaining Scheme types.
3038
3039   This representation is sufficient to implement all of Scheme’s
3040semantics.  If X is an ‘SCM’ value:
3041   • To test if X is an integer, we can write ‘X->type == integer’.
3042   • To find its value, we can write ‘X->value.integer’.
3043   • To test if X is a vector, we can write ‘X->type == vector’.
3044   • If we know X is a vector, we can write ‘X->value.vector.elts[0]’ to
3045     refer to its first element.
3046   • If we know X is a pair, we can write ‘X->value.pair.car’ to extract
3047     its car.
3048
3049
3050File: guile.info,  Node: Faster Integers,  Next: Cheaper Pairs,  Prev: A Simple Representation,  Up: Data Representation
3051
30529.2.2 Faster Integers
3053---------------------
3054
3055Unfortunately, the above representation has a serious disadvantage.  In
3056order to return an integer, an expression must allocate a ‘struct
3057value’, initialize it to represent that integer, and return a pointer to
3058it.  Furthermore, fetching an integer’s value requires a memory
3059reference, which is much slower than a register reference on most
3060processors.  Since integers are extremely common, this representation is
3061too costly, in both time and space.  Integers should be very cheap to
3062create and manipulate.
3063
3064   One possible solution comes from the observation that, on many
3065architectures, heap-allocated data (i.e., what you get when you call
3066‘malloc’) must be aligned on an eight-byte boundary.  (Whether or not
3067the machine actually requires it, we can write our own allocator for
3068‘struct value’ objects that assures this is true.)  In this case, the
3069lower three bits of the structure’s address are known to be zero.
3070
3071   This gives us the room we need to provide an improved representation
3072for integers.  We make the following rules:
3073   • If the lower three bits of an ‘SCM’ value are zero, then the SCM
3074     value is a pointer to a ‘struct value’, and everything proceeds as
3075     before.
3076   • Otherwise, the ‘SCM’ value represents an integer, whose value
3077     appears in its upper bits.
3078
3079   Here is C code implementing this convention:
3080     enum type { pair, string, vector, ... };
3081
3082     typedef struct value *SCM;
3083
3084     struct value {
3085       enum type type;
3086       union {
3087         struct { SCM car, cdr; } pair;
3088         struct { int length; char *elts; } string;
3089         struct { int length; SCM  *elts; } vector;
3090         ...
3091       } value;
3092     };
3093
3094     #define POINTER_P(x) (((int) (x) & 7) == 0)
3095     #define INTEGER_P(x) (! POINTER_P (x))
3096
3097     #define GET_INTEGER(x)  ((int) (x) >> 3)
3098     #define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1))
3099
3100   Notice that ‘integer’ no longer appears as an element of ‘enum type’,
3101and the union has lost its ‘integer’ member.  Instead, we use the
3102‘POINTER_P’ and ‘INTEGER_P’ macros to make a coarse classification of
3103values into integers and non-integers, and do further type testing as
3104before.
3105
3106   Here’s how we would answer the questions posed above (again, assume X
3107is an ‘SCM’ value):
3108   • To test if X is an integer, we can write ‘INTEGER_P (X)’.
3109   • To find its value, we can write ‘GET_INTEGER (X)’.
3110   • To test if X is a vector, we can write:
3111            POINTER_P (X) && X->type == vector
3112     Given the new representation, we must make sure X is truly a
3113     pointer before we dereference it to determine its complete type.
3114   • If we know X is a vector, we can write ‘X->value.vector.elts[0]’ to
3115     refer to its first element, as before.
3116   • If we know X is a pair, we can write ‘X->value.pair.car’ to extract
3117     its car, just as before.
3118
3119   This representation allows us to operate more efficiently on integers
3120than the first.  For example, if X and Y are known to be integers, we
3121can compute their sum as follows:
3122     MAKE_INTEGER (GET_INTEGER (X) + GET_INTEGER (Y))
3123   Now, integer math requires no allocation or memory references.  Most
3124real Scheme systems actually implement addition and other operations
3125using an even more efficient algorithm, but this essay isn’t about
3126bit-twiddling.  (Hint: how do you decide when to overflow to a bignum?
3127How would you do it in assembly?)
3128
3129
3130File: guile.info,  Node: Cheaper Pairs,  Next: Conservative GC,  Prev: Faster Integers,  Up: Data Representation
3131
31329.2.3 Cheaper Pairs
3133-------------------
3134
3135However, there is yet another issue to confront.  Most Scheme heaps
3136contain more pairs than any other type of object; Jonathan Rees said at
3137one point that pairs occupy 45% of the heap in his Scheme
3138implementation, Scheme 48.  However, our representation above spends
3139three ‘SCM’-sized words per pair — one for the type, and two for the CAR
3140and CDR.  Is there any way to represent pairs using only two words?
3141
3142   Let us refine the convention we established earlier.  Let us assert
3143that:
3144   • If the bottom three bits of an ‘SCM’ value are ‘#b000’, then it is
3145     a pointer, as before.
3146   • If the bottom three bits are ‘#b001’, then the upper bits are an
3147     integer.  This is a bit more restrictive than before.
3148   • If the bottom two bits are ‘#b010’, then the value, with the bottom
3149     three bits masked out, is the address of a pair.
3150
3151   Here is the new C code:
3152     enum type { string, vector, ... };
3153
3154     typedef struct value *SCM;
3155
3156     struct value {
3157       enum type type;
3158       union {
3159         struct { int length; char *elts; } string;
3160         struct { int length; SCM  *elts; } vector;
3161         ...
3162       } value;
3163     };
3164
3165     struct pair {
3166       SCM car, cdr;
3167     };
3168
3169     #define POINTER_P(x) (((int) (x) & 7) == 0)
3170
3171     #define INTEGER_P(x)  (((int) (x) & 7) == 1)
3172     #define GET_INTEGER(x)  ((int) (x) >> 3)
3173     #define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1))
3174
3175     #define PAIR_P(x) (((int) (x) & 7) == 2)
3176     #define GET_PAIR(x) ((struct pair *) ((int) (x) & ~7))
3177
3178   Notice that ‘enum type’ and ‘struct value’ now only contain
3179provisions for vectors and strings; both integers and pairs have become
3180special cases.  The code above also assumes that an ‘int’ is large
3181enough to hold a pointer, which isn’t generally true.
3182
3183   Our list of examples is now as follows:
3184   • To test if X is an integer, we can write ‘INTEGER_P (X)’; this is
3185     as before.
3186   • To find its value, we can write ‘GET_INTEGER (X)’, as before.
3187   • To test if X is a vector, we can write:
3188            POINTER_P (X) && X->type == vector
3189     We must still make sure that X is a pointer to a ‘struct value’
3190     before dereferencing it to find its type.
3191   • If we know X is a vector, we can write ‘X->value.vector.elts[0]’ to
3192     refer to its first element, as before.
3193   • We can write ‘PAIR_P (X)’ to determine if X is a pair, and then
3194     write ‘GET_PAIR (X)->car’ to refer to its car.
3195
3196   This change in representation reduces our heap size by 15%.  It also
3197makes it cheaper to decide if a value is a pair, because no memory
3198references are necessary; it suffices to check the bottom two bits of
3199the ‘SCM’ value.  This may be significant when traversing lists, a
3200common activity in a Scheme system.
3201
3202   Again, most real Scheme systems use a slightly different
3203implementation; for example, if GET_PAIR subtracts off the low bits of
3204‘x’, instead of masking them off, the optimizer will often be able to
3205combine that subtraction with the addition of the offset of the
3206structure member we are referencing, making a modified pointer as fast
3207to use as an unmodified pointer.
3208
3209
3210File: guile.info,  Node: Conservative GC,  Next: The SCM Type in Guile,  Prev: Cheaper Pairs,  Up: Data Representation
3211
32129.2.4 Conservative Garbage Collection
3213-------------------------------------
3214
3215Aside from the latent typing, the major source of constraints on a
3216Scheme implementation’s data representation is the garbage collector.
3217The collector must be able to traverse every live object in the heap, to
3218determine which objects are not live, and thus collectable.
3219
3220   There are many ways to implement this.  Guile’s garbage collection is
3221built on a library, the Boehm-Demers-Weiser conservative garbage
3222collector (BDW-GC). The BDW-GC “just works”, for the most part.  But
3223since it is interesting to know how these things work, we include here a
3224high-level description of what the BDW-GC does.
3225
3226   Garbage collection has two logical phases: a “mark” phase, in which
3227the set of live objects is enumerated, and a “sweep” phase, in which
3228objects not traversed in the mark phase are collected.  Correct
3229functioning of the collector depends on being able to traverse the
3230entire set of live objects.
3231
3232   In the mark phase, the collector scans the system’s global variables
3233and the local variables on the stack to determine which objects are
3234immediately accessible by the C code.  It then scans those objects to
3235find the objects they point to, and so on.  The collector logically sets
3236a “mark bit” on each object it finds, so each object is traversed only
3237once.
3238
3239   When the collector can find no unmarked objects pointed to by marked
3240objects, it assumes that any objects that are still unmarked will never
3241be used by the program (since there is no path of dereferences from any
3242global or local variable that reaches them) and deallocates them.
3243
3244   In the above paragraphs, we did not specify how the garbage collector
3245finds the global and local variables; as usual, there are many different
3246approaches.  Frequently, the programmer must maintain a list of pointers
3247to all global variables that refer to the heap, and another list
3248(adjusted upon entry to and exit from each function) of local variables,
3249for the collector’s benefit.
3250
3251   The list of global variables is usually not too difficult to
3252maintain, since global variables are relatively rare.  However, an
3253explicitly maintained list of local variables (in the author’s personal
3254experience) is a nightmare to maintain.  Thus, the BDW-GC uses a
3255technique called “conservative garbage collection”, to make the local
3256variable list unnecessary.
3257
3258   The trick to conservative collection is to treat the C stack as an
3259ordinary range of memory, and assume that _every_ word on the C stack is
3260a pointer into the heap.  Thus, the collector marks all objects whose
3261addresses appear anywhere in the C stack, without knowing for sure how
3262that word is meant to be interpreted.
3263
3264   In addition to the stack, the BDW-GC will also scan static data
3265sections.  This means that global variables are also scanned when
3266looking for live Scheme objects.
3267
3268   Obviously, such a system will occasionally retain objects that are
3269actually garbage, and should be freed.  In practice, this is not a
3270problem, as the set of conservatively-scanned locations is fixed; the
3271Scheme stack is maintained apart from the C stack, and is scanned
3272precisely (as opposed to conservatively).  The GC-managed heap is also
3273partitioned into parts that can contain pointers (such as vectors) and
3274parts that can’t (such as bytevectors), limiting the potential for
3275confusing a raw integer with a pointer to a live object.
3276
3277   Interested readers should see the BDW-GC web page at
3278<http://www.hboehm.info/gc/>, for more information on conservative GC in
3279general and the BDW-GC implementation in particular.
3280
3281
3282File: guile.info,  Node: The SCM Type in Guile,  Prev: Conservative GC,  Up: Data Representation
3283
32849.2.5 The SCM Type in Guile
3285---------------------------
3286
3287Guile classifies Scheme objects into two kinds: those that fit entirely
3288within an ‘SCM’, and those that require heap storage.
3289
3290   The former class are called “immediates”.  The class of immediates
3291includes small integers, characters, boolean values, the empty list, the
3292mysterious end-of-file object, and some others.
3293
3294   The remaining types are called, not surprisingly, “non-immediates”.
3295They include pairs, procedures, strings, vectors, and all other data
3296types in Guile.  For non-immediates, the ‘SCM’ word contains a pointer
3297to data on the heap, with further information about the object in
3298question is stored in that data.
3299
3300   This section describes how the ‘SCM’ type is actually represented and
3301used at the C level.  Interested readers should see ‘libguile/scm.h’ for
3302an exposition of how Guile stores type information.
3303
3304   In fact, there are two basic C data types to represent objects in
3305Guile: ‘SCM’ and ‘scm_t_bits’.
3306
3307* Menu:
3308
3309* Relationship Between SCM and scm_t_bits::
3310* Immediate Objects::
3311* Non-Immediate Objects::
3312* Allocating Heap Objects::
3313* Heap Object Type Information::
3314* Accessing Heap Object Fields::
3315
3316
3317File: guile.info,  Node: Relationship Between SCM and scm_t_bits,  Next: Immediate Objects,  Up: The SCM Type in Guile
3318
33199.2.5.1 Relationship Between ‘SCM’ and ‘scm_t_bits’
3320...................................................
3321
3322A variable of type ‘SCM’ is guaranteed to hold a valid Scheme object.  A
3323variable of type ‘scm_t_bits’, on the other hand, may hold a
3324representation of a ‘SCM’ value as a C integral type, but may also hold
3325any C value, even if it does not correspond to a valid Scheme object.
3326
3327   For a variable X of type ‘SCM’, the Scheme object’s type information
3328is stored in a form that is not directly usable.  To be able to work on
3329the type encoding of the scheme value, the ‘SCM’ variable has to be
3330transformed into the corresponding representation as a ‘scm_t_bits’
3331variable Y by using the ‘SCM_UNPACK’ macro.  Once this has been done,
3332the type of the scheme object X can be derived from the content of the
3333bits of the ‘scm_t_bits’ value Y, in the way illustrated by the example
3334earlier in this chapter (*note Cheaper Pairs::).  Conversely, a valid
3335bit encoding of a Scheme value as a ‘scm_t_bits’ variable can be
3336transformed into the corresponding ‘SCM’ value using the ‘SCM_PACK’
3337macro.
3338
3339
3340File: guile.info,  Node: Immediate Objects,  Next: Non-Immediate Objects,  Prev: Relationship Between SCM and scm_t_bits,  Up: The SCM Type in Guile
3341
33429.2.5.2 Immediate Objects
3343.........................
3344
3345A Scheme object may either be an immediate, i.e. carrying all necessary
3346information by itself, or it may contain a reference to a “heap object”
3347which is, as the name implies, data on the heap.  Although in general it
3348should be irrelevant for user code whether an object is an immediate or
3349not, within Guile’s own code the distinction is sometimes of importance.
3350Thus, the following low level macro is provided:
3351
3352 -- Macro: int SCM_IMP (SCM X)
3353     A Scheme object is an immediate if it fulfills the ‘SCM_IMP’
3354     predicate, otherwise it holds an encoded reference to a heap
3355     object.  The result of the predicate is delivered as a C style
3356     boolean value.  User code and code that extends Guile should
3357     normally not be required to use this macro.
3358
3359Summary:
3360   • Given a Scheme object X of unknown type, check first with ‘SCM_IMP
3361     (X)’ if it is an immediate object.
3362   • If so, all of the type and value information can be determined from
3363     the ‘scm_t_bits’ value that is delivered by ‘SCM_UNPACK (X)’.
3364
3365   There are a number of special values in Scheme, most of them
3366documented elsewhere in this manual.  It’s not quite the right place to
3367put them, but for now, here’s a list of the C names given to some of
3368these values:
3369
3370 -- Macro: SCM SCM_EOL
3371     The Scheme empty list object, or “End Of List” object, usually
3372     written in Scheme as ‘'()’.
3373
3374 -- Macro: SCM SCM_EOF_VAL
3375     The Scheme end-of-file value.  It has no standard written
3376     representation, for obvious reasons.
3377
3378 -- Macro: SCM SCM_UNSPECIFIED
3379     The value returned by some (but not all) expressions that the
3380     Scheme standard says return an “unspecified” value.
3381
3382     This is sort of a weirdly literal way to take things, but the
3383     standard read-eval-print loop prints nothing when the expression
3384     returns this value, so it’s not a bad idea to return this when you
3385     can’t think of anything else helpful.
3386
3387 -- Macro: SCM SCM_UNDEFINED
3388     The “undefined” value.  Its most important property is that is not
3389     equal to any valid Scheme value.  This is put to various internal
3390     uses by C code interacting with Guile.
3391
3392     For example, when you write a C function that is callable from
3393     Scheme and which takes optional arguments, the interpreter passes
3394     ‘SCM_UNDEFINED’ for any arguments you did not receive.
3395
3396     We also use this to mark unbound variables.
3397
3398 -- Macro: int SCM_UNBNDP (SCM X)
3399     Return true if X is ‘SCM_UNDEFINED’.  Note that this is not a check
3400     to see if X is ‘SCM_UNBOUND’.  History will not be kind to us.
3401
3402
3403File: guile.info,  Node: Non-Immediate Objects,  Next: Allocating Heap Objects,  Prev: Immediate Objects,  Up: The SCM Type in Guile
3404
34059.2.5.3 Non-Immediate Objects
3406.............................
3407
3408A Scheme object of type ‘SCM’ that does not fulfill the ‘SCM_IMP’
3409predicate holds an encoded reference to a heap object.  This reference
3410can be decoded to a C pointer to a heap object using the
3411‘SCM_UNPACK_POINTER’ macro.  The encoding of a pointer to a heap object
3412into a ‘SCM’ value is done using the ‘SCM_PACK_POINTER’ macro.
3413
3414   Before Guile 2.0, Guile had a custom garbage collector that allocated
3415heap objects in units of 2-word “cells”.  With the move to the BDW-GC
3416collector in Guile 2.0, Guile can allocate heap objects of any size, and
3417the concept of a cell is now obsolete.  Still, we mention it here as the
3418name still appears in various low-level interfaces.
3419
3420 -- Macro: scm_t_bits * SCM_UNPACK_POINTER (SCM X)
3421 -- Macro: scm_t_cell * SCM2PTR (SCM X)
3422     Extract and return the heap object pointer from a non-immediate
3423     ‘SCM’ object X.  The name ‘SCM2PTR’ is deprecated but still common.
3424
3425 -- Macro: SCM_PACK_POINTER (scm_t_bits * X)
3426 -- Macro: SCM PTR2SCM (scm_t_cell * X)
3427     Return a ‘SCM’ value that encodes a reference to the heap object
3428     pointer X.  The name ‘PTR2SCM’ is deprecated but still common.
3429
3430   Note that it is also possible to transform a non-immediate ‘SCM’
3431value by using ‘SCM_UNPACK’ into a ‘scm_t_bits’ variable.  However, the
3432result of ‘SCM_UNPACK’ may not be used as a pointer to a heap object:
3433only ‘SCM_UNPACK_POINTER’ is guaranteed to transform a ‘SCM’ object into
3434a valid pointer to a heap object.  Also, it is not allowed to apply
3435‘SCM_PACK_POINTER’ to anything that is not a valid pointer to a heap
3436object.
3437
3438Summary:
3439   • Only use ‘SCM_UNPACK_POINTER’ on ‘SCM’ values for which ‘SCM_IMP’
3440     is false!
3441   • Don’t use ‘(scm_t_cell *) SCM_UNPACK (X)’!  Use ‘SCM_UNPACK_POINTER
3442     (X)’ instead!
3443   • Don’t use ‘SCM_PACK_POINTER’ for anything but a heap object
3444     pointer!
3445
3446
3447File: guile.info,  Node: Allocating Heap Objects,  Next: Heap Object Type Information,  Prev: Non-Immediate Objects,  Up: The SCM Type in Guile
3448
34499.2.5.4 Allocating Heap Objects
3450...............................
3451
3452Heap objects are heap-allocated data pointed to by non-immediate ‘SCM’
3453value.  The first word of the heap object should contain a type code.
3454The object may be any number of words in length, and is generally
3455scanned by the garbage collector for additional unless the object was
3456allocated using a “pointerless” allocation function.
3457
3458   You should generally not need these functions, unless you are
3459implementing a new data type, and thoroughly understand the code in
3460‘<libguile/scm.h>’.
3461
3462   If you just want to allocate pairs, use ‘scm_cons’.
3463
3464 -- Function: SCM scm_words (scm_t_bits word_0, uint32_t n_words)
3465     Allocate a new heap object containing N_WORDS, and initialize the
3466     first slot to WORD_0, and return a non-immediate ‘SCM’ value
3467     encoding a pointer to the object.  Typically WORD_0 will contain
3468     the type tag.
3469
3470   There are also deprecated but common variants of ‘scm_words’ that use
3471the term “cell” to indicate 2-word objects.
3472
3473 -- Function: SCM scm_cell (scm_t_bits word_0, scm_t_bits word_1)
3474     Allocate a new 2-word heap object, initialize the two slots with
3475     WORD_0 and WORD_1, and return it.  Just like calling ‘scm_words
3476     (WORD_0, 2)’, then initializing the second slot to WORD_1.
3477
3478     Note that WORD_0 and WORD_1 are of type ‘scm_t_bits’.  If you want
3479     to pass a ‘SCM’ object, you need to use ‘SCM_UNPACK’.
3480
3481 -- Function: SCM scm_double_cell (scm_t_bits word_0, scm_t_bits word_1,
3482          scm_t_bits word_2, scm_t_bits word_3)
3483     Like ‘scm_cell’, but allocates a 4-word heap object.
3484
3485
3486File: guile.info,  Node: Heap Object Type Information,  Next: Accessing Heap Object Fields,  Prev: Allocating Heap Objects,  Up: The SCM Type in Guile
3487
34889.2.5.5 Heap Object Type Information
3489....................................
3490
3491Heap objects contain a type tag and are followed by a number of
3492word-sized slots.  The interpretation of the object contents depends on
3493the type of the object.
3494
3495 -- Macro: scm_t_bits SCM_CELL_TYPE (SCM X)
3496     Extract the first word of the heap object pointed to by X.  This
3497     value holds the information about the cell type.
3498
3499 -- Macro: void SCM_SET_CELL_TYPE (SCM X, scm_t_bits T)
3500     For a non-immediate Scheme object X, write the value T into the
3501     first word of the heap object referenced by X.  The value T must
3502     hold a valid cell type.
3503
3504
3505File: guile.info,  Node: Accessing Heap Object Fields,  Prev: Heap Object Type Information,  Up: The SCM Type in Guile
3506
35079.2.5.6 Accessing Heap Object Fields
3508....................................
3509
3510For a non-immediate Scheme object X, the object type can be determined
3511by using the ‘SCM_CELL_TYPE’ macro described in the previous section.
3512For each different type of heap object it is known which fields hold
3513tagged Scheme objects and which fields hold untagged raw data.  To
3514access the different fields appropriately, the following macros are
3515provided.
3516
3517 -- Macro: scm_t_bits SCM_CELL_WORD (SCM X, unsigned int N)
3518 -- Macro: scm_t_bits SCM_CELL_WORD_0 (X)
3519 -- Macro: scm_t_bits SCM_CELL_WORD_1 (X)
3520 -- Macro: scm_t_bits SCM_CELL_WORD_2 (X)
3521 -- Macro: scm_t_bits SCM_CELL_WORD_3 (X)
3522     Deliver the field N of the heap object referenced by the
3523     non-immediate Scheme object X as raw untagged data.  Only use this
3524     macro for fields containing untagged data; don’t use it for fields
3525     containing tagged ‘SCM’ objects.
3526
3527 -- Macro: SCM SCM_CELL_OBJECT (SCM X, unsigned int N)
3528 -- Macro: SCM SCM_CELL_OBJECT_0 (SCM X)
3529 -- Macro: SCM SCM_CELL_OBJECT_1 (SCM X)
3530 -- Macro: SCM SCM_CELL_OBJECT_2 (SCM X)
3531 -- Macro: SCM SCM_CELL_OBJECT_3 (SCM X)
3532     Deliver the field N of the heap object referenced by the
3533     non-immediate Scheme object X as a Scheme object.  Only use this
3534     macro for fields containing tagged ‘SCM’ objects; don’t use it for
3535     fields containing untagged data.
3536
3537 -- Macro: void SCM_SET_CELL_WORD (SCM X, unsigned int N, scm_t_bits W)
3538 -- Macro: void SCM_SET_CELL_WORD_0 (X, W)
3539 -- Macro: void SCM_SET_CELL_WORD_1 (X, W)
3540 -- Macro: void SCM_SET_CELL_WORD_2 (X, W)
3541 -- Macro: void SCM_SET_CELL_WORD_3 (X, W)
3542     Write the raw value W into field number N of the heap object
3543     referenced by the non-immediate Scheme value X.  Values that are
3544     written into heap objects as raw values should only be read later
3545     using the ‘SCM_CELL_WORD’ macros.
3546
3547 -- Macro: void SCM_SET_CELL_OBJECT (SCM X, unsigned int N, SCM O)
3548 -- Macro: void SCM_SET_CELL_OBJECT_0 (SCM X, SCM O)
3549 -- Macro: void SCM_SET_CELL_OBJECT_1 (SCM X, SCM O)
3550 -- Macro: void SCM_SET_CELL_OBJECT_2 (SCM X, SCM O)
3551 -- Macro: void SCM_SET_CELL_OBJECT_3 (SCM X, SCM O)
3552     Write the Scheme object O into field number N of the heap object
3553     referenced by the non-immediate Scheme value X.  Values that are
3554     written into heap objects as objects should only be read using the
3555     ‘SCM_CELL_OBJECT’ macros.
3556
3557Summary:
3558   • For a non-immediate Scheme object X of unknown type, get the type
3559     information by using ‘SCM_CELL_TYPE (X)’.
3560   • As soon as the type information is available, only use the
3561     appropriate access methods to read and write data to the different
3562     heap object fields.
3563   • Note that field 0 stores the cell type information.  Generally
3564     speaking, other data associated with a heap object is stored
3565     starting from field 1.
3566
3567
3568File: guile.info,  Node: A Virtual Machine for Guile,  Next: Compiling to the Virtual Machine,  Prev: Data Representation,  Up: Guile Implementation
3569
35709.3 A Virtual Machine for Guile
3571===============================
3572
3573Enough about data—how does Guile run code?
3574
3575   Code is a grammatical production of a language.  Sometimes these
3576languages are implemented using interpreters: programs that run
3577along-side the program being interpreted, dynamically translating the
3578high-level code to low-level code.  Sometimes these languages are
3579implemented using compilers: programs that translate high-level programs
3580to equivalent low-level code, and pass on that low-level code to some
3581other language implementation.  Each of these languages can be thought
3582to be virtual machines: they offer programs an abstract machine on which
3583to run.
3584
3585   Guile implements a number of interpreters and compilers on different
3586language levels.  For example, there is an interpreter for the Scheme
3587language that is itself implemented as a Scheme program compiled to a
3588bytecode for a low-level virtual machine shipped with Guile.  That
3589virtual machine is implemented by both an interpreter—a C program that
3590interprets the bytecodes—and a compiler—a C program that dynamically
3591translates bytecode programs to native machine code(1).
3592
3593   This section describes the language implemented by Guile’s bytecode
3594virtual machine, as well as some examples of translations of Scheme
3595programs to Guile’s VM.
3596
3597* Menu:
3598
3599* Why a VM?::
3600* VM Concepts::
3601* Stack Layout::
3602* Variables and the VM::
3603* VM Programs::
3604* Object File Format::
3605* Instruction Set::
3606* Just-In-Time Native Code::
3607
3608   ---------- Footnotes ----------
3609
3610   (1) Even the lowest-level machine code can be thought to be
3611interpreted by the CPU, and indeed is often implemented by compiling
3612machine instructions to “micro-operations”.
3613
3614
3615File: guile.info,  Node: Why a VM?,  Next: VM Concepts,  Up: A Virtual Machine for Guile
3616
36179.3.1 Why a VM?
3618---------------
3619
3620For a long time, Guile only had a Scheme interpreter, implemented in C.
3621Guile’s interpreter operated directly on the S-expression representation
3622of Scheme source code.
3623
3624   But while the interpreter was highly optimized and hand-tuned, it
3625still performed many needless computations during the course of
3626evaluating a Scheme expression.  For example, application of a function
3627to arguments needlessly consed up the arguments in a list.  Evaluation
3628of an expression like ‘(f x y)’ always had to figure out whether F was a
3629procedure, or a special form like ‘if’, or something else.  The
3630interpreter represented the lexical environment as a heap data
3631structure, so every evaluation caused allocation, which was of course
3632slow.  Et cetera.
3633
3634   The solution to the slow-interpreter problem was to compile the
3635higher-level language, Scheme, into a lower-level language for which all
3636of the checks and dispatching have already been done—the code is instead
3637stripped to the bare minimum needed to “do the job”.
3638
3639   The question becomes then, what low-level language to choose?  There
3640are many options.  We could compile to native code directly, but that
3641poses portability problems for Guile, as it is a highly cross-platform
3642project.
3643
3644   So we want the performance gains that compilation provides, but we
3645also want to maintain the portability benefits of a single code path.
3646The obvious solution is to compile to a virtual machine that is present
3647on all Guile installations.
3648
3649   The easiest (and most fun) way to depend on a virtual machine is to
3650implement the virtual machine within Guile itself.  Guile contains a
3651bytecode interpreter (written in C) and a Scheme to bytecode compiler
3652(written in Scheme).  This way the virtual machine provides what Scheme
3653needs (tail calls, multiple values, ‘call/cc’) and can provide optimized
3654inline instructions for Guile as well (GC-managed allocations, type
3655checks, etc.).
3656
3657   Guile also includes a just-in-time (JIT) compiler to translate
3658bytecode to native code.  Because Guile embeds a portable code
3659generation library (<https://gitlab.com/wingo/lightening>), we keep the
3660benefits of portability while also benefitting from fast native code.
3661To avoid too much time spent in the JIT compiler itself, Guile is tuned
3662to only emit machine code for bytecode that is called often.
3663
3664   The rest of this section describes that VM that Guile implements, and
3665the compiled procedures that run on it.
3666
3667   Before moving on, though, we should note that though we spoke of the
3668interpreter in the past tense, Guile still has an interpreter.  The
3669difference is that before, it was Guile’s main Scheme implementation,
3670and so was implemented in highly optimized C; now, it is actually
3671implemented in Scheme, and compiled down to VM bytecode, just like any
3672other program.  (There is still a C interpreter around, used to
3673bootstrap the compiler, but it is not normally used at runtime.)
3674
3675   The upside of implementing the interpreter in Scheme is that we
3676preserve tail calls and multiple-value handling between interpreted and
3677compiled code, and with advent of the JIT compiler in Guile 3.0 we reach
3678the speed of the old hand-tuned C implementation; it’s the best of both
3679worlds.
3680
3681   Also note that this decision to implement a bytecode compiler does
3682not preclude ahead-of-time native compilation.  More possibilities are
3683discussed in *note Extending the Compiler::.
3684
3685
3686File: guile.info,  Node: VM Concepts,  Next: Stack Layout,  Prev: Why a VM?,  Up: A Virtual Machine for Guile
3687
36889.3.2 VM Concepts
3689-----------------
3690
3691The bytecode in a Scheme procedure is interpreted by a virtual machine
3692(VM). Each thread has its own instantiation of the VM. The virtual
3693machine executes the sequence of instructions in a procedure.
3694
3695   Each VM instruction starts by indicating which operation it is, and
3696then follows by encoding its source and destination operands.  Each
3697procedure declares that it has some number of local variables, including
3698the function arguments.  These local variables form the available
3699operands of the procedure, and are accessed by index.
3700
3701   The local variables for a procedure are stored on a stack.  Calling a
3702procedure typically enlarges the stack, and returning from a procedure
3703shrinks it.  Stack memory is exclusive to the virtual machine that owns
3704it.
3705
3706   In addition to their stacks, virtual machines also have access to the
3707global memory (modules, global bindings, etc) that is shared among other
3708parts of Guile, including other VMs.
3709
3710   The registers that a VM has are as follows:
3711
3712   • ip - Instruction pointer
3713   • sp - Stack pointer
3714   • fp - Frame pointer
3715
3716   In other architectures, the instruction pointer is sometimes called
3717the “program counter” (pc).  This set of registers is pretty typical for
3718virtual machines; their exact meanings in the context of Guile’s VM are
3719described in the next section.
3720
3721
3722File: guile.info,  Node: Stack Layout,  Next: Variables and the VM,  Prev: VM Concepts,  Up: A Virtual Machine for Guile
3723
37249.3.3 Stack Layout
3725------------------
3726
3727The stack of Guile’s virtual machine is composed of “frames”.  Each
3728frame corresponds to the application of one compiled procedure, and
3729contains storage space for arguments, local variables, and some
3730bookkeeping information (such as what to do after the frame is
3731finished).
3732
3733   While the compiler is free to do whatever it wants to, as long as the
3734semantics of a computation are preserved, in practice every time you
3735call a function, a new frame is created.  (The notable exception of
3736course is the tail call case, *note Tail Calls::.)
3737
3738   The structure of the top stack frame is as follows:
3739
3740        | ...previous frame locals...  |
3741        +==============================+ <- fp + 3
3742        | Dynamic link                 |
3743        +------------------------------+
3744        | Virtual return address (vRA) |
3745        +------------------------------+
3746        | Machine return address (mRA) |
3747        +==============================+ <- fp
3748        | Local 0                      |
3749        +------------------------------+
3750        | Local 1                      |
3751        +------------------------------+
3752        | ...                          |
3753        +------------------------------+
3754        | Local N-1                    |
3755        \------------------------------/ <- sp
3756
3757   In the above drawing, the stack grows downward.  At the beginning of
3758a function call, the procedure being applied is in local 0, followed by
3759the arguments from local 1.  After the procedure checks that it is being
3760passed a compatible set of arguments, the procedure allocates some
3761additional space in the frame to hold variables local to the function.
3762
3763   Note that once a value in a local variable slot is no longer needed,
3764Guile is free to re-use that slot.  This applies to the slots that were
3765initially used for the callee and arguments, too.  For this reason,
3766backtraces in Guile aren’t always able to show all of the arguments: it
3767could be that the slot corresponding to that argument was re-used by
3768some other variable.
3769
3770   The “virtual return address” is the ‘ip’ that was in effect before
3771this program was applied.  When we return from this activation frame, we
3772will jump back to this ‘ip’.  Likewise, the “dynamic link” is the offset
3773of the ‘fp’ that was in effect before this program was applied, relative
3774to the current ‘fp’.
3775
3776   There are two return addresses: the virtual return address (vRA), and
3777the machine return address (mRA). The vRA is always present and
3778indicates a bytecode address.  The mRA is only present when a call is
3779made from a function with machine code (e.g.  a function that has been
3780JIT-compiled).
3781
3782   To prepare for a non-tail application, Guile’s VM will emit code that
3783shuffles the function to apply and its arguments into appropriate stack
3784slots, with three free slots below them.  The call then initializes
3785those free slots to hold the machine return address (or NULL), the
3786virtual return address, and the offset to the previous frame pointer
3787(‘fp’).  It then gets the ‘ip’ for the function being called and adjusts
3788‘fp’ to point to the new call frame.
3789
3790   In this way, the dynamic link links the current frame to the previous
3791frame.  Computing a stack trace involves traversing these frames.
3792
3793   Each stack local in Guile is 64 bits wide, even on 32-bit
3794architectures.  This allows Guile to preserve its uniform treatment of
3795stack locals while allowing for unboxed arithmetic on 64-bit integers
3796and floating-point numbers.  *Note Instruction Set::, for more on
3797unboxed arithmetic.
3798
3799   As an implementation detail, we actually store the dynamic link as an
3800offset and not an absolute value because the stack can move at runtime
3801as it expands or during partial continuation calls.  If it were an
3802absolute value, we would have to walk the frames, relocating frame
3803pointers.
3804
3805
3806File: guile.info,  Node: Variables and the VM,  Next: VM Programs,  Prev: Stack Layout,  Up: A Virtual Machine for Guile
3807
38089.3.4 Variables and the VM
3809--------------------------
3810
3811Consider the following Scheme code as an example:
3812
3813       (define (foo a)
3814         (lambda (b) (vector foo a b)))
3815
3816   Within the lambda expression, ‘foo’ is a top-level variable, ‘a’ is a
3817lexically captured variable, and ‘b’ is a local variable.
3818
3819   Another way to refer to ‘a’ and ‘b’ is to say that ‘a’ is a “free”
3820variable, since it is not defined within the lambda, and ‘b’ is a
3821“bound” variable.  These are the terms used in the “lambda calculus”, a
3822mathematical notation for describing functions.  The lambda calculus is
3823useful because it is a language in which to reason precisely about
3824functions and variables.  It is especially good at describing scope
3825relations, and it is for that reason that we mention it here.
3826
3827   Guile allocates all variables on the stack.  When a lexically
3828enclosed procedure with free variables—a “closure”—is created, it copies
3829those variables into its free variable vector.  References to free
3830variables are then redirected through the free variable vector.
3831
3832   If a variable is ever ‘set!’, however, it will need to be
3833heap-allocated instead of stack-allocated, so that different closures
3834that capture the same variable can see the same value.  Also, this
3835allows continuations to capture a reference to the variable, instead of
3836to its value at one point in time.  For these reasons, ‘set!’ variables
3837are allocated in “boxes”—actually, in variable cells.  *Note
3838Variables::, for more information.  References to ‘set!’ variables are
3839indirected through the boxes.
3840
3841   Thus perhaps counterintuitively, what would seem “closer to the
3842metal”, viz ‘set!’, actually forces an extra memory allocation and
3843indirection.  Sometimes Guile’s optimizer can remove this allocation,
3844but not always.
3845
3846   Going back to our example, ‘b’ may be allocated on the stack, as it
3847is never mutated.
3848
3849   ‘a’ may also be allocated on the stack, as it too is never mutated.
3850Within the enclosed lambda, its value will be copied into (and
3851referenced from) the free variables vector.
3852
3853   ‘foo’ is a top-level variable, because ‘foo’ is not lexically bound
3854in this example.
3855
3856
3857File: guile.info,  Node: VM Programs,  Next: Object File Format,  Prev: Variables and the VM,  Up: A Virtual Machine for Guile
3858
38599.3.5 Compiled Procedures are VM Programs
3860-----------------------------------------
3861
3862By default, when you enter in expressions at Guile’s REPL, they are
3863first compiled to bytecode.  Then that bytecode is executed to produce a
3864value.  If the expression evaluates to a procedure, the result of this
3865process is a compiled procedure.
3866
3867   A compiled procedure is a compound object consisting of its bytecode
3868and a reference to any captured lexical variables.  In addition, when a
3869procedure is compiled, it has associated metadata written to side
3870tables, for instance a line number mapping, or its docstring.  You can
3871pick apart these pieces with the accessors in ‘(system vm program)’.
3872*Note Compiled Procedures::, for a full API reference.
3873
3874   A procedure may reference data that was statically allocated when the
3875procedure was compiled.  For example, a pair of immediate objects (*note
3876Immediate Objects::) can be allocated directly in the memory segment
3877that contains the compiled bytecode, and accessed directly by the
3878bytecode.
3879
3880   Another use for statically allocated data is to serve as a cache for
3881a bytecode.  Top-level variable lookups are handled in this way; the
3882first time a top-level binding is referenced, the resolved variable will
3883be stored in a cache.  Thereafter all access to the variable goes
3884through the cache cell.  The variable’s value may change in the future,
3885but the variable itself will not.
3886
3887   We can see how these concepts tie together by disassembling the ‘foo’
3888function we defined earlier to see what is going on:
3889
3890     scheme@(guile-user)> (define (foo a) (lambda (b) (vector foo a b)))
3891     scheme@(guile-user)> ,x foo
3892     Disassembly of #<procedure foo (a)> at #xf1da30:
3893
3894        0    (instrument-entry 164)                                at (unknown file):5:0
3895        2    (assert-nargs-ee/locals 2 1)    ;; 3 slots (1 arg)
3896        3    (allocate-words/immediate 2 3)                        at (unknown file):5:16
3897        4    (load-u64 0 0 65605)
3898        7    (word-set!/immediate 2 0 0)
3899        8    (load-label 0 7)                ;; anonymous procedure at #xf1da6c
3900       10    (word-set!/immediate 2 1 0)
3901       11    (scm-set!/immediate 2 2 1)
3902       12    (reset-frame 1)                 ;; 1 slot
3903       13    (handle-interrupts)
3904       14    (return-values)
3905
3906     ----------------------------------------
3907     Disassembly of anonymous procedure at #xf1da6c:
3908
3909        0    (instrument-entry 183)                                at (unknown file):5:16
3910        2    (assert-nargs-ee/locals 2 3)    ;; 5 slots (1 arg)
3911        3    (static-ref 2 152)              ;; #<variable 112e530 value: #<procedure foo (a)>>
3912        5    (immediate-tag=? 2 7 0)         ;; heap-object?
3913        7    (je 19)                         ;; -> L2
3914        8    (static-ref 2 119)              ;; #<directory (guile-user) ca9750>
3915       10    (static-ref 1 127)              ;; foo
3916       12    (call-scm<-scm-scm 2 2 1 40)
3917       14    (immediate-tag=? 2 7 0)         ;; heap-object?
3918       16    (jne 8)                         ;; -> L1
3919       17    (scm-ref/immediate 0 2 1)
3920       18    (immediate-tag=? 0 4095 2308)   ;; undefined?
3921       20    (je 4)                          ;; -> L1
3922       21    (static-set! 2 134)             ;; #<variable 112e530 value: #<procedure foo (a)>>
3923       23    (j 3)                           ;; -> L2
3924     L1:
3925       24    (throw/value 1 151)             ;; #(unbound-variable #f "Unbound variable: ~S")
3926     L2:
3927       26    (scm-ref/immediate 2 2 1)
3928       27    (allocate-words/immediate 1 4)                        at (unknown file):5:28
3929       28    (load-u64 0 0 781)
3930       31    (word-set!/immediate 1 0 0)
3931       32    (scm-set!/immediate 1 1 2)
3932       33    (scm-ref/immediate 4 4 2)
3933       34    (scm-set!/immediate 1 2 4)
3934       35    (scm-set!/immediate 1 3 3)
3935       36    (mov 4 1)
3936       37    (reset-frame 1)                 ;; 1 slot
3937       38    (handle-interrupts)
3938       39    (return-values)
3939
3940   The first thing to notice is that the bytecode is at a fairly low
3941level.  When a program is compiled from Scheme to bytecode, it is
3942expressed in terms of more primitive operations.  As such, there can be
3943more instructions than you might expect.
3944
3945   The first chunk of instructions is the outer ‘foo’ procedure.  It is
3946followed by the code for the contained closure.  The code can look
3947daunting at first glance, but with practice it quickly becomes
3948comprehensible, and indeed being able to read bytecode is an important
3949step to understanding the low-level performance of Guile programs.
3950
3951   The ‘foo’ function begins with a prelude.  The ‘instrument-entry’
3952bytecode increments a counter associated with the function.  If the
3953counter reaches a certain threshold, Guile will emit machine code
3954(“JIT-compile”) for ‘foo’.  Emitting machine code is fairly cheap but it
3955does take time, so it’s not something you want to do for every function.
3956Using a per-function counter and a global threshold allows Guile to
3957spend time JIT-compiling only the “hot” functions.
3958
3959   Next in the prelude is an argument-checking instruction, which checks
3960that it was called with only 1 argument (plus the callee function itself
3961makes 2) and then reserves stack space for an additional 1 local.
3962
3963   Then from ‘ip’ 3 to 11, we allocate a new closure by allocating a
3964three-word object, initializing its first word to store a type tag,
3965setting its second word to its code pointer, and finally at ‘ip’ 11,
3966storing local value 1 (the ‘a’ argument) into the third word (the first
3967free variable).
3968
3969   Before returning, ‘foo’ “resets the frame” to hold only one local
3970(the return value), runs any pending interrupts (*note Asyncs::) and
3971then returns.
3972
3973   Note that local variables in Guile’s virtual machine are usually
3974addressed relative to the stack pointer, which leads to a pleasantly
3975efficient ‘sp[N]’ access.  However it can make the disassembly hard to
3976read, because the ‘sp’ can change during the function, and because
3977incoming arguments are relative to the ‘fp’, not the ‘sp’.
3978
3979   To know what ‘fp’-relative slot corresponds to an ‘sp’-relative
3980reference, scan up in the disassembly until you get to a “N slots”
3981annotation; in our case, 3, indicating that the frame has space for 3
3982slots.  Thus a zero-indexed ‘sp’-relative slot of 2 corresponds to the
3983‘fp’-relative slot of 0, which initially held the value of the closure
3984being called.  This means that Guile doesn’t need the value of the
3985closure to compute its result, and so slot 0 was free for re-use, in
3986this case for the result of making a new closure.
3987
3988   A closure is code with data.  As you can see, making the closure
3989involved making an object (‘ip’ 3), putting a code pointer in it (‘ip’ 8
3990and 10), and putting in the closure’s free variable (‘ip’ 11).
3991
3992   The second stanza disassembles the code for the closure.  After the
3993prelude, all of the code between ‘ip’ 5 and 24 is related to loading the
3994toplevel variable ‘foo’ into slot 1.  This lookup happens only once, and
3995is associated with a cache; after the first run, the value in the cache
3996will be a bound variable, and the code will jump from ‘ip’ 7 to 26.  On
3997the first run, Guile gets the module associated with the function, calls
3998out to a run-time routine to look up the variable, and checks that the
3999variable is bound before initializing the cache.  Either way, ‘ip’ 26
4000dereferences the variable into local 2.
4001
4002   What follows is the allocation and initialization of the vector
4003return value.  ‘Ip’ 27 does the allocation, and the following two
4004instructions initialize the type-and-length tag for the object’s first
4005word.  ‘Ip’ 32 sets word 1 of the object (the first vector slot) to the
4006value of ‘foo’; ‘ip’ 33 fetches the closure variable for ‘a’, then in
4007‘ip’ 34 stores it in the second vector slot; and finally, in ‘ip’ 35,
4008local ‘b’ is stored to the third vector slot.  This is followed by the
4009return sequence.
4010
4011
4012File: guile.info,  Node: Object File Format,  Next: Instruction Set,  Prev: VM Programs,  Up: A Virtual Machine for Guile
4013
40149.3.6 Object File Format
4015------------------------
4016
4017To compile a file to disk, we need a format in which to write the
4018compiled code to disk, and later load it into Guile.  A good “object
4019file format” has a number of characteristics:
4020
4021   • Above all else, it should be very cheap to load a compiled file.
4022   • It should be possible to statically allocate constants in the file.
4023     For example, a bytevector literal in source code can be emitted
4024     directly into the object file.
4025   • The compiled file should enable maximum code and data sharing
4026     between different processes.
4027   • The compiled file should contain debugging information, such as
4028     line numbers, but that information should be separated from the
4029     code itself.  It should be possible to strip debugging information
4030     if space is tight.
4031
4032   These characteristics are not specific to Scheme.  Indeed, mainstream
4033languages like C and C++ have solved this issue many times in the past.
4034Guile builds on their work by adopting ELF, the object file format of
4035GNU and other Unix-like systems, as its object file format.  Although
4036Guile uses ELF on all platforms, we do not use platform support for ELF.
4037Guile implements its own linker and loader.  The advantage of using ELF
4038is not sharing code, but sharing ideas.  ELF is simply a well-designed
4039object file format.
4040
4041   An ELF file has two meta-tables describing its contents.  The first
4042meta-table is for the loader, and is called the “program table” or
4043sometimes the “segment table”.  The program table divides the file into
4044big chunks that should be treated differently by the loader.  Mostly the
4045difference between these “segments” is their permissions.
4046
4047   Typically all segments of an ELF file are marked as read-only, except
4048that part that represents modifiable static data or static data that
4049needs load-time initialization.  Loading an ELF file is as simple as
4050mmapping the thing into memory with read-only permissions, then using
4051the segment table to mark a small sub-region of the file as writable.
4052This writable section is typically added to the root set of the garbage
4053collector as well.
4054
4055   One ELF segment is marked as “dynamic”, meaning that it has data of
4056interest to the loader.  Guile uses this segment to record the Guile
4057version corresponding to this file.  There is also an entry in the
4058dynamic segment that points to the address of an initialization thunk
4059that is run to perform any needed link-time initialization.  (This is
4060like dynamic relocations for normal ELF shared objects, except that we
4061compile the relocations as a procedure instead of having the loader
4062interpret a table of relocations.)  Finally, the dynamic segment marks
4063the location of the “entry thunk” of the object file.  This thunk is
4064returned to the caller of ‘load-thunk-from-memory’ or
4065‘load-thunk-from-file’.  When called, it will execute the “body” of the
4066compiled expression.
4067
4068   The other meta-table in an ELF file is the “section table”.  Whereas
4069the program table divides an ELF file into big chunks for the loader,
4070the section table specifies small sections for use by introspective
4071tools like debuggers or the like.  One segment (program table entry)
4072typically contains many sections.  There may be sections outside of any
4073segment, as well.
4074
4075   Typical sections in a Guile ‘.go’ file include:
4076
4077‘.rtl-text’
4078     Bytecode.
4079‘.data’
4080     Data that needs initialization, or which may be modified at
4081     runtime.
4082‘.rodata’
4083     Statically allocated data that needs no run-time initialization,
4084     and which therefore can be shared between processes.
4085‘.dynamic’
4086     The dynamic section, discussed above.
4087‘.symtab’
4088‘.strtab’
4089     A table mapping addresses in the ‘.rtl-text’ to procedure names.
4090     ‘.strtab’ is used by ‘.symtab’.
4091.guile.procprops4092.guile.arities4093.guile.arities.strtab4094.guile.docstrs4095.guile.docstrs.strtab4096     Side tables of procedure properties, arities, and docstrings.
4097.guile.docstrs.strtab4098     Side table of frame maps, describing the set of live slots for ever
4099     return point in the program text, and whether those slots are
4100     pointers are not.  Used by the garbage collector.
4101‘.debug_info’
4102‘.debug_abbrev’
4103‘.debug_str’
4104‘.debug_loc’
4105‘.debug_line’
4106     Debugging information, in DWARF format.  See the DWARF
4107     specification, for more information.
4108‘.shstrtab’
4109     Section name string table.
4110
4111   For more information, see the elf(5) man page.  See the DWARF
4112specification (http://dwarfstd.org/) for more on the DWARF debugging
4113format.  Or if you are an adventurous explorer, try running ‘readelf’ or
4114‘objdump’ on compiled ‘.go’ files.  It’s good times!
4115
4116
4117File: guile.info,  Node: Instruction Set,  Next: Just-In-Time Native Code,  Prev: Object File Format,  Up: A Virtual Machine for Guile
4118
41199.3.7 Instruction Set
4120---------------------
4121
4122There are currently about 150 instructions in Guile’s virtual machine.
4123These instructions represent atomic units of a program’s execution.
4124Ideally, they perform one task without conditional branches, then
4125dispatch to the next instruction in the stream.
4126
4127   Instructions themselves are composed of 1 or more 32-bit units.  The
4128low 8 bits of the first word indicate the opcode, and the rest of
4129instruction describe the operands.  There are a number of different ways
4130operands can be encoded.
4131
4132‘sN’
4133     An unsigned N-bit integer, indicating the ‘sp’-relative index of a
4134     local variable.
4135‘fN’
4136     An unsigned N-bit integer, indicating the ‘fp’-relative index of a
4137     local variable.  Used when a continuation accepts a variable number
4138     of values, to shuffle received values into known locations in the
4139     frame.
4140‘cN’
4141     An unsigned N-bit integer, indicating a constant value.
4142‘l24’
4143     An offset from the current ‘ip’, in 32-bit units, as a signed
4144     24-bit value.  Indicates a bytecode address, for a relative jump.
4145‘zi16’
4146‘i16’
4147‘i32’
4148     An immediate Scheme value (*note Immediate Objects::), encoded
4149     directly in 16 or 32 bits.  ‘zi16’ is sign-extended; the others are
4150     zero-extended.
4151‘a32’
4152‘b32’
4153     An immediate Scheme value, encoded as a pair of 32-bit words.
4154     ‘a32’ and ‘b32’ values always go together on the same opcode, and
4155     indicate the high and low bits, respectively.  Normally only used
4156     on 64-bit systems.
4157‘n32’
4158     A statically allocated non-immediate.  The address of the
4159     non-immediate is encoded as a signed 32-bit integer, and indicates
4160     a relative offset in 32-bit units.  Think of it as ‘SCM x = ip +
4161     offset’.
4162‘r32’
4163     Indirect scheme value, like ‘n32’ but indirected.  Think of it as
4164     ‘SCM *x = ip + offset’.
4165‘l32’
4166‘lo32’
4167     An ip-relative address, as a signed 32-bit integer.  Could indicate
4168     a bytecode address, as in ‘make-closure’, or a non-immediate
4169     address, as with ‘static-patch!’.
4170
4171     ‘l32’ and ‘lo32’ are the same from the perspective of the virtual
4172     machine.  The difference is that an assembler might want to allow
4173     an ‘lo32’ address to be specified as a label and then some number
4174     of words offset from that label, for example when patching a field
4175     of a statically allocated object.
4176‘v32:x8-l24’
4177     Almost all VM instructions have a fixed size.  The ‘jtable’
4178     instruction used to perform optimized ‘case’ branches is an
4179     exception, which uses a ‘v32’ trailing word to indicate the number
4180     of additional words in the instruction, which themselves are
4181     encoded as ‘x8-l24’ values.
4182‘b1’
4183     A boolean value: 1 for true, otherwise 0.
4184‘xN’
4185     An ignored sequence of N bits.
4186
4187   An instruction is specified by giving its name, then describing its
4188operands.  The operands are packed by 32-bit words, with earlier
4189operands occupying the lower bits.
4190
4191   For example, consider the following instruction specification:
4192
4193 -- Instruction: call f24:PROC x8:_ c24:NLOCALS
4194
4195   The first word in the instruction will start with the 8-bit value
4196corresponding to the CALL opcode in the low bits, followed by PROC as a
419724-bit value.  The second word starts with 8 dead bits, followed by the
4198index as a 24-bit immediate value.
4199
4200   For instructions with operands that encode references to the stack,
4201the interpretation of those stack values is up to the instruction
4202itself.  Most instructions expect their operands to be tagged SCM values
4203(‘scm’ representation), but some instructions expect unboxed integers
4204(‘u64’ and ‘s64’ representations) or floating-point numbers (‘f64’
4205representation).  It is assumed that the bits for a ‘u64’ value are the
4206same as those for an ‘s64’ value, and that ‘s64’ values are stored in
4207two’s complement.
4208
4209   Instructions have static types: they must receive their operands in
4210the format they expect.  It’s up to the compiler to ensure this is the
4211case.
4212
4213   Unless otherwise mentioned, all operands and results are in the ‘scm’
4214representation.
4215
4216* Menu:
4217
4218* Call and Return Instructions::
4219* Function Prologue Instructions::
4220* Shuffling Instructions::
4221* Trampoline Instructions::
4222* Non-Local Control Flow Instructions::
4223* Instrumentation Instructions::
4224* Intrinsic Call Instructions::
4225* Constant Instructions::
4226* Memory Access Instructions::
4227* Atomic Memory Access Instructions::
4228* Tagging and Untagging Instructions::
4229* Integer Arithmetic Instructions::
4230* Floating-Point Arithmetic Instructions::
4231* Comparison Instructions::
4232* Branch Instructions::
4233* Raw Memory Access Instructions::
4234
4235
4236File: guile.info,  Node: Call and Return Instructions,  Next: Function Prologue Instructions,  Up: Instruction Set
4237
42389.3.7.1 Call and Return Instructions
4239....................................
4240
4241As described earlier (*note Stack Layout::), Guile’s calling convention
4242is that arguments are passed and values returned on the stack.
4243
4244   For calls, both in tail position and in non-tail position, we require
4245that the procedure and the arguments already be shuffled into place
4246before the call instruction.  “Into place” for a tail call means that
4247the procedure should be in slot 0, relative to the ‘fp’, and the
4248arguments should follow.  For a non-tail call, if the procedure is in
4249‘fp’-relative slot N, the arguments should follow from slot N+1, and
4250there should be three free slots between N-1 and N-3 in which to save
4251the mRA, vRA, and ‘fp’.
4252
4253   Returning values is similar.  Multiple-value returns should have
4254values already shuffled down to start from ‘fp’-relative slot 0 before
4255emitting ‘return-values’.
4256
4257   In both calls and returns, the ‘sp’ is used to indicate to the callee
4258or caller the number of arguments or return values, respectively.  After
4259receiving return values, it is the caller’s responsibility to “restore
4260the frame” by resetting the ‘sp’ to its former value.
4261
4262 -- Instruction: call f24:PROC x8:_ c24:NLOCALS
4263     Call a procedure.  PROC is the local corresponding to a procedure.
4264     The three values below PROC will be overwritten by the saved call
4265     frame data.  The new frame will have space for NLOCALS locals: one
4266     for the procedure, and the rest for the arguments which should
4267     already have been pushed on.
4268
4269     When the call returns, execution proceeds with the next
4270     instruction.  There may be any number of values on the return
4271     stack; the precise number can be had by subtracting the address of
4272     PROC-1 from the post-call ‘sp’.
4273
4274 -- Instruction: call-label f24:PROC x8:_ c24:NLOCALS l32:LABEL
4275     Call a procedure in the same compilation unit.
4276
4277     This instruction is just like ‘call’, except that instead of
4278     dereferencing PROC to find the call target, the call target is
4279     known to be at LABEL, a signed 32-bit offset in 32-bit units from
4280     the current ‘ip’.  Since PROC is not dereferenced, it may be some
4281     other representation of the closure.
4282
4283 -- Instruction: tail-call x24:_
4284     Tail-call a procedure.  Requires that the procedure and all of the
4285     arguments have already been shuffled into position, and that the
4286     frame has already been reset to the number of arguments to the
4287     call.
4288
4289 -- Instruction: tail-call-label x24:_ l32:LABEL
4290     Tail-call a known procedure.  As ‘call’ is to ‘call-label’,
4291     ‘tail-call’ is to ‘tail-call-label’.
4292
4293 -- Instruction: return-values x24:_
4294     Return a number of values from a call frame.  The return values
4295     should have already been shuffled down to a contiguous array
4296     starting at slot 0, and the frame already reset.
4297
4298 -- Instruction: receive f12:DST f12:PROC x8:_ c24:NLOCALS
4299     Receive a single return value from a call whose procedure was in
4300     PROC, asserting that the call actually returned at least one value.
4301     Afterwards, resets the frame to NLOCALS locals.
4302
4303 -- Instruction: receive-values f24:PROC b1:ALLOW-EXTRA? x7:_
4304          c24:NVALUES
4305     Receive a return of multiple values from a call whose procedure was
4306     in PROC.  If fewer than NVALUES values were returned, signal an
4307     error.  Unless ALLOW-EXTRA? is true, require that the number of
4308     return values equals NVALUES exactly.  After ‘receive-values’ has
4309     run, the values can be copied down via ‘mov’, or used in place.
4310
4311
4312File: guile.info,  Node: Function Prologue Instructions,  Next: Shuffling Instructions,  Prev: Call and Return Instructions,  Up: Instruction Set
4313
43149.3.7.2 Function Prologue Instructions
4315......................................
4316
4317A function call in Guile is very cheap: the VM simply hands control to
4318the procedure.  The procedure itself is responsible for asserting that
4319it has been passed an appropriate number of arguments.  This strategy
4320allows arbitrarily complex argument parsing idioms to be developed,
4321without harming the common case.
4322
4323   For example, only calls to keyword-argument procedures “pay” for the
4324cost of parsing keyword arguments.  (At the time of this writing,
4325calling procedures with keyword arguments is typically two to four times
4326as costly as calling procedures with a fixed set of arguments.)
4327
4328 -- Instruction: assert-nargs-ee c24:EXPECTED
4329 -- Instruction: assert-nargs-ge c24:EXPECTED
4330 -- Instruction: assert-nargs-le c24:EXPECTED
4331     If the number of actual arguments is not ‘==’, ‘>=’, or ‘<=’
4332     EXPECTED, respectively, signal an error.
4333
4334     The number of arguments is determined by subtracting the stack
4335     pointer from the frame pointer (‘fp - sp’).  *Note Stack Layout::,
4336     for more details on stack frames.  Note that EXPECTED includes the
4337     procedure itself.
4338
4339 -- Instruction: arguments<=? c24:EXPECTED
4340     Set the ‘LESS_THAN’, ‘EQUAL’, or ‘NONE’ comparison result values if
4341     the number of arguments is respectively less than, equal to, or
4342     greater than EXPECTED.
4343
4344 -- Instruction: positional-arguments<=? c24:NREQ x8:_ c24:EXPECTED
4345     Set the ‘LESS_THAN’, ‘EQUAL’, or ‘NONE’ comparison result values if
4346     the number of positional arguments is respectively less than, equal
4347     to, or greater than EXPECTED.  The first NREQ arguments are
4348     positional arguments, as are the subsequent arguments that are not
4349     keywords.
4350
4351   The ‘arguments<=?’ and ‘positional-arguments<=?’ instructions are
4352used to implement multiple arities, as in ‘case-lambda’.  *Note
4353Case-lambda::, for more information.  *Note Branch Instructions::, for
4354more on comparison results.
4355
4356 -- Instruction: bind-kwargs c24:NREQ c8:FLAGS c24:NREQ-AND-OPT x8:_
4357          c24:NTOTAL n32:KW-OFFSET
4358     FLAGS is a bitfield, whose lowest bit is ALLOW-OTHER-KEYS, second
4359     bit is HAS-REST, and whose following six bits are unused.
4360
4361     Find the last positional argument, and shuffle all the rest above
4362     NTOTAL.  Initialize the intervening locals to ‘SCM_UNDEFINED’.
4363     Then load the constant at KW-OFFSET words from the current IP, and
4364     use it and the ALLOW-OTHER-KEYS flag to bind keyword arguments.  If
4365     HAS-REST, collect all shuffled arguments into a list, and store it
4366     in NREQ-AND-OPT.  Finally, clear the arguments that we shuffled up.
4367
4368     The parsing is driven by a keyword arguments association list,
4369     looked up using KW-OFFSET.  The alist is a list of pairs of the
4370     form ‘(KW . INDEX)’, mapping keyword arguments to their local slot
4371     indices.  Unless ‘allow-other-keys’ is set, the parser will signal
4372     an error if an unknown key is found.
4373
4374     A macro-mega-instruction.
4375
4376 -- Instruction: bind-optionals f24:NLOCALS
4377     Expand the current frame to have at least NLOCALS locals, filling
4378     in any fresh values with ‘SCM_UNDEFINED’.  If the frame has more
4379     than NLOCALS locals, it is left as it is.
4380
4381 -- Instruction: bind-rest f24:DST
4382     Collect any arguments at or above DST into a list, and store that
4383     list at DST.
4384
4385 -- Instruction: alloc-frame c24:NLOCALS
4386     Ensure that there is space on the stack for NLOCALS local
4387     variables.  The value of any new local is undefined.
4388
4389 -- Instruction: reset-frame c24:NLOCALS
4390     Like ‘alloc-frame’, but doesn’t check that the stack is big enough,
4391     and doesn’t initialize values to ‘SCM_UNDEFINED’.  Used to reset
4392     the frame size to something less than the size that was previously
4393     set via alloc-frame.
4394
4395 -- Instruction: assert-nargs-ee/locals c12:EXPECTED c12:NLOCALS
4396     Equivalent to a sequence of ‘assert-nargs-ee’ and ‘allocate-frame’.
4397     The number of locals reserved is EXPECTED + NLOCALS.
4398
4399
4400File: guile.info,  Node: Shuffling Instructions,  Next: Trampoline Instructions,  Prev: Function Prologue Instructions,  Up: Instruction Set
4401
44029.3.7.3 Shuffling Instructions
4403..............................
4404
4405These instructions are used to move around values on the stack.
4406
4407 -- Instruction: mov s12:DST s12:SRC
4408 -- Instruction: long-mov s24:DST x8:_ s24:SRC
4409     Copy a value from one local slot to another.
4410
4411     As discussed previously, procedure arguments and local variables
4412     are allocated to local slots.  Guile’s compiler tries to avoid
4413     shuffling variables around to different slots, which often makes
4414     ‘mov’ instructions redundant.  However there are some cases in
4415     which shuffling is necessary, and in those cases, ‘mov’ is the
4416     thing to use.
4417
4418 -- Instruction: long-fmov f24:DST x8:_ f24:SRC
4419     Copy a value from one local slot to another, but addressing slots
4420     relative to the ‘fp’ instead of the ‘sp’.  This is used when
4421     shuffling values into place after multiple-value returns.
4422
4423 -- Instruction: push s24:SRC
4424     Bump the stack pointer by one word, and fill it with the value from
4425     slot SRC.  The offset to SRC is calculated before the stack pointer
4426     is adjusted.
4427
4428   The ‘push’ instruction is used when another instruction is unable to
4429address an operand because the operand is encoded with fewer than 24
4430bits.  In that case, Guile’s assembler will transparently emit code that
4431temporarily pushes any needed operands onto the stack, emits the
4432original instruction to address those now-near variables, then shuffles
4433the result (if any) back into place.
4434
4435 -- Instruction: pop s24:DST
4436     Pop the stack pointer, storing the value that was there in slot
4437     DST.  The offset to DST is calculated after the stack pointer is
4438     adjusted.
4439
4440 -- Instruction: drop c24:COUNT
4441     Pop the stack pointer by COUNT words, discarding any values that
4442     were stored there.
4443
4444 -- Instruction: shuffle-down f12:FROM f12:TO
4445     Shuffle down values from FROM to TO, reducing the frame size by
4446     FROM-TO slots.  Part of the internal implementation of
4447     ‘call-with-values’, ‘values’, and ‘apply’.
4448
4449 -- Instruction: expand-apply-argument x24:_
4450     Take the last local in a frame and expand it out onto the stack, as
4451     for the last argument to ‘apply’.
4452
4453
4454File: guile.info,  Node: Trampoline Instructions,  Next: Non-Local Control Flow Instructions,  Prev: Shuffling Instructions,  Up: Instruction Set
4455
44569.3.7.4 Trampoline Instructions
4457...............................
4458
4459Though most applicable objects in Guile are procedures implemented in
4460bytecode, not all are.  There are primitives, continuations, and other
4461procedure-like objects that have their own calling convention.  Instead
4462of adding special cases to the ‘call’ instruction, Guile wraps these
4463other applicable objects in VM trampoline procedures, then provides
4464special support for these objects in bytecode.
4465
4466   Trampoline procedures are typically generated by Guile at runtime,
4467for example in response to a call to ‘scm_c_make_gsubr’.  As such, a
4468compiler probably shouldn’t emit code with these instructions.  However,
4469it’s still interesting to know how these things work, so we document
4470these trampoline instructions here.
4471
4472 -- Instruction: subr-call c24:IDX
4473     Call a subr, passing all locals in this frame as arguments, and
4474     storing the results on the stack, ready to be returned.
4475
4476 -- Instruction: foreign-call c12:CIF-IDX c12:PTR-IDX
4477     Call a foreign function.  Fetch the CIF and foreign pointer from
4478     CIF-IDX and PTR-IDX closure slots of the callee.  Arguments are
4479     taken from the stack, and results placed on the stack, ready to be
4480     returned.
4481
4482 -- Instruction: builtin-ref s12:DST c12:IDX
4483     Load a builtin stub by index into DST.
4484
4485
4486File: guile.info,  Node: Non-Local Control Flow Instructions,  Next: Instrumentation Instructions,  Prev: Trampoline Instructions,  Up: Instruction Set
4487
44889.3.7.5 Non-Local Control Flow Instructions
4489...........................................
4490
4491 -- Instruction: capture-continuation s24:DST
4492     Capture the current continuation, and write it to DST.  Part of the
4493     implementation of ‘call/cc’.
4494
4495 -- Instruction: continuation-call c24:CONTREGS
4496     Return to a continuation, nonlocally.  The arguments to the
4497     continuation are taken from the stack.  CONTREGS is a free variable
4498     containing the reified continuation.
4499
4500 -- Instruction: abort x24:_
4501     Abort to a prompt handler.  The tag is expected in slot 1, and the
4502     rest of the values in the frame are returned to the prompt handler.
4503     This corresponds to a tail application of ‘abort-to-prompt’.
4504
4505     If no prompt can be found in the dynamic environment with the given
4506     tag, an error is signalled.  Otherwise all arguments are passed to
4507     the prompt’s handler, along with the captured continuation, if
4508     necessary.
4509
4510     If the prompt’s handler can be proven to not reference the captured
4511     continuation, no continuation is allocated.  This decision happens
4512     dynamically, at run-time; the general case is that the continuation
4513     may be captured, and thus resumed.  A reinstated continuation will
4514     have its arguments pushed on the stack from slot 0, as if from a
4515     multiple-value return, and control resumes in the caller.  Thus to
4516     the calling function, a call to ‘abort-to-prompt’ looks like any
4517     other function call.
4518
4519 -- Instruction: compose-continuation c24:CONT
4520     Compose a partial continuation with the current continuation.  The
4521     arguments to the continuation are taken from the stack.  CONT is a
4522     free variable containing the reified continuation.
4523
4524 -- Instruction: prompt s24:TAG b1:ESCAPE-ONLY? x7:_ f24:PROC-SLOT x8:_
4525          l24:HANDLER-OFFSET
4526     Push a new prompt on the dynamic stack, with a tag from TAG and a
4527     handler at HANDLER-OFFSET words from the current IP.
4528
4529     If an abort is made to this prompt, control will jump to the
4530     handler.  The handler will expect a multiple-value return as if
4531     from a call with the procedure at PROC-SLOT, with the reified
4532     partial continuation as the first argument, followed by the values
4533     returned to the handler.  If control returns to the handler, the
4534     prompt is already popped off by the abort mechanism.  (Guile’s
4535     ‘prompt’ implements Felleisen’s “–F–” operator.)
4536
4537     If ESCAPE-ONLY? is nonzero, the prompt will be marked as
4538     escape-only, which allows an abort to this prompt to avoid reifying
4539     the continuation.
4540
4541     *Note Prompts::, for more information on prompts.
4542
4543 -- Instruction: throw s12:KEY s12:ARGS
4544     Raise an error by throwing to KEY and ARGS.  ARGS should be a list.
4545
4546 -- Instruction: throw/value s24:VALUE n32:KEY-SUBR-AND-MESSAGE
4547 -- Instruction: throw/value+data s24:VALUE n32:KEY-SUBR-AND-MESSAGE
4548     Raise an error, indicating VAL as the bad value.
4549     KEY-SUBR-AND-MESSAGE should be a vector, where the first element is
4550     the symbol to which to throw, the second is the procedure in which
4551     to signal the error (a string) or ‘#f’, and the third is a format
4552     string for the message, with one template.  These instructions do
4553     not fall through.
4554
4555     Both of these instructions throw to a key with four arguments: the
4556     procedure that indicates the error (or ‘#f’, the format string, a
4557     list with VALUE, and either ‘#f’ or the list with VALUE as the last
4558     argument respectively.
4559
4560
4561File: guile.info,  Node: Instrumentation Instructions,  Next: Intrinsic Call Instructions,  Prev: Non-Local Control Flow Instructions,  Up: Instruction Set
4562
45639.3.7.6 Instrumentation Instructions
4564....................................
4565
4566 -- Instruction: instrument-entry x24__ n32:DATA
4567 -- Instruction: instrument-loop x24__ n32:DATA
4568     Increase execution counter for this function and potentially tier
4569     up to the next JIT level.  DATA is an offset to a structure
4570     recording execution counts and the next-level JIT code
4571     corresponding to this function.  The increment values are currently
4572     30 for ‘instrument-entry’ and 2 for ‘instrument-loop’.
4573
4574     ‘instrument-entry’ will also run the apply hook, if VM hooks are
4575     enabled.
4576
4577 -- Instruction: handle-interrupts x24:_
4578     Handle pending asynchronous interrupts (asyncs).  *Note Asyncs::.
4579     The compiler inserts ‘handle-interrupts’ instructions before any
4580     call, return, or loop back-edge.
4581
4582 -- Instruction: return-from-interrupt x24:_
4583     A special instruction to return from a call and also pop off the
4584     stack frame from the call.  Used when returning from asynchronous
4585     interrupts.
4586
4587
4588File: guile.info,  Node: Intrinsic Call Instructions,  Next: Constant Instructions,  Prev: Instrumentation Instructions,  Up: Instruction Set
4589
45909.3.7.7 Intrinsic Call Instructions
4591...................................
4592
4593Guile’s instruction set is low-level.  This is good because the separate
4594components of, say, a ‘vector-ref’ operation might be able to be
4595optimized out, leaving only the operations that need to be performed at
4596run-time.
4597
4598   However some macro-operations may need to perform large amounts of
4599computation at run-time to handle all the edge cases, and whose
4600micro-operation components aren’t amenable to optimization.
4601Residualizing code for the entire macro-operation would lead to code
4602bloat with no benefit.
4603
4604   In this kind of a case, Guile’s VM calls out to “intrinsics”:
4605run-time routines written in the host language (currently C, possibly
4606more in the future if Guile gains more run-time targets like
4607WebAssembly).  There is one instruction for each instrinsic prototype;
4608the intrinsic is specified by index in the instruction.
4609
4610 -- Instruction: call-thread x24:_ c32:IDX
4611     Call the ‘void’-returning instrinsic with index IDX, passing the
4612     current ‘scm_thread*’ as the argument.
4613
4614 -- Instruction: call-thread-scm s24:A c32:IDX
4615     Call the ‘void’-returning instrinsic with index IDX, passing the
4616     current ‘scm_thread*’ and the ‘scm’ local A as arguments.
4617
4618 -- Instruction: call-thread-scm-scm s12:A s12:B c32:IDX
4619     Call the ‘void’-returning instrinsic with index IDX, passing the
4620     current ‘scm_thread*’ and the ‘scm’ locals A and B as arguments.
4621
4622 -- Instruction: call-scm-sz-u32 s12:A s12:B c32:IDX
4623     Call the ‘void’-returning instrinsic with index IDX, passing the
4624     locals A, B, and C as arguments.  A is a ‘scm’ value, while B and C
4625     are raw ‘u64’ values which fit into ‘size_t’ and ‘uint32_t’ types,
4626     respectively.
4627
4628 -- Instruction: call-scm<-u64 s24:DST c32:IDX
4629     Call the ‘SCM’-returning instrinsic with index IDX, passing the
4630     current ‘scm_thread*’ as the argument.  Place the result in DST.
4631
4632 -- Instruction: call-scm<-u64 s12:DST s12:A c32:IDX
4633     Call the ‘SCM’-returning instrinsic with index IDX, passing ‘u64’
4634     local A as the argument.  Place the result in DST.
4635
4636 -- Instruction: call-scm<-s64 s12:DST s12:A c32:IDX
4637     Call the ‘SCM’-returning instrinsic with index IDX, passing ‘s64’
4638     local A as the argument.  Place the result in DST.
4639
4640 -- Instruction: call-scm<-scm s12:DST s12:A c32:IDX
4641     Call the ‘SCM’-returning instrinsic with index IDX, passing ‘scm’
4642     local A as the argument.  Place the result in DST.
4643
4644 -- Instruction: call-u64<-scm s12:DST s12:A c32:IDX
4645     Call the ‘uint64_t’-returning instrinsic with index IDX, passing
4646     ‘scm’ local A as the argument.  Place the ‘u64’ result in DST.
4647
4648 -- Instruction: call-s64<-scm s12:DST s12:A c32:IDX
4649     Call the ‘int64_t’-returning instrinsic with index IDX, passing
4650     ‘scm’ local A as the argument.  Place the ‘s64’ result in DST.
4651
4652 -- Instruction: call-f64<-scm s12:DST s12:A c32:IDX
4653     Call the ‘double’-returning instrinsic with index IDX, passing
4654     ‘scm’ local A as the argument.  Place the ‘f64’ result in DST.
4655
4656 -- Instruction: call-scm<-scm-scm s8:DST s8:A s8:B c32:IDX
4657     Call the ‘SCM’-returning instrinsic with index IDX, passing ‘scm’
4658     locals A and B as arguments.  Place the ‘scm’ result in DST.
4659
4660 -- Instruction: call-scm<-scm-uimm s8:DST s8:A c8:B c32:IDX
4661     Call the ‘SCM’-returning instrinsic with index IDX, passing ‘scm’
4662     local A and ‘uint8_t’ immediate B as arguments.  Place the ‘scm’
4663     result in DST.
4664
4665 -- Instruction: call-scm<-thread-scm s12:DST s12:A c32:IDX
4666     Call the ‘SCM’-returning instrinsic with index IDX, passing the
4667     current ‘scm_thread*’ and ‘scm’ local A as arguments.  Place the
4668     ‘scm’ result in DST.
4669
4670 -- Instruction: call-scm<-scm-u64 s8:DST s8:A s8:B c32:IDX
4671     Call the ‘SCM’-returning instrinsic with index IDX, passing ‘scm’
4672     local A and ‘u64’ local B as arguments.  Place the ‘scm’ result in
4673     DST.
4674
4675 -- Instruction: call-scm-scm s12:A s12:B c32:IDX
4676     Call the ‘void’-returning instrinsic with index IDX, passing ‘scm’
4677     locals A and B as arguments.
4678
4679 -- Instruction: call-scm-scm-scm s8:A s8:B s8:C c32:IDX
4680     Call the ‘void’-returning instrinsic with index IDX, passing ‘scm’
4681     locals A, B, and C as arguments.
4682
4683 -- Instruction: call-scm-uimm-scm s8:A c8:B s8:C c32:IDX
4684     Call the ‘void’-returning instrinsic with index IDX, passing ‘scm’
4685     local A, ‘uint8_t’ immediate B, and ‘scm’ local C as arguments.
4686
4687   There are corresponding macro-instructions for specific intrinsics.
4688These are equivalent to ‘call-INSTRINSIC-KIND’ instructions with the
4689appropriate intrinsic IDX arguments.
4690
4691 -- Macro Instruction: add dst a b
4692 -- Macro Instruction: add/immediate dst a b/imm
4693     Add ‘SCM’ values A and B and place the result in DST.
4694 -- Macro Instruction: sub dst a b
4695 -- Macro Instruction: sub/immediate dst a b/imm
4696     Subtract ‘SCM’ value B from A and place the result in DST.
4697 -- Macro Instruction: mul dst a b
4698     Multiply ‘SCM’ values A and B and place the result in DST.
4699 -- Macro Instruction: div dst a b
4700     Divide ‘SCM’ value A by B and place the result in DST.
4701 -- Macro Instruction: quo dst a b
4702     Compute the quotient of ‘SCM’ values A and B and place the result
4703     in DST.
4704 -- Macro Instruction: rem dst a b
4705     Compute the remainder of ‘SCM’ values A and B and place the result
4706     in DST.
4707 -- Macro Instruction: mod dst a b
4708     Compute the modulo of ‘SCM’ value A by B and place the result in
4709     DST.
4710 -- Macro Instruction: logand dst a b
4711     Compute the bitwise ‘and’ of ‘SCM’ values A and B and place the
4712     result in DST.
4713 -- Macro Instruction: logior dst a b
4714     Compute the bitwise inclusive ‘or’ of ‘SCM’ values A and B and
4715     place the result in DST.
4716 -- Macro Instruction: logxor dst a b
4717     Compute the bitwise exclusive ‘or’ of ‘SCM’ values A and B and
4718     place the result in DST.
4719 -- Macro Instruction: logsub dst a b
4720     Compute the bitwise ‘and’ of ‘SCM’ value A and the bitwise ‘not’ of
4721     B and place the result in DST.
4722 -- Macro Instruction: lsh dst a b
4723 -- Macro Instruction: lsh/immediate a b/imm
4724     Shift ‘SCM’ value A left by ‘u64’ value B bits and place the result
4725     in DST.
4726 -- Macro Instruction: rsh dst a b
4727 -- Macro Instruction: rsh/immediate dst a b/imm
4728     Shifts ‘SCM’ value A right by ‘u64’ value B bits and place the
4729     result in DST.
4730 -- Macro Instruction: scm->f64 dst src
4731     Convert SRC to an unboxed ‘f64’ and place the result in DST, or
4732     raises an error if SRC is not a real number.
4733 -- Macro Instruction: scm->u64 dst src
4734     Convert SRC to an unboxed ‘u64’ and place the result in DST, or
4735     raises an error if SRC is not an integer within range.
4736 -- Macro Instruction: scm->u64/truncate dst src
4737     Convert SRC to an unboxed ‘u64’ and place the result in DST,
4738     truncating to the low 64 bits, or raises an error if SRC is not an
4739     integer.
4740 -- Macro Instruction: scm->s64 dst src
4741     Convert SRC to an unboxed ‘s64’ and place the result in DST, or
4742     raises an error if SRC is not an integer within range.
4743 -- Macro Instruction: u64->scm dst src
4744     Convert U64 value SRC to a Scheme integer in DST.
4745 -- Macro Instruction: s64->scm scm<-s64
4746     Convert S64 value SRC to a Scheme integer in DST.
4747 -- Macro Instruction: string-set! str idx ch
4748     Sets the character IDX (a ‘u64’) of string STR to CH (a ‘u64’ that
4749     is a valid character value).
4750 -- Macro Instruction: string->number dst src
4751     Call ‘string->number’ on SRC and place the result in DST.
4752 -- Macro Instruction: string->symbol dst src
4753     Call ‘string->symbol’ on SRC and place the result in DST.
4754 -- Macro Instruction: symbol->keyword dst src
4755     Call ‘symbol->keyword’ on SRC and place the result in DST.
4756 -- Macro Instruction: class-of dst src
4757     Set DST to the GOOPS class of ‘src’.
4758 -- Macro Instruction: wind winder unwinder
4759     Push wind and unwind procedures onto the dynamic stack.  Note that
4760     neither are actually called; the compiler should emit calls to
4761     WINDER and UNWINDER for the normal dynamic-wind control flow.  Also
4762     note that the compiler should have inserted checks that WINDER and
4763     UNWINDER are thunks, if it could not prove that to be the case.
4764     *Note Dynamic Wind::.
4765 -- Macro Instruction: unwind
4766     Exit from the dynamic extent of an expression, popping the top
4767     entry off of the dynamic stack.
4768 -- Macro Instruction: push-fluid fluid value
4769     Dynamically bind VALUE to FLUID by creating a with-fluids object,
4770     pushing that object on the dynamic stack.  *Note Fluids and Dynamic
4771     States::.
4772 -- Macro Instruction: pop-fluid
4773     Leave the dynamic extent of a ‘with-fluid*’ expression, restoring
4774     the fluid to its previous value.  ‘push-fluid’ should always be
4775     balanced with ‘pop-fluid’.
4776 -- Macro Instruction: fluid-ref dst fluid
4777     Place the value associated with the fluid FLUID in DST.
4778 -- Macro Instruction: fluid-set! fluid value
4779     Set the value of the fluid FLUID to VALUE.
4780 -- Macro Instruction: push-dynamic-state state
4781     Save the current set of fluid bindings on the dynamic stack and
4782     instate the bindings from STATE instead.  *Note Fluids and Dynamic
4783     States::.
4784 -- Macro Instruction: pop-dynamic-state
4785     Restore a saved set of fluid bindings from the dynamic stack.
4786     ‘push-dynamic-state’ should always be balanced with
4787     ‘pop-dynamic-state’.
4788 -- Macro Instruction: resolve-module dst name public?
4789     Look up the module named NAME, resolve its public interface if the
4790     immediate operand PUBLIC? is true, then place the result in DST.
4791 -- Macro Instruction: lookup dst mod sym
4792     Look up SYM in module MOD, placing the resulting variable (or ‘#f’
4793     if not found) in DST.
4794 -- Macro Instruction: define! dst mod sym
4795     Look up SYM in module MOD, placing the resulting variable in DST,
4796     creating the variable if needed.
4797 -- Macro Instruction: current-module dst
4798     Set DST to the current module.
4799 -- Macro Instruction: $car dst src
4800 -- Macro Instruction: $cdr dst src
4801 -- Macro Instruction: $set-car! x val
4802 -- Macro Instruction: $set-cdr! x val
4803 -- Macro Instruction: $variable-ref dst src
4804 -- Macro Instruction: $variable-set! x val
4805 -- Macro Instruction: $vector-length dst x
4806 -- Macro Instruction: $vector-ref dst x idx
4807 -- Macro Instruction: $vector-ref/immediate dst x idx/imm
4808 -- Macro Instruction: $vector-set! x idx v
4809 -- Macro Instruction: $vector-set!/immediate x idx/imm v
4810 -- Macro Instruction: $allocate-struct dst vtable nwords
4811 -- Macro Instruction: $struct-vtable dst src
4812 -- Macro Instruction: $struct-ref dst src idx
4813 -- Macro Instruction: $struct-ref/immediate dst src idx/imm
4814 -- Macro Instruction: $struct-set! x idx v
4815 -- Macro Instruction: $struct-set!/immediate x idx/imm v
4816     Intrinsics for use by the baseline compiler.  The usual strategy
4817     for CPS compilation is to expose the component parts of e.g.
4818     ‘vector-ref’ so that the compiler can learn from them and eliminate
4819     needless bits.  However in the non-optimizing baseline compiler,
4820     that’s just overhead, so we have some intrinsics that encapsulate
4821     all the usual type checks.
4822
4823
4824File: guile.info,  Node: Constant Instructions,  Next: Memory Access Instructions,  Prev: Intrinsic Call Instructions,  Up: Instruction Set
4825
48269.3.7.8 Constant Instructions
4827.............................
4828
4829The following instructions load literal data into a program.  There are
4830two kinds.
4831
4832   The first set of instructions loads immediate values.  These
4833instructions encode the immediate directly into the instruction stream.
4834
4835 -- Instruction: make-immediate s8:DST zi16:LOW-BITS
4836     Make an immediate whose low bits are LOW-BITS, sign-extended.
4837
4838 -- Instruction: make-short-immediate s8:DST i16:LOW-BITS
4839     Make an immediate whose low bits are LOW-BITS, and whose top bits
4840     are 0.
4841
4842 -- Instruction: make-long-immediate s24:DST i32:LOW-BITS
4843     Make an immediate whose low bits are LOW-BITS, and whose top bits
4844     are 0.
4845
4846 -- Instruction: make-long-long-immediate s24:DST a32:HIGH-BITS
4847          b32:LOW-BITS
4848     Make an immediate with HIGH-BITS and LOW-BITS.
4849
4850   Non-immediate constant literals are referenced either directly or
4851indirectly.  For example, Guile knows at compile-time what the layout of
4852a string will be like, and arranges to embed that object directly in the
4853compiled image.  A reference to a string will use ‘make-non-immediate’
4854to treat a pointer into the compilation unit as a ‘scm’ value directly.
4855
4856 -- Instruction: make-non-immediate s24:DST n32:OFFSET
4857     Load a pointer to statically allocated memory into DST.  The
4858     object’s memory will be found OFFSET 32-bit words away from the
4859     current instruction pointer.  Whether the object is mutable or
4860     immutable depends on where it was allocated by the compiler, and
4861     loaded by the loader.
4862
4863   Sometimes you need to load up a code pointer into a register; for
4864this, use ‘load-label’.
4865
4866 -- Instruction: load-label s24:DST l32:OFFSET
4867     Load a label OFFSET words away from the current ‘ip’ and write it
4868     to DST.  OFFSET is a signed 32-bit integer.
4869
4870   Finally, Guile supports a number of unboxed data types, with their
4871associate constant loaders.
4872
4873 -- Instruction: load-f64 s24:DST au32:HIGH-BITS au32:LOW-BITS
4874     Load a double-precision floating-point value formed by joining
4875     HIGH-BITS and LOW-BITS, and write it to DST.
4876
4877 -- Instruction: load-u64 s24:DST au32:HIGH-BITS au32:LOW-BITS
4878     Load an unsigned 64-bit integer formed by joining HIGH-BITS and
4879     LOW-BITS, and write it to DST.
4880
4881 -- Instruction: load-s64 s24:DST au32:HIGH-BITS au32:LOW-BITS
4882     Load a signed 64-bit integer formed by joining HIGH-BITS and
4883     LOW-BITS, and write it to DST.
4884
4885   Some objects must be unique across the whole system.  This is the
4886case for symbols and keywords.  For these objects, Guile arranges to
4887initialize them when the compilation unit is loaded, storing them into a
4888slot in the image.  References go indirectly through that slot.
4889‘static-ref’ is used in this case.
4890
4891 -- Instruction: static-ref s24:DST r32:OFFSET
4892     Load a SCM value into DST.  The SCM value will be fetched from
4893     memory, OFFSET 32-bit words away from the current instruction
4894     pointer.  OFFSET is a signed value.
4895
4896   Fields of non-immediates may need to be fixed up at load time,
4897because we do not know in advance at what address they will be loaded.
4898This is the case, for example, for a pair containing a non-immediate in
4899one of its fields.  ‘static-set!’ and ‘static-patch!’ are used in these
4900situations.
4901
4902 -- Instruction: static-set! s24:SRC lo32:OFFSET
4903     Store a SCM value into memory, OFFSET 32-bit words away from the
4904     current instruction pointer.  OFFSET is a signed value.
4905
4906 -- Instruction: static-patch! x24:_ lo32:DST-OFFSET l32:SRC-OFFSET
4907     Patch a pointer at DST-OFFSET to point to SRC-OFFSET.  Both offsets
4908     are signed 32-bit values, indicating a memory address as a number
4909     of 32-bit words away from the current instruction pointer.
4910
4911
4912File: guile.info,  Node: Memory Access Instructions,  Next: Atomic Memory Access Instructions,  Prev: Constant Instructions,  Up: Instruction Set
4913
49149.3.7.9 Memory Access Instructions
4915..................................
4916
4917In these instructions, the ‘/immediate’ variants represent their indexes
4918or counts as immediates; otherwise these values are unboxed u64 locals.
4919
4920 -- Instruction: allocate-words s12:DST s12:COUNT
4921 -- Instruction: allocate-words/immediate s12:DST c12:COUNT
4922     Allocate a fresh GC-traced object consisting of COUNT words and
4923     store it into DST.
4924
4925 -- Instruction: scm-ref s8:DST s8:OBJ s8:IDX
4926 -- Instruction: scm-ref/immediate s8:DST s8:OBJ c8:IDX
4927     Load the ‘SCM’ object at word offset IDX from local OBJ, and store
4928     it to DST.
4929
4930 -- Instruction: scm-set! s8:DST s8:IDX s8:OBJ
4931 -- Instruction: scm-set!/immediate s8:DST c8:IDX s8:OBJ
4932     Store the ‘scm’ local VAL into object OBJ at word offset IDX.
4933
4934 -- Instruction: scm-ref/tag s8:DST s8:OBJ c8:TAG
4935     Load the first word of OBJ, subtract the immediate TAG, and store
4936     the resulting ‘SCM’ to DST.
4937
4938 -- Instruction: scm-set!/tag s8:OBJ c8:TAG s8:VAL
4939     Set the first word of OBJ to the unpacked bits of the ‘scm’ value
4940     VAL plus the immediate value TAG.
4941
4942 -- Instruction: word-ref s8:DST s8:OBJ s8:IDX
4943 -- Instruction: word-ref/immediate s8:DST s8:OBJ c8:IDX
4944     Load the word at offset IDX from local OBJ, and store it to the
4945     ‘u64’ local DST.
4946
4947 -- Instruction: word-set! s8:DST s8:IDX s8:OBJ
4948 -- Instruction: word-set!/immediate s8:DST c8:IDX s8:OBJ
4949     Store the ‘u64’ local VAL into object OBJ at word offset IDX.
4950
4951 -- Instruction: pointer-ref/immediate s8:DST s8:OBJ c8:IDX
4952     Load the pointer at offset IDX from local OBJ, and store it to the
4953     unboxed pointer local DST.
4954
4955 -- Instruction: pointer-set!/immediate s8:DST c8:IDX s8:OBJ
4956     Store the unboxed pointer local VAL into object OBJ at word offset
4957     IDX.
4958
4959 -- Instruction: tail-pointer-ref/immediate s8:DST s8:OBJ c8:IDX
4960     Compute the address of word offset IDX from local OBJ, and store it
4961     to DST.
4962
4963
4964File: guile.info,  Node: Atomic Memory Access Instructions,  Next: Tagging and Untagging Instructions,  Prev: Memory Access Instructions,  Up: Instruction Set
4965
49669.3.7.10 Atomic Memory Access Instructions
4967..........................................
4968
4969 -- Instruction: current-thread s24:DST
4970     Write the current thread into DST.
4971
4972 -- Instruction: atomic-scm-ref/immediate s8:DST s8:OBJ c8:IDX
4973     Atomically load the ‘SCM’ object at word offset IDX from local OBJ,
4974     using the sequential consistency memory model.  Store the result to
4975     DST.
4976
4977 -- Instruction: atomic-scm-set!/immediate s8:OBJ c8:IDX s8:VAL
4978     Atomically set the ‘SCM’ object at word offset IDX from local OBJ
4979     to VAL, using the sequential consistency memory model.
4980
4981 -- Instruction: atomic-scm-swap!/immediate s24:DST x8:_ s24:OBJ c8:IDX
4982          s24:VAL
4983     Atomically swap the ‘SCM’ value stored in object OBJ at word offset
4984     IDX with VAL, using the sequentially consistent memory model.
4985     Store the previous value to DST.
4986
4987 -- Instruction: atomic-scm-compare-and-swap!/immediate s24:DST x8:_
4988          s24:OBJ c8:IDX s24:EXPECTED x8:_ s24:DESIRED
4989     Atomically swap the ‘SCM’ value stored in object OBJ at word offset
4990     IDX with DESIRED, if and only if the value that was there was
4991     EXPECTED, using the sequentially consistent memory model.  Store
4992     the value that was previously at IDX from OBJ in DST.
4993
4994
4995File: guile.info,  Node: Tagging and Untagging Instructions,  Next: Integer Arithmetic Instructions,  Prev: Atomic Memory Access Instructions,  Up: Instruction Set
4996
49979.3.7.11 Tagging and Untagging Instructions
4998...........................................
4999
5000 -- Instruction: tag-char s12:DST s12:SRC
5001     Make a ‘SCM’ character whose integer value is the ‘u64’ in SRC, and
5002     store it in DST.
5003
5004 -- Instruction: untag-char s12:DST s12:SRC
5005     Extract the integer value from the ‘SCM’ character SRC, and store
5006     the resulting ‘u64’ in DST.
5007
5008 -- Instruction: tag-fixnum s12:DST s12:SRC
5009     Make a ‘SCM’ integer whose value is the ‘s64’ in SRC, and store it
5010     in DST.
5011
5012 -- Instruction: untag-fixnum s12:DST s12:SRC
5013     Extract the integer value from the ‘SCM’ integer SRC, and store the
5014     resulting ‘s64’ in DST.
5015
5016
5017File: guile.info,  Node: Integer Arithmetic Instructions,  Next: Floating-Point Arithmetic Instructions,  Prev: Tagging and Untagging Instructions,  Up: Instruction Set
5018
50199.3.7.12 Integer Arithmetic Instructions
5020........................................
5021
5022 -- Instruction: uadd s8:DST s8:A s8:B
5023 -- Instruction: uadd/immediate s8:DST s8:A c8:B
5024     Add the ‘u64’ values A and B, and store the ‘u64’ result to DST.
5025     Overflow will wrap.
5026
5027 -- Instruction: usub s8:DST s8:A s8:B
5028 -- Instruction: usub/immediate s8:DST s8:A c8:B
5029     Subtract the ‘u64’ value B from A, and store the ‘u64’ result to
5030     DST.  Underflow will wrap.
5031
5032 -- Instruction: umul s8:DST s8:A s8:B
5033 -- Instruction: umul/immediate s8:DST s8:A c8:B
5034     Multiply the ‘u64’ values A and B, and store the ‘u64’ result to
5035     DST.  Overflow will wrap.
5036
5037 -- Instruction: ulogand s8:DST s8:A s8:B
5038     Place the bitwise ‘and’ of the ‘u64’ values A and B into the ‘u64’
5039     local DST.
5040
5041 -- Instruction: ulogior s8:DST s8:A s8:B
5042     Place the bitwise inclusive ‘or’ of the ‘u64’ values A and B into
5043     the ‘u64’ local DST.
5044
5045 -- Instruction: ulogxor s8:DST s8:A s8:B
5046     Place the bitwise exclusive ‘or’ of the ‘u64’ values A and B into
5047     the ‘u64’ local DST.
5048
5049 -- Instruction: ulogsub s8:DST s8:A s8:B
5050     Place the bitwise ‘and’ of the ‘u64’ values A and the bitwise ‘not’
5051     of B into the ‘u64’ local DST.
5052
5053 -- Instruction: ulsh s8:DST s8:A s8:B
5054 -- Instruction: ulsh/immediate s8:DST s8:A c8:B
5055     Shift the unboxed unsigned 64-bit integer in A left by B bits, also
5056     an unboxed unsigned 64-bit integer.  Truncate to 64 bits and write
5057     to DST as an unboxed value.  Only the lower 6 bits of B are used.
5058
5059 -- Instruction: ursh s8:DST s8:A s8:B
5060 -- Instruction: ursh/immediate s8:DST s8:A c8:B
5061     Shift the unboxed unsigned 64-bit integer in A right by B bits,
5062     also an unboxed unsigned 64-bit integer.  Truncate to 64 bits and
5063     write to DST as an unboxed value.  Only the lower 6 bits of B are
5064     used.
5065
5066 -- Instruction: srsh s8:DST s8:A s8:B
5067 -- Instruction: srsh/immediate s8:DST s8:A c8:B
5068     Shift the unboxed signed 64-bit integer in A right by B bits, also
5069     an unboxed signed 64-bit integer.  Truncate to 64 bits and write to
5070     DST as an unboxed value.  Only the lower 6 bits of B are used.
5071
5072
5073File: guile.info,  Node: Floating-Point Arithmetic Instructions,  Next: Comparison Instructions,  Prev: Integer Arithmetic Instructions,  Up: Instruction Set
5074
50759.3.7.13 Floating-Point Arithmetic Instructions
5076...............................................
5077
5078 -- Instruction: fadd s8:DST s8:A s8:B
5079     Add the ‘f64’ values A and B, and store the ‘f64’ result to DST.
5080
5081 -- Instruction: fsub s8:DST s8:A s8:B
5082     Subtract the ‘f64’ value B from A, and store the ‘f64’ result to
5083     DST.
5084
5085 -- Instruction: fmul s8:DST s8:A s8:B
5086     Multiply the ‘f64’ values A and B, and store the ‘f64’ result to
5087     DST.
5088
5089 -- Instruction: fdiv s8:DST s8:A s8:B
5090     Divide the ‘f64’ values A by B, and store the ‘f64’ result to DST.
5091
5092
5093File: guile.info,  Node: Comparison Instructions,  Next: Branch Instructions,  Prev: Floating-Point Arithmetic Instructions,  Up: Instruction Set
5094
50959.3.7.14 Comparison Instructions
5096................................
5097
5098 -- Instruction: u64=? s12:A s12:B
5099     Set the comparison result to EQUAL if the ‘u64’ values A and B are
5100     the same, or ‘NONE’ otherwise.
5101
5102 -- Instruction: u64<? s12:A s12:B
5103     Set the comparison result to ‘LESS_THAN’ if the ‘u64’ value A is
5104     less than the ‘u64’ value B are the same, or ‘NONE’ otherwise.
5105
5106 -- Instruction: s64<? s12:A s12:B
5107     Set the comparison result to ‘LESS_THAN’ if the ‘s64’ value A is
5108     less than the ‘s64’ value B are the same, or ‘NONE’ otherwise.
5109
5110 -- Instruction: s64-imm=? s12:A z12:B
5111     Set the comparison result to EQUAL if the ‘s64’ value A is equal to
5112     the immediate ‘s64’ value B, or ‘NONE’ otherwise.
5113
5114 -- Instruction: u64-imm<? s12:A c12:B
5115     Set the comparison result to ‘LESS_THAN’ if the ‘u64’ value A is
5116     less than the immediate ‘u64’ value B, or ‘NONE’ otherwise.
5117
5118 -- Instruction: imm-u64<? s12:A s12:B
5119     Set the comparison result to ‘LESS_THAN’ if the ‘u64’ immediate B
5120     is less than the ‘u64’ value A, or ‘NONE’ otherwise.
5121
5122 -- Instruction: s64-imm<? s12:A z12:B
5123     Set the comparison result to ‘LESS_THAN’ if the ‘s64’ value A is
5124     less than the immediate ‘s64’ value B, or ‘NONE’ otherwise.
5125
5126 -- Instruction: imm-s64<? s12:A z12:B
5127     Set the comparison result to ‘LESS_THAN’ if the ‘s64’ immediate B
5128     is less than the ‘s64’ value A, or ‘NONE’ otherwise.
5129
5130 -- Instruction: f64=? s12:A s12:B
5131     Set the comparison result to EQUAL if the f64 value A is equal to
5132     the f64 value B, or ‘NONE’ otherwise.
5133
5134 -- Instruction: f64<? s12:A s12:B
5135     Set the comparison result to ‘LESS_THAN’ if the f64 value A is less
5136     than the f64 value B, ‘NONE’ if A is greater than or equal to B, or
5137     ‘INVALID’ otherwise.
5138
5139 -- Instruction: =? s12:A s12:B
5140     Set the comparison result to EQUAL if the SCM values A and B are
5141     numerically equal, in the sense of the Scheme ‘=’ operator.  Set to
5142     ‘NONE’ otherwise.
5143
5144 -- Instruction: heap-numbers-equal? s12:A s12:B
5145     Set the comparison result to EQUAL if the SCM values A and B are
5146     numerically equal, in the sense of Scheme ‘=’.  Set to ‘NONE’
5147     otherwise.  It is known that both A and B are heap numbers.
5148
5149 -- Instruction: <? s12:A s12:B
5150     Set the comparison result to ‘LESS_THAN’ if the SCM value A is less
5151     than the SCM value B, ‘NONE’ if A is greater than or equal to B, or
5152     ‘INVALID’ otherwise.
5153
5154 -- Instruction: immediate-tag=? s24:OBJ c16:MASK c16:TAG
5155     Set the comparison result to EQUAL if the result of a bitwise ‘and’
5156     between the bits of ‘scm’ value A and the immediate MASK is TAG, or
5157     ‘NONE’ otherwise.
5158
5159 -- Instruction: heap-tag=? s24:OBJ c16:MASK c16:TAG
5160     Set the comparison result to EQUAL if the result of a bitwise ‘and’
5161     between the first word of ‘scm’ value A and the immediate MASK is
5162     TAG, or ‘NONE’ otherwise.
5163
5164 -- Instruction: eq? s12:A s12:B
5165     Set the comparison result to EQUAL if the SCM values A and B are
5166     ‘eq?’, or ‘NONE’ otherwise.
5167
5168 -- Instruction: eq-immediate? s8:A zi16:B
5169     Set the comparison result to EQUAL if the SCM value A is equal to
5170     the immediate SCM value B (sign-extended), or ‘NONE’ otherwise.
5171
5172   There are a set of macro-instructions for ‘immediate-tag=?’ and
5173‘heap-tag=?’ as well that abstract away the precise type tag values.
5174*Note The SCM Type in Guile::.
5175
5176 -- Macro Instruction: fixnum? x
5177 -- Macro Instruction: heap-object? x
5178 -- Macro Instruction: char? x
5179 -- Macro Instruction: eq-false? x
5180 -- Macro Instruction: eq-nil? x
5181 -- Macro Instruction: eq-null? x
5182 -- Macro Instruction: eq-true? x
5183 -- Macro Instruction: unspecified? x
5184 -- Macro Instruction: undefined? x
5185 -- Macro Instruction: eof-object? x
5186 -- Macro Instruction: null? x
5187 -- Macro Instruction: false? x
5188 -- Macro Instruction: nil? x
5189     Emit a ‘immediate-tag=?’ instruction that will set the comparison
5190     result to ‘EQUAL’ if X would pass the corresponding predicate (e.g.
5191     ‘null?’), or ‘NONE’ otherwise.
5192
5193 -- Macro Instruction: pair? x
5194 -- Macro Instruction: struct? x
5195 -- Macro Instruction: symbol? x
5196 -- Macro Instruction: variable? x
5197 -- Macro Instruction: vector? x
5198 -- Macro Instruction: immutable-vector? x
5199 -- Macro Instruction: mutable-vector? x
5200 -- Macro Instruction: weak-vector? x
5201 -- Macro Instruction: string? x
5202 -- Macro Instruction: heap-number? x
5203 -- Macro Instruction: hash-table? x
5204 -- Macro Instruction: pointer? x
5205 -- Macro Instruction: fluid? x
5206 -- Macro Instruction: stringbuf? x
5207 -- Macro Instruction: dynamic-state? x
5208 -- Macro Instruction: frame? x
5209 -- Macro Instruction: keyword? x
5210 -- Macro Instruction: atomic-box? x
5211 -- Macro Instruction: syntax? x
5212 -- Macro Instruction: program? x
5213 -- Macro Instruction: vm-continuation? x
5214 -- Macro Instruction: bytevector? x
5215 -- Macro Instruction: weak-set? x
5216 -- Macro Instruction: weak-table? x
5217 -- Macro Instruction: array? x
5218 -- Macro Instruction: bitvector? x
5219 -- Macro Instruction: smob? x
5220 -- Macro Instruction: port? x
5221 -- Macro Instruction: bignum? x
5222 -- Macro Instruction: flonum? x
5223 -- Macro Instruction: compnum? x
5224 -- Macro Instruction: fracnum? x
5225     Emit a ‘heap-tag=?’ instruction that will set the comparison result
5226     to ‘EQUAL’ if X would pass the corresponding predicate (e.g.
5227     ‘null?’), or ‘NONE’ otherwise.
5228
5229
5230File: guile.info,  Node: Branch Instructions,  Next: Raw Memory Access Instructions,  Prev: Comparison Instructions,  Up: Instruction Set
5231
52329.3.7.15 Branch Instructions
5233............................
5234
5235All offsets to branch instructions are 24-bit signed numbers, which
5236count 32-bit units.  This gives Guile effectively a 26-bit address range
5237for relative jumps.
5238
5239 -- Instruction: j l24:OFFSET
5240     Add OFFSET to the current instruction pointer.
5241
5242 -- Instruction: jl l24:OFFSET
5243     If the last comparison result is ‘LESS_THAN’, add OFFSET, a signed
5244     24-bit number, to the current instruction pointer.
5245
5246 -- Instruction: je l24:OFFSET
5247     If the last comparison result is ‘EQUAL’, add OFFSET, a signed
5248     24-bit number, to the current instruction pointer.
5249
5250 -- Instruction: jnl l24:OFFSET
5251     If the last comparison result is not ‘LESS_THAN’, add OFFSET, a
5252     signed 24-bit number, to the current instruction pointer.
5253
5254 -- Instruction: jne l24:OFFSET
5255     If the last comparison result is not ‘EQUAL’, add OFFSET, a signed
5256     24-bit number, to the current instruction pointer.
5257
5258 -- Instruction: jge l24:OFFSET
5259     If the last comparison result is ‘NONE’, add OFFSET, a signed
5260     24-bit number, to the current instruction pointer.
5261
5262     This is intended for use after a ‘<?’ comparison, and is different
5263     from ‘jnl’ in the way it handles not-a-number (NaN) values: ‘<?’
5264     sets ‘INVALID’ instead of ‘NONE’ if either value is a NaN. For
5265     exact numbers, ‘jge’ is the same as ‘jnl’.
5266
5267 -- Instruction: jnge l24:OFFSET
5268     If the last comparison result is not ‘NONE’, add OFFSET, a signed
5269     24-bit number, to the current instruction pointer.
5270
5271     This is intended for use after a ‘<?’ comparison, and is different
5272     from ‘jl’ in the way it handles not-a-number (NaN) values: ‘<?’
5273     sets ‘INVALID’ instead of ‘NONE’ if either value is a NaN. For
5274     exact numbers, ‘jnge’ is the same as ‘jl’.
5275
5276 -- Instruction: jtable s24:IDX v32:LENGTH [x8:_ l24:OFFSET]...
5277     Branch to an entry in a table, as in C’s ‘switch’ statement.  IDX
5278     is a ‘u64’ local indicating which entry to branch to.  The
5279     immediate LEN indicates the number of entries in the table, and
5280     should be greater than or equal to 1.  The last entry in the table
5281     is the "catch-all" entry.  The OFFSET...  values are signed 24-bit
5282     immediates (‘l24’ encoding), indicating a memory address as a
5283     number of 32-bit words away from the current instruction pointer.
5284
5285
5286File: guile.info,  Node: Raw Memory Access Instructions,  Prev: Branch Instructions,  Up: Instruction Set
5287
52889.3.7.16 Raw Memory Access Instructions
5289.......................................
5290
5291Bytevector operations correspond closely to what the current hardware
5292can do, so it makes sense to inline them to VM instructions, providing a
5293clear path for eventual native compilation.  Without this, Scheme
5294programs would need other primitives for accessing raw bytes – but these
5295primitives are as good as any.
5296
5297 -- Instruction: u8-ref s8:DST s8:PTR s8:IDX
5298 -- Instruction: s8-ref s8:DST s8:PTR s8:IDX
5299 -- Instruction: u16-ref s8:DST s8:PTR s8:IDX
5300 -- Instruction: s16-ref s8:DST s8:PTR s8:IDX
5301 -- Instruction: u32-ref s8:DST s8:PTR s8:IDX
5302 -- Instruction: s32-ref s8:DST s8:PTR s8:IDX
5303 -- Instruction: u64-ref s8:DST s8:PTR s8:IDX
5304 -- Instruction: s64-ref s8:DST s8:PTR s8:IDX
5305 -- Instruction: f32-ref s8:DST s8:PTR s8:IDX
5306 -- Instruction: f64-ref s8:DST s8:PTR s8:IDX
5307
5308     Fetch the item at byte offset IDX from the raw pointer local PTR,
5309     and store it in DST.  All accesses use native endianness.
5310
5311     The IDX value should be an unboxed unsigned 64-bit integer.
5312
5313     The results are all written to the stack as unboxed values, either
5314     as signed 64-bit integers, unsigned 64-bit integers, or IEEE double
5315     floating point numbers.
5316
5317 -- Instruction: u8-set! s8:PTR s8:IDX s8:VAL
5318 -- Instruction: s8-set! s8:PTR s8:IDX s8:VAL
5319 -- Instruction: u16-set! s8:PTR s8:IDX s8:VAL
5320 -- Instruction: s16-set! s8:PTR s8:IDX s8:VAL
5321 -- Instruction: u32-set! s8:PTR s8:IDX s8:VAL
5322 -- Instruction: s32-set! s8:PTR s8:IDX s8:VAL
5323 -- Instruction: u64-set! s8:PTR s8:IDX s8:VAL
5324 -- Instruction: s64-set! s8:PTR s8:IDX s8:VAL
5325 -- Instruction: f32-set! s8:PTR s8:IDX s8:VAL
5326 -- Instruction: f64-set! s8:PTR s8:IDX s8:VAL
5327
5328     Store VAL into memory pointed to by raw pointer local PTR, at byte
5329     offset IDX.  Multibyte values are written using native endianness.
5330
5331     The IDX value should be an unboxed unsigned 64-bit integer.
5332
5333     The VAL values are all unboxed, either as signed 64-bit integers,
5334     unsigned 64-bit integers, or IEEE double floating point numbers.
5335
5336
5337File: guile.info,  Node: Just-In-Time Native Code,  Prev: Instruction Set,  Up: A Virtual Machine for Guile
5338
53399.3.8 Just-In-Time Native Code
5340------------------------------
5341
5342The final piece of Guile’s virtual machine is a just-in-time (JIT)
5343compiler from bytecode instructions to native code.  It is faster to run
5344a function when its bytecode instructions are compiled to native code,
5345compared to having the VM interpret the instructions.
5346
5347   The JIT compiler runs automatically, triggered by counters associated
5348with each function.  The counter increments when functions are called
5349and during each loop iteration.  Once a function’s counter passes a
5350certain value, the function gets JIT-compiled.  *Note Instrumentation
5351Instructions::, for full details.
5352
5353   Guile’s JIT compiler is what is known as a “template JIT”. This kind
5354of JIT is very simple: for each instruction in a function, the JIT
5355compiler will emit a generic sequence of machine code corresponding to
5356the instruction kind, specializing that generic template to reference
5357the specific operands of the instruction being compiled.
5358
5359   The strength of a template JIT is principally that it is very fast at
5360emitting code.  It doesn’t need to do any time-consuming analysis on the
5361bytecode that it is compiling to do its job.
5362
5363   A template JIT is also very predictable: the native code emitted by a
5364template JIT has the same performance characteristics of the
5365corresponding bytecode, only that it runs faster.  In theory you could
5366even generate the template-JIT machine code ahead of time, as it doesn’t
5367depend on any value seen at run-time.
5368
5369   This predictability makes it possible to reason about the performance
5370of a system in terms of bytecode, knowing that the conclusions apply to
5371native code emitted by a template JIT.
5372
5373   Because the machine code corresponding to an instruction always
5374performs the same tasks that the interpreter would do for that
5375instruction, bytecode and a template JIT also allows Guile programmers
5376to debug their programs in terms of the bytecode model.  When a Guile
5377programmer sets a breakpoint, Guile will disable the JIT for the thread
5378being debugged, falling back to the interpreter (which has the
5379corresponding code to run the hooks).  *Note VM Hooks::.
5380
5381   To emit native code, Guile uses a forked version of GNU Lightning.
5382This "Lightening" effort, spun out as a separate project, aims to build
5383on the back-end support from GNU Lightning, but adapting the API and
5384behavior of the library to match Guile’s needs.  This code is included
5385in the Guile source distribution.  For more information, see
5386<https://gitlab.com/wingo/lightening>.  As of mid-2019, Lightening
5387supports code generation for the x86-64, ia32, ARMv7, and AArch64
5388architectures.
5389
5390   The weaknesses of a template JIT are two-fold.  Firstly, as a simple
5391back-end that has to run fast, a template JIT doesn’t have time to do
5392analysis that could help it generate better code, notably global
5393register allocation and instruction selection.
5394
5395   However this is a minor weakness compared to the inability to perform
5396significant, speculative program transformations.  For example, Guile
5397could see that in an expression ‘(f x)’, that in practice F always
5398refers to the same function.  An advanced JIT compiler would
5399speculatively inline F into the call-site, along with a dynamic check to
5400make sure that the assertion still held.  But as a template JIT doesn’t
5401pay attention to values only known at run-time, it can’t make this
5402transformation.
5403
5404   This limitation is mitigated in part by Guile’s robust ahead-of-time
5405compiler which can already perform significant optimizations when it can
5406prove they will always be valid, and its low-level bytecode which is
5407able to represent the effect of those optimizations (e.g.  elided
5408type-checks).  *Note Compiling to the Virtual Machine::, for more on
5409Guile’s compiler.
5410
5411   An ahead-of-time Scheme-to-bytecode strategy, complemented by a
5412template JIT, also particularly suits the somewhat static nature of
5413Scheme.  Scheme programmers often write code in a way that makes the
5414identity of free variable references lexically apparent.  For example,
5415the ‘(f x)’ expression could appear within a ‘(let ((f (lambda (x) (1+
5416x)))) ...)’ expression, or we could see that ‘f’ was imported from a
5417particular module where we know its binding.  Ahead-of-time compilation
5418techniques can work well for a language like Scheme where there is
5419little polymorphism and much first-order programming.  They do not work
5420so well for a language like JavaScript, which is highly mutable at
5421run-time and difficult to analyze due to method calls (which are
5422effectively higher-order calls).
5423
5424   All that said, a template JIT works well for Guile at this point.
5425It’s only a few thousand lines of maintainable code, it speeds up Scheme
5426programs, and it keeps the bulk of the Guile Scheme implementation
5427written in Scheme itself.  The next step is probably to add
5428ahead-of-time native code emission to the back-end of the compiler
5429written in Scheme, to take advantage of the opportunity to do global
5430register allocation and instruction selection.  Once this is working, it
5431can allow Guile to experiment with speculative optimizations in Scheme
5432as well.  *Note Extending the Compiler::, for more on future directions.
5433
5434   Finally, note that there are a few environment variables that can be
5435tweaked to make JIT compilation happen sooner, later, or never.  *Note
5436Environment Variables::, for more.
5437
5438
5439File: guile.info,  Node: Compiling to the Virtual Machine,  Prev: A Virtual Machine for Guile,  Up: Guile Implementation
5440
54419.4 Compiling to the Virtual Machine
5442====================================
5443
5444Compilers!  The word itself inspires excitement and awe, even among
5445experienced practitioners.  But a compiler is just a program: an
5446eminently hackable thing.  This section aims to describe Guile’s
5447compiler in such a way that interested Scheme hackers can feel
5448comfortable reading and extending it.
5449
5450   *Note Read/Load/Eval/Compile::, if you’re lost and you just wanted to
5451know how to compile your ‘.scm’ file.
5452
5453* Menu:
5454
5455* Compiler Tower::
5456* The Scheme Compiler::
5457* Tree-IL::
5458* Continuation-Passing Style::
5459* Bytecode::
5460* Writing New High-Level Languages::
5461* Extending the Compiler::
5462
5463
5464File: guile.info,  Node: Compiler Tower,  Next: The Scheme Compiler,  Up: Compiling to the Virtual Machine
5465
54669.4.1 Compiler Tower
5467--------------------
5468
5469Guile’s compiler is quite simple – its _compilers_, to put it more
5470accurately.  Guile defines a tower of languages, starting at Scheme and
5471progressively simplifying down to languages that resemble the VM
5472instruction set (*note Instruction Set::).
5473
5474   Each language knows how to compile to the next, so each step is
5475simple and understandable.  Furthermore, this set of languages is not
5476hardcoded into Guile, so it is possible for the user to add new
5477high-level languages, new passes, or even different compilation targets.
5478
5479   Languages are registered in the module, ‘(system base language)’:
5480
5481     (use-modules (system base language))
5482
5483   They are registered with the ‘define-language’ form.
5484
5485 -- Scheme Syntax: define-language [#:name] [#:title] [#:reader]
5486          [#:printer] [#:parser=#f] [#:compilers='()]
5487          [#:decompilers='()] [#:evaluator=#f] [#:joiner=#f]
5488          [#:for-humans?=#t]
5489          [#:make-default-environment=make-fresh-user-module]
5490          [#:lowerer=#f] [#:analyzer=#f] [#:compiler-chooser=#f]
5491     Define a language.
5492
5493     This syntax defines a ‘<language>’ object, bound to NAME in the
5494     current environment.  In addition, the language will be added to
5495     the global language set.  For example, this is the language
5496     definition for Scheme:
5497
5498          (define-language scheme
5499            #:title	"Scheme"
5500            #:reader      (lambda (port env) ...)
5501            #:compilers   `((tree-il . ,compile-tree-il))
5502            #:decompilers `((tree-il . ,decompile-tree-il))
5503            #:evaluator	(lambda (x module) (primitive-eval x))
5504            #:printer	write
5505            #:make-default-environment (lambda () ...))
5506
5507   The interesting thing about having languages defined this way is that
5508they present a uniform interface to the read-eval-print loop.  This
5509allows the user to change the current language of the REPL:
5510
5511     scheme@(guile-user)> ,language tree-il
5512     Happy hacking with Tree Intermediate Language!  To switch back, type `,L scheme'.
5513     tree-il@(guile-user)> ,L scheme
5514     Happy hacking with Scheme!  To switch back, type `,L tree-il'.
5515     scheme@(guile-user)>
5516
5517   Languages can be looked up by name, as they were above.
5518
5519 -- Scheme Procedure: lookup-language name
5520     Looks up a language named NAME, autoloading it if necessary.
5521
5522     Languages are autoloaded by looking for a variable named NAME in a
5523     module named ‘(language NAME spec)’.
5524
5525     The language object will be returned, or ‘#f’ if there does not
5526     exist a language with that name.
5527
5528   When Guile goes to compile Scheme to bytecode, it will ask the Scheme
5529language to choose a compiler from Scheme to the next language on the
5530path from Scheme to bytecode.  Performing this computation recursively
5531builds transformations from a flexible chain of compilers.  The next
5532link will be obtained by invoking the language’s compiler chooser, or if
5533not present, from the language’s compilers field.
5534
5535   A language can specify an analyzer, which is run before a term of
5536that language is lowered and compiled.  This is where compiler warnings
5537are issued.
5538
5539   If a language specifies a lowerer, that procedure is called on
5540expressions before compilation.  This is where optimizations and
5541canonicalizations go.
5542
5543   Finally a language’s compiler translates a lowered term from one
5544language to the next one in the chain.
5545
5546   There is a notion of a “current language”, which is maintained in the
5547‘current-language’ parameter, defined in the core ‘(guile)’ module.
5548This language is normally Scheme, and may be rebound by the user.  The
5549run-time compilation interfaces (*note Read/Load/Eval/Compile::) also
5550allow you to choose other source and target languages.
5551
5552   The normal tower of languages when compiling Scheme goes like this:
5553
5554   • Scheme
5555   • Tree Intermediate Language (Tree-IL)
5556   • Continuation-Passing Style (CPS)
5557   • Bytecode
5558
5559   As discussed before (*note Object File Format::), bytecode is in ELF
5560format, ready to be serialized to disk.  But when compiling Scheme at
5561run time, you want a Scheme value: for example, a compiled procedure.
5562For this reason, so as not to break the abstraction, Guile defines a
5563fake language at the bottom of the tower:
5564
5565   • Value
5566
5567   Compiling to ‘value’ loads the bytecode into a procedure, turning
5568cold bytes into warm code.
5569
5570   Perhaps this strangeness can be explained by example: ‘compile-file’
5571defaults to compiling to bytecode, because it produces object code that
5572has to live in the barren world outside the Guile runtime; but ‘compile’
5573defaults to compiling to ‘value’, as its product re-enters the Guile
5574world.
5575
5576   Indeed, the process of compilation can circulate through these
5577different worlds indefinitely, as shown by the following quine:
5578
5579     ((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))
5580
5581
5582File: guile.info,  Node: The Scheme Compiler,  Next: Tree-IL,  Prev: Compiler Tower,  Up: Compiling to the Virtual Machine
5583
55849.4.2 The Scheme Compiler
5585-------------------------
5586
5587The job of the Scheme compiler is to expand all macros and all of Scheme
5588to its most primitive expressions.  The definition of “primitive
5589expression” is given by the inventory of constructs provided by Tree-IL,
5590the target language of the Scheme compiler: procedure calls,
5591conditionals, lexical references, and so on.  This is described more
5592fully in the next section.
5593
5594   The tricky and amusing thing about the Scheme-to-Tree-IL compiler is
5595that it is completely implemented by the macro expander.  Since the
5596macro expander has to run over all of the source code already in order
5597to expand macros, it might as well do the analysis at the same time,
5598producing Tree-IL expressions directly.
5599
5600   Because this compiler is actually the macro expander, it is
5601extensible.  Any macro which the user writes becomes part of the
5602compiler.
5603
5604   The Scheme-to-Tree-IL expander may be invoked using the generic
5605‘compile’ procedure:
5606
5607     (compile '(+ 1 2) #:from 'scheme #:to 'tree-il)
56085609     #<tree-il (call (toplevel +) (const 1) (const 2))>
5610
5611   ‘(compile FOO #:from 'scheme #:to 'tree-il)’ is entirely equivalent
5612to calling the macro expander as ‘(macroexpand FOO 'c '(compile load
5613eval))’.  *Note Macro Expansion::.  ‘compile-tree-il’, the procedure
5614dispatched by ‘compile’ to ‘'tree-il’, is a small wrapper around
5615‘macroexpand’, to make it conform to the general form of compiler
5616procedures in Guile’s language tower.
5617
5618   Compiler procedures take three arguments: an expression, an
5619environment, and a keyword list of options.  They return three values:
5620the compiled expression, the corresponding environment for the target
5621language, and a “continuation environment”.  The compiled expression and
5622environment will serve as input to the next language’s compiler.  The
5623“continuation environment” can be used to compile another expression
5624from the same source language within the same module.
5625
5626   For example, you might compile the expression, ‘(define-module
5627(foo))’.  This will result in a Tree-IL expression and environment.  But
5628if you compiled a second expression, you would want to take into account
5629the compile-time effect of compiling the previous expression, which puts
5630the user in the ‘(foo)’ module.  That is the purpose of the
5631“continuation environment”; you would pass it as the environment when
5632compiling the subsequent expression.
5633
5634   For Scheme, an environment is a module.  By default, the ‘compile’
5635and ‘compile-file’ procedures compile in a fresh module, such that
5636bindings and macros introduced by the expression being compiled are
5637isolated:
5638
5639     (eq? (current-module) (compile '(current-module)))
5640     ⇒ #f
5641
5642     (compile '(define hello 'world))
5643     (defined? 'hello)
5644     ⇒ #f
5645
5646     (define / *)
5647     (eq? (compile '/) /)
5648     ⇒ #f
5649
5650   Similarly, changes to the ‘current-reader’ fluid (*note
5651‘current-reader’: Loading.) are isolated:
5652
5653     (compile '(fluid-set! current-reader (lambda args 'fail)))
5654     (fluid-ref current-reader)
5655     ⇒ #f
5656
5657   Nevertheless, having the compiler and “compilee” share the same name
5658space can be achieved by explicitly passing ‘(current-module)’ as the
5659compilation environment:
5660
5661     (define hello 'world)
5662     (compile 'hello #:env (current-module))
5663     ⇒ world
5664
5665
5666File: guile.info,  Node: Tree-IL,  Next: Continuation-Passing Style,  Prev: The Scheme Compiler,  Up: Compiling to the Virtual Machine
5667
56689.4.3 Tree-IL
5669-------------
5670
5671Tree Intermediate Language (Tree-IL) is a structured intermediate
5672language that is close in expressive power to Scheme.  It is an
5673expanded, pre-analyzed Scheme.
5674
5675   Tree-IL is “structured” in the sense that its representation is based
5676on records, not S-expressions.  This gives a rigidity to the language
5677that ensures that compiling to a lower-level language only requires a
5678limited set of transformations.  For example, the Tree-IL type ‘<const>’
5679is a record type with two fields, ‘src’ and ‘exp’.  Instances of this
5680type are created via ‘make-const’.  Fields of this type are accessed via
5681the ‘const-src’ and ‘const-exp’ procedures.  There is also a predicate,
5682‘const?’.  *Note Records::, for more information on records.
5683
5684   All Tree-IL types have a ‘src’ slot, which holds source location
5685information for the expression.  This information, if present, will be
5686residualized into the compiled object code, allowing backtraces to show
5687source information.  The format of ‘src’ is the same as that returned by
5688Guile’s ‘source-properties’ function.  *Note Source Properties::, for
5689more information.
5690
5691   Although Tree-IL objects are represented internally using records,
5692there is also an equivalent S-expression external representation for
5693each kind of Tree-IL. For example, the S-expression representation of
5694‘#<const src: #f exp: 3>’ expression would be:
5695
5696     (const 3)
5697
5698   Users may program with this format directly at the REPL:
5699
5700     scheme@(guile-user)> ,language tree-il
5701     Happy hacking with Tree Intermediate Language!  To switch back, type `,L scheme'.
5702     tree-il@(guile-user)> (call (primitive +) (const 32) (const 10))
5703     ⇒ 42
5704
5705   The ‘src’ fields are left out of the external representation.
5706
5707   One may create Tree-IL objects from their external representations
5708via calling ‘parse-tree-il’, the reader for Tree-IL. If any source
5709information is attached to the input S-expression, it will be propagated
5710to the resulting Tree-IL expressions.  This is probably the easiest way
5711to compile to Tree-IL: just make the appropriate external
5712representations in S-expression format, and let ‘parse-tree-il’ take
5713care of the rest.
5714
5715 -- Scheme Variable: <void> src
5716 -- External Representation: (void)
5717     An empty expression.  In practice, equivalent to Scheme’s ‘(if #f
5718     #f)’.
5719
5720 -- Scheme Variable: <const> src exp
5721 -- External Representation: (const EXP)
5722     A constant.
5723
5724 -- Scheme Variable: <primitive-ref> src name
5725 -- External Representation: (primitive NAME)
5726     A reference to a “primitive”.  A primitive is a procedure that,
5727     when compiled, may be open-coded.  For example, ‘cons’ is usually
5728     recognized as a primitive, so that it compiles down to a single
5729     instruction.
5730
5731     Compilation of Tree-IL usually begins with a pass that resolves
5732     some ‘<module-ref>’ and ‘<toplevel-ref>’ expressions to
5733     ‘<primitive-ref>’ expressions.  The actual compilation pass has
5734     special cases for calls to certain primitives, like ‘apply’ or
5735     ‘cons’.
5736
5737 -- Scheme Variable: <lexical-ref> src name gensym
5738 -- External Representation: (lexical NAME GENSYM)
5739     A reference to a lexically-bound variable.  The NAME is the
5740     original name of the variable in the source program.  GENSYM is a
5741     unique identifier for this variable.
5742
5743 -- Scheme Variable: <lexical-set> src name gensym exp
5744 -- External Representation: (set! (lexical NAME GENSYM) EXP)
5745     Sets a lexically-bound variable.
5746
5747 -- Scheme Variable: <module-ref> src mod name public?
5748 -- External Representation: (@ MOD NAME)
5749 -- External Representation: (@@ MOD NAME)
5750     A reference to a variable in a specific module.  MOD should be the
5751     name of the module, e.g. ‘(guile-user)’.
5752
5753     If PUBLIC? is true, the variable named NAME will be looked up in
5754     MOD’s public interface, and serialized with ‘@’; otherwise it will
5755     be looked up among the module’s private bindings, and is serialized
5756     with ‘@@’.
5757
5758 -- Scheme Variable: <module-set> src mod name public? exp
5759 -- External Representation: (set! (@ MOD NAME) EXP)
5760 -- External Representation: (set! (@@ MOD NAME) EXP)
5761     Sets a variable in a specific module.
5762
5763 -- Scheme Variable: <toplevel-ref> src name
5764 -- External Representation: (toplevel NAME)
5765     References a variable from the current procedure’s module.
5766
5767 -- Scheme Variable: <toplevel-set> src name exp
5768 -- External Representation: (set! (toplevel NAME) EXP)
5769     Sets a variable in the current procedure’s module.
5770
5771 -- Scheme Variable: <toplevel-define> src name exp
5772 -- External Representation: (define NAME EXP)
5773     Defines a new top-level variable in the current procedure’s module.
5774
5775 -- Scheme Variable: <conditional> src test then else
5776 -- External Representation: (if TEST THEN ELSE)
5777     A conditional.  Note that ELSE is not optional.
5778
5779 -- Scheme Variable: <call> src proc args
5780 -- External Representation: (call PROC . ARGS)
5781     A procedure call.
5782
5783 -- Scheme Variable: <primcall> src name args
5784 -- External Representation: (primcall NAME . ARGS)
5785     A call to a primitive.  Equivalent to ‘(call (primitive NAME) .
5786     ARGS)’.  This construct is often more convenient to generate and
5787     analyze than ‘<call>’.
5788
5789     As part of the compilation process, instances of ‘(call (primitive
5790     NAME) . ARGS)’ are transformed into primcalls.
5791
5792 -- Scheme Variable: <seq> src head tail
5793 -- External Representation: (seq HEAD TAIL)
5794     A sequence.  The semantics is that HEAD is evaluated first, and any
5795     resulting values are ignored.  Then TAIL is evaluated, in tail
5796     position.
5797
5798 -- Scheme Variable: <lambda> src meta body
5799 -- External Representation: (lambda META BODY)
5800     A closure.  META is an association list of properties for the
5801     procedure.  BODY is a single Tree-IL expression of type
5802     ‘<lambda-case>’.  As the ‘<lambda-case>’ clause can chain to an
5803     alternate clause, this makes Tree-IL’s ‘<lambda>’ have the
5804     expressiveness of Scheme’s ‘case-lambda’.
5805
5806 -- Scheme Variable: <lambda-case> req opt rest kw inits gensyms body
5807          alternate
5808 -- External Representation: (lambda-case ((REQ OPT REST KW INITS
5809          GENSYMS) BODY) [ALTERNATE])
5810     One clause of a ‘case-lambda’.  A ‘lambda’ expression in Scheme is
5811     treated as a ‘case-lambda’ with one clause.
5812
5813     REQ is a list of the procedure’s required arguments, as symbols.
5814     OPT is a list of the optional arguments, or ‘#f’ if there are no
5815     optional arguments.  REST is the name of the rest argument, or
5816     ‘#f’.
5817
5818     KW is a list of the form, ‘(ALLOW-OTHER-KEYS? (KEYWORD NAME VAR)
5819     ...)’, where KEYWORD is the keyword corresponding to the argument
5820     named NAME, and whose corresponding gensym is VAR, or ‘#f’ if there
5821     are no keyword arguments.  INITS are tree-il expressions
5822     corresponding to all of the optional and keyword arguments,
5823     evaluated to bind variables whose value is not supplied by the
5824     procedure caller.  Each INIT expression is evaluated in the lexical
5825     context of previously bound variables, from left to right.
5826
5827     GENSYMS is a list of gensyms corresponding to all arguments: first
5828     all of the required arguments, then the optional arguments if any,
5829     then the rest argument if any, then all of the keyword arguments.
5830
5831     BODY is the body of the clause.  If the procedure is called with an
5832     appropriate number of arguments, BODY is evaluated in tail
5833     position.  Otherwise, if there is an ALTERNATE, it should be a
5834     ‘<lambda-case>’ expression, representing the next clause to try.
5835     If there is no ALTERNATE, a wrong-number-of-arguments error is
5836     signaled.
5837
5838 -- Scheme Variable: <let> src names gensyms vals exp
5839 -- External Representation: (let NAMES GENSYMS VALS EXP)
5840     Lexical binding, like Scheme’s ‘let’.  NAMES are the original
5841     binding names, GENSYMS are gensyms corresponding to the NAMES, and
5842     VALS are Tree-IL expressions for the values.  EXP is a single
5843     Tree-IL expression.
5844
5845 -- Scheme Variable: <letrec> in-order? src names gensyms vals exp
5846 -- External Representation: (letrec NAMES GENSYMS VALS EXP)
5847 -- External Representation: (letrec* NAMES GENSYMS VALS EXP)
5848     A version of ‘<let>’ that creates recursive bindings, like Scheme’s
5849     ‘letrec’, or ‘letrec*’ if IN-ORDER? is true.
5850
5851 -- Scheme Variable: <prompt> escape-only? tag body handler
5852 -- External Representation: (prompt ESCAPE-ONLY? TAG BODY HANDLER)
5853     A dynamic prompt.  Instates a prompt named TAG, an expression,
5854     during the dynamic extent of the execution of BODY, also an
5855     expression.  If an abort occurs to this prompt, control will be
5856     passed to HANDLER, also an expression, which should be a procedure.
5857     The first argument to the handler procedure will be the captured
5858     continuation, followed by all of the values passed to the abort.
5859     If ESCAPE-ONLY? is true, the handler should be a ‘<lambda>’ with a
5860     single ‘<lambda-case>’ body expression with no optional or keyword
5861     arguments, and no alternate, and whose first argument is
5862     unreferenced.  *Note Prompts::, for more information.
5863
5864 -- Scheme Variable: <abort> tag args tail
5865 -- External Representation: (abort TAG ARGS TAIL)
5866     An abort to the nearest prompt with the name TAG, an expression.
5867     ARGS should be a list of expressions to pass to the prompt’s
5868     handler, and TAIL should be an expression that will evaluate to a
5869     list of additional arguments.  An abort will save the partial
5870     continuation, which may later be reinstated, resulting in the
5871     ‘<abort>’ expression evaluating to some number of values.
5872
5873   There are two Tree-IL constructs that are not normally produced by
5874higher-level compilers, but instead are generated during the
5875source-to-source optimization and analysis passes that the Tree-IL
5876compiler does.  Users should not generate these expressions directly,
5877unless they feel very clever, as the default analysis pass will generate
5878them as necessary.
5879
5880 -- Scheme Variable: <let-values> src names gensyms exp body
5881 -- External Representation: (let-values NAMES GENSYMS EXP BODY)
5882     Like Scheme’s ‘receive’ – binds the values returned by evaluating
5883     ‘exp’ to the ‘lambda’-like bindings described by GENSYMS.  That is
5884     to say, GENSYMS may be an improper list.
5885
5886     ‘<let-values>’ is an optimization of a ‘<call>’ to the primitive,
5887     ‘call-with-values’.
5888
5889 -- Scheme Variable: <fix> src names gensyms vals body
5890 -- External Representation: (fix NAMES GENSYMS VALS BODY)
5891     Like ‘<letrec>’, but only for VALS that are unset ‘lambda’
5892     expressions.
5893
5894     ‘fix’ is an optimization of ‘letrec’ (and ‘let’).
5895
5896   Tree-IL is a convenient compilation target from source languages.  It
5897can be convenient as a medium for optimization, though CPS is usually
5898better.  The strength of Tree-IL is that it does not fix order of
5899evaluation, so it makes some code motion a bit easier.
5900
5901   Optimization passes performed on Tree-IL currently include:
5902
5903   • Open-coding (turning toplevel-refs into primitive-refs, and calls
5904     to primitives to primcalls)
5905   • Partial evaluation (comprising inlining, copy propagation, and
5906     constant folding)
5907
5908
5909File: guile.info,  Node: Continuation-Passing Style,  Next: Bytecode,  Prev: Tree-IL,  Up: Compiling to the Virtual Machine
5910
59119.4.4 Continuation-Passing Style
5912--------------------------------
5913
5914Continuation-passing style (CPS) is Guile’s principal intermediate
5915language, bridging the gap between languages for people and languages
5916for machines.  CPS gives a name to every part of a program: every
5917control point, and every intermediate value.  This makes it an excellent
5918medium for reasoning about programs, which is the principal job of a
5919compiler.
5920
5921* Menu:
5922
5923* An Introduction to CPS::
5924* CPS in Guile::
5925* Building CPS::
5926* CPS Soup::
5927* Compiling CPS::
5928
5929
5930File: guile.info,  Node: An Introduction to CPS,  Next: CPS in Guile,  Up: Continuation-Passing Style
5931
59329.4.4.1 An Introduction to CPS
5933..............................
5934
5935Consider the following Scheme expression:
5936
5937     (begin
5938       (display "The sum of 32 and 10 is: ")
5939       (display 42)
5940       (newline))
5941
5942   Let us identify all of the sub-expressions in this expression,
5943annotating them with unique labels:
5944
5945     (begin
5946       (display "The sum of 32 and 10 is: ")
5947       |k1      k2
5948       k0
5949       (display 42)
5950       |k4      k5
5951       k3
5952       (newline))
5953       |k7
5954       k6
5955
5956   Each of these labels identifies a point in a program.  One label may
5957be the continuation of another label.  For example, the continuation of
5958‘k7’ is ‘k6’.  This is because after evaluating the value of ‘newline’,
5959performed by the expression labelled ‘k7’, we continue to apply it in
5960‘k6’.
5961
5962   Which expression has ‘k0’ as its continuation?  It is either the
5963expression labelled ‘k1’ or the expression labelled ‘k2’.  Scheme does
5964not have a fixed order of evaluation of arguments, though it does
5965guarantee that they are evaluated in some order.  Unlike general Scheme,
5966continuation-passing style makes evaluation order explicit.  In Guile,
5967this choice is made by the higher-level language compilers.
5968
5969   Let us assume a left-to-right evaluation order.  In that case the
5970continuation of ‘k1’ is ‘k2’, and the continuation of ‘k2’ is ‘k0’.
5971
5972   With this example established, we are ready to give an example of CPS
5973in Scheme:
5974
5975     (lambda (ktail)
5976       (let ((k1 (lambda ()
5977                   (let ((k2 (lambda (proc)
5978                               (let ((k0 (lambda (arg0)
5979                                           (proc k4 arg0))))
5980                                 (k0 "The sum of 32 and 10 is: ")))))
5981                     (k2 display))))
5982             (k4 (lambda _
5983                   (let ((k5 (lambda (proc)
5984                               (let ((k3 (lambda (arg0)
5985                                           (proc k7 arg0))))
5986                                 (k3 42)))))
5987                     (k5 display))))
5988             (k7 (lambda _
5989                   (let ((k6 (lambda (proc)
5990                               (proc ktail))))
5991                     (k6 newline)))))
5992         (k1))
5993
5994   Holy code explosion, Batman!  What’s with all the lambdas?  Indeed,
5995CPS is by nature much more verbose than “direct-style” intermediate
5996languages like Tree-IL. At the same time, CPS is simpler than full
5997Scheme, because it makes things more explicit.
5998
5999   In the original program, the expression labelled ‘k0’ is in effect
6000context.  Any values it returns are ignored.  In Scheme, this fact is
6001implicit.  In CPS, we can see it explicitly by noting that its
6002continuation, ‘k4’, takes any number of values and ignores them.
6003Compare this to ‘k2’, which takes a single value; in this way we can say
6004that ‘k1’ is in a “value” context.  Likewise ‘k6’ is in tail context
6005with respect to the expression as a whole, because its continuation is
6006the tail continuation, ‘ktail’.  CPS makes these details manifest, and
6007gives them names.
6008
6009
6010File: guile.info,  Node: CPS in Guile,  Next: Building CPS,  Prev: An Introduction to CPS,  Up: Continuation-Passing Style
6011
60129.4.4.2 CPS in Guile
6013....................
6014
6015Guile’s CPS language is composed of “continuations”.  A continuation is
6016a labelled program point.  If you are used to traditional compilers,
6017think of a continuation as a trivial basic block.  A program is a “soup”
6018of continuations, represented as a map from labels to continuations.
6019
6020   Like basic blocks, each continuation belongs to only one function.
6021Some continuations are special, like the continuation corresponding to a
6022function’s entry point, or the continuation that represents the tail of
6023a function.  Others contain a “term”.  A term contains an “expression”,
6024which evaluates to zero or more values.  The term also records the
6025continuation to which it will pass its values.  Some terms, like
6026conditional branches, may continue to one of a number of continuations.
6027
6028   Continuation labels are small integers.  This makes it easy to sort
6029them and to group them into sets.  Whenever a term refers to a
6030continuation, it does so by name, simply recording the label of the
6031continuation.  Continuation labels are unique among the set of labels in
6032a program.
6033
6034   Variables are also named by small integers.  Variable names are
6035unique among the set of variables in a program.
6036
6037   For example, a simple continuation that receives two values and adds
6038them together can be matched like this, using the ‘match’ form from
6039‘(ice-9 match)’:
6040
6041     (match cont
6042       (($ $kargs (x-name y-name) (x-var y-var)
6043           ($ $continue k src ($ $primcall '+ #f (x-var y-var))))
6044        (format #t "Add ~a and ~a and pass the result to label ~a"
6045                x-var y-var k)))
6046
6047   Here we see the most common kind of continuation, ‘$kargs’, which
6048binds some number of values to variables and then evaluates a term.
6049
6050 -- CPS Continuation: $kargs names vars term
6051     Bind the incoming values to the variables VARS, with original names
6052     NAMES, and then evaluate TERM.
6053
6054   The NAMES of a ‘$kargs’ are just for debugging, and will end up
6055residualized in the object file for use by the debugger.
6056
6057   The TERM in a ‘$kargs’ is always a ‘$continue’, which evaluates an
6058expression and continues to a continuation.
6059
6060 -- CPS Term: $continue k src exp
6061     Evaluate the expression EXP and pass the resulting values (if any)
6062     to the continuation labelled K.  The source information associated
6063     with the expression may be found in SRC, which is either an alist
6064     as in ‘source-properties’ or is ‘#f’ if there is no associated
6065     source.
6066
6067   There are a number of expression kinds.  Above you see an example of
6068‘$primcall’.
6069
6070 -- CPS Expression: $primcall name param args
6071     Perform the primitive operation identified by ‘name’, a well-known
6072     symbol, passing it the arguments ARGS, and pass all resulting
6073     values to the continuation.
6074
6075     PARAM is a constant parameter whose interpretation is up to the
6076     primcall in question.  Usually it’s ‘#f’ but for a primcall that
6077     might need some compile-time constant information – such as
6078add/immediate’, which adds a constant number to a value – the
6079     parameter holds this information.
6080
6081     The set of available primitives includes many primitives known to
6082     Tree-IL and then some more; see the source code for details.  Note
6083     that some Tree-IL primcalls need to be converted to a sequence of
6084     lower-level CPS primcalls.  Again, see ‘(language tree-il
6085     compile-cps)’ for full details.
6086
6087   The variables that are used by ‘$primcall’, or indeed by any
6088expression, must be defined before the expression is evaluated.  An
6089equivalent way of saying this is that predecessor ‘$kargs’
6090continuation(s) that bind the variables(s) used by the expression must
6091“dominate” the continuation that uses the expression: definitions
6092dominate uses.  This condition is trivially satisfied in our example
6093above, but in general to determine the set of variables that are in
6094“scope” for a given term, you need to do a flow analysis to see what
6095continuations dominate a term.  The variables that are in scope are
6096those variables defined by the continuations that dominate a term.
6097
6098   Here is an inventory of the kinds of expressions in Guile’s CPS
6099language, besides ‘$primcall’ which has already been described.  Recall
6100that all expressions are wrapped in a ‘$continue’ term which specifies
6101their continuation.
6102
6103 -- CPS Expression: $const val
6104     Continue with the constant value VAL.
6105
6106 -- CPS Expression: $prim name
6107     Continue with the procedure that implements the primitive operation
6108     named by NAME.
6109
6110 -- CPS Expression: $call proc args
6111     Call PROC with the arguments ARGS, and pass all values to the
6112     continuation.  PROC and the elements of the ARGS list should all be
6113     variable names.  The continuation identified by the term’s K should
6114     be a ‘$kreceive’ or a ‘$ktail’ instance.
6115
6116 -- CPS Expression: $values args
6117     Pass the values named by the list ARGS to the continuation.
6118
6119 -- CPS Expression: $prompt escape? tag handler
6120
6121   There are two sub-languages of CPS, “higher-order CPS” and
6122“first-order CPS”. The difference is that in higher-order CPS, there are
6123‘$fun’ and ‘$rec’ expressions that bind functions or mutually-recursive
6124functions in the implicit scope of their use sites.  Guile transforms
6125higher-order CPS into first-order CPS by “closure conversion”, which
6126chooses representations for all closures and which arranges to access
6127free variables through the implicit closure parameter that is passed to
6128every function call.
6129
6130 -- CPS Expression: $fun body
6131     Continue with a procedure.  BODY names the entry point of the
6132     function, which should be a ‘$kfun’.  This expression kind is only
6133     valid in higher-order CPS, which is the CPS language before closure
6134     conversion.
6135
6136 -- CPS Expression: $rec names vars funs
6137     Continue with a set of mutually recursive procedures denoted by
6138     NAMES, VARS, and FUNS.  NAMES is a list of symbols, VARS is a list
6139     of variable names (unique integers), and FUNS is a list of ‘$fun’
6140     values.  Note that the ‘$kargs’ continuation should also define
6141     NAMES/VARS bindings.
6142
6143   The contification pass will attempt to transform the functions
6144declared in a ‘$rec’ into local continuations.  Any remaining ‘$fun’
6145instances are later removed by the closure conversion pass.  If the
6146function has no free variables, it gets allocated as a constant.
6147
6148 -- CPS Expression: $const-fun label
6149     A constant which is a function whose entry point is LABEL.  As a
6150     constant, instances of ‘$const-fun’ with the same LABEL will not
6151     allocate; the space for the function is allocated as part of the
6152     compilation unit.
6153
6154     In practice, ‘$const-fun’ expressions are reified by CPS-conversion
6155     for functions whose call sites are not all visible within the
6156     compilation unit and which have no free variables.  This expression
6157     kind is part of first-order CPS.
6158
6159   Otherwise, if the closure has free variables, it will be allocated at
6160its definition site via an ‘allocate-words’ primcall and its free
6161variables initialized there.  The code pointer in the closure is
6162initialized from a ‘$code’ expression.
6163
6164 -- CPS Expression: $code label
6165     Continue with the value of LABEL, which should denote some ‘$kfun’
6166     continuation in the program.  Used when initializing the code
6167     pointer of closure objects.
6168
6169   However, If the closure can be proven to never escape its scope then
6170other lighter-weight representations can be chosen.  Additionally, if
6171all call sites are known, closure conversion will hard-wire the calls by
6172lowering ‘$call’ to ‘$callk’.
6173
6174 -- CPS Expression: $callk label proc args
6175     Like ‘$call’, but for the case where the call target is known to be
6176     in the same compilation unit.  LABEL should denote some ‘$kfun’
6177     continuation in the program.  In this case the PROC is simply an
6178     additional argument, since it is not used to determine the call
6179     target at run-time.
6180
6181   To summarize: a ‘$continue’ is a CPS term that continues to a single
6182label.  But there are other kinds of CPS terms that can continue to a
6183different number of labels: ‘$branch’, ‘$switch’, ‘$throw’, and
6184‘$prompt’.
6185
6186 -- CPS Term: $branch kf kt src op param args
6187     Evaluate the branching primcall OP, with arguments ARGS and
6188     constant parameter PARAM, and continue to KT with zero values if
6189     the test is true.  Otherwise continue to KF.
6190
6191     The ‘$branch’ term is like a ‘$continue’ term with a ‘$primcall’
6192     expression, except that instead of binding a value and continuing
6193     to a single label, the result of the test is not bound but instead
6194     used to choose the continuation label.
6195
6196     The set of operations (corresponding to OP values) that are valid
6197     in a $BRANCH is limited.  In the general case, bind the result of a
6198     test expression to a variable, and then make a ‘$branch’ on a
6199     ‘true?’ op referencing that variable.  The optimizer should inline
6200     the branch if possible.
6201
6202 -- CPS Term: $switch kf kt* src arg
6203     Continue to a label in the list K* according to the index argument
6204     ARG, or to the default continuation KF if ARG is greater than or
6205     equal to the length K*.  The index variable ARG is an unboxed,
6206     unsigned 64-bit value.
6207
6208     The ‘$switch’ term is like C’s ‘switch’ statement.  The compiler to
6209     CPS can generate a ‘$switch’ term directly, if the source language
6210     has such a concept, or it can rely on the CPS optimizer to turn
6211     appropriate chains of ‘$branch’ statements to ‘$switch’ instances,
6212     which is what the Scheme compiler does.
6213
6214 -- CPS Term: $throw src op param args
6215     Throw a non-resumable exception.  Throw terms do not continue at
6216     all.  The usual value of OP is ‘throw’, with two arguments KEY and
6217     ARGS.  There are also some specific primcalls that compile to the
6218     VM ‘throw/value’ and ‘throw/value+data’ instructions; see the code
6219     for full details.
6220
6221     The advantage of having ‘$throw’ as a term is that, because it does
6222     not continue, this allows the optimizer to gather more information
6223     from type predicates.  For example, if the predicate is ‘char?’ and
6224     the KF continues to a throw, the set of labels dominated by KT is
6225     larger than if the throw notationally continued to some label that
6226     would never be reached by the throw.
6227
6228 -- CPS Term: $prompt k kh src escape? tag
6229     Push a prompt on the stack identified by the variable name TAG,
6230     which may be escape-only if ESCAPE? is true, and continue to KH
6231     with zero values.  If the body aborts to this prompt, control will
6232     proceed at the continuation labelled KH, which should be a
6233     ‘$kreceive’ continuation.  Prompts are later popped by ‘pop-prompt’
6234     primcalls.
6235
6236   At this point we have described terms, expressions, and the most
6237common kind of continuation, ‘$kargs’.  ‘$kargs’ is used when the
6238predecessors of the continuation can be instructed to pass the values
6239where the continuation wants them.  For example, if a ‘$kargs’
6240continuation K binds a variable V, and the compiler decides to allocate
6241V to slot 6, all predecessors of K should put the value for V in slot 6
6242before jumping to K.  One situation in which this isn’t possible is
6243receiving values from function calls.  Guile has a calling convention
6244for functions which currently places return values on the stack.  A
6245continuation of a call must check that the number of values returned
6246from a function matches the expected number of values, and then must
6247shuffle or collect those values to named variables.  ‘$kreceive’ denotes
6248this kind of continuation.
6249
6250 -- CPS Continuation: $kreceive arity k
6251     Receive values on the stack.  Parse them according to ARITY, and
6252     then proceed with the parsed values to the ‘$kargs’ continuation
6253     labelled K.  As a limitation specific to ‘$kreceive’, ARITY may
6254     only contain required and rest arguments.
6255
6256   ‘$arity’ is a helper data structure used by ‘$kreceive’ and also by
6257‘$kclause’, described below.
6258
6259 -- CPS Data: $arity req opt rest kw allow-other-keys?
6260     A data type declaring an arity.  REQ and OPT are lists of source
6261     names of required and optional arguments, respectively.  REST is
6262     either the source name of the rest variable, or ‘#f’ if this arity
6263     does not accept additional values.  KW is a list of the form
6264     ‘((KEYWORD NAME VAR) ...)’, describing the keyword arguments.
6265     ALLOW-OTHER-KEYS? is true if other keyword arguments are allowed
6266     and false otherwise.
6267
6268     Note that all of these names with the exception of the VARs in the
6269     KW list are source names, not unique variable names.
6270
6271   Additionally, there are three specific kinds of continuations that
6272are only used in function entries.
6273
6274 -- CPS Continuation: $kfun src meta self tail clause
6275     Declare a function entry.  SRC is the source information for the
6276     procedure declaration, and META is the metadata alist as described
6277     above in Tree-IL’s ‘<lambda>’.  SELF is a variable bound to the
6278     procedure being called, and which may be used for self-references.
6279     TAIL is the label of the ‘$ktail’ for this function, corresponding
6280     to the function’s tail continuation.  CLAUSE is the label of the
6281     first ‘$kclause’ for the first ‘case-lambda’ clause in the
6282     function, or otherwise ‘#f’.
6283
6284 -- CPS Continuation: $ktail
6285     A tail continuation.
6286
6287 -- CPS Continuation: $kclause arity cont alternate
6288     A clause of a function with a given arity.  Applications of a
6289     function with a compatible set of actual arguments will continue to
6290     the continuation labelled CONT, a ‘$kargs’ instance representing
6291     the clause body.  If the arguments are incompatible, control
6292     proceeds to ALTERNATE, which is a ‘$kclause’ for the next clause,
6293     or ‘#f’ if there is no next clause.
6294
6295
6296File: guile.info,  Node: Building CPS,  Next: CPS Soup,  Prev: CPS in Guile,  Up: Continuation-Passing Style
6297
62989.4.4.3 Building CPS
6299....................
6300
6301Unlike Tree-IL, the CPS language is built to be constructed and
6302deconstructed with abstract macros instead of via procedural
6303constructors or accessors, or instead of S-expression matching.
6304
6305   Deconstruction and matching is handled adequately by the ‘match’ form
6306from ‘(ice-9 match)’.  *Note Pattern Matching::.  Construction is
6307handled by a set of mutually builder macros: ‘build-term’, ‘build-cont’,
6308and ‘build-exp’.
6309
6310   In the following interface definitions, consider ‘term’ and ‘exp’ to
6311be built by ‘build-term’ or ‘build-exp’, respectively.  Consider any
6312other name to be evaluated as a Scheme expression.  Many of these forms
6313recognize ‘unquote’ in some contexts, to splice in a previously-built
6314value; see the specifications below for full details.
6315
6316 -- Scheme Syntax: build-term ,val
6317 -- Scheme Syntax: build-term ($continue k src exp)
6318 -- Scheme Syntax: build-exp ,val
6319 -- Scheme Syntax: build-exp ($const val)
6320 -- Scheme Syntax: build-exp ($prim name)
6321 -- Scheme Syntax: build-exp ($fun kentry)
6322 -- Scheme Syntax: build-exp ($const-fun kentry)
6323 -- Scheme Syntax: build-exp ($code kentry)
6324 -- Scheme Syntax: build-exp ($rec names syms funs)
6325 -- Scheme Syntax: build-exp ($call proc (arg ...))
6326 -- Scheme Syntax: build-exp ($call proc args)
6327 -- Scheme Syntax: build-exp ($callk k proc (arg ...))
6328 -- Scheme Syntax: build-exp ($callk k proc args)
6329 -- Scheme Syntax: build-exp ($primcall name param (arg ...))
6330 -- Scheme Syntax: build-exp ($primcall name param args)
6331 -- Scheme Syntax: build-exp ($values (arg ...))
6332 -- Scheme Syntax: build-exp ($values args)
6333 -- Scheme Syntax: build-exp ($prompt escape? tag handler)
6334 -- Scheme Syntax: build-term ($branch kf kt src op param (arg ...))
6335 -- Scheme Syntax: build-term ($branch kf kt src op param args)
6336 -- Scheme Syntax: build-term ($switch kf kt* src arg)
6337 -- Scheme Syntax: build-term ($throw src op param (arg ...))
6338 -- Scheme Syntax: build-term ($throw src op param args)
6339 -- Scheme Syntax: build-term ($prompt k kh src escape? tag)
6340 -- Scheme Syntax: build-cont ,val
6341 -- Scheme Syntax: build-cont ($kargs (name ...) (sym ...) term)
6342 -- Scheme Syntax: build-cont ($kargs names syms term)
6343 -- Scheme Syntax: build-cont ($kreceive req rest kargs)
6344 -- Scheme Syntax: build-cont ($kfun src meta self ktail kclause)
6345 -- Scheme Syntax: build-cont ($kclause ,arity kbody kalt)
6346 -- Scheme Syntax: build-cont ($kclause (req opt rest kw aok?) kbody)
6347     Construct a CPS term, expression, or continuation.
6348
6349   There are a few more miscellaneous interfaces as well.
6350
6351 -- Scheme Procedure: make-arity req opt rest kw allow-other-keywords?
6352     A procedural constructor for ‘$arity’ objects.
6353
6354 -- Scheme Syntax: rewrite-term val (pat term) ...
6355 -- Scheme Syntax: rewrite-exp val (pat exp) ...
6356 -- Scheme Syntax: rewrite-cont val (pat cont) ...
6357     Match VAL against the series of patterns PAT..., using ‘match’.
6358     The body of the matching clause should be a template in the syntax
6359     of ‘build-term’, ‘build-exp’, or ‘build-cont’, respectively.
6360
6361
6362File: guile.info,  Node: CPS Soup,  Next: Compiling CPS,  Prev: Building CPS,  Up: Continuation-Passing Style
6363
63649.4.4.4 CPS Soup
6365................
6366
6367We describe programs in Guile’s CPS language as being a kind of “soup”
6368because all continuations in the program are mixed into the same “pot”,
6369so to speak, without explicit markers as to what function or scope a
6370continuation is in.  A program in CPS is a map from continuation labels
6371to continuation values.  As discussed in the introduction, a
6372continuation label is an integer.  No label may be negative.
6373
6374   As a matter of convention, label 0 should map to the ‘$kfun’
6375continuation of the entry to the program, which should be a function of
6376no arguments.  The body of a function consists of the labelled
6377continuations that are reachable from the function entry.  A program can
6378refer to other functions, either via ‘$fun’ and ‘$rec’ in higher-order
6379CPS, or via ‘$const-fun’, ‘$callk’, and allocated closures in
6380first-order CPS. The program logically contains all continuations of all
6381functions reachable from the entry function.  A compiler pass may leave
6382unreachable continuations in a program; subsequent compiler passes
6383should ensure that their transformations and analyses only take
6384reachable continuations into account.  It’s OK though if transformation
6385runs over all continuations if including the unreachable continuations
6386has no effect on the transformations on the live continuations.
6387
6388   The “soup” itself is implemented as an “intmap”, a functional
6389array-mapped trie specialized for integer keys.  Intmaps associate
6390integers with values of any kind.  Currently intmaps are a private data
6391structure only used by the CPS phase of the compiler.  To work with
6392intmaps, load the ‘(language cps intmap)’ module:
6393
6394     (use-modules (language cps intmap))
6395
6396   Intmaps are functional data structures, so there is no constructor as
6397such: one can simply start with the empty intmap and add entries to it.
6398
6399     (intmap? empty-intmap) ⇒ #t
6400     (define x (intmap-add empty-intmap 42 "hi"))
6401     (intmap? x) ⇒ #t
6402     (intmap-ref x 42) ⇒ "hi"
6403     (intmap-ref x 43) ⇒ error: 43 not present
6404     (intmap-ref x 43 (lambda (k) "yo!")) ⇒ "yo"
6405     (intmap-add x 42 "hej") ⇒ error: 42 already present
6406
6407   ‘intmap-ref’ and ‘intmap-add’ are the core of the intmap interface.
6408There is also ‘intmap-replace’, which replaces the value associated with
6409a given key, requiring that the key was present already, and
6410‘intmap-remove’, which removes a key from an intmap.
6411
6412   Intmaps have a tree-like structure that is well-suited to set
6413operations such as union and intersection, so there are also the binary
6414‘intmap-union’ and ‘intmap-intersect’ procedures.  If the result is
6415equivalent to either argument, that argument is returned as-is; in that
6416way, one can detect whether the set operation produced a new result
6417simply by checking with ‘eq?’.  This makes intmaps useful when computing
6418fixed points.
6419
6420   If a key is present in both intmaps and the associated values are not
6421the same in the sense of ‘eq?’, the resulting value is determined by a
6422“meet” procedure, which is the optional last argument to ‘intmap-union’,
6423‘intmap-intersect’, and also to ‘intmap-add’, ‘intmap-replace’, and
6424similar functions.  The meet procedure will be called with the two
6425values and should return the intersected or unioned value in some
6426domain-specific way.  If no meet procedure is given, the default meet
6427procedure will raise an error.
6428
6429   To traverse over the set of values in an intmap, there are the
6430‘intmap-next’ and ‘intmap-prev’ procedures.  For example, if intmap X
6431has one entry mapping 42 to some value, we would have:
6432
6433     (intmap-next x) ⇒ 42
6434     (intmap-next x 0) ⇒ 42
6435     (intmap-next x 42) ⇒ 42
6436     (intmap-next x 43) ⇒ #f
6437     (intmap-prev x) ⇒ 42
6438     (intmap-prev x 42) ⇒ 42
6439     (intmap-prev x 41) ⇒ #f
6440
6441   There is also the ‘intmap-fold’ procedure, which folds over keys and
6442values in the intmap from lowest to highest value, and
6443‘intmap-fold-right’ which does so in the opposite direction.  These
6444procedures may take up to 3 seed values.  The number of values that the
6445fold procedure returns is the number of seed values.
6446
6447     (define q (intmap-add (intmap-add empty-intmap 1 2) 3 4))
6448     (intmap-fold acons q '()) ⇒ ((3 . 4) (1 . 2))
6449     (intmap-fold-right acons q '()) ⇒ ((1 . 2) (3 . 4))
6450
6451   When an entry in an intmap is updated (removed, added, or changed), a
6452new intmap is created that shares structure with the original intmap.
6453This operation ensures that the result of existing computations is not
6454affected by future computations: no mutation is ever visible to user
6455code.  This is a great property in a compiler data structure, as it lets
6456us hold a copy of a program before a transformation and use it while we
6457build a post-transformation program.  Updating an intmap is O(log N) in
6458the size of the intmap.
6459
6460   However, the O(log N) allocation costs are sometimes too much,
6461especially in cases when we know that we can just update the intmap in
6462place.  As an example, say we have an intmap mapping the integers 1 to
6463100 to the integers 42 to 141.  Let’s say that we want to transform this
6464map by adding 1 to each value.  There is already an efficient
6465‘intmap-map’ procedure in the ‘(language cps utils’) module, but if we
6466didn’t know about that we might do:
6467
6468     (define (intmap-increment map)
6469       (let lp ((k 0) (map map))
6470         (let ((k (intmap-next map k)))
6471           (if k
6472               (let ((v (intmap-ref map k)))
6473                 (lp (1+ k) (intmap-replace map k (1+ v))))
6474               map))))
6475
6476   Observe that the intermediate values created by ‘intmap-replace’ are
6477completely invisible to the program – only the last result of
6478‘intmap-replace’ value is needed.  The rest might as well share state
6479with the last one, and we could update in place.  Guile allows this kind
6480of interface via “transient intmaps”, inspired by Clojure’s transient
6481interface (<http://clojure.org/transients>).
6482
6483   The in-place ‘intmap-add!’ and ‘intmap-replace!’ procedures return
6484transient intmaps.  If one of these in-place procedures is called on a
6485normal persistent intmap, a new transient intmap is created.  This is an
6486O(1) operation.  In all other respects the interface is like their
6487persistent counterparts, ‘intmap-add’ and ‘intmap-replace’.  If an
6488in-place procedure is called on a transient intmap, the intmap is
6489mutated in-place and the same value is returned.
6490
6491   If a persistent operation like ‘intmap-add’ is called on a transient
6492intmap, the transient’s mutable substructure is then marked as
6493persistent, and ‘intmap-add’ then runs on a new persistent intmap
6494sharing structure but not state with the original transient.  Mutating a
6495transient will cause enough copying to ensure that it can make its
6496change, but if part of its substructure is already “owned” by it, no
6497more copying is needed.
6498
6499   We can use transients to make ‘intmap-increment’ more efficient.  The
6500two changed elements have been marked *like this*.
6501
6502     (define (intmap-increment map)
6503       (let lp ((k 0) (map map))
6504         (let ((k (intmap-next map k)))
6505           (if k
6506               (let ((v (intmap-ref map k)))
6507                 (lp (1+ k) (*intmap-replace!* map k (1+ v))))
6508               (*persistent-intmap* map)))))
6509
6510   Be sure to tag the result as persistent using the ‘persistent-intmap’
6511procedure to prevent the mutability from leaking to other parts of the
6512program.  For added paranoia, you could call ‘persistent-intmap’ on the
6513incoming map, to ensure that if it were already transient, that the
6514mutations in the body of ‘intmap-increment’ wouldn’t affect the incoming
6515value.
6516
6517   In summary, programs in CPS are intmaps whose values are
6518continuations.  See the source code of ‘(language cps utils)’ for a
6519number of useful facilities for working with CPS values.
6520
6521
6522File: guile.info,  Node: Compiling CPS,  Prev: CPS Soup,  Up: Continuation-Passing Style
6523
65249.4.4.5 Compiling CPS
6525.....................
6526
6527Compiling CPS in Guile has three phases: conversion, optimization, and
6528code generation.
6529
6530   CPS conversion is the process of taking a higher-level language and
6531compiling it to CPS. Source languages can do this directly, or they can
6532convert to Tree-IL (which is probably easier) and let Tree-IL convert to
6533CPS later.  Going through Tree-IL has the advantage of running Tree-IL
6534optimization passes, like partial evaluation.  Also, the compiler from
6535Tree-IL to CPS handles assignment conversion, in which assigned local
6536variables (in Tree-IL, locals that are ‘<lexical-set>’) are converted to
6537being boxed values on the heap.  *Note Variables and the VM::.
6538
6539   After CPS conversion, Guile runs some optimization passes over the
6540CPS. Most optimization in Guile is done on the CPS language.  The one
6541major exception is partial evaluation, which for historic reasons is
6542done on Tree-IL.
6543
6544   The major optimization performed on CPS is contification, in which
6545functions that are always called with the same continuation are
6546incorporated directly into a function’s body.  This opens up space for
6547more optimizations, and turns procedure calls into ‘goto’.  It can also
6548make loops out of recursive function nests.  Guile also does dead code
6549elimination, common subexpression elimination, loop peeling and
6550invariant code motion, and range and type inference.
6551
6552   The rest of the optimization passes are really cleanups and
6553canonicalizations.  CPS spans the gap between high-level languages and
6554low-level bytecodes, which allows much of the compilation process to be
6555expressed as source-to-source transformations.  Such is the case for
6556closure conversion, in which references to variables that are free in a
6557function are converted to closure references, and in which functions are
6558converted to closures.  There are a few more passes to ensure that the
6559only primcalls left in the term are those that have a corresponding
6560instruction in the virtual machine, and that their continuations expect
6561the right number of values.
6562
6563   Finally, the backend of the CPS compiler emits bytecode for each
6564function, one by one.  To do so, it determines the set of live variables
6565at all points in the function.  Using this liveness information, it
6566allocates stack slots to each variable, such that a variable can live in
6567one slot for the duration of its lifetime, without shuffling.  (Of
6568course, variables with disjoint lifetimes can share a slot.)  Finally
6569the backend emits code, typically just one VM instruction, for each
6570continuation in the function.
6571
6572
6573File: guile.info,  Node: Bytecode,  Next: Writing New High-Level Languages,  Prev: Continuation-Passing Style,  Up: Compiling to the Virtual Machine
6574
65759.4.5 Bytecode
6576--------------
6577
6578As mentioned before, Guile compiles all code to bytecode, and that
6579bytecode is contained in ELF images.  *Note Object File Format::, for
6580more on Guile’s use of ELF.
6581
6582   To produce a bytecode image, Guile provides an assembler and a
6583linker.
6584
6585   The assembler, defined in the ‘(system vm assembler)’ module, has a
6586relatively straightforward imperative interface.  It provides a
6587‘make-assembler’ function to instantiate an assembler and a set of
6588‘emit-INST’ procedures to emit instructions of each kind.
6589
6590   The ‘emit-INST’ procedures are actually generated at compile-time
6591from a machine-readable description of the VM. With a few exceptions for
6592certain operand types, each operand of an emit procedure corresponds to
6593an operand of the corresponding instruction.
6594
6595   Consider ‘allocate-words’, from *note Memory Access Instructions::.
6596It is documented as:
6597
6598 -- Instruction: allocate-words s12:DST s12:NWORDS
6599
6600   Therefore the emit procedure has the form:
6601
6602 -- Scheme Procedure: emit-allocate-words asm dst nwords
6603
6604   All emit procedure take the assembler as their first argument, and
6605return no useful values.
6606
6607   The argument types depend on the operand types.  *Note Instruction
6608Set::.  Most are integers within a restricted range, though labels are
6609generally expressed as opaque symbols.  Besides the emitters that
6610correspond to instructions, there are a few additional helpers defined
6611in the assembler module.
6612
6613 -- Scheme Procedure: emit-label asm label
6614     Define a label at the current program point.
6615
6616 -- Scheme Procedure: emit-source asm source
6617     Associate SOURCE with the current program point.
6618
6619 -- Scheme Procedure: emit-cache-ref asm dst key
6620 -- Scheme Procedure: emit-cache-set! asm key val
6621     Macro-instructions to implement compilation-unit caches.  A single
6622     cache cell corresponding to KEY will be allocated for the
6623     compilation unit.
6624
6625 -- Scheme Procedure: emit-load-constant asm dst constant
6626     Load the Scheme datum CONSTANT into DST.
6627
6628 -- Scheme Procedure: emit-begin-program asm label properties
6629 -- Scheme Procedure: emit-end-program asm
6630     Delimit the bounds of a procedure, with the given LABEL and the
6631     metadata PROPERTIES.
6632
6633 -- Scheme Procedure: emit-load-static-procedure asm dst label
6634     Load a procedure with the given LABEL into local DST.  This
6635     macro-instruction should only be used with procedures without free
6636     variables – procedures that are not closures.
6637
6638 -- Scheme Procedure: emit-begin-standard-arity asm req nlocals
6639          alternate
6640 -- Scheme Procedure: emit-begin-opt-arity asm req opt rest nlocals
6641          alternate
6642 -- Scheme Procedure: emit-begin-kw-arity asm req opt rest kw-indices
6643          allow-other-keys? nlocals alternate
6644 -- Scheme Procedure: emit-end-arity asm
6645     Delimit a clause of a procedure.
6646
6647   The linker is a complicated beast.  Hackers interested in how it
6648works would do well do read Ian Lance Taylor’s series of articles on
6649linkers.  Searching the internet should find them easily.  From the
6650user’s perspective, there is only one knob to control: whether the
6651resulting image will be written out to a file or not.  If the user
6652passes ‘#:to-file? #t’ as part of the compiler options (*note The Scheme
6653Compiler::), the linker will align the resulting segments on page
6654boundaries, and otherwise not.
6655
6656 -- Scheme Procedure: link-assembly asm #:page-aligned?=#t
6657     Link an ELF image, and return the bytevector.  If PAGE-ALIGNED? is
6658     true, Guile will align the segments with different permissions on
6659     page-sized boundaries, in order to maximize code sharing between
6660     different processes.  Otherwise, padding is minimized, to minimize
6661     address space consumption.
6662
6663   To write an image to disk, just use ‘put-bytevector’ from ‘(ice-9
6664binary-ports)’.
6665
6666   Compiling object code to the fake language, ‘value’, is performed via
6667loading objcode into a program, then executing that thunk with respect
6668to the compilation environment.  Normally the environment propagates
6669through the compiler transparently, but users may specify the
6670compilation environment manually as well, as a module.  Procedures to
6671load images can be found in the ‘(system vm loader)’ module:
6672
6673     (use-modules (system vm loader))
6674
6675 -- Scheme Variable: load-thunk-from-file file
6676 -- C Function: scm_load_thunk_from_file (file)
6677     Load object code from a file named FILE.  The file will be mapped
6678     into memory via ‘mmap’, so this is a very fast operation.
6679
6680 -- Scheme Variable: load-thunk-from-memory bv
6681 -- C Function: scm_load_thunk_from_memory (bv)
6682     Load object code from a bytevector.  The data will be copied out of
6683     the bytevector in order to ensure proper alignment of embedded
6684     Scheme values.
6685
6686   Additionally there are procedures to find the ELF image for a given
6687pointer, or to list all mapped ELF images:
6688
6689 -- Scheme Variable: find-mapped-elf-image ptr
6690     Given the integer value PTR, find and return the ELF image that
6691     contains that pointer, as a bytevector.  If no image is found,
6692     return ‘#f’.  This routine is mostly used by debuggers and other
6693     introspective tools.
6694
6695 -- Scheme Variable: all-mapped-elf-images
6696     Return all mapped ELF images, as a list of bytevectors.
6697
6698