1---
2title: "Immediate Binding Values"
3author: Luke Tierney
4output: html_document
5---
6
7## Background
8
9For scalar numerical code it can help to allow variable bindings to
10hold scalar integer, logical, and double values as immediate values
11rather than as allocated scalar vectors, or _boxed_ values. This
12eliminates the overhead of checking whether they might be shared or
13have attributes. It also makes inlining scalar computations for basic
14arithmetic operations and element access in the byte code engine more
15effective. The combined benefit can be as high as 20% for some
16examples, including the convolution example from the extensions
17manual. Having immediate bindings also allows some brittle
18optimizations for updating scalar variable bindings and loop indices
19to be removed.
20
21This note reflects changed committed to R_devel in r77327.)
22
23## Interface
24
25Binding cells have a marker that is returned by `BNDCELL_TAG`. The
26marker, or tag, is zero for standard bindings, and one of `REALSXP`,
27`INTSXP`, or `LGLSXP` for immediate bindings.
28
29`BINDING_VALUE`, used only in `eval.c` and `envir.c`, always returns
30an allocated object as the value of a binding. For immediate bindings
31it first converts to a standard binding by allocating and installing a
32scalar vector of the appropriate type. This allows most code to be
33unaware of the existence of typed bindings.  The allocation is done by
34`R_expand_binding_value`.
35
36Code that wants to take advantage of typed bindings can read and set
37their values with
38
39- `INTSXP`: `BNDCELL_IVAL(cell)`, `SET_BNDCELL_IVAL(cell, val)`
40- `LGLSXP`: `BNDCELL_LVAL(cell)`, `SET_BNDCELL_LVAL(cell, val)`
41- `REALSXP`:`BNDCELL_DVAL(cell)`, `SET_BNDCELL_DVAL(cell, val)`
42
43These do not check or set the type tag. To create and initialize a new
44immediate binding in a cell use
45
46- `INTSXP`: `NEW_BNDCELL_IVAL(cell, val)`
47- `LGLSXP`: `NEW_BNDCELL_LVAL(cell, val)`
48- `REALSXP`:`NEW_BNDCELL_DVAL(cell, val)`
49
50The generic `CAR` accessor has been modified to signal an error if it
51encounters a cell with an immediate `CAR` value. This ensures
52immediate values are only used in the context of bindings. This makes
53it easier to avoid inadvertent boxing and may help with a transition
54to a different environment and binding representation.
55
56The setters, such as `SETCAR`, clear an immediate binding marker
57without signaling an error.
58
59
60## Notes
61
62  - For now, the `sxpinfo.extra` field is used to hold the binding
63    tag.
64
65  - Two implementations are provided for representing the immediate
66    values. One replaces the `SEXP` `CAR` field by a union; he other
67    allocates a boxed value. The union representation is conceptually
68    more natural and a little more efficient. But it would require a
69    change in memory layout on 32-bit platforms since the union
70    requires 8 bytes for the `double` value while a pointer only
71    requires 4 bytes. On 64-bit hardware the union approach should not
72    change the memory layout.
73
74    For now, the union approach is used on 64-bit platforms and the
75    boxed approach on 32-bit ones. It would be best to use the union
76    approach unconditionally, but this would require changing the
77    binary version and rebuilding all packages with compiled code.
78    This should probably be done before release.
79
80  - The approach taken for now is to just allow immediate values in
81    the `CAR` of binding cells. An alternative would be to allow
82    immediate values in all `CONS` cells, or even more widely, such as
83    in vector element. Allowing immediate values in all `CONS` cells
84    would have been a little simpler. But it would have make it harder
85    to detect unintended boxing, and might also have made it harder to
86    transition to an alternate environment or binding representation
87    should we wish to do that.
88
89    If immediate values were to be supported more widely it would
90    probably be necessary to suspend the GC when boxing values in
91    `R_expand_binding_value`.
92
93  - Serialization handles environment frames with standard pairlist
94    code, so the code not checks for an immediate binding and boxes
95    the value if necessary. An alternative would be to update the
96    serialization format to support immediate bindings. But given how
97    challenging it is to change the format it seemed best just to box.
98
99  - Only unlocked standard environment bindings that can be cached can
100    be turned into immediate bindings. Symbol bindings for the base
101    environment are not cached, and bindings for user data bases are
102    locked when returned by `findVarLoc` or findVarLocInFrame, so
103    neither of these can become immediate bindings.
104
105  - `BINDING_VALUE` is defined slightly differently in `eval.c` and
106    `envir.c`. It would be good to unify these eventually.
107
108<!--
109Local Variables:
110mode: poly-markdown+R
111mode: flyspell
112End:
113-->
114