1--- 2title: "Immediate Binding Values" 3author: Luke Tierney 4output: html_document 5--- 6 7## Background 8 9For scalar numerical code it can help to allow variable bindings to 10hold scalar integer, logical, and double values as immediate values 11rather than as allocated scalar vectors, or _boxed_ values. This 12eliminates the overhead of checking whether they might be shared or 13have attributes. It also makes inlining scalar computations for basic 14arithmetic operations and element access in the byte code engine more 15effective. The combined benefit can be as high as 20% for some 16examples, including the convolution example from the extensions 17manual. Having immediate bindings also allows some brittle 18optimizations for updating scalar variable bindings and loop indices 19to be removed. 20 21This note reflects changed committed to R_devel in r77327.) 22 23## Interface 24 25Binding cells have a marker that is returned by `BNDCELL_TAG`. The 26marker, or tag, is zero for standard bindings, and one of `REALSXP`, 27`INTSXP`, or `LGLSXP` for immediate bindings. 28 29`BINDING_VALUE`, used only in `eval.c` and `envir.c`, always returns 30an allocated object as the value of a binding. For immediate bindings 31it first converts to a standard binding by allocating and installing a 32scalar vector of the appropriate type. This allows most code to be 33unaware of the existence of typed bindings. The allocation is done by 34`R_expand_binding_value`. 35 36Code that wants to take advantage of typed bindings can read and set 37their values with 38 39- `INTSXP`: `BNDCELL_IVAL(cell)`, `SET_BNDCELL_IVAL(cell, val)` 40- `LGLSXP`: `BNDCELL_LVAL(cell)`, `SET_BNDCELL_LVAL(cell, val)` 41- `REALSXP`:`BNDCELL_DVAL(cell)`, `SET_BNDCELL_DVAL(cell, val)` 42 43These do not check or set the type tag. To create and initialize a new 44immediate binding in a cell use 45 46- `INTSXP`: `NEW_BNDCELL_IVAL(cell, val)` 47- `LGLSXP`: `NEW_BNDCELL_LVAL(cell, val)` 48- `REALSXP`:`NEW_BNDCELL_DVAL(cell, val)` 49 50The generic `CAR` accessor has been modified to signal an error if it 51encounters a cell with an immediate `CAR` value. This ensures 52immediate values are only used in the context of bindings. This makes 53it easier to avoid inadvertent boxing and may help with a transition 54to a different environment and binding representation. 55 56The setters, such as `SETCAR`, clear an immediate binding marker 57without signaling an error. 58 59 60## Notes 61 62 - For now, the `sxpinfo.extra` field is used to hold the binding 63 tag. 64 65 - Two implementations are provided for representing the immediate 66 values. One replaces the `SEXP` `CAR` field by a union; he other 67 allocates a boxed value. The union representation is conceptually 68 more natural and a little more efficient. But it would require a 69 change in memory layout on 32-bit platforms since the union 70 requires 8 bytes for the `double` value while a pointer only 71 requires 4 bytes. On 64-bit hardware the union approach should not 72 change the memory layout. 73 74 For now, the union approach is used on 64-bit platforms and the 75 boxed approach on 32-bit ones. It would be best to use the union 76 approach unconditionally, but this would require changing the 77 binary version and rebuilding all packages with compiled code. 78 This should probably be done before release. 79 80 - The approach taken for now is to just allow immediate values in 81 the `CAR` of binding cells. An alternative would be to allow 82 immediate values in all `CONS` cells, or even more widely, such as 83 in vector element. Allowing immediate values in all `CONS` cells 84 would have been a little simpler. But it would have make it harder 85 to detect unintended boxing, and might also have made it harder to 86 transition to an alternate environment or binding representation 87 should we wish to do that. 88 89 If immediate values were to be supported more widely it would 90 probably be necessary to suspend the GC when boxing values in 91 `R_expand_binding_value`. 92 93 - Serialization handles environment frames with standard pairlist 94 code, so the code not checks for an immediate binding and boxes 95 the value if necessary. An alternative would be to update the 96 serialization format to support immediate bindings. But given how 97 challenging it is to change the format it seemed best just to box. 98 99 - Only unlocked standard environment bindings that can be cached can 100 be turned into immediate bindings. Symbol bindings for the base 101 environment are not cached, and bindings for user data bases are 102 locked when returned by `findVarLoc` or findVarLocInFrame, so 103 neither of these can become immediate bindings. 104 105 - `BINDING_VALUE` is defined slightly differently in `eval.c` and 106 `envir.c`. It would be good to unify these eventually. 107 108<!-- 109Local Variables: 110mode: poly-markdown+R 111mode: flyspell 112End: 113--> 114