1# Region inference (NLL) 2 3<!-- toc --> 4 5The MIR-based region checking code is located in [the `rustc_mir::borrow_check` 6module][nll]. 7 8[nll]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/index.html 9 10The MIR-based region analysis consists of two major functions: 11 12- [`replace_regions_in_mir`], invoked first, has two jobs: 13 - First, it finds the set of regions that appear within the 14 signature of the function (e.g., `'a` in `fn foo<'a>(&'a u32) { 15 ... }`). These are called the "universal" or "free" regions – in 16 particular, they are the regions that [appear free][fvb] in the 17 function body. 18 - Second, it replaces all the regions from the function body with 19 fresh inference variables. This is because (presently) those 20 regions are the results of lexical region inference and hence are 21 not of much interest. The intention is that – eventually – they 22 will be "erased regions" (i.e., no information at all), since we 23 won't be doing lexical region inference at all. 24- [`compute_regions`], invoked second: this is given as argument the 25 results of move analysis. It has the job of computing values for all 26 the inference variables that `replace_regions_in_mir` introduced. 27 - To do that, it first runs the [MIR type checker]. This is 28 basically a normal type-checker but specialized to MIR, which is 29 much simpler than full Rust, of course. Running the MIR type 30 checker will however create various [constraints][cp] between region 31 variables, indicating their potential values and relationships to 32 one another. 33 - After this, we perform [constraint propagation][cp] by creating a 34 [`RegionInferenceContext`] and invoking its [`solve`] 35 method. 36 - The [NLL RFC] also includes fairly thorough (and hopefully readable) 37 coverage. 38 39[cp]: ./region_inference/constraint_propagation.md 40[fvb]: ../appendix/background.md#free-vs-bound 41[`replace_regions_in_mir`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/nll/fn.replace_regions_in_mir.html 42[`compute_regions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/nll/fn.compute_regions.html 43[`RegionInferenceContext`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html 44[`solve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#method.solve 45[NLL RFC]: https://rust-lang.github.io/rfcs/2094-nll.html 46[MIR type checker]: ./type_check.md 47 48## Universal regions 49 50The [`UniversalRegions`] type represents a collection of _universal_ regions 51corresponding to some MIR `DefId`. It is constructed in 52[`replace_regions_in_mir`] when we replace all regions with fresh inference 53variables. [`UniversalRegions`] contains indices for all the free regions in 54the given MIR along with any relationships that are _known_ to hold between 55them (e.g. implied bounds, where clauses, etc.). 56 57For example, given the MIR for the following function: 58 59```rust 60fn foo<'a>(x: &'a u32) { 61 // ... 62} 63``` 64 65we would create a universal region for `'a` and one for `'static`. There may 66also be some complications for handling closures, but we will ignore those for 67the moment. 68 69TODO: write about _how_ these regions are computed. 70 71[`UniversalRegions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/universal_regions/struct.UniversalRegions.html 72 73<a name="region-variables"></a> 74 75## Region variables 76 77The value of a region can be thought of as a **set**. This set contains all 78points in the MIR where the region is valid along with any regions that are 79outlived by this region (e.g. if `'a: 'b`, then `end('b)` is in the set for 80`'a`); we call the domain of this set a `RegionElement`. In the code, the value 81for all regions is maintained in [the 82`rustc_mir::borrow_check::nll::region_infer` module][ri]. For each region we 83maintain a set storing what elements are present in its value (to make this 84efficient, we give each kind of element an index, the `RegionElementIndex`, and 85use sparse bitsets). 86 87[ri]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_mir/src/borrow_check/region_infer/ 88 89The kinds of region elements are as follows: 90 91- Each **[`location`]** in the MIR control-flow graph: a location is just 92 the pair of a basic block and an index. This identifies the point 93 **on entry** to the statement with that index (or the terminator, if 94 the index is equal to `statements.len()`). 95- There is an element `end('a)` for each universal region `'a`, 96 corresponding to some portion of the caller's (or caller's caller, 97 etc) control-flow graph. 98- Similarly, there is an element denoted `end('static)` corresponding 99 to the remainder of program execution after this function returns. 100- There is an element `!1` for each placeholder region `!1`. This 101 corresponds (intuitively) to some unknown set of other elements – 102 for details on placeholders, see the section 103 [placeholders and universes](region_inference/placeholders_and_universes.md). 104 105## Constraints 106 107Before we can infer the value of regions, we need to collect 108constraints on the regions. The full set of constraints is described 109in [the section on constraint propagation][cp], but the two most 110common sorts of constraints are: 111 1121. Outlives constraints. These are constraints that one region outlives another 113 (e.g. `'a: 'b`). Outlives constraints are generated by the [MIR type 114 checker]. 1152. Liveness constraints. Each region needs to be live at points where it can be 116 used. These constraints are collected by [`generate_constraints`]. 117 118[`generate_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/constraint_generation/fn.generate_constraints.html 119 120## Inference Overview 121 122So how do we compute the contents of a region? This process is called _region 123inference_. The high-level idea is pretty simple, but there are some details we 124need to take care of. 125 126Here is the high-level idea: we start off each region with the MIR locations we 127know must be in it from the liveness constraints. From there, we use all of the 128outlives constraints computed from the type checker to _propagate_ the 129constraints: for each region `'a`, if `'a: 'b`, then we add all elements of 130`'b` to `'a`, including `end('b)`. This all happens in 131[`propagate_constraints`]. 132 133Then, we will check for errors. We first check that type tests are satisfied by 134calling [`check_type_tests`]. This checks constraints like `T: 'a`. Second, we 135check that universal regions are not "too big". This is done by calling 136[`check_universal_regions`]. This checks that for each region `'a` if `'a` 137contains the element `end('b)`, then we must already know that `'a: 'b` holds 138(e.g. from a where clause). If we don't already know this, that is an error... 139well, almost. There is some special handling for closures that we will discuss 140later. 141 142### Example 143 144Consider the following example: 145 146```rust,ignore 147fn foo<'a, 'b>(x: &'a usize) -> &'b usize { 148 x 149} 150``` 151 152Clearly, this should not compile because we don't know if `'a` outlives `'b` 153(if it doesn't then the return value could be a dangling reference). 154 155Let's back up a bit. We need to introduce some free inference variables (as is 156done in [`replace_regions_in_mir`]). This example doesn't use the exact regions 157produced, but it (hopefully) is enough to get the idea across. 158 159```rust,ignore 160fn foo<'a, 'b>(x: &'a /* '#1 */ usize) -> &'b /* '#3 */ usize { 161 x // '#2, location L1 162} 163``` 164 165Some notation: `'#1`, `'#3`, and `'#2` represent the universal regions for the 166argument, return value, and the expression `x`, respectively. Additionally, I 167will call the location of the expression `x` `L1`. 168 169So now we can use the liveness constraints to get the following starting points: 170 171Region | Contents 172--------|---------- 173'#1 | 174'#2 | `L1` 175'#3 | `L1` 176 177Now we use the outlives constraints to expand each region. Specifically, we 178know that `'#2: '#3` ... 179 180Region | Contents 181--------|---------- 182'#1 | `L1` 183'#2 | `L1, end('#3) // add contents of '#3 and end('#3)` 184'#3 | `L1` 185 186... and `'#1: '#2`, so ... 187 188Region | Contents 189--------|---------- 190'#1 | `L1, end('#2), end('#3) // add contents of '#2 and end('#2)` 191'#2 | `L1, end('#3)` 192'#3 | `L1` 193 194Now, we need to check that no regions were too big (we don't have any type 195tests to check in this case). Notice that `'#1` now contains `end('#3)`, but 196we have no `where` clause or implied bound to say that `'a: 'b`... that's an 197error! 198 199### Some details 200 201The [`RegionInferenceContext`] type contains all of the information needed to 202do inference, including the universal regions from [`replace_regions_in_mir`] and 203the constraints computed for each region. It is constructed just after we 204compute the liveness constraints. 205 206Here are some of the fields of the struct: 207 208- [`constraints`]: contains all the outlives constraints. 209- [`liveness_constraints`]: contains all the liveness constraints. 210- [`universal_regions`]: contains the `UniversalRegions` returned by 211 [`replace_regions_in_mir`]. 212- [`universal_region_relations`]: contains relations known to be true about 213 universal regions. For example, if we have a where clause that `'a: 'b`, that 214 relation is assumed to be true while borrow checking the implementation (it 215 is checked at the caller), so `universal_region_relations` would contain `'a: 216 'b`. 217- [`type_tests`]: contains some constraints on types that we must check after 218 inference (e.g. `T: 'a`). 219- [`closure_bounds_mapping`]: used for propagating region constraints from 220 closures back out to the creator of the closure. 221 222[`constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.constraints 223[`liveness_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.liveness_constraints 224[`location`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Location.html 225[`universal_regions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.universal_regions 226[`universal_region_relations`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.universal_region_relations 227[`type_tests`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.type_tests 228[`closure_bounds_mapping`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.closure_bounds_mapping 229 230TODO: should we discuss any of the others fields? What about the SCCs? 231 232Ok, now that we have constructed a `RegionInferenceContext`, we can do 233inference. This is done by calling the [`solve`] method on the context. This 234is where we call [`propagate_constraints`] and then check the resulting type 235tests and universal regions, as discussed above. 236 237[`propagate_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#method.propagate_constraints 238[`check_type_tests`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#method.check_type_tests 239[`check_universal_regions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#method.check_universal_regions 240