1# Region inference (NLL)
2
3<!-- toc -->
4
5The MIR-based region checking code is located in [the `rustc_mir::borrow_check`
6module][nll].
7
8[nll]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/index.html
9
10The MIR-based region analysis consists of two major functions:
11
12- [`replace_regions_in_mir`], invoked first, has two jobs:
13  - First, it finds the set of regions that appear within the
14    signature of the function (e.g., `'a` in `fn foo<'a>(&'a u32) {
15    ... }`). These are called the "universal" or "free" regions – in
16    particular, they are the regions that [appear free][fvb] in the
17    function body.
18  - Second, it replaces all the regions from the function body with
19    fresh inference variables. This is because (presently) those
20    regions are the results of lexical region inference and hence are
21    not of much interest. The intention is that – eventually – they
22    will be "erased regions" (i.e., no information at all), since we
23    won't be doing lexical region inference at all.
24- [`compute_regions`], invoked second: this is given as argument the
25  results of move analysis. It has the job of computing values for all
26  the inference variables that `replace_regions_in_mir` introduced.
27  - To do that, it first runs the [MIR type checker]. This is
28    basically a normal type-checker but specialized to MIR, which is
29    much simpler than full Rust, of course. Running the MIR type
30    checker will however create various [constraints][cp] between region
31    variables, indicating their potential values and relationships to
32    one another.
33  - After this, we perform [constraint propagation][cp] by creating a
34    [`RegionInferenceContext`] and invoking its [`solve`]
35    method.
36  - The [NLL RFC] also includes fairly thorough (and hopefully readable)
37    coverage.
38
39[cp]: ./region_inference/constraint_propagation.md
40[fvb]: ../appendix/background.md#free-vs-bound
41[`replace_regions_in_mir`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/nll/fn.replace_regions_in_mir.html
42[`compute_regions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/nll/fn.compute_regions.html
43[`RegionInferenceContext`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html
44[`solve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#method.solve
45[NLL RFC]: https://rust-lang.github.io/rfcs/2094-nll.html
46[MIR type checker]: ./type_check.md
47
48## Universal regions
49
50The [`UniversalRegions`] type represents a collection of _universal_ regions
51corresponding to some MIR `DefId`. It is constructed in
52[`replace_regions_in_mir`] when we replace all regions with fresh inference
53variables. [`UniversalRegions`] contains indices for all the free regions in
54the given MIR along with any relationships that are _known_ to hold between
55them (e.g. implied bounds, where clauses, etc.).
56
57For example, given the MIR for the following function:
58
59```rust
60fn foo<'a>(x: &'a u32) {
61    // ...
62}
63```
64
65we would create a universal region for `'a` and one for `'static`. There may
66also be some complications for handling closures, but we will ignore those for
67the moment.
68
69TODO: write about _how_ these regions are computed.
70
71[`UniversalRegions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/universal_regions/struct.UniversalRegions.html
72
73<a name="region-variables"></a>
74
75## Region variables
76
77The value of a region can be thought of as a **set**. This set contains all
78points in the MIR where the region is valid along with any regions that are
79outlived by this region (e.g. if `'a: 'b`, then `end('b)` is in the set for
80`'a`); we call the domain of this set a `RegionElement`. In the code, the value
81for all regions is maintained in [the
82`rustc_mir::borrow_check::nll::region_infer` module][ri]. For each region we
83maintain a set storing what elements are present in its value (to make this
84efficient, we give each kind of element an index, the `RegionElementIndex`, and
85use sparse bitsets).
86
87[ri]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_mir/src/borrow_check/region_infer/
88
89The kinds of region elements are as follows:
90
91- Each **[`location`]** in the MIR control-flow graph: a location is just
92  the pair of a basic block and an index. This identifies the point
93  **on entry** to the statement with that index (or the terminator, if
94  the index is equal to `statements.len()`).
95- There is an element `end('a)` for each universal region `'a`,
96  corresponding to some portion of the caller's (or caller's caller,
97  etc) control-flow graph.
98- Similarly, there is an element denoted `end('static)` corresponding
99  to the remainder of program execution after this function returns.
100- There is an element `!1` for each placeholder region `!1`. This
101  corresponds (intuitively) to some unknown set of other elements –
102  for details on placeholders, see the section
103  [placeholders and universes](region_inference/placeholders_and_universes.md).
104
105## Constraints
106
107Before we can infer the value of regions, we need to collect
108constraints on the regions. The full set of constraints is described
109in [the section on constraint propagation][cp], but the two most
110common sorts of constraints are:
111
1121. Outlives constraints. These are constraints that one region outlives another
113   (e.g. `'a: 'b`). Outlives constraints are generated by the [MIR type
114   checker].
1152. Liveness constraints. Each region needs to be live at points where it can be
116   used. These constraints are collected by [`generate_constraints`].
117
118[`generate_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/constraint_generation/fn.generate_constraints.html
119
120## Inference Overview
121
122So how do we compute the contents of a region? This process is called _region
123inference_. The high-level idea is pretty simple, but there are some details we
124need to take care of.
125
126Here is the high-level idea: we start off each region with the MIR locations we
127know must be in it from the liveness constraints. From there, we use all of the
128outlives constraints computed from the type checker to _propagate_ the
129constraints: for each region `'a`, if `'a: 'b`, then we add all elements of
130`'b` to `'a`, including `end('b)`. This all happens in
131[`propagate_constraints`].
132
133Then, we will check for errors. We first check that type tests are satisfied by
134calling [`check_type_tests`]. This checks constraints like `T: 'a`. Second, we
135check that universal regions are not "too big". This is done by calling
136[`check_universal_regions`]. This checks that for each region `'a` if `'a`
137contains the element `end('b)`, then we must already know that `'a: 'b` holds
138(e.g. from a where clause). If we don't already know this, that is an error...
139well, almost. There is some special handling for closures that we will discuss
140later.
141
142### Example
143
144Consider the following example:
145
146```rust,ignore
147fn foo<'a, 'b>(x: &'a usize) -> &'b usize {
148    x
149}
150```
151
152Clearly, this should not compile because we don't know if `'a` outlives `'b`
153(if it doesn't then the return value could be a dangling reference).
154
155Let's back up a bit. We need to introduce some free inference variables (as is
156done in [`replace_regions_in_mir`]). This example doesn't use the exact regions
157produced, but it (hopefully) is enough to get the idea across.
158
159```rust,ignore
160fn foo<'a, 'b>(x: &'a /* '#1 */ usize) -> &'b /* '#3 */ usize {
161    x // '#2, location L1
162}
163```
164
165Some notation: `'#1`, `'#3`, and `'#2` represent the universal regions for the
166argument, return value, and the expression `x`, respectively. Additionally, I
167will call the location of the expression `x` `L1`.
168
169So now we can use the liveness constraints to get the following starting points:
170
171Region  | Contents
172--------|----------
173'#1     |
174'#2     | `L1`
175'#3     | `L1`
176
177Now we use the outlives constraints to expand each region. Specifically, we
178know that `'#2: '#3` ...
179
180Region  | Contents
181--------|----------
182'#1     | `L1`
183'#2     | `L1, end('#3) // add contents of '#3 and end('#3)`
184'#3     | `L1`
185
186... and `'#1: '#2`, so ...
187
188Region  | Contents
189--------|----------
190'#1     | `L1, end('#2), end('#3) // add contents of '#2 and end('#2)`
191'#2     | `L1, end('#3)`
192'#3     | `L1`
193
194Now, we need to check that no regions were too big (we don't have any type
195tests to check in this case). Notice that `'#1` now contains `end('#3)`, but
196we have no `where` clause or implied bound to say that `'a: 'b`... that's an
197error!
198
199### Some details
200
201The [`RegionInferenceContext`] type contains all of the information needed to
202do inference, including the universal regions from [`replace_regions_in_mir`] and
203the constraints computed for each region. It is constructed just after we
204compute the liveness constraints.
205
206Here are some of the fields of the struct:
207
208- [`constraints`]: contains all the outlives constraints.
209- [`liveness_constraints`]: contains all the liveness constraints.
210- [`universal_regions`]: contains the `UniversalRegions` returned by
211  [`replace_regions_in_mir`].
212- [`universal_region_relations`]: contains relations known to be true about
213  universal regions. For example, if we have a where clause that `'a: 'b`, that
214  relation is assumed to be true while borrow checking the implementation (it
215  is checked at the caller), so `universal_region_relations` would contain `'a:
216  'b`.
217- [`type_tests`]: contains some constraints on types that we must check after
218  inference (e.g. `T: 'a`).
219- [`closure_bounds_mapping`]: used for propagating region constraints from
220  closures back out to the creator of the closure.
221
222[`constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.constraints
223[`liveness_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.liveness_constraints
224[`location`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Location.html
225[`universal_regions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.universal_regions
226[`universal_region_relations`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.universal_region_relations
227[`type_tests`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.type_tests
228[`closure_bounds_mapping`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#structfield.closure_bounds_mapping
229
230TODO: should we discuss any of the others fields? What about the SCCs?
231
232Ok, now that we have constructed a `RegionInferenceContext`, we can do
233inference. This is done by calling the [`solve`] method on the context. This
234is where we call [`propagate_constraints`] and then check the resulting type
235tests and universal regions, as discussed above.
236
237[`propagate_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#method.propagate_constraints
238[`check_type_tests`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#method.check_type_tests
239[`check_universal_regions`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/region_infer/struct.RegionInferenceContext.html#method.check_universal_regions
240