Skip to content

Commit 8c5f1a6

Browse files
authored
Merge pull request rust-lang#83 from RalfJung/stacked-borrows
Stacked Borrows: add code for the two core operations
2 parents 08dfba1 + 4769f56 commit 8c5f1a6

File tree

1 file changed

+202
-24
lines changed

1 file changed

+202
-24
lines changed

wip/stacked-borrows.md

Lines changed: 202 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,15 @@
22

33
**Note:** This document is not normative nor endorsed by the UCG WG. It is maintained by @RalfJung to reflect what is currently implemented in [Miri].
44

5-
For more background on the how and why of Stacked Borrows, see this [blog post on the first implementation][stacked-borrows-1] and this [follow-up][stacked-borrows-2].
5+
This is not a guide!
6+
It is more of a reference.
7+
For a more guide-like introduction of Stacked Borrows, see this [blog post on the first implementation][stacked-borrows-1] and this [follow-up][stacked-borrows-2].
8+
Also see some [notes from a discussion at the 2019 Rust All-Hands][all-hands].
69

710
[Miri]: https://github.com/solson/miri/
811
[stacked-borrows-1]: https://www.ralfj.de/blog/2018/11/16/stacked-borrows-implementation.html
912
[stacked-borrows-2]: https://www.ralfj.de/blog/2018/12/26/stacked-borrows-barriers.html
13+
[all-hands]: https://paper.dropbox.com/doc/Topic-Stacked-borrows--AXAkoFfUGViWL_PaSryqKK~hAg-2q57v4UM7cIkxCq9PQc22
1014

1115
## Extra state
1216

@@ -15,31 +19,54 @@ Every pointer value has a *tag* (in addition to the location in memory that the
1519
Moreover, there is a per-call-frame `CallId` as well as some global tracking state.
1620

1721
```rust
22+
/// Timestamps: a monotonically increasing counter.
1823
// `nat` is the type of mathematical natural numbers, meaning we don't want to think about overflow.
1924
// NOTE: Miri just uses `u64` which, realistically, will not overflow because we only ever increment by 1.
2025
type Timestamp = nat;
2126

22-
// Extra per-pointer state (the "tag")
27+
/// Extra per-pointer state (the "tag"), which must match the stack of the locations that
28+
/// are accessed with this pointer.
2329
pub enum Borrow {
30+
/// A unique (mutable) reference.
2431
Uniq(Timestamp),
25-
Shr(Option<Timestamp>),
32+
/// An aliasing reference. This is also used by raw pointers, which do not track details
33+
/// of how or when they were created, hence the timestamp is optional.
34+
/// Shr(Some(_)) does NOT mean that the destination of this reference is frozen;
35+
/// that depends on the type! Only those parts outside of an `UnsafeCell` are actually
36+
/// frozen.
37+
Alias(Option<Timestamp>),
2638
}
2739

40+
/// An item on the per-location stack, controlling which pointers may access this location.
2841
pub enum BorStackItem {
42+
/// Indicates the unique reference that may mutate this location.
2943
Uniq(Timestamp),
30-
Shr,
44+
/// Indicates that the location has been mutably shared. Used for raw pointers as
45+
/// well as for unfrozen shared references.
46+
Raw,
47+
/// A barrier, tracking the function it belongs to. As long as that function executes,
48+
/// the barrier may not be popped off the stack.
3149
FnBarrier(CallId),
3250
}
33-
// Extra per-location state
51+
/// Per-location stack of borrow items, with the most recently created item on top.
52+
/// Making use of an item towards the bottom of the stack pops the top items,
53+
/// meaning the corresponding references may not be used any more.
54+
/// The stack may have exactly one "frozen" item on top, which is stored separately.
55+
/// The stack can also contain function barriers, which may not be popped while the
56+
/// function they belong to still runs.
3457
pub struct Stack {
3558
borrows: Vec<BorStackItem>, // used as a stack; never empty
3659
frozen_since: Option<Timestamp>, // virtual frozen "item" on top of the stack
3760
}
3861

39-
// Extra per-call-frame state
62+
/// Extra per-call-frame state: an ID uniquely identifying a stack frame.
63+
/// (The order of `CallId` does not matter, they are only ever compared for equality.)
4064
type CallId = nat;
4165

42-
// Extra global state
66+
/// Extra global state: the next `Timestamp`, as well as the next `CallId`.
67+
/// Both are just monotonically increasing counters, ensuring they are unique
68+
/// (for `CallId`) and properly ordered (for `Timestamp`).
69+
/// Also see the code for `increment_clock` and `new_call` below.
4370
pub struct Tracking {
4471
clock: Timestamp,
4572
next_call: CallId,
@@ -53,7 +80,7 @@ The stack of a location, and the tag of a pointer stored at some location, do no
5380
## Retag statement
5481

5582
Stacked Borrows introduces a new operation (a new MIR statement) on the Rust abstract machine.
56-
*Retagging* operates on a place, and it also carries a flag indicating the kind of retag that is being performed:
83+
*Retagging* operates on a place (Rust's equivalent of a C lvalue; see [the glossary](../reference/src/glossary.md)), and it also carries a flag indicating the kind of retag that is being performed:
5784

5885
```rust
5986
pub enum RetagKind {
@@ -83,7 +110,7 @@ pub enum RetagKind {
83110

84111
* A `Default` retag happens on the return value of every function that gets called (i.e., this is the first statement in the basic block that the call will return to).
85112

86-
* The automatically generated drop shims perform a `Raw` retag of their argument because they use it as a raw pointer.
113+
* The automatically generated drop shims (generated as the body of `ptr::real_drop_in_place`) perform a `Raw` retag of their argument because they use it as a raw pointer.
87114

88115
## Operational semantics
89116

@@ -120,14 +147,15 @@ impl Tracking {
120147

121148
This method will never return the same value twice.
122149
We say that a `CallId` is *active* if the call stack contains a stack frame with that ID.
150+
In the following, we pretend there exists a function `barrier_is_active(id)` that can check this.
123151

124152
**Note**: Miri uses a slightly more complex system with a `HashSet<CallId>` tracking the set of active `CallId`; that is just an optimization to avoid having to scan the call stack all the time.
125153

126154
### Allocating memory
127155

128156
When allocating memory, we have to initialize the `Stack` associated with the new locations, and we have to choose a `Borrow` (a tag) for the initial pointer to this memory.
129157

130-
For most memory, the stack of each freshly allocated memory location is `Stack { borrows: vec![Shr], frozen_since: None }`, and the initial pointer to that memory has tag `Shr(None)`.
158+
For most memory, the stack of each freshly allocated memory location is `Stack { borrows: vec![Raw], frozen_since: None }`, and the initial pointer to that memory has tag `Alias(None)`.
131159

132160
The only exception is stack memory.
133161
Stack memory is handled by an environment (which is part of the information carried in a stack frame of the Rust abstract machine) that maps each local variable to a place.
@@ -144,59 +172,209 @@ On every memory access, we perform the following extra check for every location
144172
(If this is a read access and we come here, the location is already unfrozen.)
145173
3. Pop the stack until the top item matches the tag of the pointer.
146174
- A `Uniq` item matches a `Uniq` tag with the same ID.
147-
- A `Shr` item matches any `Shr` tag (with or without timestamp).
148-
- When we are reading, a `Shr` item matches a `Uniq` tag.
175+
- A `Raw` item matches any `Alias` tag (with or without timestamp).
176+
- When we are reading, a `Raw` item matches a `Uniq` tag.
149177
- If we pop a `FnBarrier(c)` where `c` is active, we have undefined behavior.
150178

151179
If we pop the entire stack without finding a match, then we have undefined behavior.
152180

181+
The per-location part in code:
182+
183+
```rust
184+
pub enum AccessKind {
185+
Read,
186+
Write,
187+
Dealloc,
188+
}
189+
190+
impl Stack {
191+
fn access(
192+
&mut self,
193+
bor: Borrow,
194+
kind: AccessKind,
195+
) -> Result<(), String> {
196+
// Check if we can match the frozen "item".
197+
if self.frozen_since.is_some() {
198+
if kind == AccessKind::Read {
199+
// When we are frozen, we just accept all reads. No harm in this.
200+
// The deref already checked that `Uniq` items are in the stack, and that
201+
// the location is frozen if it should be.
202+
return Ok(());
203+
}
204+
}
205+
// Unfreeze on writes.
206+
self.frozen_since = None;
207+
// Pop the stack until we have something matching.
208+
while let Some(&itm) = self.borrows.last() {
209+
match (itm, bor) {
210+
(BorStackItem::FnBarrier(call), _) if barrier_is_active(call) => {
211+
return err!(
212+
"Stopping looking for borrow being accessed ({:?}) because of barrier ({})",
213+
bor, call
214+
)
215+
}
216+
(BorStackItem::Uniq(itm_t), Borrow::Uniq(bor_t)) if itm_t == bor_t => {
217+
// Found matching unique item. Continue after the match.
218+
}
219+
(BorStackItem::Raw, _) if kind == AccessKind::Read => {
220+
// When reading, everything can use a shared item!
221+
// We do not want to do this when writing: Writing to an `&mut`
222+
// should reaffirm its exclusivity (i.e., make sure it is
223+
// on top of the stack). Continue after the match.
224+
}
225+
(BorStackItem::Raw, Borrow::Alias(_)) => {
226+
// Found matching shared item. Continue after the match.
227+
}
228+
_ => {
229+
// Pop this, go on.
230+
self.borrows.pop().unwrap();
231+
continue
232+
}
233+
}
234+
// If we got here, we found a matching item. Congratulations!
235+
if kind == AccessKind::Dealloc { /* to be discussed later */ }
236+
return Ok(())
237+
}
238+
// If we got here, we did not find our item.
239+
err!(
240+
"Borrow being accessed ({:?}) does not exist on the stack",
241+
bor
242+
)
243+
}
244+
}
245+
```
246+
153247
### Dereferencing a pointer
154248

155249
Every time a pointer gets dereferenced (evaluating the `Deref` place projection), we determine the extent of memory that this pointer covers using `size_of_val` and then we perform the following check on every location covered by the reference:
156250

157251
1. The location must exist, i.e., the pointer must actually be dereferencable for this entire memory range it covers.
158252
2. If this is a raw pointer, stop here. Raw accesses are checked as little as possible.
159-
3. If this is a unique reference and the tag is `Shr(Some(_))`, that's an error.
253+
3. If this is a unique reference and the tag is `Alias(Some(_))`, that's an error.
160254
4. If the tag is `Uniq`, make sure there is a matching `Uniq` item with the same ID on the stack.
161-
5. If the tag is `Shr(None)`, make sure that either the location is frozen or else there is a `Shr` item on the stack.
162-
6. If the tag is `Shr(Some(t))`, then the check depends on whether the location is inside an `UnsafeCell` or not, according to the type of the reference.
255+
5. If the tag is `Alias(None)`, make sure that either the location is frozen or else there is a `Raw` item on the stack.
256+
6. If the tag is `Alias(Some(t))`, then the check depends on whether the location is inside an `UnsafeCell` or not, according to the type of the reference.
163257
- Locations outside `UnsafeCell` must have `frozen_since` set to `t` or an older timestamp.
164-
- `UnsafeCell` locations must either be frozen or else have a `Shr` item in their stack (same check as if the tag had no timestamp).
258+
- `UnsafeCell` locations must either be frozen or else have a `Raw` item in their stack (same check as if the tag had no timestamp).
165259

166260
Whenever we are checking whether an item is in the stack, we ignore barriers.
167261
Failing any of these checks means we have undefined behavior.
168262

263+
The per-location part (starting at step 3 above) in code:
264+
265+
```rust
266+
pub enum RefKind {
267+
/// &mut
268+
Unique,
269+
/// & without interior mutability
270+
Frozen,
271+
/// * (raw pointer) or & to `UnsafeCell`
272+
Raw,
273+
}
274+
275+
impl Stack {
276+
fn deref(
277+
&self,
278+
bor: Borrow,
279+
kind: RefKind,
280+
) -> Result<Option<usize>, String> {
281+
// Exclude unique ref with frozen tag.
282+
if let (RefKind::Unique, Borrow::Alias(Some(_))) = (kind, bor) {
283+
return err!("Encountered mutable reference with frozen tag ({:?})", bor);
284+
}
285+
// Checks related to freezing
286+
match bor {
287+
Borrow::Alias(Some(bor_t)) if kind == RefKind::Frozen => {
288+
// We need the location to be frozen.
289+
let frozen = self.frozen_since.map_or(false, |itm_t| itm_t <= bor_t);
290+
return if frozen { Ok(None) } else {
291+
err!("Location is not frozen long enough")
292+
}
293+
}
294+
Borrow::Alias(_) if self.frozen_since.is_some() => {
295+
return Ok(None) // Shared deref to frozen location, looking good
296+
}
297+
_ => {} // Not sufficient, go on looking.
298+
}
299+
// If we got here, we have to look for our item in the stack.
300+
for (idx, &itm) in self.borrows.iter().enumerate().rev() {
301+
match (itm, bor) {
302+
(BorStackItem::Uniq(itm_t), Borrow::Uniq(bor_t)) if itm_t == bor_t => {
303+
// Found matching unique item.
304+
return Ok(Some(idx))
305+
}
306+
(BorStackItem::Raw, Borrow::Alias(_)) => {
307+
// Found matching raw item.
308+
return Ok(Some(idx))
309+
}
310+
// Go on looking. We ignore barriers! When an `&mut` and an `&` alias,
311+
// dereferencing the `&` is still possible (to reborrow), but doing
312+
// an access is not.
313+
_ => {}
314+
}
315+
}
316+
// If we got here, we did not find our item. We have to error to satisfy U3.
317+
err!("Borrow being dereferenced ({:?}) does not exist on the stack", bor)
318+
}
319+
}
320+
```
321+
169322
### Reborrowing
170323

171324
We define the notion of "reborrowing", which will be used below to define the semantics of `Retag`.
172325
Reborrowing takes a (typed) place, whether to push a barrier, and the new borrow that this place is to be reborrowed for.
173-
we determine the extent of memory that this place covers using `size_of_val` and then we perform the following actions on every location covered by the place:
326+
We determine the extent of memory that this place covers using `size_of_val` and then we perform the following actions on every location covered by the place:
174327

175328
1. Perform the checks that would also happen on a dereference.
176329
Remember the position of the item matching the tag in the stack.
177330
2. Redundancy check, only happens if we will not push a barrier: if the new tag passes the checks performed on a dereference, and if the item that makes this check succeed is *above* the one we remembered in step 1 (where the "frozen" state is considered above every item in the stack), then stop.
178331
We are done for this location.
179-
This can only happen for shared references (i.e., when the borrow is `Shr(_)`).
180-
3. Perform the actions that would also happen when an actual access happens through this reference (for shared references with borrow `Shr(_)` this is a read access, for mutable references with borrow `Uniq(_)` it is a write access).
181-
Now the location cannot be frozen any more: if the new borrow is `Uniq(_)`, we just unfroze; if it is `Shr(_)` and the location was already frozen, then the redundancy check (step 3) would have kicked in.
332+
This can only happen for shared references (i.e., when the borrow is `Alias(_)`).
333+
3. Perform the actions that would also happen when an actual access happens through this reference (for shared references with borrow `Alias(_)` this is a read access, for mutable references with borrow `Uniq(_)` it is a write access).
334+
Now the location cannot be frozen any more: if the new borrow is `Uniq(_)`, we just unfroze; if it is `Alias(_)` and the location was already frozen, then the redundancy check (step 3) would have kicked in.
182335
4. If we want to push a barrier, push `FnBarrier(c)` to the location stack where `c` is the `CallId` if the current function call (i.e., of the topmost frame in the call stack).
183-
5. Check if the new tag is `Shr(Some(t))` and the location is inside an `UnsafeCell`.
336+
5. Check if the new tag is `Alias(Some(t))` and the location is inside an `UnsafeCell`.
184337
- If both conditions are satisfied, freeze the location with timestamp `t`. If it is already frozen, do nothing.
185-
- Otherwise, push a new item onto the stack: `Shr` if the tag is a `Shr(_)`, `Uniq(id)` if the tag is `Uniq(id)`.
338+
- Otherwise, push a new item onto the stack: `Raw` if the tag is a `Alias(_)`, `Uniq(id)` if the tag is `Uniq(id)`.
186339

187340
### Retagging
188341

189342
When executing `Retag(kind, place)`, we recursively visit all fields of this place, descending into compound types (`struct`, `enum`, arrays and so on) but not below any pointers.
190343
For each reference (`&[mut] _`) and box (`Box<_>`) we encounter, and if `kind == Raw` also for each raw pointer (`*[const,mut] _`), we perform the following steps:
191344

192-
1. We compute a fresh tag: `Uniq(_)` for mutable references, `Box`, `Shr(Some(_))` for shared references, and `Shr(None)` for raw pointers.
345+
1. We compute a fresh tag: `Uniq(_)` for mutable references, `Box`, `Alias(Some(_))` for shared references, and `Alias(None)` for raw pointers.
193346
2. We determine if we will want to push a barrier.
194347
This is the case only if all of the following conditions are satisfied: `kind == FnBarrier`, the type of this pointer is a reference (not a box), and if this is a shared reference then we are not inside an `UnsafeCell`.
195348
3. We perform reborrowing with the new tag and indicating whether we ant a barrier pushed or not.
196-
4. If `kind == TwoPhase`, we perform *another* reborrow with the tag being `Shr(Some(t))` for some fresh timestamp `t`, and not pushing new barriers.
349+
4. If `kind == TwoPhase`, we perform *another* reborrow with the tag being `Alias(Some(t))` for some fresh timestamp `t`, and not pushing new barriers.
197350

198351
### Deallocating memory
199352

200353
Memory deallocation first acts like a write access through the pointer used for deallocation.
201354
After that is done, we additionally check all `FnBarrier(c)` occurring in any stack on any of the deallocated locations.
202355
If any of the `c` is still active, we have undefined behavior.
356+
357+
In code, do the following just before returning `Ok()` from `access`:
358+
```rust
359+
if kind == AccessKind::Dealloc {
360+
for &itm in self.borrows.iter().rev() {
361+
match itm {
362+
BorStackItem::FnBarrier(call) if barrier_tracking.is_active(call) => {
363+
return err!(
364+
"Deallocating with active barrier ({})", call
365+
)
366+
}
367+
_ => {},
368+
}
369+
}
370+
}
371+
```
372+
373+
## Adjustments to libstd
374+
375+
libstd needed some patches to comply with this model. These provide a good opportunity to review if we are okay with the requirements that Stacked Borrows places onto unsafe code.
376+
377+
* [`VecDeque` creating overlapping mutable references](https://github.com/rust-lang/rust/pull/56161)
378+
* [Futures turning a shared reference into a mutable one](https://github.com/rust-lang/rust/pull/56319)
379+
* [`str` turning a shared reference into a mutable one](https://github.com/rust-lang/rust/pull/58200)
380+
* [`BTreeMap` creating mutable references that overlap with shared references](https://github.com/rust-lang/rust/pull/58431)

0 commit comments

Comments
 (0)