Skip to content

Commit 7eabb5a

Browse files
RalfJungavadacatavra
authored andcommitted
add glossary.md and define some invariants (#56)
* add glossary.md and define some invariants * add a slogan * live is not precise yet * Typo Co-Authored-By: RalfJung <[email protected]> * explain life a bit * immediate pointing
1 parent b8aaf89 commit 7eabb5a

File tree

3 files changed

+60
-14
lines changed

3 files changed

+60
-14
lines changed

reference/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@
2020
- [Vectors](./representation/vectors.md)
2121
- [Optimizations](./optimizations.md)
2222
- [Optimizing immutable memory](./optimizations/immutable_memory.md)
23+
- [Glossary](./glossary.md)

reference/src/glossary.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
## Glossary
2+
3+
#### Interior mutability
4+
5+
*Interior Mutability* means mutating memory where there also exists a live shared reference immediately (i.e., non-transitively) pointing to the same memory.
6+
This propagates recursively through references, but not through raw pointers.
7+
So, for example, if data immediately pointed to by a `&T` or `& &mut T` is mutated, that's interior mutability.
8+
If data immediately pointed to by a `*const T` or `&*const T` is mutated, that's *not* interior mutability.
9+
10+
"live" here means a value that will be "used again" later.
11+
This is not yet precisely defined, this will be fixed as part of developing a precise aliasing model.
12+
13+
Interior mutability is only allowed inside [`UnsafeCell`](https://doc.rust-lang.org/core/cell/struct.UnsafeCell.html).
14+
15+
#### Validity and safety invariant
16+
17+
The *validity invariant* is an invariant that all data must uphold any time it is accessed or copied in a typed manner.
18+
This invariant is known to the compiler and exploited by optimizations such as improved enum layout or eliding in-bounds checks.
19+
20+
In terms of MIR statements, "accessed or copied" means whenever an assignment statement is executed.
21+
That statement has a type (LHS and RHS must have the same type), and the data being assigned must be valid at that type.
22+
Moreover, arguments passed to a function must be valid at the type given in the callee signature, and the return value of a function must be valid at the type given in the caller signature.
23+
OPEN QUESTION: Are there more cases where data must be valid?
24+
25+
In terms of code, some data computed by `TERM` is valid at type `T` if and only if the following program does not have UB:
26+
```rust
27+
fn main() { unsafe {
28+
let t: T = std::mem::transmute(TERM);
29+
} }
30+
```
31+
32+
The *safety* invariant is an invariant that safe code may assume all data to uphold.
33+
This invariant is used to justify which operations safe code can perform.
34+
The safety invariant can be temporarily violated by unsafe code, but must always be upheld when interfacing with unknown safe code.
35+
It is not relevant when arguing whether some *program* has UB, but it is relevant when arguing whether some code safely encapsulates its unsafety -- IOW, it is relevant when arguing whether some *library* can be used by safe code to *cause* UB.
36+
37+
In terms of code, some data computed by `TERM` (possibly constructed from some `arguments` that can be *assumed* to satisfy the safety invariant) is valid at type `T` if and only if the following library function can be safely exposed to arbitrary (safe) code as part of the public library interface:
38+
```rust
39+
pub fn make_something(arguments: U) -> T { unsafe {
40+
std::mem::transmute(TERM)
41+
} }
42+
```
43+
44+
One example of valid-but-unsafe data is a `&str` or `String` that's not well-formed UTF-8: the compiler will not run its own optimizations that would cause any trouble here, so unsafe code may temporarily violate the invariant that strings are `UTF-8`.
45+
However, functions on `&str`/`String` may assume the string to be `UTF-8`, meaning they may cause UB if the string is *not* `UTF-8`.
46+
This means that unsafe code violating the UTF-8 invariant must not perform string operations (it may operate on the data as a byte slice though), or else it risks UB.
47+
Moreover, such unsafe code must not return a non-UTF-8 string to the "outside" of its safe abstraction boundary, because that would mean safe code could cause UB by doing `bad_function().chars().count()`.
48+
49+
To summarize: *Data must always be valid, but it only must be safe in safe code.*
50+
For some more information, see [this blog post](https://www.ralfj.de/blog/2018/08/22/two-kinds-of-invariants.html).
51+
52+
### TODO
53+
54+
* *niche*
55+
* *layout*
56+
* *tag*
57+
* *rvalue*
58+
* *lvalue*

reference/src/introduction.md

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,3 @@
11
# Introduction
22

3-
4-
## Terms
5-
6-
* *interior mutability* — mutating memory where there also exists a live shared reference non-transitively pointing to the same memory.
7-
8-
(TODO: definitions)
9-
10-
* *niche*
11-
* *layout*
12-
* *tag*
13-
* *rvalue*
14-
* *lvalue*
15-
16-
## Unsafe abstraction
3+
What this document is about, and how it is structured.

0 commit comments

Comments
 (0)