Skip to content

Commit ab03d32

Browse files
committed
Auto merge of #2059 - carbotaniuman:master, r=RalfJung
Initial work on Miri permissive-exposed-provenance Miri portions of the changes for portions of a permissive ptr-to-int model for Miri. This is more restrictive than what we currently have so it will probably need a flag once I figure out how to hook that up. > This implements a form of permissive exposed-address provenance, wherein the only way to expose the address is with a cast to usize (ideally expose_addr). This is more restrictive than C in that stuff like reading the representation bytes (via unions, type-punning, transmute) does not expose the address, only expose_addr. This is less restrictive than C in that a pointer casted from an integer has union provenance of all exposed pointers, not any udi stuff. There's a few TODOs here, namely related to `fn memory_read` and friends. We pass it the maybe/unreified provenance before `ptr_get_alloc` reifies it into a concrete one, so it doesn't have the `AllocId` (or the SB tag, but that's getting ahead of ourselves). One way this could be fixed is changing `ptr_get_alloc` and (`ptr_try_get_alloc_id` on the rustc side) to return a pointer with the tag fixed up. We could also take in different arguments, but I'm not sure what works best. The other TODOs here are how permissive this model could be. This currently does not enforce that a ptr-to-int cast happens before the corresponding int-to-ptr (colloquial meaning of happens before, not atomic meaning). Example: ``` let ptr = 0x2000 as *const i32; let a: i32 = 5; let a_ptr = &a as *const i32; // value is 0x2000; a_ptr as usize; println!("{}", unsafe { *ptr }); // this is valid ``` We also allow the resulting pointer to dereference different non-contiguous allocations (the "not any udi stuff" mentioned above), which I'm not sure if is allowed by LLVM. This is the Miri side of rust-lang/rust#95826.
2 parents 72e11d3 + f8f2255 commit ab03d32

11 files changed

+286
-48
lines changed

README.md

+11
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,17 @@ to Miri failing to detect cases of undefined behavior in a program.
318318
application instead of raising an error within the context of Miri (and halting
319319
execution). Note that code might not expect these operations to ever panic, so
320320
this flag can lead to strange (mis)behavior.
321+
* `-Zmiri-permissive-provenance` is **experimental**. This will make Miri do a
322+
best-effort attempt to implement the semantics of
323+
[`expose_addr`](https://doc.rust-lang.org/nightly/std/primitive.pointer.html#method.expose_addr)
324+
and
325+
[`ptr::from_exposed_addr`](https://doc.rust-lang.org/nightly/std/ptr/fn.from_exposed_addr.html)
326+
for pointer-to-int and int-to-pointer casts, respectively. This will
327+
necessarily miss some bugs as those semantics are not efficiently
328+
implementable in a sanitizer, but it will only miss bugs that concerns
329+
memory/pointers which is subject to these operations. Also note that this flag
330+
is currently incompatible with Stacked Borrows, so you will have to also pass
331+
`-Zmiri-disable-stacked-borrows` to use this.
321332
* `-Zmiri-seed=<hex>` configures the seed of the RNG that Miri uses to resolve
322333
non-determinism. This RNG is used to pick base addresses for allocations.
323334
When isolation is enabled (the default), this is also used to emulate system

src/bin/miri.rs

+6-2
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ use rustc_middle::{
3030
};
3131
use rustc_session::{config::ErrorOutputType, search_paths::PathKind, CtfeBacktrace};
3232

33-
use miri::BacktraceStyle;
33+
use miri::{BacktraceStyle, ProvenanceMode};
3434

3535
struct MiriCompilerCalls {
3636
miri_config: miri::MiriConfig,
@@ -384,10 +384,14 @@ fn main() {
384384
miri_config.tag_raw = true;
385385
}
386386
"-Zmiri-strict-provenance" => {
387-
miri_config.strict_provenance = true;
387+
miri_config.provenance_mode = ProvenanceMode::Strict;
388388
miri_config.tag_raw = true;
389389
miri_config.check_number_validity = true;
390390
}
391+
"-Zmiri-permissive-provenance" => {
392+
miri_config.provenance_mode = ProvenanceMode::Permissive;
393+
miri_config.tag_raw = true;
394+
}
391395
"-Zmiri-mute-stdout-stderr" => {
392396
miri_config.mute_stdout_stderr = true;
393397
}

src/eval.rs

+3-4
Original file line numberDiff line numberDiff line change
@@ -113,9 +113,8 @@ pub struct MiriConfig {
113113
pub panic_on_unsupported: bool,
114114
/// Which style to use for printing backtraces.
115115
pub backtrace_style: BacktraceStyle,
116-
/// Whether to enforce "strict provenance" rules. Enabling this means int2ptr casts return
117-
/// pointers with an invalid provenance, i.e., not valid for any memory access.
118-
pub strict_provenance: bool,
116+
/// Which provenance to use for int2ptr casts
117+
pub provenance_mode: ProvenanceMode,
119118
/// Whether to ignore any output by the program. This is helpful when debugging miri
120119
/// as its messages don't get intermingled with the program messages.
121120
pub mute_stdout_stderr: bool,
@@ -144,7 +143,7 @@ impl Default for MiriConfig {
144143
measureme_out: None,
145144
panic_on_unsupported: false,
146145
backtrace_style: BacktraceStyle::Short,
147-
strict_provenance: false,
146+
provenance_mode: ProvenanceMode::Legacy,
148147
mute_stdout_stderr: false,
149148
}
150149
}

src/helpers.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -786,8 +786,8 @@ pub trait EvalContextExt<'mir, 'tcx: 'mir>: crate::MiriEvalContextExt<'mir, 'tcx
786786
fn mark_immutable(&mut self, mplace: &MemPlace<Tag>) {
787787
let this = self.eval_context_mut();
788788
// This got just allocated, so there definitely is a pointer here.
789-
this.alloc_mark_immutable(mplace.ptr.into_pointer_or_addr().unwrap().provenance.alloc_id)
790-
.unwrap();
789+
let provenance = mplace.ptr.into_pointer_or_addr().unwrap().provenance;
790+
this.alloc_mark_immutable(provenance.get_alloc_id().unwrap()).unwrap();
791791
}
792792

793793
fn item_link_name(&self, def_id: DefId) -> Symbol {

src/intptrcast.rs

+103-20
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,25 @@ use std::collections::hash_map::Entry;
44
use log::trace;
55
use rand::Rng;
66

7-
use rustc_data_structures::fx::FxHashMap;
7+
use rustc_data_structures::fx::{FxHashMap, FxHashSet};
88
use rustc_target::abi::{HasDataLayout, Size};
99

1010
use crate::*;
1111

12+
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
13+
pub enum ProvenanceMode {
14+
/// Int2ptr casts return pointers with "wildcard" provenance
15+
/// that basically matches that of all exposed pointers
16+
/// (and SB tags, if enabled).
17+
Permissive,
18+
/// Int2ptr casts return pointers with an invalid provenance,
19+
/// i.e., not valid for any memory access.
20+
Strict,
21+
/// Int2ptr casts determine the allocation they point to at cast time.
22+
/// All allocations are considered exposed.
23+
Legacy,
24+
}
25+
1226
pub type GlobalState = RefCell<GlobalStateInner>;
1327

1428
#[derive(Clone, Debug)]
@@ -21,35 +35,37 @@ pub struct GlobalStateInner {
2135
/// they do not have an `AllocExtra`.
2236
/// This is the inverse of `int_to_ptr_map`.
2337
base_addr: FxHashMap<AllocId, u64>,
38+
/// Whether an allocation has been exposed or not. This cannot be put
39+
/// into `AllocExtra` for the same reason as `base_addr`.
40+
exposed: FxHashSet<AllocId>,
2441
/// This is used as a memory address when a new pointer is casted to an integer. It
2542
/// is always larger than any address that was previously made part of a block.
2643
next_base_addr: u64,
27-
/// Whether to enforce "strict provenance" rules. Enabling this means int2ptr casts return
28-
/// pointers with an invalid provenance, i.e., not valid for any memory access.
29-
strict_provenance: bool,
44+
/// The provenance to use for int2ptr casts
45+
provenance_mode: ProvenanceMode,
3046
}
3147

3248
impl GlobalStateInner {
3349
pub fn new(config: &MiriConfig) -> Self {
3450
GlobalStateInner {
3551
int_to_ptr_map: Vec::default(),
3652
base_addr: FxHashMap::default(),
53+
exposed: FxHashSet::default(),
3754
next_base_addr: STACK_ADDR,
38-
strict_provenance: config.strict_provenance,
55+
provenance_mode: config.provenance_mode,
3956
}
4057
}
4158
}
4259

4360
impl<'mir, 'tcx> GlobalStateInner {
44-
pub fn ptr_from_addr(addr: u64, ecx: &MiriEvalContext<'mir, 'tcx>) -> Pointer<Option<Tag>> {
45-
trace!("Casting 0x{:x} to a pointer", addr);
61+
// Returns the exposed `AllocId` that corresponds to the specified addr,
62+
// or `None` if the addr is out of bounds
63+
fn alloc_id_from_addr(ecx: &MiriEvalContext<'mir, 'tcx>, addr: u64) -> Option<AllocId> {
4664
let global_state = ecx.machine.intptrcast.borrow();
47-
48-
if global_state.strict_provenance {
49-
return Pointer::new(None, Size::from_bytes(addr));
50-
}
65+
assert!(global_state.provenance_mode != ProvenanceMode::Strict);
5166

5267
let pos = global_state.int_to_ptr_map.binary_search_by_key(&addr, |(addr, _)| *addr);
68+
5369
let alloc_id = match pos {
5470
Ok(pos) => Some(global_state.int_to_ptr_map[pos].1),
5571
Err(0) => None,
@@ -60,6 +76,7 @@ impl<'mir, 'tcx> GlobalStateInner {
6076
// This never overflows because `addr >= glb`
6177
let offset = addr - glb;
6278
// If the offset exceeds the size of the allocation, don't use this `alloc_id`.
79+
6380
if offset
6481
<= ecx
6582
.get_alloc_size_and_align(alloc_id, AllocCheck::MaybeDead)
@@ -72,12 +89,65 @@ impl<'mir, 'tcx> GlobalStateInner {
7289
None
7390
}
7491
}
75-
};
76-
// Pointers created from integers are untagged.
77-
Pointer::new(
78-
alloc_id.map(|alloc_id| Tag { alloc_id, sb: SbTag::Untagged }),
79-
Size::from_bytes(addr),
80-
)
92+
}?;
93+
94+
// In legacy mode, we consider all allocations exposed.
95+
if global_state.provenance_mode == ProvenanceMode::Legacy
96+
|| global_state.exposed.contains(&alloc_id)
97+
{
98+
Some(alloc_id)
99+
} else {
100+
None
101+
}
102+
}
103+
104+
pub fn expose_addr(ecx: &MiriEvalContext<'mir, 'tcx>, alloc_id: AllocId) {
105+
trace!("Exposing allocation id {:?}", alloc_id);
106+
107+
let mut global_state = ecx.machine.intptrcast.borrow_mut();
108+
if global_state.provenance_mode == ProvenanceMode::Permissive {
109+
global_state.exposed.insert(alloc_id);
110+
}
111+
}
112+
113+
pub fn ptr_from_addr_transmute(
114+
ecx: &MiriEvalContext<'mir, 'tcx>,
115+
addr: u64,
116+
) -> Pointer<Option<Tag>> {
117+
trace!("Transmuting 0x{:x} to a pointer", addr);
118+
119+
let global_state = ecx.machine.intptrcast.borrow();
120+
121+
// In legacy mode, we have to support int2ptr transmutes,
122+
// so just pretend they do the same thing as a cast.
123+
if global_state.provenance_mode == ProvenanceMode::Legacy {
124+
Self::ptr_from_addr_cast(ecx, addr)
125+
} else {
126+
Pointer::new(None, Size::from_bytes(addr))
127+
}
128+
}
129+
130+
pub fn ptr_from_addr_cast(
131+
ecx: &MiriEvalContext<'mir, 'tcx>,
132+
addr: u64,
133+
) -> Pointer<Option<Tag>> {
134+
trace!("Casting 0x{:x} to a pointer", addr);
135+
136+
let global_state = ecx.machine.intptrcast.borrow();
137+
138+
if global_state.provenance_mode == ProvenanceMode::Strict {
139+
Pointer::new(None, Size::from_bytes(addr))
140+
} else if global_state.provenance_mode == ProvenanceMode::Legacy {
141+
let alloc_id = Self::alloc_id_from_addr(ecx, addr);
142+
143+
Pointer::new(
144+
alloc_id
145+
.map(|alloc_id| Tag::Concrete(ConcreteTag { alloc_id, sb: SbTag::Untagged })),
146+
Size::from_bytes(addr),
147+
)
148+
} else {
149+
Pointer::new(Some(Tag::Wildcard), Size::from_bytes(addr))
150+
}
81151
}
82152

83153
fn alloc_base_addr(ecx: &MiriEvalContext<'mir, 'tcx>, alloc_id: AllocId) -> u64 {
@@ -136,14 +206,27 @@ impl<'mir, 'tcx> GlobalStateInner {
136206
dl.overflowing_offset(base_addr, offset.bytes()).0
137207
}
138208

139-
pub fn abs_ptr_to_rel(ecx: &MiriEvalContext<'mir, 'tcx>, ptr: Pointer<Tag>) -> Size {
209+
pub fn abs_ptr_to_rel(
210+
ecx: &MiriEvalContext<'mir, 'tcx>,
211+
ptr: Pointer<Tag>,
212+
) -> Option<(AllocId, Size)> {
140213
let (tag, addr) = ptr.into_parts(); // addr is absolute (Tag provenance)
141-
let base_addr = GlobalStateInner::alloc_base_addr(ecx, tag.alloc_id);
214+
215+
let alloc_id = if let Tag::Concrete(concrete) = tag {
216+
concrete.alloc_id
217+
} else {
218+
GlobalStateInner::alloc_id_from_addr(ecx, addr.bytes())?
219+
};
220+
221+
let base_addr = GlobalStateInner::alloc_base_addr(ecx, alloc_id);
142222

143223
// Wrapping "addr - base_addr"
144224
let dl = ecx.data_layout();
145225
let neg_base_addr = (base_addr as i64).wrapping_neg();
146-
Size::from_bytes(dl.overflowing_signed_offset(addr.bytes(), neg_base_addr).0)
226+
Some((
227+
alloc_id,
228+
Size::from_bytes(dl.overflowing_signed_offset(addr.bytes(), neg_base_addr).0),
229+
))
147230
}
148231

149232
/// Shifts `addr` to make it aligned with `align` by rounding `addr` to the smallest multiple

src/lib.rs

+3-2
Original file line numberDiff line numberDiff line change
@@ -78,9 +78,10 @@ pub use crate::eval::{
7878
create_ecx, eval_entry, AlignmentCheck, BacktraceStyle, IsolatedOp, MiriConfig, RejectOpWith,
7979
};
8080
pub use crate::helpers::{CurrentSpan, EvalContextExt as HelpersEvalContextExt};
81+
pub use crate::intptrcast::ProvenanceMode;
8182
pub use crate::machine::{
82-
AllocExtra, Evaluator, FrameData, MiriEvalContext, MiriEvalContextExt, MiriMemoryKind, Tag,
83-
NUM_CPUS, PAGE_SIZE, STACK_ADDR, STACK_SIZE,
83+
AllocExtra, ConcreteTag, Evaluator, FrameData, MiriEvalContext, MiriEvalContextExt,
84+
MiriMemoryKind, Tag, NUM_CPUS, PAGE_SIZE, STACK_ADDR, STACK_SIZE,
8485
};
8586
pub use crate::mono_hash_map::MonoHashMap;
8687
pub use crate::operator::EvalContextExt as OperatorEvalContextExt;

0 commit comments

Comments
 (0)