Description
For brevity, this issue will refer to core::cell::Ref
as CoreRef
and core::cell::RefMut
as CoreRefMut
.
Overview
Ref<B, T>
's into_ref
, into_mut
, into_slice
, and into_mut_slice
methods are unsound when B
is CoreRef<[u8]>
or CoreRefMut<[u8]>
.
Progress
- High priority
- Affected published versions identified
- Fix implemented in 0.7
- Fix backported to each affected version Git branch
- 0.6: Fix soundness hole in Ref::into_ref and into_mut (#721) #723
- 0.5: Fix soundness hole in Ref::into_ref and into_mut (#721) #724
- 0.4: Fix soundness hole in Ref::into_ref and into_mut (#721) #725
- 0.3: Fix soundness hole in Ref::into_ref and into_mut (#721) #726
- 0.2: Fix soundness hole in Ref::into_ref and into_mut (#721) #727
- Fix published to each affected version train
- GitHub security advisory published
- Rustsec security advisory published
- GitHub Releases published for each new version with description referencing both advisories
- Yank log updated
- Affected versions yanked
- Medium priority
- Fix backported to 0.8.0
- UI test added to ensure that other implementations of
ByteSlice
do not permit this unsoundness
Proof of concept
The following (100% safe) code exhibits undefined behavior according to Miri (playground link):
use core::cell::{Ref, RefCell};
fn main() {
let refcell = RefCell::new([0u8, 1, 2, 3]);
let core_ref = refcell.borrow();
let core_ref = Ref::map(core_ref, |bytes| &bytes[..]);
// `zc_ref` now stores `core_ref` internally.
let zc_ref = zerocopy::Ref::<_, u32>::new(core_ref).unwrap();
// This causes `core_ref` to get dropped and synthesizes a Rust
// reference to the memory `core_ref` was pointing at.
let rust_ref = zc_ref.into_ref();
// UB!!! This mutates `rust_ref`'s referent while it's alive.
*refcell.borrow_mut() = [0, 0, 0, 0];
println!("{}", rust_ref);
}
Miri output:
error: Undefined Behavior: trying to retag from <1984> for SharedReadOnly permission at alloc903[0x8], but that tag does not exist in the borrow stack for this location
--> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:2304:1
|
2304 | fmt_refs! { Debug, Display, Octal, Binary, LowerHex, UpperHex, LowerExp, UpperExp }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| trying to retag from <1984> for SharedReadOnly permission at alloc903[0x8], but that tag does not exist in the borrow stack for this location
| this error occurs as part of retag at alloc903[0x8..0xc]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1984> was created by a SharedReadOnly retag at offsets [0x8..0xc]
--> src/main.rs:13:20
|
13 | let rust_ref = zc_ref.into_ref();
| ^^^^^^^^^^^^^^^^^
help: <1984> was later invalidated at offsets [0x8..0xc] by a Unique retag
--> src/main.rs:16:5
|
16 | *refcell.borrow_mut() = [0, 0, 0, 0];
| ^^^^^^^^^^^^^^^^^^^^^
= note: BACKTRACE (of the first span):
= note: inside `<&u32 as std::fmt::Display>::fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:2294:71: 2294:78
= note: inside `core::fmt::rt::Argument::<'_>::fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/rt.rs:142:9: 142:40
= note: inside `std::fmt::write` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:1120:17: 1120:40
= note: inside `<std::io::StdoutLock<'_> as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:1810:15: 1810:43
= note: inside `<&std::io::Stdout as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:727:9: 727:36
= note: inside `<std::io::Stdout as std::io::Write>::write_fmt` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:701:9: 701:33
= note: inside `std::io::stdio::print_to::<std::io::Stdout>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:1020:21: 1020:47
= note: inside `std::io::_print` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/stdio.rs:1097:5: 1097:37
note: inside `main`
--> src/main.rs:18:5
|
18 | println!("{}", rust_ref);
| ^^^^^^^^^^^^^^^^^^^^^^^^
= note: this error originates in the macro `fmt_refs` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
The following playground examples demonstrate the same UB using the other types and methods:
Explanation
This explanation is in terms of CoreRef
and the Ref::into_ref
method, but the same explanation holds of CoreRefMut
and Ref::into_mut
, Ref::into_slice
, and Ref::into_mut_slice
.
The relevant bits of the code are:
Lines 4787 to 4805 in a8572da
Lines 4924 to 4947 in a8572da
Lines 5332 to 5339 in a8572da
Quick explanation
We'll start with a concise explanation which will paper over the unsoundness, and hopefully help to demonstrate how a bug like this could have made its way in.
Ref::deref_helper
is unsafe, and allows the caller to provide an arbitrary lifetime, 'a
, which is not necessarily the same as the lifetime of its &self
parameter. It returns a reference, &'a T
, of that arbitrary lifetime. In exchange, it requires the caller to guarantee that "the lifetime 'a
is valid for this reference."
Ref::into_ref
is only available when the buffer type, B
, is valid for 'a
(ie, B: 'a
). Thus, it can soundly call deref_helper
internally - the B: 'a
bound ensures that the returned reference won't outlive the buffer.
Bug
The bug comes from the fact that B: 'a
is an insufficient guarantee. In particular, it guarantees that the buffer type satisfies 'a
, but what that means differs by type. When I authored this code, I was probably implicitly thinking in terms of &'a [u8]
, where the 'a
means that the underlying memory is considered immutably borrowed for the lifetime 'a
- it can neither be mutated nor dropped during 'a
.
Unfortunately, CoreRef<[u8]>: 'a
only requires that the underlying RefCell
lives for at least 'a
. [1] The CoreRef
itself may be dropped sooner than this and still satisfy CoreRef<[u8]>: 'a
. Of course, if we think specifically in terms of RefCell
, it makes sense that this has to be the behavior - RefCell
's whole purpose is to allow mutation which is tracked dynamically at runtime rather than statically at compile time. By dropping the CoreRef
, we're decrementing the RefCell
's refcount and allowing the underlying memory to be mutated again.
Another perspective on why this is problematic is to trace the function call stack and figure out why this code compiles at all. We start with Ref::into_ref
, which takes self
by value. As a consequence, it drops self
at the end of the method scope. We might reasonably ask how this is able to compile given that the returned reference outlives that scope.
As we already saw, Ref::deref_helper
takes a &self
of any lifetime and returns a &'a T
of the caller-chosen lifetime 'a
. Internally, it accomplishes this by calling <B as ByteSlice>::as_ptr
. as_ptr
is defined like so:
Lines 5332 to 5339 in a8572da
This involves an implicit call to Deref::deref
, so we can desugar like so:
fn as_ptr(&self) -> *const u8 {
let bytes: &[u8] = self.deref();
<[u8]>::as_ptr(bytes)
}
Note that Deref<Target = [u8]>::deref
has the signature fn deref<'a>(&'a self) -> &'a [u8]
, and so the returned bytes
has the same lifetime as the self
argument to as_ptr
. Bubbling up the call stack, that means that it has the same lifetime as self
in Ref::into_ref
, which is only valid until the end of into_ref
's method scope.
[1] In particular, the RefCell<T>
method borrow
has the signature fn borrow<'a>(&'a self) -> CoreRef<'a, T>
- the 'a
serves to ensure that the returned CoreRef
doesn't outlive the underlying RefCell
.
History and affected versions
This commit from December, 2019 is as far back as our Git history goes, when zerocopy's code was migrated from elsewhere in Fuchsia's codebase (this was before zerocopy had been moved to GitHub, when it still lived in Fuchia's source tree).
From a manually search of docs.rs, it seems that the earliest affected version is 0.2.2. Ref::into_ref
exists here in 0.2.2, but does not exist in 0.2.1.
Impact
It is difficult to estimate the usage of these methods in the wild, but in my personal experience, I've never seen anyone use the Ref<B, T>
type with B = CoreRef<[u8]>
or B = CoreRefMut<[u8]>
. An affected user would need to use this setup and also call the into_ref
, into_mut
, into_slice
, or into_mut_slice
methods.
Mitigation plan
Thankfully, our Git history goes back far enough that we have source code for every affected minor version (ie, 0.2.X and onwards). Looking at the history up until the release of 0.3.0, we can see that the most recent commit we have from 0.2.X is 19ac2a7, and this is confirmed by the Cargo.toml.crates-io
file. Thus, we should be able to backport a fix to every affected minor version. Depending on how the discussion in #679 concludes, we may also decide to yank some or all affected versions after publishing a fix to each affected minor version.
Fix
TL;DR: We will add a check that causes code which exercises this UB to fail compilation after monomorphization.
What fix we implement on existing versions is significantly more constrained. No matter what we do, any fix must technically be a breaking change, as code which could previously compile and exhibit UB at runtime should either no longer compile or should panic at runtime.
We considered a few options, and evaluated them across a few axes:
- Existing code may use these methods in a context in which
Ref<B, T>
'sB
parameter is generic, but in practiceB
is never set toCoreRef
orCoreRefMut
. This code should continue to compile. - Existing code may use
CoreRef
orCoreRefMut
in combination withRef
and not make use ofinto_ref
,into_mut
,into_slice
, orinto_mut_slice
. This code should continue to compile. - Existing code may use
CoreRef
orCoreRefMut
in combination withRef
and make use ofinto_ref
,into_mut
,into_slice
, orinto_mut_slice
, which may exhibit UB at runtime. This code should no longer compile.
We considered these options:
- We could gate
into_ref
,into_mut
,into_slice
, orinto_mut_slice
on an extra trait bound on theB
type parameter - importantly, a trait bound not satisfied byCoreRef
orCoreRefMut
. - Since
into_ref
,into_mut
,into_slice
, orinto_mut_slice
currently requireB: ByteSlice
, we could remove the impl ofByteSlice
forCoreRef
andCoreRefMut
. - We could add a runtime check so that if
into_ref
,into_mut
,into_slice
, orinto_mut_slice
are called whenB
is set toCoreRef
orCoreRefMut
, they panic before exhibiting UB. - We could add a post-monomorphization check so that if
into_ref
,into_mut
,into_slice
, orinto_mut_slice
are called whenB
is set toCoreRef
orCoreRefMut
, compilation is failed after monomorphization has already happened.
Only the last option satisfies all our criteria, and so we will implement it. For completeness, here is each option evaluated against each criterion:
Option | into_xxx compile when B is generic? |
CoreRef /CoreRefMut compile when not using into_xxx ? |
Code which would otherwise be UB fails at compile time? |
---|---|---|---|
Extra trait bound for B |
❌ | ✅ | ✅ |
Remove ByteSlice impl from CoreRef /CoreRefMut |
✅ | ❌ | ✅ |
Panic at runtime | ✅ | ✅ | ❌ |
Fail compilation post-monomorphization | ✅ | ✅ | ✅ |