Skip to content

RFC: Never allow reads from uninitialized memory in safe Rust #837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions text/0000-uninit-memory-policy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
- Feature Name: (fill me in with a unique ident, my_awesome_feature)
- Start Date: 2015-02-13
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

Set an explicit policy that uninitialized memory can never be exposed
in safe Rust, even when it would not lead to undefined behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the one below first, but this also should say "can never".


# Motivation

Exactly what is guaranteed by safe Rust code is not entirely
clear. There are some clear baseline guarantees: data-race freedom,
memory safety, type safety. But what about cases like reading from an
uninitialized, but allocated slice of scalars? These cases can be made
memory and typesafe, but they carry security risks.

In particular, it may be possible to exploit a bug in safe Rust code
that causes that code to reveal the contents of memory.

Consider the `std::io::Read` trait:

```rust
pub trait Read {
fn read(&mut self, buf: &mut [u8]) -> Result<usize>;

fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<()> { ... }
}
```

The `read_to_end` convenience function will extend the given vector's capacity,
then pass the resulting (allocated but uninitialized) memory to the
underlying `read` method.

While the `read` method may be implemented in pure safe code, it is
nonetheless given read access to uninitialized memory. The
implementation of `read_to_end` guarantees that no UB will arise as a
result. But nevertheless, an incorrect implementation of `read` -- for
example, one that returned an incorrect number of bytes read -- could
result in that memory being exposed (and then potentially sent over
the wire).

# Detailed design

While we do not have a formal spec/contract for unsafe code, this RFC
will serve to set an explicit policy that:

**Uninitialized memory can never be exposed in safe Rust, even when it
would not lead to undefined behavior**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that you mean "can never be exposed". 😀


Like other aspects of the definition of "safe Rust", this is part of
the contract that *all* unsafe code must abide by, whether part of
`std` or external libraries or applications.

In practical terms, this will require methods like `read_to_end` to
internally zero out (or otherwise ensure initialization of) memory
before they pass it to unknown safe code like the `read` method.

# Drawbacks

In some cases, this policy may incur a performance overhead due to
having to initialize memory that will just be overwritten
later. However, these situations would be better served by improved
implementation techniques and/or introducing something like a `&out`
pointer expressing this idiom.

In addition, in most cases `unsafe` variants of APIs can always be
provided for maximal performance.

# Alternatives

The main alternative is to limit safety in Rust to e.g. having defined
behavior (which generally entails memory and type safety and data-race
freedom). While this is a good baseline, it seems worthwhile to aspire
to greater guarantees where they come at relatively low cost.

# Unresolved questions

Are there APIs in `std` besides the convenience functions in IO that
this policy would affect?