Skip to content

UnsafeAlias for intrusive collections #272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RustyYato opened this issue Feb 8, 2021 · 18 comments
Closed

UnsafeAlias for intrusive collections #272

RustyYato opened this issue Feb 8, 2021 · 18 comments
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) A-provenance Topic: Related to when which values have which provenance (but not which alias restrictions follow)

Comments

@RustyYato
Copy link

RustyYato commented Feb 8, 2021

I'm currently working with @ckaran on intrusive collections (RustyYato/generic-field-projection#54), and I've come across a snag. Given the following implementation of an intrusive singly-linked-list (different from the PR, but still relevant)

(edit) a more detailed explanation here: #272 (comment)

struct Link {
    next: Cell<*mut Link>,
}

struct Node {
    link: Link,
    value: T
}

It would be impossible to provide a safe API under stacked borrows without allocations. The problem: Link is intrinsically aliased, First by the Node, which owns it, and second by its parent link. This means that getting a &mut Node will invalidate the parent.next, making this linked list unusable. (Last time I checked, MIRI doesn't track raw pointers well enough to catch this, but I believe this is a problem). Currently, the only fix is to heap allocate Node, and then never expose &mut Node, but this is sub-optimal.

Unless I'm missing something crucial, I think UnsafeAlias would resolve this issue. UnsafeAlias as described by @RalfJung here: #148 (comment). Then we could wrap Link in UnsafeAlias somewhere down the line, and this would prevent Link.next from being invalidated.

UnsafeAlias's aliasing condition is: &mut UnsafeAlias<T> is allowed to alias with &T (and possibly &mut T, but I don't think that's necessary, you could use UnsafeAlias<UnsafeCell<T>> for that purpose even if only &T was exposed).

@Diggsey
Copy link

Diggsey commented Feb 8, 2021

What would the new Node and Link types look like with the addition of an UnsafeAlias type?

@RustyYato
Copy link
Author

RustyYato commented Feb 8, 2021

Node could remain as is. I think that Link would change to the following.

struct Link {
    next: UnsafeAlias<Cell<*mut Link>>,
}

Another option is to change Node, but not Link

struct Node {
    link: UnsafeAlias<Link>,
    value: T,
}

I don't think that there is a significant difference between these two ways (in terms of safety), the latter seems like a worse API for Link (but that's out of scope).

@comex
Copy link

comex commented Feb 9, 2021

👍

The pileup of wrapper types is kind of unfortunate from a verbosity perspective. I suppose UnsafeAlias can safely impl Deref, so accessing an UnsafeAlias<Cell<T>> would look no different than if it was just Cell<T>. The type declaration and initialization would still be more verbose, though.

@RustyYato
Copy link
Author

Yes, that's unfortunate. Even worse, the only way to avoid that is to add a more comprehensive pointer system that can handle all these combinations. i.e. a generic pointer type that can encode that it's exclusive/shared, maybe read, maybe write and more at the type-level. But that would basically bring LLVM pointers to the surface language. I don't think that's desirable either. So, as unfortunate as this is, I can't see a way around it ☹️.

@ckaran
Copy link

ckaran commented Feb 9, 2021

@RustyYato I'm actually of exactly the opposite mind, that we need more ways of describing the types of pointers (and of memory in general). Memory mapped IO, exclusive/shared, read-only, write-only, both, video-memory, etc., etc., etc., There are SO MANY things that look like memory but aren't just plain old bits that you can twiddle... and all of them can have an impact on your code. It'd be nice if we had some way of informing the compiler about all of them, so that when we get pointers from particular address ranges the pointer already had some type information coming along with it...

@RustyYato
Copy link
Author

@ckaran I too would like more pointer types, the reason why I said that was because, in previous RFCs that I've made which introduced new pointer types, the core team was initially against adding more pointer types. The reason being that Rust already has a whole host of pointer types and adding new ones would make the language harder to learn.

To stay consistent with UnsafeCell, the only way to solve the problem with non-allocating intrusive collections is to add a UnsafeAlias. I think this should be fine because UnsafeAlias is somewhat niche, only really showing up in intrusive collections and self-referential types. I'm going to write up an RFC for this within the next week if there isn't any opposition here.

An aside

That said, this argument falls flat for me, because not adding pointer types is what lead to UnsafeCell, and now will lead to UnsafeAlias. These types fundamentally add exceptions to a rule, which I find harder to learn than yet another pointer type. To emphasize this, UnsafeCell has been the source of a number of bugs that could have been avoided if we had a specific shared -read-write pointer, but we don't. Fundamentally Rust doesn't have a way to express mutability, which leads to things like UnsafeCell, overloading what &_ means, and in turn making Rust harder to learn.

@RalfJung
Copy link
Member

I haven't yet had time to read the entire thread, but I have some questions about the original question:

Currently, the only fix is to heap allocate Node, and then never expose &mut Node, but this is sub-optimal.

There's hardly any difference between stack and heap allocations, so I do not understand why you say that doing heap allocations would make a difference. Could you say more, ideally with some concrete examples?

Generally, concrete examples for code that currently fails and how you think it should work would really help me understand what you want to do.

(Last time I checked, MIRI doesn't track raw pointers well enough to catch this, but I believe this is a problem)

With -Zmiri-track-raw-pointers, it does track them as precisely as it gets.

@RalfJung RalfJung added A-provenance Topic: Related to when which values have which provenance (but not which alias restrictions follow) A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) labels Feb 10, 2021
@Diggsey
Copy link

Diggsey commented Feb 10, 2021

There's hardly any difference between stack and heap allocations, so I do not understand why you say that doing heap allocations would make a difference. Could you say more, ideally with some concrete examples?

I think it's not about whether it's heap-allocated, but whether the "link" struct is stored inline by the node, or whether the node simply stores a pointer to the link data. If the latter, then you need to heap allocate the link data for practical reasons.

@Diggsey
Copy link

Diggsey commented Feb 10, 2021

For me, I'm unsure that UnsafeAlias is sufficient to expose a safe interface: as soon as user-code is able to access &mut Node, then they can do things like mem::replace your node, including all of its link data.

And if users cannot access &mut Node, there's no need for UnsafeAlias?

@RustyYato
Copy link
Author

RustyYato commented Feb 10, 2021

@RalfJung

There's hardly any difference between stack and heap allocations, so I do not understand why you say that doing heap allocations would make a difference. Could you say more, ideally with some concrete examples?

Generally, concrete examples for code that currently fails and how you think it should work would really help me understand what you want to do.

The only difference is that with a heap allocation, you can prevent &mut Pointer<T> from implying &mut T, i.e. &mut Rc<T> doesn't imply &mut T. This in turn means that &mut Pin<Rc<Node>> doesn't invalidate any of the fields of Node. This leads to the safe interface of:

struct Link {
    next: Cell<*const Link>,
    pin: PhantomPinned,
}

struct Node {
    link: Link,
    value: T,
}

impl Node {
    pub fn new(value: T) -> Pin<Rc<Self>> {
        Rc::pin(Self { ... })
    }
}

Where all the methods of Node use self: Pin<&Self>.

@Diggsey

For me, I'm unsure that UnsafeAlias is sufficient to expose a safe interface: as soon as user-code is able to access &mut Node, then they can do things like mem::replace your node, including all of its link data.

And if users cannot access &mut Node, there's no need for UnsafeAlias?

Pin can be used here to provide a safe interface. With stack pins, you can get a Pin<&mut T> which has all the aliasing problems, but none of the mem::replace problems.

Thinking about this some more, I think that Node will have to be marked UnsafeAlias, not just Link. Because I need to provide a way to convert a *const Link to *const Node, which means that Node could be shared at any given point in time. (Imagine iterating over the linked list, and inspecting each of the values).

So in reality users would interact with UnsafeAlias<Node> instead of Node. (or a wrapper around this).

@comex
Copy link

comex commented Feb 10, 2021

Thinking about this some more, I think that Node will have to be marked UnsafeAlias, not just Link. Because I need to provide a way to convert a *const Link to *const Node, which means that Node could be shared at any given point in time. (Imagine iterating over the linked list, and inspecting each of the values).

So in reality users would interact with UnsafeAlias<Node> instead of Node. (or a wrapper around this).

If you can't ever guarantee that Node is uniquely accessed, what's the point of having any form of &mut Node? What semantic guarantee does it provide?

@RustyYato
Copy link
Author

It's not useful. However, it is useful to have these nodes on the stack. This means that users of Node could create a &mut Node. If this happens, then I believe it will invalidate the next pointer in Link (due to stacked borrows). So it's a case that must be handled in some way, even though it isn't useful.

@digama0
Copy link

digama0 commented Feb 11, 2021

Could you show some code to make this issue clearer? I've been reading everything thus far but I still don't understand what exactly is the aim here, what is getting invalidated when you perform some operation. Do you have a small reproducer, ideally something that does what you want to do and fails miri? I'm not sure what the code in the OP is supposed to do since it's not even a linked list (the data is only in the head?), but maybe I'm just getting hung up on the wrong thing.

@RustyYato
Copy link
Author

RustyYato commented Feb 11, 2021

Yes, I think I should have explained intrusive collections in more detail at the beginning. Better late than never!

The big idea of intrusive collections is to store the structure of the collection separately from the data in the collection. Usually intrusive collections are node-based, like lists or trees, so the data node holds on to the node of the list or tree instead of the list node/tree node holding onto the data.

[ data: T, link : [next]  ] [ data: T, link: [next]  ] [ data: T, link: [next]  ] [ data: T, link: [next]  ]
                      └───────────────────────┘  └───────────────────────┘  └───────────────────────┘

This means that you could embed multiple collections over the exact same data.

[ data: T, list_for : [next], list_back: [next]  ] [ data: T, list_for : [next], list_back: [next]  ] [ data: T, list_for : [next], list_back: [next]  ]
                          └──────────────────┼────────────────────────────┘  └───────────────┼──┼────────────────────────────┘                  │
                                             └───────────────────────────────────────────────┘  └───────────────────────────────────────────────┘

For this to be useful, we need some operation to go from a link (link, list_for, list_back) to a node ([ data: T, link : [next] ]). This is possible for all Sized types, (and potentially for unsized types, but that's out of scope), and is where generic-field-projection comes in. It allows you to make these sorts of inverse field access, to get the parent of a field. This means that in theory, every node is always shared because something could be iterating over the entire list at any given point in time.

You can look at this file for an implementation of a singly linked list, and a doubly linked list. https://github.com/RustyYato/generic-field-projection/blob/linked-list-DO-NOT-MERGE/temp/src/lib.rs
The node type is defined at Line 165, and the relevant tests start at line 252

Looking at the code again, it looks like MIRI does catch a violation in the higher-level code, once the invalidated raw pointers are turned into references.

The following snippet works fine:

#[derive(Field)]
pub struct Foo {
    x:    Box<i32>,
    y:    Box<i32>,
    link: DoubleLink,
}


#[test]
fn foo() {
    let mut foo = Foo::new();
    let mut bar = Foo::new();
    let mut yam = Foo::new();
    
    // snip
    
    unsafe {
        foo.link_next(&yam);
        foo.link_next(&bar);
                
        assert_eq!(foo.prev().map(Foo::get), None);
        assert_eq!(foo.next().map(Foo::get), Some((30, 40)));
        assert_eq!(bar.prev().map(Foo::get), Some((10, 20)));
        assert_eq!(bar.next().map(Foo::get), None);
        assert_eq!(yam.prev().map(Foo::get), None);
        assert_eq!(yam.next().map(Foo::get), None);
    }

    // snip
}

However if we apply this diff, MIRI fails at assert_eq!(bar.prev().map(Foo::get), Some((10, 20)));

    unsafe {
        foo.link_next(&yam);
        foo.link_next(&bar);

+        let _ = &mut foo;

You can check out this exact code at the provided file, and apply the diff to see the MIRI violation for yourself.

complete MIRI error
running 1 test
error: Undefined Behavior: trying to reborrow for SharedReadOnly at alloc73309, but parent tag <untagged> does not have an appropriate item in the borrow stack
   --> temp\src\lib.rs:239:24
    |
239 |             .map(|foo| &*foo.as_ptr())
    |                        ^^^^^^^^^^^^^^ trying to reborrow for SharedReadOnly at alloc73309, but parent tag <untagged> does not have an appropriate item in the borrow stack
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information

    = note: inside closure at temp\src\lib.rs:239:24
    = note: inside `std::option::Option::<std::ptr::NonNull<Foo>>::map::<&Foo, [closure@temp\src\lib.rs:239:18: 239:38]>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\option.rs:487:29
note: inside `Foo::prev` at temp\src\lib.rs:237:9
   --> temp\src\lib.rs:237:9
    |
237 | /         self.link
238 | |             .prev_from(Foo::fields().link)
239 | |             .map(|foo| &*foo.as_ptr())
    | |______________________________________^
note: inside `foo` at temp\src\lib.rs:274:20
   --> temp\src\lib.rs:274:20
    |
274 |         assert_eq!(bar.prev().map(Foo::get), Some((10, 20)));
    |                    ^^^^^^^^^^
note: inside closure at temp\src\lib.rs:252:1
   --> temp\src\lib.rs:252:1
    |
252 | / fn foo() {
253 | |     let mut foo = Foo::new();
254 | |     let mut bar = Foo::new();
255 | |     let mut yam = Foo::new();
...   |
306 | |     }
307 | | }
    | |_^
    = note: inside `<[closure@temp\src\lib.rs:252:1: 307:2] as std::ops::FnOnce<()>>::call_once - shim` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:227:5
    = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:227:5
    = note: inside `test::__rust_begin_short_backtrace::<fn()>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:556:5
    = note: inside closure at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:547:30
    = note: inside `<[closure@test::run_test::{closure#2}] as std::ops::FnOnce<()>>::call_once - shim(vtable)` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:227:5
    = note: inside `<std::boxed::Box<dyn std::ops::FnOnce() + std::marker::Send> as std::ops::FnOnce<()>>::call_once` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\alloc\src\boxed.rs:1487:9
    = note: inside `<std::panic::AssertUnwindSafe<std::boxed::Box<dyn std::ops::FnOnce() + std::marker::Send>> as std::ops::FnOnce<()>>::call_once` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panic.rs:322:9
    = note: inside `std::panicking::r#try::do_call::<std::panic::AssertUnwindSafe<std::boxed::Box<dyn std::ops::FnOnce() + std::marker::Send>>, ()>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panicking.rs:379:40
    = note: inside `std::panicking::r#try::<(), std::panic::AssertUnwindSafe<std::boxed::Box<dyn std::ops::FnOnce() + std::marker::Send>>>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panicking.rs:343:19
    = note: inside `std::panic::catch_unwind::<std::panic::AssertUnwindSafe<std::boxed::Box<dyn std::ops::FnOnce() + std::marker::Send>>, ()>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panic.rs:396:14
    = note: inside `test::run_test_in_process` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:578:18
    = note: inside closure at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:486:39
    = note: inside `test::run_test::run_test_inner` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:511:13
    = note: inside `test::run_test` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:544:28
    = note: inside `test::run_tests::<[closure@test::run_tests_console::{closure#2}]>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:302:13
    = note: inside `test::run_tests_console` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\console.rs:289:5
    = note: inside `test::test_main` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:123:15
    = note: inside `test::test_main_static` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\test\src\lib.rs:142:5
    = note: inside `main`
    = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:227:5
    = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\sys_common\backtrace.rs:125:18
    = note: inside closure at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\rt.rs:66:18
    = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\ops\function.rs:259:13
    = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panicking.rs:379:40
    = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panicking.rs:343:19
    = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\panic.rs:396:14
    = note: inside `std::rt::lang_start_internal` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\rt.rs:51:25
    = note: inside `std::rt::lang_start::<()>` at <HOME>\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\rt.rs:65:5
    = note: this error originates in an attribute macro (in Nightly builds, run with -Z macro-backtrace for more info)

This shows that just having a &mut Node is illegal for intrusive collections wrt stacked borrows.

@digama0
Copy link

digama0 commented Feb 11, 2021

I think that the error is actually on this line:

    let mut foo = Foo::new();
    let mut bar = Foo::new();
    let mut yam = Foo::new();

and more specifically, that you continue to use these variables in the tests even after using functions like foo.insert_next(&yam) that takes effectively a borrow of yam that is invalidated by field access on yam. Variables directly on the stack created by let x = ... are owned by the stack frame, so they already implicitly have mutable access via these paths and other access paths are unsound. The let _ = &mut foo; is just making something happen that actually triggers the UB by reasserting this unique access, but it was unsound already before that.

As long as you are directly manipulating these stack variables, I think you will have this issue. But another API would be to guard these variables by immediately taking a shared borrow out on them, and the guard mediates all access to the intrusive collection. That way, mutable access to the guard means nothing additional (or rather, it means what you want it to mean), and the presence of the guard ensures that the client can't write &mut foo without dropping the guard, after which point the whole list is considered destroyed. You might need a scoped variable pattern to ensure the guard can't be recreated, or perhaps the API can be such that recreating the guard is okay but resets the list or otherwise restores a valid state.

@comex
Copy link

comex commented Feb 11, 2021

As long as you are directly manipulating these stack variables, I think you will have this issue. But another API would be to guard these variables by immediately taking a shared borrow out on them, and the guard mediates all access to the intrusive collection. That way, mutable access to the guard means nothing additional (or rather, it means what you want it to mean), and the presence of the guard ensures that the client can't write &mut foo without dropping the guard, after which point the whole list is considered destroyed. You might need a scoped variable pattern to ensure the guard can't be recreated, or perhaps the API can be such that recreating the guard is okay but resets the list or otherwise restores a valid state.

Which is similar to how pinning works.

I still think UnsafeAlias is useful, but only in situations where parts of a struct can be accessed exclusively while other parts can't. Though admittedly, if you want a safe API that isn't subject to mem::replace shenanigans, then with few exceptions, such a struct must have mutable references to it guarded by a wrapper type like Pin. And in that case, the wrapper type, while behaving like &mut T, could internally store a *mut T, avoiding the need for UnsafeAlias.

@RustyYato
Copy link
Author

I still think UnsafeAlias is useful, but only in situations where parts of a struct can be accessed exclusively while other parts can't.

Yes, that's also what I think. I incorrectly thought that this applied to intrusive collections, but this discussion has proven that intrusive collections can't be partially exclusive. So I don't think that UnsafeAlias applied in this case. UnsafeAlias seems to be very specific to self-referential types, and almost nothing else (but that should be its own issue)!

Given that this is fundamentally unworkable, I think I will close this issue. I'll follow @digama0's advice on creating a guard for Node that prevents access to &mut Node (directly or indirectly) to safely abstract Node.

@RalfJung
Copy link
Member

UnsafeAlias seems to be very specific to self-referential types, and almost nothing else (but that should be its own issue)!

And that issue already exists: #148

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) A-provenance Topic: Related to when which values have which provenance (but not which alias restrictions follow)
Projects
None yet
Development

No branches or pull requests

6 participants