-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Guarantee slice representation #3775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,205 @@ | ||
- Feature Name: guaranteed_slice_repr | ||
- Start Date: 2025-02-18 | ||
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) | ||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
This RFC guarantees the in-memory representation of slice and str references. | ||
Specifically, `&[T]` and `&mut [T]` are guaranteed to have the same layout as: | ||
|
||
```rust | ||
#[repr(C)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the calling convention slices are actually passed as two separate arguments rather than as a single struct argument. Depending on the calling convention the latter can cause it to be passed on the stack, which would likely be a perf regression. Edit: The reference level explanation says that this guarantee is just for the memory layout. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It’s clarified later in the guide level explanation that the RFC does not intend to say anything about calling conventions, only memory layout:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Turns out this splitting behavior only applies to the Rust ABI: https://rust.godbolt.org/z/vfEWj5jja (and arm64: https://rust.godbolt.org/z/GoGPex1GM) That makes me a lot less worried about guaranteeing it for the C calling convention. In I'm now slightly in favor of making this guarantee. (Me actually verifying this claim was prompted by https://internals.rust-lang.org/t/what-prevents-rust-from-making-slices-ffi-safe/23049/7) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might it not be better to guarantee that the splitting does occur? Not only does it improve performance, but it also leads to function signatures that are more typical for C, where There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the RFC is right to only propose anything for memory layout and nothing for calling conventions. There's much more room for regret when committing to a particular calling convention for slices than when committing to a memory layout. And there's also much less upside: if you're passing a slice by value over FFI, it's a local decision to take it apart and pass the pointer and length however you want. In contrast, if the memory layout of a slice somewhere in memory isn't what you need (i.e., not possible to work with from outside Rust), you may have to change a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Jules-Bertholet I think we would rather simply not promise anything right now about the calling convention of these types, certainly not that it gets split across arguments, because such
|
||
struct Slice<T> { | ||
data: *const T, | ||
len: usize, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why this specific order? Why not length before pointer? Are there any plausible reasons to prefer one over the other, e.g., based on target architecture? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that our current layout algorithm is best able to exploit any niche in Due to various quirks of our existing APIs, the
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @workingjubile Why would There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, clever! Thanks for describing that-- I'll mention it in the RFC. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @cramertj It's really important that the length have a niche in Sadly There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, also, we want to ensure to leave space for size-and-alignment based niches in references too. I guess for a slice there's always the "zero size" case, so that simplifies to just alignment niches, but that still means There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Theoretically, we could have been even stricter, additionally requiring that ptr + length not overflow the address space. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have a draft PR where I tried implementing the niche in the length field, it didn't yield much perf improvements. Maybe I should try separating the metadata and the niche. And I haven't attempted reordering the fields yet either. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's better to use struct Slice<T> {
data: NonNull<T>,
len: usize,
}
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "same layout as" here means field order, alignment and size, not validity. Validity requirements are mentioned further down. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, but it's just a matter of extending There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume you mean
As already discussed in other comment threads, there are good reasons to not guarantee how bare |
||
} | ||
``` | ||
|
||
The layout of `&str` is the same as that of `&[u8]`, and the layout of | ||
cramertj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
`&mut str` is the same as that of `&mut [u8]`. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
This RFC allows non-Rust (e.g. C or C++) code to read from or write to existing | ||
slices and to declare slice fields or locals. | ||
|
||
For example, guaranteeing the representation of slice references allows | ||
non-Rust code to read from the `data` or `len` fields of `string` in the type | ||
below without intermediate FFI calls into Rust: | ||
|
||
```rust | ||
#[repr(C)] | ||
struct HasString { | ||
string: &'static str, | ||
} | ||
``` | ||
|
||
Note: prior to this RFC, the type above is not even properly `repr(C)` since the | ||
size and alignment of slices were not guaranteed. However, the Rust compiler | ||
accepts the `repr(C)` declaration above without warning. | ||
|
||
# Guide-level explanation | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You appear to have deleted the reference-level explanation section from the template. I would encourage you to split this back into two parts: the guide level informal explanation that talks about how it allows you to read things from C, but then also have a reference-level explanation of exactly what it's guaranteeing, without ever using the word "layout", because "layout" means too many different things to different people. (Some people think it means just I'm absolutely in favour of doing this RFC, see this old Zulip thread, so long as it's clearly scoped to the parts that really are uncontroversial. Notably, I don't think that any description that include So I want to see a precise, positively-specified list of exactly what we're RFCing. A quick stab at it:
I don't know if we stop there, but if that's all we're committing to I think it's uncontroversial -- and I bet it's what a whole bunch of unsafe code in the ecosystem already assumes anyway, de facto, since it's been true since at least 1.1.0. I don't know what, if anything, we want to promise or require about the validity invariant, especially since that's already defined to be different between There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Limiting it to platforms where the size and alignment of usize and a pointer are the same seems unnecessarily restrictive. Is there a difficulty with differing size and alignment I'm not aware of? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, IIRC we still haven't decided if Last discussion on it (that I recall at the moment): There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. But even if it is decided that #[repr(C)]
struct Slice<T> {
data: NonNull<T>,
len: usize,
} ? Another consideration is how the value would be passed by value as a function argument over FFI. I believe at least on x86_64, the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
passing by value through function arguments/return is specifically excluded by this RFC because that's much more complex and we may want to change it. In practice Rust passes a slice by value as separate pointer and length arguments, rather than as a struct. I've heard that you can't express what it does for return in C code on x86-64 (or x86? icr). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @tmccombs AFAIK every platform we currently support meets those requirements, so they're there more as a way to make writing out what's guaranteed easy, and because anywhere that doesn't meet those requirements there's a meaningful conversation to be had about what the layout should be. For example, suppose there was a platform with (size: 16, align:8) pointers and (size: 8, align:8) slice-metadata-type. Should
That's intentional, yes. Does this need to define an ABI for it? I don't know. Maybe we can accomplish most of the goals by requiring that it be a field in a struct that's passed by pointer, or just passing it as Basically, I think there's simple and non-controversial set of things that we can approve easily that'll be useful, even if it's not everything that everyone might one day want. Let's land those parts first, then a later RFC that wants to, say, spend the time doing the ABI details survey to figure out what's practical can do that, but we can avoid worrying about it for now. |
||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
Slice references are represented with a pointer and length pair. Their in-memory | ||
layout is the same as a `#[repr(C)]` struct like the following: | ||
Comment on lines
+47
to
+48
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we use more concrete language than "in-memory layout"? Or link to a definition if that's a spec term. E.g.
|
||
|
||
```rust | ||
#[repr(C)] | ||
struct Slice<T> { | ||
data: *const T, | ||
len: usize, | ||
Comment on lines
+47
to
+54
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we wished to extend Rust to allow references to unsized types that encompass unsized types... for example, I feel this "multiple metadata" possibility should be considered and either explicitly reserved as a future possibility or explicitly dismissed. I feel accepting this RFC as-is could be interpreted as foreclosing it, but there might forever be grumbling about how "it doesn't say...!" Thus if we don't want to do that, we should be clear, and if we're still open to that possibility, we should also be clear. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I considered this, but IMO it seems difficult to imagine how I suppose we could make it work if all of the elements were the same type, e.g. coercing Personally, I'd be happy to restrict this RFC to specify that this only applies to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we could make the RFC compatible with &[[U]]
->
{ data: *const [U], len: usize }
->
{ data: { data: *const U, len: usize }, len: usize } so I don't see how this definition of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You could either require homogeneity like you suggest, or you could have the metadata be its own pointer to a separate region of memory, sharing the lifetime of the data pointer. (I wrote a very experimental crate that does something like the latter: https://github.com/Jules-Bertholet/unsized-vec) But since current Rust doesn’t support this, and nothing in this RFC would interfere with someday adopting either option, it’s not a concern IMO. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
another valid extension is to just insert more fields for metadata between the pointer and length or change the length to be a struct-- these match how the pointer metadata APIs work more closely. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
changing the metadata (length) from
if we still want to have zero-cost unsizing the type // unsizing from &[i32; 5] to &[i32] is free
let y: &[i32] = &[1, 2, 3, 4, 5];
// unsizing a `&[[f64; 3]; 2]` to `&[[f64]]` should also not involve any allocation.
let x: &[[f64]] = &[[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't really think so, that's just wrapping the metadata type in a struct whenever you're not using it as part of a pointer type. This is explicitly similar to for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think both homogeneous We could consider unsizing options for more general types too like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I lost the thread a bit here in terms of how this should affect the layout. AFAICT, there's no reasonable layout we can specify at this point for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Oh, yes, entirely. |
||
} | ||
``` | ||
Comment on lines
+50
to
+56
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While the RFC does describe that the non-null requirements are upheld, because
This RFC should probably at least mention the guaranteed niche transformations exist and their implications, just to make clear it is not contradicting or overruling them. A weaker version was described in RFC 3391 but we strengthened that decision to generalize to similarly-shaped enums, not just Result or Option, post-hoc. I documented them in this test, I don't know where to look it up in the reference: https://github.com/rust-lang/rust/blob/ed49386d3aa3a445a9889707fd405df01723eced/tests/ui/rfcs/rfc-3391-result-ffi-guarantees.rs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another niche that isn't currently exploited but may be in the future is alignment, e.g. the pointer in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, perhaps more generally we should be clear that "the representation is stable" still allows for transformations to happen "around it", particularly when it comes to ADT tag layout. This means that |
||
|
||
The precise ABI of slice references is not guaranteed, so `&[T]` may not be | ||
passed by-value or returned by-value from an `extern "C" fn`. | ||
Comment on lines
+58
to
+59
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One thing I don't think is well-described here, or at least I feel that based on some other comments it might be at risk of being glossed over when some people engage with this RFC, is that calling convention (AKA parameter and return passing) and in-memory layout are both very different things. Often, "ABI" is used to describe both, because they are both technically part of the literal "application binary interface". Due to arguments and returns sometimes passing through the stack, and almost always when the number of them is large enough, in-memory layout becomes very relevant for almost every calling convention. But they're not the same. In particular, every calling convention is a beautiful and unique snowflake. Thus, we could define the calling convention for BUT I would advise caution in promising compatibility in a very specific way like "just for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm happy to amend the RFC to clarify this point! The sentence you highlighted seems like it covers this to me, but perhaps I need to make it bolder/brighter/a headline. Do you have other suggestions for how I could make this more straightforward? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Simply be more boring and avoid vague terms like ABI, and specify exactly what you specify: "This guarantees the in-memory layout only, it does not concern how it is passed or returned through functions". |
||
|
||
The validity requirements for the in-memory representation of slice references | ||
are the same as [those documented on `std::slice::from_raw_parts`](https://doc.rust-lang.org/std/slice/fn.from_raw_parts.html) for shared slice references, and | ||
[those documented on `std::slice::from_raw_parts_mut`](https://doc.rust-lang.org/std/slice/fn.from_raw_parts_mut.html) | ||
for mutable slice references. | ||
|
||
Namely: | ||
|
||
* `data` must be non-null, valid for reads (for shared references) or writes | ||
(for mutable references) for `len * mem::size_of::<T>()` many bytes, | ||
and it must be properly aligned. This means in particular: | ||
|
||
* The entire memory range of this slice must be contained within a single allocated object! | ||
Slices can never span across multiple allocated objects. | ||
* `data` must be non-null and aligned even for zero-length slices or slices of ZSTs. One | ||
reason for this is that enum layout optimizations may rely on references | ||
(including slices of any length) being aligned and non-null to distinguish | ||
them from other data. You can obtain a pointer that is usable as `data` | ||
for zero-length slices using [`NonNull::dangling()`]. | ||
|
||
* `data` must point to `len` consecutive properly initialized values of type `T`. | ||
|
||
* The total size `len * mem::size_of::<T>()` of the slice must be no larger than `isize::MAX`, | ||
and adding that size to `data` must not "wrap around" the address space. | ||
See the safety documentation of [`pointer::offset`]. | ||
|
||
## `str` | ||
|
||
The layout of `&str` is the same as that of `&[u8]`, and the layout of | ||
`&mut str` is the same as that of `&mut [u8]`. More generally, `str` behaves like | ||
`#[repr(transparent)] struct str([u8]);`. Safe Rust functions may assume that | ||
`str` holds valid UTF8, but [it is not immediate undefined-behavior to store | ||
non-UTF8 data in `str`](https://doc.rust-lang.org/std/primitive.str.html#invariant). | ||
|
||
## Pointers | ||
|
||
Raw pointers to slices such as `*const [T]` or `*mut str` use the same layout | ||
as slice references, but do not necessarily point to anything. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
## Zero-sized types | ||
|
||
One could imagine representing `&[T]` as only `len` for zero-sized `T`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIUC Rust currently does not supports this optimization and there are no plans for it. Either way, as the first step it may be worth to guarantee layout for slices iff |
||
This proposal would preclude that choice in favor of a standard representation | ||
for slices regardless of the underlying type. | ||
|
||
Alternatively, we could choose to guarantee that the data pointer is present if | ||
and only if `size_of::<T> != 0`. This has the possibility of breaking exising | ||
code which smuggles pointers through the `data` value in `from_raw_parts` / | ||
`into_raw_parts`. | ||
|
||
## Uninhabited types | ||
|
||
Similarly, we could be *extra* tricky and make `&[!]` or other `&[Uninhabited]` | ||
types into a ZST since the slice can only ever be length zero. | ||
|
||
If we want to maintain the pointer field, we could also make `&[!]` *just* a | ||
pointer since we know the length can only be zero. | ||
|
||
Either option may offer modest performance benefits for highly generic code | ||
which happens to create empty slices of uninhabited types, but this is unlikely | ||
to be worth the cost of maintaining a special case. | ||
Comment on lines
+121
to
+123
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the other hand, we could opt not to guarantee this since no one seems to have much of a use case for this right now anyway. |
||
|
||
## Compatibility with C++ `std::span` | ||
|
||
The largest drawback of this layout and set of validity requirements is that it | ||
may preclude `&[T]` from being representationally equivalent to C++'s | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I keep getting tripped up reading this section by thinking it says we're committing to something that diverges from C++. I would phrase it as something like
|
||
`std::span<T, std::dynamic_extent>`. | ||
|
||
* `std::span` does not currently guarantee its layout. In practice, pointer + length | ||
is the common representation. This is even observable using `is_layout_compatible` | ||
[on MSVC](https://godbolt.org/z/Y8ardrshY), though not | ||
[on GCC](https://godbolt.org/z/s4v4xehnG) nor | ||
[on Clang](https://godbolt.org/z/qsd1K5oGq). Future changes to guarantee a | ||
different layout in the C++ standard (unlikely due to MSVC ABI stabilitiy | ||
requirements) could preclude matching the layout with `&[T]`. | ||
|
||
* Unlike Rust, `std::span` allows the `data` pointer to be `nullptr`. One | ||
possibile workaround for this would be to guarantee that `Option<&[T]>` uses | ||
`data: std::ptr::null(), len: 0` to represent the `None` case, making | ||
`std::span<T>` equivalent to `Option<&[T]>` for non-zero-sized types. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it would be worth a perf run to see if there's any measurable impact of doing this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would be a very weird special case, actually. I'd worry about it far more in terms of all the places that would have to deal with it for correctness (in our codegen and in unsafe code) more than I'd worry about any perf implications. Notably, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can guarantee that reading an all-zero chunk of memory produces a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One possible downside is that even guaranteeing Footnotes
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’m not sure if the compiler internals are equipped to do that, but in theory it’s possible to use the not-null niche for |
||
|
||
Note that this is not currently the case. The compiler currenty represents | ||
`None::<&[u8]>` as `data: std::ptr::null(), len: uninit` (though this is | ||
not guaranteed). | ||
|
||
* Rust uses a dangling pointer in the representation of zero-length slices. | ||
It's unclear whether C++ guarantees that a dangling pointer will remain | ||
unchanged when passed through `std::span`. However, it does support | ||
dangling pointers during regular construction via the use of | ||
[`std::to_address`](https://en.cppreference.com/w/cpp/container/span/span) | ||
in the iterator constructors. | ||
|
||
Note that C++ also does not support zero-sized types, so there is no naive way | ||
to represent types like `std::span<SomeZeroSizedRustType>`. | ||
|
||
## Flexibility | ||
|
||
Additionally, guaranteeing layout of Rust-native types limits the compiler's and | ||
standard library's ability to change and take advantage of new optimization | ||
opportunities. | ||
|
||
# Rationale and alternatives | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. crABI provides a way to pass slices to/from C without locking in our representation #3470, that is probably worth mentioning somewhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In a related way, maybe we could do something like guarantee the layout only in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could be good to mention slice DST references. Their layout is, in practice, the same as the layout of a slice reference, and so this RFC could easily stabilize their layout as well (albeit with @RalfJung may have opinions on why this is more subtle than I realize 😛 If it's actually straightforward, though, then it'd be worth considering including in this RFC. cc @jswrenn There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not aware of any opsem subtleties in this area -- all concerns are of the form "do we really want to make this a hard guarantee we cannot change again later". |
||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
* We could avoid committing to a particular representation for slices. | ||
|
||
* We could try to guarantee layout compatibility with a particular target's | ||
`std::span` representation, though without standardization this may be | ||
impossible. Multiple different C++ stdlib implementations may be used on | ||
the same platform and could potentially have different span representations. | ||
In practice, current span representations also use ptr+len pairs. | ||
|
||
* We could avoid storing a data pointer for zero-sized types. This would result | ||
in a more compact representation but would mean that the representation of | ||
`&[T]` is dependent on the type of `T`. Additionally, this would break | ||
existing code which depends on storing data in the pointer of ZST slices. | ||
|
||
This would break popular crates such as [bitvec](https://docs.rs/crate/bitvec/1.0.1/source/doc/ptr/BitSpan.md) | ||
(55 million downloads) and would result in strange behavior such as | ||
`std::ptr::slice_from_raw_parts(ptr, len).as_ptr()` returning a different | ||
pointer from the one that was passed in. | ||
|
||
Types like `*const ()` / `&()` are widely used to pass around pointers today. | ||
We cannot make them zero-sized, and it would be surprising to make a | ||
different choice for `&[()]`. | ||
Comment on lines
+185
to
+187
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should have warned against this and made everyone use |
||
|
||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
The layout in this RFC is already documented in | ||
[the Unsafe Code Guildelines Reference.](https://rust-lang.github.io/unsafe-code-guidelines/layout/pointers.html) | ||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
* Consider defining a separate Rust type which is repr-equivalent to the platform's | ||
native `std::span<T, std::dynamic_extent>` to allow for easier | ||
interoperability with C++ APIs. Unfortunately, the C++ standard does not | ||
guarantee the layout of `std::span` (though the representation may be known | ||
and fixed on a particular implementation, e.g. libc++/libstdc++/MSVC). | ||
Zero-sized types would also not be supported with a naive implementation of | ||
such a type. |
Uh oh!
There was an error while loading. Please reload this page.