|
| 1 | +- Feature Name: extern_types |
| 2 | +- Start Date: 2017-01-18 |
| 3 | +- RFC PR: https://github.com/rust-lang/rfcs/pull/1861 |
| 4 | +- Rust Issue: https://github.com/rust-lang/rust/issues/43467 |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +Add an `extern type` syntax for declaring types which are opaque to Rust's type |
| 10 | +system. |
| 11 | + |
| 12 | +# Motivation |
| 13 | +[motivation]: #motivation |
| 14 | + |
| 15 | +When interacting with external libraries we often need to be able to handle pointers to data that we don't know the size or layout of. |
| 16 | + |
| 17 | +In C it's possible to declare a type but not define it. |
| 18 | +These incomplete types can only be used behind pointers, a compilation error will result if the user tries to use them in such a way that the compiler would need to know their layout. |
| 19 | + |
| 20 | +In Rust, we don't have this feature. Instead, a couple of problematic hacks are used in its place. |
| 21 | + |
| 22 | +One is, we define the type as an uninhabited type. eg. |
| 23 | + |
| 24 | +```rust |
| 25 | +enum MyFfiType {} |
| 26 | +``` |
| 27 | + |
| 28 | +Another is, we define the type with a private field and no methods to construct it. |
| 29 | + |
| 30 | +```rust |
| 31 | +struct MyFfiType { |
| 32 | + _priv: (), |
| 33 | +} |
| 34 | +``` |
| 35 | + |
| 36 | +The point of both these constructions is to prevent the user from being able to create or deal directly with instances of the type. |
| 37 | +Neither of these types accurately reflect the reality of the situation. |
| 38 | +The first definition is logically problematic as it defines a type which can never exist. |
| 39 | +This means that references to the type can also—logically—never exist and raw pointers to the type are guaranteed to be |
| 40 | +invalid. |
| 41 | +The second definition says that the type is a ZST, that we can store it on the stack and that we can call `ptr::read`, `mem::size_of` etc. on it. |
| 42 | +None of this is of course valid. |
| 43 | + |
| 44 | +The controversies on how to represent foreign types even extend to the standard library too; see the discussion in the [libc_types RFC PR](https://github.com/rust-lang/rfcs/pull/1783). |
| 45 | + |
| 46 | +This RFC instead proposes a way to directly express that a type exists but is unknown to Rust. |
| 47 | + |
| 48 | +Finally, In the 2017 roadmap, [integration with other languages](https://github.com/rust-lang/rfcs/blob/master/text/1774-roadmap-2017.md#integration-with-other-languages), is listed as a priority. |
| 49 | +Just like unions, this is an unsafe feature necessary for dealing with legacy code in a correct and understandable manner. |
| 50 | + |
| 51 | +# Detailed design |
| 52 | +[design]: #detailed-design |
| 53 | + |
| 54 | +Add a new kind of type declaration, an extern type: |
| 55 | + |
| 56 | +```rust |
| 57 | +extern { |
| 58 | + type Foo; |
| 59 | +} |
| 60 | +``` |
| 61 | + |
| 62 | +These types are FFI-safe. They are also DSTs, meaning that they do not implement `Sized`. Being DSTs, they cannot be kept on the stack, can only be accessed through pointers and references and cannot be moved from. |
| 63 | + |
| 64 | +In Rust, pointers to DSTs carry metadata about the object being pointed to. |
| 65 | +For strings and slices this is the length of the buffer, for trait objects this is the object's vtable. |
| 66 | +For extern types the metadata is simply `()`. |
| 67 | +This means that a pointer to an extern type has the same size as a `usize` (ie. it is not a "fat pointer"). |
| 68 | +It also means that if we store an extern type at the end of a container (such as a struct or tuple) pointers to that container will also be identical to raw pointers (despite the container as a whole being unsized). |
| 69 | +This is useful to support a pattern found in some C APIs where structs are passed around which have arbitrary data appended to the end of them: eg. |
| 70 | + |
| 71 | +```rust |
| 72 | +extern { |
| 73 | + type OpaqueTail; |
| 74 | +} |
| 75 | + |
| 76 | +#[repr(C)] |
| 77 | +struct FfiStruct { |
| 78 | + data: u8, |
| 79 | + more_data: u32, |
| 80 | + tail: OpaqueTail, |
| 81 | +} |
| 82 | +``` |
| 83 | + |
| 84 | +As a DST, `size_of` and `align_of` do not work, but we must also be careful that `size_of_val` and `align_of_val` do not work either, as there is not necessarily a way at run-time to get the size of extern types either. |
| 85 | +For an initial implementation, those methods can just panic, but before this is stabilized there should be some trait bound or similar on them that prevents their use statically. |
| 86 | +The exact mechanism is more the domain of the custom DST RFC, [RFC 1524](https://github.com/rust-lang/rfcs/pull/1524), and so figuring that mechanism out will be delegated to it. |
| 87 | + |
| 88 | +C's "pointer `void`" (not `()`, but the `void` used in `void*` and similar) is currently defined in two official places: [`std::os::raw::c_void`](https://doc.rust-lang.org/stable/std/os/raw/enum.c_void.html) and [`libc::c_void`](https://doc.rust-lang.org/libc/x86_64-unknown-linux-gnu/libc/enum.c_void.html). |
| 89 | +Unifying these is out of scope for this RFC, but this feature should be used in their definition instead of the current tricks. |
| 90 | +Strictly speaking, this is a breaking change, but the `std` docs explicitly say that `void` shouldn't be used without indirection. |
| 91 | +And `libc` can, in the worst-case, make a breaking change. |
| 92 | + |
| 93 | +# How We Teach This |
| 94 | +[how-we-teach-this]: #how-we-teach-this |
| 95 | + |
| 96 | +Really, the question is "how do we teach *without* this". |
| 97 | +As described above, the current tricks for doing this are wrong. |
| 98 | +Furthermore, they are quite advanced touching upon many advanced corners of the language: zero-sized and uninhabited types are phenomena few programmer coming from mainstream languages have encountered. |
| 99 | +From reading around other RFCs, issues, and internal threads, one gets a sense of two issues: |
| 100 | +First, even among the group of Rust programmers enthusiastic enough to participate in these fora, the semantics of foreign types are not widely understood. |
| 101 | +Second, there is annoyance that none of the current tricks, by nature of them all being flawed in different ways, would become standard. |
| 102 | + |
| 103 | +By contrast, `extern type` does exactly what one wants, with an obvious and guessable syntax, without forcing the user to immediately understand all the nuance about why *these* semantics are indeed the right ones. |
| 104 | +As they see various options fail: moves, stack variables, they can discover these semantics incrementally. |
| 105 | +The benefits are such that this would soon displace the current hacks, making code in the wild more readable through consistent use of a pattern. |
| 106 | + |
| 107 | +This should be taught in the foreign function interface chapter of the rust book in place of where it currently tells people to use uninhabited enums (ack!). |
| 108 | + |
| 109 | +# Drawbacks |
| 110 | +[drawbacks]: #drawbacks |
| 111 | + |
| 112 | +Very slight addition of complexity to the language. |
| 113 | + |
| 114 | +The syntax has the potential to be confused with introducing a type alias, rather than a new nominal type. |
| 115 | +The use of `extern` here is also a bit of a misnomer as the name of the type does not refer to anything external to Rust. |
| 116 | + |
| 117 | +# Alternatives |
| 118 | +[alternatives]: #alternatives |
| 119 | + |
| 120 | +Not do this. |
| 121 | + |
| 122 | +Alternatively, rather than provide a way to create opaque types, we could just offer one distinguished type (`std::mem::OpaqueData` or something like that). |
| 123 | +Then, to create new opaque types, users just declare a struct with a member of type `OpaqueData`. |
| 124 | +This has the advantage of introducing no new syntax, and issues like FFI-compatibility would fall out of existing rules. |
| 125 | + |
| 126 | +Another alternative is to drop the `extern` and allow a declaration to be written `type A;`. |
| 127 | +This removes the (arguably disingenuous) use of the `extern` keyword although it makes the syntax look even more like a type alias. |
| 128 | + |
| 129 | +# Unresolved questions |
| 130 | +[unresolved]: #unresolved-questions |
| 131 | + |
| 132 | +- Should we allow generic lifetime and type parameters on extern types? |
| 133 | + If so, how do they effect the type in terms of variance? |
| 134 | + |
| 135 | +- [In std's source](https://github.com/rust-lang/rust/blob/164619a8cfe6d376d25bd3a6a9a5f2856c8de64d/src/libstd/os/raw.rs#L59-L64), it is mentioned that LLVM expects `i8*` for C's `void*`. |
| 136 | + We'd need to continue to hack this for the two `c_void`s in std and libc. |
| 137 | + But perhaps this should be done across-the-board for all extern types? |
| 138 | + Somebody should check what Clang does. |
0 commit comments