Skip to content

Commit 2b4d09c

Browse files
committed
Modified into a two-stage transition plan involving DST
As suggested by Aaron Turon: #592 (comment)
1 parent 96e7abf commit 2b4d09c

1 file changed

Lines changed: 37 additions & 30 deletions

File tree

text/0000-c-str-deref.md

Lines changed: 37 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ fn main() {
2323

2424
The type `std::ffi::CString` is used to prepare string data for passing
2525
as null-terminated strings to FFI functions. This type dereferences to a
26-
DST, `[libc::c_char]`. The slice type, however, is a poor choice for
27-
representing borrowed C string data, since:
26+
DST, `[libc::c_char]`. The slice type as it is, however, is a poor choice
27+
for representing borrowed C string data, since:
2828

2929
1. A slice does not express the C string invariant at compile time.
3030
Safe interfaces wrapping FFI functions cannot take slice references as is
@@ -49,33 +49,53 @@ it makes sense that `CString` gets its own borrowed counterpart.
4949

5050
# Detailed design
5151

52-
## CStr, an Irrelevantly Sized Type
52+
This proposal introduces `CStr`, a type to designate a null-terminated
53+
string. This type does not implement `Sized`, `Copy`, or `Clone`.
54+
References to `CStr` are only safely obtained by dereferencing `CString`
55+
and a few other helper methods, described below. A `CStr` value should provide
56+
no size information, as there is intent to turn `CStr` into an
57+
[unsized type](https://github.com/rust-lang/rfcs/issues/813),
58+
pending resolution on that proposal.
5359

54-
This proposal introduces `CStr`, a token type to designate a null-terminated
55-
string. This type does not implement `Copy` or `Clone` and is only used in
56-
borrowed references. `CStr` is sized, but its size and layout are of no
57-
consequence to its users. It's only safely obtained by dereferencing
58-
`CString` and a few other helper methods, described below.
60+
## Stage 1: CStr, a DST with a weight problem
61+
62+
As current Rust does not have unsized types that are not DSTs, at this stage
63+
`CStr` is defined as a newtype over a character slice:
5964

6065
```rust
6166
#[repr(C)]
6267
pub struct CStr {
63-
head: libc::c_char,
64-
marker: std::marker::NoCopy
68+
chars: [libc::c_char]
6569
}
6670

6771
impl CStr {
6872
pub fn as_ptr(&self) -> *const libc::c_char {
69-
&self.head as *const libc::c_char
73+
self.chars.as_ptr()
7074
}
7175
}
76+
```
77+
78+
`CString` is changed to dereference to `CStr`:
7279

80+
```rust
7381
impl Deref for CString {
7482
type Target = CStr;
7583
fn deref(&self) -> &CStr { ... }
7684
}
7785
```
7886

87+
In implementation, the `CStr` value needs a length for the internal slice.
88+
This RFC provides no guarantees that the length will be equal to the length
89+
of the string, or be any particular value suitable for safe use.
90+
91+
## Stage 2: unsized CStr
92+
93+
If unsized types are enabled later one way of another, the definition
94+
of `CStr` would change to an unsized type with statically sized contents.
95+
The authors of this RFC believe this would constitute no breakage to code
96+
using `CStr` safely. With a view towards this future change, it's recommended
97+
to avoid any unsafe code depending on the internal representation of `CStr`.
98+
7999
## Returning C strings
80100

81101
In cases when an FFI function returns a pointer to a non-owned C string,
@@ -102,7 +122,7 @@ impl CStr {
102122
```
103123

104124
An odd consequence is that it is valid, if wasteful, to call `to_bytes` on
105-
`CString` via auto-dereferencing.
125+
a `CString` via auto-dereferencing.
106126

107127
## Remove c_str_to_bytes
108128

@@ -113,8 +133,10 @@ in favor of composition of the functions described above:
113133

114134
## Proof of concept
115135

116-
The described changes are implemented in crate
117-
[c_string](https://github.com/mzabaluev/rust-c-str).
136+
The described interface changes are implemented in crate
137+
[c_string](https://github.com/mzabaluev/rust-c-str), with a difference
138+
that the `CStr` token type has a bogus static size, as a compromise to
139+
offer better performance in current Rust.
118140

119141
# Drawbacks
120142

@@ -125,25 +147,14 @@ expose the slice in type annotations, parameter signatures and so on,
125147
the change should not be breaking since `CStr` also provides
126148
this method.
127149

128-
Making the deref target practically unsized throws away the length information
150+
Making the deref target unsized throws away the length information
129151
intrinsic to `CString` and makes it less useful as a container for bytes.
130152
This is countered by the fact that there are general purpose byte containers
131153
in the core libraries, whereas `CString` addresses the specific need to
132154
convey string data from Rust to C-style APIs.
133155

134-
While it's not possible outside of unsafe code to unintentionally copy out
135-
or modify the nominal value of `CStr` under an immutable reference, some
136-
unforeseen trouble or confusion can arise due to the structure having a
137-
bogus size. A separate [RFC](https://github.com/rust-lang/rfcs/issues/813),
138-
if accepted, will solve this by opting out of `Sized`.
139-
140156
# Alternatives
141157

142-
`CStr` could be made a newtype on DST `[libc::c_char]`, allowing no-cost
143-
slices. It's not clear if this is useful, and the need to calculate length
144-
up front might prevent some optimized uses possible with the 'thin'
145-
reference.
146-
147158
If the proposed enhancements or other equivalent facilities are not adopted,
148159
users of Rust can turn to third-party libraries for better convenience
149160
and safety when working with C strings. This may result in proliferation of
@@ -152,8 +163,4 @@ is established.
152163

153164
# Unresolved questions
154165

155-
`CStr` can be made a
156-
[truly unsized type](https://github.com/rust-lang/rfcs/issues/813),
157-
pending on that proposal's approval.
158-
159166
Need a `Cow`?

0 commit comments

Comments
 (0)