-
Notifications
You must be signed in to change notification settings - Fork 28
Internal symbols array is publicly exposed and unsound #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The implementation itself (`InternalEncoding`) is already hidden, but the field in the definition of `Encoding` is not. Fixes #75
The implementation itself (`InternalEncoding`) is already hidden, but the field in the definition of `Encoding` is not. Fixes #75
Thanks for the bug report! I didn't know it was possible to have Note that eventually, when |
The implementation itself (`InternalEncoding`) is already hidden, but the field in the definition of `Encoding` is not. Fixes #75
@ia0 it may also be worth renaming the field something suitably scary, because it is relatively easy to not notice that I'd also recommend against using doc(hidden) fields when there are unsafe invariants; if possible it would be better to expose a |
Yes, I think the |
Yeah I think a |
Are there additional follow-ups needed/expected here? This issue is closed, but it seems to me (see https://chromium-review.googlesource.com/c/chromium/src/+/6187726/comment/53544bd0_bf28981a/) that there may be some lingering soundness concerns here. Maybe I don't fully understand how the current code prevents violating UTF-8 encoding through the public API? FWIW I like the suggestion above to restrict all the symbols to ASCII (only allowing construction of |
Yes, using However, if you believe this is an issue, I could see a temporary workaround. I could add an
I agree the situation is not ideal and depends on what "public API" means. Is it what the library documents? Or is it whatever you can achieve with safe code? I believe this is the first. But I also agree that if you can make both match, it's better (not only to avoid genuine mistakes, but also to simply tracking of malicious usage). That's why I don't consider the library unsound. If you access the "private" field (because undocumented and with an explicit type name), then you have library UB by breaking the library contract. Making sure you can't have library UB with safe code is tracked in #72 as described above.
This is not as simple because not all bytes are ASCII in the internal representation. The optional field |
This does involve tweaking the field of the
Encoding
, but that's a public and notdoc(hidden)
field. At the very least it should bedoc(hidden)
.(at a meta level, this code would probably benefit from some kind of internal
#[repr(transparent)] pub Ascii(u8);
type so that it is very clear that the invariants are being upheld)The text was updated successfully, but these errors were encountered: