Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for ByteStr/ByteString #134915

Open
3 tasks
joshtriplett opened this issue Dec 30, 2024 · 1 comment
Open
3 tasks

Tracking Issue for ByteStr/ByteString #134915

joshtriplett opened this issue Dec 30, 2024 · 1 comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@joshtriplett
Copy link
Member

joshtriplett commented Dec 30, 2024

Feature gate: #![feature(bstr)]

This is a tracking issue for the ByteStr/ByteString types, which represent human-readable strings that are usually, but not always, UTF-8. Unlike &str/String, these types permit non-UTF-8 contents, making them suitable for user input, non-native filenames (as Path only supports native filenames), and other applications that need to round-trip whatever data the user provides.

This was approved in ACP rust-lang/libs-team#502 .

Public API

// In core::bstr
#[repr(transparent)]
pub struct ByteStr(pub [u8]);

impl ByteStr {
    pub fn new<B: ?Sized + AsRef<[u8]>>(bytes: &B) -> &Self { ... }
}

impl Debug for ByteStr { ... }
impl Display for ByteStr { ... }
impl Deref for ByteStr { type Target = [u8]; ... }
impl DerefMut for ByteStr { ... }
// Other trait impls from bstr, including From impls

// In alloc::bstr
#[repr(transparent)]
pub struct ByteString(pub Vec<u8>);

impl Debug for ByteString { ... }
impl Display for ByteString { ... }
impl Deref for ByteString { type Target = Vec<u8>; ... }
impl DerefMut for ByteString { ... }
// Other trait impls from bstr, including From impls

Steps / History

Unresolved Questions

  • Should we call this BStr/BString, or ByteStr/ByteString? The former will be more familiar to users of the bstr crate in the ecosystem. The latter is more explicit, and avoids potential naming conflicts (making it easier to, for instance, add it to the prelude).
  • Should the Display impl use the Unicode replacement character, or do escaping like the Debug impl?

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@joshtriplett joshtriplett added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Dec 30, 2024
@joshtriplett joshtriplett changed the title Tracking Issue for BStr/BString Tracking Issue for ByteStr/ByteString Dec 30, 2024
@joshtriplett
Copy link
Member Author

In the course of implementing this, I'm addressing BurntSushi/bstr#190 : both the ByteStr and ByteString types will implement Index and IndexMut.

joshtriplett added a commit to joshtriplett/rust that referenced this issue Jan 3, 2025
Approved ACP: rust-lang/libs-team#502
Tracking issue: rust-lang#134915

These types represent human-readable strings that are conventionally,
but not always, UTF-8. The `Debug` impl prints non-UTF-8 bytes using
escape sequences, and the `Display` impl uses the Unicode replacement
character.

This is a minimal implementation of these types and associated trait
impls. It does not add any helper methods to other types such as `[u8]`
or `Vec<u8>`.

I've omitted a few implementations of `AsRef`, `AsMut`, and `Borrow`,
when those would be the second implementation for a type (counting the
`T` impl), to avoid potential inference failures. We can attempt to add
more impls later in standalone commits, and run them through crater.

In addition to the `bstr` feature, I've added a `bstr_internals` feature
for APIs provided by `core` for use by `alloc` but not currently
intended for stabilization.

This API and its implementation are based *heavily* on the `bstr` crate
by Andrew Gallant (@BurntSushi).
joshtriplett added a commit to joshtriplett/rust that referenced this issue Jan 4, 2025
Approved ACP: rust-lang/libs-team#502
Tracking issue: rust-lang#134915

These types represent human-readable strings that are conventionally,
but not always, UTF-8. The `Debug` impl prints non-UTF-8 bytes using
escape sequences, and the `Display` impl uses the Unicode replacement
character.

This is a minimal implementation of these types and associated trait
impls. It does not add any helper methods to other types such as `[u8]`
or `Vec<u8>`.

I've omitted a few implementations of `AsRef`, `AsMut`, and `Borrow`,
when those would be the second implementation for a type (counting the
`T` impl), to avoid potential inference failures. We can attempt to add
more impls later in standalone commits, and run them through crater.

In addition to the `bstr` feature, I've added a `bstr_internals` feature
for APIs provided by `core` for use by `alloc` but not currently
intended for stabilization.

This API and its implementation are based *heavily* on the `bstr` crate
by Andrew Gallant (@BurntSushi).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

1 participant