-
Notifications
You must be signed in to change notification settings - Fork 46
RFC: Introduce raw_size() and make size equal to length. #69
Description
First of all, thanks a lot for providing this project! It makes it so much easier to work with UTF-8 data.
I'm aware that this might be out of the scope of this project, so I figured I'd just ask what you all think about this. When porting my code from std::string
to tiny_utf8::string
I encountered various issues, where the mismatch of size and length caused issues.
E.g., my code uses templates with std::size(...)
to work on arbitrary data types. It doesn't work on tiny_utf8::string
s though, since size()
is the raw byte size, but operator[]
expects a codepoint index. It would be nice of tiny_utf8 would be consistent with other STL containers.
Various other functions also made use of both size()
and length()
. Yes, it can also be fixed on my side, but (1) its difficult to get this done correctly in a large code base, and (2) it does no longer work as a "quick drop-in replacement" as advertised in the README.
So, what I'm considering is adding a new raw_size()
(similar to raw_at
, raw iterators, ...) that returns the byte size, and change the default behavior of size
to match the length
. This is obviously not a backwards compatible change, but (1) there have also been other non-backwards compatible changes and (2) there could still be a define-parameter to switch between both behaviors.
What do you think? If its out of the scope I'll come up with a different solution. :)