Skip to content
/ wtf8 Public

Encode and decode WTF-8 with a similar API to TextEncoder and TextDecoder.

License

Notifications You must be signed in to change notification settings

cto-af/wtf8

Repository files navigation

@cto.af/wtf8

Encode and decode WTF-8 with a similar API to TextEncoder and TextDecoder.

The goal is to be able to parse and generate bytestreams that can store any JavaScript string, including ones that have unpaired surrogates.

Installation

npm install @cto.af/wtf8

API

Full API documentation is available.

Example:

import {Wtf8Decoder, Wtf8Encoder} from '@cto.af/wtf8';

const bytes = new Wtf8Encoder().encode('\ud800');
const string = new Wtf8Decoder().decode(bytes); // '\ud800'

W3C streams are also provided: Wtf8EncoderStream and Wtf8DecoderStream.

Notes

Used a few of the tricks from the paper Validating UTF-8 In Less Than One Instruction Per Byte, but not all of them. Moving data in and out of WASM to be able to use SIMD might be slightly faster, but since we're not merely validating but instead actually decoding (and generating replacement characters when fatal is false), staying in JS seems good enough for the moment.


Build Status codecov

About

Encode and decode WTF-8 with a similar API to TextEncoder and TextDecoder.

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published