Releases: ashvardanian/StringTape
Release list
Release v2.4.1
Release v2.4.0
v2.3: Complete Standard Rust Traits 🦀
This release transforms StringTape into a fully-featured collection library with comprehensive standard trait implementations. All tape and view types now implement PartialEq, Eq, PartialOrd, Ord, and Hash, enabling their use in HashMap, HashSet, and sorted collections with lexicographic comparison semantics. The new DoubleEndedIterator implementations unlock reverse iteration, rfind(), and rposition() operations across all iterator types. Additional utility methods include contains() for membership testing, shrink_to_fit() for memory optimization, and first(), last(), pop() for Vec-like convenience. These additions maintain the library's zero-allocation guarantees and full no_std compatibility.
The memory layout documentation has been expanded to clearly distinguish between Tape and Cows architectures, with concrete performance trade-offs. Tape classes use Apache Arrow's cumulative offset layout for cache-friendly sequential access and zero-copy interop, while Cows classes employ packed (offset, length) entries with #[repr(C, packed(1))] to minimize memory footprint for large datasets—achieving 10x memory reduction compared to Vec<String>. The documentation now includes guidance on choosing between these approaches based on workload characteristics, particularly for high-throughput applications where unaligned access patterns may favor Tape classes despite their higher memory usage.
Release v2.2.3
Release v2.2.2
Release v2.2.1
Release v2.2.0
Release v2.1.1
v2.1: Packing Cows 🐮
The StringTape crate now also provides "Cows" classes for large arrays of references pointing into parent Copy-On-Write objects, allowing for the handling of larger datasets. Why? In addition to storing 3 pointers for every String instance, you often need another 2-4 pointers per entry in your heap. Assuming most English words are under 8 bytes long, storing an extra 6 pointers on a 64-bit machine results in a 7x memory usage amplification. So choose your data structures wisely!
let doc = fs::read_to_string("enwik9.txt")?; // 1.0 GB
let words = doc.split_whitespace(); // ~ 160 M words
let buffers = words.map(str::as_bytes);
let _ = Vec::<String>::from_iter(words); // + 7.1 GB copied ❌
let _ = CharsTapeAuto::from_iter(words); // + 1.3 GB copied ✅
let _ = Vec::<&[u8]>::from_iter(buffers); // + 1.9 GB copy-less ⚠️
let _ = BytesCowsAuto::from_iter_and_data( // + 0.7 GB copy-less ✅
buffers,
Cow::Borrowed(doc.as_bytes()),
);Minor
- Add: Slice classes (3666059)