Skip to content

Releases: ashvardanian/StringTape

Release v2.4.1

Choose a tag to compare

@github-actions github-actions released this 21 Nov 00:44

Release: v2.4.1 [skip ci]

Patch

  • Fix: Ingesting a single 4 GB+ string (9216161)

Release v2.4.0

Choose a tag to compare

@github-actions github-actions released this 12 Oct 21:29

Release: v2.4.0 [skip ci]

Minor

  • Add: get_unchecked methods for faster iteration (71307bd)
  • Add: Clone-able bidirectional iterators (d35a887)

v2.3: Complete Standard Rust Traits 🦀

Choose a tag to compare

@github-actions github-actions released this 07 Oct 15:11

This release transforms StringTape into a fully-featured collection library with comprehensive standard trait implementations. All tape and view types now implement PartialEq, Eq, PartialOrd, Ord, and Hash, enabling their use in HashMap, HashSet, and sorted collections with lexicographic comparison semantics. The new DoubleEndedIterator implementations unlock reverse iteration, rfind(), and rposition() operations across all iterator types. Additional utility methods include contains() for membership testing, shrink_to_fit() for memory optimization, and first(), last(), pop() for Vec-like convenience. These additions maintain the library's zero-allocation guarantees and full no_std compatibility.

The memory layout documentation has been expanded to clearly distinguish between Tape and Cows architectures, with concrete performance trade-offs. Tape classes use Apache Arrow's cumulative offset layout for cache-friendly sequential access and zero-copy interop, while Cows classes employ packed (offset, length) entries with #[repr(C, packed(1))] to minimize memory footprint for large datasets—achieving 10x memory reduction compared to Vec<String>. The documentation now includes guidance on choosing between these approaches based on workload characteristics, particularly for high-throughput applications where unaligned access patterns may favor Tape classes despite their higher memory usage.

Release v2.2.3

Choose a tag to compare

@github-actions github-actions released this 06 Oct 23:13

Release: v2.2.3 [skip ci]

Patch

  • Improve: IntoIter for tapes (1259ad5)

Release v2.2.2

Choose a tag to compare

@github-actions github-actions released this 06 Oct 21:59

Release: v2.2.2 [skip ci]

Patch

  • Improve: Expose parent() buffers (a29de96)

Release v2.2.1

Choose a tag to compare

@github-actions github-actions released this 06 Oct 21:42

Release: v2.2.1 [skip ci]

Patch

  • Improve: BytesCowsAuto.iter() (4403f6d)

Release v2.2.0

Choose a tag to compare

@github-actions github-actions released this 06 Oct 21:26

Release: v2.2.0 [skip ci]

Minor

  • Add: Casting between chars/bytes CoWs (00fc721)

Release v2.1.1

Choose a tag to compare

@github-actions github-actions released this 06 Oct 15:43

Release: v2.1.1 [skip ci]

Patch

v2.1: Packing Cows 🐮

Choose a tag to compare

@github-actions github-actions released this 06 Oct 15:06

The StringTape crate now also provides "Cows" classes for large arrays of references pointing into parent Copy-On-Write objects, allowing for the handling of larger datasets. Why? In addition to storing 3 pointers for every String instance, you often need another 2-4 pointers per entry in your heap. Assuming most English words are under 8 bytes long, storing an extra 6 pointers on a 64-bit machine results in a 7x memory usage amplification. So choose your data structures wisely!

let doc = fs::read_to_string("enwik9.txt")?;    // 1.0 GB
let words = doc.split_whitespace();             // ~ 160 M words
let buffers = words.map(str::as_bytes);

let _ = Vec::<String>::from_iter(words);        // + 7.1 GB copied ❌
let _ = CharsTapeAuto::from_iter(words);        // + 1.3 GB copied ✅
let _ = Vec::<&[u8]>::from_iter(buffers);       // + 1.9 GB copy-less ⚠️
let _ = BytesCowsAuto::from_iter_and_data(      // + 0.7 GB copy-less ✅
    buffers,
    Cow::Borrowed(doc.as_bytes()),
);

Minor

Patch

  • Improve: Funnier naming - Cows FTW 🐮 (78d500e)
  • Improve: Allocators for reorderable classes (76c906d)
  • Docs: Cleaner explanations (86b4ed2)

Release v2.0.3

Choose a tag to compare

@github-actions github-actions released this 11 Sep 10:59

Release: v2.0.3 [skip ci]

Patch