Skip to content

Release v1.1.0

Latest

Choose a tag to compare

@github-actions github-actions released this 29 Dec 07:05
7456f88

We are excited to announce the release of RecIS v1.1.0. This version marks a significant milestone with the introduction of Model Bank 1.0, native ROCm support, and substantial performance optimizations for large-scale embedding tables.

🌟 Key Highlights

Category Description
🏆 Framework Model Bank 1.0 officially arrives; New Negative Sampler and RTP Exporter support.
⚡ Performance Introduction of Auto-resizing Hash Tables and Fused AdamW TF CUDA operations.
🌐 Compatibility Expanded hardware support for AMD ROCm; Fixed non-NVIDIA device kernel launches.
🛡️ Robustness Improved multi-node synchronization and robust handling for empty tensor edge cases.

📝 Detailed Changelog

Bug Fixes

  • checkpoint: fix mos version format, update use openlm api (854bbb3)
  • checkpoint: refine torch_rank_weights_embs_table_multi_shard.json format (d5e7a5c)
  • checkpoint: walk around save bug, deal with xpfs model path (ae99728)
  • embedding: fix empty kernel launch in non-nvidia device (2e310d0)
  • embedding: fix insert when size == 1 (7702c9e)
  • framework: add an option for algo_config for export (0ad4c3f)
  • framework: fix bugs of invalid index, grad accumulation; add clear child feat (1e7acf9)
  • framework: fix eval in trainer (676a053)
  • framework: fix fg && exporter bugs (3964ce2)
  • framework: fix load extra info not in ckpt (a64cd00)
  • framework: fix loss backward (7d9a41b)
  • framework: fix some bug of model bank (be196db)
  • framework: fix window io failover (cde3049)
  • framework: reset io state when start another epoch (f918f24)
  • io: fix batch_convert row_splits when dataset read empty data (44661ab)
  • io: fix None data when window switch (e788b4d)
  • io: fix odps import bug (7c13f09)
  • io: use openstorage get_table_size directly (d5c0952)
  • ops: fix bug in fast atomic operations (fea8d47)
  • ops: fix dense_to_ragged op when check_invalid=False (#14) (300a77b)
  • ops: fix edge cases for empty tensors and improve CUDA kernel handling (794be12)
  • ops: fix emb segment reduce mean op (3f82b9c)
  • ops: handle empty tensor inputs in ragged ops (a39fc2a)
  • optimizer: step add 1 should be in-place (cdb3632)
  • serialize: fix bug of file sync of multi node (822af49)
  • serialize: fix bug of load tensor (e25eee4)
  • serialize: fix bug when load by oname (e5ca3d7)
  • serialize: fix bug when tensor num < parallel num (a02aded)
  • tools: fix torch_fx_tool string format (1d426f8)

Features

  • checkpoint: add label for ckpt (5436b5b)
  • checkpoint: load dense optimizer by named_parameters (a07dbaf)
  • docs: add model bank docs (ff0d23e)
  • embedding: add monitor for ids/embs (2f268eb)
  • embedding: expose methods to retrieve child ids and embs from the coalesced hashtable; fix clear method of hashtable (b5de207)
  • framework,checkpoint: change checkpointmanager to save/load hooks (eb3b441)
  • framework: [internal] add negative sampler (8c21517)
  • framework: add exporter for rtp (b8af849)
  • framework: add skip option in model bank (00828ce)
  • framework: add some utility to RaggedTensor (78eca0a)
  • framework: add window_iter for window pipline (87886a0)
  • framework: collect eval result for hooks and fix after_data bug (81d3723)
  • framework: enable amp by options (db5bbe7)
  • framework: impl-independent monitor (24a1631)
  • framework: model bank 1.0 (488672b)
  • framework: support filter hashtable for saver, update hook for window, fix metric (01eb2ae)
  • io: add adaptor filter by scene (c3e6738)
  • io: add new dedup option for neg sampler (61b2cb7)
  • io: add standard fg for input features (2deedff)
  • ops: add fused AdamW TF CUDA operation (05dba24)
  • ops: add parse_sample_id ops (78674cd)
  • packaging: support ROCm (7a626d3)
  • serialize: update load metric interface (66b085d)
  • update column-io to support ROCm device (7907158)

Performance Improvements

  • embedding: use auto-resizing hash table (2f53f53)