merge main into dev testing branch#195
Merged
avecplezir merged 73 commits intotestingfrom May 8, 2025
Merged
Conversation
…omplexData-MILA/AIF-Gen into 112-dataset-generation-with-openai-api
…with-openai-api Dataset Generation Finalization and Debug
* Transmute * CLI Typos
filter cli feature implementation and testing
…edding-diversity Text embedding diversity metric
* Add spellcheck hooks * Fix spelling
* Add rtd config * CLI is public
* Consolidate benchmark dependancies * Pin accelerate==0.34.2 and datasets>=3.2.0 * Pin deepspeed==0.16.3
* Fix similarity * Add dataset idx to future * Add basic test
* Add cppo starter files Based on ppo_continual implementation and trlx's cppo implementation * Add CPPO loss logic to benchmarks Incorporate trlx CPPO loss into the CPPO trainer, using the detect track function and existing PPO iterative structure * Update naming convention in README * Update function docstring to pass linting * Lint CPPO trainer file with ruff * Implement feedback from PR review * Implement minor updates from CPPO testing * Fix sweep config * Run formatter * Fix ref_policy variable deletion * Specialize detect_track based on ablation type * Detect track to standalone function * Update docstrings * Fix cppo loss * Fix cloning of old logprobs and rewards * Add knowledge retention regularization coefficient * Update benchmark sync * Push old logprobs/rewards out to trainer * Revert value loss without alpha * Use ppo approx kl logs * Remove redundant gc * CPPO first successful run * cppo unnecessary list indexing on mask * mask duplicate variable debug in CPPO * Dead code remove --------- Co-authored-by: Jacob-Chmura <jacobpaul.chmura@gmail.com> Co-authored-by: Shahrad Mohammadzadeh <shahrad_m@icloud.com>
* Deprecate transmute * Deprecate filter
* Clean up utils * Spelling
* Cleanup mappers * Fix docs ref * Fix merge * WIP
* WIP * Fix tests * update continual dataset * Pipe instead of union * Annotate
* Seperate validation module * Update docs * Deprecate token entropy * Update embedding diversity * Update llm judge
* Preference swap clean * Split is a transform * Seperate transform module * Revert accidentl file push * Fix tests
* Preference swap clean * Split is a transform * Seperate transform module * Revert accidentl file push * Fix tests * WIp * Update docs
* Update service * Share retry logic * Consolidate preference axes sample gen * Update docs * Consolidate judge prompt
* upload logo * Upload log * wip * wip * upd * wip * WIP * WIP * Upd * Upd * upd * upd * Upd * wip * upd * wip * WIp
* wip * wip * wip * wip * wip * wip * WIP * wip * Wip * wip * wip * wip * Remove old img ref
* WIP: update readme * wip * wip * Check install * wip * wip * wip * wip * wip * WIP * wip * qip * wip * wip * wip * wip
* Bump torch==2.6.0 * Relax torch upper bound * No need to pin huggingface_hub in benchmarks group
* Generate sample completions during log call * Clean config
cppo readme update
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.