- 
                Notifications
    You must be signed in to change notification settings 
- Fork 624
Add CABS Merge Method #568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| All contributors have signed the CLA  ✍️ ✅ | 
| I have read the CLA Document and I hereby sign the CLA | 
| Thanks for the PR! I'd love to have your method in mergekit. Two things: 
 | 
| This is absolutely an amazing merging method, however it seems to need more support with gradients. While technically compatible if you supply as many values to the array as there are blocks, it would be nice to have it round to the closest whole number so it doesn't throw an error. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| return tensor.clone(), torch.ones_like(tensor, dtype=torch.bool) | ||
| if n_val < 0 or n_val > m_val: | ||
| logging.error(f"Tensor {original_shape}: n_val ({n_val}) invalid.") | ||
| return tensor.clone(), torch.ones_like(tensor, dtype=torch.bool) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Validation Failure Causes Incorrect Masking
When validation fails (m_val <= 0 or invalid n_val), the function returns torch.ones_like(tensor, dtype=torch.bool) as the mask. This means ALL parameters are marked as retained/claimed. In the CABS conflict-aware algorithm, this causes the cumulative_param_mask to be filled with True values, preventing subsequent models from claiming any parameters and breaking the conflict-aware merging logic. The function should either return torch.zeros_like (no parameters retained) or raise an exception on validation failure.
| weight: 0.4 | ||
| n_val: 8 # Per-model n | ||
| m_val: 32 # Per-model m | ||
| # n_val and m_val not set for zephyr_beta, will use global defaults | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on line 13 is incorrect - it states that n_val and m_val are not set for zephyr-7b-beta, but these parameters are actually defined in lines 11-12 directly above the comment. This comment should either be removed or corrected to accurately reflect the configuration.
| # n_val and m_val not set for zephyr_beta, will use global defaults | |
| # n_val and m_val are set for zephyr_beta above | 
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
Implements CABS (Conflict-Aware and Balanced Sparsification) model merging technique from "CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging".
CABS aims to improve merged model quality by mitigating parameter interference through sequential conflict-aware pruning and applying n:m structural pruning to task vector components.
Key Features:
pruning_order, masking out parameters already claimed by prior models in the sequence before subsequent pruning. This minimizes destructive overlap.nlargest magnitude weights out of everymconsecutive weights) to the conflict-masked task vector components.nandmvalues are configurable globally (default_n_val,default_m_val) and per-model (n_val,m_val).weight(lambda) and added to the base model.cabs.pyimplementingCABSMergeandCABSTask.examples/cabs.yml.