Skip to content

Conversation

@zongzhenyang
Copy link

Implements CABS (Conflict-Aware and Balanced Sparsification) model merging technique from "CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging".

CABS aims to improve merged model quality by mitigating parameter interference through sequential conflict-aware pruning and applying n:m structural pruning to task vector components.

Key Features:

  • Sequential Conflict-Aware Pruning: Processes task vectors in a user-defined pruning_order, masking out parameters already claimed by prior models in the sequence before subsequent pruning. This minimizes destructive overlap.
  • N:M Structural Pruning:
    • Applies n:m pruning (retaining n largest magnitude weights out of every m consecutive weights) to the conflict-masked task vector components.
    • n and m values are configurable globally (default_n_val, default_m_val) and per-model (n_val, m_val).
  • Weighted Aggregation: Pruned task vectors are scaled by a weight (lambda) and added to the base model.
  • Added cabs.py implementing CABSMerge and CABSTask.
  • Added CABS example configuration in examples/cabs.yml.

@github-actions
Copy link

github-actions bot commented May 9, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@zongzhenyang
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@cg123
Copy link
Collaborator

cg123 commented May 10, 2025

Thanks for the PR! I'd love to have your method in mergekit.

Two things:

  • Could you please run the pre-commit hook to format the code and push the changes?
  • Would you like to add your method to the table in the README?

@CasualAutopsy
Copy link

This is absolutely an amazing merging method, however it seems to need more support with gradients. While technically compatible if you supply as many values to the array as there are blocks, it would be nice to have it round to the closest whole number so it doesn't throw an error.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

return tensor.clone(), torch.ones_like(tensor, dtype=torch.bool)
if n_val < 0 or n_val > m_val:
logging.error(f"Tensor {original_shape}: n_val ({n_val}) invalid.")
return tensor.clone(), torch.ones_like(tensor, dtype=torch.bool)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Validation Failure Causes Incorrect Masking

When validation fails (m_val <= 0 or invalid n_val), the function returns torch.ones_like(tensor, dtype=torch.bool) as the mask. This means ALL parameters are marked as retained/claimed. In the CABS conflict-aware algorithm, this causes the cumulative_param_mask to be filled with True values, preventing subsequent models from claiming any parameters and breaking the conflict-aware merging logic. The function should either return torch.zeros_like (no parameters retained) or raise an exception on validation failure.

Fix in Cursor Fix in Web

weight: 0.4
n_val: 8 # Per-model n
m_val: 32 # Per-model m
# n_val and m_val not set for zephyr_beta, will use global defaults
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 13 is incorrect - it states that n_val and m_val are not set for zephyr-7b-beta, but these parameters are actually defined in lines 11-12 directly above the comment. This comment should either be removed or corrected to accurately reflect the configuration.

Suggested change
# n_val and m_val not set for zephyr_beta, will use global defaults
# n_val and m_val are set for zephyr_beta above

Spotted by Graphite Agent

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants