Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support FLUX OneTrainer LoRA formats (incl. DoRA) #7590

Merged
merged 21 commits into from
Jan 28, 2025

Conversation

RyanJDick
Copy link
Collaborator

@RyanJDick RyanJDick commented Jan 24, 2025

Summary

This PR adds support for the FLUX LoRA model format produced by OneTrainer.

Specifically, this PR adds:

  • Support for DoRA patches
  • Support for patch models that modify the FLUX T5 encoder
  • Probing / loading support for OneTrainer models

Known limitations

  • DoRA patches cannot currently be applied to base weights that are quantized with bitsandbytes. The DoRA algorithm requires accessing the original model weight in order to compute the patch diff, and the bitsandbytes quantization layers make this difficult. DoRA patches can be applied to non-quantized and GGUF-quantized layers without issue.
  • This PR results in a slight speed regression for a very particular inference combination: quantized base model + LoRA with diffusers keys (i.e. uses the MergedLayerPatch). Now that more LoRA formats are using the MergedLayerPatch, it was becoming too much work to maintain this optimization. Regression from ~1.7 it/s to ~1.4 it/s.

Future Notes

  • We may want to consider dropping support for bitsandbytes quantization. It is very difficult to maintain compatibility for across features like partial-loading and LoRA patching.
  • At a future time, we should refactor the LoRA parsing logic to be more generalized rather than handling each format independently.
  • There are some redundant device casts and dequantizations in autocast_linear_forward_sidecar_patches(...) (and its sub-calls). Optimizing this is left for future work.

Related Issues / Discussions

QA Instructions

OneTrainer test models:

The following tests were repeated with each of the OneTrainer test models:

  • Test with non-quantized base model
  • Test with GGUF-quantized base model
  • Test with BnB-quantized base model
  • Test with non-quantized base model that is partially-loaded onto the GPU

Other regression test:

  • Test some SD1 LoRAs
  • Test some SDXL LoRAs
  • Test a variety of existing FLUX LoRA formats
  • Test a FLUX Control LoRA on all base model quantization formats.

Merge Plan

No special instructions.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions bot added python PRs that change python files invocations PRs that change invocations backend PRs that change backend files frontend PRs that change frontend files python-tests PRs that change python tests labels Jan 24, 2025
@RyanJDick RyanJDick force-pushed the ryan/flux-dora-one-trainer-concatenated branch from 14c5f31 to 5085a8c Compare January 24, 2025 19:50
@RyanJDick RyanJDick marked this pull request as ready for review January 24, 2025 20:24
@RyanJDick RyanJDick force-pushed the ryan/flux-dora-one-trainer-concatenated branch from 07d83b7 to 229834a Compare January 28, 2025 15:11
@RyanJDick RyanJDick merged commit debcbd6 into main Jan 28, 2025
15 checks passed
@RyanJDick RyanJDick deleted the ryan/flux-dora-one-trainer-concatenated branch January 28, 2025 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files frontend PRs that change frontend files invocations PRs that change invocations python PRs that change python files python-tests PRs that change python tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants