Skip to content

fix: Enable LoRAs#308

Merged
p-e-w merged 1 commit into
p-e-w:masterfrom
anrp:anrp/loradetails
May 2, 2026
Merged

fix: Enable LoRAs#308
p-e-w merged 1 commit into
p-e-w:masterfrom
anrp:anrp/loradetails

Conversation

@anrp
Copy link
Copy Markdown
Contributor

@anrp anrp commented Apr 22, 2026

No description provided.

@anrp anrp changed the title Enable LoRAs and write README.md for LoRA adapter as well feat: Enable LoRAs and write README.md for LoRA adapter as well Apr 22, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to save or upload LoRA adapters independently and refactors model card generation into a centralized get_model_card function. However, several critical issues were identified: a git merge conflict marker was accidentally left in src/heretic/utils.py, the get_model_card function references a non-existent use_ara attribute which will cause a runtime error, and the function's return type annotation is incorrectly marked as None instead of ModelCard | None.

Comment thread src/heretic/utils.py Outdated
Comment thread src/heretic/utils.py Outdated
Comment thread src/heretic/utils.py Outdated
@anrp anrp force-pushed the anrp/loradetails branch 5 times, most recently from 9270fd7 to 5419a20 Compare April 22, 2026 13:28
@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented Apr 22, 2026

Previously the locally saved LoRA would have the default autogenerated README
README.md

Now the local file looks like
README.md

which has the same info as the hf_hub uploaded version (just ran & put in https://huggingface.co/anrp/hereticuploadlora )

@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented Apr 22, 2026

Resolves #152

@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented Apr 22, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors model card generation into a new utility module and introduces an option to save LoRA adapters without merging. The changes include moving README generation logic to src/heretic/model_card_utils.py and updating the saving and uploading workflows to automatically generate and store model cards. Review feedback highlights a potential crash when loading remote model cards and suggests improving the readability of the parameter table generation logic.

Comment thread src/heretic/model_card_utils.py Outdated
Comment thread src/heretic/model_card_utils.py Outdated
@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented Apr 23, 2026

Thanks, but please split out the model card changes into a separate PR. We're very close to the 1.3 release now, and #303 is making further changes that affect the model card. This part is too risky to merge so late in the release cycle.

@anrp anrp force-pushed the anrp/loradetails branch from 5419a20 to 4865f9f Compare April 28, 2026 12:01
@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented Apr 28, 2026

Moved readme bit to #314, this is just a pure revert now

@anrp anrp changed the title feat: Enable LoRAs and write README.md for LoRA adapter as well fix: Enable LoRAs Apr 28, 2026
@p-e-w p-e-w merged commit da92f74 into p-e-w:master May 2, 2026
5 checks passed
@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented May 2, 2026

Thanks! I was finally able to merge this after GitHub had issues with pull requests for several days.

@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented May 2, 2026

I just tested this: https://huggingface.co/p-e-w/gemma-4-E2B-it-heretic-LoRA

The safetensors file is empty 😠 😠 😠

Looks like the bug isn't fixed after all.

@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented May 2, 2026

It is possible the adapter is actually empty (save it to disk and see?)

~$ xxd < !$
xxd < Downloads/adapter_model.safetensors
00000000: 2000 0000 0000 0000 7b22 5f5f 6d65 7461   .......{"__meta
00000010: 6461 7461 5f5f 223a 7b22 666f 726d 6174  data__":{"format
00000020: 223a 2270 7422 7d7d                      ":"pt"}}

that looks like valid-but-empty?

@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented May 2, 2026

Interestingly, it did work with Qwen3.5: https://huggingface.co/p-e-w/Qwen3.5-4B-heretic-LoRA

@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented May 2, 2026

It is possible the adapter is actually empty

No, that isn't possible, because otherwise the model wouldn't be abliterated. All abliteration happens through the adapter. If the model is modified at all (and it is, which you can see from the refusal reduction) then the adapter cannot be empty.

@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented May 2, 2026

There is probably a bug in the way PEFT identifies modules, which would explain why it works for one model but not the other, even though the Heretic configuration is identical.

@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented May 2, 2026

I think it might be something about your system? I just ran this

CUDA_VISIBLE_DEVICES=0 uv run heretic --model google/gemma-4-E2B-it --n-trials 10 --n-startup-trials 4

and uploaded (directly) to https://huggingface.co/anrp/gemma-4-E2B-it-heretic-tryupload/tree/main and the adapter is reasonably-sized. ???

OTOH I am using an out of date uv that doesn't understand the exclude-newer directive, so it may be that a package needs to be updated.

$ uv pip list
warning: Failed to parse `pyproject.toml` during settings discovery:
  TOML parse error at line 77, column 17
     |
  77 | exclude-newer = "7 days"
     |                 ^^^^^^^^
  failed to parse year in date "7 days": failed to parse "7 da" as year (a four digit integer): invalid digit, expected 0-9 but got  

Package                  Version      Editable project location
------------------------ ------------ -------------------------
absl-py                  2.4.0
accelerate               1.13.0
aiohappyeyeballs         2.6.1
aiohttp                  3.13.5
aiosignal                1.4.0
alembic                  1.18.4
annotated-doc            0.0.4
annotated-types          0.7.0
anyio                    4.13.0
attrs                    26.1.0
av                       16.1.0
bitsandbytes             0.49.2
certifi                  2026.4.22
chardet                  5.2.0
charset-normalizer       3.4.7
click                    8.3.3
colorama                 0.4.6
colorlog                 6.10.1
cuda-bindings            13.2.0
cuda-pathfinder          1.5.4
cuda-toolkit             13.0.2
dataproperty             1.1.0
datasets                 4.8.5
decord2                  3.0.0
dill                     0.4.1
einops                   0.8.1
evaluate                 0.4.6
filelock                 3.29.0
flash-attn               2.8.3
frozenlist               1.8.0
fsspec                   2026.2.0
greenlet                 3.5.0
h11                      0.16.0
heretic-llm              1.2.0        /home/anrp/ai/heretic
hf-transfer              0.1.9
hf-xet                   1.4.3
httpcore                 1.0.9
httpx                    0.28.1
huggingface-hub          1.13.0
idna                     3.13
immutabledict            4.3.1
jinja2                   3.1.6
joblib                   1.5.3
jsonlines                4.0.0
kernels                  0.13.0
langdetect               1.0.9
lm-eval                  0.4.11
lxml                     6.1.0
mako                     1.3.12
markdown-it-py           4.0.0
markupsafe               3.0.3
mbstrdecoder             1.1.4
mdurl                    0.1.2
molmo-utils              0.0.1
more-itertools           11.0.2
mpmath                   1.3.0
multidict                6.7.1
multiprocess             0.70.19
networkx                 3.6.1
nltk                     3.9.4
numpy                    2.4.4
nvidia-cublas            13.1.0.3
nvidia-cublas-cu12       12.8.4.1
nvidia-cuda-cupti        13.0.85
nvidia-cuda-cupti-cu12   12.8.90
nvidia-cuda-nvrtc        13.0.88
nvidia-cuda-nvrtc-cu12   12.8.93
nvidia-cuda-runtime      13.0.96
nvidia-cuda-runtime-cu12 12.8.90
nvidia-cudnn-cu12        9.19.0.56
nvidia-cudnn-cu13        9.19.0.56
nvidia-cufft             12.0.0.61
nvidia-cufft-cu12        11.3.3.83
nvidia-cufile            1.15.1.6
nvidia-cufile-cu12       1.13.1.3
nvidia-curand            10.4.0.35
nvidia-curand-cu12       10.3.9.90
nvidia-cusolver          12.0.4.66
nvidia-cusolver-cu12     11.7.3.90
nvidia-cusparse          12.6.3.3
nvidia-cusparse-cu12     12.5.8.93
nvidia-cusparselt-cu12   0.7.1
nvidia-cusparselt-cu13   0.8.0
nvidia-nccl-cu12         2.28.9
nvidia-nccl-cu13         2.28.9
nvidia-nvjitlink         13.0.88
nvidia-nvjitlink-cu12    12.8.93
nvidia-nvshmem-cu12      3.4.5
nvidia-nvshmem-cu13      3.4.5
nvidia-nvtx              13.0.85
nvidia-nvtx-cu12         12.8.90
optuna                   4.8.0
packaging                26.2
pandas                   3.0.2
pathvalidate             3.3.1
peft                     0.19.1
pillow                   12.2.0
portalocker              3.2.0
prompt-toolkit           3.0.52
propcache                0.4.1
psutil                   7.2.2
py-cpuinfo               9.0.0
pyarrow                  24.0.0
pydantic                 2.13.3
pydantic-core            2.46.3
pydantic-settings        2.14.0
pygments                 2.20.0
pytablewriter            1.2.1
python-dateutil          2.9.0.post0
python-dotenv            1.2.2
pytz                     2026.1.post1
pyyaml                   6.0.3
questionary              2.1.1
regex                    2026.4.4
requests                 2.33.1
rich                     14.3.4
rouge-score              0.1.2
ruff                     0.15.12
sacrebleu                2.6.0
safetensors              0.7.0
scikit-learn             1.8.0
scipy                    1.17.1
setuptools               81.0.0
shellingham              1.5.4
six                      1.17.0
sqlalchemy               2.0.49
sqlitedict               2.1.0
sympy                    1.14.0
tabledata                1.3.4
tabulate                 0.10.0
tcolorpy                 0.1.7
threadpoolctl            3.6.0
timm                     1.0.24
tokenizers               0.22.2
tomli-w                  1.2.0
tomlkit                  0.14.0
torch                    2.11.0+cu128
torchvision              0.26.0+cu128
tqdm                     4.67.3
transformers             5.7.0
triton                   3.6.0
ty                       0.0.34
typepy                   1.3.4
typer                    0.25.1
typer-slim               0.21.1
typing-extensions        4.15.0
typing-inspection        0.4.2
tzdata                   2025.2
urllib3                  2.6.3
wcwidth                  0.6.0
word2number              1.1
xxhash                   3.7.0
yarl                     1.23.0
zstandard                0.25.0

p-e-w added a commit that referenced this pull request May 3, 2026
@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented May 3, 2026

I spun up a fresh server for this test, installed dependencies with pip, and did the complete run with all parameters set to their defaults, with GPU acceleration.

It may be that GPU operations are the problem, or that there is some hardware-specific behavior, but given that it clearly still doesn't work reliably, I see no option other than to revert this PR.

@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented May 3, 2026

Makes sense to revert it while figuring it out. I'd like to reproduce what's happening to you though, could you just write out the steps that you ran to get that failure mode (specifically, not uv, using pip, which host OS, ...)? I always use uv to run this which (basically) disallows me to mess with packages, so it's very weird.

(FYI: I care about LoRAs being outputted so much because it's +few% overhead to have both the gold standard base model & the abliterated model serving at the same time with vllm, so very easy to compare with no switching overhead)

@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented May 4, 2026

could you just write out the steps that you ran to get that failure mode

Thanks to our new reproducibility system, I can indeed tell you exactly what I did:

https://huggingface.co/p-e-w/gemma-4-E2B-it-heretic/blob/main/reproduce/README.md

(This was exported from the same installation.)

😄

@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented May 5, 2026

I did, and... it works. https://huggingface.co/anrp/gemma-4-E2B-it-heretic-trylora2/tree/main Don't know what to say anymore?

@p-e-w
Copy link
Copy Markdown
Owner

p-e-w commented May 5, 2026

I think I may have figured out what's going on. Try the following:

  1. Run Heretic.
  2. Choose a trial.
  3. Export the full model (not just the LoRA).
  4. Now export the LoRA.

I suspect that merging the adapter with the base model empties it somehow.

@anrp
Copy link
Copy Markdown
Contributor Author

anrp commented May 5, 2026

There's the clue. #321 fixes that flow (lora-after-save-merged gave an empty adapter even going to disk.)

anrp added a commit to anrp/heretic that referenced this pull request May 7, 2026
p-e-w pushed a commit that referenced this pull request May 9, 2026
* fix: Reset model after saving merged model

The adapter is lost and writes 0-byte adapters if you save an adapter after saving the merged model.

* Revert "Revert "Revert "fix: disable LoRA export for now" (#308)" (#319)"

This reverts commit 216c089.

* Add comment as to why resetting model is needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants