-
Notifications
You must be signed in to change notification settings - Fork 26
[CI] Enable model revisions in GHA test #523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Enable model revisions in GHA test #523
Conversation
Signed-off-by: Christian Kadner <[email protected]>
This PR is a continuation of PR #499 rebased onto |
Signed-off-by: Christian Kadner <[email protected]>
Signed-off-by: Christian Kadner <[email protected]>
Signed-off-by: Christian Kadner <[email protected]>
# replace '/' characters in HF_MODEL with '--' for GHA cache keys and | ||
# in model file names in local HF hub cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment on how this looks like would be nice, like what was it earlier and what it looks like after replacing
Signed-off-by: Christian Kadner <[email protected]>
Signed-off-by: Christian Kadner <[email protected]>
@@ -0,0 +1,63 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this tool required if you can use huggingface-cli download
? Is it to make sure that only the necessary and sufficient set of files is downloaded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using it for the GitHub Action workflow to download models with revisions (unless they are already cached in a GHA cache blob).
I recall being told and seeing it in comments, that the HuggingFace CLI is not reliable during Github action runs, though I have never put that to the test myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a shot? I also seem to remember that the HF CLI was downloading something weird at one point, but I don't see that lately. Maybe something got fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall being told and seeing it in comments, that the HuggingFace CLI is not reliable during Github action runs, though I have never put that to the test myself
Worth a shot? I also seem to remember that the HF CLI was downloading something weird at one point, but I don't see that lately. Maybe something got fixed?
I tried using the hf download action and it failed in 3 of 10 test jobs:
https://github.com/ckadner/vllm-spyre/actions/runs/18510700226/job/52750674898?pr=20
Traceback (most recent call last):
File "/home/runner/work/vllm-spyre/vllm-spyre/.venv/bin/huggingface-cli", line 10, in <module>
sys.exit(main())
^^^^^^
File "/home/runner/work/vllm-spyre/vllm-spyre/.venv/lib/python3.12/site-packages/huggingface_hub/commands/huggingface_cli.py", line 61, in main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, System.IO.IOException: No space left on device.
[EDIT]
I wonder if HF CLI download (temporarily) creates/keeps 2 copies of the files while the download is ongoing which exceeds the available disk space on the GHA runner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. Just reverted. All downloads went through fine: https://github.com/ckadner/vllm-spyre/actions/runs/18510991794/job/52751608834?pr=20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more time with hf download --max-workers 2 ...
Downloading 'pytorch_model.bin' to '/home/runner/work/vllm-spyre/vllm-spyre/.cache/huggingface/hub/models--cross-encoder--stsb-roberta-large/blobs/03023f7dcd714c15ff27d534432a80d3bff78c9b50778a44b10585ef5fa7fd25.incomplete'
/home/runner/work/vllm-spyre/vllm-spyre/.venv/lib/python3.12/site-packages/huggingface_hub/
file_download.py:801: UserWarning: Not enough free disk space to download the file. The expected file size is: 1421.62 MB. The target location /home/runner/work/vllm-spyre/vllm-spyre/.cache/huggingface/hub/models--cross-encoder--stsb-roberta-large/blobs only has 1414.68 MB free disk space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably figure out how the space gets used, which files to exclude and/or how to make more space on the GHA runner.
But we already have the download script and the code in it has been running fine for several months. So, I vote for keeping the existing custom download code, albeit in a separate script now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, often repositories have extra files such as model weights in other formats. I think the script only downloads the required files, so that would explain that it doesn't run out of disk space. I would prefer less code to maintain, but since we have these space restrictions I guess for now it's better to keep the script.
Signed-off-by: Prashant Gupta <[email protected]> Signed-off-by: Christian Kadner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
This PR solves 2 problems:
spyre_util.py
)Bonus: Use dedicated GHA cache blob per model (and revision) that can be used locally.
Options to solve the the multi-model cache blob problem:
Revisions used now (consistently):
from
tests/spyre_util.py
:Related Issues
#522
#499