Skip to content

Fix AOPerModuleConfig name changes #18869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented May 29, 2025

Summary:
also fixed float8 and int4 tests

Test Plan:
python test/quantization/test_torchao.py

Reviewers:

Subscribers:

Tasks:

Tags:

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@@ -12,7 +12,7 @@

@pytest.mark.skipif(not TORCHAO_AVAILABLE, reason="torchao is not available")
def test_pre_quantized_model(vllm_runner):
with vllm_runner("drisspg/float8_dynamic_act_float8_weight-opt-125m",
with vllm_runner("drisspg/fp8-opt-125m",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a unified place to store all the models, :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any recommendations? other quantization methods are also using random places:

models_4bit_to_test = [
("facebook/opt-125m", "quantize opt model inflight"),
("mistralai/Mistral-7B-Instruct-v0.3",
"quantize inflight model with both HF and Mistral format weights")
]
models_4bit_to_embedding_test = [
("intfloat/e5-mistral-7b-instruct", "quantize embedding model inflight"),
]
models_pre_qaunt_4bit_to_test = [
('PrunaAI/Einstein-v6.1-Llama3-8B-bnb-4bit-smashed',
'read pre-quantized 4-bit FP4 model'),
('poedator/opt-125m-bnb-4bit', 'read pre-quantized 4-bit NF4 opt model'),
]
models_pre_quant_8bit_to_test = [
('meta-llama/Llama-Guard-3-8B-INT8',
'read pre-quantized llama 8-bit model'),
("yec019/fbopt-350m-8bit", "read pre-quantized 8-bit opt model"),
]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about having a central repo, like torchao/fp8-opt-125m?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for test models? I feel might be a bit overkill

We do release official torchao models under pytorch, e.g.: https://huggingface.co/collections/pytorch/torchao-quantized-phi-4-mini-instruct-681566f123acc6fed345cb1a

@jerryzh168 jerryzh168 force-pushed the fix-ao-per-module-config-name branch from 959f5f2 to f62f7b3 Compare May 29, 2025 02:55
@jerryzh168
Copy link
Contributor Author

can you help merge this @mgoin

the broken checks does not look relevant

@mgoin mgoin added quantization ready ONLY add when PR is ready to merge/full CI is needed labels May 30, 2025
Summary:
also fixed float8 and int4 tests

Test Plan:
python test/quantization/test_torchao.py

Reviewers:

Subscribers:

Tasks:

Tags:
Signed-off-by: Jerry Zhang <[email protected]>
Signed-off-by: Jerry Zhang <[email protected]>
@jerryzh168 jerryzh168 force-pushed the fix-ao-per-module-config-name branch from 4fab075 to 0d2f4cb Compare May 30, 2025 03:47
@mergify mergify bot added the ci/build label May 30, 2025
@DarkLight1337
Copy link
Member

Can you merge from main to fix the CI failure?

# to enable proper caching this needs standalone compile
# os.environ["VLLM_TEST_STANDALONE_COMPILE"] = "1"
# logger.info("Using TorchAO: Setting VLLM_TEST_STANDALONE_COMPILE=1")
os.environ["VLLM_DISABLE_COMPILE_CACHE"] = "1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can check the torch version, something like "if is_torch_equal_or_newer("2.8.0")"

Copy link
Collaborator

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid such BC-breaking changes in TorchAO :-)

@houseroad
Copy link
Collaborator

Besides rebasing to main, could you also address inline comment?

@jerryzh168
Copy link
Contributor Author

We should avoid such BC-breaking changes in TorchAO :-)

yeah we'll make sure not to break BC and fix the callsite first next

@jerryzh168 jerryzh168 force-pushed the fix-ao-per-module-config-name branch 3 times, most recently from 3440f4c to 3bded6f Compare May 30, 2025 22:02
@houseroad
Copy link
Collaborator

There are some failures, could you take a look?

Signed-off-by: Jerry Zhang <[email protected]>
@jerryzh168 jerryzh168 force-pushed the fix-ao-per-module-config-name branch from ef6813d to 53d0a63 Compare June 3, 2025 05:00
@DarkLight1337
Copy link
Member

Try merging from main branch and see if the CI failures are resolved

Copy link

mergify bot commented Jun 3, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @jerryzh168.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Jun 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build needs-rebase quantization ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants