Integrate Automated QDQ placement tool - Part 3 #703

willg-nv · 2025-12-17T06:56:58Z

What does this PR do?

Type of change: new feature

Overview: This PR integrates automated QDQ placement tool to ModelOpt. This PR is 3/4 of the change. This PR contains the following changes:

Implements QDQAutotuner and Autotuner CLI interface
Implements Benchmark to measure E2E time of QDQ models.
unit tests for QDQ Autotuner and config.

Part 1: #701
Part 2: #702
Part 3: #703
Part 4: #704

Usage

python -m modelopt.onnx.quantization.autotune --model model.onnx

Testing

Implemented unit tests for QDQAutotuner and Config classes.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: Yes
Did you add or update any necessary documentation?: No, document will be in part 4.
Did you update Changelog?: No. change log will be in part 4.

Additional Information

copy-pr-bot · 2025-12-17T06:57:01Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

willg-nv · 2025-12-22T03:24:50Z

@vishalpandya1990 could you help me review this PR? thanks!

vishalpandya1990 · 2025-12-22T06:34:13Z

@vishalpandya1990 could you help me review this PR? thanks!

Sorry for the delay. Added Ajinkya for review.

modelopt/onnx/quantization/autotune/__init__.py

gcunhase · 2026-01-08T20:45:18Z

modelopt/onnx/quantization/autotune/cli.py

+        "--output",
+        "-o",
+        type=str,
+        default=DEFAULT_OUTPUT_DIR,


Can we update this behavior to match the ONNX quantization and Autocast workflow?

In there, if output_path is not given, the resulting model is saved in the same path as the input model with a name extension. For ex: the quantized model.onnx is saved as model.quant.onnx and the converted model is saved as model.fp16.onnx.

See more details in:

Model-Optimizer/modelopt/onnx/quantization/quantize.py

Line 391 in 6f18490

if not output_path:

Suggestion: rename this as output_path to match other ONNX workflows.

FYI: I plan to create another PR to integrate autotuner as a sub-command of modelopt.onnx.quantization. User could 1) directly run autotuner, 2) or autotune based on PTQ model. After that, I think cli.py could be removed.

modelopt/onnx/quantization/autotune/cli.py

gcunhase · 2026-01-08T20:54:25Z

tests/unit/onnx/quantization/autotune/test_autotuner.py

+from modelopt.onnx.quantization.autotune.common import PatternCache, RegionType
+
+
+def create_simple_conv_model():


Can we move this to tests/_test_utils/onnx/lib_test_models.py? Alternatively, NonSimplifiedModel or build_resnet_block could be used here instead?

@ajrasane WDYT?

Signed-off-by: Will Guo <[email protected]>

willg-nv requested a review from a team as a code owner December 17, 2025 06:56

willg-nv requested a review from vishalpandya1990 December 17, 2025 06:56

This was referenced Dec 17, 2025

Integrate Automated QDQ placement tool - Part 4 #704

Open

Integrate Automated QDQ placement tool - Part 2 #702

Open

Integrate Automated QDQ placement tool - Part 1 #701

Open

vishalpandya1990 requested a review from ajrasane December 22, 2025 06:31

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part3 branch from 3454bba to 4b9d789 Compare December 31, 2025 02:09

gcunhase reviewed Jan 8, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/__init__.py Outdated Show resolved Hide resolved

gcunhase reviewed Jan 8, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/cli.py Outdated Show resolved Hide resolved

gcunhase reviewed Jan 8, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/cli.py Outdated Show resolved Hide resolved

gcunhase reviewed Jan 8, 2026

View reviewed changes

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part3 branch from 4b9d789 to 20ae533 Compare January 12, 2026 02:32

Integrate Automated QDQ placement tool - part 3

99d3c0d

Signed-off-by: Will Guo <[email protected]>

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part3 branch from 20ae533 to 99d3c0d Compare January 12, 2026 03:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integrate Automated QDQ placement tool - Part 3 #703

Integrate Automated QDQ placement tool - Part 3 #703

willg-nv commented Dec 17, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Dec 17, 2025

Uh oh!

willg-nv commented Dec 22, 2025

Uh oh!

vishalpandya1990 commented Dec 22, 2025

Uh oh!

Uh oh!

gcunhase Jan 8, 2026 •

edited

Loading

Uh oh!

gcunhase Jan 8, 2026

Uh oh!

willg-nv Jan 9, 2026

Uh oh!

Uh oh!

Uh oh!

gcunhase Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		from modelopt.onnx.quantization.autotune.common import PatternCache, RegionType


		def create_simple_conv_model():

Integrate Automated QDQ placement tool - Part 3 #703

Are you sure you want to change the base?

Integrate Automated QDQ placement tool - Part 3 #703

Conversation

willg-nv commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Dec 17, 2025

Uh oh!

willg-nv commented Dec 22, 2025

Uh oh!

vishalpandya1990 commented Dec 22, 2025

Uh oh!

Uh oh!

gcunhase Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gcunhase Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

willg-nv Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gcunhase Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

willg-nv commented Dec 17, 2025 •

edited

Loading

gcunhase Jan 8, 2026 •

edited

Loading