NVIDIA / TensorRT-Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 61
Star 830

Code
Issues 88
Pull requests 3
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Issues: NVIDIA/TensorRT-Model-Optimizer

[RFC] TensorRT Model Optimizer - Product Roadmap

#146 opened Mar 6, 2025 by omrialmog

Open

Labels 7 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

88 Open 62 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Which instructions should I follow to quant my model from bf16 to nvfp4?

#164 opened Mar 28, 2025 by ghostplant

What is pre_quant_scale?

#163 opened Mar 26, 2025 by dingjingzhen

Unable to quantize pytorch model using huggingface export

#161 opened Mar 23, 2025 by ishan-modi

slower when quantize whole bert model than quantize only ffn

#159 opened Mar 20, 2025 by DamonsJ

issue about eagle

#158 opened Mar 20, 2025 by white-wolf-tech

Large model offloaded to huggingface accelerate is not able to export the weights using unified export. bug

Something isn't working

#157 opened Mar 18, 2025 by michaelfeil

int4 quantization output onnx does not load bug

Something isn't working

#156 opened Mar 13, 2025 by thejaswi01

modelopt corrupts output onnx

#154 opened Mar 11, 2025 by thejaswi01

Missing Support for Qwen2Moe

#153 opened Mar 11, 2025 by michaelfeil

pi0 support?

#151 opened Mar 9, 2025 by johnnynunez

What is the Precision of the Linear Layer Operations in the Attention Module of the DeepSeek R1 FP4 Model?

#150 opened Mar 7, 2025 by charmin161

PyTorch Quantization Failed to Quantize Scaled Dot Product

#149 opened Mar 7, 2025 by YixuanSeanZhou

[RFC] TensorRT Model Optimizer - Product Roadmap roadmap

#146 opened Mar 6, 2025 by omrialmog

Not support torch.compile() ?

#145 opened Mar 5, 2025 by Vieeo

sm_100 not defined for option gpu-name when running calibration in DeepSeek

#144 opened Mar 4, 2025 by imenselmi

Quantized model size is larger than the original model

#143 opened Mar 1, 2025 by Urania880519

the consistency between modelopt and trt is quite different?

#140 opened Feb 24, 2025 by 666DZY666

trt-modelopt is not compatible with pytorch FSDP？

#139 opened Feb 21, 2025 by Vieeo

Restore functionality: lm_head option to disable quantization

#138 opened Feb 20, 2025 by michaelfeil

UNet cannot be quantized with assertion error

#136 opened Feb 19, 2025 by 4a16dick

can modelopt export trt calib cache？

#135 opened Feb 19, 2025 by 666DZY666

modelopt2trt：model inference time is much slower!

#134 opened Feb 19, 2025 by 666DZY666

More modes for model opt quantization than halving the batch size

#133 opened Feb 18, 2025 by michaelfeil

Support SequenceClassifcation Models

#132 opened Feb 18, 2025 by michaelfeil

How to sparse the specified model parameters

#130 opened Feb 13, 2025 by Vieeo

Previous 1 2 3 4 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-03-27.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly