GitHub · Where software is built

[RFC]Feedback collection about TensorRT-LLM 1.0 Release Planning and API Compatibility Commitment
#3148 · juney-nvidia opened on Mar 29, 2025
2
[RFC]Topics you want to discuss with TensorRT-LLM team in the upcoming meet-ups
#3124 · juney-nvidia opened on Mar 27, 2025
10

Labels Milestones New issue

Errors running sanity check test following installation guide

#5495

· stmcginnis opened

on Jun 25, 2025

Compatibility with HF LLaMA Codebase

feature request

Generic Runtime

#5492

· Saeedmatt3r opened

on Jun 25, 2025

[Perf] Conditionally enable SWAP AB for speculative decoding

Community want to contribute

#5403

· zoheth opened

on Jun 23, 2025

invalid request_id in <code>MTPSpecMetadata</code>

Speculative Decoding

#5386

· k-l-lambda opened

on Jun 20, 2025

Mixtral-8x7B-Instruct awq-w4a8 output shows duplicated Chinese text

#5379

· wanzhenchn opened

on Jun 20, 2025

Poor performance after FP8 Quantization for Llama 3.1 on PyTorch backend

#5370

· geaned opened

on Jun 19, 2025

How to run Qwen3 using triton-server + trtllm

#5310

· ezioliao opened

on Jun 18, 2025

Abnormal Performance Scaling of W4AFP8 vs FP8 on H20-141G with Deepseek-R1 Models

#5127

· Nekofish-L opened

on Jun 11, 2025

guided decoding parameters for tensorrt_llm backend must be present even if not needed

#5099

· InCogNiTo124 opened

on Jun 10, 2025

disaggregated service MPI communicate failed,pls help to check,thanks

Disaggregated Serving

#5012

· w066650 opened

on Jun 9, 2025

How is the performance of the model with pytorch as the backend

#4745

· oppolll opened

on May 29, 2025

No module named 'tensorrt_llm.bindings

#4458

· Shegun93 opened

on May 19, 2025