Disaggregated Prefill & Decode serving optimizations #3963

Open

Open

Disaggregated Prefill & Decode serving optimizations#3963

Enhancement

Labels

InvestigatingPerformanceroadmaptriaged

Disaggregated Prefill & Decode serving

[Done] MPI/UCX backend integration
[Ongoing] NIXL Integration
Performance tuning
Best practice guide

Metadata

Assignees

No one assigned

Labels

InvestigatingPerformanceroadmaptriaged

Type

Projects

TensorRT-LLM Roadmap

Status

April 2025 - June 2025

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests