Feat: Router observability (Current QPS, router-side queueing delay, etc) #78

sitloboi2012 · 2025-02-07T04:43:21Z

This issue dedicated to discuss about the feature:

(P1) Router observability (Current QPS, router-side queueing delay, number of pending / prefilling / decoding requests, average prefill / decoding length, etc)

sitloboi2012 · 2025-02-08T17:21:10Z

@gaocegege @ApostaC did you guy think of any specific layout for the dashboard yet or for now let's just dump them in first and then think about the layout later, I'm currently splitting out the metrics into 3 main groups:

Core vLLM Metrics: “Available vLLM instances”, “Request latency distribution”. “Request TTFT distribution”
Operational Metrics: Number of Running Request, GPU KV Usage Percentage, Number of Pending Request, GPU KV cache hit rate
Router Observability Metrics: “Current QPS”, “Router‐side Queueing Delay”, “Average Prefill Length”, “Number of Prefilling Requests”, “Number of Decoding Requests”, “Average Decoding Length”

YuhanLiu11 · 2025-02-09T05:46:35Z

@sitloboi2012 This looks good!

Just for your reference, below are our earlier design on this:

Overview of the system: Number of currently healthy vLLM pods, Number of requests that are processed or queuing, Average latency.
QoS information: Timeseries of average QPS, Average TTFT, Average ITL
Serving engine load: Timeseries of GPU KV cache usage, Number of running requests, Number of queuing requests, Number of swapped requests
Current resource usage: Timeseries of GPU, CPU, Memory and Disk usage

(cc @ApostaC )

sitloboi2012 · 2025-02-09T05:52:49Z

nice, I think yours @YuhanLiu11 makes more sense, mine was like guessing around based on the usage and using ChatGPT to get some suggestions 😆
I will update again based on these references, thanks for your input, appreciate it 👍

sitloboi2012 mentioned this issue Feb 7, 2025

[Roadmap] vLLM production stack roadmap for 2025 Q1 #26

Open

15 tasks

gaocegege added the feature request New feature or request label Feb 7, 2025

gaocegege changed the title ~~Feat: Router observability~~ Feat: Router observability (Current QPS, router-side queueing delay, etc) Feb 7, 2025

ApostaC assigned sitloboi2012 Feb 7, 2025

sitloboi2012 mentioned this issue Feb 8, 2025

Feat: Router observability (Current QPS, router-side queueing delay, etc) - WIP #90

Closed

sitloboi2012 mentioned this issue Feb 12, 2025

Feat: Router observability (Current QPS, router-side queueing delay, etc) Part 1 #119

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Router observability (Current QPS, router-side queueing delay, etc) #78

Feat: Router observability (Current QPS, router-side queueing delay, etc) #78

sitloboi2012 commented Feb 7, 2025

sitloboi2012 commented Feb 8, 2025

YuhanLiu11 commented Feb 9, 2025 •

edited

Loading

sitloboi2012 commented Feb 9, 2025

Feat: Router observability (Current QPS, router-side queueing delay, etc) #78

Feat: Router observability (Current QPS, router-side queueing delay, etc) #78

Comments

sitloboi2012 commented Feb 7, 2025

sitloboi2012 commented Feb 8, 2025

YuhanLiu11 commented Feb 9, 2025 • edited Loading

sitloboi2012 commented Feb 9, 2025

YuhanLiu11 commented Feb 9, 2025 •

edited

Loading