Skip to content

FR [v0.4.0?]: P/D Disaggregation related metrics #386

@JeffLuoo

Description

@JeffLuoo

What would you like to be added:

Provide high-level metric:

  • The # of requests that are routed to P/D given the threshold.

Why is this needed:

  • Allow users to tune the threshold in the configuration of the plugin to decide that, given the current available hardware resources allocated to prefill and decode workers, how many request are they willing to enable the P/D?

Need more metrics?:

Please feel free to drop a comment if you think there are other useful metrics needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions