Skip to content

Use Mode/LoRA name in KV cache matching #381

@elevran

Description

@elevran

When matching prefixes based on KV events, the EPP should provide the model (or LoRA) name in matching blocks.
The kv-cache-manager needs to be aware of the model/LoRA used for the block id (provided by vLLM in kv-event?)

cc: @nilig

Metadata

Metadata

Assignees

Labels

needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

Type

No type

Projects

Status

Ready

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions