Skip to content

Commit

Permalink
[Docs] Add feature description of dist kv cache in README (#705)
Browse files Browse the repository at this point in the history
Signed-off-by: Dwyane Shi <[email protected]>
  • Loading branch information
DwyaneShi authored Feb 19, 2025
1 parent b939afd commit dcda623
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ The initial release includes the following key features:
- **Distributed Inference**: Scalable architecture to handle large workloads across multiple nodes.
- **LLM App-Tailored Autoscaler**: Dynamically scale inference resources based on real-time demand.
- **Unified AI Runtime**: A versatile sidecar enabling metric standardization, model downloading, and management.
- **Distributed KV Cache**: Enables high-capacity, cross-engine KV reuse.
- **GPU Hardware Failure Detection (TBD)**: Proactive detection of GPU hardware issues.
- **Benchmark Tool (TBD)**: A tool for measuring inference performance and resource efficiency.

Expand Down

0 comments on commit dcda623

Please sign in to comment.