[Bug] difference of kv-cache-prefixing between vLLM and sglang #1669
Closed
chenchunhui97
started this conversation in
General
Replies: 1 comment
-
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Checklist
Describe the bug
no bug. I am just wondering the difference of kv-cache-prefixing between vLLM impletention and SGLang implementation.
vLLM use hash to store and verify cached token:

SGLang uses RadixAttention, so what is the difference? I found SGLang is faster than vLLM, why SGLang RadixAttention is faster than vLLM KV-Cache-prefixing?
Reproduction
not available
Environment
//
Beta Was this translation helpful? Give feedback.
All reactions