-
Notifications
You must be signed in to change notification settings - Fork 639
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Hierarchical Caching for SGLang
enhancement
New feature or request
#2693
opened Jan 1, 2025 by
xiezhq-hermann
Loading…
3 tasks
Online serving benchmarks [multiturn chat, shared prefix] to multi-tier KV caching
#2665
opened Dec 30, 2024 by
PanJason
Loading…
5 tasks
CUDA-graph-compatible releasing and resuming KV cache and model weight memory
#2630
opened Dec 28, 2024 by
fzyzcjy
Loading…
3 tasks done
[Feature] support TransformerEngine to enable communication overlap
#2627
opened Dec 28, 2024 by
Zhuohao-Li
•
Draft
2 of 6 tasks
[Feature, Hardware] Enable DeepseekV3 on AMD GPUs
amd
bug
Something isn't working
high priority
#2601
opened Dec 26, 2024 by
BruceXcluding
•
Draft
9 tasks
Refactor Scheduler to improve code organization
#2593
opened Dec 26, 2024 by
libratiger
Loading…
3 tasks done
[Feature] Streaming API for Function Calling
#2576
opened Dec 25, 2024 by
HaoyuWang4188
•
Draft
3 tasks
[Docs] add quantization docs
dependencies
Pull requests that update a dependency file
#2572
opened Dec 25, 2024 by
JamesSand
Loading…
3 tasks done
Refactor SchedulePolicy to improve code organization
#2571
opened Dec 25, 2024 by
libratiger
Loading…
3 tasks done
feat:support 2 kenrels for mixed chunked prefill
#2546
opened Dec 22, 2024 by
chosen-ox
Loading…
2 tasks
Enable Nvidia's ModelOpt fp8 quantized models
high priority
quant
LLM Quantization
#2535
opened Dec 21, 2024 by
Edwardf0t1
Loading…
1 of 3 tasks
Add generator-style run_batch function
await-response
#2513
opened Dec 18, 2024 by
xingyaoww
Loading…
adapt custom allreduce for tensorrt llm
high priority
#2511
opened Dec 18, 2024 by
yizhang2077
Loading…
3 tasks
improve performance by removing use_tensor_core dependency
await-response
#2496
opened Dec 17, 2024 by
bjmsong
Loading…
3 tasks
[Experimental] Add a gRPC server for completion request
high priority
#2478
opened Dec 13, 2024 by
MrAta
Loading…
2 of 3 tasks
Add InfiniteBench for long context benchmarking
high priority
#2421
opened Dec 9, 2024 by
iankur
Loading…
2 of 3 tasks
[Feature] Add sampler custom logits processor
#2396
opened Dec 8, 2024 by
hongpeng-guo
Loading…
2 of 3 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2024-12-29.