Skip to content

Pull requests: alibaba/rtp-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat: decode entrance support mtp
#245 opened Oct 16, 2025 by SJTUGavinLiu Loading…
feat: virtual memory allocation
#242 opened Oct 15, 2025 by ZhangZhiPku Loading…
feat: optimize apply rope with cache
#241 opened Oct 15, 2025 by Bruce-Lee-LY Loading…
feat: add remote update weight api.
#240 opened Oct 15, 2025 by ZhangZhiPku Loading…
feat: support grpc client config
#237 opened Oct 15, 2025 by wanglining97 Loading…
Feature/memory block cache
#233 opened Oct 15, 2025 by zhangchicc Loading…
feat: support reuse within queries
#224 opened Oct 13, 2025 by siluzhou Loading…
fix - fix mtp fake query and add check nan
#214 opened Oct 10, 2025 by zerozw Loading…
feature - adapt deepseek in model py
#207 opened Oct 10, 2025 by Nancheng-11 Loading…
feat: add fastsafetensors loader
#205 opened Oct 10, 2025 by lixin010 Loading…
feat: optimizations for dense models on ROCM/AMD
#204 opened Oct 10, 2025 by DorianZi Loading…
feature - support return top_k of logits
#189 opened Oct 4, 2025 by jianglan89 Loading…
Add Qwen3 support for Arm
#145 opened Sep 2, 2025 by xiangze-arm Loading…
Add Deepseek V3 support for Arm
#144 opened Sep 2, 2025 by xiangze-arm Loading…
Flash attention implementation for Arm Device
#142 opened Aug 19, 2025 by xiangze-arm Loading…
add opt_125M
#103 opened Aug 15, 2024 by Nanuion Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.