Skip to content

Pull requests: waybarrios/vllm-mlx

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Wire MLLM assistant drafters for Gemma 4 MTP
#507 opened May 6, 2026 by Thump604 Collaborator Loading…
Expose MLLM MTP draft counters
#504 opened May 6, 2026 by Thump604 Collaborator Loading…
fix: Qwen tool streaming recovery
#497 opened May 4, 2026 by kylejeske Loading…
Fix dangling think before tool calls in templates
#494 opened May 2, 2026 by Thump604 Collaborator Loading…
docs: add PR merge readiness checklist
#492 opened May 2, 2026 by Thump604 Collaborator Loading…
docs: clarify MLLM MTP guard semantics
#491 opened May 2, 2026 by Thump604 Collaborator Loading…
fix: sanitize remote media safety errors
#490 opened May 2, 2026 by Thump604 Collaborator Loading…
Expose effective MLLM MTP draft stats
#473 opened Apr 29, 2026 by Thump604 Collaborator Loading…
Fix sampling defaults and short prefix-cache reuse
#424 opened Apr 24, 2026 by Thump604 Collaborator Loading…
feat(mllm): extract audio track from video inputs
#352 opened Apr 15, 2026 by miguel-flowstate Contributor Loading…
3 of 4 tasks
fix: graceful fallback when model has no chat_template (MedGemma)
#271 opened Apr 9, 2026 by jackneil Contributor Loading…
2 of 3 tasks
feat: add --compile flag for mx.compile model optimization
#270 opened Apr 9, 2026 by jackneil Contributor Loading…
3 tasks done
Add TurboQuant KV cache compression for prefix cache (4.6x)
#233 opened Mar 29, 2026 by arozanov Loading…
9 tasks done
ProTip! Mix and match filters to narrow down what you’re looking for.