-
Notifications
You must be signed in to change notification settings - Fork 253
Issues: vllm-project/aibrix
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Does aibrix support to do load balance against managed model endpoints
area/gateway
triage/needs-information
Indicates an issue needs more information in order to work on it.
#784
opened Mar 3, 2025 by
Colstuwjx
Failed to run benchmark scripts against the endpoint
area/gateway
kind/bug
Something isn't working
priority/critical-urgent
Highest priority. Must be actively worked on as someone's top priority right now.
#783
opened Mar 3, 2025 by
Jeffwan
Add probe usage practice for super large models, including multi-node case
area/performance
kind/documentation
Improvements or additions to documentation
kind/enhancement
New feature or request
priority/critical-urgent
Highest priority. Must be actively worked on as someone's top priority right now.
#782
opened Mar 3, 2025 by
Jeffwan
Gateway returns not meaningful response when pod is running but container not ready
area/gateway
kind/bug
Something isn't working
kind/documentation
Improvements or additions to documentation
We still see some errors that not explainable if httpRoute is missing
#778
opened Mar 1, 2025 by
Jeffwan
RayClusterReplicaSet didn't populate annotations to headers and workers
area/distributed
kind/bug
Something isn't working
making future benchmarks use this utils properly as well.
area/benchmark
#768
opened Feb 28, 2025 by
gangmuk
Managing common functionalities of benchmark in a separate utils dir
area/benchmark
kind/cleanup
Categorizes issue or PR as related to cleaning up code, process, or technical debt.
#767
opened Feb 28, 2025 by
gangmuk
Loading base model/lora adapters from s3 fails and throws HF error
area/lora
kind/bug
Something isn't working
#766
opened Feb 28, 2025 by
robert-moyai
Mounting s3 hosted model files using s3fs is causing startup issues
area/lora
kind/bug
Something isn't working
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
#765
opened Feb 28, 2025 by
robert-moyai
why it donnot supploy helm deploy?
area/installation
kind/feature
Categorizes issue or PR as related to a new feature.
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
#762
opened Feb 28, 2025 by
ying2025
Stateful information sync for ext-proc Instances
area/gateway
area/stability
kind/enhancement
New feature or request
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
#761
opened Feb 27, 2025 by
Jeffwan
Support high availability of gateway server for production users
area/gateway
area/stability
kind/feature
Categorizes issue or PR as related to a new feature.
#760
opened Feb 27, 2025 by
Jeffwan
Support multi-node & autoscaling & routing together for models like Deepseek-R1
area/autoscaling
area/distributed
priority/critical-urgent
Highest priority. Must be actively worked on as someone's top priority right now.
#758
opened Feb 27, 2025 by
Jeffwan
Add Deepseek R1 multi-node example
area/distributed
kind/documentation
Improvements or additions to documentation
priority/important-longterm
Important over the long term, but may not be staffed and/or may need multiple releases to complete.
#754
opened Feb 26, 2025 by
Jeffwan
Cleanup the branches
area/github
priority/important-longterm
Important over the long term, but may not be staffed and/or may need multiple releases to complete.
#747
opened Feb 26, 2025 by
kerthcet
Intermittently CI test is skipped in PR
area/cicd
area/github
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
#739
opened Feb 25, 2025 by
gangmuk
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.