-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download model from Object Storage #69
Comments
Thanks for helping us organize the feature requests! Will work on this soon! |
Looking forward to this feature too. I am using RunAI Model Streamer to load models directly from s3 now. |
Hi from the RunAI team, Happy to confirm that you can load models directly from object storage in the production stack, by adding the necessary flags and credentials in your configuration file. Using the following configuration file, we deployed vLLM with RunAI model streamer to read the model from S3.
The RunAI streamer is an open source project integrated into vLLM. The streamer provides direct fast streaming of model weights from Safetensors files (either from file system or object storage), saturating the storage bandwidth with parallel reads. Benchmarks can be found here |
@noa-neria Thank you. I have tested this method, but the RunAI Streamer has some limitations when loading from S3. For example, it can't load models that need |
@xqe2011 we appreciate your feedback! The The You have also mentioned the performance issue when the model is distributed on several devices and nodes. |
@noa-neria Well, I know why... We are using the fs backend of MinIO, so the |
That’s awesome! It not only supports object storage but also speeds things up. Maybe we should add some info about it in the tutorial. |
Need a solution to download a model from AWS storage, Azure Blob Storage or MinIO.
Creating this issue as requested here - #67 (comment)
The text was updated successfully, but these errors were encountered: