Run small models with vLLM CPU mode for local development testing #417
Labels
good first issue
Good for newcomers
help wanted
Extra attention is needed
kind/documentation
Improvements or additions to documentation
Milestone
🚀 Feature Description and Motivation
We already have mocked app for most feature integration testing, however, this is still not that convenient in some cases. We should check whether it's possible to use small models like opt-125m with cpu only vLLM for testing
Use Case
No response
Proposed Solution
No response
The text was updated successfully, but these errors were encountered: