Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run small models with vLLM CPU mode for local development testing #417

Open
Tracked by #698
Jeffwan opened this issue Nov 20, 2024 · 1 comment
Open
Tracked by #698

Run small models with vLLM CPU mode for local development testing #417

Jeffwan opened this issue Nov 20, 2024 · 1 comment
Assignees
Labels
good first issue Good for newcomers help wanted Extra attention is needed kind/documentation Improvements or additions to documentation
Milestone

Comments

@Jeffwan
Copy link
Collaborator

Jeffwan commented Nov 20, 2024

🚀 Feature Description and Motivation

We already have mocked app for most feature integration testing, however, this is still not that convenient in some cases. We should check whether it's possible to use small models like opt-125m with cpu only vLLM for testing

Use Case

No response

Proposed Solution

No response

@Jeffwan Jeffwan added kind/documentation Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed labels Nov 20, 2024
@Jeffwan Jeffwan added this to the v0.3.0 milestone Nov 20, 2024
@zhangjyr
Copy link
Collaborator

zhangjyr commented Dec 2, 2024

In simulator (#430, #456), I used llama2-7b for CPU testing. Will this satisfy your requirement, or might we want to support opt-125m?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed kind/documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants