Run small models with vLLM CPU mode for local development testing #417

Jeffwan · 2024-11-20T21:10:42Z

🚀 Feature Description and Motivation

We already have mocked app for most feature integration testing, however, this is still not that convenient in some cases. We should check whether it's possible to use small models like opt-125m with cpu only vLLM for testing

Use Case

No response

Proposed Solution

No response

zhangjyr · 2024-12-02T19:29:44Z

In simulator (#430, #456), I used llama2-7b for CPU testing. Will this satisfy your requirement, or might we want to support opt-125m?

Jeffwan added kind/documentation Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed labels Nov 20, 2024

Jeffwan added this to the v0.3.0 milestone Nov 20, 2024

varungup90 self-assigned this Feb 20, 2025

varungup90 mentioned this issue Feb 20, 2025

Add vllm cpu alternative for local development #721

Merged

Jeffwan mentioned this issue Feb 24, 2025

v0.3.0 roadmap #698

Open

41 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run small models with vLLM CPU mode for local development testing #417

Run small models with vLLM CPU mode for local development testing #417

Jeffwan commented Nov 20, 2024

zhangjyr commented Dec 2, 2024

Run small models with vLLM CPU mode for local development testing #417

Run small models with vLLM CPU mode for local development testing #417

Comments

Jeffwan commented Nov 20, 2024

🚀 Feature Description and Motivation

Use Case

Proposed Solution

zhangjyr commented Dec 2, 2024