Skip to content

Conversation

joerunde
Copy link
Collaborator

No description provided.

yannicks1 and others added 25 commits March 17, 2025 11:39
* adding tests for vLLM V1

Signed-off-by: Yannick Schnider <[email protected]>

* convert list to tuple (V1 returns list, V0 returned tuple)

Signed-off-by: Yannick Schnider <[email protected]>

* added test timeout and forked execution

Signed-off-by: Yannick Schnider <[email protected]>

* Fixing Dockerfile.spyre to install from upsream PR 14242 spyre-workarounds


Signed-off-by: Yannick Schnider <[email protected]>

* removing failing tests and add successful ones

Signed-off-by: Yannick Schnider <[email protected]>

---------

Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Yannick Schnider <[email protected]>
Co-authored-by: Dhruval Patel <[email protected]>
* ♻️ Use real logger

Signed-off-by: Joe Runde <[email protected]>

* 🔥 remove [SpyreWorker] prefixes

Signed-off-by: Joe Runde <[email protected]>

---------

Signed-off-by: Joe Runde <[email protected]>
* 📝 Document kv-cache related config methods

Signed-off-by: Joe Runde <[email protected]>

* ✨ override the things for v1 to disable paged attn

Signed-off-by: Joe Runde <[email protected]>

* 🐛 remove v0 --max-num-seqs override

Signed-off-by: Joe Runde <[email protected]>

* 🎨 typo

Signed-off-by: Joe Runde <[email protected]>

---------

Signed-off-by: Joe Runde <[email protected]>
* 🐛 fix batch handling in V1 runner

Signed-off-by: Joe Runde <[email protected]>

* ⚗️ try v1 test only

Signed-off-by: Joe Runde <[email protected]>

* ⚗️ add a bit more prompt

Signed-off-by: Joe Runde <[email protected]>

* ⚗️ unclear why CI won't count to 0

Signed-off-by: Joe Runde <[email protected]>

* ♻️ rename map_output_indices

Signed-off-by: Joe Runde <[email protected]>

---------

Signed-off-by: Joe Runde <[email protected]>
* execute_model with a warm up mode

Signed-off-by: Rafael Vasquez <[email protected]>

* Remove funcs and prints

Signed-off-by: Rafael Vasquez <[email protected]>

* Fix

Signed-off-by: Rafael Vasquez <[email protected]>

* Lints

Signed-off-by: Rafael Vasquez <[email protected]>

* Another refactor to leave execute_model untouched

Signed-off-by: Rafael Vasquez <[email protected]>

* this one too

Signed-off-by: Rafael Vasquez <[email protected]>

* Small lint

Signed-off-by: Rafael Vasquez <[email protected]>

* Removes _raw_model_forward

Signed-off-by: Rafael Vasquez <[email protected]>

* Attempt a revamp

Signed-off-by: Rafael Vasquez <[email protected]>

* Update prints to logger

Signed-off-by: Rafael Vasquez <[email protected]>

* Lints, fixes prints

Signed-off-by: Rafael Vasquez <[email protected]>

* Move function back

Signed-off-by: Rafael Vasquez <[email protected]>

* Fix logging typos

Signed-off-by: Rafael Vasquez <[email protected]>

* 🐛 fix batch handling in V1 runner (#33)

* 🐛 fix batch handling in V1 runner

Signed-off-by: Joe Runde <[email protected]>

* ⚗️ try v1 test only

Signed-off-by: Joe Runde <[email protected]>

* ⚗️ add a bit more prompt

Signed-off-by: Joe Runde <[email protected]>

* ⚗️ unclear why CI won't count to 0

Signed-off-by: Joe Runde <[email protected]>

* ♻️ rename map_output_indices

Signed-off-by: Joe Runde <[email protected]>

---------

Signed-off-by: Joe Runde <[email protected]>

* Refactor forward_pass, update comments/logs for clarity

Signed-off-by: Rafael Vasquez <[email protected]>

---------

Signed-off-by: Rafael Vasquez <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Co-authored-by: Joe Runde <[email protected]>
* ✨ Support rejecting requests

Signed-off-by: Joe Runde <[email protected]>

* ⬆️ Bump to 0.8.0

Signed-off-by: Joe Runde <[email protected]>

* 🐛 fix build

Signed-off-by: Joe Runde <[email protected]>

* 🐛 fixup scheduler and backwards compatibility

Signed-off-by: Joe Runde <[email protected]>

* 📝 add note about backwards compatibility

Signed-off-by: Joe Runde <[email protected]>

* :construction-worker: stop matching on the roberta-v1 model name for v1 tests

Signed-off-by: Joe Runde <[email protected]>

* 🔥 more fixes for 0.8.0, remove backwards compatibility

Signed-off-by: Joe Runde <[email protected]>

---------

Signed-off-by: Joe Runde <[email protected]>
* point to repo on vllm-project

Signed-off-by: Yannick Schnider <[email protected]>

* ⬆️ Bump to 0.8.0

Signed-off-by: Yannick Schnider <[email protected]>

---------

Signed-off-by: Yannick Schnider <[email protected]>
Co-authored-by: Joe Runde <[email protected]>
* fix: track sampling params correctly in the worker

Signed-off-by: Wallas Santos <[email protected]>

* fix: CI errors

Signed-off-by: Wallas Santos <[email protected]>

* fix: test_spyre_input_batch wrong import
fix: linting

Signed-off-by: Wallas Santos <[email protected]>

* style: clean up code

Signed-off-by: Wallas Santos <[email protected]>

* style: removed lora from input batch

Signed-off-by: Wallas Santos <[email protected]>

* style: linting

Signed-off-by: Wallas Santos <[email protected]>

* style: removed more stuffs

Signed-off-by: Wallas Santos <[email protected]>

* fix: tests

Signed-off-by: Wallas Santos <[email protected]>

* refact: cleanup spyre_input_batch

Signed-off-by: Wallas Santos <[email protected]>

* wip

Signed-off-by: Wallas Santos <[email protected]>

* refactoring of model runner

Signed-off-by: Wallas Santos <[email protected]>

* fix allowed token ids

Signed-off-by: Wallas Santos <[email protected]>

* fix: clear of input batch

Signed-off-by: Wallas Santos <[email protected]>

* Fixes on syre_input_batch
Added some documentation
Restored test

Signed-off-by: Wallas Santos <[email protected]>

* updated code for v0.8.0

Signed-off-by: Wallas Santos <[email protected]>

* fix linting

Signed-off-by: Wallas Santos <[email protected]>

* updated test-spyre.yml action

Signed-off-by: Wallas Santos <[email protected]>

---------

Signed-off-by: Wallas Santos <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
* Store warmup shapes in config

Signed-off-by: Thomas Parnell <[email protected]>

* fix formatting

Signed-off-by: Yannick Schnider <[email protected]>

* apply fix to V1 classes

Signed-off-by: Yannick Schnider <[email protected]>

* ✨ always get warmup shapes from env

Signed-off-by: Joe Runde <[email protected]>

* 🐛 fixup minor merge issues

Signed-off-by: Joe Runde <[email protected]>

* 🔥 remove forking requirement

Signed-off-by: Joe Runde <[email protected]>

---------

Signed-off-by: Thomas Parnell <[email protected]>
Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Co-authored-by: Yannick Schnider <[email protected]>
Co-authored-by: Joe Runde <[email protected]>
* Fix importings for schedule repackaging

Signed-off-by: Wallas Santos <[email protected]>

* fix: missing reference for typing

Signed-off-by: Wallas Santos <[email protected]>

* improvements on typing

Signed-off-by: Wallas Santos <[email protected]>

---------

Signed-off-by: Wallas Santos <[email protected]>
* Add spyre-TP online test and server util

Signed-off-by: Rafael Vasquez <[email protected]>

* Lint imports

Signed-off-by: Rafael Vasquez <[email protected]>

* Remove TP arg and call from test instead

Signed-off-by: Rafael Vasquez <[email protected]>

---------

Signed-off-by: Rafael Vasquez <[email protected]>
* ⚗️ Add tests for online server

Signed-off-by: Joe Runde <[email protected]>

* 🧪 add negative test cases for online serving

Signed-off-by: Joe Runde <[email protected]>

* ♻️ use new test util

Signed-off-by: Joe Runde <[email protected]>

* 🔥 remove finish reason from test

Signed-off-by: Joe Runde <[email protected]>

---------

Signed-off-by: Joe Runde <[email protected]>
* Enable v1 in test_spyre_max_prompt_length

Signed-off-by: Gabriel Marinho <[email protected]>

* fix yapf error

Signed-off-by: Gabriel Marinho <[email protected]>

---------

Signed-off-by: Gabriel Marinho <[email protected]>
Co-authored-by: Joe Runde <[email protected]>
* fixing import for gptq functionalities

Signed-off-by: Yannick Schnider <[email protected]>

* fix formatting

Signed-off-by: Yannick Schnider <[email protected]>

* warning gptq not working on CPU

Signed-off-by: Yannick Schnider <[email protected]>

* removing unused aiu-fms package

Signed-off-by: Yannick Schnider <[email protected]>

* format fixing

Signed-off-by: Yannick Schnider <[email protected]>

* adding fms-mo as dependency

Signed-off-by: Yannick Schnider <[email protected]>

* dont specify version

Signed-off-by: Yannick Schnider <[email protected]>

* removing fms-mo in requirements

Signed-off-by: Yannick Schnider <[email protected]>

---------

Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes:

pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

@joerunde
Copy link
Collaborator Author

wellp, that didn't work lol

@joerunde joerunde closed this Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants