-
Notifications
You must be signed in to change notification settings - Fork 26
[do not merge] checking reminder comment changes #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* adding tests for vLLM V1 Signed-off-by: Yannick Schnider <[email protected]> * convert list to tuple (V1 returns list, V0 returned tuple) Signed-off-by: Yannick Schnider <[email protected]> * added test timeout and forked execution Signed-off-by: Yannick Schnider <[email protected]> * Fixing Dockerfile.spyre to install from upsream PR 14242 spyre-workarounds Signed-off-by: Yannick Schnider <[email protected]> * removing failing tests and add successful ones Signed-off-by: Yannick Schnider <[email protected]> --------- Signed-off-by: Yannick Schnider <[email protected]> Signed-off-by: Yannick Schnider <[email protected]> Co-authored-by: Dhruval Patel <[email protected]>
* ♻️ Use real logger Signed-off-by: Joe Runde <[email protected]> * 🔥 remove [SpyreWorker] prefixes Signed-off-by: Joe Runde <[email protected]> --------- Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
* 📝 Document kv-cache related config methods Signed-off-by: Joe Runde <[email protected]> * ✨ override the things for v1 to disable paged attn Signed-off-by: Joe Runde <[email protected]> * 🐛 remove v0 --max-num-seqs override Signed-off-by: Joe Runde <[email protected]> * 🎨 typo Signed-off-by: Joe Runde <[email protected]> --------- Signed-off-by: Joe Runde <[email protected]>
* 🐛 fix batch handling in V1 runner Signed-off-by: Joe Runde <[email protected]> * ⚗️ try v1 test only Signed-off-by: Joe Runde <[email protected]> * ⚗️ add a bit more prompt Signed-off-by: Joe Runde <[email protected]> * ⚗️ unclear why CI won't count to 0 Signed-off-by: Joe Runde <[email protected]> * ♻️ rename map_output_indices Signed-off-by: Joe Runde <[email protected]> --------- Signed-off-by: Joe Runde <[email protected]>
* execute_model with a warm up mode Signed-off-by: Rafael Vasquez <[email protected]> * Remove funcs and prints Signed-off-by: Rafael Vasquez <[email protected]> * Fix Signed-off-by: Rafael Vasquez <[email protected]> * Lints Signed-off-by: Rafael Vasquez <[email protected]> * Another refactor to leave execute_model untouched Signed-off-by: Rafael Vasquez <[email protected]> * this one too Signed-off-by: Rafael Vasquez <[email protected]> * Small lint Signed-off-by: Rafael Vasquez <[email protected]> * Removes _raw_model_forward Signed-off-by: Rafael Vasquez <[email protected]> * Attempt a revamp Signed-off-by: Rafael Vasquez <[email protected]> * Update prints to logger Signed-off-by: Rafael Vasquez <[email protected]> * Lints, fixes prints Signed-off-by: Rafael Vasquez <[email protected]> * Move function back Signed-off-by: Rafael Vasquez <[email protected]> * Fix logging typos Signed-off-by: Rafael Vasquez <[email protected]> * 🐛 fix batch handling in V1 runner (#33) * 🐛 fix batch handling in V1 runner Signed-off-by: Joe Runde <[email protected]> * ⚗️ try v1 test only Signed-off-by: Joe Runde <[email protected]> * ⚗️ add a bit more prompt Signed-off-by: Joe Runde <[email protected]> * ⚗️ unclear why CI won't count to 0 Signed-off-by: Joe Runde <[email protected]> * ♻️ rename map_output_indices Signed-off-by: Joe Runde <[email protected]> --------- Signed-off-by: Joe Runde <[email protected]> * Refactor forward_pass, update comments/logs for clarity Signed-off-by: Rafael Vasquez <[email protected]> --------- Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: Joe Runde <[email protected]> Co-authored-by: Joe Runde <[email protected]>
* ✨ Support rejecting requests Signed-off-by: Joe Runde <[email protected]> * ⬆️ Bump to 0.8.0 Signed-off-by: Joe Runde <[email protected]> * 🐛 fix build Signed-off-by: Joe Runde <[email protected]> * 🐛 fixup scheduler and backwards compatibility Signed-off-by: Joe Runde <[email protected]> * 📝 add note about backwards compatibility Signed-off-by: Joe Runde <[email protected]> * :construction-worker: stop matching on the roberta-v1 model name for v1 tests Signed-off-by: Joe Runde <[email protected]> * 🔥 more fixes for 0.8.0, remove backwards compatibility Signed-off-by: Joe Runde <[email protected]> --------- Signed-off-by: Joe Runde <[email protected]>
* point to repo on vllm-project Signed-off-by: Yannick Schnider <[email protected]> * ⬆️ Bump to 0.8.0 Signed-off-by: Yannick Schnider <[email protected]> --------- Signed-off-by: Yannick Schnider <[email protected]> Co-authored-by: Joe Runde <[email protected]>
* fix: track sampling params correctly in the worker Signed-off-by: Wallas Santos <[email protected]> * fix: CI errors Signed-off-by: Wallas Santos <[email protected]> * fix: test_spyre_input_batch wrong import fix: linting Signed-off-by: Wallas Santos <[email protected]> * style: clean up code Signed-off-by: Wallas Santos <[email protected]> * style: removed lora from input batch Signed-off-by: Wallas Santos <[email protected]> * style: linting Signed-off-by: Wallas Santos <[email protected]> * style: removed more stuffs Signed-off-by: Wallas Santos <[email protected]> * fix: tests Signed-off-by: Wallas Santos <[email protected]> * refact: cleanup spyre_input_batch Signed-off-by: Wallas Santos <[email protected]> * wip Signed-off-by: Wallas Santos <[email protected]> * refactoring of model runner Signed-off-by: Wallas Santos <[email protected]> * fix allowed token ids Signed-off-by: Wallas Santos <[email protected]> * fix: clear of input batch Signed-off-by: Wallas Santos <[email protected]> * Fixes on syre_input_batch Added some documentation Restored test Signed-off-by: Wallas Santos <[email protected]> * updated code for v0.8.0 Signed-off-by: Wallas Santos <[email protected]> * fix linting Signed-off-by: Wallas Santos <[email protected]> * updated test-spyre.yml action Signed-off-by: Wallas Santos <[email protected]> --------- Signed-off-by: Wallas Santos <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
* Store warmup shapes in config Signed-off-by: Thomas Parnell <[email protected]> * fix formatting Signed-off-by: Yannick Schnider <[email protected]> * apply fix to V1 classes Signed-off-by: Yannick Schnider <[email protected]> * ✨ always get warmup shapes from env Signed-off-by: Joe Runde <[email protected]> * 🐛 fixup minor merge issues Signed-off-by: Joe Runde <[email protected]> * 🔥 remove forking requirement Signed-off-by: Joe Runde <[email protected]> --------- Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: Yannick Schnider <[email protected]> Signed-off-by: Joe Runde <[email protected]> Co-authored-by: Yannick Schnider <[email protected]> Co-authored-by: Joe Runde <[email protected]>
* Fix importings for schedule repackaging Signed-off-by: Wallas Santos <[email protected]> * fix: missing reference for typing Signed-off-by: Wallas Santos <[email protected]> * improvements on typing Signed-off-by: Wallas Santos <[email protected]> --------- Signed-off-by: Wallas Santos <[email protected]>
* Add spyre-TP online test and server util Signed-off-by: Rafael Vasquez <[email protected]> * Lint imports Signed-off-by: Rafael Vasquez <[email protected]> * Remove TP arg and call from test instead Signed-off-by: Rafael Vasquez <[email protected]> --------- Signed-off-by: Rafael Vasquez <[email protected]>
* ⚗️ Add tests for online server Signed-off-by: Joe Runde <[email protected]> * 🧪 add negative test cases for online serving Signed-off-by: Joe Runde <[email protected]> * ♻️ use new test util Signed-off-by: Joe Runde <[email protected]> * 🔥 remove finish reason from test Signed-off-by: Joe Runde <[email protected]> --------- Signed-off-by: Joe Runde <[email protected]>
* Enable v1 in test_spyre_max_prompt_length Signed-off-by: Gabriel Marinho <[email protected]> * fix yapf error Signed-off-by: Gabriel Marinho <[email protected]> --------- Signed-off-by: Gabriel Marinho <[email protected]> Co-authored-by: Joe Runde <[email protected]>
* fixing import for gptq functionalities Signed-off-by: Yannick Schnider <[email protected]> * fix formatting Signed-off-by: Yannick Schnider <[email protected]> * warning gptq not working on CPU Signed-off-by: Yannick Schnider <[email protected]> * removing unused aiu-fms package Signed-off-by: Yannick Schnider <[email protected]> * format fixing Signed-off-by: Yannick Schnider <[email protected]> * adding fms-mo as dependency Signed-off-by: Yannick Schnider <[email protected]> * dont specify version Signed-off-by: Yannick Schnider <[email protected]> * removing fms-mo in requirements Signed-off-by: Yannick Schnider <[email protected]> --------- Signed-off-by: Yannick Schnider <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
👋 Hi! Thank you for contributing to vLLM support on Spyre.
Now you are good to go 🚀 |
wellp, that didn't work lol |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.