Skip to content

Conversation

@yannicks1
Copy link
Collaborator

🐛 fix warmup

Currently, the number of blocks (in particular the kv cache tensor shape) has to be consistent across warmup and inference. PR #362 introduced inconsistency which led to recompilation for a certain model/TP combination.
This PR fixes above and refactors the warmup to make it more readable.

Signed-off-by: Yannick Schnider <[email protected]>
@github-actions
Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

@yannicks1
Copy link
Collaborator Author

bot:test
export TORCH_SENDNN_CACHE_ENABLE=0
export VLLM_SPYRE_TEST_MODEL_LIST='ibm-granite/granite-3.3-8b-instruct'

# for layer in self.past_key_value_states:
# for tensor in layer:
# torch._dynamo.mark_dynamic(tensor, 0)
def _get_num_blocks_available(self) -> int:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not private anymore.

Suggested change
def _get_num_blocks_available(self) -> int:
def _get_num_blocks_available(self) -> int:

Signed-off-by: Yannick Schnider <[email protected]>
Copy link
Collaborator

@wallashss wallashss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@yannicks1 yannicks1 enabled auto-merge (squash) August 15, 2025 14:35
@github-actions github-actions bot added the ready label Aug 15, 2025
@yannicks1 yannicks1 merged commit 6a1e753 into main Aug 15, 2025
24 checks passed
@yannicks1 yannicks1 deleted the ysc-fix-recomp-issue branch August 15, 2025 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants