[CB] set new_tokens to to max value given the constraints #516

yannicks1 · 2025-10-10T08:51:24Z

set new_tokens to to max possible value given the Spyre constraints in platform.get_max_output_tokens() for continuous batching: max_model_len - padded_prompt_len .

Signed-off-by: Yannick Schnider <[email protected]>

github-actions · 2025-10-10T08:51:31Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

joerunde

Nice!

Can you confirm that this allows you to send a /v1/chat/completions request without setting max_tokens?

tjohnson31415

I ran a few tests with this change. I used chat_template to have full control over the input tokens with the chat endpoint:

curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d'{
        "messages":[{"role":"user","content":""}],
        "model":"ibm-ai-platform/micro-g3.3-8b-instruct-1b",
        "chat_template": "{{ \"A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A \" }}"
    }'

Before this change, the prompt had to be 63 or 64 tokens to be accepted (it is interesting that 63 also worked 🤔)

After this change, that restriction no longer exists and I can freely send requests without worrying about the page boundaries 🚀

A potential source of confusion: the way this is applied is that it silently overrides the value for max_tokens set on the request. If the request sets min_tokens and max_tokens to the same value and this is higher than would fit, this overrides max_tokens and results in a min_tokens must be less than or equal to max_tokens error.

joerunde · 2025-10-10T23:13:16Z

Nice!
Gonna go ahead and merge, because I am of course trying to slide in a little release on a Friday afternoon

set new_tokens to to max value given the constraints

e7218c6

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 marked this pull request as ready for review October 10, 2025 10:15

yannicks1 requested review from nikolaospapandreou, sducouedic and tdoublep as code owners October 10, 2025 10:15

yannicks1 requested review from gmarinho2, joerunde, maxdebayser and tjohnson31415 October 10, 2025 10:27

joerunde approved these changes Oct 10, 2025

View reviewed changes

tjohnson31415 approved these changes Oct 10, 2025

View reviewed changes

joerunde merged commit 00ec338 into main Oct 10, 2025
20 checks passed

joerunde deleted the ysc-fix-max-tokens-chat branch October 10, 2025 23:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CB] set new_tokens to to max value given the constraints #516

[CB] set new_tokens to to max value given the constraints #516

Uh oh!

yannicks1 commented Oct 10, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 10, 2025

Uh oh!

joerunde left a comment

Uh oh!

tjohnson31415 left a comment •

edited

Loading

Uh oh!

joerunde commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[CB] set new_tokens to to max value given the constraints #516

[CB] set new_tokens to to max value given the constraints #516

Uh oh!

Conversation

yannicks1 commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 10, 2025

Uh oh!

joerunde left a comment

Choose a reason for hiding this comment

Uh oh!

tjohnson31415 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joerunde commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yannicks1 commented Oct 10, 2025 •

edited

Loading

tjohnson31415 left a comment •

edited

Loading