Skip to content

[fix] Make add_special_tokens to false for completions API #4583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Pernekhan
Copy link

@Pernekhan Pernekhan commented May 22, 2025

PR title

[fix] Change default for add_special_tokens to false, since the completions API shouldn't expect to modify the prompt.

Description

Change the default of add_special_tokens. It made some confusions, especially when running with speculative decoding which were trained with single bos tokens. The extra bos token messes up the rest of the output.

@Pernekhan Pernekhan force-pushed the add-special-token-to-false branch from 74edca2 to 48307b2 Compare May 22, 2025 17:58
@jhaotingc
Copy link
Collaborator

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6338 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6338 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4628 completed with status: 'FAILURE'

@poweiw poweiw added Community want to contribute PRs initiated from Community OpenAI API trtllm-serve's OpenAI-compatible API: endpoint behavior, req/resp formats, feature parity. triaged Issue has been triaged by maintainers and removed Community want to contribute PRs initiated from Community labels Jun 5, 2025
@poweiw poweiw requested a review from LinPoly June 5, 2025 20:26
@LinPoly
Copy link
Collaborator

LinPoly commented Jun 11, 2025

It made some confusions, especially when running with speculative decoding which were trained with single bos tokens. The extra bos token messes up the rest of the output.

So you have prepended a BOS token to the prompt before sending to the server, correct?

@LinPoly LinPoly changed the title Make add_special_tokens to false for completions API [fix] Make add_special_tokens to false for completions API Jun 11, 2025
@Pernekhan
Copy link
Author

It made some confusions, especially when running with speculative decoding which were trained with single bos tokens. The extra bos token messes up the rest of the output.

So you have prepended a BOS token to the prompt before sending to the server, correct?

This is how it's usually done. For the completions API the prompt is already formatted, and the bos token is included most of the time. That's why I'm proposing the default to not prepend anything.

@LinPoly
Copy link
Collaborator

LinPoly commented Jun 16, 2025

Thanks for explanation, I think it is reasonable to do, could you please also change this line to prompt_tokens = 5 to pass the CI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community want to contribute PRs initiated from Community OpenAI API trtllm-serve's OpenAI-compatible API: endpoint behavior, req/resp formats, feature parity. triaged Issue has been triaged by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants