-
Notifications
You must be signed in to change notification settings - Fork 269
feat: add stop string matching #969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @blueswhen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request adds a significant new feature to the text generation process: the ability to stop generation based on specific string patterns, not just token IDs. It involves modifications across several core components to store, detect, and handle these stop strings, ultimately providing more granular control over the output and ensuring that the final generated text is clean of the stop sequences themselves.
Highlights
- Stop String Matching Implementation: Introduced a new mechanism to detect and handle user-defined stop strings within the generated text, allowing for more precise control over text generation termination.
- Enhanced Sampling Parameters: The
SamplingParams
object has been extended to store and manage the string representations of stop sequences, alongside their token ID counterparts, enabling string-based matching. - Dynamic Tokenizer Integration: The inference backend now dynamically loads the tokenizer when stop string matching is enabled, allowing it to decode generated token IDs into strings for comparison against the specified stop strings.
- Configurable Feature: The entire stop string matching functionality can be enabled or disabled via the
ENABLE_STOP_STRING_MATCH
environment variable, providing flexibility for deployment and testing. - Post-Generation Stop String Trimming: After generation, if a stop string caused the termination, the matched stop string is now automatically removed from the end of the final generated text to ensure clean output.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces stop string matching. I've identified a high-severity issue in how stop sequences are processed, a high-severity issue in the matching logic that might cause premature termination, and a medium-severity suggestion to improve correctness. Overall, this is a solid contribution and with these fixes, it will be even better.
if isinstance(stop_sequences, str): | ||
stop_sequences = [stop_sequences] | ||
for group_idx in range(self.size): | ||
self.groups[group_idx].initialize(groups[group_idx]) | ||
self.groups[group_idx].initialize(groups[group_idx], stop_sequences[group_idx]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential issue in this loop. If stop_sequences
contains a list of token IDs (e.g., [1, 2, 3]
), it will be used as the sequence_str
argument to StopSequence.initialize
, which expects a string. This can lead to incorrect behavior or a crash. Ensure that only strings are passed as sequence_str
.
] | ||
tail_str = tokenizer.decode(tail_token_ids, skip_special_tokens=False) | ||
for stop_str in self.stop_sequences_str: | ||
if stop_str in tail_str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using in
to check for the stop string can lead to premature termination. If the stop string is "User:" and the model generates "Assistant: Hello! User: Hi!", the generation would stop prematurely. Use tail_str.endswith(stop_str)
to match only at the end.
if stop_str in tail_str: | |
if tail_str.endswith(stop_str): |
lightllm/server/api_openai.py
Outdated
@@ -518,9 +522,22 @@ async def _collect_generation_results(generator, request: CompletionRequest, pro | |||
finish_reason = finish_status.get_finish_reason() | |||
prompt_tokens = metadata["prompt_tokens"] | |||
|
|||
# 处理停止序列剔除 | |||
final_text = "".join(final_output) | |||
if finish_reason == "stop" and hasattr(sampling_params, "stop_sequences") and sampling_params.stop_sequences: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition hasattr(sampling_params, "stop_sequences") and sampling_params.stop_sequences
is always true. It's better to explicitly check if there are any stop sequences defined by looking at the size
attribute.
if finish_reason == "stop" and hasattr(sampling_params, "stop_sequences") and sampling_params.stop_sequences: | |
if finish_reason == "stop" and sampling_params.stop_sequences.size > 0: |
b93b511
to
b9f3add
Compare
b9f3add
to
50a1663
Compare
No description provided.