-
Notifications
You must be signed in to change notification settings - Fork 125
Open
Labels
feature requestNew feature or requestNew feature or request
Description
Hi TRT-LLM team,
I see that this technique could be beneficial for Summarization, Context-QA, and Multi-turn chat.
This technique is similar to speculative decoding, but instead of using a draft model, they switch to 'string' matching in the prompt to generate candidate token sequences.
Github : Prompt-lookup-decoding
Columpio
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or request