Skip to content

[Feature Request] Prompt lookup decodingΒ #311

@matichon-vultureprime

Description

@matichon-vultureprime

Hi TRT-LLM team,

I see that this technique could be beneficial for Summarization, Context-QA, and Multi-turn chat.

This technique is similar to speculative decoding, but instead of using a draft model, they switch to 'string' matching in the prompt to generate candidate token sequences.

Github : Prompt-lookup-decoding

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions