-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
Summary
Implement and finalize src/speculators/models/independent.py
to support initial speculative decoding algorithms requiring an independent or separate draft model as the speculator.
References
- Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation
- Fast Inference from Transformers via Speculative Decoding
Acceptance Criteria
Classes and Test Cases
- Implement
IndependentSpeculatorConfig
andIndependentSpeculator
following the example insrc/speculators/models/eagle.py
. - Ensure compatibility with
SpeculatorModelConfig.from_pretrained
andSpeculatorModel.from_pretrained
. - Implement full test cases following the examples in
tests/unit/models/test_eagle_config.py
andtests/unit/models/test_eagle_model.py
.
IndependentSpeculatorConfig
- Include all relevant hyperparameters expected to change or be configured to construct a working Speculator model as defined in the referenced papers.
IndependentSpeculator
- Correctly create the required architecture from a given
IndependentSpeculatorConfig
. - Enable loading and saving of weights.
- Integrate seamlessly with the existing system.
TokenProposal Functionality
- Implement any missing
TokenProposal
methods or functionality as defined in the papers, or expand current implementations as needed.
Out of Scope (Future Targets)
- Implement a functioning forward pass for
SpeculatorModel
compatible with training flows. - Implement a functioning generate pass for
SpeculatorModel
compatible with generation flows. - Create an Algorithm factory to handle preconfigured hyperparameters for the desired supported algorithms.
ruipeterpan
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers