Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merging tokenizer of deepseekv3 to distilled model for speculative decoding in llama.cpp? #513

Open
BarfingLemurs opened this issue Feb 12, 2025 · 0 comments

Comments

@BarfingLemurs
Copy link

Hi @cg123, would there be a way to swap the original distilled deepseek model's tokenizers to make them somewhat compatible with llama.cpp speculative decoding?

Another case, could deepseekvl2, which has the same vocabulary as deepseekv3 be converted to a text-only model, for speculative use?

Until llama.cpp supports MTP inference, this would be very useful for that model!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant