merging tokenizer of deepseekv3 to distilled model for speculative decoding in llama.cpp? #513

BarfingLemurs · 2025-02-12T16:15:28Z

Hi @cg123, would there be a way to swap the original distilled deepseek model's tokenizers to make them somewhat compatible with llama.cpp speculative decoding?

Another case, could deepseekvl2, which has the same vocabulary as deepseekv3 be converted to a text-only model, for speculative use?

Until llama.cpp supports MTP inference, this would be very useful for that model!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merging tokenizer of deepseekv3 to distilled model for speculative decoding in llama.cpp? #513

merging tokenizer of deepseekv3 to distilled model for speculative decoding in llama.cpp? #513

BarfingLemurs commented Feb 12, 2025

merging tokenizer of deepseekv3 to distilled model for speculative decoding in llama.cpp? #513

merging tokenizer of deepseekv3 to distilled model for speculative decoding in llama.cpp? #513

Comments

BarfingLemurs commented Feb 12, 2025