Could the model trained in A100 use FlashMLA in H100 for inference? #40

hdjsjyl · 2025-02-25T07:39:15Z

Hi Author,
Thanks for the release. I would like a question as title.
I have trained the model in A100, now I hope to use FlashMLA to speed up inference in H1200. Is it possible? If yes, do I need to do some changes? Any suggestion would be appreciated, thanks

foreverlms · 2025-02-25T16:16:01Z

Model's structure is nothing to do with arch/platform.
Train and inference could be splited.
You could, as long as the model architecture is compatible with mla.

hdjsjyl · 2025-02-25T21:15:25Z

Hi @foreverlms , really appreciate your reply.

That's what I really want to know.
If I defined a model based on transformer during training and want to use MLA during inference. What does the transformer look like? A simple code is enough for illustration if possible.

Thank you so much!

foreverlms · 2025-02-26T09:58:11Z

Hi @foreverlms , really appreciate your reply.

That's what I really want to know. If I defined a model based on transformer during training and want to use MLA during inference. What does the transformer look like? A simple code is enough for illustration if possible.

Thank you so much!

You could refer to modeling_deepseek.py in huggingface of Deepseek Models. That script use if to check if is in training or just for inference.

https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite/blob/main/modeling_deepseek.py#L574

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could the model trained in A100 use FlashMLA in H100 for inference? #40

Could the model trained in A100 use FlashMLA in H100 for inference? #40

hdjsjyl commented Feb 25, 2025

foreverlms commented Feb 25, 2025

hdjsjyl commented Feb 25, 2025

foreverlms commented Feb 26, 2025

Could the model trained in A100 use FlashMLA in H100 for inference? #40

Could the model trained in A100 use FlashMLA in H100 for inference? #40

Comments

hdjsjyl commented Feb 25, 2025

foreverlms commented Feb 25, 2025

hdjsjyl commented Feb 25, 2025

foreverlms commented Feb 26, 2025