Fixes #271 Issue #273

LukaDarsalia · 2025-05-19T11:01:00Z

Checklist

Read CONTRIBUTING.md, and accept the CLA by including the provided snippet. We will not accept PR without this.
Run pre-commit hook.
If you changed Rust code, run cargo check, cargo clippy, cargo test.

PR Description

Fixes #271

Summary

This PR fixes a crash in the quantized PyTorch Moshi models when loaded on Windows/Linux due to meta tensors being passed into int8_vectorwise_quant.

The issue happens because replace_linear_with_qlinear() was called before real weights were loaded, resulting in meta tensors being quantized, which bitsandbytes does not support.

Fix Details

In QLinear.__init__, we now check if the weight is on the meta device.
If it is, we create dummy CB and SCB tensors with correct shapes and set a flag self.is_meta = True.
We delay quantization until real weights are loaded.
The forward pass checks the meta status and raises a clear error if used too early.
Once weights are loaded, _check_meta_status will turn off meta mode and ensure scale tensors are in float32.

CLA

I, LukaDarsalia, confirm that I have read and understood the terms of the CLA of Kyutai-labs, as outlined in the repository's CONTRIBUTING.md, and I agree to be bound by these terms.

LukaDarsalia added 2 commits May 15, 2025 16:53

Fixes quantizing issue

3b02ebf

Update server.py No need to disable cudagraphs

26dfb20

LukaDarsalia force-pushed the main branch from f113cbd to 26dfb20 Compare May 21, 2025 17:32

LukaDarsalia mentioned this pull request Jul 22, 2025

Quantized model fails to load on Windows/Linux #271

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes #271 Issue #273

Fixes #271 Issue #273

LukaDarsalia commented May 19, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fixes #271 Issue #273

Are you sure you want to change the base?

Fixes #271 Issue #273

Conversation

LukaDarsalia commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

PR Description

Summary

Fix Details

CLA

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

LukaDarsalia commented May 19, 2025 •

edited

Loading