Make it easier to unset use_flex_decoding for Moondream 3 on non-CUDA devices

I'm trying to use moondream3-preview and due to its use of flex decoding, it only runs on CUDA devices. I'm on a MacBook Pro which only supports MPS. 

There seems to be a flag in the code called `use_flex_decoding`, but somehow I can't set it meaningfully in my code, e.g., like

```
moondream.use_flex_decoding = False
```

Still fails with attempts to call the CUDA-only `create_block_mask`.

If instead I actually patch `self.use_flex_decoding = True` to `False` in `MoondreamModel` in the model code itself, Moondream 3 Preview seems to work fine on my Mac.

I wonder if `use_flex_decoding` could simply be a parameter on the module constructor so that the bad value doesn't have a chance to take effect anywhere before I can change it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it easier to unset use_flex_decoding for Moondream 3 on non-CUDA devices #316

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Make it easier to unset use_flex_decoding for Moondream 3 on non-CUDA devices #316

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions