Skip to content
14 changes: 4 additions & 10 deletions vllm_spyre/model_executor/model_loader/spyre.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
"""Utilities for selecting and loading Spyre models."""
import os
import sys
from typing import Optional

import torch
Expand Down Expand Up @@ -121,19 +120,14 @@ def load_weights(self, model_config: ModelConfig, max_prompt_length: int,
model_config.dtype, self.dtype)

if model_config.quantization == "gptq":

# note, we have to find a better way to package this
# shouldn't it be part of FMS?
sys.path.append("/home/senuser/aiu-fms")

if envs_spyre.VLLM_SPYRE_DYNAMO_BACKEND == "sendnn_decoder":
from aiu_as_addon import aiu_adapter, aiu_linear # noqa: F401
from fms_mo.aiu_addons.gptq import ( # noqa: F401
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the above sys.path.append("/home/senuser/aiu-fms") still required now that the fms_mo package exists? It's not clear from the comment what we're actually importing from that path

Copy link
Collaborator

@joerunde joerunde Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we should ideally be tracking fms_mo as a dependency, but I'm assuming that it's not published to pypi yet, right? Ah JK I see it is released here, though it looks like there isn't much activity on getting new releases pushed: https://pypi.org/project/fms-model-optimizer/#history

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tried gptq but yes I agree we may not need aiu-fms anymore as everything there should be either in fms_mo or in fms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree we need to add fms_mo as a dep.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I could not find any other dependencies on aiu-fms and thus removed it.

gptq_aiu_adapter, gptq_aiu_linear)
linear_type = "gptq_aiu"
logger.info("Loaded `aiu_as_addon` functionalities")
logger.info("Loaded `aiu_addons` functionalities")
else:
from cpu_addon import cpu_linear # noqa: F401
linear_type = "gptq_cpu"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we perhaps just raise an exception here (since we don't expect it to work)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inserted a logger warning. This way it does not directly raise an exception if this should work in the future due to some changes in the fms repos.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't work now, I would still prefer to raise an exception rather than warning if it is unsupported configuration. Exception is much better feedback to the user.

logger.info("Loaded `cpu_addon` functionalities")
logger.warning("GPTQ is not expected to work on CPU.")

quant_cfg = model_config._parse_quant_hf_config()

Expand Down