Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'Qwen2Model' object has no attribute 'lm_head' #500

Open
jrruethe opened this issue Feb 5, 2025 · 1 comment
Open

AttributeError: 'Qwen2Model' object has no attribute 'lm_head' #500

jrruethe opened this issue Feb 5, 2025 · 1 comment

Comments

@jrruethe
Copy link

jrruethe commented Feb 5, 2025

I get this error when trying to extract loras from Qwen models. It doesn't happen for all qwen models, but it seems to mostly be the 0.5/1.5/3B models and I cannot figure out what causes it, or why 7/14/32B succeed without issue. This particular example is from "mobiuslabsgmbh_DeepSeek-R1-ReDistill-Qwen-1.5B-v1.1", while some others like "huihui-ai_DeepSeek-R1-Distill-Qwen-7B-abliterated" work just fine. I am running the latest version of mergekit, and I have the latest transformers / torch installed, as far as I can tell.

torch==2.5.1
transformers==4.48.1
File "/media/user/backup/llm/./mergekit/mergekit/scripts/extract_lora.py", line 672, in main
  ) = validate_and_combine_details(
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/user/backup/llm/./mergekit/mergekit/scripts/extract_lora.py", line 184, in validate_and_combine_details
  finetuned_model_details, finetuned_vocab_size = get_model_details(
                                                  ^^^^^^^^^^^^^^^^^^
File "/media/user/backup/llm/./mergekit/mergekit/scripts/extract_lora.py", line 116, in get_model_details
  pretrained_model = AutoModelForCausalLM.from_pretrained(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
  return model_class.from_pretrained(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4224, in from_pretrained
  ) = cls._load_pretrained_model(
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4794, in _load_pretrained_model
  new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 873, in _load_state_dict_into_meta_model
  set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/accelerate/utils/modeling.py", line 248, in set_module_tensor_to_device
  new_module = getattr(module, split)
              ^^^^^^^^^^^^^^^^^^^^^^
File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1931, in __getattr__
  raise AttributeError(

AttributeError: 'Qwen2Model' object has no attribute 'lm_head'

Any clues as to what the problem is would be really helpful. Thanks!

@jrruethe
Copy link
Author

This error also occurs on some Llama models:

You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
  File "/media/backup/llm/./mergekit/mergekit/scripts/extract_lora.py", line 704, in <module>
    main()
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/backup/llm/./mergekit/mergekit/scripts/extract_lora.py", line 672, in main
    ) = validate_and_combine_details(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/backup/llm/./mergekit/mergekit/scripts/extract_lora.py", line 184, in validate_and_combine_details
    finetuned_model_details, finetuned_vocab_size = get_model_details(
                                                    ^^^^^^^^^^^^^^^^^^
  File "/media/backup/llm/./mergekit/mergekit/scripts/extract_lora.py", line 116, in get_model_details
    pretrained_model = AutoModelForCausalLM.from_pretrained(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4224, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4794, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 873, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/accelerate/utils/modeling.py", line 248, in set_module_tensor_to_device
    new_module = getattr(module, split)
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/backup/llm/mergekit/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1931, in __getattr__
    raise AttributeError(
AttributeError: 'LlamaModel' object has no attribute 'lm_head'

In this case, it was a Llama 3.2 1B finetine: huihui-ai_Llama-3.2-1B-Instruct-abliterated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant