Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: shape '[-1, 32]' is invalid for input of size 1 #26

Open
chunhualiao opened this issue Aug 26, 2023 · 0 comments
Open

RuntimeError: shape '[-1, 32]' is invalid for input of size 1 #26

chunhualiao opened this issue Aug 26, 2023 · 0 comments

Comments

@chunhualiao
Copy link

The modeling_llama.py code is too buggy. I am debugging a common pattern of error. the input_ids has one additional dimension added in front. Then many other places in the code do not know this and get the wrong dimension information.

Traceback (most recent call last):
  File "/global/cfs/cdirs/m2956/workspace-cfs/openmp-qa/reinforcement_learning.py", line 140, in <module>
    response_tensor = ppo_trainer.generate(query_tensor, pad_token_id=tokenizer.eos_token_id, max_new_tokens=20)
  File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/trl/trainer/ppo_trainer.py", line 454, in generate
    response = self.accelerator.unwrap_model(self.model).generate(
  File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/trl/models/modeling_value_head.py", line 198, in generate
    return self.pretrained_model.generate(*args, **kwargs)
  File "/global/common/software/nersc/pm-2022q4/sw/pytorch/1.13.1/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/generation/utils.py", line 1538, in generate
    return self.greedy_search(
  File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/generation/utils.py", line 2362, in greedy_search
    outputs = self(
  File "/global/common/software/nersc/pm-2022q4/sw/pytorch/1.13.1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 806, in forward
    outputs = self.model(
  File "/global/common/software/nersc/pm-2022q4/sw/pytorch/1.13.1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 643, in forward
    position_ids = position_ids.view(-1, seq_length).long()
RuntimeError: shape '[-1, 32]' is invalid for input of size 1

the size of position_ids should be the same as the input_ids's last dimension (1x1x32 ).
But it has 1x1 shape.

(Pdb) p position_ids

tensor([[0]], device='cuda:0')

def prepare_inputs_for_generation () sets position_ids, based on the shape of attention_mask, which in turn is set by _prepare_attention_mask_for_generation () from .. pytorch1.13.1/lib/python3.9/site-packages/transformers/generation/utils.py

return torch.ones(inputs.shape[:2], dtype=torch.long, device=inputs.device)

I changed it to inputs.shape[1:3] instead the code can proceed.

But it then get another similar error later.

  File ".local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 287, in forward
    bsz, q_len, _ = hidden_states.size()
ValueError: too many values to unpack (expected 3)
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> ...local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py(287)forward()
-> bsz, q_len, _ = hidden_states.size()
(Pdb) p hidden_states.size()
torch.Size([1, 1, 32, 4096])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant