You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The modeling_llama.py code is too buggy. I am debugging a common pattern of error. the input_ids has one additional dimension added in front. Then many other places in the code do not know this and get the wrong dimension information.
Traceback (most recent call last):
File "/global/cfs/cdirs/m2956/workspace-cfs/openmp-qa/reinforcement_learning.py", line 140, in <module>
response_tensor = ppo_trainer.generate(query_tensor, pad_token_id=tokenizer.eos_token_id, max_new_tokens=20)
File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/trl/trainer/ppo_trainer.py", line 454, in generate
response = self.accelerator.unwrap_model(self.model).generate(
File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/trl/models/modeling_value_head.py", line 198, in generate
return self.pretrained_model.generate(*args, **kwargs)
File "/global/common/software/nersc/pm-2022q4/sw/pytorch/1.13.1/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/generation/utils.py", line 1538, in generate
return self.greedy_search(
File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/generation/utils.py", line 2362, in greedy_search
outputs = self(
File "/global/common/software/nersc/pm-2022q4/sw/pytorch/1.13.1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 806, in forward
outputs = self.model(
File "/global/common/software/nersc/pm-2022q4/sw/pytorch/1.13.1/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/global/homes/l/liaoch/.local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 643, in forward
position_ids = position_ids.view(-1, seq_length).long()
RuntimeError: shape '[-1, 32]' is invalid for input of size 1
the size of position_ids should be the same as the input_ids's last dimension (1x1x32 ).
But it has 1x1 shape.
(Pdb) p position_ids
tensor([[0]], device='cuda:0')
def prepare_inputs_for_generation () sets position_ids, based on the shape of attention_mask, which in turn is set by _prepare_attention_mask_for_generation () from .. pytorch1.13.1/lib/python3.9/site-packages/transformers/generation/utils.py
I changed it to inputs.shape[1:3] instead the code can proceed.
But it then get another similar error later.
File ".local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 287, in forward
bsz, q_len, _ = hidden_states.size()
ValueError: too many values to unpack (expected 3)
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> ...local/unknown/pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py(287)forward()
-> bsz, q_len, _ = hidden_states.size()
(Pdb) p hidden_states.size()
torch.Size([1, 1, 32, 4096])
The text was updated successfully, but these errors were encountered:
The modeling_llama.py code is too buggy. I am debugging a common pattern of error. the input_ids has one additional dimension added in front. Then many other places in the code do not know this and get the wrong dimension information.
the size of position_ids should be the same as the input_ids's last dimension (1x1x32 ).
But it has 1x1 shape.
def prepare_inputs_for_generation () sets position_ids, based on the shape of attention_mask, which in turn is set by _prepare_attention_mask_for_generation () from .. pytorch1.13.1/lib/python3.9/site-packages/transformers/generation/utils.py
I changed it to inputs.shape[1:3] instead the code can proceed.
But it then get another similar error later.
The text was updated successfully, but these errors were encountered: