Skip to content

Dimensional Error on Forward Pass #8

@SarrocaGSergi

Description

@SarrocaGSergi

After adjusting the code to make it run, I find a dimensional error on the forward pass:

Creating model instance...
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/lazy.py:180: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment.
  warnings.warn('Lazy modules are a new feature under heavy development '
VoViT pre-trained weights loaded
Lead Voice enhancer pre-trained weights loaded
Done
Forwarding speaker1...
/usr/local/lib/python3.10/dist-packages/torchaudio/functional/functional.py:109: UserWarning: `return_complex` argument is now deprecated and is not effective.`torchaudio.functional.spectrogram(power=None)` always returns a tensor with complex dtype. Please remove the argument in the function call.
  warnings.warn(
/content/VoViT/vovit/core/models/production_model.py:102: UserWarning: Casting complex values to real discards the imaginary part (Triggered internally at ../aten/src/ATen/native/Copy.cpp:276.)
  return s.to(dtype)
---------------------------------------------------------------------------
EinopsError                               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
    411         recipe = _prepare_transformation_recipe(pattern, reduction, axes_lengths=hashable_axes_lengths)
--> 412         return _apply_recipe(recipe, tensor, reduction_type=reduction)
    413     except EinopsError as e:

15 frames
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in _apply_recipe(recipe, tensor, reduction_type)
    234     init_shapes, reduced_axes, axes_reordering, added_axes, final_shapes = \
--> 235         _reconstruct_from_shape(recipe, backend.shape(tensor))
    236     tensor = backend.reshape(tensor, init_shapes)

[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in _reconstruct_from_shape_uncached(self, shape)
    164         if len(shape) != len(self.input_composite_axes):
--> 165             raise EinopsError('Expected {} dimensions, got {}'.format(len(self.input_composite_axes), len(shape)))
    166 

EinopsError: Expected 4 dimensions, got 3

During handling of the above exception, another exception occurred:

EinopsError                               Traceback (most recent call last)
[<ipython-input-7-faaa648e3dcd>](https://localhost:8080/#) in <cell line: 28>()
     28 with torch.no_grad():
     29     print('Forwarding speaker1...')
---> 30     pred_s1 = model.forward_unlimited(mixture, speaker1_face)
     31     print('Forwarding speaker2...')
     32     pred_s2 = model.forward_unlimited(mixture, speaker2_face)

[/content/VoViT/vovit/__init__.py](https://localhost:8080/#) in forward_unlimited(self, mixture, visuals)
     78         visuals = visuals[:n_chunks * fps * 2].view(n_chunks, fps * 2, 3, 68)
     79         mixture = mixture[:n_chunks * length].view(n_chunks, -1)
---> 80         pred = self.forward(mixture, visuals)
     81         pred_unraveled = {}
     82         for k, v in pred.items():

[/content/VoViT/vovit/__init__.py](https://localhost:8080/#) in forward(self, mixture, visuals, extract_landmarks)
     56         mixture /= mixture.abs().max()
     57 
---> 58         return self.vovit(mixture, ld)
     59 
     60     def forward_unlimited(self, mixture, visuals):

[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

[/content/VoViT/vovit/core/models/production_model.py](https://localhost:8080/#) in forward(self, mixture, landmarks)
    378         """
    379         inputs = {'src': mixture, 'landmarks': landmarks}
--> 380         return self.avse(inputs)

[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

[/content/VoViT/vovit/core/models/production_model.py](https://localhost:8080/#) in forward(self, *args, **kwargs)
    325 
    326     def forward(self, *args, **kwargs):
--> 327         return self.inference(*args, **kwargs)
    328 
    329     def inference(self, inputs: dict, n_iter=1):

[/content/VoViT/vovit/core/models/production_model.py](https://localhost:8080/#) in inference(self, inputs, n_iter)
    329     def inference(self, inputs: dict, n_iter=1):
    330         with torch.no_grad():
--> 331             output = self.forward_avse(inputs, compute_istft=False)
    332             estimated_sp = output['estimated_sp']
    333             for i in range(n_iter):

[/content/VoViT/vovit/core/models/production_model.py](https://localhost:8080/#) in forward_avse(self, inputs, compute_istft)
    321     def forward_avse(self, inputs, compute_istft: bool):
    322         self.av_se.eval()
--> 323         output = self.av_se(inputs, compute_wav=compute_istft)
    324         return output
    325 

[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

[/content/VoViT/vovit/core/models/production_model.py](https://localhost:8080/#) in forward(self, inputs, compute_wav)
    223         # ==========================================
    224 
--> 225         audio_feats = self.audio_processor.preprocess_audio(inputs['src'])
    226 
    227         """

[/content/VoViT/vovit/core/models/production_model.py](https://localhost:8080/#) in preprocess_audio(self, n_sources, *src)
    135             # Contiguous required to address memory problems in certain gpus
    136             sp_mix = sp_mix_raw[:, ::2, ...].contiguous()  # BxFxTx2
--> 137         x = rearrange(sp_mix, 'b f t c -> b c f t')
    138         output = {'mixture': x, 'sp_mix_raw': sp_mix_raw}
    139 

[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in rearrange(tensor, pattern, **axes_lengths)
    481             raise TypeError("Rearrange can't be applied to an empty list")
    482         tensor = get_backend(tensor[0]).stack_on_zeroth_dimension(tensor)
--> 483     return reduce(cast(Tensor, tensor), pattern, reduction='rearrange', **axes_lengths)
    484 
    485 

[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
    418             message += '\n Input is list. '
    419         message += 'Additional info: {}.'.format(axes_lengths)
--> 420         raise EinopsError(message + '\n {}'.format(e))
    421 
    422 

EinopsError:  Error while processing rearrange-reduction pattern "b f t c -> b c f t".
 Input tensor shape: torch.Size([4, 256, 128]). Additional info: {}.
 Expected 4 dimensions, got 3

This error comes either in the colab notebook and when I clone the repo in my local. The main changes I did is to reformat the requirements to use newer versions of pytorch packages and cuda and fix the bugs generated by np.int
What do you suggest?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions