Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[transcribe] Expected parameter logits errors #440

Open
bonsai-byte opened this issue Mar 11, 2025 · 2 comments
Open

[transcribe] Expected parameter logits errors #440

bonsai-byte opened this issue Mar 11, 2025 · 2 comments

Comments

@bonsai-byte
Copy link

bonsai-byte commented Mar 11, 2025

I get this error when running on Tesla T4s but not on RTX 4080.

Expected parameter logits (Tensor of shape (1, 51865)) of distribution Categorical(logits: torch.Size([1, 51865])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0')
Traceback (most recent call last):
  File "/app/src/unchartedlabs/backend/stable_model/stable_ts_model.py", line 117, in transcribe
    result = self._model.transcribe(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/stable_whisper/whisper_word_level/original_whisper.py", line 504, in transcribe_stable
    result: DecodingResult = decode_with_fallback(mel_segment, ts_token_mask=ts_token_mask)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/stable_whisper/whisper_word_level/original_whisper.py", line 360, in decode_with_fallback
    decode_result, audio_features = decode_stable(model,
                                    ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/stable_whisper/decode.py", line 108, in decode_stable
    result = task.run(mel)
             ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/whisper/decoding.py", line 737, in run
    tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens)
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/stable_whisper/decode.py", line 58, in _main_loop
    tokens, completed = self.decoder.update(tokens, logits, sum_logprobs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/whisper/decoding.py", line 283, in update
    next_tokens = Categorical(logits=logits / self.temperature).sample()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/distributions/categorical.py", line 70, in __init__
    super().__init__(batch_shape, validate_args=validate_args)
  File "/usr/local/lib/python3.11/dist-packages/torch/distributions/distribution.py", line 68, in __init__
    raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (1, 51865)) of distribution Categorical(logits: torch.Size([1, 51865])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/src/unchartedlabs/backend/gpu_postprocess_subscriber.py", line 359, in process_transcribe_song
    response = impl.transcribe(request, nonce=nonce)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/src/unchartedlabs/backend/lyrics_transcribe/impl.py", line 62, in transcribe
    transcription_result = self._model.transcribe(
                           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/src/unchartedlabs/backend/stable_model/stable_ts_model.py", line 125, in transcribe
    raise RuntimeError(f"Transcription failed: {str(e)}") from e
RuntimeError: Transcription failed: Expected parameter logits (Tensor of shape (1, 51865)) of distribution Categorical(logits: torch.Size([1, 51865])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0')

Tesla T4

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.01             Driver Version: 535.216.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   79C    P0              32W /  70W |   4820MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

4080

Mon Mar 10 22:36:16 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 561.00         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4080 ...    On  |   00000000:01:00.0 Off |                  N/A |
| N/A   52C    P8              4W /   90W |    1222MiB /  12282MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
faster-whisper==1.1.0
openai-whisper==20240930
stable-ts==2.18.0
@bonsai-byte
Copy link
Author

Setting fp16=False seems to resolve things...

@jianfch
Copy link
Owner

jianfch commented Mar 13, 2025

Can you reproduce this issue on Colab with T4 and share the audio source and transcription settings for replicating the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants