-
Notifications
You must be signed in to change notification settings - Fork 680
Description
Sensevoicesmall模型微调异常:
[2026-01-04 08:07:36,039][root][INFO] - Build optim
[2026-01-04 08:07:36,042][root][INFO] - Build scheduler
[2026-01-04 08:07:36,042][root][INFO] - Build dataloader
[2026-01-04 08:07:36,042][root][INFO] - Build dataloader
[2026-01-04 08:07:36,043][root][INFO] - total_num of samplers: 7, /dev_data/sensevoice_training/training_1230/train_example.jsonl
[2026-01-04 08:07:36,043][root][INFO] - total_num of samplers: 1, /dev_data/sensevoice_training/training_1230/val_example.jsonl
No checkpoint found at './outputs/model.pt', does not resume status!
[2026-01-04 08:07:36,043][root][INFO] - Train epoch: 0, rank: 0
[2026-01-04 08:07:36,048][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 8, after: 8
[2026-01-04 08:07:36,184][root][INFO] - rank: 0, dataloader start from step: 0, batch_num: 8, after: 8
[2026-01-04 08:07:36,186][root][ERROR] - ERROR: data is empty!
Error executing job with overrides: ['++model=iic/SenseVoiceSmall', '++train_data_set_list=/dev_data/sensevoice_training/training_1230/train_example.jsonl', '++valid_data_set_list=/dev_data/sensevoice_training/training_1230/val_example.jsonl', '++dataset_conf.data_split_num=1', '++dataset_conf.batch_sampler=BatchSampler', '++dataset_conf.batch_size=60', '++dataset_conf.sort_size=1024', '++dataset_conf.batch_type=token', '++dataset_conf.num_workers=4', '++train_conf.max_epoch=50', '++train_conf.log_interval=1', '++train_conf.resume=true', '++train_conf.validate_interval=1000', '++train_conf.save_checkpoint_interval=1000', '++train_conf.keep_nbest_models=20', '++train_conf.avg_nbest_model=10', '++train_conf.use_deepspeed=false', '++train_conf.deepspeed_config=/dev_data/sensevoice_training/training_1230/../../ds_stage1.json', '++optim_conf.lr=0.0002', '++output_dir=./outputs']
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/funasr/bin/train_ds.py", line 244, in
main_hydra()
File "/usr/local/lib/python3.10/dist-packages/hydra/main.py", line 94, in decorated_main
_run_hydra(
File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 223, in run_and_report
raise ex
File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 220, in run_and_report
return func()
File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 458, in
lambda: hydra.run(
File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "/usr/local/lib/python3.10/dist-packages/funasr/bin/train_ds.py", line 56, in main_hydra
main(**kwargs)
File "/usr/local/lib/python3.10/dist-packages/funasr/bin/train_ds.py", line 177, in main
trainer.train_epoch(
File "/usr/local/lib/python3.10/dist-packages/funasr/train_utils/trainer_ds.py", line 603, in train_epoch
self.forward_step(model, batch, loss_dict=loss_dict)
File "/usr/local/lib/python3.10/dist-packages/funasr/train_utils/trainer_ds.py", line 670, in forward_step
retval = model(**batch)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/funasr/models/sense_voice/model.py", line 697, in forward
encoder_out, encoder_out_lens = self.encode(speech, speech_lengths, text)
File "/usr/local/lib/python3.10/dist-packages/funasr/models/sense_voice/model.py", line 759, in encode
[[self.textnorm_int_dict[int(style)]] for style in text[:, 3]]
IndexError: index 3 is out of bounds for dimension 1 with size 1
E0104 08:07:40.488000 140673480103040 torch/distributed/elastic/multiprocessing/api.py:826] failed (exitcode: 1) local_rank: 0 (pid: 2542283) of binary: /usr/bin/python3
funasr:1.2.9
jsonl 数据由 命令sensevoice2jsonl生成,包括语言、事件、情感:
{"key": "seg_000004", "source": "seg_000004.wav", "source_len": 179, "target": "甚至出现交易", "target_len": 6, "with_or_wo_itn": "<|woitn|>", "text_language": "<|zh|>", "emo_target": "<|NEUTRAL|>", "event_target": "<|Speech|>"}
训练命令来自:https://github.com/modelscope/FunASR/blob/main/examples/industrial_data_pretraining/sense_voice/finetune.sh
脚本更改单卡GPU训练,以及train_tool、batch_size。
问题:1. 通过sensevoice2jsonl生成的jsonl 为何 target_len 长度不对。
2. total_num of samplers: 7 源码index_ds.py上的校验 同个jsonl不同环境结果不一样
3. ERROR: data is empty! 异常信息是指音频文件不存在吗,ls -l 路径下文件是存在
4. IndexError: index 3 is out of bounds for dimension 1 with size 1 通过以下尝试:
尝试1: funasr提供的data(来自git),通过sensevoice2jsonl命令由txt生成jsonl ,再训练同样的异常。
尝试2: 训练脚本的命令增加:
++dataset_type="index_ds"
++dataset="SenseVoiceCTCDataset" 或者 dataset="SenseVoiceDataset" 同样异常.
环境:
OS (e.g., Linux): linux Ubuntu
FunASR Version (e.g., 1.0.0): 1.2.9
ModelScope Version (e.g., 1.11.0): 1.33.0
PyTorch Version (e.g., 2.0.0):
How you installed funasr (pip, source): pip3 install funasr
Python version: 3.12.7
GPU (e.g., V100M32) L40
CUDA/cuDNN version (e.g., cuda11.7):
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1) 无
Any other relevant information: 无