aishell复现结果和readme的结果不符

各位大佬们好，我们在aishell1上复现了whisper large-v3 + qwen2 7B的实验，但发现模型的输出存在明显的"复读"(尾部若干字重复了许多遍)以及输出标点符号，特殊符号等情况，我们在推理的时候将大模型的repetition_penalty提高了，复读现象有所好转，但删除所有标点符号后字错率仍高达11%+，与README.md中的5.55%差距较大，以下是我们的训练命令（代码中whisper中提特征是80维的，我们添加了一个n_mel=128参数以支持large-v3）:
```
torchrun --standalone --nnodes=1 --nproc_per_node=8 train.py \
        --llm_model_name_or_path Qwen2-7B-Instruct \
        --whisper_model_name_or_path whisper/large-v3.pt \
        --data_path aishell/train/train.jsonl \
        --eval_data_path aishell/dev/eval.jsonl \
        --bf16 True \
        --output_dir Qwen-7B-Instruct-whisper-large-v3-aishell \
        --num_train_epochs 10 \
        --per_device_train_batch_size 16 \
        --per_device_eval_batch_size 8 \
        --gradient_accumulation_steps 8 \
        --evaluation_strategy "no" \
        --save_strategy "steps" \
        --save_steps 100 \
        --save_total_limit 10 \
        --learning_rate 3e-4 \
        --weight_decay 0.01 \
        --adam_beta2 0.95 \
        --warmup_ratio 0.01 \
        --lr_scheduler_type "cosine" \
        --logging_steps 1 \
        --report_to "none" \
        --model_max_length 512 \
        --n_mels 128 \
        --gradient_checkpointing \
        --dataloader_num_workers 4 \
        --dataloader_prefetch_factor 10 \
        --deepspeed ds_config_zero3.json
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

aishell复现结果和readme的结果不符 #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

aishell复现结果和readme的结果不符 #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions