Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge japanese-to-english multilingual branch #1860

Merged
merged 36 commits into from
Feb 3, 2025
Merged
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
5062f12
add streaming support to reazonresearch
Aug 1, 2024
5a0c247
update README for streaming
Aug 1, 2024
62eb090
update streaming decoding file
Aug 1, 2024
6317405
update streaming decode
Aug 1, 2024
2d2daf6
remove streaming/greedy_search results folder
Aug 1, 2024
e052481
update for streaming
Aug 1, 2024
529d92f
Add docker images for torch 2.4 (#1704)
csukuangfj Jul 25, 2024
8189d11
remove prints
Aug 1, 2024
916e84d
resolve PR issue
Aug 1, 2024
707a956
Add multi_ja_en
Sep 14, 2024
2e355a8
Update README.md
baileyeet Sep 14, 2024
b6af607
Update README.md
baileyeet Dec 24, 2024
7aedda0
Update RESULTS.md
baileyeet Dec 24, 2024
7b1445b
Update RESULTS.md
baileyeet Dec 25, 2024
4a55a10
Update RESULTS.md
baileyeet Dec 25, 2024
1bc7f07
Delete egs/multi_ja_en/ASR/zipformer/streaming/greedy_search directory
baileyeet Dec 25, 2024
68e1c3c
formatting
baileyeet Nov 25, 2024
a2bb272
formatting
baileyeet Nov 25, 2024
4604be8
add onnx decode
baileyeet Dec 25, 2024
f421001
remove unnecessary folders
baileyeet Dec 25, 2024
564b632
fix repeated definition of tokenize_by_ja_char
baileyeet Jan 7, 2025
5c142d4
Merge branch 'master' into einichi
baileyeet Jan 7, 2025
8a3790c
clean up files
baileyeet Jan 8, 2025
9d6211e
remove test
baileyeet Jan 8, 2025
84c91db
edit prepare.sh
baileyeet Jan 14, 2025
1244de9
update python ver
baileyeet Jan 14, 2025
aa74f6c
update python ver
baileyeet Jan 14, 2025
b574e68
udpate symlink
baileyeet Jan 14, 2025
9ab3021
Reformatted streaming_decode.py with flake8
baileyeet Jan 14, 2025
3eec244
Update RESULTS.md
baileyeet Jan 20, 2025
efc0536
Merge branch 'k2-fsa:master' into einichi
baileyeet Jan 27, 2025
50c3270
Update generate_build_matrix.py
JinZr Jan 28, 2025
b8ce806
Update build-docker-image.yml
JinZr Jan 28, 2025
5cf7e42
Update zipformer.py
JinZr Jan 28, 2025
1cb4594
Update zipformer.py
JinZr Jan 28, 2025
b9efbf8
Update utils.py
JinZr Jan 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
formatting
baileyeet committed Dec 25, 2024
commit a2bb2724e112a73f5d8c3ff36cec317c7c72c0c8
1 change: 1 addition & 0 deletions egs/multi_ja_en/ASR/local/prepare_lang.py
1 change: 1 addition & 0 deletions egs/multi_ja_en/ASR/local/prepare_lang_bbpe.py
Original file line number Diff line number Diff line change
@@ -34,6 +34,7 @@
"""

import argparse
import re
from pathlib import Path
from typing import Dict, List, Tuple

3 changes: 3 additions & 0 deletions egs/multi_ja_en/ASR/local/train_bbpe_model.py
Original file line number Diff line number Diff line change
@@ -54,6 +54,9 @@ def tokenize_by_ja_char(line: str) -> str:
"""
pattern = re.compile(r"([\u3040-\u309F\u30A0-\u30FF\u4E00-\u9FFF])")
chars = pattern.split(line.strip())
return " ".join(
[w.strip().upper() if not pattern.match(w) else w for w in chars if w.strip()]
)


def get_args():
4 changes: 2 additions & 2 deletions egs/multi_ja_en/ASR/prepare.sh
Original file line number Diff line number Diff line change
@@ -73,7 +73,7 @@ if [ $stage -le 3 ] && [ $stop_stage -ge 3 ]; then
ln -svf $(realpath ../../../../reazonspeech/ASR/data/manifests/feats_test) .
cd ../..
else
log "Abort! Please run ./prepare.sh --stage 2 --stop-stage 2"
log "Abort! Please run ../../reazonspeech/ASR/prepare.sh --stage 0 --stop-stage 2"
exit 1
fi
fi
@@ -184,4 +184,4 @@ if [ $stage -le 4 ] && [ $stop_stage -ge 4 ]; then
done
fi

log "prepare_einishi.sh: PREPARATION DONE"
log "prepare.sh: PREPARATION DONE"