Skip to content
This repository was archived by the owner on May 27, 2022. It is now read-only.

Commit b399634

Browse files
committed
Initial commit
0 parents  commit b399634

File tree

407 files changed

+81845
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

407 files changed

+81845
-0
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Team SIGNALS @kakaobrain

.github/ISSUE_TEMPLATE/bug.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
name: Report a bug
3+
about: Bug report
4+
labels: 'bug'
5+
---
6+
7+
## How to reproduce
8+
9+
-
10+
11+
## Environment
12+
13+
-

.github/ISSUE_TEMPLATE/feature.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
---
2+
name: Request a feature
3+
about: Feature request
4+
labels: 'feature'
5+
---
6+
7+
## Describe a requested feature
8+
9+
-
10+
11+
## Expected behavior
12+
13+
```python
14+
>>> a = Foo()
15+
>>> a.predict()
16+
```

.github/ISSUE_TEMPLATE/install.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
name: Install issue
3+
about: Issue about installation
4+
labels: 'install'
5+
---
6+
7+
## Environment
8+
9+
-

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
## Title
2+
-
3+
4+
## Description
5+
-
6+
7+
## Linked Issues
8+
- resolved #00

.gitignore

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# project
2+
__pycache__
3+
.pytest_cache
4+
external_lib
5+
core.*
6+
.idea
7+
.empty
8+
.coverage
9+
10+
# docs
11+
*.bat
12+
13+
# deploy
14+
build*
15+
dist*
16+
*.egg*
17+
18+
# test
19+
*.flac
20+
*.wav
21+
*.pt
22+
*.tmp
23+
tmp*

CONTRIBUTING.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Contributing to Pororo
2+
3+
## Style check guide
4+
5+
- `pororo` relies on `black` and `isort` to format its source code consistently. After you make changes, format them with:
6+
7+
```bash
8+
$ make style
9+
```
10+
11+
- `pororo` also relies on `yapf` to maintain neat code structure. To apply `yapf`, follow [installation guide](https://github.com/google/yapf#installation) and utilize it with:
12+
13+
```
14+
PYTHONPATH=DIR python DIR/yapf pororo --style '{based_on_style: google, indent_width: 4}' --recursive -i
15+
```
16+
17+
<br>
18+
19+
## Quality check guide
20+
21+
- `pororo` uses `flake8` to check for coding mistakes. You can run the checks with:
22+
23+
```bash
24+
$ make quality
25+
```

Dockerfile

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
FROM pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel
2+
3+
WORKDIR /app
4+
5+
COPY . .
6+
7+
RUN apt-get update && \
8+
apt-get install -y apt-utils \
9+
wget \
10+
git \
11+
gcc \
12+
build-essential \
13+
cmake \
14+
libpq-dev \
15+
libsndfile-dev \
16+
libboost-system-dev \
17+
libboost-thread-dev \
18+
libboost-program-options-dev \
19+
libboost-test-dev \
20+
libeigen3-dev \
21+
zlib1g-dev \
22+
libbz2-dev \
23+
liblzma-dev \
24+
libsndfile1-dev \
25+
libopenblas-dev \
26+
libfftw3-dev \
27+
libgflags-dev \
28+
libgoogle-glog-dev \
29+
libgl1-mesa-glx \
30+
libomp-dev
31+
32+
# 1. install pororo
33+
RUN pip install pororo
34+
35+
# 2. install brainspeech
36+
RUN pip install soundfile \
37+
torchaudio==0.6.0 \
38+
pydub
39+
40+
RUN conda install -y -c conda-forge librosa
41+
42+
# 3. install etc modules
43+
RUN pip install librosa \
44+
kollocate \
45+
koparadigm \
46+
g2pk \
47+
fugashi \
48+
ipadic \
49+
romkan \
50+
g2pM \
51+
jieba \
52+
opencv-python \
53+
scikit-image \
54+
python-mecab-ko
55+
56+
WORKDIR /app/external_lib
57+
58+
RUN git clone https://github.com/kpu/kenlm.git
59+
WORKDIR /app/external_lib/kenlm/build
60+
RUN cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_POSITION_INDEPENDENT_CODE=ON
61+
RUN make -j 16
62+
ENV KENLM_ROOT_DIR="/app/external_lib/kenlm/"
63+
64+
WORKDIR /app/external_lib
65+
RUN git clone -b v0.2 https://github.com/facebookresearch/wav2letter.git
66+
WORKDIR /app/external_lib/wav2letter/bindings/python
67+
RUN pip install -e .
68+
69+
WORKDIR /app

INSTALL.ko.md

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# 설치 가이드
2+
3+
본 문서에서는 Pororo 설치를 위해 필요한 라이브러리에 대한 설명과 설치 방법을 다룹니다.
4+
5+
<br>
6+
7+
## 공통 모듈
8+
9+
- Pororo 사용을 위해 공통적으로 설치되어야 할 라이브러리는 다음과 같습니다.
10+
- 해당 라이브러리들은 `pip install` 명령어를 통해 Pororo가 설치될 때 공통적으로 설치되므로, 추가적인 조치를 취해주지 않으셔도 됩니다.
11+
12+
```python
13+
requirements = [
14+
"torch==1.6.0",
15+
"torchvision==0.7.0",
16+
"pillow>=4.1.1",
17+
"fairseq==0.10.2",
18+
"transformers>=4.0.0",
19+
"sentence_transformers==0.4.1.2",
20+
"nltk==3.5",
21+
"word2word",
22+
"wget",
23+
"joblib",
24+
"lxml",
25+
"g2p_en",
26+
"whoosh",
27+
"marisa-trie",
28+
"kss",
29+
]
30+
```
31+
32+
<br>
33+
34+
## 한국어
35+
36+
- 한국어의 특정 태스크를 수행하기 위해서는 추가적인 라이브러리를 설치할 필요가 있을 수 있습니다.
37+
38+
- `python-mecab-ko`**한국어 Tokenization, PoS Tagging, Dependency Parsing** 등 여러 태스크의 수행을 위해 필요한 라이브러리입니다.
39+
40+
```console
41+
pip install python-mecab-ko
42+
```
43+
44+
- `kollocate`**한국어 Collocation** 태스크의 수행을 위해 필요한 라이브러리입니다.
45+
46+
```console
47+
pip install kollocate
48+
```
49+
50+
- `koparadigm`**한국어 Morphological Inflection** 태스크의 수행을 위해 필요한 라이브러리입니다.
51+
52+
```console
53+
pip install koparadigm
54+
```
55+
56+
- `g2pk`**한국어 Grapheme-to-Phoneme** 태스크의 수행을 위해 필요한 라이브러리입니다.
57+
58+
```console
59+
pip install g2pk
60+
```
61+
62+
<br>
63+
64+
## 일본어
65+
66+
- 일본어의 특정 태스크를 수행하기 위해서는 추가적인 라이브러리를 설치할 필요가 있을 수 있습니다.
67+
68+
- `fugashi``ipadic`**일본어 RoBERTa** 모델의 토크나이즈와 **일본어 PoS Tagging**을 위해 필요한 라이브러리입니다.
69+
70+
```console
71+
pip install fugashi ipadic
72+
```
73+
74+
- `romkan`**일본어 Grapheme-to-Phoneme** 태스크의 수행을 위해 필요한 라이브러리입니다.
75+
76+
```console
77+
pip install romkan
78+
```
79+
80+
<br>
81+
82+
## 중국어
83+
84+
- 중국어의 특정 태스크를 수행하기 위해서는 추가적인 라이브러리를 설치할 필요가 있을 수 있습니다.
85+
86+
- `g2pM`**중국어 Grapheme-to-Phoneme** 태스크의 수행을 위해 필요한 라이브러리입니다.
87+
88+
```console
89+
pip install g2pM
90+
```
91+
92+
- `jieba`**중국어 PoS Tagging** 태스크의 수행을 위해 필요한 라이브러리입니다.
93+
94+
```console
95+
pip install jieba
96+
```
97+
98+
<br>
99+
100+
## 기타
101+
102+
### Linux 지원 태스크
103+
104+
- Automatic Speech Recognition
105+
- Speech Translation
106+
- Optical Character Recognition
107+
- Image Captioning
108+
109+
<br>
110+
111+
### Automatic Speech Recognition
112+
113+
- 음성인식 모듈을 활용하기 위해서는 [wav2letter](https://github.com/facebookresearch/wav2letter) 설치가 필요합니다. 레포지토리의 `asr-install.sh`를 실행함으로써 `wav2letter` 설치가 가능합니다.
114+
115+
```console
116+
bash asr-install.sh
117+
```
118+
119+
<br>
120+
121+
### Optical Character Recognition
122+
123+
- OCR 모듈을 활용하기 위해서는 아래 라이브러리들을 설치해주셔야 합니다.
124+
125+
```console
126+
apt-get install -y libgl1-mesa-glx
127+
```
128+
129+
```console
130+
pip install opencv-python scikit-image
131+
```

0 commit comments

Comments
 (0)