Speechjoey master #12

B-Czarnetzki · 2021-01-07T16:20:59Z

Well the problem with this merge is that speechjoey is a good bit behind the up-to-date joeynmt since i developed it early last year mainly before the multi-gpu changes. I am not sure how easy or, with the current structure, possible a merge can be done, since i am not familiar with the multi gpu implementation. Can we force the branch to be the current speechjoey master and merge with up-to-date joeynmt down the line? Or @may- do you think a merge could be easily done?

…ev, test) using a new AudioDataset class (MonoAudioDataset is also available), logging audio info

…g a bug by sorting during shuffling

… audio & text length

Test with travis

Conflicts manually resolved

…arallel to data)

may- · 2021-01-11T10:04:25Z

@B-Czarnetzki Ah ok, could you tell me on which commit id you want to go? I'll roll back to that point.

may- · 2021-01-11T14:34:51Z

@B-Czarnetzki, I rolled back to the commit point of Sariya's master branch. Now I see no conflicts in your pull request. I'll go through your extension and give you feedback this week.

B-Czarnetzki · 2021-01-11T16:47:23Z

@may- Ok thanks i would have said rollback to 48baf21, but i think it should be the same.

may- · 2021-01-15T17:59:36Z

@B-Czarnetzki thank you again for your enormous effort. I'm still reading your code, but I'd like to share my comments so far, so that you can start fixing them.

speechjoey/data.py
- [fix] where have you imported librosa? ([suggestion] better to replace with torchaudio, I personally think.)
- [fix] an instance in self.examples doesn't have "audio" field, does it? what object do you refer in def getaudio()? and where this function will be called??
- [fix] you shouldn't construct src_vocab and src_field in load_audio_data() func.
- [question] why do you need "src" (audio_dummy) and "conv" (conv_dummy) fields?
- [suggestion] better to get rid of the sklearn dependency...
- [suggestion] for a big audio dataset, loading all features at once would easily trigger a memory error. How about looking up the file (np.load()) everytime the iterator constructs a batch?
- [suggestion] how about moving the scaling part to scripts/audio_preprocessing? that is, joeynmt expects the saved numpy data is already scaled. (you know, joeynmt intentionally exclude the tokenization for text. so we could respect that philosophy of minimalism in audio processing, too.)
speechjoey/model.py
- [fix] what is self.src_emb? You don't use Embeddings for audio input, do you? You shouldn't construct src_embed at all in def build_model()
speechjoey/encoders.py
- [fix] typo line 225 NotImmplementedError
- [suggestion] How about separating the Conv layers from RNN and make Conv also configurable? Currently you hard-coded the conv layers.
speechjoey/decoders.py
- could you please add comments why you need self.rnn2 layer in ConditionalRecurrentDecoder? ([suggestion] you could integrate it to the RecurrentDecoder. ConditionalRecurrentDecoder is almost a duplication of it, as far as I understand.)

Sariyusha added 30 commits January 14, 2019 13:10

add speech as an input option, a new function to load train data (+ d…

980e5a2

…ev, test) using a new AudioDataset class (MonoAudioDataset is also available), logging audio info

yaml and audio files for testing

574311e

Merge remote-tracking branch 'upstream/master'

38b427f

fixed bug

2fd8742

changed the fields of input data, added a dummy line

f76f2e0

Merge remote-tracking branch 'upstream/master'

9595fe3

removed an overwritten getitem method in the AudioDataset class fixin…

8160e40

…g a bug by sorting during shuffling

hacked source embeddings with padded mfccs

ebd84fc

fixed the dummy line for char level

c56073d

fixed cuda mode for speech processing, logging length statistics

4eada0a

Merge remote-tracking branch 'upstream/master'

14bb384

complementing prerequisites

0ec5aeb

changed mfccs' sort order for validation, set treshhold ratio between…

b6a8c77

… audio & text length

Merge remote-tracking branch 'upstream/master'

1ad0d00

Test with travis

Resolved merge conflicts

776197d

resolved conflicts again

f3660b9

last small changes

613638a

Merge remote-tracking branch 'upstream/master'

dd0e030

Conflicts manually resolved

adopted to new settings

31e764b

added a header

fe74c9e

changed for merging, extended docs, fixed BOS in trg_field

122b016

adopted audio sequence for BOS

004d990

removed torchaudio dependency

5e3f12e

check mapping inside of batch

ff0ca90

improved parameter for audio features, fixed MonoAudioDataset

a0be8e8

Individual vocab handling, merged with upstream

4408476

src, trg vocab in config

3b8913d

extended input fields, set default max_length to infinite

8277008

added WER & CER as eval_metrics

036811e

clean up

585e07c

B-Czarnetzki and others added 22 commits December 4, 2020 16:13

Update README.md (fixed formating)

741daa6

Update README.md (formating)

649c919

Update REAME.md (some more minor formating)

30ca801

Update scripts/audio_preprocessing/README.md (formating, corrections)

61cf745

Update scripts/audio_preprocessing/README.md

353f6df

Update scripts/audio_preprocessing/README.md (typo)

c739dff

Update README.md

e4291fe

Update README.md

f32d7bb

Update README.md

3a57401

Update README.md

a1ede3f

Update README.md

43c14f0

Update README.md

dcab151

Added credit line in create_audio_features.py

b0e7989

Removed .travis.yml (incomplete unittests, pylint not fulfilled yet)

82bb060

Removed benchmarks.md

b954fcc

Cleaned up filtering.py, changed output format (perplexity per line p…

d7e2c96

…arallel to data)

Updated README.md

fb30f0f

Updated LICENSE

0dbb3d3

Update README.md

c7cb666

Update README.md

b0bf063

Updated README.md

af1f298

Update README.md

0fad37d

may- force-pushed the speechjoey branch from 53fa381 to 0122a5d Compare January 11, 2021 14:32

Some minor fixes to prevent merge conflict with 48baf21

3971219

may- force-pushed the speechjoey branch from 0122a5d to a286361 Compare September 15, 2021 15:14

may- force-pushed the speechjoey branch from 2e65c81 to 130ba9c Compare September 16, 2022 20:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speechjoey master #12

Speechjoey master #12

B-Czarnetzki commented Jan 7, 2021

may- commented Jan 11, 2021

may- commented Jan 11, 2021 •

edited

Loading

B-Czarnetzki commented Jan 11, 2021

may- commented Jan 15, 2021

Speechjoey master #12

Are you sure you want to change the base?

Speechjoey master #12

Conversation

B-Czarnetzki commented Jan 7, 2021

may- commented Jan 11, 2021

may- commented Jan 11, 2021 • edited Loading

B-Czarnetzki commented Jan 11, 2021

may- commented Jan 15, 2021

may- commented Jan 11, 2021 •

edited

Loading