Language-Localization/README.txt at main · nabeelDanish/Language-Localization · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
==============================================================================
		Language Localization on Audio Files and Live Audio
==============================================================================
Nabeel Danish

A spectogram-based approach to identifying langauges most commonly spoken in Pakistan.
this includes Urdu, English, Arabic, Pashto, and Sindhi. The model is contructed as CNN with LSTM layers
achieving an accuracy of 95%

------------------------------------------
		Usage
------------------------------------------

import the file languageLocalization.py to use

------------------------------------------
		Functions
------------------------------------------
def languageLocalize(inputFile, extension, chunk_file):

	Parameters:

	inputFile -- path to file for audio
	extension -- audio file extension
	chunk_file -- path to folder where the model stores preprocessing data

	Return Value:

	pred -- python array of strings containing the languauges predicted at positional interval
		pred[i] is the language detected between (i - 1) and (i)th second
		Example:
			pred[3] = 'english' means english detected between 2-3 sec of audio

------------------------------------------
	Live Audio Detection
------------------------------------------
the notebook contains the script to run live detection on audio

------------------------------------------
		Dependancies
------------------------------------------
Tensorflow
Keras
Numpy
Scipy
Pydub
Librosa