Skip to content

ltrc/ilparser-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ilparser-api

An API for Parsing Indic Languages

Note: This API runs the full dependency parser which can take, on an average, up to 2.6 seconds on a sentence of length 20 words.

Installing Dependencies (Ubuntu):

$ sudo -E apt-get install autoconf cpanminus gcc git libgdbm-dev libglib2.0-dev make python-numpy \
  python-pydot python-urllib3 python-pip

Installing Dependencies (Fedora):

$ sudo -E dnf install autoconf git gcc gdbm-devel glib2-devel make numpy pydot perl-App-cpanminus \
  python-urllib3 python-pip

Installing Perl Dependencies:

$ cpanm Data::Dumper Dir::Self IPC::Run List::Util Config::IniFiles Mojolicious::Lite

Install CRF++

$ curl -L https://github.com/ltrc/ilparser-api/releases/download/0.1/CRF.-0.58.tar.gz | tar -xz
$ cd CRF++-0.58
$ ./configure ; sudo make install
$ echo "/usr/local" | sudo tee /etc/ld.so.conf.d/crfpp.conf
$ sudo ldconfig

Setting up the repo:

$ git clone https://github.com/ltrc/ilparser-api.git
$ cd ilparser-api
$ ./setup.sh

To run the api server:

$ perl api.pl prefork

Tu run from pre-built tocker container, type:

$ docker pull ltrc/ilparser-api:v1 
$ docker run --name ilparser-api -dit ltrc/ilparser-api:v1
#To find out the IP address of the container:
$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ilparser-api 
# Replace 172.17.0.2 below with the IP obtained from the above command.
$ curl -s 172.17.0.2/parse --data lang=hin --data data=" देश के टूरिजम में राजस्थान एक अहम जगह रखता है।"

Example use of the API:

$ curl -s localhost:3000/parse --data lang=hin \
    --data data="माना जाता है कि अमृतमंथन के बाद अमृत की कुछ बूँदें यहाँ गिरी थीं , इसलिए इसे ब्रह्मकुंड कहा जाता है ." | \
    jq '.["dependencyparse-11"]' | \
    sed -e 's/\\t/\t/g' -e 's/\\n/\n/g'  -e 's/\\"/\"/g' -e 's/^"//' -e 's/"$//'
1	माना	मान	v	VM	case-|vib-या_जा+ता_है|psd-|chunkId-VGF|pers-2|num-sg|tam-yA|sem-|cp-|gen-m	0	root	_	_
2	जाता	जा	v	VAUX	case-|vib-ता|psd-|chunkId-VGF|pers-any|num-sg|tam-wA|sem-|cp-|gen-m	1	lwg__vaux	_	_
3	है	है	v	VAUX	case-|vib-है|psd-|chunkId-VGF|pers-2|num-sg|tam-hE|sem-|cp-|gen-any	1	lwg__vaux	_	_
4	कि	कि	avy	CC	case-|vib-|psd-|chunkId-CCP|pers-|num-|tam-|sem-|cp-|gen-	1	k2	_	_
5	अमृतमंथन	अमृतमंथन	n	NNP	case-o|vib-०_का_बाद|psd-|chunkId-NP|pers-3|num-sg|tam-0|sem-|cp-|gen-m	14	k7t	_	_
6	के	का	psp	PSP	case-o|vib-का|psd-|chunkId-NP|pers-|num-sg|tam-kA|sem-|cp-|gen-m	5	lwg__psp	_	_
7	बाद	बाद	adv	NST	case-|vib-|psd-|chunkId-NP|pers-|num-|tam-|sem-|cp-|gen-	5	lwg__psp	_	_
8	अमृत	अमृत	n	NNP	case-o|vib-०_का|psd-|chunkId-NP2|pers-3|num-sg|tam-0|sem-|cp-|gen-m	11	r6	_	_
9	की	का	psp	PSP	case-d|vib-का|psd-|chunkId-NP2|pers-|num-sg|tam-kA|sem-|cp-|gen-f	8	lwg__psp	_	_
10	कुछ	कुछ	adj	QF	case-any|vib-|psd-|chunkId-NP3|pers-|num-any|tam-|sem-|cp-|gen-any	11	nmod__adj	_	_
11	बूँदें	बूँद	n	NN	case-d|vib-0|psd-|chunkId-NP3|pers-3|num-pl|tam-0|sem-|cp-|gen-f	14	k1	_	_
12	यहाँ	यहाँ	adv	PRP	case-|vib-|psd-|chunkId-NP4|pers-|num-|tam-|sem-|cp-|gen-	14	k7p	_	_
13	गिरी	गिरी	adj	JJ	case-any|vib-|psd-|chunkId-JJP|pers-|num-any|tam-|sem-|cp-|gen-f	14	k1s	_	_
14	थीं	था	v	VM	case-|vib-था|psd-|chunkId-VGF2|pers-any|num-pl|tam-WA|sem-|cp-|gen-f	4	ccof	_	_
15	,	&चोम्म	punc	SYM	case-|vib-|psd-|chunkId-VGF2|pers-|num-|tam-|sem-|cp-|gen-	14	rsym	_	_
16	इसलिए	इसलिए	adv	PRP	case-|vib-|psd-|chunkId-NP5|pers-|num-|tam-|sem-|cp-|gen-	19	rh	_	_
17	इसे	यह	pn	PRP	case-o|vib-को|psd-|chunkId-NP6|pers-3|num-sg|tam-ko|sem-|cp-|gen-any	19	k2	_	_
18	ब्रह्मकुंड	ब्रह्मकुंड	n	NNP	case-d|vib-0|psd-|chunkId-NP7|pers-3|num-pl|tam-0|sem-|cp-|gen-m	19	k1	_	_
19	कहा	कहा	v	VM	case-|vib-०_जा+ता_है|psd-|chunkId-VGF3|pers-2|num-sg|tam-0|sem-|cp-|gen-m	4	ccof	_	_
20	जाता	जा	v	VAUX	case-|vib-ता|psd-|chunkId-VGF3|pers-any|num-sg|tam-wA|sem-|cp-|gen-m	19	lwg__vaux	_	_
21	है	है	v	VAUX	case-|vib-है|psd-|chunkId-VGF3|pers-2|num-sg|tam-hE|sem-|cp-|gen-any	19	lwg__vaux	_	_
22	.	.	punc	SYM	case-|vib-|psd-|chunkId-BLK|pers-|num-|tam-|sem-|cp-|gen-	1	rsym	_	_

Querying the API through browser (output rendered using a JSON plugin):

GoogleChromeILParserAPI

Querying the API through browser (♥ pretty print!):

Enter into the browser: http://localhost:3000/parse?lang=hin&data=देश के टूरिजम में राजस्थान&pretty

PrettyPrint

In case of port conflicts, edit the file(s): ./lib/${lang}/daemons.ini

Note: Please make sure no proxy environment variables are present while contacting the server, if deployed locally.

About

An API for Parsing Indic Languages

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •