Editing Integrating Stanford NLP (from old wiki) #3971
Closed
chenlica
started this conversation in
archived-wiki
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
From the page https://github.com/apache/texera/wiki/Integrating-Stanford-NLP/ (may be dangling)
====
Author(s): Feng Hong, Yang Jiao
##Synopsys
Stanford NLP package is a very powerful Java software for natural language processing. The goal is to integrate some of its features as an operator to allow users to extract Named Entities or Part of speeches.
Status
As of 6/13/2016: FINISHED
Modules
Related Issues
#33
##Stanford NLP package
Stanford NLP is a set of natural language analysis tools written in Java, which annotate raw human language tokens and output forms of words, their part of speech (whether they are names of companies, people, location, etc.). The package includes a POS tagger, a syntactic parser, and a named entity recognizer. Its analyses provide the foundational building blocks for higher-level and domain-specific text-understanding applications.
The purpose of this project is to implement Stanford NLP as an extractor in Texera. We allow users to specify the NLP constant including 7 Named Entity classes and 4 types of Part of Speech entity: Number, Location, Person, Organization, Money, Percent, Date, Time, Adjective, Adverb, Noun, Verb.
##Presentation Slides
4/11/2016 Presentation: Project Overview
4/18/2016 Presentation: StanfordNPL introduction
4/25/2016 Presentation: [Status Report]
(https://docs.google.com/presentation/d/1ek18Zr0OqQ0RONj8D7W2aSGs9sz1etnf9bEnWTEA2ag/edit?usp=sharing)
Performance Test
Machine setting: Macbook Pro (Late-2015), Intel Core i5, SSD hard drive, 8GB memory.
On average: 34 Documents/sec for Named Entities Recognition and 480 Docs/sec for Part of Speech Recognition
Data set: 1M Medline records, about 1.5G
TODOs
##Stanford NLP package License:
GNU General Public License
Beta Was this translation helpful? Give feedback.
All reactions