Description
Hi All,
Thanks for the great software. I would like to ask you the following please.
When training specific relations to be extracted from custom Entity types, using the Relation Extractor, I noted that the current possible entities are "hard-coded" in some parts, e.g.:
- https://github.com/stanfordnlp/CoreNLP/blob/master/src/edu/stanford/nlp/ie/machinereading/domains/roth/RothEntityExtractor.java#L16
- https://github.com/stanfordnlp/CoreNLP/blob/master/src/edu/stanford/nlp/ie/machinereading/domains/roth/RothCONLL04Reader.java#L64
By modifying these 2 bits, one can re-use the Relation Extractor successfully with custom entities in case its needed, but this requires then a recompilation and an initial troubleshooting as to understand this.
Would you be interested in a pull-request that refactors these hard-coded methods in something that is obtainable from the properties file? E.g.: in the properties file one can indicate a "entitiesPath" option which would then point to a tab separated file with the normalised and not normalised values of these entities as its columns.
If this option is not provided potentially these default hard coded entities can then be used as to maintain the current behaviour.
This would cause potential Relation Extractor workflows with custom entities to be possible without code recompilation.
Please advise.
Again, thanks!