Training a model using multiple corpora #6944
thiippal
started this conversation in
Help: Best practices
Replies: 1 comment 3 replies
-
@svlandeg, @adrianeboyd, anyone? :-) Should I just train UD + NER separately, combine them into a single pipeline and serialize to disk? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey!
I'm looking into training a single Transformer-based model for Finnish that would perform tagging, parsing, morphological analysis and named entity recognition.
For tagging, parsing and morphological analysis I would naturally use the UD-compliant Turku Dependency Treebank (TDT).
For NER, I would like to use the Turku NER corpus which is provided in a CoNLL-like format.
Can I use a spaCy config file to train different components of a model using different corpora? If so, is there an example of how to do this?
Beta Was this translation helpful? Give feedback.
All reactions