This repository contains files related to the ongoing morphosyntactic annotation following the Universal Dependencies formalism (upcoming treebank) of the Divina Commedia 'Divine Comedy', the major work by Dante Alighieri, who lived between the 13th and the 14th century in Italy. Annotation is performed by Claudia Corbetta (@ClaudiaCorbe) with some help by Flavio Massimiliano Cecchini (@Stormur) and Giovanni Moretti, at the CIRCSE research center of the Università Cattolica del Sacro Cuore in Milan under the supervision of prof. Marco Passarotti.
In particular, here you can find:
- general statistics about the treebank
- general statistics about the data splits used for a first evaluation of POS taggers/parser performances
- the data splits used for this evaluation (NB: currently only the one by canti)
These data and its annotation are intended to be provisional until the official ones will be published as part of the UD project. Please note that they may not yet pass the official validation tests.