This is the research data repository of the project Early Chinese Periodicals Online (ECPO), especially the part dedicated to generating ECPO full text.
The repo contains the ground truth data we produced manually from Jing bao 晶報:
- April 1939 (10 issues, each 4 folds)
- April 1930 (10 issues, each 2 folds)
- April 1920 (10 issues, each 2 folds)
The user frontend of the ECPO online database can be accessed at https://uni-heidelberg.de/ecpo.
We also published a number of related repositories:
- ECPO annotator - the ECPO image annotator tool
- ECPO segment - dhSegment based tools to segmant ECPO data
- ECPO full text - the OCR classification model
.
Instructions to editors:
For detailed information see the ECPO Wiki
For more details about encoding special characters see the Quick reference
To enter special characters use e.g. BabelMap or Babel Map Online
This project is licensed under the terms of the Creative Commons license Attribution 4.0 International (CC-BY).