HIPE-OCRepair-2026 is an ICDAR 2026 Competition focused on LLM-assisted OCR post-correction of historical documents, with a particular emphasis on historical newspapers.
With renewed interest driven by large language models (LLMs), OCR post-correction has (re)gained momentum, resulting in a growing number of models and experimental approaches. However, these efforts often rely on heterogeneous legacy datasets that come with important limitations, making systematic evaluation and meaningful comparison across approaches difficult.
A central question motivating this competition is:
To what extent can modern large language models address the OCR debt accumulated in large-scale digitized historical collections?
The competition addresses this by providing HIPE-OCRepair-Bench, a unified multilingual benchmark for OCR post-correction, comprising curated datasets, an evaluation protocol, baseline systems, and an open leaderboard.
All information about the task, datasets, evaluation protocol, and submission instructions is available in the Participation Guidelines.
| π Competition website | https://hipe-eval.github.io/HIPE-OCRepair-2026/ |
| π Participation Guidelines | README-Participation-Guidelines.md |
| π Scorer | https://github.com/hipe-eval/HIPE-OCRepair-scorer |
| π Evaluation repository (after competition) | https://github.com/hipe-eval/HIPE-OCRepair-2026-eval |
| π Leaderboard (to come) | https://huggingface.co/spaces/hipe-ocrepair-2026-eval |
| π Registration & contact | see competition website |
Data is available:
- 02.03.2026: release v0.9
The HIPE-OCRepair-2026 organising team expresses its sincere appreciation to the ICDAR-2026 Competition Committee for the overall coordination and support.
HIPE-OCRepair-2026 is part of the HIPE-eval series of shared tasks on historical document and information processing and evaluation.
HIPE-eval editions are organised within the framework of the Impresso β Media Monitoring of the Past project, funded by the Swiss National Science Foundation under grant No. CRSII5_213585 and by the Luxembourg National Research Fund under grant No. 17498891.