ENEIDE: A High Quality Silver Standard Dataset for Named Entity Recognition and Linking in Historical Italian
arXiv cs.CL / 4/1/2026
📰 NewsSignals & Early TrendsModels & Research
Key Points
- ENEIDE is a newly introduced silver-standard dataset for Named Entity Recognition and Linking (NERL) tailored to historical Italian, spanning two scholarly domains and centuries.
- The corpus includes 2,111 documents and 8,000+ entity annotations that cover multiple entity types (people, locations, organizations, literary works) mapped to Wikidata IDs with support for NIL entities.
- Annotations were produced via semi-automatic extraction from manually curated digital editions (Digital Zibaldone and Aldo Moro Digitale), with quality control and enhancement steps.
- The dataset is released with training/development/test splits and is described as the first publicly available multi-domain NERL dataset for historical Italian, enabling diachronic and cross-domain evaluation.
- Baseline experiments with state-of-the-art models show the dataset’s difficulty and a performance gap between zero-shot and fine-tuned approaches, suggesting clear opportunities for research and improvement.
Related Articles

Black Hat Asia
AI Business

AI server farms heat up the neighborhood for miles around, paper finds
The Register

Paperclip: Công Cụ Miễn Phí Biến AI Thành Đội Phát Triển Phần Mềm
Dev.to
Does the Claude “leak” actually change anything in practice?
Reddit r/LocalLLaMA

87.4% of My Agent's Decisions Run on a 0.8B Model
Dev.to