Approaches to Analysing Historical Newspapers Using LLMs
arXiv cs.CL / 3/27/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper presents a mixed-method computational analysis of two Slovene historical newspapers (Slovenec and Slovenski narod) from the sPeriodika corpus, linking topic modeling with LLM-driven aspect-level sentiment analysis and qualitative discourse interpretation.
- Using BERTopic, the study identifies shared themes and clear ideological differences between the newspapers, aligning patterns with their conservative-Catholic versus liberal-progressive orientations.
- The authors evaluate four instruction-following LLMs for sentiment classification on OCR-degraded historical Slovene, concluding that the Slovene-adapted GaMS3-12B-Instruct model is most suitable for large-scale use while noting uneven performance across sentiment classes.
- At dataset scale, the selected model uncovers variation in how collective identities are portrayed (often neutral versus evaluative/conflict-related contexts), and the study further visualizes NER/entity relationships with both network analysis and critical discourse analysis.
- Overall, the work argues that combining scalable LLM-based methods with critical interpretive frameworks can strengthen digital humanities research on noisy historical media data.
広告
Related Articles

Got My 39-Agent System Audited Live. Here's What the Maturity Scorecard Revealed.
Dev.to

The Redline Economy
Dev.to

$500 GPU outperforms Claude Sonnet on coding benchmarks
Dev.to

From Scattershot to Sniper: AI for Hyper-Personalized Media Lists
Dev.to

The LiteLLM Supply Chain Attack: A Wake-Up Call for AI Infrastructure
Dev.to