Temporal Text Classification with Large Language Models
arXiv cs.CL / 3/13/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper conducts a systematic evaluation of temporal text classification (TTC) using leading proprietary LLMs (Claude 3.5, GPT-4o, Gemini 1.5) and open-source LLMs (LLaMA 3.2, Gemma 2, Mistral, Nemotron 4) across three historical corpora (two English, one Portuguese) to assess zero-shot, few-shot prompting, and fine-tuning settings.
- Proprietary models show strong TTC performance, particularly with few-shot prompting.
- Open-source models improve with fine-tuning but still do not match the performance of proprietary LLMs.
- The study highlights implications for prompt design, model selection, and future research in dating historical texts.




