LLM Analysis of 150+ years of German Parliamentary Debates on Migration Reveals Shift from Post-War Solidarity to Anti-Solidarity in the Last Decade

arXiv cs.CL / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • 本研究は、ドイツ連邦議会の移民関連発言を対象に、連帯(solidarity)と反連帯(anti-solidarity)の下位類型をLLMで自動注釈するための枠組みと評価を行った。
  • 複数のLLMについて、モデル規模やプロンプト、ファインチューニング、歴史データと現代データ、誤りパターンの影響を分析し、GPT-5やgpt-oss-120Bが人間レベルに近い一致を示す一方、誤りは系統的で下流推論にバイアスを生むことを明らかにした。
  • そのバイアス低減として、ソフトラベルのLLM出力とDesign-based Supervised Learning(DSL)を組み合わせ、長期トレンド推定の歪みを抑える手法を提案した。
  • 分析結果として、戦後はグループ単位や思いやり(compassion)に基づく比較的高い連帯が見られ、2015年以降に排除・「不当な受け手(undeservingness)」・資源負担といった枠組みで反連帯が大きく増加したことを示した。
  • 著者らは、LLMによる大規模社会科学テキスト分析は可能だが、厳密なバリデーションと統計的補正が不可欠だと結論づけている。

Abstract

Migration has been a core topic in German political debate, from the postwar displacement of millions of expellees to labor migration and recent refugee movements. Studying political speech across such wide-ranging phenomena in depth has traditionally required extensive manual annotation, limiting analysis to small subsets of the data. Large language models (LLMs) offer a potential way to overcome this constraint. Using a theory-driven annotation scheme, we examine how well LLMs annotate subtypes of solidarity and anti-solidarity in German parliamentary debates and whether the resulting labels support valid downstream inference. We first provide a comprehensive evaluation of multiple LLMs, analyzing the effects of model size, prompting strategies, fine-tuning, historical versus contemporary data, and systematic error patterns. We find that the strongest models, especially GPT-5 and gpt-oss-120B, achieve human-level agreement on this task, although their errors remain systematic and bias downstream results. To address this issue, we combine soft-label model outputs with Design-based Supervised Learning (DSL) to reduce bias in long-term trend estimates. Beyond the methodological evaluation, we interpret the resulting annotations from a social-scientific perspective to trace trends in solidarity and anti-solidarity toward migrants in postwar and contemporary Germany. Our approach shows relatively high levels of solidarity in the postwar period, especially in group-based and compassionate forms, and a marked rise in anti-solidarity since 2015, framed through exclusion, undeservingness, and resource burden. We argue that LLMs can support large-scale social-scientific text analysis, but only when their outputs are rigorously validated and statistically corrected.