Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments

arXiv cs.CL / 5/1/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that detecting “language ideologies” in multilingual societies helps explain how identities and social belonging are constructed through discourse.
It proposes using large language models (LLMs) to replicate human-coded ideological categories from Luxembourgish news comments, testing multiple prompt conditions.
The researchers manually annotate a Luxembourgish user-comment corpus with predefined ideological labels and evaluate how closely LLM outputs match human annotations.
Because Luxembourgish is a low-resource language with limited representation in LLM training data, the study also tests whether machine-translating comments into high-resource languages improves ideology-detection performance.
Results indicate LLMs are not yet fully optimized for multi-class ideological annotation, but they can still serve as practical tools for identifying ideological content in text.

Abstract

Detecting language ideologies is a valuable yet complex task for understanding how identities are constructed through discourse. In Luxembourg's multicultural and multilingual society, language ideologies reflect more than simple preferences: they carry deep cultural and social meanings, shaping identities and social belonging. Following recent developments in applying Natural Language Processing tools to linguistics and social science, this paper explores the potential of large language models to assist in the detection of language ideologies. We manually annotate a corpus of user comments in Luxembourgish with predefined ideological categories and then evaluate the performance of large language models under varying prompt conditions to assess their ability to replicate these human annotations. Since Luxembourgish is a small language and poorly represented in the LLMs' training data, we also investigate whether machine-translating the data to high-resource languages increases performance on the ideology detection task. Our findings suggest that, while LLMs are not yet fully optimized for a multi-class ideological annotation task, they are practical tools to identify language ideological content.

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!

Reddit r/artificial

Why Enterprise AI Pilots Fail

Dev.to

Announcing the NVIDIA Nemotron 3 Super Build Contest

Dev.to

75% of Sites Blocking AI Bots Still Get Cited. Here Is Why Blocking Does Not Work.

Dev.to

Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments

Key Points

Abstract

Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!

Why Enterprise AI Pilots Fail

Announcing the NVIDIA Nemotron 3 Super Build Contest

75% of Sites Blocking AI Bots Still Get Cited. Here Is Why Blocking Does Not Work.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer