Polarization by Default: Auditing Recommendation Bias in LLM-Based Content Curation

arXiv cs.CL / 4/20/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The study audits how large language models (LLMs) introduce and structure bias when curation and ranking are applied to real social-media posts from Twitter/X, Bluesky, and Reddit.
  • Across 540,000 simulated top-10 selections using six prompting strategies and three major providers (OpenAI, Anthropic, Google), the authors find that some biases are robust while others are highly sensitive to prompt design.
  • Results show amplified polarization in all configurations, a strong prompt-dependent inversion in toxicity handling between engagement-focused and information-focused prompts, and predominantly negative sentiment bias.
  • Provider-level comparisons reveal distinct trade-offs: GPT-4o Mini is most consistent across prompts, Claude and Gemini adapt more strongly for toxicity handling, and Gemini most strongly prefers negative sentiment.
  • On Twitter/X, political-leaning bias is the clearest demographic signal: left-leaning authors are systematically over-represented in selections even when right-leaning authors make up the largest share of the candidate pool, and this persists across prompts.

Abstract

Large Language Models (LLMs) are increasingly deployed to curate and rank human-created content, yet the nature and structure of their biases in these tasks remains poorly understood: which biases are robust across providers and platforms, and which can be mitigated through prompt design. We present a controlled simulation study mapping content selection biases across three major LLM providers (OpenAI, Anthropic, Google) on real social media datasets from Twitter/X, Bluesky, and Reddit, using six prompting strategies (\textit{general}, \textit{popular}, \textit{engaging}, \textit{informative}, \textit{controversial}, \textit{neutral}). Through 540,000 simulated top-10 selections from pools of 100 posts across 54 experimental conditions, we find that biases differ substantially in how structural and how prompt-sensitive they are. Polarization is amplified across all configurations, toxicity handling shows a strong inversion between engagement- and information-focused prompts, and sentiment biases are predominantly negative. Provider comparisons reveal distinct trade-offs: GPT-4o Mini shows the most consistent behavior across prompts; Claude and Gemini exhibit high adaptivity in toxicity handling; Gemini shows the strongest negative sentiment preference. On Twitter/X, where author demographics can be inferred from profile bios, political leaning bias is the clearest demographic signal: left-leaning authors are systematically over-represented despite right-leaning authors forming the pool plurality in the dataset, and this pattern largely persists across prompts.