AI Navigate

LLMs Can Infer Political Alignment from Online Conversations

arXiv cs.CL / 3/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • LLMs can reliably infer hidden political alignment from online discussions, outperforming traditional machine learning models on Debate.org and Reddit.
  • Prediction accuracy improves when aggregating multiple text-level inferences to the user level and when using more politics-adjacent domains.
  • LLMs leverage words that are highly predictive of political alignment while not being explicitly political, highlighting privacy risks in online data and AI capabilities.
  • The findings underscore the capacity and risks of LLMs for exploiting socio-cultural correlates, implying potential misuse and the need for privacy safeguards.
  • The work demonstrates a fundamental privacy risk as increasing data exposure and rapid AI progress raise the misuse potential of such capabilities.

Abstract

Due to the correlational structure in our traits such as identities, cultures, and political attitudes, seemingly innocuous preferences such as following a band or using a specific slang, can reveal private traits. This possibility, especially when combined with massive, public social data and advanced computational methods, poses a fundamental privacy risk. Given our increasing data exposure online and the rapid advancement of AI are increasing the misuse potential of such risk, it is therefore critical to understand capacity of large language models (LLMs) to exploit it. Here, using online discussions on Debate.org and Reddit, we show that LLMs can reliably infer hidden political alignment, significantly outperforming traditional machine learning models. Prediction accuracy further improves as we aggregate multiple text-level inferences into a user-level prediction, and as we use more politics-adjacent domains. We demonstrate that LLMs leverage the words that can be highly predictive of political alignment while not being explicitly political. Our findings underscore the capacity and risks of LLMs for exploiting socio-cultural correlates.