POLAR:A Per-User Association Test in Embedding Space
arXiv cs.CL / 3/18/2026
💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- POLAR (Per-user On-axis Lexical Association Report) introduces a per-user lexical association test that operates in the embedding space of a lightly adapted masked language model to reveal author-level variation.
- Authors are represented by private deterministic tokens, and POLAR projects these vectors onto curated lexical axes, reporting standardized effects with permutation p-values and Benjamini–Hochberg control.
- On a balanced bot–human Twitter benchmark, POLAR cleanly separates LLM-driven bots from organic accounts and on an extremist forum it quantifies strong alignment with slur lexicons and shows rightward drift over time.
- The method is modular to new attribute sets and provides concise per-author diagnostics for computational social science, with all code publicly available.




