POLAR:A Per-User Association Test in Embedding Space

arXiv cs.CL / 3/18/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

POLAR (Per-user On-axis Lexical Association Report) introduces a per-user lexical association test that operates in the embedding space of a lightly adapted masked language model to reveal author-level variation.
Authors are represented by private deterministic tokens, and POLAR projects these vectors onto curated lexical axes, reporting standardized effects with permutation p-values and Benjamini–Hochberg control.
On a balanced bot–human Twitter benchmark, POLAR cleanly separates LLM-driven bots from organic accounts and on an extremist forum it quantifies strong alignment with slur lexicons and shows rightward drift over time.
The method is modular to new attribute sets and provides concise per-author diagnostics for computational social science, with all code publicly available.

Abstract

Most intrinsic association probes operate at the word, sentence, or corpus level, obscuring author-level variation. We present POLAR (Per-user On-axis Lexical Association Re-port), a per-user lexical association test that runs in the embedding space of a lightly adapted masked language model. Authors are represented by private deterministic to-kens; POLAR projects these vectors onto curated lexicalaxes and reports standardized effects with permutation p-values and Benjamini--Hochberg control. On a balanced bot--human Twitter benchmark, POLAR cleanly separates LLM-driven bots from organic accounts; on an extremist forum,it quantifies strong alignment with slur lexicons and reveals rightward drift over time. The method is modular to new attribute sets and provides concise, per-author diagnostics for computational social science. All code is publicly avail-able at https://github.com/pedroaugtb/POLAR-A-Per-User-Association-Test-in-Embedding-Space.

State of MCP Security 2026: We Scanned 15,923 AI Tools. Here's What We Found.

Dev.to

I Built a Zombie Process Killer Because Claude Code Ate 14GB of My RAM

Dev.to

Data Augmentation Using GANs

Dev.to

Building Safety Guardrails for LLM Customer Service That Actually Work in Production

Dev.to

The New AI Agent Primitive: Why Policy Needs Its Own Language (And Why YAML and Rego Fall Short)

Dev.to

POLAR:A Per-User Association Test in Embedding Space

Key Points

Abstract

Related Articles

State of MCP Security 2026: We Scanned 15,923 AI Tools. Here's What We Found.

I Built a Zombie Process Killer Because Claude Code Ate 14GB of My RAM

Data Augmentation Using GANs

Building Safety Guardrails for LLM Customer Service That Actually Work in Production

The New AI Agent Primitive: Why Policy Needs Its Own Language (And Why YAML and Rego Fall Short)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer