Probing Cultural Signals in Large Language Models through Author Profiling

arXiv cs.CL / 3/18/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper evaluates whether large language models can perform author profiling from song lyrics in a zero-shot setting, inferring singers' gender and ethnicity without task-specific fine-tuning.
Across open-source models and over 10,000 lyrics, the study finds non-trivial profiling performance but reveals systematic cultural alignment, with most models defaulting toward North American ethnicity and some (e.g., DeepSeek-1.5B) aligning with Asian ethnicity.
The authors introduce two fairness metrics, Modality Accuracy Divergence (MAD) and Recall Divergence (RD), to quantify disparities in model outputs and biases across models.
They report model-specific bias differences, noting Ministral-8B as having the strongest ethnicity bias and Gemma-12B as the most balanced, and provide code on GitHub for replication.

Abstract

Large language models (LLMs) are increasingly deployed in applications with societal impact, raising concerns about the cultural biases they encode. We probe these representations by evaluating whether LLMs can perform author profiling from song lyrics in a zero-shot setting, inferring singers' gender and ethnicity without task-specific fine-tuning. Across several open-source models evaluated on more than 10,000 lyrics, we find that LLMs achieve non-trivial profiling performance but demonstrate systematic cultural alignment: most models default toward North American ethnicity, while DeepSeek-1.5B aligns more strongly with Asian ethnicity. This finding emerges from both the models' prediction distributions and an analysis of their generated rationales. To quantify these disparities, we introduce two fairness metrics, Modality Accuracy Divergence (MAD) and Recall Divergence (RD), and show that Ministral-8B displays the strongest ethnicity bias among the evaluated models, whereas Gemma-12B shows the most balanced behavior. Our code is available on GitHub (https://github.com/ValentinLafargue/CulturalProbingLLM).

I Was Wrong About AI Coding Assistants. Here's What Changed My Mind (and What I Built About It).

Dev.to

Interesting loop

Reddit r/LocalLLaMA

Qwen3.5-122B-A10B Uncensored (Aggressive) — GGUF Release + new K_P Quants

Reddit r/LocalLLaMA

A supervisor or "manager" Al agent is the wrong way to control Al

Reddit r/artificial

FeatherOps: Fast fp8 matmul on RDNA3 without native fp8

Reddit r/LocalLLaMA

Probing Cultural Signals in Large Language Models through Author Profiling

Key Points

Abstract

Related Articles

I Was Wrong About AI Coding Assistants. Here's What Changed My Mind (and What I Built About It).

Interesting loop

Qwen3.5-122B-A10B Uncensored (Aggressive) — GGUF Release + new K_P Quants

A supervisor or "manager" Al agent is the wrong way to control Al

FeatherOps: Fast fp8 matmul on RDNA3 without native fp8

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer