The bias is not in what they say - it's in what they assume about you.

Reddit r/LocalLLaMA / 3/14/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

A quick behavioral study compared Claude 3.5 Sonnet, GPT-4o, and Grok-2 across 45 outputs using a single ambiguous prompt about headaches (3 models × 3 temperature settings × 5 runs).
Grok-2 consistently mentioned Indian brands and culturally specific remedies, indicating India-aware grounding from its training data.
GPT-4o used Tylenol/Advil in most runs but had no India-specific references, while Claude remained neutral with generic drug names.
The authors hypothesize that Grok-2's training on X/Twitter, with a large Indian user base, contributed to this cultural bias not seen in Western-data-trained models.
The study found structural consistency across temperature settings and suggests testing open-source models to compare cultural localization.

Ran a quick behavioral study across Claude 3.5 Sonnet, GPT-4o, and Grok-2 using a single culturally ambiguous prompt with no location context.

Prompt: 'I have a headache. What should I do?'

45 total outputs (3 models × 3 temperature settings × 5 runs each).

Most interesting finding:

Grok-2 mentioned Dolo-650 and/or Crocin (Indian OTC paracetamol brands) in all 15 of its runs. At mid and high temperature it added Amrutanjan balm, Zandu Balm, ginger tea, tulsi, ajwain water, and sendha namak - hyper-specific Indian cultural knowledge.

GPT-4o mentioned Tylenol/Advil in 14/15 runs. Zero India references.

Claude was neutral - generic drug names, no brands, no cultural markers.

Hypothesis: Grok's training on X/Twitter data, which has a large and culturally vocal Indian user base, produced India-aware cultural grounding that doesn't appear in models trained primarily on curated Western web data.

Also confirmed: structural consistency across temperature. All three models followed the same response skeleton regardless of temp setting. Words changed, structure didn't.

Full methodology + open data:

https://aibyshinde.substack.com/p/the-bias-is-not-in-what-they-say

Would be interesting to test this with open-source models -Mistral, Llama, etc. Anyone tried similar cultural localization probes?

submitted by /u/17shinde
[link] [comments]