Locating Demographic Bias at the Attention-Head Level in CLIP's Vision Encoder
arXiv cs.CV / 3/13/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- Introduces a mechanistic fairness audit that locates demographic bias at the level of individual attention heads in vision transformers, using projected residual-stream decomposition, zero-shot Concept Activation Vectors, and bias-augmented TextSpan analysis.
- Applies the pipeline to the CLIP ViT-L-14 encoder across 42 profession classes in the FACET benchmark, auditing gender and age bias.
- For gender, identifies four terminal-layer heads whose ablation reduces global bias with a small accuracy gain, and a layer-matched random control confirms the effect is head-specific.
- Finds that a single final-layer head accounts for most of the bias reduction in the most stereotyped classes, shifting predictions toward correct occupations.
- For age, the same approach yields weaker, more diffuse effects, suggesting age bias is encoded more diffusely than gender in this model, and indicating head-level localisation may vary by attribute.
Related Articles
5 Signs Your Consulting Firm Needs AI Agents (Not More Staff)
Dev.to
AgentDesk vs Hiring Another Consultant: A Cost Comparison
Dev.to
"Why Your AI Agent Needs a System 1"
Dev.to
When should we expect TurboQuant?
Reddit r/LocalLLaMA
AI as Your Customs Co-Pilot: Automating HS Code Chaos in Southeast Asia
Dev.to