FairNVT: Improving Fairness via Noise Injection in Vision Transformers

arXiv cs.CV / 4/21/2026

📰 NewsModels & Research

Key Points

  • FairNVT is proposed as a lightweight debiasing framework for pretrained transformer-based encoders that targets both representation-level and prediction-level fairness without sacrificing task accuracy.
  • The paper argues representation and prediction fairness are closely linked, and that suppressing sensitive information in learned embeddings can directly lead to fairer downstream predictions.
  • FairNVT uses lightweight adapters to learn task-relevant and sensitive embeddings, injects calibrated Gaussian noise into the sensitive embedding, and then fuses it with the task representation.
  • It further employs orthogonality constraints and fairness regularization to reduce sensitive-attribute leakage and improve fairness metrics such as demographic parity and equalized odds.
  • Experiments across three vision-and-language datasets show that FairNVT lowers sensitive-attribute attacker accuracy while maintaining strong task performance, and it is compatible with many pretrained transformer encoders.

Abstract

This paper presents FairNVT, a lightweight debiasing framework for pretrained transformer-based encoders that improves both representation and prediction level fairness while preserving task accuracy. Unlike many existing debiasing approaches that address these notions separately, we argue they are inherently connected: suppressing sensitive information at the representation level can facilitate fairer predictions. Our approach learns task-relevant and sensitive embeddings via lightweight adapters, applies calibrated Gaussian noise to the sensitive embedding, and fuses it with the task representation. Together with orthogonality constraints and fairness regularization, these components jointly reduce sensitive-attribute leakage in the learned embeddings and encourage fairer downstream predictions. The framework is compatible with a wide range of pretrained transformer encoders. Across three datasets spanning vision and language, FairNVT reduces sensitive-attribute attacker accuracy, improves demographic-parity and equalized-odds metrics, and maintains high task performance.