AI Navigate

Revisiting Label Inference Attacks in Vertical Federated Learning: Why They Are Vulnerable and How to Defend

arXiv cs.LG / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies label inference attacks in vertical federated learning and shows that LIAs remain a vulnerability even when bottom models focus on feature extraction, using mutual information to reveal a 'model compensation' phenomenon.
  • It proves that in VFL, the mutual information between layer outputs and labels grows with layer depth, indicating that the top model is responsible for mapping labels while bottom models primarily extract features.
  • The authors introduce task reassignment to break the distribution alignment between features and labels, showing that disrupting this alignment significantly reduces LIA success.
  • They propose a zero-overhead defense based on layer adjustment, shifting cut layers forward to increase the share of top-model layers and thereby improve resistance to LIAs and possibly other defenses.
  • Extensive experiments across five datasets and model architectures validate the defense and highlight practical implications for secure VFL deployment.

Abstract

Vertical federated learning (VFL) allows an active party with a top model, and multiple passive parties with bottom models to collaborate. In this scenario, passive parties possessing only features may attempt to infer active party's private labels, making label inference attacks (LIAs) a significant threat. Previous LIA studies have claimed that well-trained bottom models can effectively represent labels. However, we demonstrate that this view is misleading and exposes the vulnerability of existing LIAs. By leveraging mutual information, we present the first observation of the "model compensation" phenomenon in VFL. We theoretically prove that, in VFL, the mutual information between layer outputs and labels increases with layer depth, indicating that bottom models primarily extract feature information while the top model handles label mapping. Building on this insight, we introduce task reassignment to show that the success of existing LIAs actually stems from the distribution alignment between features and labels. When this alignment is disrupted, the performance of LIAs declines sharply or even fails entirely. Furthermore, the implications of this insight for defenses are also investigated. We propose a zero-overhead defense technique based on layer adjustment. Extensive experiments across five datasets and five representative model architectures indicate that shifting cut layers forward to increase the proportion of top model layers in the entire model not only improves resistance to LIAs but also enhances other defenses.