Variational Feature Compression for Model-Specific Representations

arXiv cs.CV / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses “input repurposing” in shared/cloud inference, where unauthorized models can reuse released representations for unintended tasks despite access controls.
  • It proposes a feature compression/encoding framework using a variational latent bottleneck trained with a task-driven cross-entropy objective plus KL regularization, while avoiding pixel-level reconstruction loss.
  • A dynamic binary masking strategy selects or suppresses latent dimensions based on per-dimension KL divergence and gradient-based saliency relative to a frozen target model, enabling strong task-specific utility.
  • In experiments on CIFAR-100, the resulting representations maintain high accuracy for the intended classifier while driving unintended classifiers’ accuracy to below 2%, reported as a suppression ratio exceeding 45x.
  • The authors note the method requires a white-box training setup to compute saliency (gradient access), while inference only needs a forward pass through the frozen target model, and they call for further robustness evaluation against adaptive adversaries.

Abstract

As deep learning inference is increasingly deployed in shared and cloud-based settings, a growing concern is input repurposing, in which data submitted for one task is reused by unauthorized models for another. Existing privacy defenses largely focus on restricting data access, but provide limited control over what downstream uses a released representation can still support. We propose a feature extraction framework that suppresses cross-model transfer while preserving accuracy for a designated classifier. The framework employs a variational latent bottleneck, trained with a task-driven cross-entropy objective and KL regularization, but without any pixel-level reconstruction loss, to encode inputs into a compact latent space. A dynamic binary mask, computed from per-dimension KL divergence and gradient-based saliency with respect to the frozen target model, suppresses latent dimensions that are uninformative for the intended task. Because saliency computation requires gradient access, the encoder is trained in a white-box setting, whereas inference requires only a forward pass through the frozen target model. On CIFAR-100, the processed representations retain strong utility for the designated classifier while reducing the accuracy of all unintended classifiers to below 2%, yielding a suppression ratio exceeding 45 times relative to unintended models. Preliminary experiments on CIFAR-10, Tiny ImageNet, and Pascal VOC provide exploratory evidence that the approach extends across task settings, although further evaluation is needed to assess robustness against adaptive adversaries.