Adaptive Transform Coding for Semantic Compression

arXiv cs.CV / 4/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses semantic (feature/embedding) compression for vision tasks, shifting from reconstructing images for humans to transmitting compact machine-oriented representations for downstream inference.
It proposes an adaptive transform-coding approach for semantic-feature compression based on the conditional rate-distortion function of a Gaussian mixture model.
The method selects mode-dependent transforms and quantizers according to the inferred source component, improving efficiency for heterogeneous feature distributions.
Experiments on features from common vision backbones and foundation models indicate the approach achieves better-than or competitive results versus state-of-the-art neural compression methods while remaining flexible and interpretable.

Abstract

Visual data compression is shifting from human-centered reconstruction to machine-oriented representation coding. In this setting, an image is often mapped to a compact semantic embedding, which is then compressed and transmitted for downstream inference. We propose an adaptive transform-coding method for semantic-feature compression motivated by the conditional rate-distortion function of a Gaussian mixture model. The scheme uses mode-dependent transforms and quantizers selected according to the inferred source component, enabling more efficient coding of heterogeneous feature distributions. Evaluations on features from widely used vision backbones and foundation models show that the proposed method outperforms or is competitive with state-of-the-art neural compression methods while preserving flexibility and interpretability.

Can AI Predict Pollution Before It Happens? The Smart Solution to an Old Problem

Dev.to

THE FIFTH TRANSMISSION: THE GRADIENT IS THE GOVERNMENT

Reddit r/artificial

Looking for feedback on OpenVidya: an open-source AI classroom layer for NCERT/CBSE [R]

Reddit r/MachineLearning

RAG Series (1): Why LLMs Need External Memory

Dev.to

One Open Source Project a Day (No. 54): Warp - The AI-Native Rust Terminal

Dev.to

Adaptive Transform Coding for Semantic Compression

Key Points

Abstract

Related Articles

Can AI Predict Pollution Before It Happens? The Smart Solution to an Old Problem

THE FIFTH TRANSMISSION: THE GRADIENT IS THE GOVERNMENT

Looking for feedback on OpenVidya: an open-source AI classroom layer for NCERT/CBSE [R]

RAG Series (1): Why LLMs Need External Memory

One Open Source Project a Day (No. 54): Warp - The AI-Native Rust Terminal

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer