When W4A4 Breaks Camouflaged Object Detection: Token-Group Dual-Constraint Activation Quantization

arXiv cs.CV / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper studies post-training W4A4 (4-bit weights/4-bit activations) quantization for camouflaged object detection (COD) using Transformer-based models, showing a sharp “quantization cliff” that makes aggressive low-bit inference unusually difficult for COD.
It identifies the cause as token-local activation bottlenecks where heavy-tailed background tokens dominate a shared activation range, increasing quantization step size and causing weak but meaningful boundary cues to be mapped into the zero bin.
To fix this, the authors propose COD-TDQ, a COD-aware Token-group Dual-constraint activation Quantization method using Direct-Sum Token-Group (DSTG) token-group scaling and Dual-Constraint Range Projection (DCRP) to bound both the step-to-dispersion ratio and the zero-bin mass.
Experiments on four COD benchmarks with two baseline models (CFRN and ESCNet) show COD-TDQ improves the Sα-score by more than 0.12 over the state of the art quantization approach without retraining, and the code will be released.

Abstract

Camouflaged object detection (COD) segments objects that intentionally blend with the background, so predictions depend on subtle texture and boundary cues. COD is often needed under tight on-device memory and latency budgets, making low-bit inference highly desirable. However, COD is unusually hard to quantify aggressively. We study post-training W4A4 quantization of Transformer-based COD and find a task-specific cliff: heavy-tailed background tokens dominate a shared activation range, inflating the step size and pushing weak-but-structured boundary cues into the zero bin. This exposes a token-local bottleneck -- remove cross-token range domination and bound the zero-bin mass under 4-bit activations. To address this, we introduce COD-TDQ, a COD-aware Token-group Dual-constraint activation Quantization method. COD-TDQ addresses this token-local bottleneck with two coupled steps: Direct-Sum Token-Group (DSTG) assigns token-group scales to suppress cross-token range domination, and Dual-Constraint Range Projection (DCRP) projects each token-group clip range to keep the step-to-dispersion ratio and the zero-bin mass bounded. Across four COD benchmarks and two baseline models (CFRN and ESCNet), COD-TDQ consistently achieves an S{\alpha}score more than 0.12 higher than that of the state-of-the-art quantization method without retraining. The code will be released.

A practical guide to getting comfortable with AI coding tools

Dev.to

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

Dev.to

🚀 Major BrowserAct CLI Update

Dev.to

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

Dev.to

When W4A4 Breaks Camouflaged Object Detection: Token-Group Dual-Constraint Activation Quantization

Key Points

Abstract

Related Articles

A practical guide to getting comfortable with AI coding tools

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

🚀 Major BrowserAct CLI Update

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer