Modality-Specific Hierarchical Enhancement for RGB-D Camouflaged Object Detection
arXiv cs.CV / 4/6/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- RGB-D camouflaged object detection is difficult because targets closely resemble backgrounds, and existing methods often fuse RGB and depth features without enough modality-specific enhancement.
- The paper introduces MHENet, which adds a Texture Hierarchical Enhancement Module (THEM) to boost subtle high-frequency texture cues and a Geometry Hierarchical Enhancement Module (GHEM) to strengthen geometric structure via learnable gradient extraction.
- MHENet uses an Adaptive Dynamic Fusion Module (ADFM) that fuses the enhanced texture and geometry representations using spatially varying weights to improve cross-modal fusion quality.
- Experiments on four benchmarks show MHENet outperforms 16 state-of-the-art methods both qualitatively and quantitatively, and the code is released on GitHub.




