AI Navigate

Decoding Matters: Efficient Mamba-Based Decoder with Distribution-Aware Deep Supervision for Medical Image Segmentation

arXiv cs.CV / 3/16/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Deco-Mamba, a decoder-centric architecture for generalized 2D medical image segmentation built on a U-Net-like structure with a Transformer-CNN-Mamba design.
  • It integrates novel decoder components such as a Co-Attention Gate (CAG), Vision State Space Module (VSSM), and a deformable convolutional refinement block to enhance multi-scale contextual representation.
  • A windowed distribution-aware KL-divergence loss is proposed for deep supervision across multiple decoding stages.
  • Extensive experiments on diverse medical imaging benchmarks report state-of-the-art performance with strong generalization while maintaining moderate model complexity.
  • The authors indicate that the source code will be released upon acceptance.

Abstract

Deep learning has achieved remarkable success in medical image segmentation, often reaching expert-level accuracy in delineating tumors and tissues. However, most existing approaches remain task-specific, showing strong performance on individual datasets but limited generalization across diverse imaging modalities. Moreover, many methods focus primarily on the encoder, relying on large pretrained backbones that increase computational complexity. In this paper, we propose a decoder-centric approach for generalized 2D medical image segmentation. The proposed Deco-Mamba follows a U-Net-like structure with a Transformer-CNN-Mamba design. The encoder combines a CNN block and Transformer backbone for efficient feature extraction, while the decoder integrates our novel Co-Attention Gate (CAG), Vision State Space Module (VSSM), and deformable convolutional refinement block to enhance multi-scale contextual representation. Additionally, a windowed distribution-aware KL-divergence loss is introduced for deep supervision across multiple decoding stages. Extensive experiments on diverse medical image segmentation benchmarks yield state-of-the-art performance and strong generalization capability while maintaining moderate model complexity. The source code will be released upon acceptance.