Geometrical Cross-Attention and Nonvoid Voxelization for Efficient 3D Medical Image Segmentation

arXiv cs.CV / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces GCNV-Net, a 3D medical image segmentation framework designed to improve both accuracy and computational efficiency across different organs and imaging modalities.
It combines a Tri-directional Dynamic Nonvoid Voxel Transformer that partitions voxels along three anatomical planes with a Geometrical Cross-Attention module that fuses multi-scale features using explicit geometric positional information.
A Nonvoid Voxelization strategy reduces computation by processing only informative voxel regions, cutting FLOPs by 56.13% and inference latency by 68.49% versus conventional voxelization.
Across multiple benchmarks (BraTS2021, ACDC, MSD Prostate, MSD Pancreas, AMOS2022), the method reports state-of-the-art results, outperforming prior best methods by 0.65% Dice, 0.63% IoU, 1% NSD, and about 14.5% HD95.
The authors argue the approach provides a strong accuracy–efficiency balance and robustness suitable for potential clinical deployment.

Abstract

Accurate segmentation of 3D medical scans is crucial for clinical diagnostics and treatment planning, yet existing methods often fail to achieve both high accuracy and computational efficiency across diverse anatomies and imaging modalities. To address these challenges, we propose GCNV-Net, a novel 3D medical segmentation framework that integrates a Tri-directional Dynamic Nonvoid Voxel Transformer (3DNVT), a Geometrical Cross-Attention module (GCA), and Nonvoid Voxelization. The 3DNVT dynamically partitions relevant voxels along the three orthogonal anatomical planes, namely the transverse, sagittal, and coronal planes, enabling effective modeling of complex 3D spatial dependencies. The GCA mechanism explicitly incorporates geometric positional information during multi-scale feature fusion, significantly enhancing fine-grained anatomical segmentation accuracy. Meanwhile, Nonvoid Voxelization processes only informative regions, greatly reducing redundant computation without compromising segmentation quality, and achieves a 56.13% reduction in FLOPs and a 68.49% reduction in inference latency compared to conventional voxelization. We evaluate GCNV-Net on multiple widely used benchmarks: BraTS2021, ACDC, MSD Prostate, MSD Pancreas, and AMOS2022. Our method achieves state-of-the-art segmentation performance across all datasets, outperforming the best existing methods by 0.65% on Dice, 0.63% on IoU, 1% on NSD, and relatively 14.5% on HD95. All results demonstrate that GCNV-Net effectively balances accuracy and efficiency, and its robustness across diverse organs, disease conditions, and imaging modalities highlights strong potential for clinical deployment.

Black Hat Asia

AI Business

The enforcement gap: why finding issues was never the problem

Dev.to

How I Built AI-Powered Auto-Redaction Into a Desktop Screenshot Tool

Dev.to

Agentic AI vs Traditional Automation: Why They Require Different Approaches in Modern Enterprises

Dev.to

Agentic AI vs Traditional Automation: Why Modern Enterprises Must Treat Them Differently

Dev.to

Geometrical Cross-Attention and Nonvoid Voxelization for Efficient 3D Medical Image Segmentation

Key Points

Abstract

Related Articles

Black Hat Asia

The enforcement gap: why finding issues was never the problem

How I Built AI-Powered Auto-Redaction Into a Desktop Screenshot Tool

Agentic AI vs Traditional Automation: Why They Require Different Approaches in Modern Enterprises

Agentic AI vs Traditional Automation: Why Modern Enterprises Must Treat Them Differently

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer