Breaking the Resource Wall: Geometry-Guided Sequence Modeling for Efficient Semantic Segmentation

arXiv cs.CV / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper introduces DGM-Net (Directional Geometric Mamba Network), a geometry-guided semantic segmentation model designed to improve accuracy without scaling up backbone size or computation budgets.
It proposes Directional Geometric Mamba (G-Mamba), a linear-complexity O(N) sequence/context modeling operator intended as an efficient alternative to modules like ASPP and PPM.
To strengthen structural awareness in state space model (SSM)-based processing, the authors develop a DGM-Module that derives centripetal flow fields and topological skeletons to guide scanning and better preserve object boundaries.
The method reportedly achieves strong segmentation performance—80.8% mIoU on the reported setting within 28k iterations, 82.3% mIoU on Cityscapes test, and 45.24% mIoU on ADE20K—while remaining stable on constrained hardware (e.g., batch size 2 on 8GB VRAM).
Overall, the work argues that integrating geometric guidance into SSM-based architectures can yield resource-efficient, high-quality semantic segmentation results.

Abstract

High-performance semantic segmentation has achieved significant progress in recent years, often driven by increasingly large backbones and higher computational budgets. While effective, such approaches introduce substantial computational overhead and limit accessibility under constrained hardware settings. In this paper, we propose DGM-Net (Directional Geometric Mamba Network), an efficient architecture that improves modeling capability through structural design rather than increasing model capacity. We introduce Directional Geometric Mamba (G-Mamba), a linear-complexity O(N) operator as an alternative to conventional context modeling modules such as ASPP and PPM. To further enhance structural awareness in state space model (SSM)-based modeling, we design the DGM-Module, which extracts centripetal flow fields and topological skeletons to guide the scanning process and improve boundary preservation. Without relying on large-scale pretraining or heavy backbone scaling, DGM-Net achieves 80.8% mIoU within 28k iterations, 82.3% mIoU on Cityscapes test set, and 45.24% mIoU on ADE20K. In addition, the model maintains stable performance under constrained hardware settings (e.g., batch size of 2 on 8GB VRAM), highlighting its efficiency and practicality. These results demonstrate that incorporating geometric guidance into SSM-based architectures provides an effective and resource-efficient direction for semantic segmentation.

How I Automate My Dev Workflow with Claude Code Hooks

Dev.to

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

Dev.to

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

Dev.to

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

Dev.to

Real-Time Monitoring for AI Agents: Beyond Log Streaming

Dev.to

Breaking the Resource Wall: Geometry-Guided Sequence Modeling for Efficient Semantic Segmentation

Key Points

Abstract

Related Articles

How I Automate My Dev Workflow with Claude Code Hooks

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

Real-Time Monitoring for AI Agents: Beyond Log Streaming

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer