High-Precision Dichotomous Image Segmentation via Depth Integrity-Prior and Fine-Grained Patch Strategy

arXiv cs.CV / 4/29/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses high-precision dichotomous image segmentation by highlighting a tradeoff: non-diffusion methods are fast but have weaker semantics and unstable spatial priors, while diffusion methods are accurate but computationally expensive.
It introduces a “depth integrity-prior,” observing that complete objects tend to form low-variance, smoothly connected regions with sharp boundaries in depth maps, whereas backgrounds show chaotic high-variance patterns due to disconnected surfaces.
Because DIS typically lacks depth maps, the authors generate pseudo-depth using monocular depth estimation to quickly capture semantic and depth-aware spatial differences between foreground objects and background.
The proposed Prior-guided Depth Fusion Network (PDFNet) fuses RGB with pseudo-depth features, adds a depth integrity-prior loss for depth-consistent segmentation, and uses a fine-grained enhancement module with adaptive patch selection to improve boundary sharpness.
Experiments report state-of-the-art performance (Fmax 0.915 on DIS-VD and 0.915 on DIS-TE) while using less than half the parameters of diffusion-based methods, and the code is publicly available.

Abstract

High-precision dichotomous image segmentation (DIS) is a task of extracting fine-grained objects from high-resolution images. Existing methods trade efficiency for accuracy: non-diffusion methods are fast but suffer from weak semantics and unstable spatial priors, causing false detections; diffusion-based methods offer high accuracy via strong generative priors but are computationally expensive. In depth maps, a complete object appears as a low variance region with a smooth interior and sharp boundaries, whereas the background exhibits a chaotic, high variance pattern due to disconnected surfaces at varying depths. We refer to this as the depth integrity-prior. Inspired by this, and noting that DIS currently lacks depth maps, we leverage pseudo-depth information from monocular depth estimation models to obtain essential semantic understanding, thereby rapidly revealing spatial differences across target objects and the background. To exploit this prior, we propose the Prior-guided Depth Fusion Network (PDFNet), which fuses RGB and pseudo-depth features for depth-aware structure perception. We further introduce a novel depth integrity-prior loss to enforce depth consistency in segmentation and a fine-grained enhancement module with adaptive patch selection to sharpen boundaries. Notably, PDFNet with DAM-v2 achieves SOTA (Fmax 0.915 on DIS-VD and 0.915 on DIS-TE) using less than half the params of diffusion-based methods. Our code is available at https://tennine2077.github.io/PDFNet.github.io/ .

LLMs will be a commodity

Reddit r/artificial

Indian Developers: How to Build AI Side Income with $0 Capital in 2026

Dev.to

HubSpot Just Legitimized AEO: What It Means for Your Brand AI Visibility

Dev.to

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Reddit r/LocalLLaMA

From Fault Codes to Smart Fixes: How Google Cloud NEXT ’26 Inspired My AI Mechanic Assistant

Dev.to

High-Precision Dichotomous Image Segmentation via Depth Integrity-Prior and Fine-Grained Patch Strategy

Key Points

Abstract

Related Articles

LLMs will be a commodity

Indian Developers: How to Build AI Side Income with $0 Capital in 2026

HubSpot Just Legitimized AEO: What It Means for Your Brand AI Visibility

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

From Fault Codes to Smart Fixes: How Google Cloud NEXT ’26 Inspired My AI Mechanic Assistant

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer