Toward Faithful Segmentation Attribution via Benchmarking and Dual-Evidence Fusion

arXiv cs.CV / 3/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that segmentation attribution methods are often evaluated only by visual plausibility, which can mask failures in causal faithfulness and off-target attribution.
It introduces a reproducible benchmark for semantic segmentation attribution that evaluates intervention-based faithfulness, off-target leakage, perturbation robustness, and runtime across Pascal VOC and SBD using three pretrained backbones.
The proposed Dual-Evidence Attribution (DEA) method applies agreement-weighted fusion that combines gradient evidence with region-level intervention signals to improve stability when gradients are unreliable.
Results show DEA improves deletion-based faithfulness compared to gradient-only baselines and maintains strong robustness, while requiring additional compute due to intervention passes.
The benchmark reveals a faithfulness–stability tradeoff among attribution families that would be invisible under purely visual evaluation, enabling more principled method selection; code is provided on GitHub.

Abstract

Attribution maps for semantic segmentation are almost always judged by visual plausibility. Yet looking convincing does not guarantee that the highlighted pixels actually drive the model's prediction, nor that attribution credit stays within the target region. These questions require a dedicated evaluation protocol. We introduce a reproducible benchmark that tests intervention-based faithfulness, off-target leakage, perturbation robustness, and runtime on Pascal VOC and SBD across three pretrained backbones. To further demonstrate the benchmark, we propose Dual-Evidence Attribution (DEA), a lightweight correction that fuses gradient evidence with region-level intervention signals through agreement-weighted fusion. DEA increases emphasis where both sources agree and retains causal support when gradient responses are unstable. Across all completed runs, DEA consistently improves deletion-based faithfulness over gradient-only baselines and preserves strong robustness, at the cost of additional compute from intervention passes. The benchmark exposes a faithfulness-stability tradeoff among attribution families that is entirely hidden under visual evaluation, providing a foundation for principled method selection in segmentation explainability. Code is available at https://github.com/anmspro/DEA.

Santa Augmentcode Intent Ep.6

Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’

Reddit r/artificial

Scaffolded Test-First Prompting: Get Correct Code From the First Run

Dev.to

Toward Faithful Segmentation Attribution via Benchmarking and Dual-Evidence Fusion

Key Points

Abstract

Related Articles

Santa Augmentcode Intent Ep.6

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’

Scaffolded Test-First Prompting: Get Correct Code From the First Run

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer