PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation

arXiv cs.CV / 4/15/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces PR-MaGIC, a training-free test-time method to refine prompts for in-context image segmentation built around SAM-like visual foundation models.
It addresses a key weakness of existing in-context approaches—sub-optimal prompts caused by visual inconsistencies between support and query images.
PR-MaGIC leverages gradient flow from SAM’s mask decoder to improve prompt quality and therefore segmentation outputs, and it can plug into existing in-context segmentation frameworks.
The authors provide theoretical justification and a practical stabilization mechanism (a simple top-1 selection strategy) to ensure robust performance across different samples.
Experiments on multiple benchmarks show consistent segmentation quality improvements without additional training or architectural changes.

Abstract

Visual Foundation Models (VFMs) such as the Segment Anything Model (SAM) have significantly advanced broad use of image segmentation. However, SAM and its variants necessitate substantial manual effort for prompt generation and additional training for specific applications. Recent approaches address these limitations by integrating SAM into in-context (one/few shot) segmentation, enabling auto-prompting through semantic alignment between query and support images. Despite these efforts, they still generate sub-optimal prompts that degrade segmentation quality due to visual inconsistencies between support and query images. To tackle this limitation, we introduce PR-MaGIC (Prompt Refinement via Mask Decoder Gradient Flow for In-Context Segmentation), a training-free test-time framework that refines prompts via gradient flow derived from SAM's mask decoder. PR-MaGIC seamlessly integrates into in-context segmentation frameworks, being theoretically grounded yet practically stabilized through a simple top-1 selection strategy that ensures robust performance across samples. Extensive evaluations demonstrate that PR-MaGIC consistently improves segmentation quality across various benchmarks, effectively mitigating inadequate prompts without requiring additional training or architectural modifications.

Black Hat Asia

AI Business

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

Reddit r/MachineLearning

I built a trading intelligence MCP server in 2 days — here's how

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s

Reddit r/LocalLLaMA

PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation

Key Points

Abstract

Related Articles

Black Hat Asia

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

I built a trading intelligence MCP server in 2 days — here's how

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer