MedVeriSeg: Teaching MLLM-Based Medical Segmentation Models to Verify Query Validity Without Extra Training

arXiv cs.CV / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces MedVeriSeg, a training-free framework for LISA-like MLLM-based medical segmentation models to reject “false queries” that reference non-existent targets, addressing hallucinated mask generation.
It leverages an observed difference in the distribution patterns of similarity maps between the [SEG] token feature and MLLM image features for true versus false queries.
MedVeriSeg adds a Similarity Response Quality Scoring Module that scores the similarity map using three criteria—strength, compactness, and purity—to produce an initial existence prediction.
For final verification, it incorporates qualitative visual evidence by using GPT-4o to jointly evaluate the similarity heatmap and the module’s scoring outputs.
Experiments on a small benchmark derived from SA-Med2D-20M indicate MedVeriSeg can effectively reject false-query requests while keeping recognition of true queries reliable.

Abstract

Despite recent advances in MLLM-based medical image segmentation, existing LISA-like methods cannot reliably reject false queries and often produce hallucinated segmentation masks for absent targets. This limitation reduces practical reliability in both medical education and clinical use. In this work, we propose MedVeriSeg, a training-free verification framework that equips LISA-like medical segmentation models with the ability to identify and reject false queries which contain non-existent targets. Our key observation is that the similarity map between the [SEG] token feature and MLLM image features exhibits markedly different distribution patterns for true and false queries. Based on this, we introduce a Similarity Response Quality Scoring Module that characterizes the similarity map from three aspects: strength, compactness, and purity, producing an initial target-existence prediction. We further incorporate qualitative visual evidence by using GPT-4o to jointly assess the similarity heatmap and the results of Similarity Response Quality Scoring Module for final verification. Experiments on a small-scale benchmark constructed from SA-Med2D-20M show that MedVeriSeg effectively rejects false-query segmentation requests while maintaining reliable recognition of true queries.

Reported ban on ‘sex robots’ by online platform fuels debate on AI boundaries and content moderation

Reddit r/artificial

FastAPI With LangChain and MongoDB

Dev.to

Best AI Game Creator in 2026

Dev.to

Smart AI Recruiter Assistant with OpenClaw

Dev.to

🌱 Green Habit Tracker

Dev.to

MedVeriSeg: Teaching MLLM-Based Medical Segmentation Models to Verify Query Validity Without Extra Training

Key Points

Abstract

Related Articles

Reported ban on ‘sex robots’ by online platform fuels debate on AI boundaries and content moderation

FastAPI With LangChain and MongoDB

Best AI Game Creator in 2026

Smart AI Recruiter Assistant with OpenClaw

🌱 Green Habit Tracker

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer