When Minor Edits Matter: LLM-Driven Prompt Attack for Medical VLM Robustness in Ultrasound

Key Points

The paper argues that vision-language models in ultrasound can be vulnerable to “prompt attacks” because even minor changes to natural-language instructions (typos, shorthand, ambiguity) can significantly alter outputs.
It introduces a scalable adversarial evaluation framework that uses an LLM to generate clinically plausible, human-like prompt variants through minimal edits and “humanized” rewrites.
The authors evaluate multiple state-of-the-art Med-VLMs on ultrasound multiple-choice question answering benchmarks to measure vulnerability, including how attacker model capability affects success rates.
They analyze how attack success correlates with model confidence and report consistent failure patterns across models, indicating realistic robustness gaps for safe clinical deployment.
The authors plan to release the code publicly after the review process, enabling further testing and mitigation work.
categories: [

Abstract

Ultrasound is widely used in clinical practice due to its portability, cost-effectiveness, safety, and real-time imaging capabilities. However, image acquisition and interpretation remain highly operator dependent, motivating the development of robust AI-assisted analysis methods. Vision-language models (VLMs) have recently demonstrated strong multimodal reasoning capabilities and competitive performance in medical image analysis, including ultrasound. However, emerging evidence highlights significant concerns about their trustworthiness. In particular, adversarial robustness is critical because Med-VLMs operate via natural-language instructions, rendering prompt formulation a realistic and practically exploitable point of vulnerability. Small variations (typos, shorthand, underspecified requests, or ambiguous wording) can meaningfully shift model outputs. We propose a scalable adversarial evaluation framework that leverages a large language model (LLM) to generate clinically plausible adversarial prompt variants via "humanized" rewrites and minimal edits that mimic routine clinical communication. Using ultrasound multiple-choice question answering benchmarks, we systematically assess the vulnerability of SOTA Med-VLMs to these attacks, examine how attacker LLM capacity influences attack success, analyze the relationship between attack success and model confidence, and identify consistent failure patterns across models. Our results highlight realistic robustness gaps that must be addressed for safe clinical translation. Code will be released publicly following the review process.

When Minor Edits Matter: LLM-Driven Prompt Attack for Medical VLM Robustness in Ultrasound

Key Points

Abstract

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer