SciFigDetect: A Benchmark for AI-Generated Scientific Figure Detection

arXiv cs.CV / 4/10/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

SciFigDetect is introduced as the first benchmark specifically for detecting AI-generated scientific figures, addressing how this domain differs from open-domain image forensics due to structure, dense text, and scholarly semantics.
The dataset is built using an agent-based pipeline that retrieves licensed papers, performs multimodal understanding of text and figures, synthesizes candidate figures via multiple sources, and applies a review-driven refinement loop.
It includes multiple figure categories and aligned real–synthetic pairs, enabling evaluation across zero-shot transfer, cross-generator generalization, and degraded-image scenarios.
Benchmark results indicate current detectors fail dramatically in zero-shot transfer, overfit strongly to specific generators, and are fragile under common post-processing corruptions.
The authors provide the dataset publicly to support research into more robust and generalizable scientific-figure forensics and research-integrity tooling.

Abstract

Modern multimodal generators can now produce scientific figures at near-publishable quality, creating a new challenge for visual forensics and research integrity. Unlike conventional AI-generated natural images, scientific figures are structured, text-dense, and tightly aligned with scholarly semantics, making them a distinct and difficult detection target. However, existing AI-generated image detection benchmarks and methods are almost entirely developed for open-domain imagery, leaving this setting largely unexplored. We present the first benchmark for AI-generated scientific figure detection. To construct it, we develop an agent-based data pipeline that retrieves licensed source papers, performs multimodal understanding of paper text and figures, builds structured prompts, synthesizes candidate figures, and filters them through a review-driven refinement loop. The resulting benchmark covers multiple figure categories, multiple generation sources and aligned real--synthetic pairs. We benchmark representative detectors under zero-shot, cross-generator, and degraded-image settings. Results show that current methods fail dramatically in zero-shot transfer, exhibit strong generator-specific overfitting, and remain fragile under common post-processing corruptions. These findings reveal a substantial gap between existing AIGI detection capabilities and the emerging distribution of high-quality scientific figures. We hope this benchmark can serve as a foundation for future research on robust and generalizable scientific-figure forensics. The dataset is available at https://github.com/Joyce-yoyo/SciFigDetect.

Black Hat USA

AI Business

Black Hat Asia

AI Business

CIA is trusting AI to help analyze intel from human spies

Reddit r/artificial

LLM API Pricing in 2026: I Put Every Major Model in One Table

Dev.to

i generated AI video on a GTX 1660. here's what it actually takes.

Dev.to

SciFigDetect: A Benchmark for AI-Generated Scientific Figure Detection

Key Points

Abstract

Related Articles

Black Hat USA

Black Hat Asia

CIA is trusting AI to help analyze intel from human spies

LLM API Pricing in 2026: I Put Every Major Model in One Table

i generated AI video on a GTX 1660. here's what it actually takes.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer