Evian: Towards Explainable Visual Instruction-tuning Data Auditing
arXiv cs.CV / 4/23/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that LVLM performance hinges on high-quality training data, and that existing filtering methods are too coarse to catch subtle semantic issues such as logical fallacies and factual errors.
- It introduces a 300K-sample benchmark created by systematically injecting diverse, subtle defects to better stress-test visual-instruction data auditing.
- The authors propose a “Decomposition-then-Evaluation” approach that breaks model outputs into visual descriptions, subjective inferences, and factual claims for more fine-grained diagnosis.
- They implement this as EVIAN, an automated auditing framework evaluating image-text consistency, logical coherence, and factual accuracy, and show that smaller but higher-quality datasets curated by EVIAN can outperform much larger scale-trained sets.
- Experiments indicate that auditing benefits from decomposing work into verifiable subtasks, and that logical coherence is the most critical dimension for judging data quality.
Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Dev.to

Why use an AI gateway at all?
Dev.to

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago
Dev.to

GPT Image 2 Subject-Lock Editing: A Practical Guide to input_fidelity
Dev.to