AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis

arXiv cs.CV / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces AICA-Bench to evaluate Vision-Language Models (VLMs) on holistic Affective Image Content Analysis across three tasks: Emotion Understanding, Emotion Reasoning, and Emotion-Guided Content Generation.
Experiments across 23 VLMs find two key weaknesses: poor intensity calibration and shallow performance on open-ended emotional descriptions.
To mitigate these issues, the authors propose Grounded Affective Tree (GAT) Prompting, a training-free approach that uses visual scaffolding and hierarchical reasoning.
Results indicate GAT reduces emotion intensity errors and improves the depth of generated or described content, establishing a baseline for future affective multimodal research.

Abstract

Vision-Language Models (VLMs) have demonstrated strong capabilities in perception, yet holistic Affective Image Content Analysis (AICA), which integrates perception, reasoning, and generation into a unified framework, remains underexplored. To address this gap, we introduce AICA-Bench, a comprehensive benchmark with three core tasks: Emotion Understanding (EU), Emotion Reasoning (ER), and Emotion-Guided Content Generation (EGCG). We evaluate 23 VLMs and identify two major limitations: weak intensity calibration and shallow open-ended descriptions. To address these issues, we propose Grounded Affective Tree (GAT) Prompting, a training-free framework that combines visual scaffolding with hierarchical reasoning. Experiments show that GAT reduces intensity errors and improves descriptive depth, providing a strong baseline for future research on affective multimodal understanding and generation.

Black Hat Asia

AI Business

Research with ChatGPT

Dev.to

Silicon Valley is quietly running on Chinese open source models and almost nobody is talking about it

Reddit r/LocalLLaMA

Why AI Product Quality Is Now an Evaluation Pipeline Problem, Not a Model Problem

Dev.to

The 10 Best AI Tools for SEO and Digital Marketing in 2026

Dev.to

AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis

Key Points

Abstract

Related Articles

Black Hat Asia

Research with ChatGPT

Silicon Valley is quietly running on Chinese open source models and almost nobody is talking about it

Why AI Product Quality Is Now an Evaluation Pipeline Problem, Not a Model Problem

The 10 Best AI Tools for SEO and Digital Marketing in 2026

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer