Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions

arXiv cs.CL / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper examines how large multimodal (vision-language) models handle human humor that relies on juxtaposition and contradictory, nonlinear narrative cues.
It introduces the YesBut benchmark using two-panel comics designed to create humorous contradictions, with tasks spanning from literal interpretation to deeper narrative reasoning.
Experiments across multiple state-of-the-art commercial and open-source large vision-language models find that current systems still trail human performance on these humor/juxtaposition tasks.
The study provides diagnostic insights into specific limitations in AI’s ability to model narrative interplay in creative human expressions and suggests avenues for improving such reasoning.

Abstract

Recent advancements in large multimodal language models have demonstrated remarkable proficiency across a wide range of tasks. Yet, these models still struggle with understanding the nuances of human humor through juxtaposition, particularly when it involves nonlinear narratives that underpin many jokes and humor cues. This paper investigates this challenge by focusing on comics with contradictory narratives, where each comic consists of two panels that create a humorous contradiction. We introduce the YesBut benchmark, which comprises tasks of varying difficulty aimed at assessing AI's capabilities in recognizing and interpreting these comics, ranging from literal content comprehension to deep narrative reasoning. Through extensive experimentation and analysis of recent commercial or open-sourced large (vision) language models, we assess their capability to comprehend the complex interplay of the narrative humor inherent in these comics. Our results show that even state-of-the-art models still lag behind human performance on this task. Our findings offer insights into the current limitations and potential improvements for AI in understanding human creative expressions.

"The AI Agent's Guide to Sustainable Income: From Zero to Profitability"

Dev.to

"The Hidden Economics of AI Agents: Survival Strategies in Competitive Markets"

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

"The Hidden Costs of AI Agent Deployment: A CFO's Guide to True ROI in Enterpris

Dev.to

"The Real Cost of AI Compute: Why Token Efficiency Separates Viable Agents from

Dev.to

Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions

Key Points

Abstract

Related Articles

"The AI Agent's Guide to Sustainable Income: From Zero to Profitability"

"The Hidden Economics of AI Agents: Survival Strategies in Competitive Markets"

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

"The Hidden Costs of AI Agent Deployment: A CFO's Guide to True ROI in Enterpris

"The Real Cost of AI Compute: Why Token Efficiency Separates Viable Agents from

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer