Words at Play: Benchmarking Audio Pun Understanding in Large Audio-Language Models
arXiv cs.CL / 3/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- APUN-Bench is introduced as the first benchmark specifically for assessing large audio-language models on understanding spoken puns.
- The benchmark includes 4,434 audio samples annotated for pun recognition, pun location, and pun meaning inference.
- The paper evaluates 10 state-of-the-art LALMs and finds substantial gaps in recognizing, localizing, and interpreting audio puns.
- It identifies challenges such as positional biases in pun location and errors in meaning inference, offering actionable guidance for advancing humor-aware audio intelligence.
Related Articles
How CVE-2026-25253 exposed every OpenClaw user to RCE — and how to fix it in one command
Dev.to
Does Synthetic Data Generation of LLMs Help Clinical Text Mining?
Dev.to
What CVE-2026-25253 Taught Me About Building Safe AI Assistants
Dev.to
Day 52: Building vs Shipping — Why We Had 711 Commits and 0 Users
Dev.to
The Dawn of the Local AI Era: From iPhone 17 Pro to the Future of NVIDIA RTX
Dev.to