Cards Against LLMs: Benchmarking Humor Alignment in Large Language Models
arXiv cs.AI / 4/13/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces “Cards Against LLMs,” a benchmark that evaluates humor alignment by having five frontier language models play Cards Against Humanity-style rounds against human preferences.
- In nearly 9,900 rounds, the models choose the “funniest” option from 10 candidates and all outperform a random baseline, but their alignment with human judgments is only modest.
- A key finding is that model-to-model agreement is much higher than model-to-human agreement, suggesting that what looks like shared taste may not match human preference well.
- The study argues that systematic position bias and content-based preferences can partially explain the misalignment, raising questions about whether humor judgments reflect genuine preference or artifacts of inference/alignment.
Related Articles

Black Hat Asia
AI Business

Apple is building smart glasses without a display to serve as an AI wearable
THE DECODER

Why Fashion Trend Prediction Isn’t Enough Without Generative AI
Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Chatbot vs Voicebot: The Real Business Decision Nobody Talks About
Dev.to