SNEAK: Evaluating Strategic Communication and Information Leakage in Large Language Models
arXiv cs.CL / 4/1/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces SNEAK, a new benchmark designed to evaluate strategic communication in LLMs where agents must share information with allies while minimizing leakage to adversaries.
- SNEAK tests selective information sharing by having a model generate messages that indicate knowledge of a secret word given a category and candidate word set, without revealing the secret too clearly.
- It uses two simulated agents—an ally (who knows the secret) to assess communication utility and a chameleon (who lacks the secret) to assess adversarial leakage—producing complementary utility and leakage metrics.
- The authors analyze informativeness–secrecy trade-offs in modern language models and conclude that strategic communication under asymmetric information is still difficult for current systems.
- Human participants substantially outperform the evaluated models, reaching up to four times higher scores, highlighting a gap between model behavior and effective secret-aware communication.
Related Articles

Black Hat Asia
AI Business

Knowledge Governance For The Agentic Economy.
Dev.to

AI server farms heat up the neighborhood for miles around, paper finds
The Register

Paperclip: Công Cụ Miễn Phí Biến AI Thành Đội Phát Triển Phần Mềm
Dev.to
Does the Claude “leak” actually change anything in practice?
Reddit r/LocalLLaMA