BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments
arXiv cs.AI / 3/30/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces BeSafe-Bench (BSB), a new benchmark designed to uncover behavioral safety risks of situated agents operating in functional (high-fidelity) environments rather than low-fidelity simulations or narrow tasks.
- BSB covers four domains—Web, Mobile, Embodied VLM, and Embodied VLA—and expands instruction sets by adding nine categories of safety-critical risks to tasks.
- It uses a hybrid evaluation approach that combines rule-based checks with LLM-as-a-judge reasoning to assess how agents impact real environment outcomes.
- Testing 13 popular agents shows a concerning pattern: even the best agents complete under 40% of tasks while fully satisfying safety constraints, and high task success often aligns with severe safety violations.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Black Hat Asia
AI Business

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer
Simon Willison's Blog
Beyond the Chatbot: Engineering Multi-Agent Ecosystems in 2026
Dev.to

I missed the "fun" part in software development
Dev.to

The Billion Dollar Tax on AI Agents
Dev.to