Mini-BEHAVIOR-Gran: Revealing U-Shaped Effects of Instruction Granularity on Language-Guided Embodied Agents
arXiv cs.AI / 4/21/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces Mini-BEHAVIOR-Gran, a new embodied AI benchmark designed to study how instruction granularity affects language-guided agent behavior under controlled conditions.
- Unlike prior benchmarks that use a single static instruction per task, this benchmark provides multiple instruction variants per task, from high-level goals to step-by-step guidance.
- The authors evaluate four metrics for quantifying cross-task granularity (token count, entity count, action-verb count, and planning-width) and find planning-width correlates most consistently with agent performance.
- When training and evaluation are organized using planning-width, the relationship between instruction granularity and performance is non-monotonic, showing a U-shaped pattern with peaks at both very fine and very coarse extremes.
- The coarse-granularity performance rebound is attributed to shallow grounding, where agents tend to learn vision-dominant policies rather than deeper instruction grounding.
Related Articles

¿Hasta qué punto podría la IA reemplazarnos en nuestros trabajos? A veces creo que la gente exagera un poco.
Reddit r/artificial

Why I Built byCode: A 100% Local, Privacy-First AI IDE
Dev.to

Magnificent irony as Meta staff unhappy about running surveillance software on work PCs
The Register
v0.21.1
Ollama Releases

How I Built an AI Agent That Investigates Cloud Bill Spikes (Architecture Inside)
Dev.to