Long-Horizon Plan Execution in Large Tool Spaces through Entropy-Guided Branching

arXiv cs.AI / 4/15/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces SLATE (Synthetic Large-scale API Toolkit for E-commerce), a context-aware benchmark for evaluating tool-augmented LLM agents under large tool libraries and long-horizon multi-step tasks.
  • It argues that existing evaluations and static metrics miss important behaviors, showing that agents often lack effective self-correction and have inefficient search across valid execution trajectories.
  • Based on these findings, the authors propose Entropy-Guided Branching (EGB), a search algorithm that uses predictive uncertainty (entropy) to decide where to expand or prune branches.
  • Experiments on SLATE indicate EGB improves both task success rates and computational efficiency by optimizing the exploration–exploitation trade-off in tool-rich environments.
  • Overall, the work aims to provide evaluation and algorithmic infrastructure for building more reliable, scalable LLM agents that can plan and execute with extensive external APIs.

Abstract

Large Language Models (LLMs) have significantly advanced tool-augmented agents, enabling autonomous reasoning via API interactions. However, executing multi-step tasks within massive tool libraries remains challenging due to two critical bottlenecks: (1) the absence of rigorous, plan-level evaluation frameworks and (2) the computational demand of exploring vast decision spaces stemming from large toolsets and long-horizon planning. To bridge these gaps, we first introduce SLATE (Synthetic Large-scale API Toolkit for E-commerce), a large-scale context-aware benchmark designed for the automated assessment of tool-integrated agents. Unlike static metrics, SLATE accommodates diverse yet functionally valid execution trajectories, revealing that current agents struggle with self-correction and search efficiency. Motivated by these findings, we next propose Entropy-Guided Branching (EGB), an uncertainty-aware search algorithm that dynamically expands decision branches where predictive entropy is high. EGB optimizes the exploration-exploitation trade-off, significantly enhancing both task success rates and computational efficiency. Extensive experiments on SLATE demonstrate that our dual contribution provides a robust foundation for developing reliable and scalable LLM agents in tool-rich environments.

Long-Horizon Plan Execution in Large Tool Spaces through Entropy-Guided Branching | AI Navigate