LLM-Foraging: Large Language Models for Decentralized Swarm Robot Foraging

arXiv cs.RO / 5/5/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

Key Points

  • The paper introduces LLM-Foraging, a decentralized swarm robot foraging controller that combines a CPFA state machine with an LLM-based tactical decision-maker at three key decision points.
  • Each robot runs its own LLM client and queries it using only locally observable state, while the existing sensing and motion stack executes the LLM-selected action.
  • Unlike traditional CPFA approaches that require offline parameter optimization via genetic algorithms or reinforcement learning, the proposed method is training-free at deployment and does not require re-optimization when conditions change.
  • Experiments in Gazebo with TurtleBot3 robots across 36 different configurations show that LLM-Foraging gathers more resources and maintains more consistent performance than a GA-tuned CPFA baseline, especially across varying team sizes, arena sizes, and resource distributions.
  • The results suggest the LLM acts as a general decision policy that transfers across configurations where GA-tuned parameters fail to generalize.

Abstract

Swarm foraging algorithms, such as the central-place foraging algorithm (CPFA), typically rely on offline parameter optimization using genetic algorithms (GA) or reinforcement learning, yielding policies tightly coupled to a specific combination of team size, arena size, and resource distribution. When deployment conditions change, performance degrades, and retraining is computationally expensive. We propose LLM-Foraging, a decentralized swarm controller that augments the CPFA state machine with a large language model (LLM) tactical decision-maker at three structured decision points, namely post-deposit, central-zone arrival, and search starvation. Each robot runs its own LLM client and queries it using only locally observable state, while the existing CPFA motion and sensing stack executes the selected action. Because the LLM serves as a general decision policy rather than parameters fitted to a single configuration, the controller is training-free at deployment and transfers across configurations without re-optimization. We evaluate LLM-Foraging in Gazebo with TurtleBot3 robots across 36 configurations spanning team sizes of 4 to 10 robots, arena sizes from 6x6 to 10x10 meters, and three resource distributions (clustered, powerlaw, random). LLM-Foraging collects more resources than the GA-tuned CPFA baseline across the evaluated configurations and is more consistent, a property that the GA's single-configuration tuning does not transfer.