FitText: Evolving Agent Tool Ecologies via Memetic Retrieval

arXiv cs.AI / 5/5/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • FitText addresses a “semantic gap” in agent tool use by updating retrieval dynamically during execution, rather than relying on static retrieval from the initial user query.
  • The training-free framework generates natural-language pseudo-tool descriptions as retrieval probes, iteratively refines them using retrieval feedback, and explores multiple candidates via stochastic generation.
  • It introduces Memetic Retrieval, which applies evolutionary selection pressure over candidate descriptions while using a tool memory to avoid redundant search.
  • Experiments show substantial gains: on ToolRet (43k tools), average retrieval rank improves from 8.81 to 2.78, and on StableToolBench (16,464 APIs), the average pass rate reaches 0.73, a 24-point absolute improvement over static retrieval.
  • The approach generalizes across base models that can act as strong semantic operators, but under weaker base models the evolutionary search may amplify noise, indicating model capacity requirements for effective exploration.

Abstract

A semantic gap separates how users describe tasks from how tools are documented. As API ecosystems scale to tens of thousands of endpoints, static retrieval from the initial query alone cannot bridge this gap: the agent's understanding of what it needs evolves during execution, but its tool set does not. We introduce FitText, a training-free framework that makes retrieval dynamic by embedding it directly in the agent's reasoning loop. FitText generates natural-language pseudo-tool descriptions as retrieval probes, refines them iteratively using retrieval feedback, and explores diverse alternatives through stochastic generation. Memetic Retrieval adds evolutionary selection pressure over candidate descriptions, guided by a tool memory that avoids redundant search. On ToolRet (43k tools, 4 domains), FitText improves average retrieval rank from 8.81 to 2.78; on StableToolBench (16,464 APIs), it achieves a 0.73 average pass rate--a 24-point absolute gain over static query retrieval. The gains transfer across base models capable of acting as competent semantic operators; under weaker base models, Memetic's evolutionary search inverts--amplifying noise rather than refining signal--surfacing model capacity as a prerequisite for evolutionary tool exploration.