FlowBot: Inducing LLM Workflows with Bilevel Optimization and Textual Gradients

arXiv cs.LG / 4/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes FlowBot, an automated method to induce and optimize LLM workflows instead of relying on manually designed pipelines and prompts.
  • Workflow induction is cast as a bilevel optimization problem, with an outer loop learning the workflow’s high-level call structure and an inner loop optimizing each LLM call sequentially.
  • The optimization uses “textual gradients,” including modular, layer-by-layer backpropagation of textual signals to improve individual workflow components.
  • Experiments show FlowBot-discovered workflows perform competitively versus baselines that use human-crafted or automatically generated workflow designs.

Abstract

LLM workflows, which coordinate structured calls to individual LLMs (each augmented with varying instructions and tools) to achieve a particular goal, offer a promising path towards extending the capabilities of LLMs and building powerful systems that can tackle diverse tasks. However, existing approaches for building such workflows generally rely on human-crafted pipelines and prompts, which presents a substantial bottleneck in real world deployment. How can automatically induce and optimize such workflows in a data-driven way? This paper describes a simple data-driven approach for automatically inducing LLM workflows. We formulate workflow induction as a bilevel optimization problem: an outer loop which optimizes a high-level sketch of the workflow (in particular how the LLM calls should be structured), and an inner loop which optimizes each individual LLM call one-by one. Both loops are optimized with ``textual gradients'' where for the inner loop we optimize each component in a modular way through ``backpropagating'' textual gradients layer-by-layer. We find that LLM workflows discovered through our \textsc{FlowBot} (work\textbf{flow} induction through \textbf{b}ilevel \textbf{o}ptimization and \textbf{t}extual gradients) approach performs competitively against strong baselines that make use of human-crafted or automatically-generated workflows.