Adaptive Prompt Structure Factorization: A Framework for Self-Discovering and Optimizing Compositional Prompt Programs

arXiv cs.CL / 4/9/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Adaptive Prompt Structure Factorization (aPSF), an API-only framework that automatically discovers task-specific prompt structures by decomposing prompts into semantic factors using an “Architect” model.
  • It performs interventional, single-factor updates by scoring each factor’s marginal contribution through validation-performance changes, improving controllability and clearer credit assignment versus monolithic prompt editing.
  • aPSF uses error-guided factor selection to target the dominant current failure source, making optimization more sample- and token-efficient.
  • Experiments on multiple advanced reasoning benchmarks show aPSF outperforms strong baselines, achieving up to +2.16 percentage points higher accuracy on average and cutting optimization token cost by 45–87% on MultiArith while reaching peak validation in one step.

Abstract

Automated prompt optimization is crucial for eliciting reliable reasoning from large language models (LLMs), yet most API-only prompt optimizers iteratively edit monolithic prompts, coupling components and obscuring credit assignment, limiting controllability, and wasting tokens. We propose Adaptive Prompt Structure Factorization (aPSF), an API-only framework (prompt-in/text-out; no access to model internals) that uses an Architect model to discover task-specific prompt structures as semantic factors. aPSF then performs interventional, single-factor updates: interventional factor-level scoring estimates each factor's marginal contribution via validation-performance changes, and error-guided factor selection routes updates to the current dominant failure source for more sample-efficient optimization. Across multiple advanced reasoning benchmarks, aPSF outperforms strong baselines including principle-aware optimizers, improving accuracy by up to +2.16 percentage points on average, and reduces optimization cost by 45--87% tokens on MultiArith while reaching peak validation in 1 step.