JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents

arXiv cs.AI / 4/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that LLM agents with many external, domain-specific tools often fail due to generic “one-size-fits-all” prompts and tool schemas that are underspecified for when to use each tool and how to format arguments.
It proposes JTPRO (Joint Tool-Prompt Reflective Optimization), which uses rollout-driven reflection in trace-supervised settings to jointly optimize both global agent instructions and per-tool schema/argument descriptions.
The framework aims to keep only tool-local cues needed for correct disambiguation and slot/value filling, improving reliability even in large tool inventories.
Experiments on multi-tool benchmarks measure Tool Selection Accuracy (TSA), Slot Filling Accuracy (SFA), and Overall Success Rate (OSR), with JTPRO outperforming strong baselines and reflective optimizers like GEPA by 5%–20% (relative) on OSR.
Ablation results indicate that jointly optimizing instructions and tool schemas is more effective and robust than optimizing either component alone.

Abstract

Large language model (LLM) agents augmented with external tools often struggle as number of tools grow large and become domain-specific. In such settings, ambiguous tool descriptions and under-specified agent instructions frequently lead to tool mis-selection and incorrect slot/value instantiation. We hypothesize that this is due to two root causes: generic, one-size-fits-all prompts that ignore tool-specific nuances, and underspecified tool schemas that lack clear guidance on when and how to use each tool and how to format its parameters. We introduce Joint Tool-Prompt Reflective Optimization (JTPRO), a framework for improving tool-calling reliability in trace-supervised settings by iteratively using rollout-driven reflection to co-optimize global instructions and per-tool schema/argument descriptions for accurate tool selection and argument instantiation in large tool inventories. JTPRO is designed to preserve only tool-local cues needed for correct disambiguation and slot filling. We evaluate JTPRO across multi-tool benchmarks, which account for different number of tools using three metrics: Tool Selection Accuracy (TSA), Slot Filling Accuracy(SFA), and Overall Success Rate(OSR) (correct tool + correct slots + correct values). JTPRO consistently outperforms strong baselines, including CoT-style agents, and reflective prompt optimizers such as GEPA by 5%-20% (relative) on OSR. Ablations show that joint optimization of instructions and tool schemas is more effective and robust than optimizing either component in isolation.

I’m working on an AGI and human council system that could make the world better and keep checks and balances in place to prevent catastrophes. It could change the world. Really. Im trying to get ahead of the game before an AGI is developed by someone who only has their best interest in mind.

Reddit r/artificial

Deepseek V4 Flash and Non-Flash Out on HuggingFace

Reddit r/LocalLLaMA

DeepSeek V4 Flash & Pro Now out on API

Reddit r/LocalLLaMA

I’m building a post-SaaS app catalog on Base, and here’s what that actually means

Dev.to

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering

Dev.to

JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents

Key Points

Abstract

Related Articles

I’m working on an AGI and human council system that could make the world better and keep checks and balances in place to prevent catastrophes. It could change the world. Really. Im trying to get ahead of the game before an AGI is developed by someone who only has their best interest in mind.

Deepseek V4 Flash and Non-Flash Out on HuggingFace

DeepSeek V4 Flash & Pro Now out on API

I’m building a post-SaaS app catalog on Base, and here’s what that actually means

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer