Don't Start What You Can't Finish: A Counterfactual Audit of Support-State Triage in LLM Agents

arXiv cs.AI / 4/21/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that many LLM agent evaluations miss an ability to diagnose *why* a task is blocked before acting, focusing instead on fully specified execution outcomes or individual related behaviors in isolation.
It introduces the Support-State Triage Audit (SSTA-32), a matched-item counterfactual diagnostic framework that flips the same base request across four support states: Complete, Clarifiable, Support-Blocked, and Unsupported-Now.
Testing a frontier model with four prompting approaches shows that default “Direct” execution overcommits on non-complete tasks (41.7% overcommitment), while confidence-scalar mapping reduces overcommitment but performs poorly in distinguishing among deferral types.
In contrast, both “Action-Only” and a typed Preflight Support Check (PSC) reach 91.7% typed deferral accuracy by making the categorical decision ontology explicit in the prompt.
Ablation results indicate that removing the support-sufficiency dimension harms REQUEST SUPPORT accuracy, while removing the evidence-sufficiency dimension increases overcommitment on unsupported items, and the authors note the method’s single-context-window nature yields upper-bound capability estimates.

Abstract

Current agent evaluations largely reward execution on fully specified tasks, while recent work studies clarification [11, 22, 2], capability awareness [9, 1], abstention [8, 14], and search termination [20, 5] mostly in isolation. This leaves open whether agents can diagnose why a task is blocked before acting. We introduce the Support-State Triage Audit (SSTA-32), a matched-item diagnostic framework in which minimal counterfactual edits flip the same base request across four support states: Complete (ANSWER), Clarifiable (CLARIFY), Support-Blocked (REQUEST SUPPORT), and Unsupported-Now (ABSTAIN). We evaluate a frontier model under four prompting conditions - Direct, Action-Only, Confidence-Only, and a typed Preflight Support Check (PSC) - using Dual-Persona Auto-Auditing (DPAA) with deterministic heuristic scoring. Default execution overcommits heavily on non-complete tasks (41.7% overcommitment rate). Scalar confidence mapping avoids overcommitment but collapses the three-way deferral space (58.3% typed deferral accuracy). Conversely, both Action-Only and PSC achieve 91.7% typed deferral accuracy by surfacing the categorical ontology in the prompt. Targeted ablations confirm that removing the support-sufficiency dimension selectively degrades REQUEST SUPPORT accuracy, while removing the evidence-sufficiency dimension triggers systematic overcommitment on unsupported items. Because DPAA operates within a single context window, these results represent upper-bound capability estimates; nonetheless, the structural findings indicate that frontier models possess strong latent triage capabilities that require explicit categorical decision paths to activate safely.

¿Hasta qué punto podría la IA reemplazarnos en nuestros trabajos? A veces creo que la gente exagera un poco.

Reddit r/artificial

Magnificent irony as Meta staff unhappy about running surveillance software on work PCs

The Register

ETHENEA (ETHENEA Americas LLC) Analyst View: Asset Allocation Resilience in the 2026 Global Macro Cycle

Dev.to

DEEPX and Hyundai Are Building Generative AI Robots

Dev.to

Stop Paying OpenAI to Read Garbage: The Two-Stage Agent Pipeline

Dev.to

Don't Start What You Can't Finish: A Counterfactual Audit of Support-State Triage in LLM Agents

Key Points

Abstract

Related Articles

¿Hasta qué punto podría la IA reemplazarnos en nuestros trabajos? A veces creo que la gente exagera un poco.

Magnificent irony as Meta staff unhappy about running surveillance software on work PCs

ETHENEA (ETHENEA Americas LLC) Analyst View: Asset Allocation Resilience in the 2026 Global Macro Cycle

DEEPX and Hyundai Are Building Generative AI Robots

Stop Paying OpenAI to Read Garbage: The Two-Stage Agent Pipeline

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer