Natural-Language Agent Harnesses

arXiv cs.CL / 3/27/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that agent performance depends heavily on “harness engineering,” but existing harness designs are often embedded in controller code and runtime-specific conventions that hinder transfer and scientific study.
It proposes Natural-Language Agent Harnesses (NLAHs) to express an agent harness’s high-level control logic as editable natural language.
It introduces an Intelligent Harness Runtime (IHR) that executes these harnesses via explicit contracts, durable artifacts, and lightweight adapters to improve portability.
The authors run controlled evaluations on coding and computer-use benchmarks, testing operational viability, module ablations, and migrating harness logic from code to text.

Abstract

Agent performance increasingly depends on \emph{harness engineering}, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific object. We ask whether the high-level control logic of an agent harness can instead be externalized as a portable executable artifact. We introduce \textbf{Natural-Language Agent Harnesses} (NLAHs), which express harness behavior in editable natural language, and \textbf{Intelligent Harness Runtime} (IHR), a shared runtime that executes these harnesses through explicit contracts, durable artifacts, and lightweight adapters. Across coding and computer-use benchmarks, we conduct controlled evaluations of operational viability, module ablation, and code-to-text harness migration.