Formal Architecture Descriptors as Navigation Primitives for AI Coding Agents

arXiv cs.AI / 4/16/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper tests whether formal software architecture descriptors can reduce AI coding agents’ undirected codebase exploration, finding a 33–44% reduction in navigation steps in a controlled experiment.
It shows that descriptor formats (S-expression, JSON, YAML, Markdown) can be equally effective for navigation steps at least in the measured setting, and that automatically generated descriptors provide high localization accuracy versus blind exploration.
Across 7,012 Claude Code sessions, the authors report a 52% reduction in agent behavioral variance when architecture context is provided, suggesting more consistent agent behavior.
Writer-side experiments highlight a key robustness tradeoff: JSON fails atomically, YAML can silently corrupt many errors, while S-expressions better detect structural completeness issues.
The authors propose “intent.lisp” (an S-expression architecture descriptor) and release an open-source “Forge” toolkit to support this approach.

Abstract

AI coding agents spend a substantial fraction of their tool calls on undirected codebase exploration. We investigate whether providing agents with formal architecture descriptors can reduce this navigational overhead. We present three complementary studies. First, a controlled experiment (24 code localization tasks x 4 conditions, Claude Sonnet 4.6, temperature=0) demonstrates that architecture context reduces navigation steps by 33-44% (Wilcoxon p=0.009, Cohen's d=0.92), with no significant format difference detected across S-expression, JSON, YAML, and Markdown. Second, an artifact-vs-process experiment (15 tasks x 3 conditions) demonstrates that an automatically generated descriptor achieves 100% accuracy versus 80% blind (p=0.002, d=1.04), proving direct navigational value independent of developer self-clarification. Third, an observational field study across 7,012 Claude Code sessions shows 52% reduction in agent behavioral variance. A writer-side experiment (96 generation runs, 96 error injections) reveals critical failure mode differences: JSON fails atomically, YAML silently corrupts 50% of errors, S-expressions detect all structural completeness errors. We propose intent.lisp, an S-expression architecture descriptor, and open-source the Forge toolkit.

Black Hat USA

AI Business

Black Hat Asia

AI Business

Best AI Video Generators in 2026 (That Actually Work for Real Content)

Dev.to

Vibe Coding Just Graduated From Joke to Job Title

Dev.to

512,000 Lines of Leaked Code Exposed Anthropic's Secret Models

Dev.to

Formal Architecture Descriptors as Navigation Primitives for AI Coding Agents

Key Points

Abstract

Related Articles

Black Hat USA

Black Hat Asia

Best AI Video Generators in 2026 (That Actually Work for Real Content)

Vibe Coding Just Graduated From Joke to Job Title

512,000 Lines of Leaked Code Exposed Anthropic's Secret Models

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer