Agent Factories for High Level Synthesis: How Far Can General-Purpose Coding Agents Go in Hardware Optimization?

arXiv cs.AI / 3/27/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper presents an empirical study showing that general-purpose coding agents, without hardware-specific training, can optimize hardware designs from high-level algorithm specifications using HLS toolchains.
It introduces an “agent factory” two-stage pipeline: decomposing designs into sub-kernels and using an ILP to assemble promising global configurations, then launching multiple expert agents to explore cross-function optimizations like pragma recombination, loop fusion, and memory restructuring.
Experiments on 12 HLS kernels (from HLS-Eval and Rodinia-HLS) using Claude Code (Opus 4.5/4.6) with AMD Vitis HLS show strong scaling: increasing from 1 to 10 agents yields an average 8.27× speedup.
Harder benchmarks see especially large gains, with streamcluster exceeding 20× and kmeans reaching around 10×, and the best results sometimes emerge from non–top-ranked ILP candidates.
The authors conclude that scaling agent populations is a practical and effective lever for HLS optimization and that agents can rediscover known hardware optimization patterns without domain-specific training.

Abstract

We present an empirical study of how far general-purpose coding agents -- without hardware-specific training -- can optimize hardware designs from high-level algorithmic specifications. We introduce an agent factory, a two-stage pipeline that constructs and coordinates multiple autonomous optimization agents. In Stage~1, the pipeline decomposes a design into sub-kernels, independently optimizes each using pragma and code-level transformations, and formulates an Integer Linear Program (ILP) to assemble globally promising configurations under an area constraint. In Stage~2, it launches

N

expert agents over the top ILP solutions, each exploring cross-function optimizations such as pragma recombination, loop fusion, and memory restructuring that are not captured by sub-kernel decomposition. We evaluate the approach on 12 kernels from HLS-Eval and Rodinia-HLS using Claude Code (Opus~4.5/4.6) with AMD Vitis HLS. Scaling from 1 to 10 agents yields a mean

8.27\times

speedup over baseline, with larger gains on harder benchmarks: streamcluster exceeds

20\times

and kmeans reaches approximately

10\times

. Across benchmarks, agents consistently rediscover known hardware optimization patterns without domain-specific training, and the best designs often do not originate from top-ranked ILP candidates, indicating that global optimization exposes improvements missed by sub-kernel search. These results establish agent scaling as a practical and effective axis for HLS optimization.

GDPR and AI Training Data: What You Need to Know Before Training on Personal Data

Dev.to

Edge-to-Cloud Swarm Coordination for heritage language revitalization programs with embodied agent feedback loops

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Sector HQ Daily AI Intelligence - March 27, 2026

Dev.to

AI Crawler Management: The Definitive Guide to robots.txt for AI Bots

Dev.to

Agent Factories for High Level Synthesis: How Far Can General-Purpose Coding Agents Go in Hardware Optimization?

Key Points

Abstract

Related Articles

GDPR and AI Training Data: What You Need to Know Before Training on Personal Data

Edge-to-Cloud Swarm Coordination for heritage language revitalization programs with embodied agent feedback loops

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Sector HQ Daily AI Intelligence - March 27, 2026

AI Crawler Management: The Definitive Guide to robots.txt for AI Bots

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer