AdaExplore: Failure-Driven Adaptation and Diversity-Preserving Search for Efficient Kernel Generation

arXiv cs.CL / 4/21/2026

📰 NewsModels & Research

共有:

Key Points

AdaExplore is a new agent framework for efficient kernel code generation that leverages accumulated execution feedback for self-improvement at test time.
It combines failure-driven adaptation—turning recurring execution failures into reusable validity rules—with diversity-preserving search to maintain feasibility while optimizing performance.
The framework avoids treating each instance independently by using a memory of validity constraints derived from failures, reducing unreliable “naive generation + local refinement,” especially for constrained DSLs like Triton.
AdaExplore performs tree-based exploration that alternates between small local refinements and larger structural regeneration, improving search coverage beyond local optima.
Experiments on kernel runtime optimization benchmarks show substantial runtime speedups (3.12x on KernelBench Level-2 and 1.72x on Level-3) within 100 steps, with continued gains given more computation.

Abstract

Recent large language model (LLM) agents have shown promise in using execution feedback for test-time adaptation. However, robust self-improvement remains far from solved: most approaches still treat each problem instance independently, without accumulating reusable knowledge. This limitation is particularly pronounced in domain-specific languages such as Triton, which are underrepresented in LLM pretraining data. Their strict constraints and non-linear optimization landscape further make naive generation and local refinement unreliable. We propose AdaExplore, an agent framework that enables self-improvement via accumulated execution feedback for performance-critical kernel code generation through two complementary stages: failure-driven adaptation and diversity-preserving search, jointly improving correctness and optimization performance without additional fine-tuning or external knowledge. In the adaptation stage, the agent synthesizes tasks and converts recurring failures into a reusable memory of validity rules, helping subsequent generations remain within the feasible set. In the search stage, the agent organizes candidate kernels as a tree and alternates between small local refinements and larger structural regeneration, allowing it to explore the optimization landscape beyond local optima. Experiments on kernel runtime optimization benchmarks validate these gains: AdaExplore achieves 3.12x and 1.72x speedups on KernelBench Level-2 and Level-3, respectively, within 100 steps, and continues to improve with additional computation.

Agent Package Manager (APM): A DevOps Guide to Reproducible AI Agents

Dev.to

3 Things I Learned Benchmarking Claude, GPT-4o, and Gemini on Real Dev Work

Dev.to

Open Source Contributors Needed for Skillware & Rooms (AI/ML/Python)

Dev.to

Production LLM systematically violates tool schema constraints to invent UI features; observed over ~2,400 messages [D]

Reddit r/MachineLearning

My AI system kept randomly switching to French mid-answer and it took me way too long to figure out why

Reddit r/artificial

AdaExplore: Failure-Driven Adaptation and Diversity-Preserving Search for Efficient Kernel Generation

Key Points

Abstract

Related Articles

Agent Package Manager (APM): A DevOps Guide to Reproducible AI Agents

3 Things I Learned Benchmarking Claude, GPT-4o, and Gemini on Real Dev Work

Open Source Contributors Needed for Skillware & Rooms (AI/ML/Python)

Production LLM systematically violates tool schema constraints to invent UI features; observed over ~2,400 messages [D]

My AI system kept randomly switching to French mid-answer and it took me way too long to figure out why

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer