Can LLMs Beat Classical Hyperparameter Optimization Algorithms? A Study on autoresearch
arXiv cs.LG / 3/27/2026
💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- The paper introduces “autoresearch,” an LLM-agent approach that optimizes hyperparameters by directly editing training source code in an unconstrained search space, and uses it as a testbed against classical HPO methods.
- Under a fixed, constrained hyperparameter search space, classical algorithms like CMA-ES and TPE consistently outperform LLM-based agents for tuning a small language model.
- In the unconstrained setting, LLM-based code editing substantially narrows the performance gap, and the study finds that avoiding out-of-memory failures is more important than maximizing search diversity.
- The authors argue that small/mid-sized LLMs struggle to maintain optimization state across trials, while classical HPO methods lack domain knowledge, motivating a hybrid solution.
- They propose “Centaur,” which combines CMA-ES internal state sharing (mean vector, step size, covariance) with an LLM, and report that the best results come from a Centaur 0.8B variant, while scaling to 27B shows no advantage for fixed-space methods with the tested open-weight models.
Related Articles

GDPR and AI Training Data: What You Need to Know Before Training on Personal Data
Dev.to

We built a 9-item checklist that catches LLM coding agent failures before execution starts
Dev.to
Edge-to-Cloud Swarm Coordination for heritage language revitalization programs with embodied agent feedback loops
Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

How to Build an Automated SEO Workflow with AI: Lessons Learned from Developing SEONIB
Dev.to