[R] Is autoresearch really better than classic hyperparameter tuning?

Reddit r/MachineLearning / 4/3/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The post compares autoresearch against classic hyperparameter tuning using Optuna, finding that autoresearch converges faster and is more cost-efficient across budgets.
Experiments on NanoChat used Claude to define Optuna’s search space so that priors align between the two methods, with both approaches run three times.
Even when accounting for higher per-step cost where LLM tokens can be as expensive as GPU compute, autoresearch still outperforms Optuna on overall cost effectiveness.
The best solutions from autoresearch also generalize better, and the performance gap widens when those solutions receive additional training time.
A key reason suggested is that autoresearch expands search from “knobs” within a 16-parameter space into broader code-space changes as iterations increase.

We did experiments comparing Optuna & autoresearch.
Autoresearch converges faster, is more cost-efficient, and even generalizes better.

Experiments were done on NanoChat: we let Claude define Optuna’s search space to align the priors between methods. Both optimization methods were run three times. Autoresearch is far more sample-efficient on average
In 5 min training setting, LLM tokens cost as much as GPUs, but despite a 2× higher per-step cost, AutoResearch still comes out ahead across all cost budgets:
What’s more, the solution found by autoresearch generalizes better than Optuna’s. We gave the best solutions more training time; the absolute score gap widens, and the statistical significance becomes stronger:

An important contributor to autoresearch’s capability is that it searches directly in code space. In the early stages, autoresearch tunes knobs within Optuna’s 16-parameter search space. However, with more iterations, it starts to explore code changes

AI Business

Dev.to

Dev.to

Dev.to

Dev.to