Evolving Deep Learning Optimizers [R]

Reddit r/MachineLearning / 5/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The article presents a genetic algorithm framework that automatically discovers deep learning optimization algorithms by evolving optimizer “genomes.”
  • Optimizers are encoded as combinations of primitive update terms (e.g., gradient, momentum, RMS normalization, Adam-style adaptive terms, and sign-based updates) plus hyperparameters and scheduling choices.
  • In an evolutionary run of 50 generations with 50 individuals, evaluated across multiple vision tasks, the evolved optimizer outperformed Adam by 2.6% in aggregate fitness and improved CIFAR-10 by 7.7% relative.
  • The best evolved optimizer blends sign-based gradient updates with adaptive moment estimation, uses lower momentum coefficients than Adam, disables bias correction, and employs learning-rate warmup with cosine decay.
  • The results suggest evolutionary search can yield competitive optimizers and surface design principles that differ from manually engineered ones.

We present a genetic algorithm framework for automatically discovering deep learning optimization algorithms.

Our approach encodes optimizers as genomes that specify combinations of primitive update terms (gradient, momentum, RMS normalization, Adam-style adaptive terms, and sign-based updates) along with hyperparameters and scheduling options.

Through evolutionary search over 50 generations with a population of 50 individuals, evaluated across multiple vision tasks, we discover an evolved optimizer that outperforms Adam by 2.6% in aggregate fitness and achieves a 7.7% relative improvement on CIFAR-10.

The evolved optimizer combines sign-based gradient terms with adaptive moment estimation, uses lower momentum coefficients than Adam ( =0.86, =0.94), and notably disables bias correction while enabling learning rate warmup and cosine decay.

Our results demonstrate that evolutionary search can discover competitive optimization algorithms and reveal design principles that differ from hand-crafted optimizers.

submitted by /u/EducationalCicada
[link] [comments]