Introducing AutoMuon, a one line drop in for AdamW [P]

Reddit r/MachineLearning / 4/26/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • The author introduces AutoMuon, a small Python package that enables using the Muon optimizer as a drop-in replacement for AdamW in arbitrary PyTorch training pipelines.
  • AutoMuon inspects a model at initialization to automatically choose the appropriate optimizer per parameter, since Muon works best for 2D weight matrices while AdamW is still needed for embeddings, norms, biases, and other parameter types.
  • The project is positioned as broadly applicable beyond Transformers and CNNs, but the author cautions it may struggle with fully custom architectures and may require user tuning.
  • The author invites community PRs to handle edge cases (e.g., expanding module-type exclusion lists) and plans to add tests across time series forecasting, genomics, and language modeling to evaluate Muon’s generality.

Hey everyone, I've been working on a small Python package called AutoMuon that makes the Muon optimizer usable as a drop-in replacement for AdamW in arbitrary PyTorch training pipelines.

The core idea is relatively simple: Muon works primarily on 2D weight matrices (linear projections, conv layers) on hidden states, but you still need AdamW for embeddings, norms, and biases, etc. AutoMuon scans your model at init, figures out the right optimizer for each parameter automatically.

I am open to PRs, especially for expanding the module-type exclusion list if you hit edge cases in your architecture. Would love to know if anyone tries it on something other than transformers or CNNs and what they find. I feel that it would likely struggle with fully custom architectures, like flash-linear-attention for instance, so that would require some user tuning.

I am planning to add more tests for time series forecasting, genomics, language modeling, etc. I want to see how generalizable Muon really is!

https://github.com/SkyeGunasekaran/automuon

pip install git+https://github.com/SkyeGunasekaran/automuon.git

submitted by /u/Skye7821
[link] [comments]