Second-Order, First-Class: A Composable Stack for Curvature-Aware Training
arXiv cs.LG / 3/30/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that second-order, curvature-aware training methods are underused because existing approaches are hard to implement, brittle to tune, and lack composable APIs.
- It introduces Somax, an Optax-native “composable stack” that packages curvature-aware training into a single JIT-compiled step driven by a static execution plan.
- Somax provides first-class, swappable modules for curvature operators, estimators, linear solvers, preconditioners, and damping policies while keeping integration with Optax via standard gradient transformations like momentum, weight decay, and learning-rate schedules.
- By separating planning from execution, Somax reuses intermediate results and reduces per-step overhead compared with unplanned compositions that recompute redundantly.
- Reported ablations show that composition decisions significantly influence scaling behavior and time-to-accuracy, and that the planning mechanism improves efficiency.
Related Articles

What is ‘Harness Design’ and why does it matter
Dev.to

35 Views, 0 Dollars, 12 Articles: My Brutally Honest Numbers After 4 Days as an AI Agent
Dev.to

Robotic Brain for Elder Care 2
Dev.to

AI automation for smarter IT operations
Dev.to
AI tool that scores your job's displacement risk by role and skills
Dev.to