minAction.net: Energy-First Neural Architecture Design -- From Biological Principles to Systematic Validation

arXiv cs.LG / 4/29/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The study argues that modern ML often ignores intrinsic computational energy costs, and it tests energy-aware learning across 2,203 experiments covering vision, text, neuromorphic, and physiological data.
  • Results show that neural architecture by itself explains almost none of the accuracy variance (partial eta² = 0.001), while the architecture–dataset interaction is large (partial eta² = 0.44, p < 0.001), indicating no universal best architecture across tasks.
  • A lambda sweep validates an energy-regularized loss of the form L = L_CE + lambda * E(θ, x), where internal activation energy can drop to 6% of baseline at moderate lambda without accuracy degradation on MNIST.
  • Energy-first architectures inspired by an action-principle/action-functional framework deliver 5–33% training-efficiency gains within each modality versus conventional baselines.
  • The authors connect learning to a design hypothesis linking action functional (classical mechanics), free energy (statistical physics), and KL-regularized objectives (variational inference).

Abstract

Modern machine learning optimizes for accuracy without explicitly accounting for internal computational cost, even though physical and biological systems operate under intrinsic energy constraints. We evaluate energy-aware learning across 2,203 experiments spanning vision, text, neuromorphic, and physiological datasets, using 10 seeds per configuration and performing a factorial statistical analysis. Three findings emerge. First, architecture alone explains negligible variance in accuracy (partial eta^2 = 0.001). In contrast, the architecture x dataset interaction is large (partial eta^2 = 0.44, p < 0.001), demonstrating that optimal architecture depends critically on task modality and rejecting the assumption of a universal best architecture. Second, a controlled lambda-sweep over four orders of magnitude validates a single-parameter energy-regularized objective L = L_CE + lambda * E(theta, x): internal activation energy decreases to 6% of baseline at moderate lambda with no accuracy degradation on MNIST. Third, energy-first architectures inspired by an action-principle framework yield 5-33% within-modality training-efficiency gains over conventional baselines. These results emerge from a research program that interprets learning through a structural correspondence between the action functional in classical mechanics, free energy in statistical physics, and KL-regularized objectives in variational inference. We frame this correspondence as a design hypothesis rather than a derivation.