Cost-Penalized Fitness in FMA-Orchestrated Mixture of Experts: Experimental Evidence for Molecular Memory in Domain Adaptation
arXiv cs.LG / 4/2/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper reports controlled experiments on nanoFMT, a Free-Market Algorithm (FMA)-orchestrated transformer using dynamic Mixture-of-Experts (MoE) expert management to handle shifting data distributions at full capacity.
- It finds that using cost-penalized fitness with a linear grace period for newly created experts enables the model to accumulate domain expertise via diversification rather than frequent expert replacement.
- In a round-trip domain shift test, the approach achieves 9–11× faster recovery when returning to a previously learned domain without requiring any expert births or replacements.
- The authors term this behavior a “molecular memory” effect, arguing that dormant experts persist and reactivate when their original domain reappears, unlike existing MoE management strategies.
- A preliminary cost/energy analysis estimates potential annual savings of $39.1M and a 27.1 GWh energy reduction for an OpenAI-scale provider under a moderate scenario.
Related Articles

Black Hat Asia
AI Business
v5.5.0
Transformers(HuggingFace)Releases
Bonsai (PrismML's 1 bit version of Qwen3 8B 4B 1.7B) was not an aprils fools joke
Reddit r/LocalLLaMA
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Inference Engines - A visual deep dive into the layers of an LLM
Dev.to