Optimized Architectures for Kolmogorov-Arnold Networks

arXiv stat.ML / 4/22/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes architectural strategies to improve Kolmogorov-Arnold networks (KANs) while preserving their interpretability, which previous enhancements often compromised due to added complexity.
  • It studies an approach that combines overprovisioned architectures with sparsification, deep supervision, and depth selection to produce compact and interpretable KANs without accuracy loss.
  • The method uses differentiable mechanisms optimized end-to-end under a minimum description length (MDL) objective, jointly learning activations, structure, and depth.
  • Experiments across multiple settings—including function approximation, dynamical systems forecasting, and real-world prediction—show sparsification alone is not enough, but adding depth selection yields competitive or better accuracy with much smaller models.

Abstract

Efforts to improve Kolmogorov--Arnold networks (KANs) with architectural enhancements have been stymied by the complexity those enhancements bring, undermining the interpretability that makes KANs attractive in the first place. Here we study overprovisioned architectures combined with sparsification, deep supervision, and depth selection, to learn compact, interpretable KANs without sacrificing accuracy. Crucially, we focus on differentiable mechanisms under a principled minimum description length objective, jointly optimizing activations, structure, and depth end-to-end. Experiments across function approximation benchmarks, dynamical systems forecasting, and real-world prediction tasks demonstrate that sparsification alone is insufficient, but the combination with depth selection achieves competitive or superior accuracy while discovering substantially smaller models. The result is a principled path toward models that are both more expressive and more interpretable, addressing a key tension in scientific machine learning.