Optimized Architectures for Kolmogorov-Arnold Networks

arXiv stat.ML / 4/22/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes architectural strategies to improve Kolmogorov-Arnold networks (KANs) while preserving their interpretability, which previous enhancements often compromised due to added complexity.
It studies an approach that combines overprovisioned architectures with sparsification, deep supervision, and depth selection to produce compact and interpretable KANs without accuracy loss.
The method uses differentiable mechanisms optimized end-to-end under a minimum description length (MDL) objective, jointly learning activations, structure, and depth.
Experiments across multiple settings—including function approximation, dynamical systems forecasting, and real-world prediction—show sparsification alone is not enough, but adding depth selection yields competitive or better accuracy with much smaller models.

Abstract

Efforts to improve Kolmogorov--Arnold networks (KANs) with architectural enhancements have been stymied by the complexity those enhancements bring, undermining the interpretability that makes KANs attractive in the first place. Here we study overprovisioned architectures combined with sparsification, deep supervision, and depth selection, to learn compact, interpretable KANs without sacrificing accuracy. Crucially, we focus on differentiable mechanisms under a principled minimum description length objective, jointly optimizing activations, structure, and depth end-to-end. Experiments across function approximation benchmarks, dynamical systems forecasting, and real-world prediction tasks demonstrate that sparsification alone is insufficient, but the combination with depth selection achieves competitive or superior accuracy while discovering substantially smaller models. The result is a principled path toward models that are both more expressive and more interpretable, addressing a key tension in scientific machine learning.

Autoencoders and Representation Learning in Vision

Dev.to

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks

Dev.to

Context Bloat in AI Agents

Dev.to

We open sourced the AI dev team that builds our product

Dev.to

Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B

Reddit r/LocalLLaMA

Optimized Architectures for Kolmogorov-Arnold Networks

Key Points

Abstract

Related Articles

Autoencoders and Representation Learning in Vision

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks

Context Bloat in AI Agents

We open sourced the AI dev team that builds our product

Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer