Neural Autoregressive Flows for Markov Boundary Learning

arXiv cs.LG / 2026/3/24

💬 オピニオンIdeas & Deep AnalysisModels & Research

要点

  • The paper proposes a new framework for discovering the Markov boundary— the smallest predictor set for a target—by using conditional entropy from information theory as a scoring criterion.
  • It introduces a masked autoregressive neural network to model complex variable dependencies more effectively than prior scoring/search approaches.
  • The method includes a parallelizable greedy search strategy with polynomial-time behavior and provides analytical evidence toward reliability/guarantees.
  • It shows that using learned Markov boundaries to initialize causal discovery can accelerate convergence, improving performance beyond boundary learning alone.
  • Experiments on real-world and synthetic datasets indicate the approach is scalable and achieves superior results on both Markov boundary discovery and causal discovery tasks.

Abstract

Recovering Markov boundary -- the minimal set of variables that maximizes predictive performance for a response variable -- is crucial in many applications. While recent advances improve upon traditional constraint-based techniques by scoring local causal structures, they still rely on nonparametric estimators and heuristic searches, lacking theoretical guarantees for reliability. This paper investigates a framework for efficient Markov boundary discovery by integrating conditional entropy from information theory as a scoring criterion. We design a novel masked autoregressive network to capture complex dependencies. A parallelizable greedy search strategy in polynomial time is proposed, supported by analytical evidence. We also discuss how initializing a graph with learned Markov boundaries accelerates the convergence of causal discovery. Comprehensive evaluations on real-world and synthetic datasets demonstrate the scalability and superior performance of our method in both Markov boundary discovery and causal discovery tasks.