Networked Information Aggregation for Binary Classification

arXiv cs.LG / 5/5/2026

💬 OpinionModels & Research

Key Points

  • The paper studies sequential distributed binary classification on a DAG, where each agent sees only a subset of dataset features and passes prediction columns downstream.
  • It analyzes whether this “logit-passing” training protocol performs information aggregation, i.e., whether some agent can achieve low excess loss relative to a logistic predictor trained with all features.
  • Unlike prior work on linear regression under squared loss, the authors show that extending guarantees to logistic regression under binary cross-entropy is nontrivial due to the lack of quadratic structure.
  • Theoretical results include an excess-loss upper bound of O(M/√D) on depth-D paths given a feature-coverage condition over any M contiguous agents.
  • A matching lower-bound construction demonstrates cases with excess loss at least Ω(k/D), identifying network depth as a key bottleneck for aggregation quality in networked logistic regression.

Abstract

We study networked binary classification on a directed acyclic graph (DAG) where each agent observes only a subset of the feature columns of a shared dataset. Agents act sequentially along the DAG: each receives prediction columns from its parents (if any), augments its local features with these columns, fits a logistic predictor by minimizing binary cross-entropy (BCE), and forwards its prediction column to its outgoing neighbors. We ask whether this sequential distributed training procedure achieves information aggregation, meaning that some agent attains small excess loss compared to the best logistic predictor trained with access to all feature columns. This question was studied for linear regression under squared loss by Kearns, Roth, and Ryu (SODA 2026). Extending their guarantees to classification is nontrivial because their analysis relies on quadratic structure that does not directly transfer to BCE with a logistic link. We analyze the resulting sequential logit-passing protocol and prove: (i) an excess loss upper bound of O(M/\sqrt{D}) on depth-D paths under the condition that every M contiguous subsequence of M agents collectively observe all features, and (ii) a close lower bound showing instances with excess loss of at least \Omega(k/D) where k is the dimension of the feature space. Together, these results identify network depth as a fundamental bottleneck for information aggregation in networked logistic regression.

Networked Information Aggregation for Binary Classification | AI Navigate