Achieving the Kesten-Stigum bound in the non-uniform hypergraph stochastic block model

arXiv cs.LG / 4/24/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies community detection in a non-uniform hypergraph stochastic block model (HSBM), where hyperedges of different sizes coexist, and asks when weak recovery is possible by combining multiple uniform layers.
  • It proves a Kesten–Stigum-type threshold for weak recovery across a broad class of non-uniform HSBMs with r blocks generated from multiple symmetric probability tensors.
  • For the special case r=2, it shows weak recovery occurs when the sum of the signal-to-noise ratios across all uniform hypergraph layers exceeds one, confirming the positive part of a conjecture by Chodrow et al. (2023).
  • The authors design a polynomial-time spectral algorithm using an optimally weighted non-backtracking operator to achieve (or characterize) the threshold, and they also analyze how the unweighted version leads to a different algorithmic threshold.
  • Technically, the work develops spectral theory for weighted non-backtracking operators on non-uniform hypergraphs, including outlier eigenvalue characterization and a new weighted Ihara–Bass formula enabling efficient spectral reconstruction.

Abstract

We study the community detection problem in the non-uniform hypergraph stochastic block model (HSBM), where hyperedges of varying sizes coexist. This setting captures higher-order and multi-view interactions and raises a fundamental question: can multiple uniform hypergraph layers below the detection threshold be combined to enable weak recovery? We answer this question by establishing a Kesten--Stigum-type bound for weak recovery in a general class of non-uniform HSBMs with r blocks, generated according to multiple symmetric probability tensors. In the case r=2, we show that weak recovery is possible whenever the sum of the signal-to-noise ratios across all uniform hypergraph layers exceeds one, thereby confirming the positive part of a conjecture in (Chodrow et al., 2023). Moreover, we provide a polynomial-time spectral algorithm that achieves this threshold via an optimally weighted non-backtracking operator. For the unweighted non-backtracking matrix, our spectral method attains a different algorithmic threshold, also conjectured in (Chodrow et al., 2023). Our approach develops a spectral theory for weighted non-backtracking operators on non-uniform hypergraphs, including a precise characterization of outlier eigenvalues and eigenvector overlaps. We introduce a novel Ihara--Bass formula tailored to weighted non-uniform hypergraphs, which yields an efficient low-dimensional representation and leads to a provable spectral reconstruction algorithm. Taken together, these results provide a principled and computationally efficient approach to clustering in non-uniform hypergraphs, and highlight the role of optimal weighting in aggregating heterogeneous higher-order interactions.