On the Identifiability of Tensor Ranks via Prior Predictive Matching

arXiv stat.ML / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tackles the problem of choosing the latent rank in tensor factorization by deriving a rigorous criterion for rank identifiability in probabilistic tensor models using prior predictive moment matching.
  • It converts moment-matching conditions into a log-linear system whose solvability is shown to be equivalent to the identifiability of the tensor rank.
  • The approach is applied to four classic tensor models (PARAFAC/CP, Tensor Train, Tensor Ring, and Tucker), showing that PARAFAC/CP’s linear structure, Tensor Train’s chain structure, and Tensor Ring’s closed-loop structure produce solvable systems.
  • For the Tucker model, the authors prove the system is underdetermined due to its symmetric topology, so ranks are unidentifiable under this method.
  • For the identifiable cases, the paper provides explicit closed-form rank estimators that rely only on moments computed from observed data, and validates them empirically with robustness checks.

Abstract

Selecting the latent dimensions (ranks) in tensor factorization is a central challenge that often relies on heuristic methods. This paper introduces a rigorous approach to determine rank identifiability in probabilistic tensor models, based on prior predictive moment matching. We transform a set of moment matching conditions into a log-linear system of equations in terms of marginal moments, prior hyperparameters, and ranks; establishing an equivalence between rank identifiability and the solvability of such system. We apply this framework to four foundational tensor-models, demonstrating that the linear structure of the PARAFAC/CP model, the chain structure of the Tensor Train model, and the closed-loop structure of the Tensor Ring model yield solvable systems, making their ranks identifiable. In contrast, we prove that the symmetric topology of the Tucker model leads to an underdetermined system, rendering the ranks unidentifiable by this method. For the identifiable models, we derive explicit closed-form rank estimators based on the moments of observed data only. We empirically validate these estimators and evaluate the robustness of the proposal.