How Hard Is Continuous Clustering? Lower Bounds from the Existential Theory of the Reals

arXiv cs.LG / 5/1/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper analyzes clustering defined over a continuous probability density represented by polynomials, focusing on four natural cluster-structure decision questions.
  • It proves that detecting separated high-density points and detecting a low-density “valley” between two high-density points are exactly as hard as the existential theory of the reals (a class containing NP and widely believed to be strictly larger).
  • It shows that topological properties—such as whether the above-threshold region has enough connected components or contains a hole—are at least as hard as the existential theory of the reals, while their precise complexity classification remains unresolved.
  • The results provide an initial rigorous classification of exact continuous clustering problems within the real polynomial hierarchy and indicate that simple clustering criteria are unlikely to be NP-complete without major, unexpected complexity-theoretic collapses.

Abstract

This paper studies the computational difficulty of clustering problems that are defined directly on a continuous probability density. Rather than working with finite samples, we assume the density is given as a polynomial and ask whether it contains certain cluster structures. Four natural questions are examined. First, do there exist several points with high density that are far apart from each other. Second, do two high density points have a midpoint with low density, creating a valley between them. Third, does the region where the density is above a threshold have at least a given number of separate connected pieces. Fourth, does that same region contain a hole, meaning a loop that cannot be shrunk to a point. We prove that the first two problems, separated points and valley detection, are exactly as hard as the existential theory of the reals, a complexity class that contains NP and is believed to be strictly larger. In contrast, the topological problems of counting connected pieces and detecting holes are at least as hard as the existential theory of the reals, but their exact complexity remains open. Placing them inside that class would need a major advance in real algebraic geometry. These results give the first rigorous classification of exact continuous clustering inside the real polynomial hierarchy. They also show that even basic clustering criteria are not NP complete unless unexpected collapses occur.