PRIM-cipal components analysis

arXiv stat.ML / 4/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies “unsupervised” No Free Lunch theorems by proving that, for elliptical distributions, two scientifically meaningful bump-hunting strategies can be exactly opposite yet equally optimal, with no universally best method.
  • It analyzes a principal-component “peeling” procedure that removes k orthogonal dimensions and keeps an inter-quantile region per dimension, and shows how choosing the smallest (“pettiest”) vs largest principal components flips whether total variance/Frobenius norm are maximized or minimized.
  • The derived optima motivate PRIM-based bump-hunting algorithms that implement either variance-minimization or volume-minimization, thereby grounding an NFLT-style argument in specific selection rules.
  • Experiments on Fashion-MNIST indicate that peeling the largest principal components tends to capture multiplicity, while peeling the smallest components helps isolate popular styles.

Abstract

Supervised No Free Lunch Theorems (NFLTs) are well studied, yet unsupervised NFLTs remain underexplored. For elliptical distributions, we prove that there exist two equally optimal, scientifically meaningful bump-hunting strategies that are exact opposites, with no universal winner. Specifically, peeling k orthogonal dimensions from \mathbb{R}^d (d \ge k), retaining an inter-quantile region of probability 1-\alpha per peeled dimension, maximizes total variance and Frobenius norm when the k smallest principal components (called pettiest components) are selected, and minimizes them when the selected dimensions are the k leading principal components. These optima inspire PRIM-based bump-hunting algorithms either by minimizing variance or by minimizing volume, thereby motivating an NFLT. We test our results on the Fashion-MNIST database, showing that peeling the largest principal components captures multiplicity, while peeling the smallest principal components isolates popular styles.