The Wasserstein transform

arXiv stat.ML / 4/14/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • Wasserstein Transform (WT) は、各データ点を近傍構造を表す確率測度として表現し、その確率測度間のワッサースタイン距離を計算して距離構造を更新する、教師なしの汎用フレームワークとして提案されます。
  • WT は mean shift 系アルゴリズムを拡張した手法であり、距離構造の更新により特徴強調やノイズ除去(denoising)を狙います。
  • とくに Gaussian Transform (GT) は、ガウス測度同士の ℓ^2-ワッサースタイン距離に閉形式が存在するため計算効率が良く、他のWTインスタンスとの関係も整理したうえで摂動に対する安定性を理論的に示します。
  • 反復的アルゴリズムを提示し、GT の高速化として行列平方根計算の回数を減らす線形代数的観察などの戦略を導入します。
  • denoising、clustering、image segmentation、word embeddings など複数のタスクでの有効性が検討されています。

Abstract

We introduce the Wasserstein Transform (WT), a general unsupervised framework for updating distance structures on given data sets with the purpose of enhancing features and denoising. Our framework represents each data point by a probability measure reflecting the neighborhood structure of the point, and then updates the distance by computing the Wasserstein distance between these probability measures. The Wasserstein Transform is a general method which extends the mean shift family of algorithms. We study several instances of WT, and in particular, in one of the instances which we call the Gaussian Transform (GT), we utilize Gaussian measures to model neighborhood structures of individual data points. GT is computationally cheaper than other instances of WT since there exists closed form solution for the \ell^2-Wasserstein distance between Gaussian measures. We study the relationship between different instances of WT and prove that each of the instances is stable under perturbations. We devise iterative algorithms for performing the above-mentioned WT and propose several strategies to accelerate GT, such as an observation from linear algebra for reducing the number of matrix square root computations. We examine the performance of the Wasserstein Transform method in many tasks, such as denoising, clustering, image segmentation and word embeddings.