Structure-Preserving Multi-View Embedding Using Gromov-Wasserstein Optimal Transport

arXiv stat.ML / 4/6/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tackles multi-view embedding by integrating multiple representations of the same samples while preserving coherent low-dimensional structure under heterogeneous geometries and nonlinear distortions.
  • It proposes two Gromov-Wasserstein (GW) optimal-transport-based methods: Mean-GWMDS, which averages views’ distance matrices and applies GW-based multidimensional scaling, and Multi-GWMDS, which generates geometry-consistent candidate embeddings via GW alignment and then selects a representative one.
  • Experiments on both synthetic manifolds and real-world datasets indicate the methods can effectively preserve intrinsic relational structures across different views without requiring strict alignment assumptions.
  • The authors position GW-based optimal transport as a flexible, principled framework for geometry-aware multi-view representation learning.

Abstract

Multi-view data analysis seeks to integrate multiple representations of the same samples in order to recover a coherent low-dimensional structure. Classical approaches often rely on feature concatenation or explicit alignment assumptions, which become restrictive under heterogeneous geometries or nonlinear distortions. In this work, we propose two geometry-aware multi-view embedding strategies grounded in Gromov-Wasserstein (GW) optimal transport. The first, termed Mean-GWMDS, aggregates view-specific relational information by averaging distance matrices and applying GW-based multidimensional scaling to obtain a representative embedding. The second strategy, referred to as Multi-GWMDS, adopts a selection-based paradigm in which multiple geometry-consistent candidate embeddings are generated via GW-based alignment and a representative embedding is selected. Experiments on synthetic manifolds and real-world datasets show that the proposed methods effectively preserve intrinsic relational structure across views. These results highlight GW-based approaches as a flexible and principled framework for multi-view representation learning.