Harnessing Data Asymmetry: Manifold Learning in the Finsler World

arXiv cs.LG / 3/13/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper proposes using Finsler geometry, an asymmetric generalisation of Riemannian geometry, to capture asymmetric relationships in data for manifold learning.
It develops a Finsler manifold learning pipeline and adapts asymmetric embedding methods such as Finsler t-SNE and Finsler Umap.
Experiments on synthetic and large real datasets show the approach uncovers information like density hierarchies that traditional symmetric methods miss, with embeddings that outperform Euclidean-based counterparts.
This work broadens the applicability of asymmetric embedders beyond directed data, potentially improving data visualization and analysis workflows.

Abstract

Manifold learning is a fundamental task at the core of data analysis and visualisation. It aims to capture the simple underlying structure of complex high-dimensional data by preserving pairwise dissimilarities in low-dimensional embeddings. Traditional methods rely on symmetric Riemannian geometry, thus forcing symmetric dissimilarities and embedding spaces, e.g. Euclidean. However, this discards in practice valuable asymmetric information inherent to the non-uniformity of data samples. We suggest to harness this asymmetry by switching to Finsler geometry, an asymmetric generalisation of Riemannian geometry, and propose a Finsler manifold learning pipeline that constructs asymmetric dissimilarities and embeds in a Finsler space. This greatly broadens the applicability of existing asymmetric embedders beyond traditionally directed data to any data. We also modernise asymmetric embedders by generalising current reference methods to asymmetry, like Finsler t-SNE and Finsler Umap. On controlled synthetic and large real datasets, we show that our asymmetric pipeline reveals valuable information lost in the traditional pipeline, e.g. density hierarchies, and consistently provides superior quality embeddings than their Euclidean counterparts.