Riemannian Generative Decoder

arXiv stat.ML / 5/5/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that standard Euclidean embeddings can distort data that actually has intrinsic non-Euclidean structure, motivating manifold-aware representation learning.
It introduces a “Riemannian generative decoder” that learns manifold-valued latents using a Riemannian optimizer while jointly training only a decoder network, avoiding an encoder.
By discarding the encoder and density-estimation steps used in prior Riemannian representation learning, the method sidesteps numerically brittle training objectives and simplifies the manifold constraint.
The approach is validated on three diverse studies—synthetic branching diffusion, mitochondrial DNA-based human migration inference, and cell division cycle dynamics—showing latents that respect the intended geometry.
The method is presented as architecture-compatible, interpretable, and broadly applicable to “any Riemannian manifold,” with released code on GitHub.

Abstract

Euclidean representations distort data with intrinsic non-Euclidean structure. While Riemannian representation learning offers a solution by embedding data onto matching manifolds, it typically relies on an encoder to estimate densities on chosen manifolds. This involves optimizing numerically brittle objectives, potentially harming model training and quality. To completely circumvent this issue, we introduce the Riemannian generative decoder, a unifying approach for finding manifold-valued latents on any Riemannian manifold. Latents are learned with a Riemannian optimizer while jointly training a decoder network. By discarding the encoder, we vastly simplify the manifold constraint compared to current approaches which often only handle few specific manifolds. We validate our approach on three case studies -- a synthetic branching diffusion process, human migrations inferred from mitochondrial DNA, and cells undergoing a cell division cycle -- each showing that learned representations respect the prescribed geometry and capture intrinsic non-Euclidean structure. Our method requires only a decoder, is compatible with existing architectures, and yields interpretable latent spaces aligned with data geometry. Code available on https://github.com/yhsure/riemannian-generative-decoder.