Self-Organizing Maps with Optimized Latent Positions

arXiv cs.LG / 4/16/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes SOM-OLP, an objective-based self-organizing maps variant that adds a continuous latent position per data point to improve topographic mapping quality while addressing inefficiencies in earlier STVQ-style methods.
  • By deriving a separable surrogate local cost from STVQ’s neighborhood distortion and adding entropy regularization, the authors obtain a block coordinate descent procedure with closed-form updates for assignment probabilities, latent positions, and reference vectors.
  • The optimization is designed to guarantee monotonic non-increase of the objective and to maintain linear per-iteration complexity with respect to both the number of data points and latent nodes.
  • Experiments demonstrate competitive neighborhood preservation and quantization performance on synthetic and digit datasets, with scalability studies on Digits/MNIST and broader evaluation across 16 benchmark datasets showing strong average ranking versus compared methods.

Abstract

Self-Organizing Maps (SOM) are a classical method for unsupervised learning, vector quantization, and topographic mapping of high-dimensional data. However, existing SOM formulations often involve a trade-off between computational efficiency and a clearly defined optimization objective. Objective-based variants such as Soft Topographic Vector Quantization (STVQ) provide a principled formulation, but their neighborhood-coupled computations become expensive as the number of latent nodes increases. In this paper, we propose Self-Organizing Maps with Optimized Latent Positions (SOM-OLP), an objective-based topographic mapping method that introduces a continuous latent position for each data point. Starting from the neighborhood distortion of STVQ, we construct a separable surrogate local cost based on its local quadratic structure and formulate an entropy-regularized objective based on it. This yields a simple block coordinate descent scheme with closed-form updates for assignment probabilities, latent positions, and reference vectors, while guaranteeing monotonic non-increase of the objective and retaining linear per-iteration complexity in the numbers of data points and latent nodes. Experiments on a synthetic saddle manifold, scalability studies on the Digits and MNIST datasets, and 16 benchmark datasets show that SOM-OLP achieves competitive neighborhood preservation and quantization performance, favorable scalability for large numbers of latent nodes and large datasets, and the best average rank among the compared methods on the benchmark datasets.