Phase-Associative Memory: Sequence Modeling in Complex Hilbert Space

arXiv cs.LG / 4/8/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces Phase-Associative Memory (PAM), a recurrent sequence model that uses complex-valued representations and stores associations in a matrix state updated via outer products.
  • Retrieval is performed using a conjugate inner-product mechanism, designed to better support binding and recall than prior vector-state (holographic) approaches.
  • Experiments on WikiText-103 show PAM reaching a validation perplexity of 30.0 with ~100M parameters, achieving performance within ~10% of a matched transformer (27.1) under similar training conditions.
  • The authors argue that shifting from vector-state superposition to a matrix-state formulation avoids capacity degradation seen in holographic binding, improving associative memory behavior for sequence modeling.
  • They connect the architecture’s strengths to broader claims about non-classical contextuality in human and language-model semantic interpretation, suggesting computational formalism choices may matter.

Abstract

We present Phase-Associative Memory (PAM), a recurrent sequence model in which all representations are complex-valued, associations accumulate in a matrix state S_{t} \in \mathbb{C}^{d \times d} via outer products, and retrieval operates through the conjugate inner product K_t^* \cdot Q_t / \sqrt{d}. At \sim100M parameters on WikiText-103, PAM reaches validation perplexity 30.0, within \sim10\% of a matched transformer (27.1) trained under identical conditions, despite 4\times arithmetic overhead from complex computation and no custom kernels. We trace the experimental path from vector-state models, where holographic binding fails due to the O(1/\sqrt{n}) capacity degradation of superposed associations, to the matrix state that resolves it. The competitiveness of an architecture whose native operations are complex-valued superposition and conjugate retrieval is consistent with recent empirical evidence that semantic interpretation in both humans and large language models exhibits non-classical contextuality, and we discuss what this implies for the choice of computational formalism in language modeling.