On the Expressive Power of Contextual Relations in Transformers

arXiv cs.LG / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that while Transformers model contextual relationships well empirically, their expressive power is not fully characterized mathematically.
It proposes a measure-theoretic framework where texts are probability measures in a semantic embedding space and contextual relations are represented using coupling measures.
The authors introduce the “Sinkhorn Transformer,” a transformer-like architecture designed for this coupling-measure setting.
The main contribution is a universal approximation theorem showing that continuous coupling functions between probability measures can be uniformly approximated by a Sinkhorn Transformer with suitable parameters.

Abstract

Transformer architectures have achieved remarkable empirical success in modeling contextual relationships in natural language, yet a precise mathematical characterization of their expressive power remains incomplete. In this work, we introduce a measure-theoretic framework for contextual representations in which texts are modeled as probability measures over a semantic embedding space, and contextual relations between words, are represented as coupling measures between them. Within this setting, we introduce Sinkhorn Transformer, a transformer-like architecture. Our main result is a universal approximation theorem: any continuous coupling function between probability measures, that encodes the semantic relation coupling measure, can be uniformly approximated by a Sinkhorn Transformer with appropriate parameters.

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

Simon Willison's Blog

Beyond the Chatbot: Engineering Multi-Agent Ecosystems in 2026

Dev.to

I missed the "fun" part in software development

Dev.to

The Billion Dollar Tax on AI Agents

Dev.to

Hermes Agent: A Self-Improving AI Agent That Runs Anywhere

Dev.to

On the Expressive Power of Contextual Relations in Transformers

Key Points

Abstract

Related Articles

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

Beyond the Chatbot: Engineering Multi-Agent Ecosystems in 2026

I missed the "fun" part in software development

The Billion Dollar Tax on AI Agents

Hermes Agent: A Self-Improving AI Agent That Runs Anywhere

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer