Stream separation improves Bregman conditioning in transformers

arXiv cs.LG / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that linear steering approaches for transformer representations often assume Euclidean geometry, but softmax actually induces a curved Bregman geometry with a Hessian-based metric tensor, and ignoring it can cause probability mass leakage to unintended tokens.
It extends the analysis beyond the output layer by measuring the Hessian (metric conditioning) at intermediate layers using a controlled 2x2 experiment with stream separation and per-layer supervision via vocabulary decoding loss.
In standard single-stream transformers, the Hessian metric at intermediate layers is highly degenerate, while stream separation improves conditioning by up to 22× in effective rank (with matched model size and vocabulary).
The study finds that per-layer supervision further helps, though less than stream separation, and that the cosine similarity between “primal” and “dual” concept directions predicts steering effectiveness across downstream tasks with a threshold around 0.3.
The results are framed as implications for the reliability of linear safety interventions, since such methods depend on the geometry being well-conditioned at the layer where steering is applied.

Abstract

Linear methods for steering transformer representations, including probing, activation engineering, and concept erasure, implicitly assume the geometry of representation space is Euclidean. Park et al. [Park et al., 2026] showed that softmax induces a curved Bregman geometry whose metric tensor is the Hessian of the log-normalizer,

H({\lambda}) = Cov[{\gamma} | {\lambda}]

. Ignoring this curvature causes Euclidean steering to leak probability mass to unintended tokens. Their analysis applies at the output layer. We measure this Hessian at intermediate layers in a controlled 2x2 design crossing stream separation with per-layer supervision (vocabulary decoding loss at each layer), all at matched vocabulary and parameter count. In standard single-stream transformers, H is severely degenerate at intermediate layers (effective rank 8 in 516 dimensions). Stream separation improves conditioning by up to 22 in effective rank, even without auxiliary supervision. Per-layer supervision helps, but less. The cosine similarity between primal and dual concept directions predicts per-layer steering effectiveness on downstream tasks, with a threshold near 0.3. These results bear on the reliability of linear safety interventions, which depend on the geometry being well-conditioned at the layer where they are applied.

MCP Is Quietly Replacing APIs — And Most Developers Haven't Noticed Yet

Dev.to

Stop Guessing Your API Costs: Track LLM Tokens in Real Time

Dev.to

I Built a Self-Healing AI Trading Bot That Learns From Every Failure

Dev.to

Stop Guessing Your API Costs: Track LLM Tokens in Real Time

Dev.to

Sora Is Dead. MolmoWeb Is Alive. Two Stories That Reshape AI in One Day.

Dev.to

Stream separation improves Bregman conditioning in transformers

Key Points

Abstract

Related Articles

MCP Is Quietly Replacing APIs — And Most Developers Haven't Noticed Yet

Stop Guessing Your API Costs: Track LLM Tokens in Real Time

I Built a Self-Healing AI Trading Bot That Learns From Every Failure

Stop Guessing Your API Costs: Track LLM Tokens in Real Time

Sora Is Dead. MolmoWeb Is Alive. Two Stories That Reshape AI in One Day.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer