Prism-$\Delta$: Differential Subspace Steering for Prompt Highlighting in Large Language Models

arXiv cs.CL / 3/12/2026

📰 NewsModels & Research

共有:

Key Points

PRISM-$\Delta$ is a projection-based, relevance-informed steering method for prompting LLMs that decomposes the difference between positive and negative cross-covariance to maximize discriminative energy while removing shared directions.
The method assigns a continuous softplus importance weight to each attention head, allowing weak-but-useful heads to contribute at reduced strength.
It also extends naturally to Value representations, capturing content-channel signals that Key-only methods leave unused.
Empirically, PRISM-$\Delta$ matches or exceeds the best existing method on 19 of 20 configurations across four benchmarks and five models, with up to 10.6% relative gains and a halved fluency cost for steering.
It scales to long-context retrieval, beating the best previous method by up to 4.8% relative gain, and is compatible with FlashAttention with negligible memory overhead.

Abstract

Prompt highlighting steers a large language model to prioritize user-specified text spans during generation. A key challenge is extracting steering directions that capture the difference between relevant and irrelevant contexts, rather than shared structural patterns common to both. We propose PRISM-

\Delta

(Projection-based Relevance-Informed Steering Method), which decomposes the difference between positive and negative cross-covariance matrices to maximize discriminative energy while eliminating shared directions. Each attention head receives a continuous softplus importance weight, letting weak-but-useful heads contribute at reduced strength. The framework extends naturally to Value representations, capturing content-channel signal that Key-only methods leave unused. Across four benchmarks and five models, PRISM-

\Delta

matches or exceeds the best existing method on 19 of 20 configurations, with relative gains up to +10.6%, while halving the fluency cost of steering. PRISM-

\Delta

also scales to long-context retrieval, outperforming the best existing method by up to +4.8% relative gain. PRISM-

\Delta

is compatible with FlashAttention and adds negligible memory overhead.

Core Allocation Optimization for Energy‑Efficient Multi‑Core Scheduling in ARINC650 Systems

Dev.to

LongCat-Flash-Prover: A new frontier for Open-Source Formal Reasoning.

Reddit r/LocalLLaMA

composer 2 is just Kimi K2.5 with RL?????

Reddit r/LocalLLaMA

Built a small free iOS app to reduce LLM answer uncertainty with multiple models

Dev.to

[P] We built a Weights & Biases for Autoresearch - track steps, compare experiments, and share results

Reddit r/MachineLearning

Prism-$\Delta$: Differential Subspace Steering for Prompt Highlighting in Large Language Models

Key Points

Abstract

Related Articles

Core Allocation Optimization for Energy‑Efficient Multi‑Core Scheduling in ARINC650 Systems

LongCat-Flash-Prover: A new frontier for Open-Source Formal Reasoning.

composer 2 is just Kimi K2.5 with RL?????

Built a small free iOS app to reduce LLM answer uncertainty with multiple models

[P] We built a Weights & Biases for Autoresearch - track steps, compare experiments, and share results

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Abstract

Related Articles

**Core Allocation Optimization for Energy‑Efficient Multi‑Core Scheduling in ARINC650 Systems**

LongCat-Flash-Prover: A new frontier for Open-Source Formal Reasoning.

composer 2 is just Kimi K2.5 with RL?????

Built a small free iOS app to reduce LLM answer uncertainty with multiple models

[P] We built a Weights & Biases for Autoresearch - track steps, compare experiments, and share results

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Core Allocation Optimization for Energy‑Efficient Multi‑Core Scheduling in ARINC650 Systems