When Context Sticks: Studying Interference in In-Context Learning

arXiv cs.LG / 4/28/2026

💬 OpinionModels & Research

Key Points

  • The paper studies “context stickiness” in in-context learning, where earlier prompt examples can continue to bias a transformer’s predictions for later tasks.
  • Using synthetic regression benchmarks with linear-to-quadratic task switches, the authors measure how misleading context increases prediction error and how quickly models recover after the switch.
  • Results show persistent interference: adding more prior misleading linear examples consistently worsens quadratic prediction quality, while adding more correct quadratic examples helps but eventually shows diminishing returns.
  • The study finds that training curriculum strongly affects robustness, with sequential training on the target function class enabling the fastest recovery, while random training leads to the least resilient behavior.

Abstract

This paper investigates context stickiness in in-context learning (ICL), a phenomenon where earlier examples in a prompt interfere with a transformer's ability to adapt to later tasks. Using synthetic regression tasks over linear and quadratic functions, we examine how models trained under sequential, mixed, and random curricula handle abrupt task switches during inference. By sweeping over structured combinations of misleading linear examples followed by recovery quadratic examples, we quantify how prior context biases prediction error and how quickly models realign. Our results show strong evidence of persistent interference: more preceding linear examples reliably degrade quadratic predictions, while additional quadratic examples reduce error but with diminishing returns. We further find that training curricula significantly modulate resilience, with sequential training on the target function class yielding the fastest recovery, and surprisingly, random training producing the least robust behavior.