When Prompts Interact: Assessing Prompt Arithmetic for Deconfounding under Distribution Shift

arXiv cs.LG / 5/6/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • Classification models can exploit confounding (spurious) features that look good in-distribution but cause large drops under distribution shift.
  • “Task arithmetic” can reduce unwanted signals by subtracting secondary updates, but doing this typically needs full fine-tuning and is computationally costly.
  • The paper studies whether applying task arithmetic at the prompt level—using parameter-efficient soft prompt tuning—can similarly reduce reliance on spurious features.
  • The authors introduce Hybrid Prompt Arithmetic (HyPA), which combines task prompts with linearized confounder prompts, and show it improves the robustness–performance trade-off across multiple benchmarks under distribution shift.
  • Additional analysis suggests HyPA mitigates confounding by either reducing the impact of confounder signals on predictions or suppressing them within hidden representations.

Abstract

In classification tasks, models may rely on confounding variables to achieve strong in-distribution performance, capturing spurious features that fail under distribution shift. This shortcut behavior leads to substantial degradation in out-of-distribution settings. Task arithmetic offers a potential solution by removing unwanted signals via subtraction of secondary model updates, but it typically requires full fine-tuning, which is computationally expensive. Prompt tuning provides a parameter-efficient alternative by adapting models through a small set of trainable virtual tokens. Task arithmetic on the resulting prompts presents an appealing alternative to operations on entire models, but the extent to which this approach can limit reliance on spurious features remains to be established. In this work, we study whether composing soft prompts through task arithmetic improves robustness to confounding shifts. We propose Hybrid Prompt Arithmetic (HyPA), which combines task prompts with linearized confounder prompts to counteract spurious correlations. Across multiple benchmarks, HyPA consistently improves the robustness-performance trade-off relative to prompt-arithmetic baselines under distribution shift. We further analyze how HyPA affects hidden representations and find evidence consistent with it mitigating confounding either by reducing the influence of confounder signals on predictions or by suppressing them in the representation. These results establish HyPA as a parameter-efficient and promising approach for improving robustness under confounding shifts in the evaluated setting.