Where Do Prompt Perturbations Break Generation? A Segment-Level View of Robustness in LoRA-Tuned Language Models

arXiv cs.CL / 5/5/2026

📰 NewsModels & Research

共有:

Key Points

The paper argues that common prompt-robustness methods (whole-sequence consistency) can miss a key failure mode where outputs drift on critical entities, relations, or conclusions despite looking globally similar.
It proposes S$2$R$2$, a segment-level robustness framework for LoRA fine-tuning that decomposes clean vs. perturbed generations into semantic segments and uses an optimal-transport alignment objective.
S$2$R$2$ penalizes only the segments with the largest meaning drift, and introduces an adapter-stability regularizer to connect the output objective to model adaptation via LoRA norm control as a proxy.
The authors provide PAC-Bayesian reasoning to suggest that limiting adapter growth can improve transfer beyond the perturbations seen during training.
Experiments on summarization benchmarks show that S$2$R$2$ improves robustness to typographical noise, deletion, synonym replacement, and paraphrasing while preserving competitive clean performance and improving cross-dataset transfer versus consistency-based baselines.

Abstract

Large language models are sensitive to minor prompt perturbations, yet existing robustness methods usually enforce consistency at the whole-sequence level. This holistic view can hide an important failure mode: a perturbed response may remain globally similar to the clean one while drifting on a critical entity, relation, or conclusion. We introduce S

^2

^2

, a segment-level framework for robust LoRA fine-tuning. S

^2

^2

decomposes clean and perturbed generations into semantic segments, aligns them with an optimal-transport objective, and penalises the segments with the largest meaning drift. To connect this output-side objective with model adaptation, we add an adapter-stability regulariser motivated by segment-level attention reallocation, using LoRA norm control as a tractable proxy for limiting perturbation-amplified evidence shifts. A PAC-Bayesian complexity view further explains why controlling adapter growth may support transfer beyond observed perturbations. Experiments on summarisation benchmarks show that S

^2

^2

improves robustness under typographical noise, deletion, synonym replacement, and paraphrasing, while maintaining competitive clean performance and stronger cross-dataset transfer than consistency-based baselines.

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision

Dev.to

From OOM to 262K Context: Running Qwen3-Coder 30B Locally on 8GB VRAM

Dev.to

Nano Banana Pro vs DALL-E 3 vs Midjourney: A Practical Comparison From Someone Who Actually Uses All Three

Dev.to

LLMs edited 86 human essays toward a semantic cluster not occupied by any human writer [D]

Reddit r/MachineLearning

Fake News Detection using Machine Learning & NLP!

Dev.to

Where Do Prompt Perturbations Break Generation? A Segment-Level View of Robustness in LoRA-Tuned Language Models

Key Points

Abstract

Related Articles

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision

From OOM to 262K Context: Running Qwen3-Coder 30B Locally on 8GB VRAM

Nano Banana Pro vs DALL-E 3 vs Midjourney: A Practical Comparison From Someone Who Actually Uses All Three

LLMs edited 86 human essays toward a semantic cluster not occupied by any human writer [D]

Fake News Detection using Machine Learning & NLP!

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer