A Layer-wise Analysis of Supervised Fine-Tuning

arXiv cs.AI / 4/15/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies how Supervised Fine-Tuning (SFT) produces instruction-following behavior while mitigating risks like catastrophic forgetting, focusing on mechanisms at the layer level across 1B–32B parameter scales.
Experiments find a depth-dependent stability pattern: middle layers (about 20%–80%) remain stable, while final layers are significantly more sensitive to tuning.
Based on this, the authors propose Mid-Block Efficient Tuning, which selectively updates only the critical intermediate layers rather than applying uniform adaptation across the network.
The proposed method achieves stronger results than standard LoRA, including up to a 10.2% improvement on GSM8K for OLMo2-7B, while using reduced parameter overhead.
The authors report that alignment effects are more architecturally localized than fully distributed, and they provide public code for reproducibility.

Abstract

While critical for alignment, Supervised Fine-Tuning (SFT) incurs the risk of catastrophic forgetting, yet the layer-wise emergence of instruction-following capabilities remains elusive. We investigate this mechanism via a comprehensive analysis utilizing information-theoretic, geometric, and optimization metrics across model scales (1B-32B). Our experiments reveal a distinct depth-dependent pattern: middle layers (20\%-80\%) are stable, whereas final layers exhibit high sensitivity. Leveraging this insight, we propose Mid-Block Efficient Tuning, which selectively updates these critical intermediate layers. Empirically, our method outperforms standard LoRA up to 10.2\% on GSM8K (OLMo2-7B) with reduced parameter overhead, demonstrating that effective alignment is architecturally localized rather than distributed. The code is publicly available at https://anonymous.4open.science/r/base_sft.

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG

Dev.to

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

Reddit r/MachineLearning

How AI Interview Assistants Are Changing Job Preparation in 2026

Dev.to

Consciousness in Artificial Intelligence: Insights from the Science ofConsciousness

Dev.to

NEW PROMPT INJECTION

Dev.to

A Layer-wise Analysis of Supervised Fine-Tuning

Key Points

Abstract

Related Articles

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

How AI Interview Assistants Are Changing Job Preparation in 2026

Consciousness in Artificial Intelligence: Insights from the Science ofConsciousness

NEW PROMPT INJECTION

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer