A Ridge Too Far: Correcting Over-Shrinkage via Negative Regularization

arXiv stat.ML / 4/21/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

Conventional (positive) regularization is meant to reduce variance, but for small-data regression it can worsen underfitting when the useful predictive signal lies in weak directions of a restricted representation.
The paper analyzes a “negative-capable” ridge regression family that allows a feasible negative regularization region while keeping the estimator well-posed.
Within that negative region, negative regularization functions as controlled anti-shrinkage by increasing effective model complexity most strongly along weak eigen-directions.
The authors formalize weak-spectrum underfitting, prove a sign-switch phenomenon under conservative baseline shrinkage, and propose a criterion-based method to automatically select regularization across the full negative-capable family.
Experiments on synthetic and semi-synthetic data validate key theoretical claims, including feasibility of the negative region, spectral complexity growth, sign-switch behavior, and recovery of negative adjustments when appropriate.

Abstract

Conventional regularization is designed to control variance, but in small-data regression it can also aggravate underfitting when predictive signal is concentrated in weak directions of a restricted representation. We study a negative-capable ridge family that permits a feasible negative region whenever the estimator remains well posed, and show that negative regularization acts there as controlled anti-shrinkage by increasing effective complexity most strongly along weak eigendirections. Building on this mechanism, we formalize weak-spectrum underfitting, derive a sign-switch result under conservative baseline shrinkage, and study criterion-based automatic selection over the full negative-capable family. Synthetic and semi-synthetic experiments support the theory by verifying feasibility, spectral complexity increase, sign-switch behavior, and effective recovery of negative adjustments in the predicted regimes.

Every time a new model comes out, the old one is obsolete of course

Reddit r/LocalLLaMA

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

Dev.to

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

Dev.to

Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM

Reddit r/LocalLLaMA

A Ridge Too Far: Correcting Over-Shrinkage via Negative Regularization

Key Points

Abstract

Related Articles

Every time a new model comes out, the old one is obsolete of course

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer