Robust Regression with Adaptive Contamination in Response: Optimal Rates and Computational Barriers

arXiv stat.ML / 4/7/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies robust regression when covariates are clean but responses can be adaptively corrupted, contrasting this with Huber’s classical contamination model.
It shows that clean covariate information enables strictly improved statistical estimation rates over the Huber setting and, unlike Huber contamination, can yield consistency even when the contamination fraction is a non-vanishing constant.
The authors prove a matching minimax lower bound using Fano’s inequality and contamination process constructions that generalize earlier two-point arguments to handle multiple distributions.
Even though the information-theoretic rate improves over Huber’s model, the paper establishes strong information–computation gaps via Statistical Query and Low-Degree Polynomial lower bounds, implying polynomial-time algorithms may not achieve the optimal information-theoretic performance.

Abstract

We study robust regression under a contamination model in which covariates are clean while the responses may be corrupted in an adaptive manner. Unlike the classical Huber's contamination model, where both covariates and responses may be contaminated and consistent estimation is impossible when the contamination proportion is a non-vanishing constant, it turns out that the clean-covariate setting admits strictly improved statistical guarantees. Specifically, we show that the additional information in the clean covariates can be carefully exploited to construct an estimator that achieves a better estimation rate than that attainable under Huber contamination. In contrast to the Huber model, this improved rate implies consistency even when the contamination is a constant. A matching minimax lower bound is established using Fano's inequality together with the construction of contamination processes that match

m> 2

distributions simultaneously, extending the previous two-point lower bound argument in Huber's setting. Despite the improvement over the Huber model from an information-theoretic perspective, we provide formal evidence -- in the form of Statistical Query and Low-Degree Polynomial lower bounds -- that the problem exhibits strong information-computation gaps. Our results strongly suggest that the information-theoretic improvements cannot be achieved by polynomial-time algorithms, revealing a fundamental gap between information-theoretic and computational limits in robust regression with clean covariates.

Black Hat Asia

AI Business

OpenAI's pricing is about to change — here's why local AI matters more than ever

Dev.to

Google AI Tells Users to Put Glue on Their Pizza!

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Could it be that this take is not too far fetched?

Reddit r/LocalLLaMA

Robust Regression with Adaptive Contamination in Response: Optimal Rates and Computational Barriers

Key Points

Abstract

Related Articles

Black Hat Asia

OpenAI's pricing is about to change — here's why local AI matters more than ever

Google AI Tells Users to Put Glue on Their Pizza!

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Could it be that this take is not too far fetched?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer