Perturb and Correct: Post-Hoc Ensembles using Affine Redundancy

arXiv cs.LG / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Perturb-and-Correct (P&C) is proposed as a post-hoc ensemble technique that builds epistemically diverse predictors from a single pretrained neural network.
  • It introduces random perturbations to hidden layers and then applies a least-squares correction to the following affine layer so predictors stay consistent on calibration data but can differ off-distribution.
  • The paper explains why the method works by analyzing the post-correction residual and its first-order sensitivity, linking residual control near the calibration distribution to a leverage term and growth of sensitivity farther away.
  • Experiments show P&C improves the ID/OOD (in-distribution vs out-of-distribution) tradeoff on MuJoCo dynamics prediction and CIFAR-10 OOD detection, matching or exceeding common post-hoc baselines while using only one pretrained model.
  • The results suggest that leveraging deep models’ overparameterization can be a practical strength for uncertainty and robustness under distribution shift.

Abstract

Models that are indistinguishable on in-distribution data can behave very differently under distribution shift. We introduce Perturb-and-Correct (P&C), a post-hoc method for constructing epistemically diverse predictors from a single pretrained network. P&C applies random hidden layer perturbations with a least-squares correction in the subsequent affine layer, producing predictors that agree on calibration data while remaining free to disagree away from it. We analyze this mechanism through the post-correction residual and its first-order sensitivity: the residual is controlled near the calibration distribution by a leverage term, while corrected sensitivity grows as inputs deviate from the calibration geometry. Empirically, P&C achieves a strong ID/OOD tradeoff across MuJoCo dynamics prediction and CIFAR-10 OOD detection, matching or outperforming standard post-hoc baselines while requiring only a single pretrained model. Our findings highlight the potential in further exploiting overparameterization as a strength of deep learning models.