Neural Network Models for Contextual Regression

arXiv stat.ML / 3/26/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces SCtxtNN, a simple contextual neural network that performs contextual regression by using context features to select an “active” submodel for prediction.
  • It separates context identification from context-specific regression, yielding a structured and more interpretable architecture with fewer parameters than a standard fully connected feed-forward network.
  • The authors prove mathematically that the architecture can represent contextual linear regression models using only common neural-network components.
  • Experiments show SCtxtNN achieves lower excess mean squared error and more stable performance than feed-forward networks with similar parameter counts, with larger networks improving accuracy only while adding complexity.
  • The work argues that explicitly modeling contextual structure can improve efficiency without sacrificing interpretability.

Abstract

We propose a neural network model for contextual regression in which the regression model depends on contextual features that determine the active submodel and an algorithm to fit the model. The proposed simple contextual neural network (SCtxtNN) separates context identification from context-specific regression, resulting in a structured and interpretable architecture with fewer parameters than a fully connected feed-forward network. We show mathematically that the proposed architecture is sufficient to represent contextual linear regression models using only standard neural network components. Numerical experiments are provided to support the theoretical result, showing that the proposed model achieves lower excess mean squared error and more stable performance than feed-forward neural networks with comparable numbers of parameters, while larger networks improve accuracy only at the cost of increased complexity. The results suggest that incorporating contextual structure can improve model efficiency while preserving interpretability.