Neural Generalized Mixed-Effects Models

arXiv stat.ML / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes Neural Generalized Mixed-Effects Models (NGMMs), a flexible replacement for generalized linear mixed-effects models (GLMMs) by using neural networks instead of a linear predictor for the natural parameter.
It introduces an efficient, differentiable optimization procedure that maximizes an approximate marginal likelihood despite typically intractable marginalization over random effects.
The authors analyze the approximation error and show it decreases at a Gaussian-tail rate controlled by a user-chosen parameter.
Experiments on synthetic data and multiple real-world datasets indicate NGMM can outperform GLMMs and prior methods when covariate–response relationships are nonlinear.
The study also demonstrates an extension of NGMM to more complex latent-variable modeling using a large student proficiency dataset.

Abstract

Generalized linear mixed-effects models (GLMMs) are widely used to analyze grouped and hierarchical data. In a GLMM, each response is assumed to follow an exponential-family distribution where the natural parameter is given by a linear function of observed covariates and a latent group-specific random effect. Since exact marginalization over the random effects is typically intractable, model parameters are estimated by maximizing an approximate marginal likelihood. In this paper, we replace the linear function with neural networks. The result is a more flexible model, the neural generalized mixed-effects model (NGMM), which captures complex relationships between covariates and responses. To fit NGMM to data, we introduce an efficient optimization procedure that maximizes the approximate marginal likelihood and is differentiable with respect to network parameters. We show that the approximation error of our objective decays at a Gaussian-tail rate in a user-chosen parameter. On synthetic data, NGMM improves over GLMMs when covariate-response relationships are nonlinear, and on real-world datasets it outperforms prior methods. Finally, we analyze a large dataset of student proficiency to demonstrate how NGMM can be extended to more complex latent-variable models.