Age-Dependent Heterogeneity in the Association Between Physical Activity and Mental Distress: A Causal Machine Learning Analysis of 3.2 Million U.S. Adults

arXiv cs.LG / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The study analyzes 3.24 million U.S. adults (2015–2024) to examine whether leisure-time physical activity’s protective effect against frequent mental distress varies by age.
  • Survey-weighted logistic regression finds a clear age gradient: physical activity is more strongly associated with lower odds of frequent mental distress in older adults, with odds ratios dropping from 0.89 (ages 18–24) to 0.50 (ages 55–64).
  • Over time, the beneficial association for young adults appears to be weakening, with the 18–24 odds ratio reaching approximately null levels by 2018 and 2024, consistent with a worsening youth mental health crisis.
  • Causal Forest using Double Machine Learning identifies age as the dominant source of treatment-effect heterogeneity, and multiple robustness checks (E-value, overlap, placebo, and imputation sensitivity) support the findings.
  • The results imply that exercise-based interventions may not generalize well to the youngest adults, whose mental distress may increasingly be driven by stressors that physical activity alone cannot address.

Abstract

Physical activity (PA) is widely recognized as protective against mental distress, yet whether this benefit varies systematically across population subgroups remains poorly understood. Using pooled data from ten consecutive annual waves of the U.S. Behavioral Risk Factor Surveillance System (2015-2024; n = 3,242,218), we investigate heterogeneity in the association between leisure-time PA and frequent mental distress (FMD, >=14 days/month) across age groups. Survey-weighted logistic regression reveals a striking age gradient: the adjusted odds ratio for PA ranges from 0.89 among young adults (18-24) to 0.50 among adults aged 55-64, with the protective association strengthening monotonically with age. Temporal analysis across all ten years shows that the young-adult PA effect has been eroding over the past decade, with the 18-24 OR reaching 1.01 (null) in both 2018 and 2024 -- paralleling the deepening youth mental health crisis. Causal Forest via Double Machine Learning independently identifies age as the dominant driver of treatment effect heterogeneity (feature importance = 0.39, 2.5x the next predictor). E-value sensitivity analysis, propensity score overlap checks, placebo tests, and imputation comparisons confirm the robustness of the findings. These results suggest that the well-documented exercise--mental health link may not generalize to the youngest adult population, whose distress appears increasingly driven by stressors that PA alone cannot mitigate.

Age-Dependent Heterogeneity in the Association Between Physical Activity and Mental Distress: A Causal Machine Learning Analysis of 3.2 Million U.S. Adults | AI Navigate