A Mathematical Framework for Temporal Modeling and Counterfactual Policy Simulation of Student Dropout

arXiv cs.LG / 4/13/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces a temporal modeling framework for higher-education student dropout using LMS engagement time series and administrative withdrawal records, treating dropout as a time-to-event outcome.
  • It models weekly dropout risk in discrete time with penalized, class-balanced logistic regression over person–period rows and reports strong row-level AUC performance under a late-event temporal holdout (about 0.84 test AUC).
  • The authors add a scenario-indexed counterfactual policy-simulation layer with an explicit trigger/schedule contract to generate survival contrasts, finding positive survival effects only in the shock branch while mechanism-aware branches show negative contrasts.
  • Feature-set ablations show performance is sensitive to how temporal engagement signals are represented, highlighting their importance for risk estimation.
  • Subgroup analysis by gender estimates scenario-induced survival gaps via bootstrap; effects are directionally stable but small, and the study emphasizes that results are not causally identified despite internal scenario comparison.

Abstract

This study proposes a temporal modeling framework with a counterfactual policy-simulation layer for student dropout in higher education, using LMS engagement data and administrative withdrawal records. Dropout is operationalized as a time-to-event outcome at the enrollment level; weekly risk is modeled in discrete time via penalized, class-balanced logistic regression over person--period rows. Under a late-event temporal holdout, the model attains row-level AUCs of 0.8350 (train) and 0.8405 (test), with aggregate calibration acceptable but sparsely supported in the highest-risk bins. Ablation analyses indicate performance is sensitive to feature set composition, underscoring the role of temporal engagement signals. A scenario-indexed policy layer produces survival contrasts \Delta S(T) under an explicit trigger/schedule contract: positive contrasts are confined to the shock branch (T_{\rm policy}=18: 0.0102, 0.0260, 0.0819), while the mechanism-aware branch is negative (\Delta S_{\rm mech}(18)=-0.0078, \Delta S_{\rm mech}(38)=-0.0134). A subgroup analysis by gender quantifies scenario-induced survival gaps via bootstrap; contrasts are directionally stable but small. Results are not causally identified; they demonstrate the framework's capacity for internal structural scenario comparison under observational data constraints.