Machine learning models for estimating counterfactuals in a single-arm inflammatory bowel disease study

arXiv cs.LG / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The study evaluates machine-learning “virtual control arms” for single-arm IBD trials by predicting counterfactual outcomes for a treatment arm using models trained on external control data.
Five ML counterfactual outcome models were trained on IFX-treated pediatric Crohn’s disease patients to predict 1-year steroid-free clinical remission and CRP plus steroid-free remission for ADA-treated patients.
Using the IFX-versus-ADA effect estimates derived from the virtual controls, the authors compare results against propensity score matching to external controls as a reference approach.
Gradient-boosted (LGBM) modeling produced odds ratios closest to the propensity-score-matched reference, and all 95% confidence intervals supported the same conclusion: no statistically significant difference in primary or secondary outcomes between ADA and IFX.
The authors conclude that virtual controls are a viable alternative to costly, slow, or ethically difficult patient recruitment, and propose a pretrained gradient-boosted model for future studies subject to external validation and transportability checks.

Abstract

Single-arm trials accelerate study timelines by reducing the number of patients that must be recruited for a concurrent control group. However, these designs require an alternative comparator to estimate treatment effects. One approach is to construct a virtual control arm using a machine learning (ML) model trained on external control data to predict the counterfactual outcomes of the treatment arm. Our aim in this study was to leverage virtual controls by developing and evaluating ML-based counterfactual outcome models trained on IFX-treated patients to predict 1-year steroid-free clinical remission (SFCR ) and a composite of C-reactive protein remission plus steroid-free clinical remission (CRP-SFCR) for ADA-treated pediatric Crohn's disease patients, and to compare the resulting IFX-versus-ADA treatment effect estimates with those obtained using propensity score matching to external controls. Five ML models were used to train counterfactual models on the observed IFX cohort data. The resulting models were used to predict the counterfactual outcomes for the ADA arm patients. LGBM yields the best OR closest to the propensity score matched reference, and all 95% CI results align with the conclusion from the reference study that no statistical difference in the primary and secondary outcomes has been observed between the patients treated with ADA or IFX. Our study supports virtual controls as a viable and effective substitute for expensive, lengthy or unethical patient recruitment in an inflammatory bowel disease (IBD) trial. The developed gradient boosted prediction model can be used as a pretrained model to generate IFX counterfactual predictions in future studies, pending external validation and assessment of transportability.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Same Agent, Different Risk | How Microsoft 365 Copilot Grounding Changes the Security Model | Rahsi Framework™

Dev.to

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

Dev.to

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

Dev.to

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

Dev.to

Machine learning models for estimating counterfactuals in a single-arm inflammatory bowel disease study

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Same Agent, Different Risk | How Microsoft 365 Copilot Grounding Changes the Security Model | Rahsi Framework™

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer