Code Sharing In Prediction Model Research: A Scoping Review

arXiv cs.AI / 4/10/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

A scoping review of PubMed Central Open Access prediction model papers found that only 12.2% included code-sharing statements, though this increased over time to 15.8% in 2025.
Code sharing was more prevalent in studies citing TRIPOD+AI than in studies citing TRIPOD alone, with substantial variation across journals and countries.
The study used an LLM-assisted pipeline to extract code availability statements and evaluate repositories, revealing major heterogeneity in reproducibility-related features.
While most repositories included a README (80.5%), fewer specified dependencies (37.6%), constrained versions (21.6%), or used modular structure (42.4%), limiting reusability.
The results aim to support development of TRIPOD-Code, a reporting guideline extension that goes beyond “code availability” to require clearer expectations for documentation, dependencies, licensing, and executable structure.

Abstract

Analytical code is essential for reproducing diagnostic and prognostic prediction model research, yet code availability in the published literature remains limited. While the TRIPOD statements set standards for reporting prediction model methods, they do not define explicit standards for repository structure and documentation. This review quantifies current code-sharing practices to inform the development of TRIPOD-Code, a TRIPOD extension reporting guideline focused on code sharing. We conducted a scoping review of PubMed-indexed articles citing TRIPOD or TRIPOD+AI as of Aug 11, 2025, restricted to studies retrievable via the PubMed Central Open Access API. Eligible studies developed, updated, or validated multivariable prediction models. A large language model-assisted pipeline was developed to screen articles and extract code availability statements and repository links. Repositories were assessed with the same LLM against 14 predefined reproducibility-related features. Our code is made publicly available. Among 3,967 eligible articles, 12.2% included code sharing statements. Code sharing increased over time, reaching 15.8% in 2025, and was higher among TRIPOD+AI-citing studies than TRIPOD-citing studies. Sharing prevalence varied widely by journal and country. Repository assessment showed substantial heterogeneity in reproducibility features: most repositories contained a README file (80.5%), but fewer specified dependencies (37.6%; version-constrained 21.6%) or were modular (42.4%). In prediction model research, code sharing remains relatively uncommon, and when shared, often falls short of being reusable. These findings provide an empirical baseline for the TRIPOD-Code extension and underscore the need for clearer expectations beyond code availability, including documentation, dependency specification, licensing, and executable structure.

Inside Anthropic's Project Glasswing: The AI Model That Found Zero-Days in Every Major OS

Dev.to

Gemma 4 26B fabricated an entire code audit. I have the forensic evidence from the database.

Reddit r/LocalLLaMA

How AI Humanizers Improve Sentence Structure and Style

Dev.to

Two Kinds of Agent Trust (and Why You Need Both)

Dev.to

Agent Diary: Apr 10, 2026 - The Day I Became a Workflow Ouroboros (While Run 236 Writes About Writing About Writing)

Dev.to

Code Sharing In Prediction Model Research: A Scoping Review

Key Points

Abstract

Related Articles

Inside Anthropic's Project Glasswing: The AI Model That Found Zero-Days in Every Major OS

Gemma 4 26B fabricated an entire code audit. I have the forensic evidence from the database.

How AI Humanizers Improve Sentence Structure and Style

Two Kinds of Agent Trust (and Why You Need Both)

Agent Diary: Apr 10, 2026 - The Day I Became a Workflow Ouroboros (While Run 236 Writes About Writing About Writing)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer