Measuring research data reuse in scholarly publications using generative artificial intelligence: Open Science Indicator development and preliminary results
arXiv cs.CL / 5/1/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- PLOS and DataSeer developed an LLM-based Open Science Indicator focused on measuring the downstream impact of open science, specifically the reuse of research data in scholarly publications.
- Preliminary results indicate a 43% data reuse rate, which is higher than what traditional bibliometric approaches typically report.
- The study finds that generative AI and LLMs can measure research data reuse at scale across publications.
- The authors argue that the benefits of research data sharing and reuse may be currently underestimated due to limitations of existing measurement methods.
Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works
Dev.to

Why Enterprise AI Pilots Fail
Dev.to

The PDF Feature Nobody Asked For (That I Use Every Day)
Dev.to

How to Fix OpenClaw Tool Calling Issues
Dev.to

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model
THE DECODER