A unified data format for managing diabetes time-series data: DIAbetes eXchange (DIAX)

arXiv cs.LG / 4/15/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The article proposes DIAX, a standardized JSON-based data format designed to unify diabetes time-series data from devices such as CGMs, smart insulin pens, and automated insulin delivery systems.
  • It aims to improve interoperability, reproducibility, and extensibility, with particular benefits for research workflows and machine learning pipelines that rely on consistent input data.
  • DIAX is positioned as a translational and interoperable resource (with conversion/visualization tooling) rather than a data hosting platform, helping avoid rigid data-sharing constraints.
  • The repository supports conversion tools and cross-format compatibility, and the format is already compatible with several major datasets totaling over 10 million patient-hours.
  • Overall, the work is intended to enable easier dataset sharing and integration across studies while supporting community contributions to extend the format.

Abstract

Diabetes devices, including Continuous Glucose Monitoring (CGM), Smart Insulin Pens, and Automated Insulin Delivery systems, generate rich time-series data widely used in research and machine learning. However, inconsistent data formats across sources hinder sharing, integration, and analysis. We present DIAX (DIAbetes eXchange), a standardized JSON-based format for unifying diabetes time-series data, including CGM, insulin, and meal signals. DIAX promotes interoperability, reproducibility, and extensibility, particularly for machine learning applications. An open-source repository provides tools for dataset conversion, cross-format compatibility, visualization, and community contributions. DIAX is a translational resource, not a data host, ensuring flexibility without imposing data-sharing constraints. Currently, DIAX is compatible with other standardization efforts and supports major datasets (DCLP3, DCLP5, IOBP2, PEDAP, T1Dexi, Loop), totaling over 10 million patient-hours of data. https://github.com/Center-for-Diabetes-Technology/DIAX