AI Navigate

Knowledge Graph Extraction from Biomedical Literature for Alkaptonuria Rare Disease

arXiv cs.AI / 3/18/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The paper presents a text-mining framework using PubTator3 to extract biomedical relations for AKU.
  • They construct two knowledge graphs of different sizes to capture genes, diseases, and therapies related to alkaptonuria.
  • The approach validates KG content against existing biochemical knowledge and reveals systemic interactions, comorbidities, and potential therapeutic targets for AKU.
  • The work highlights the challenge of underrepresentation of ultra-rare diseases in biomedical knowledge bases and demonstrates KG-based analytics as a tool for rare disease research.

Abstract

Alkaptonuria (AKU) is an ultra-rare autosomal recessive metabolic disorder caused by mutations in the HGD (Homogentisate 1,2-Dioxygenase) gene, leading to a pathological accumulation of homogentisic acid (HGA) in body fluids and tissues. This leads to systemic manifestations, including premature spondyloarthropathy, renal and prostatic stones, and cardiovascular complications. Being ultra-rare, the amount of data related to the disease is limited, both in terms of clinical data and literature. Knowledge graphs (KGs) can help connect the limited knowledge about the disease (basic mechanisms, manifestations and existing therapies) with other knowledge; however, AKU is frequently underrepresented or entirely absent in existing biomedical KGs. In this work, we apply a text-mining methodology based on PubTator3 for large-scale extraction of biomedical relations. We construct two KGs of different sizes, validate them using existing biochemical knowledge and use them to extract genes, diseases and therapies possibly related to AKU. This computational framework reveals the systemic interactions of the disease, its comorbidities, and potential therapeutic targets, demonstrating the efficacy of our approach in analyzing rare metabolic disorders.