HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

arXiv cs.AI / 4/25/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • HypEHR is a compact Lorentzian (hyperbolic geometry) model designed for electronic health record (EHR) question answering that explicitly uses the hierarchical structure of clinical data.
  • It embeds clinical codes, patient visits, and questions in hyperbolic space, then generates answers using geometry-consistent cross-attention with type-specific pointer heads.
  • The model is pretrained on next-visit diagnosis prediction and uses hierarchy-aware regularization aligned with the ICD ontology to improve representation quality.
  • Evaluated on two MIMIC-IV EHR-QA benchmarks, HypEHR reportedly matches LLM-based approaches while requiring far fewer parameters.
  • The researchers provide a public implementation of HypEHR on GitHub for reproducibility and further development.

Abstract

Electronic health record (EHR) question answering is often handled by LLM-based pipelines that are costly to deploy and do not explicitly leverage the hierarchical structure of clinical data. Motivated by evidence that medical ontologies and patient trajectories exhibit hyperbolic geometry, we propose HypEHR, a compact Lorentzian model that embeds codes, visits, and questions in hyperbolic space and answers queries via geometry-consistent cross-attention with type-specific pointer heads. HypEHR is pretrained with next-visit diagnosis prediction and hierarchy-aware regularization to align representations with the ICD ontology. On two MIMIC-IV-based EHR-QA benchmarks, HypEHR approaches LLM-based methods while using far fewer parameters. Our code is publicly available at https://github.com/yuyuliu11037/HypEHR.