OptimusKG: Unifying biomedical knowledge in a modern multimodal graph
arXiv cs.AI / 5/1/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- The paper introduces OptimusKG, a multimodal biomedical labeled property graph designed to unify knowledge from structured and semi-structured sources while preserving schema-level constraints and type-specific metadata.
- OptimusKG is built as an LPG with a top-level schema for nodes and edges and retains granular properties, cross-references, and provenance across molecular, anatomical, clinical, and environmental domains.
- The released graph is large-scale, containing 190,531 nodes (10 entity types) and 21,813,816 edges (26 relation types) with over 67 million property instances spanning 150 property keys sourced from 18 ontologies and controlled vocabularies.
- To validate the graph, the authors used a multimodal literature-checking agent (PaperQA3) and found that 70.0% of sampled edges had supporting evidence, while 83.4% of sampled false edges lacked such evidence.
- The dataset is distributed as Apache Parquet files to support graph-based machine learning and knowledge-grounded retrieval with large language models, including biomedical discovery tasks like hypothesis generation.
Related Articles

Black Hat USA
AI Business

Why Autonomous Coding Agents Keep Failing — And What Actually Works
Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!
Reddit r/artificial

Announcing the NVIDIA Nemotron 3 Super Build Contest
Dev.to

75% of Sites Blocking AI Bots Still Get Cited. Here Is Why Blocking Does Not Work.
Dev.to