Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

arXiv cs.AI / 4/28/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The study investigates how to automatically generate formal ontologies from unstructured text, focusing on which LLM architectural choices most affect output quality and failure causes.
  • Using domain-specific insurance contracts, the authors first build a single-agent LLM baseline and identify key issues such as weak ontology design pattern compliance, structural redundancy, and ineffective iterative repair.
  • They propose a multi-agent LLM system that splits ontology construction into four artifact-driven roles (Domain Expert, Manager, Coder, and Quality Assurer) to improve generation discipline and validation.
  • Evaluation combines heterogeneous LLM judge-based assessments for architectural quality with SPARQL competency-question testing (including retrieval-augmented generation for complementary scoring) for functional usability.
  • Results show the multi-agent approach significantly boosts structural quality and modestly improves queryability, with benefits largely attributed to front-loaded planning.

Abstract

Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear which architectural design choices drive generation quality and why current approaches fail. We present a controlled experimental study using domain-specific insurance contracts to investigate these questions. We first establish a single-agent LLM baseline, identifying key failure modes such as poor Ontology Design Pattern compliance, structural redundancy, and ineffective iterative repair. We then introduce a multi-agent architecture that decomposes ontology construction into four artifact-driven roles: Domain Expert, Manager, Coder, and Quality Assurer. We evaluate performance across architectural quality (via a panel of heterogeneous LLM judges) and functional usability (via competency question driven SPARQL evaluation with complementary retrieval augmented generation based assessment). Results show that the multi-agent approach significantly improves structural quality and modestly enhances queryability, with gains driven primarily by front-loaded planning. These findings highlight planning-first, artifact-driven generation as a promising and more auditable path toward scalable automated ontology engineering.