Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

arXiv cs.AI / 4/28/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The study investigates how to automatically generate formal ontologies from unstructured text, focusing on which LLM architectural choices most affect output quality and failure causes.
Using domain-specific insurance contracts, the authors first build a single-agent LLM baseline and identify key issues such as weak ontology design pattern compliance, structural redundancy, and ineffective iterative repair.
They propose a multi-agent LLM system that splits ontology construction into four artifact-driven roles (Domain Expert, Manager, Coder, and Quality Assurer) to improve generation discipline and validation.
Evaluation combines heterogeneous LLM judge-based assessments for architectural quality with SPARQL competency-question testing (including retrieval-augmented generation for complementary scoring) for functional usability.
Results show the multi-agent approach significantly boosts structural quality and modestly improves queryability, with benefits largely attributed to front-loaded planning.

Abstract

Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear which architectural design choices drive generation quality and why current approaches fail. We present a controlled experimental study using domain-specific insurance contracts to investigate these questions. We first establish a single-agent LLM baseline, identifying key failure modes such as poor Ontology Design Pattern compliance, structural redundancy, and ineffective iterative repair. We then introduce a multi-agent architecture that decomposes ontology construction into four artifact-driven roles: Domain Expert, Manager, Coder, and Quality Assurer. We evaluate performance across architectural quality (via a panel of heterogeneous LLM judges) and functional usability (via competency question driven SPARQL evaluation with complementary retrieval augmented generation based assessment). Results show that the multi-agent approach significantly improves structural quality and modestly enhances queryability, with gains driven primarily by front-loaded planning. These findings highlight planning-first, artifact-driven generation as a promising and more auditable path toward scalable automated ontology engineering.

LLMs will be a commodity

Reddit r/artificial

Indian Developers: How to Build AI Side Income with $0 Capital in 2026

Dev.to

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Reddit r/LocalLLaMA

Dex lands $5.3M to grow its AI-driven talent matching platform

Tech.eu

AI Citation Registry: Why Daily Updates Leave No Time for Data Structuring

Dev.to

Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

Key Points

Abstract

Related Articles

LLMs will be a commodity

Indian Developers: How to Build AI Side Income with $0 Capital in 2026

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Dex lands $5.3M to grow its AI-driven talent matching platform

AI Citation Registry: Why Daily Updates Leave No Time for Data Structuring

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer