Building Trust in the Skies: A Knowledge-Grounded LLM-based Framework for Aviation Safety

arXiv cs.AI / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that using LLMs for aviation safety decision-making requires stronger trust mechanisms because standalone LLM outputs can be inaccurate, unverifiable, or hallucinatory.
  • It proposes an end-to-end framework that combines LLMs with Knowledge Graphs to improve reliability for safety-critical analytics.
  • In the first phase, LLMs are used to automatically build and dynamically update an Aviation Safety Knowledge Graph (ASKG) from multimodal sources.
  • In the second phase, the framework applies a Retrieval-Augmented Generation (RAG) setup over the curated KG to ground, validate, and explain the LLM’s responses.
  • Experimental results indicate better accuracy and traceability than LLM-only methods, with improved support for complex queries and reduced hallucination, while future work targets relationship extraction and hybrid retrieval.

Abstract

The integration of Large Language Models (LLMs) into aviation safety decision-making represents a significant technological advancement, yet their standalone application poses critical risks due to inherent limitations such as factual inaccuracies, hallucination, and lack of verifiability. These challenges undermine the reliability required for safety-critical environments where errors can have catastrophic consequences. To address these challenges, this paper proposes a novel, end-to-end framework that synergistically combines LLMs and Knowledge Graphs (KGs) to enhance the trustworthiness of safety analytics. The framework introduces a dual-phase pipeline: it first employs LLMs to automate the construction and dynamic updating of an Aviation Safety Knowledge Graph (ASKG) from multimodal sources. It then leverages this curated KG within a Retrieval-Augmented Generation (RAG) architecture to ground, validate, and explain LLM-generated responses. The implemented system demonstrates improved accuracy and traceability over LLM-only approaches, effectively supporting complex querying and mitigating hallucination. Results confirm the framework's capability to deliver context-aware, verifiable safety insights, addressing the stringent reliability requirements of the aviation industry. Future work will focus on enhancing relationship extraction and integrating hybrid retrieval mechanisms.