Unifying Runtime Monitoring Approaches for Safety-Critical Machine Learning: Application to Vision-Based Landing

arXiv cs.LG / 4/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses safety-critical ML runtime monitoring by noting that existing methods are fragmented across research communities.
It introduces a unified framework that categorizes runtime monitoring into three types: ODD monitoring, OOD monitoring, and OMS monitoring.
ODD monitoring focuses on verifying compliance with expected operating conditions, while OOD monitoring rejects inputs that differ from the training distribution.
OMS monitoring detects abnormal model behavior using internal states or outputs, complementing the other two monitor types.
The authors validate the framework with an experiment on vision-based runway detection for landing, using common safety-oriented metrics to compare monitors.
It aims to help practitioners design complementary monitoring activities and to evaluate different monitors consistently.
The proposed categorization supports clearer comparisons by standardizing evaluation approaches for safety-critical ML monitoring.

Abstract

Runtime monitoring is essential to ensure the safety of ML applications in safety-critical domains. However, current research is fragmented, with independent methods emerging from different communities. In this paper, we propose a unified framework categorising runtime monitoring approaches into three distinct types: Operational Design Domain (ODD) monitoring, which ensures compliance with expected operating conditions; Out-of-Distribution (OOD) monitoring, which rejects inputs that deviate from the training data; and Out-of-Model-Scope (OMS) monitoring, which detects anomalous model behaviour based its internal states or outputs. We demonstrate the benefits of this categorization with a dedicated experiment on an aeronautical safety-critical application: runway detection during landing. This framework facilitates design of monitoring activities, with complementary categories of monitors, and enables evaluation and comparison of different monitors using common, safety-oriented metrics.

Building a Local AI Agent (Part 2): Six UX and UI Design Challenges

Dev.to

We Built a DNS-Based Discovery Protocol for AI Agents — Here's How It Works

Dev.to

Your first business opportunity in 3 commands: /register_directory in @biznode_bot, wait for matches, then /my_pulse to view...

Dev.to

Building AI Evaluation Pipelines: Automating LLM Testing from Dataset to CI/CD

Dev.to

Function Calling Harness 2: CoT Compliance from 9.91% to 100%

Dev.to

Unifying Runtime Monitoring Approaches for Safety-Critical Machine Learning: Application to Vision-Based Landing

Key Points

Abstract

Related Articles

Building a Local AI Agent (Part 2): Six UX and UI Design Challenges

We Built a DNS-Based Discovery Protocol for AI Agents — Here's How It Works

Your first business opportunity in 3 commands: /register_directory in @biznode_bot, wait for matches, then /my_pulse to view...

Building AI Evaluation Pipelines: Automating LLM Testing from Dataset to CI/CD

Function Calling Harness 2: CoT Compliance from 9.91% to 100%

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer