Effort-Based Criticality Metrics for Evaluating 3D Perception Errors in Autonomous Driving

arXiv cs.RO / 3/31/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that common 3D safety criticality metrics like time-to-collision (TTC) mix up the effects of false-positive and false-negative perception errors, obscuring which perception failures truly matter for driving safety.
  • It introduces two effort-based longitudinal metrics—False Speed Reduction (FSR) for cumulative velocity loss from phantom detections and Maximum Deceleration Rate (MDR) for peak braking demand from missed objects.
  • It further adds a steering-focused metric, Lateral Evasion Acceleration (LEA), which estimates the minimum steering effort needed to avoid predicted collisions using adapted evasion kinematics and reachability-based collision timing.
  • A reachability-based ellipsoidal collision filter plus frame-level matching and track-level aggregation is used to score only dynamically plausible threats and to consolidate errors over time.
  • Experiments on nuScenes and Argoverse 2 show that 65–93% of perception errors are non-critical, and the proposed metrics (FSR/MDR/LEA) provide safety-relevant signal not captured by existing time- or deceleration-based normalized criticality measures, supporting targeted identification of the most critical perception failures.

Abstract

Criticality metrics such as time-to-collision (TTC) quantify collision urgency but conflate the consequences of false-positive (FP) and false-negative (FN) perception errors. We propose two novel effort-based metrics: False Speed Reduction (FSR), the cumulative velocity loss from persistent phantom detections, and Maximum Deceleration Rate (MDR), the peak braking demand from missed objects under a constant-acceleration model. These longitudinal metrics are complemented by Lateral Evasion Acceleration (LEA), adapted from prior lateral evasion kinematics and coupled with reachability-based collision timing to quantify the minimum steering effort to avoid a predicted collision. A reachability-based ellipsoidal collision filter ensures only dynamically plausible threats are scored, with frame-level matching and track-level aggregation. Evaluation of different perception pipelines on nuScenes and Argoverse~2 shows that 65-93% of errors are non-critical, and Spearman correlation analysis confirms that all three metrics capture safety-relevant information inaccessible to established time-based, deceleration-based, or normalized criticality measures, enabling targeted mining of the most critical perception failures.