SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation

arXiv cs.AI / 4/16/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces SemiFA, an agentic multi-modal framework that autonomously generates structured semiconductor failure analysis (FA) reports from inspection images in under one minute.
  • SemiFA uses a four-agent LangGraph pipeline—DefectDescriber, RootCauseAnalyzer, SeverityClassifier, and RecipeAdvisor—plus a final node that assembles a PDF report.
  • The RootCauseAnalyzer fuses SECS/GEM equipment telemetry with historically similar defects retrieved from a Qdrant vector database to improve root-cause reasoning.
  • The authors release SemiFA-930, a dataset of 930 annotated defect images paired with structured FA narratives across nine defect classes, and report strong vision performance (92.1% accuracy, macro F1 0.917).
  • Experimental results show multi-modal fusion improves root cause reasoning (GPT-4o judge ablation: +0.86 composite points over an image-only baseline), and the full pipeline runs in 48 seconds on an NVIDIA A100-SXM4-40 GB GPU.

Abstract

Semiconductor failure analysis (FA) requires engineers to examine inspection images, correlate equipment telemetry, consult historical defect records, and write structured reports, a process that can consume several hours of expert time per case. We present SemiFA, an agentic multi-modal framework that autonomously generates structured FA reports from semiconductor inspection images in under one minute. SemiFA decomposes FA into a four-agent LangGraph pipeline: a DefectDescriber that classifies and narrates defect morphology using DINOv2 and LLaVA-1.6, a RootCauseAnalyzer that fuses SECS/GEM equipment telemetry with historically similar defects retrieved from a Qdrant vector database, a SeverityClassifier that assigns severity and estimates yield impact, and a RecipeAdvisor that proposes corrective process adjustments. A fifth node assembles a PDF report. We introduce SemiFA-930, a dataset of 930 annotated semiconductor defect images paired with structured FA narratives across nine defect classes, drawn from procedural synthesis, WM-811K, and MixedWM38. Our DINOv2-based classifier achieves 92.1% accuracy on 140 validation images (macro F1 = 0.917), and the full pipeline produces complete FA reports in 48 seconds on an NVIDIA A100-SXM4-40 GB GPU. A GPT-4o judge ablation across four modality conditions demonstrates that multi-modal fusion improves root cause reasoning by +0.86 composite points (1-5 scale) over an image-only baseline, with equipment telemetry as the more load-bearing modality. To our knowledge, SemiFA is the first system to integrate SECS/GEM equipment telemetry into a vision-language model pipeline for autonomous FA report generation.