WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning
arXiv cs.AI / 5/1/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- WaferSAGE is a wafer-defect visual question answering framework that uses small vision-language models to perform domain-specific semiconductor inspection tasks.
- To overcome scarce labeled data, it introduces a three-stage synthetic data pipeline that cleans noisy labels, generates detailed defect descriptions, and converts them into rubric-based criteria for evaluation.
- The framework uses a dual assessment approach that combines rule-based metrics with LLM-Judge scores, aligning them via Bayesian optimization for more reliable automated evaluation.
- It applies curriculum-based reinforcement learning with Group Sequence Policy Optimization (GSPO) and rubric-aligned rewards, enabling a 4B-parameter Qwen3-VL model to achieve strong performance (6.493) while remaining suitable for full on-premise deployment.
- The authors argue that well-trained small, domain-specific models can outperform proprietary large models in specialized industrial visual understanding, supporting privacy-preserving and cost-effective deployment.
Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works
Dev.to

Text-to-image is easy. Chaining LLMs to generate, critique, and iterate on images autonomously is a routing nightmare. AgentSwarms now supports Image generation playground and creative media workflows!
Reddit r/artificial

Why Enterprise AI Pilots Fail
Dev.to

Automating FDA Compliance: AI for Specialty Food Producers
Dev.to

The PDF Feature Nobody Asked For (That I Use Every Day)
Dev.to