From Concept to Practice: an Automated LLM-aided UVM Machine for RTL Verification

arXiv cs.AI / 4/7/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The paper argues that RTL verification is a major IC development bottleneck, with verification often consuming around 70% of total effort, largely due to the cost of building UVM testbenches and generating adequate stimuli.
  • It introduces UVM^2, an automated UVM verification framework that uses LLMs to generate UVM testbenches and then iteratively improves them using coverage feedback.
  • The framework is evaluated on a new benchmark suite of RTL designs up to 1.6K lines of code to test both scalability and verification quality.
  • Reported results indicate UVM^2 can significantly reduce testbench setup time compared with experienced engineers while achieving strong average code coverage (87.44%) and function coverage (89.58%).
  • The authors claim improvements over state-of-the-art approaches of roughly 20.96% (code coverage) and 23.51% (function coverage), suggesting LLM-guided coverage-driven iteration can materially boost verification effectiveness.

Abstract

Verification presents a major bottleneck in Integrated Circuit (IC) development, consuming nearly 70% of the total development effort. While the Universal Verification Methodology (UVM) is widely used in industry to improve verification efficiency through structured and reusable testbenches, constructing these testbenches and generating sufficient stimuli remain challenging. These challenges arise from the considerable manual coding effort required, repetitive manual execution of multiple EDA tools, and the need for in-depth domain expertise to navigate complex designs.Here, we present UVM^2, an automated verification framework that leverages Large Language Models (LLMs) to generate UVM testbenches and iteratively refine them using coverage feedback, significantly reducing manual effort while maintaining rigorous verification standards.To evaluate UVM^2, we introduce a benchmark suite comprising Register Transfer Level (RTL) designs of up to 1.6K lines of code.The results show that UVM^2 reduces testbench setup time by up to UVM^2 compared to experienced engineers, and achieve average code and function coverage of 87.44% and 89.58%, outperforming state-of-the-art solutions by 20.96% and 23.51%, respectively.