Exploring LLM-based Verilog Code Generation with Data-Efficient Fine-Tuning and Testbench Automation

arXiv cs.AI / 4/20/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that while LLMs have improved code generation, their application to hardware description languages like Verilog is still comparatively limited.
  • It proposes a workflow that uses multi-agent models to automatically generate testbenches, producing higher-quality fine-tuning data when such resources are scarce.
  • After fine-tuning, the model for the specification-to-Verilog task achieves performance on the refined VerilogEval v2 benchmark comparable to state-of-the-art approaches.
  • The approach reaches that level of performance while requiring less training data than typical methods, and it is positioned as a foundation for future HDL generation and automated verification work.

Abstract

Recent advances in large language models have improved code generation, but their use in hardware description languages is still limited. Moreover, training data and testbenches for these models are often scarce. This paper presents a workflow that uses multi-agent models to generate testbenches for high-quality fine-tuning data. By automating testbench creation, the fine-tuned model for the specification-to-Verilog task achieves performance comparable to state-of-the-art methods on the refined VerilogEval v2 benchmark while using less training data. This study provides a basis for future work on LLM-based HDL generation and automated verification.