Intent-aligned Formal Specification Synthesis via Traceable Refinement

arXiv cs.LG / 4/14/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces VeriSpecGen, a framework that synthesizes intent-aligned formal specifications in Lean from natural language while maintaining traceability to individual requirements.
It decomposes user intents into atomic requirements, generates requirement-targeted tests with explicit traceability maps, and uses localized clause-level repair when validation fails.
VeriSpecGen reports strong benchmark performance, reaching 86.6% on the VERINA SpecGen task with Claude Opus 4.5 and improving baselines by up to 31.8 points across model families.
Beyond inference, the method produces 343K training examples from refinement trajectories, and training on these trajectories yields reported 62–106% improvements and transfer to general reasoning abilities.

Abstract

Large language models are increasingly used to generate code from natural language, but ensuring correctness remains challenging. Formal verification offers a principled way to obtain such guarantees by proving that a program satisfies a formal specification. However, specifications are frequently missing in real-world codebases, and writing high-quality specifications remains expensive and expertise-intensive. We present VeriSpecGen, a traceable refinement framework that synthesizes intent-aligned specifications in Lean through requirement-level attribution and localized repair. VeriSpecGen decomposes natural language into atomic requirements and generates requirement-targeted tests with explicit traceability maps to validate generated specifications. When validation fails, traceability maps attribute failures to specific requirements, enabling targeted clause-level repairs. VeriSpecGen achieve 86.6% on VERINA SpecGen task using Claude Opus 4.5, improving over baselines by up to 31.8 points across different model families and scales. Beyond inference-time gains, we generate 343K training examples from VeriSpecGen refinement trajectories and demonstrate that training on these trajectories substantially improves specification synthesis by 62-106% relative and transfers gains to general reasoning abilities.