VeriAct: Beyond Verifiability -- Agentic Synthesis of Correct and Complete Formal Specifications
arXiv cs.AI / 4/2/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper evaluates how well LLM-based (classical and prompt-based) methods can synthesize Java Modeling Language (JML) specifications, including attempts to improve results via prompt optimization with verification feedback.
- It finds a key limitation: higher verifier pass rates do not necessarily imply that synthesized specifications are correct and complete, since the verifier can miss over- or under-constrained specs.
- To measure this gap, the authors introduce Spec-Harness, an evaluation framework using symbolic verification to assess specification correctness and completeness beyond what standard verifier acceptance indicates.
- They propose VeriAct, a verification-guided agentic loop (LLM planning, synthesis/repair, code execution, verification, and Spec-Harness feedback) designed to iteratively produce specs that are both verifiable and genuinely correct and complete.
- Experiments on two benchmark datasets indicate VeriAct outperforms prompt-based and prompt-optimized baselines, reducing the fraction of “verifier-accepted but wrong/incomplete” specifications.
Related Articles

Black Hat Asia
AI Business

Unitree's IPO
ChinaTalk
Did you know your GIGABYTE laptop has a built-in AI coding assistant? Meet GiMATE Coder 🤖
Dev.to
Benchmarking Batch Deep Reinforcement Learning Algorithms
Dev.to
A bug in Bun may have been the root cause of the Claude Code source code leak.
Reddit r/LocalLLaMA