Safe and Scalable Web Agent Learning via Recreated Websites

arXiv cs.CL / 3/12/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

VeriEnv proposes cloning real-world websites into fully executable synthetic environments to train web agents, addressing safety and verifiability issues of exploring live sites.
The framework uses language models as environment creators and exposes a Python SDK to provide deterministic, programmatically verifiable rewards, reducing reliance on heuristic or LLM-based judges.
It decouples agent learning from unsafe real-world interaction and enables scalable self-evolution by expanding the number of training environments.
Experiments on web agent benchmarks show agents trained with VeriEnv generalize to unseen websites and achieve site-specific mastery through self-evolving training, with benefits from scaling training environments.
Code and resources will be released on GitHub upon acceptance, signaling higher potential for reproducibility and adoption.

Abstract

Training autonomous web agents is fundamentally limited by the environments they learn from: real-world websites are unsafe to explore, hard to reset, and rarely provide verifiable feedback. We propose VeriEnv, a framework that treats language models as environment creators, automatically cloning real-world websites into fully executable, verifiable synthetic environments. By exposing controlled internal access via a Python SDK, VeriEnv enables agents to self-generate tasks with deterministic, programmatically verifiable rewards, eliminating reliance on heuristic or LLM-based judges. This design decouples agent learning from unsafe real-world interaction while enabling scalable self-evolution through environment expansion. Through experiments on web agent benchmarks, we show that agents trained with VeriEnv generalize to unseen websites, achieve site-specific mastery through self-evolving training, and benefit from scaling the number of training environments. Code and resources will be released at https://github.com/kyle8581/VeriEnv upon acceptance.