AI2's fully open web agent MolmoWeb navigates the web using only screenshots

THE DECODER / 3/26/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • AI2 has released MolmoWeb, a fully open web agent that navigates websites using only visual inputs (screenshots) rather than traditional page text or DOM access.
  • The article reports that MolmoWeb’s models, despite relatively smaller sizes (4B and 8B parameters), achieve strong results on standard web-navigation benchmarks.
  • MolmoWeb is positioned as a competitive alternative to several larger proprietary systems, suggesting an efficiency-focused approach to web agents.
  • By keeping the agent “fully open,” AI2 aims to enable broader experimentation and integration by researchers and developers.
  • The screenshots-only interface highlights a design choice that may improve robustness to webpage structure changes compared with text/HTML-dependent methods.

AI2 releases MolmoWeb, a fully open web agent that navigates websites using only screenshots. Despite having just 4 and 8 billion parameters, the models beat several larger proprietary systems on standard benchmarks.

The article AI2's fully open web agent MolmoWeb navigates the web using only screenshots appeared first on The Decoder.