Speeding up agentic workflows with WebSockets in the Responses API

OpenAI Blog / 4/22/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The article explains the “Codex agent loop” and how WebSockets can streamline communication within agentic workflows by reducing repeated API overhead.
  • It describes using connection-scoped caching to reuse relevant context during a session, which helps lower latency.
  • The piece focuses on practical architecture choices for the Responses API to improve end-to-end model responsiveness.
  • Overall, the article frames WebSocket-based, session-aware design as a way to speed up agent execution compared with more stateless request patterns.
A deep dive into the Codex agent loop, showing how WebSockets and connection-scoped caching reduced API overhead and improved model latency.