Speeding up agentic workflows with WebSockets in the Responses API
OpenAI Blog / 4/22/2026
💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage
Key Points
- The article explains the “Codex agent loop” and how WebSockets can streamline communication within agentic workflows by reducing repeated API overhead.
- It describes using connection-scoped caching to reuse relevant context during a session, which helps lower latency.
- The piece focuses on practical architecture choices for the Responses API to improve end-to-end model responsiveness.
- Overall, the article frames WebSocket-based, session-aware design as a way to speed up agent execution compared with more stateless request patterns.
A deep dive into the Codex agent loop, showing how WebSockets and connection-scoped caching reduced API overhead and improved model latency.


