Unlocking the Codex Harness — How OpenAI Built the App Server

OpenAI has published the architecture behind the Codex App Server — a bidirectional JSON-RPC layer that powers every Codex surface from CLI to IDE to web, all running the same agent harness.
One Harness to Rule Them All
Codex — OpenAI's AI coding agent — runs across remarkably different surfaces: a web app, a CLI, a VS Code extension, a JetBrains plugin, an Xcode integration, and a native macOS desktop app. Despite the diversity, every single one of them runs the same underlying agent loop. The glue that makes this possible is the Codex App Server — a bidirectional JSON-RPC API that exposes the Codex harness to any client that wants to embed it.
This post from OpenAI engineer Celia Chen is a rare look inside the infrastructure that keeps a production AI agent consistent across radically different environments — and a useful guide for any developer who wants to build on top of it.
Why the App Server Exists
Codex CLI started as a terminal UI. When the VS Code extension was built, the team needed a way to reuse the same agent logic without re-implementing it. They first tried exposing Codex as an MCP server, but MCP semantics didn't translate well to VS Code's richer interaction needs — streaming progress, live diffs, approval flows. So they built a JSON-RPC protocol that mirrored the TUI loop. What started as an internal shortcut gradually became the standard protocol powering every Codex surface.
Three Primitives That Model Agent Interaction
Agent interactions aren't simple request/response exchanges. One user prompt can unfold into a long sequence of tool calls, file edits, reasoning steps, and diffs — all of which need to be faithfully represented to the client UI. The App Server handles this with three nested primitives:
- Item — the atomic unit of input or output. Each item has a lifecycle: started → delta events → completed. Examples include a user message, a reasoning block, a tool call, or a diff.
- Turn — a unit of agent work triggered by one user request. A turn contains many items and ends when the agent finishes or requests human input.
- Thread — a durable conversation session. Threads persist across turns and can be resumed, forked, or archived. This is what allows Codex Web to continue working even if the browser tab closes.
The protocol is fully bidirectional — the server can initiate requests mid-turn, for example pausing to ask the user for an approval before executing a destructive command. The client responds and the turn resumes.
How Different Clients Connect
The architecture is flexible by design, with each client type integrating differently:
- Local IDEs and desktop apps (VS Code, JetBrains, macOS app) bundle a platform-specific App Server binary, launch it as a child process, and keep a persistent stdio channel open for JSON-RPC. The binary is pinned to a tested version so the client always runs validated code.
- Xcode and partners with slower release cycles decouple client and server versions — keeping the client stable while pointing to a newer App Server binary. The backward-compatible protocol design makes this safe.
- Codex Web runs the harness in a container. A worker provisions the container, launches the App Server inside it, and the browser communicates via HTTP and Server-Sent Events. Work continues server-side even if the browser tab closes — new sessions reconnect and catch up without rebuilding state.
Why JSON-RPC Instead of MCP
The team explicitly tried MCP first and stepped back from it. MCP is designed as a universal protocol for agent-editor communication — think LSP but for AI agents. The App Server solves a narrower, deeper problem: managing stateful, long-running agent sessions with approval flows, streaming diffs, and thread persistence. These semantics don't map cleanly onto MCP. The App Server and MCP coexist — Codex still exposes an MCP interface for simpler integrations — but the App Server is the first-class protocol OpenAI will maintain going forward.
Client bindings are available in Go, Python, TypeScript, Swift, and Kotlin, with schema generation tools built into the CLI. All source code is open in the Codex CLI repository.
What This Means for Developers
The publication of the App Server architecture is an invitation. Any tool that wants to embed a production-grade coding agent — a custom IDE, an internal dev platform, a CI/CD system — now has a stable, documented, backward-compatible protocol to build on. The same harness that powers OpenAI's own surfaces is available to anyone willing to implement the JSON-RPC client. That's a meaningful opening in a space where most AI coding agent internals remain proprietary.