[RFC] Replace HTTP+SSE with new "Streamable HTTP" transport by jspahrsummers · Pull Request #206 · modelcontextprotocol/modelcontextprotocol (original) (raw)

This PR introduces the Streamable HTTP transport for MCP, addressing key limitations of the current HTTP+SSE transport while maintaining its advantages.

Our deep appreciation to @atesgoral and @topherbullock (Shopify), @samuelcolvin and @Kludex (Pydantic), @calclavia, Cloudflare, LangChain, Vercel, the Anthropic team, and many others in the MCP community for their thoughts and input! This proposal was only possible thanks to the valuable feedback received in the GitHub Discussion.

TL;DR

As compared with the current HTTP+SSE transport:

We remove the /sse endpoint
All client → server messages go through the /message (or similar) endpoint
All client → server requests could be upgraded by the server to be SSE, and used to send notifications/requests
Servers can choose to establish a session ID to maintain state
Client can initiate an SSE stream with an empty GET to /message

This approach can be implemented backwards compatibly, and allows servers to be fully stateless if desired.

Motivation

Remote MCP currently works over HTTP+SSE transport which:

Does not support resumability
Requires the server to maintain a long-lived connection with high availability
Can only deliver server messages over SSE

Benefits

Stateless servers are now possible—eliminating the requirement for high availability long-lived connections
Plain HTTP implementation—MCP can be implemented in a plain HTTP server without requiring SSE
Infrastructure compatibility—it's "just HTTP," ensuring compatibility with middleware and infrastructure
Backwards compatibility—this is an incremental evolution of our current transport
Flexible upgrade path—servers can choose to use SSE for streaming responses when needed

Example use cases

Stateless server

A completely stateless server, without support for long-lived connections, can be implemented in this proposal.

For example, a server that just offers LLM tools and utilizes no other features could be implemented like so:

Always acknowledge initialization (but no need to persist any state from it)
Respond to any incoming ToolListRequest with a single JSON-RPC response
Handle any CallToolRequest by executing the tool, waiting for it to complete, then sending a single CallToolResponse as the HTTP response body

Stateless server with streaming

A server that is fully stateless and does not support long-lived connections can still take advantage of streaming in this design.

For example, to issue progress notifications during a tool call:

When the incoming POST request is a CallToolRequest, server indicates the response will be SSE
Server starts executing the tool
Server sends any number of ProgressNotifications over SSE while the tool is executing
When the tool execution completes, the server sends a CallToolResponse over SSE
Server closes the SSE stream

Stateful server

A stateful server would be implemented very similarly to today. The main difference is that the server will need to generate a session ID, and the client will need to pass that back with every request.

The server can then use the session ID for sticky routing or routing messages on a message bus—that is, a POST message can arrive at any server node in a horizontally-scaled deployment, so must be routed to the existing session using a broker like Redis.

Why not WebSocket?

The core team thoroughly discussed making WebSocket the primary remote transport (instead of SSE), and applying similar work to it to make it disconnectable and resumable. We ultimately decided not to pursue WS right now because:

Wanting to use MCP in an "RPC-like" way (e.g., a stateless MCP server that just exposes basic tools) would incur a lot of unnecessary operational and network overhead if a WebSocket is required for each call.
From a browser, there is no way to attach headers (like Authorization), and unlike SSE, third-party libraries cannot reimplement WebSocket from scratch in the browser.
Only GET requests can be transparently upgraded to WebSocket (other HTTP methods are not supported for upgrading), meaning that some kind of two-step upgrade process would be required on a POST endpoint, introducing complexity and latency.

We're also avoiding making WebSocket an additional option in the spec, because we want to limit the number of transports officially specified for MCP, to avoid a combinatorial compatibility problem between clients and servers. (Although this does not prevent community adoption of a non-standard WebSocket transport.)

The proposal in this doc does not preclude further exploration of WebSocket in future, if we conclude that SSE has not worked well.

To do

Move session ID responsibility to server
- Define acceptable space of session IDs
- Ensure session IDs are introspectable by middleware/WAF
Make cancellation explicit
Require centralized SSE GET for server -> client requests and notifications
Convert resumability into a per-stream concept
Design a way to proactively "end session"
"if the client has an auth token, it should include it in every MCP request"

Follow ups

Standardize support for JSON-RPC batching
Support for streaming request bodies?
Put some recommendations about timeouts into the spec, and maybe codify conventions like "issuing a progress notification should reset default timeouts."