AI Transport exists because the default pattern for interactive AI experiences limits the quality and richness of the interactions you can build, when using direct, streamed HTTP requests from clients and agents.
Most AI frameworks support simple client-driven interactions, with streamed responses from the agent via server-sent events (SSE) or similar HTTP streaming. The client's request is handled by an agent instance which pipes tokens in response to the client request. This approach is simple, surprisingly effective for simple interactions, and every framework supports it. However, the simplicity of the pattern is also the source of its limitations.
Limitations of HTTP streaming
There are several limitations associated with HTTP streaming that arise from the coupling between the client-agent interaction and the transport layer.
Streams fail on disconnection
The operation of a response stream is tied to the health of the underlying connection. When the connection drops, the response stream fails.
This happens all the time in practice. For example, when a phone switches from Wi-Fi to cellular, a user refreshes the page, a laptop lid closes mid-response. The LLM continues to generate tokens, but there's no way to deliver them to the client, so there's nowhere for them to go.
SSE is the default streaming transport for most AI frameworks. The SSE protocol does support a mechanism at the protocol level, for a reconnecting client to specify a position in the stream to resume from. However, this is usually not supported in practice because it would require a significant increase in complexity in the backend. To support resumable streams with SSE, you would need to assign sequence numbers to token events for ordering, buffer those events in an external store, and build a resume handler. This is a big departure from a simple, stateless request handler. Even having done that, you have only addressed a part of the problem; the solution would not support continuity of streams after a page refresh because that's not supported by SSE.
Sessions don't span devices
With HTTP streaming, the connection is exclusively between the requesting client and the agent. A second tab or a phone can't access that same stream. It only exists for the client that initiated the request.
In reality, users move between surfaces constantly, whether that's a second browser tab, or an app on their phone. Without shared access to sessions, each surface is isolated. There's no way for a new client to see the in-progress stream, the conversation history, or its current state.
Clients can't reach the agent
An SSE request initiated by the client is one-way: server to client. The client has no way to send a signal to the agent through the same connection once the initial request has been made. The only thing the client can do is to read the stream to completion, or cancel it by closing the connection.
Having cancellation as the sole way to signal to the agent creates a fundamental conflict. Take a 'stop' button that cancels an in-progress stream, for example. You could use request cancellation, but if a closed connection is interpreted as a cancellation then the LLM response would be suspended, and you lose the ability to resume the stream.
Even with a bidirectional transport between client and agent, such as WebSockets, the connection would still be an exclusive pipe. Other devices have no upstream channel to the agent, so you can't interrupt or steer from a second device.
Multi-agent architectures are complex
In multi-agent systems, an orchestrator handles the client's connection and delegates to specialized subagents. When there is an exclusive, point to point connection between the client and the orchestrator agent, all interactions with subagents must be proxied by the orchestrator. If you want users to see intermediate events, such as progress or responses, from subagents every update must be mediated by the orchestrator, adding complexity and coupling.
Durable sessions
These problems all stem from the coupling between client-to-agent interaction and the transport layer used to mediate that interaction. The transport, which includes the connection, request and streamed response, is ephemeral and only exists for the lifetime of that single interaction. It is also exclusive, so no other agent or client instance can interact with it.
The pattern that engineering teams are adopting to solve these problems is to break that coupling, through the idea of a durable session: a shared, persistent medium through which clients and agents interact, instead of an exclusive pipe between one client and one agent.
A durable session provides three capabilities that direct HTTP streaming does not:
- Resilient delivery. Streams survive connection drops, device switches, page refreshes, and process restarts. The client resumes from a known position. The agent continues publishing regardless of client connectivity. No events are lost and no events are duplicated.
- Continuity across surfaces. The session follows the user, not the connection. Open a second tab, switch to a phone, come back hours later. Every surface sees the same session state. Any client with the session identifier can attach and hydrate.
- Live control. Any participant can communicate with any other participant through the session while work is in progress. Cancel a generation from a different device. Steer an agent mid-response. Send a follow-up before the current response finishes. This requires bidirectional communication that is not coupled to the original request.
This changes what's possible:
| Feature | Direct HTTP | Durable session |
|---|---|---|
| Resume after disconnect | Build from scratch: buffer, sequence and resume handler. | Automatic. Client reconnects and picks up where it left off. |
| Multi-device sync | Not possible without custom infrastructure. | Any device subscribes to the same session. |
| Cancel mid-stream | Close the connection (loses resume). | Publish a cancel signal. Stream and session survive. |
| Steer or interrupt | Requires a separate back channel. | Signal the agent through the session. |
| Multi-agent visibility | Route all updates through orchestrator. | Each agent publishes directly to the session. |
How AI Transport implements this
Ably AI Transport implements durable sessions on top of Ably channels. Ably channels provide the properties that a durable session requires:
- Any client or agent connects to the session by specifying a channel name.
- Messages on the channel outlive any single connection, device, or agent.
- Events are received by subscribers in the order that they were published, even if there are disconnections.
- A client that drops its connection automatically reconnects and picks up where it left off.
- Any participant can publish to the channel. Cancel, steer, interrupt can all happen through the same session.
- Multiple participants subscribe to the same channel, and every participant sees every event.
The core idea is that no participant is special. A client that drops and reconnects, a serverless agent that spins up for one turn and terminates, a second client joining from another device, an orchestrator agent delegating to sub-agents: all interact with the same session on equal terms. The session persists independently of any participant's connection lifecycle.
The AI Transport SDK provides the abstractions that make this model practical:
- A codec layer that bridges domain-specific message models (Vercel AI SDK's UIMessage, or any other) and Ably's native message primitives, including support for streamed token-by-token delivery.
- A session layer that materialises conversation state from the channel (or from an external store) into a branching conversation tree with views for pagination and branch navigation.
- A transport layer that handles communication mechanics: publishing messages, routing streams, managing turn lifecycle, and delivering cancel signals.
- React hooks for building UIs with streaming, pagination, and branch navigation.
- Adapters that drop into various frameworks; for example AI Transport can be used with Vercel AI SDK's
useChatwith one line of code.