AI Transport

Vercel AI SDK in production: when DefaultChatTransport needs a session layer

A self-audit for Vercel AI SDK developers: four production requirements DefaultChatTransport wasn't designed for, when they become blockers, and what a session layer adds.

Vercel AI SDK in production: when DefaultChatTransport needs a session layer

You've built an AI chat app on the Vercel AI SDK. It works in development. The model responds, the stream comes through, and the UI updates cleanly. Then you ship to production, and the transport layer starts showing its edges.

Most of these failures are quiet: things that work in demos and break in ways that are hard to pin down until you know where to look. They share a common cause: DefaultChatTransport is built for HTTP, and HTTP has structural properties that some production requirements exceed. This piece explains what those limits are, which ones matter for your application, and what replacing the transport actually involves.

Key takeaways

  • DefaultChatTransport uses HTTP POST and Server-Sent Events (SSE). These protocols are one-way and point-to-point. That's correct behavior for a stateless serverless platform, not a bug in the SDK.
  • stop() fires the abort signal client-side and returns immediately. GitHub issue #9707 (open, October 2025) confirms the server cannot distinguish an intentional stop from a dropped connection, and may continue generating and billing until completion.
  • The official Vercel AI SDK stream resumption pattern requires Redis, the resumable-stream package, two custom API endpoints, and a dedicated stop handler. In a resumable stream setup, stop() is treated as a disconnect, not a cancel.
  • The ChatTransport interface is pluggable by design. Vercel's serverless platform cannot host persistent WebSocket connections, so they made the transport layer swappable. Replacing DefaultChatTransport with a WebSocket-based transport layer creates a durable session between your agent and client, without changing your agent, tool calls, or UI rendering.

How DefaultChatTransport works, and the conditions it was built for

When you call useChat() without a transport option, or pass a default config, DefaultChatTransport is what runs. It sends outgoing messages via HTTP POST, and receives responses as an SSE stream.

For a single user, on a stable connection, sending a message and waiting for the response, this is the right choice. A stateless serverless function receives the request, calls the model, and streams the response back. HTTP is the right tool for that, and DefaultChatTransport uses it correctly.

That behavior follows from a platform constraint: Vercel's serverless functions terminate after responding, so there is no persistent process to hold a socket open. That's the root of all four limits. They're architectural, not configurable, because HTTP on a stateless platform simply can't do what they require. The Ably guide to WebSockets on Vercel covers this constraint in depth if you want the full picture.

That's also why Vercel made ChatTransport pluggable in AI SDK 5. DefaultChatTransport is not broken: it's correct for the conditions it was built for. But Vercel designed the interface precisely so teams can swap in a transport that isn't bound by those conditions.

It's not just DefaultChatTransport that has this constraint. Even DirectChatTransport, the other built-in option, explicitly documents that it "does not support reconnection since there is no persistent server-side stream to reconnect to." Reconnection is a transport-layer property. The default implementations don't have it because the platform they're built for doesn't support it.

Four things DefaultChatTransport can't do in production

These are the limits that surface when you move beyond a single-user chatbot: a customer support agent that hands off between devices, a chat interface where a human and an AI both participate, or any application where the connection dropping mid-generation has a visible cost to the user.

Each follows from the same root: HTTP/SSE is built for one connection, one client, one response. When production asks for more, that constraint becomes visible.

Cancellation is ambiguous, and you may be paying for it. When a user clicks stop, stop() closes the HTTP connection client-side, and returns immediately, without waiting for the server to acknowledge or terminate the generation. The server receives a connection close event. It has no way to distinguish that from a tab close, a network drop, or a mobile device going to sleep. So it keeps generating.

GitHub issue #9707 (filed October 2025, still open) documents this directly: createUIMessageStream does not detect the abort signal server-side, making it "impossible to stop ongoing AI generation and leading to unnecessary costs and poor UX." GitHub issue #10844 adds that Vercel's own supportsCancellation: true config flag behaves unreliably in production deployments. The cost is real: orphaned generations run to completion, and there's no reliable mechanism to stop them without a custom server-side endpoint.

Multi-device delivery silently fails. SSE is one-to-one. One HTTP connection, one client, one stream. A user with the same session open on their laptop and phone receives the response only on the device that sent the request. The second device gets nothing: no error, no partial content, no indication that anything is in flight. This isn't a useChat configuration gap. It's a structural property of HTTP. Multi-device fan-out is absent from the vast majority of AI transport implementations because SSE is one-to-one by design. DefaultChatTransport is no exception.

The same architectural root connects the next limit. Where multi-device delivery requires fan-out that HTTP cannot provide, stream resumption requires session persistence that HTTP cannot maintain.

Stream resumption requires infrastructure that you build and own. The Vercel AI SDK stream resumption documentation lists the prerequisites directly: a Redis instance, the resumable-stream package, a POST handler that creates resumable streams using consumeSseStream, a GET handler at /api/chat/[id]/stream that resumes them with resumeExistingStream, and a dedicated stop endpoint.

stop() and resumable streams are also architecturally incompatible. The docs state it directly: "In a resumable stream setup, client-side aborts are treated as disconnects. Closing a tab, refreshing the page, or calling stop() only closes the current HTTP connection and should not cancel the underlying generation." Adding a working stop button requires a separate server-side endpoint to cancel the underlying work and clear the active stream record.

Tab switches and mobile backgrounding are a further gap the resumable-stream pattern doesn't cover in the same way as a page reload. The Ably guide on Vercel AI SDK resumable streams covers the distinction.

The single-response assumption breaks multi-user sessions. Vercel designed useChat around one user sending one message and receiving one response. It tracks one activeResponse at a time. If a second user joins, or an observer device needs the same response lifecycle, the only available mechanism is setMessages. This bypasses lifecycle hooks, tool-call notifications, and onFinish callbacks entirely. It works, but it's a workaround. Zak Knill's post on building the Ably transport covers the implementation detail.

Each of the four limits above has the same root cause but surfaces differently. The table below maps them to their production cost:

Limit

What breaks

Production cost

Configurable in DefaultChatTransport?

Cancellation

Server can't distinguish stop from disconnect

Orphaned generations; ongoing billing

No

Multi-device

SSE delivers to one client only

Silent failure on second device

No

Stream resumption

Requires Redis, two endpoints, stop handler

Significant custom infrastructure

No

Single-response assumption

setMessages bypasses lifecycle hooks

Broken tool calls, missing onFinish

No

How a WebSocket-based transport layer creates a durable session between agent and client

Replacing DefaultChatTransport with a WebSocket-based transport layer replaces a stateless HTTP connection with a durable session between your agent and your users. One that persists beyond any single connection and addresses all four limits directly. It also removes the custom infrastructure that those limits force you to build. The Ably topic page on implementing a custom ChatTransport covers the full capability surface. This section covers what disappears from your backlog.

With a WebSocket-based transport layer, you no longer need:

  • The Redis buffer for resumable streams
  • The stop endpoint with race condition protection
  • The fan-out layer for multi-device delivery
  • The setMessages workaround for multi-user sessions
How a durable session works: session decoupled from connection, showing cancel signal and reconnect from position.
How a durable session works: session decoupled from connection, showing cancel signal and reconnect from position.

The mechanism that makes this possible is straightforward. A session is decoupled from the connection. The session persists independently; a connection is how a client subscribes to it. When a client disconnects and reconnects, it presents its last position to the session and receives only the messages it missed. A cancel signal is sent explicitly on the session: the server reads it as intent, not as a connection close event it has to interpret.

Ably AI Transport is built as the session layer for production AI applications: the infrastructure between your agent and your users that handles the delivery concerns that DefaultChatTransport can't. It plugs into useChat as a ChatTransport implementation via a single configuration change:

// Before: default HTTP transport
const { messages, sendMessage, stop } = useChat({
  transport: new DefaultChatTransport({ api: '/api/chat' }),
});

// After: Ably AI Transport (backed by an Ably session)
const { chatTransport } = useChatTransport(); // from <ChatTransportProvider>
const { messages, sendMessage } = useChat({ transport: chatTransport });

In practice: stop() sends a typed signal the server can act on, instead of a connection close event that it has to guess at. Any device subscribed to the same session receives the stream, so a user switching from laptop to phone doesn't lose the conversation. If the connection drops mid-generation, the client reconnects and catches up from where it left off, because the session persists independently of any single connection.

What stays unchanged: your agent, tool calls, message persistence logic, and UI rendering. The swap is the transport option in useChat. Everything built on top of it carries over.

For the implementation detail on own-turns, observer-turns, and setMessages handling, see Zak Knill's post. For how transport options compare more broadly, see the durable sessions guide for Vercel AI SDK applications. The four questions in the next section will help you work out whether you're at that decision point yet.

When DefaultChatTransport is still the right choice

The four limits above are real, but they only become blockers if you need cancellation that reaches the server, multi-device delivery, stream resumption beyond page reloads, or more than one user in the same conversation. For many applications, DefaultChatTransport remains the right starting point.

A practical way to assess your own situation is to work through four questions:

  1. Do you need stop() to reliably cancel server-side generation, not just the UI update, but the actual model call?
  2. Do users access the same session from more than one device or tab?
  3. Do you need stream resumption across tab switches or mobile backgrounding, not just full page reloads?
  4. Does more than one user participate in the same conversation?

If the answer to all four is no, DefaultChatTransport is a defensible choice. If any answer is yes, the relevant section above describes the specific limit you'll encounter. The right time to replace the transport is when those limits start costing you.

If the self-audit above lands on yes for any of the four questions, DefaultChatTransport has reached its limit for your use case. The transport layer is the right place to fix it, and replacing it changes nothing else in your application.

The next step is understanding the ChatTransport interface: what sendMessages and reconnectToStream require, and what to look for in an implementation. The Ably ChatTransport topic page covers that in full. To get started with Ably AI Transport directly, the Vercel AI SDK integration guide is the right starting point.

Frequently asked questions

Does the Vercel AI SDK support multi-device AI chat out of the box?

Not with DefaultChatTransport. SSE is scoped to a single HTTP connection, so a second device has no way to join a stream already in progress. Multi-device delivery requires a transport where the session exists independently of the connection, so any subscribed client receives it. The Ably guide on why Vercel AI SDK can't stream to multiple devices provides the full picture.

Why doesn't stop() cancel server-side generation in Vercel AI SDK?

Because DefaultChatTransport has no signal path back to the server. When stop() closes the HTTP connection, the server receives a TCP close it can't distinguish from a network drop, so generation continues and billing runs to completion. With a WebSocket-based transport layer, stop() sends a typed cancel message on the session; the server reads it as intent, not inference. The Ably guide on why stop() doesn't cancel the stream covers the full mechanism.

How much infrastructure does Vercel AI SDK stream resumption require?

The official pattern requires a Redis instance, the resumable-stream package, a POST handler with consumeSseStream, a GET handler at /api/chat/[id]/stream, and a dedicated stop endpoint with race condition handling. stop() and resumable streams are also architecturally incompatible. In a resumable stream setup, a client abort is treated as a disconnect, not a cancel. See the Ably guide to Vercel AI SDK resumable streams for the full breakdown.

When should I replace DefaultChatTransport?

When the limits start affecting your production application. The four-question self-audit in the "When DefaultChatTransport is still the right choice" section gives a practical framework. In short: if you need stop() to reliably cancel server-side generation, multi-device delivery, stream resumption beyond page reloads, or multi-user sessions, the default transport can't provide those. The Ably durable sessions guide for Vercel AI SDK covers the transport options available once you've decided to move on.

Why replace DefaultChatTransport with a WebSocket-based transport layer?

When DefaultChatTransport's design scope no longer fits your production requirements. If you're hitting unconfirmed cancellations, single-device delivery, Redis-dependent stream resumption, or the setMessages workaround for multi-user sessions, those are properties of HTTP/SSE that a WebSocket-based transport layer resolves at the transport level. Your agent, tool calls, and UI code don't change.

Vercel AI SDK custom transport vs default transport, what actually changes?

The delivery mechanism only. Your agent, tool calls, message persistence, and UI rendering stay the same. The swap is the transport option in useChat, one configuration change. For a full before/after and getting started guide, see the Ably AI Transport Vercel integration guide.