Vercel AI SDK in production: when DefaultChatTransport needs a session layer

You've built an AI chat app on the Vercel AI SDK. It works in development. The model responds, the stream comes through, and the UI updates cleanly. Then you ship to production, and the transport layer starts showing its edges.

Most of these failures are quiet: things that work in demos and break in ways that are hard to pin down until you know where to look. They share a common cause: DefaultChatTransport is built for HTTP, and HTTP has structural properties that some production requirements exceed. This piece explains what those limits are, which ones matter for your application, and what replacing the transport actually involves.

Key takeaways

DefaultChatTransport uses HTTP POST and Server-Sent Events (SSE). These protocols are one-way and point-to-point. That's correct behavior for a stateless serverless platform, not a bug in the SDK.
stop() fires the abort signal client-side and returns immediately. GitHub issue #9707 (open, October 2025) confirms the server cannot distinguish an intentional stop from a dropped connection, and may continue generating and billing until completion.
The official Vercel AI SDK stream resumption pattern requires Redis, the resumable-stream package, two custom API endpoints, and a dedicated stop handler. In a resumable stream setup, stop() is treated as a disconnect, not a cancel.
The ChatTransport interface is pluggable by design, and that holds even now that Vercel Functions can accept native WebSocket connections (public beta). A socket is not a session. Native WebSockets are pinned to one instance and capped at up to 30 minutes - with no fan-out, presence, or history built in.
Replacing DefaultChatTransport with a WebSocket-based transport layer creates a durable session between your agent and client, without changing your agent, tool calls, or UI rendering.

How DefaultChatTransport works, and the conditions it was built for

When you call useChat() without a transport option, or pass a default config, DefaultChatTransport is what runs. It sends outgoing messages via HTTP POST, and receives responses as an SSE stream.

For a single user, on a stable connection, sending a message and waiting for the response, this is the right choice. A stateless serverless function receives the request, calls the model, and streams the response back. HTTP is the right tool for that, and DefaultChatTransport uses it correctly.

You might have read that Vercel Functions can now accept WebSocket connections directly (public beta, requires Fluid compute). This means a WebSocket connection on Vercel can now run as a single function invocation. But it's pinned to one instance for its lifetime and capped at up to 30 minutes. None of that removes the need for a session layer.

The reason is that a socket isn't a session. That gap is architectural, not configurable: a raw socket doesn't include fan-out, presence, or delivery guarantees, on Vercel or anywhere else.

That's also why Vercel made the ChatTransport interface pluggable in AI SDK 5. DefaultChatTransport is not broken: it's correct for the conditions it was built for. But Vercel designed the interface precisely so teams can swap in a transport that isn't bound by those conditions.

It's not just DefaultChatTransport that has this constraint. Even DirectChatTransport, the other built-in option, explicitly documents that it "does not support reconnection since there is no persistent server-side stream to reconnect to." Reconnection is a transport-layer property. The default implementations don't have it because the platform they're built for doesn't support it.

Four things DefaultChatTransport can't do in production

These are the limits that surface when you move beyond a single-user chatbot: a customer support agent that hands off between devices, a chat interface where a human and an AI both participate, or any application where the connection dropping mid-generation has a visible cost to the user.

Each follows from the same root: HTTP/SSE is built for one connection, one client, one response. When production asks for more, that constraint becomes visible.

Cancellation is ambiguous, and you may be paying for it. When a user clicks stop, stop() closes the HTTP connection client-side, and returns immediately, without waiting for the server to acknowledge or terminate the generation. The server receives a connection close event. It has no way to distinguish that from a tab close, a network drop, or a mobile device going to sleep. So it keeps generating.

GitHub issue #9707 (filed October 2025, still open) documents this directly: createUIMessageStream does not detect the abort signal server-side, making it "impossible to stop ongoing AI generation and leading to unnecessary costs and poor UX." GitHub issue #10844 adds that Vercel's own supportsCancellation: true config flag behaves unreliably in production deployments. The cost is real: orphaned generations run to completion, and there's no reliable mechanism to stop them without a custom server-side endpoint.

Multi-device delivery silently fails. SSE is one-to-one. One HTTP connection, one client, one stream. A user with the same session open on their laptop and phone receives the response only on the device that sent the request. The second device gets nothing: no error, no partial content, no indication that anything is in flight. This isn't a useChat configuration gap. It's a structural property of HTTP. Multi-device fan-out is absent from the vast majority of AI transport implementations because SSE is one-to-one by design. DefaultChatTransport is no exception.

The same architectural root connects the next limit. Where multi-device delivery requires fan-out that HTTP cannot provide, stream resumption requires session persistence that HTTP cannot maintain.

Stream resumption requires infrastructure that you build and own. The Vercel AI SDK stream resumption documentation lists the prerequisites directly: a Redis instance, the resumable-stream package, a POST handler that creates resumable streams using consumeSseStream, a GET handler at /api/chat/[id]/stream that resumes them with resumeExistingStream, and a dedicated stop endpoint.

stop() and resumable streams are also architecturally incompatible. The docs state it directly: "In a resumable stream setup, client-side aborts are treated as disconnects. Closing a tab, refreshing the page, or calling stop() only closes the current HTTP connection and should not cancel the underlying generation." Adding a working stop button requires a separate server-side endpoint to cancel the underlying work and clear the active stream record.

Tab switches and mobile backgrounding are a further gap the resumable-stream pattern doesn't cover in the same way as a page reload. The Ably guide on Vercel AI SDK resumable streams covers the distinction.

The single-response assumption breaks multi-user sessions. Vercel designed useChat around one user sending one message and receiving one response. It tracks one activeResponse at a time. If a second user joins, or an observer device needs the same response lifecycle, the only available mechanism is setMessages. This bypasses lifecycle hooks, tool-call notifications, and onFinish callbacks entirely. It works, but it's a workaround. Zak Knill's post on building the Ably transport covers the implementation detail.

Each of the four limits above has the same root cause but surfaces differently. The table below maps them to their production cost:

Limit	What breaks	Production cost	Configurable in DefaultChatTransport?
Cancellation	Server can't distinguish stop from disconnect	Orphaned generations; ongoing billing	No
Multi-device	SSE delivers to one client only	Silent failure on second device	No
Stream resumption	Requires Redis, two endpoints, stop handler	Significant custom infrastructure	No
Single-response assumption	setMessages bypasses lifecycle hooks	Broken tool calls, missing onFinish	No

How a WebSocket-based transport layer creates a durable session between agent and client

Replacing DefaultChatTransport with a WebSocket-based transport layer replaces a stateless HTTP connection with a durable session between your agent and your users. One that persists beyond any single connection and addresses all four limits directly. It also removes the custom infrastructure that those limits force you to build. The Ably topic page on implementing a custom ChatTransport covers the full capability surface. This section covers what disappears from your backlog.

With a WebSocket-based transport layer, you no longer need:

The Redis buffer for resumable streams
The stop endpoint with race condition protection
The fan-out layer for multi-device delivery
The setMessages workaround for multi-user sessions

How a WebSocket-based transport layer works: session decoupled from connection, showing cancel signal and reconnect from position.

The mechanism that makes this possible is straightforward. A session is decoupled from the connection. The session persists independently; a connection is how a client subscribes to it. When a client disconnects and reconnects, it presents its last position to the session and receives only the messages it missed. A cancel signal is sent explicitly on the session: the server reads it as intent, not as a connection close event it has to interpret.

Ably AI Transport is built as the session layer for production AI applications: the infrastructure between your agent and your users that handles the delivery concerns that DefaultChatTransport can't. It plugs into useChat as a ChatTransport implementation via a single configuration change:

// Before: default HTTP transport
const { messages, sendMessages, stop } = useChat()

// After: Ably AI Transport (backed by an Ably session)
const { chatTransport } = useChatTransport(); // from <ChatTransportProvider>
const { messages, sendMessages, stop } = useChat({ transport: chatTransport });

In practice: stop() sends a typed signal the server can act on, instead of a connection close event that it has to guess at. Any device subscribed to the same session receives the stream, so a user switching from laptop to phone doesn't lose the conversation. If the connection drops mid-generation, the client reconnects and catches up from where it left off, because the session persists independently of any single connection.

What stays unchanged: your agent, tool calls, message persistence logic, and UI rendering. The swap is the transport option in useChat. Everything built on top of it carries over.

For the implementation detail on own-turns, observer-turns, and setMessages handling, see Zak Knill's post. For how transport options compare more broadly, see the durable sessions guide for Vercel AI SDK applications. The four questions in the next section will help you work out whether you're at that decision point yet.

When DefaultChatTransport is still the right choice

The four limits above are real, but they only become blockers if you need cancellation that reaches the server, multi-device delivery, stream resumption beyond page reloads, or more than one user in the same conversation. For many applications, DefaultChatTransport remains the right starting point.

A practical way to assess your own situation is to work through four questions:

Do you need stop() to reliably cancel server-side generation, not just the UI update, but the actual model call?
Do users access the same session from more than one device or tab?
Do you need stream resumption across tab switches or mobile backgrounding, not just full page reloads?
Does more than one user participate in the same conversation?

If the answer to all four is no, DefaultChatTransport is a defensible choice. If any answer is yes, the relevant section above describes the specific limit you'll encounter. The right time to replace the transport is when those limits start costing you.

If the self-audit above lands on yes for any of the four questions, DefaultChatTransport has reached its limit for your use case. The transport layer is the right place to fix it, and replacing it changes nothing else in your application.

The next step is understanding the ChatTransport interface: what sendMessages and reconnectToStream require, and what to look for in an implementation. The Ably ChatTransport topic page covers that in full. To get started with Ably AI Transport directly, the Vercel AI SDK integration guide is the right starting point.

Frequently asked questions

Does the Vercel AI SDK support multi-device AI chat out of the box?

Not with DefaultChatTransport. SSE is scoped to a single HTTP connection, so a second device has no way to join a stream already in progress. Multi-device delivery requires a transport where the session exists independently of the connection, so any subscribed client receives it. The Ably guide on why Vercel AI SDK can't stream to multiple devices provides the full picture.

Why doesn't `stop()` cancel server-side generation in Vercel AI SDK?

Because DefaultChatTransport has no signal path back to the server. When stop() closes the HTTP connection, the server receives a TCP close it can't distinguish from a network drop, so generation continues and billing runs to completion. With a WebSocket-based transport layer, stop() sends a typed cancel message on the session; the server reads it as intent, not inference. The Ably guide on why stop() doesn't cancel the stream covers the full mechanism.

How much infrastructure does Vercel AI SDK stream resumption require?

The official pattern requires a Redis instance, the resumable-stream package, a POST handler with consumeSseStream, a GET handler at /api/chat/[id]/stream, and a dedicated stop endpoint with race condition handling. stop() and resumable streams are also architecturally incompatible. In a resumable stream setup, a client abort is treated as a disconnect, not a cancel. See the Ably guide to Vercel AI SDK resumable streams for the full breakdown.

When should I replace `DefaultChatTransport`?

When the limits start affecting your production application. The four-question self-audit in the "When DefaultChatTransport is still the right choice" section gives a practical framework. In short: if you need stop() to reliably cancel server-side generation, multi-device delivery, stream resumption beyond page reloads, or multi-user sessions, the default transport can't provide those. The Ably durable sessions guide for Vercel AI SDK covers the transport options available once you've decided to move on.

Why replace `DefaultChatTransport` with a WebSocket-based transport layer?

When DefaultChatTransport's design scope no longer fits your production requirements. If you're hitting unconfirmed cancellations, single-device delivery, Redis-dependent stream resumption, or the setMessages workaround for multi-user sessions, those are properties of HTTP/SSE that a WebSocket-based transport layer resolves at the transport level. Your agent, tool calls, and UI code don't change.

Vercel AI SDK custom transport vs default transport, what actually changes?

The delivery mechanism only. Your agent, tool calls, message persistence, and UI rendering stay the same. The swap is the transport option in useChat, one configuration change. For a full before/after and getting started guide, see the Ably AI Transport Vercel integration guide.

Vercel AI SDK in production: when DefaultChatTransport needs a session layer

Key takeaways

How DefaultChatTransport works, and the conditions it was built for

Four things DefaultChatTransport can't do in production

How a WebSocket-based transport layer creates a durable session between agent and client

When DefaultChatTransport is still the right choice

Frequently asked questions

Does the Vercel AI SDK support multi-device AI chat out of the box?

Why doesn't `stop()` cancel server-side generation in Vercel AI SDK?

How much infrastructure does Vercel AI SDK stream resumption require?

When should I replace `DefaultChatTransport`?

Why replace `DefaultChatTransport` with a WebSocket-based transport layer?

Vercel AI SDK custom transport vs default transport, what actually changes?

Continue reading

Conversation tree branching in @ably/ai-transport

The model is fine. The session is broken.

Engineering message appends for AI Transport: three vignettes

Key takeaways

How DefaultChatTransport works, and the conditions it was built for

Four things DefaultChatTransport can't do in production

How a WebSocket-based transport layer creates a durable session between agent and client

When DefaultChatTransport is still the right choice

Frequently asked questions

Does the Vercel AI SDK support multi-device AI chat out of the box?

Why doesn't stop() cancel server-side generation in Vercel AI SDK?

How much infrastructure does Vercel AI SDK stream resumption require?

When should I replace DefaultChatTransport?

Why replace DefaultChatTransport with a WebSocket-based transport layer?

Vercel AI SDK custom transport vs default transport, what actually changes?

New posts from the Ably team, monthly.

Continue reading

Conversation tree branching in @ably/ai-transport

The model is fine. The session is broken.

Engineering message appends for AI Transport: three vignettes

Why doesn't `stop()` cancel server-side generation in Vercel AI SDK?

When should I replace `DefaultChatTransport`?

Why replace `DefaultChatTransport` with a WebSocket-based transport layer?