Why does Vercel AI SDK streaming work in development but break in production?

Local development has no intermediaries between the application and the browser. In production, load balancers, reverse proxies, CDN layers, and corporate proxies sit in the request path. Many buffer HTTP responses before forwarding them, converting progressive SSE token delivery into a single delayed response. The application code is the same; the network path is not.

What is the ChatTransport interface and why did Vercel introduce it?

ChatTransport is a pluggable interface in the AI SDK that decouples useChat from its transport implementation. Vercel introduced it because their serverless platform cannot host persistent WebSocket connections, making a bundled WebSocket transport impossible. Rather than ship a transport tied to their platform constraints, they made transport pluggable and recommend third-party providers for applications that need WebSocket or durable session behavior.

Does AI SDK stream resumption fix disconnect recovery for tab switches and device switches?

No. The AI SDK's built-in stream resumption (using the resume option and Redis) covers page reloads only — reconnecting to an active stream after a full page reload. Tab switches, device switches, and mobile backgrounding are explicitly out of scope. The feature also has a documented incompatibility with abort functionality.

Why does stop() in useChat keep generating tokens after the user cancels?

SSE is unidirectional — there is no signal path from client to server on the active streaming connection. stop() closes the client connection and sends an AbortController signal as a separate HTTP request. In distributed deployments, that request may land on a different server instance than the one holding the stream. The instance receiving the abort has no stream to cancel, so generation continues and the full token count is billed.

What are the options for replacing SSE transport in a Vercel AI SDK application?

Four approaches are in common use: a Redis buffer between the agent and client; a custom WebSocket server outside Vercel's serverless functions; a database with realtime subscriptions (Supabase, Firebase, Convex); or a purpose-built transport layer integrated via the ChatTransport interface. Each has different trade-offs on delivery guarantees, reconnect behavior, and operational overhead.

Topics
/
AI Stack
/
Durable sessions for Vercel AI SDK applications

7 min read•Published Apr 22, 2026

Durable sessions for Vercel AI SDK applications

Q: What is the Vercel serverless function timeout limit for long-running AI agents?

Vercel Hobby plan functions time out at 10 seconds. Pro plan functions default to 15 seconds, configurable up to 300 seconds maximum. Edge Functions must begin streaming within 25 seconds. Multi-step agents doing retrieval, reasoning, and tool calls can exceed these limits under normal operating conditions.

TL;DR Vercel AI SDK handles the application layer, but explicitly delegates transport to external providers. The default SSE transport breaks in production in predictable ways. Vercel’s own ChatTransport interface exists specifically so you can replace it.

Vercel AI SDK covers model calls, orchestration, and UI rendering. For simple chatbots, it works end to end. The trouble comes when agents run long tasks or users are on enterprise networks: streaming that works in development fails in production. You'll see the same failure modes across every production AI stack, not just Vercel's.

The failure modes are consistent and documented. Vercel has made transport pluggable. The question is what to put there.

Scenario	SSE sufficient?	What you need
Simple chatbot, single device, stable network	Yes	Default useChat: don’t add complexity you don’t need
“Streaming works locally but not deployed”	No	Protocol fallback: WebSocket → HTTP → long-poll, auto-negotiated
Session continuity across tabs or devices	No	Channel-based fan-out with offset history; SSE can’t provide it
Long-running agents (30s+)	No	Stateful transport with agent presence signals
Human oversight of AI interactions	Partial	needsApproval handles user-side; org-side escalation requires a session layer
Enterprise deployment	No	Proxy traversal, SOC 2, audit trail; Redis builds don’t scale here

Durable sessions for Vercel AI SDK applications

Why SSE breaks in production

Proxy buffering and “fake streaming”

No stream recovery on disconnect

One connection, one device

Cancellation is ambiguous

Serverless timeout ceilings

How Vercel architected the fix point: ChatTransport

The gap in the stack

Your options for filling the gap

1. Redis buffer

2. Build your own WebSocket layer

3. Database with realtime subscriptions

4. Purpose-built transport layer

Which path is right for your situation

The AI UX challenge: where Ably fits

Recommended Articles

Why Vercel AI SDK can't stream to multiple devices

Why AI chat history disappears between sessions

Vercel AI SDK ChatTransport: implementing a custom WebSocket transport

Join the Ably newsletter today