Your Vercel AI SDK app is missing a session layer

If you have built an AI chat feature with the Vercel AI SDK, you have used its useChat hook. You give it your messages, and it streams the reply into your UI.

You may have seen our post on the custom transport we built for the Vercel AI SDK. It swaps useChat's default transport for Ably AI Transport, adding resumable streams, cross-device and multi-user sync, conversation branching, history compaction, and stop-and-approve controls.

This post tours a demo built on exactly that: the Vercel AI SDK, useChat, and Ably AI Transport. Try it live, watch the walkthrough below, or read on.

Key takeaways

The demo is a standard Vercel AI SDK useChat app with Ably AI Transport as its transport. That one change is what brings the features below.
Multi-device fan-out: the conversation stays in sync across every tab and device.
Multi-user conversations: several people share one thread, each with the full history.
Resumable streams: a refresh or reconnect resumes the answer instead of losing it.
Conversation branching: edit or regenerate without losing the previous answer.
History compaction: streamed tokens collapse into one final message, so history stays small.
A way back to the agent: stop a response, or approve an action before it runs.

What the demo shows

The change is small. Instead of useChat's default transport, you hand it an Ably-backed one, so the conversation runs on a shared durable session. It stays a Next.js app otherwise, and your model, tools, and message components do not change.

On the client, that is one swapped transport plus two hooks that pull the shared session into useChat.

const { chatTransport } = useChatTransport();
const { setMessages, sendMessage, status, stop } = useChat({ transport: chatTransport });
useMessageSync({ setMessages }); // keep useChat in step with the shared session
const { messages } = useView(); // render from the session, so branches and other clients show up

So what does this custom transport give you? Start chatting, then open the same link on your phone or a second tab and watch the conversation stay in sync. Ask for a long poem and refresh halfway through.

Regenerate a reply you do not like, or hit Stop on one that goes wrong. Then open the debug pane on the right, watch the messages go by, and reload to see what the session kept.

Each of those does something a plain useChat app cannot. The rest of this post takes them one at a time: what each is, what breaks without it, and how to try it in the demo.

Multi-device fan-out keeps every tab and device in sync

The same conversation is live on every tab and device at once, streaming to all of them. Open it on a laptop and a phone, and both show the same tokens as they arrive.

By default, the live answer only reaches the tab that asked for it, so a phone or a second tab shows nothing of it. The sync here comes from the transport: the conversation runs on a durable session backed by a shared Ably channel. The agent streams its tokens over it, and any device that subscribes sees every message: user prompts, agent responses, and control signals.

In the demo, the open in new tab button opens the same session as a fresh client. Send from either tab and both update together, streaming included. Refresh one and it catches back up. Side by side in the debug pane, you can watch both tabs receive the same Ably messages.

Multi-user conversations put more than one person on the same thread

A multi-user conversation is one thread that several distinct participants share, each able to see the full history and take part.

Without it, the reply reaches only the person who asked, so a teammate opening the same chat sees a blank thread. Here each participant joins one shared session with their own clientId. Every message carries that id, so the thread shows who said what, and anyone who joins reads the full history first.

In the demo, the clientId comes from a URL parameter, so changing it makes you a new participant. Each gets a color, and the Ably Messages tab shows their run-client-id on the wire. Client-side tools, such as reading the browser's location, run only on the participant who triggered them. Everyone sees the result, but only the right browser does the work.

Resumable streams survive refreshes and reconnects

A resumable stream survives a page reload or a reconnect. The conversation rebuilds itself from history, and an in-progress answer keeps arriving instead of vanishing.

This is the problem you hit first when you build your own apps. The default transport streams the reply once and does not retain it. If the connection drops or the page reloads, there is nothing to replay, so the answer vanishes and the agent starts over.

Recovering it is a build of its own. The Vercel AI SDK's own resumable stream support walks through what it takes:

Redis to store the stream, wired up through the resumable-stream package.
A POST endpoint to start a stream and a GET endpoint to resume it.
Storage to track which stream belongs to which chat.
A separate stop endpoint, with its own support code, to cancel the underlying work.

It works, but every piece is infrastructure you build, run, and operate.

With AI Transport, the session is that store: it holds the conversation, so any client can rebuild it, and you run nothing extra. Reload the page in the demo and the conversation comes back, completed tool results included. A load older messages control pages through history. The client re-derives its state from the session rather than from a held-open HTTP stream. If a run is still going when you refresh the browser, it keeps streaming in, and the debug pane shows the history rehydrating.

Conversation branching lets you edit and regenerate without losing the thread

Conversation branching means an edit or a regeneration creates a new branch while keeping the old one. You can move between branches, and every client sees the same tree.

Skip branching and an edit becomes destructive. Change a prompt and it overwrites the previous answer. Regenerate a reply and the one you had is gone. You cannot compare two phrasings of a question, or return to an answer you preferred.

Hover a message in the demo and controls appear. Edit a prompt and it re-sends as a forked branch rooted at that message. Regenerate a reply and it forks a new branch while keeping the previous one. A small < n / m > navigator appears under any message with siblings. Open the Ably Messages tab and you can see the fork in the message headers. An edit carries a parent and a fork-of. A regeneration carries a parent and a msg-regenerate marker. The branches persist across reloads, and other participants see them too, because the whole tree lives on the session.

History compaction keeps the stored conversation small

On the wire, a streamed answer is a create plus many small append operations, one per chunk, as the tokens arrive. History compaction collapses that whole sequence into a single final message once the answer is complete.

Without it, the session's history fills with every token fragment. A reload, a new device, or a late joiner would replay thousands of append events to rebuild a single answer. Stored data grows fast, and hydrating a conversation gets slow and noisy.

Ably compacts that whole create-and-append sequence into one final message per answer once the response completes. Open the debug pane and watch the Ably Messages tab. While a response streams, you see the create and a run of appends. Reload the page, and history returns the compacted result, a handful of final messages instead of the full token log. The SDK reads that compacted history back as a clean conversation, so reloads and new participants start from a tidy state.

A way back to the agent: stop and approve

The transport is bidirectional, so the user can send signals back to the running agent. In the demo that means two things you can try: stopping a response, and approving or denying an action before the agent takes it.

With no return path, the user has no controls during a response. You cannot tell a one-way stream to stop. A wrong or expensive answer runs to the end, or the user gives up and closes the tab.

While a response streams, the send button becomes a red Stop button. Click it and the demo publishes a cancel, and the agent aborts the stream. The cancel is a message like any other, so a second tab watching the same session sees it arrive in the Ably Messages tab. Approvals work through the same session: an approval-gated tool pauses the run with an Approve and Deny card. The run suspends rather than holding a request open, and the suspend and resume show up as events on the session. When you respond, the agent runs the tool only after you approve.

The pattern, in one move

The demo is a standard Vercel AI SDK useChat app, and the only change is that Ably carries the transport. Once the conversation lives on a shared session, the behaviors above stop being things you build and become things you watch happen. The shared, addressable session can carry structured app state and presence too, the foundations a collaborative app needs. Your useChat hook, model, and components stay the same throughout.

Try it yourself

Open the live demo
Watch the two-minute walkthrough
Get the code and run it yourself: the useChat demo on GitHub
Read the Ably AI Transport docs
Read the engineering story: We built a Custom Transport for Vercel's AI SDK

You keep your stack. The Vercel AI SDK exposes a pluggable transport, and Ably AI Transport fills that slot. Sign up free. You need an Ably account and an API key to run a session, and the free tier covers everything you need to start.

Your Vercel AI SDK app is missing a session layer

Key takeaways

What the demo shows

Multi-device fan-out keeps every tab and device in sync

Multi-user conversations put more than one person on the same thread

Resumable streams survive refreshes and reconnects

Conversation branching lets you edit and regenerate without losing the thread

History compaction keeps the stored conversation small

A way back to the agent: stop and approve

The pattern, in one move

Try it yourself

Continue reading

Conversation tree branching in @ably/ai-transport

The model is fine. The session is broken.

Engineering message appends for AI Transport: three vignettes

Key takeaways

What the demo shows

Multi-device fan-out keeps every tab and device in sync

Multi-user conversations put more than one person on the same thread

Resumable streams survive refreshes and reconnects

Conversation branching lets you edit and regenerate without losing the thread

History compaction keeps the stored conversation small

A way back to the agent: stop and approve

The pattern, in one move

Try it yourself

New posts from the Ably team, monthly.

Continue reading

Conversation tree branching in @ably/ai-transport

The model is fine. The session is broken.

Engineering message appends for AI Transport: three vignettes