When Temporal launched, a lot of people had the same reaction: "We have queues and retries. We don't need this." (Temporal's own blog addressed this directly.) That reaction made sense. Queues solve queue problems and they do it well.
What Temporal gave you was something different: a named execution context that survives a server restart and picks up from its last checkpoint. Not a better queue. A different abstraction entirely. If you built with it, you couldn't imagine going back. If you hadn't, you were one production incident away from understanding why.
The same shift is coming for AI sessions. When building a multi-message assistant, it's fair to assume that this is covered by WebSockets for transport, Redis for state, and some reconnect logic bolted on. It's sure to work in development. But it can be a different story when rolling out into production. For example, a real user loses a conversation mid-response and opens a support ticket wondering why they have to start over.
Key takeaways
- Durable execution crash-proofs backend workflow jobs. It doesn't persist the conversational session between an AI agent and a user.
- Every multi-message AI assistant has an implicit session. Currently that session lives in memory, not a durable store.
- The standard toolkit (Redis, WebSockets, job IDs, sticky sessions) solves adjacent problems but leaves the session layer unaddressed.
- Ably AI Transport makes sessions durable: a persistent, addressable channel between agent and user that outlives any single connection, device, or participant.
The session failure that arrives after you ship
An AI assistant that spans multiple messages has a session: the accumulated context, in-flight messages, the shared state between the agent and the user. For most AI applications right now, all of that lives in memory on a server.
You don't see this in development. You see it when a real user loses a conversation mid-interaction and opens a support ticket asking why they have to start over.
A user's phone locks and they don't come back for an hour. They start on their laptop and continue on their tablet, or simply switch tabs. The network drops without warning, with nothing surfacing to the UI. Each of these ends the session for a different reason, and none of them are rare. Connection drops are normal production behavior and device switches are a core usage pattern.

The code is usually fine. The problem is that the session state is tied to a connection that was never designed to hold it. What's missing is a session layer: infrastructure that sits between your transport and your application, whose only job is to keep the session alive regardless of what happens to the connection.
What Temporal actually made durable, and where it stops
Temporal made workflow execution durable. When you define a workflow in Temporal, it becomes a named execution context: if the server restarts mid-job, the workflow picks up from its last checkpoint. No custom idempotency logic, no lost progress, no retries starting from the beginning. Temporal is the right tool for this problem, and the category it created 'Durable execution' is now established enough to reach for without justifying it.
But execution durability and session durability are different abstractions at different layers of your stack.
Temporal keeps the backend job alive. It crash-proofs the work your agent does on the server. What it doesn't address is the session between your agent and your user.

You can run your entire AI backend on Temporal (fully crash-proof, deterministically replayed workflows) and still have zero session durability at the user-facing layer.
Redis, WebSockets, sticky sessions: none of them are a session layer
When the session problem first surfaces, new infrastructure isn't the obvious answer. The natural move is to reach for what's already there. Most of these tools are good at what they do. But none of them are designed for this.
Job IDs and polling
Job IDs tell you whether a background task finished. They don't buffer tokens from an agent that was mid-stream when the connection dropped. They don't know whether the user is still there. They solve the async status problem, which is a different problem.
Redis pubsub
Redis is fast, good for fan-out, and a popular choice. The problem is that pubsub is volatile by design: messages go to whoever is connected right now, and that's where the story ends. If the user drops for 30 seconds and comes back, everything published in that window is permanently gone.
Vercel's own AI SDK documentation spells it out the challenge directly. It states that to resume streams after a disconnect, developers need to build storage to track which stream belongs to each chat, Redis to store the message stream, two API endpoints, and their own integration layer to manage it all. That's a lot of work to assemble and maintain. And at the end of it you still don't have a durable session.
Sticky sessions
Sticky sessions keep a user on the same server, which is useful if you've got per-server state. The idea is simple: route every request from the same user to the same machine, so whatever is held in memory is always reachable. But it's server affinity, not session persistence. When that server restarts or you scale out, the in-memory state goes with it. The user reconnects and lands on a different server that has no record of what the agent was doing, what had been said, or where things were up to. The session was never really durable.
WebSockets, SSE, and HTTP/2
A WebSocket is a transport, and a great fit for AI use cases: bidirectional, low-latency, suited to the back-and-forth of agent interactions. Managed correctly, with reconnection handling and something that captures state when the connection drops, they come into their own. But 'managed correctly' is where the work lives. On its own, when the connection closes, whatever was in flight is gone. SSE and HTTP/2 have the same issue, with the added limitation of being one-directional.
Cloudflare Durable Objects
Durable Objects give you stateful compute at the edge, co-located with users. They're well-suited for low-latency stateful operations and useful for a lot of things. But they're scoped to a Cloudflare origin, every request has to route through that origin, and they weren't built to be a session layer for multi-participant AI conversations where agent and user might reconnect independently from different devices.
Actor model (Orleans, Akka, and equivalents)
Actor frameworks are strong for per-entity state management. An actor holds its own state across messages; for managing agent state in a long-running task the actor model is actually a legitimate approach. What it doesn't give you is the layer between the actor and the user. The actor knows its own state. It doesn't manage the channel, and it doesn't handle the case where the user reconnects from a different device.
What a durable session actually requires
These approaches aren't wrong. They solve real problems adjacent to session durability. But none of them give the four properties a session needs to survive in production.
- Persistent: Messages generated while a participant is disconnected need to be buffered and replayed in order when they reconnect. Not "try to deliver and log failures." Actually held and replayed. Without this, any connection drop is a data loss event and you're hoping users don't notice. → Reconnection and recovery
- Addressable: The session needs a stable identity that both sides can reconnect to, across devices and connection resets. An ephemeral socket ID that disappears when the connection closes doesn't cut it. You need a named thing that persists and can be found again. → Sessions concept
- Presence-aware: Both sides need to know when the other connects or disconnects. This sounds obvious until you've had an agent burning inference on a response nobody is receiving, or a UI showing "typing..." for an agent that crashed two minutes ago. First-class state transitions, not conditions you infer from timeouts. → Agent presence
- Multi-participant coherent: Agent and user can reconnect in any order, from any device. The session stays consistent regardless of which side comes back first. A device that joins late gets the full history. This is the property almost no production AI system has today. → Multi-device sessions
This is the move Temporal made for workflows: a named context with stable identity, deterministic replay, and explicit state transitions. "What happens to my backend job when the server crashes?" was the Temporal problem. "What happens to my conversation when the connection drops?" is the same problem, one layer up.
This is what we built AI Transport to do.

How Ably AI Transport implements durable sessions
It's a drop-in infrastructure layer that adds a durable session to any AI application, regardless of what model or framework you're running. With AI Transport in place, agents can come and go, but the session persists.
Named session
Each conversation lives in a named session. Both the agent and the user connect to the same session, and the session persists independently of their connections. When either side disconnects and reconnects, on the same device or a different one, they rejoin the same session and pick up from where they left off. The session identity is not a connection ID.
Message history and replay on reconnect
When a user reconnects after a drop, AI Transport replays messages from where they left off, including individual tokens from an in-progress agent response. Tokens are automatically compacted on replay: the reconnecting client gets the full accumulated response, not a retransmit of every individual delta.
Presence: tracking connection state for both participants
Agents publish structured state as presence events (thinking, streaming, idle, offline) so the system always knows what the agent is doing and who is watching. When an agent crashes, the presence disconnect event fires immediately. Crash detection is a presence event, not a timeout. That distinction matters: you can pause inference when nobody is listening, surface an "agent offline" indicator right away, and resume exactly where the agent left off when the user comes back.
If you've hit the session wall, then AI Transport replaces the Redis buffer, the catch-up logic, the polling for agent status, and the ad-hoc reconnect handling. It gives you the session abstraction those systems were trying to approximate.
The move is the same one Temporal made for workflows: stop building reliability into every application that needs it and push it into the infrastructure layer.
Temporal made the backend job durable. We make the session durable.
FAQ: durable execution, durable sessions, and Ably AI Transport
What is the difference between durable execution and durable sessions?
Durable execution, Temporal's category, crash-proofs backend workflow jobs. The job survives a server restart and retries from its last checkpoint without custom idempotency logic. Durable sessions persist between an AI agent and a user, beyond connections: messages are buffered, session identity is stable, and both participants can reconnect without losing state. They operate at different layers of the stack and work well together.
What did Temporal do for backend workflows that AI sessions still need?
Temporal gave workflow execution a stable, named identity and deterministic replay. The job became a durable entity rather than a process that could silently disappear on a server restart. AI sessions need the same move at the session layer: stable identity, message buffering across disconnections, explicit state transitions when participants connect and disconnect. Structurally identical problem, one layer up.
How does Ably AI Transport relate to Temporal durable execution?
Complementary layers. Temporal handles your AI backend job. AI Transport handles the session between that job and the user. Most production AI assistants benefit from both: Temporal so the agent's work survives a crash, AI Transport so the conversation survives a reconnect. Temporal's own documentation notes there's no built-in way to get results to a frontend. That's the gap AI Transport addresses.
Does Ably AI Transport replace Temporal for AI agent workflows?
No. AI Transport is a session layer, managing the sessions between agent and user. Temporal is an execution layer, managing the lifecycle of backend jobs. Different failure modes, different layers.
Can I use Temporal and Ably together in the same AI stack?
Yes, and this is the intended architecture for long-running AI work. Temporal orchestrates agent execution; AI Transport manages the session through which the agent communicates with the user. Each handles a failure mode the other doesn't.
What happens to an AI session when the user disconnects?
Messages generated during the disconnection are buffered at the session layer and replayed in sequence when the user reconnects, regardless of device. The agent continues streaming to the channel during the disconnection. The catch-up loads automatically. No messages are dropped.
AI Transport's documentation covers the full session model: reconnection handling, presence setup, and the AI Transport SDK.
Further reading: Why we're betting on Durable Sessions · The model is fine. The session is broken. · The Durable Sessions stack is forming



