6 min read•Last updatedUpdated Jan 21, 2026

The evolution of realtime AI: The transport layer needed for stateful, steerable AI UX

Written by

When we launched Ably in 2016, we set out to solve a fundamental problem: delivering reliable, low-latency real-time experiences at scale. So we set out to build a globally distributed system that didn't force developers to choose between latency, integrity, and reliability – trade-offs that had defined the realtime infrastructure space for years.

Fast forward to today, and we're reaching 2 billion devices monthly, processing 2 trillion operations for customers who demand rock-solid infrastructure for their mission-critical features. But over the past year, as AI has transformed from a backend optimisation tool into a front-and-centre user experience, we've been asking ourselves a critical question: What's Ably's role in the AI ecosystem?

From "nice-to-have" to essential infrastructure

A year ago, if you'd asked us about Ably's AI story, we would have told you that yes, customers were using us. Companies like HubSpot and Intercom were leveraging Ably for token streaming and realtime AI features. But honestly? The value proposition felt incremental. Traditional LLM interactions followed a simple request-response pattern: send a query, stream back tokens, done. HTTP streaming handled this reasonably well, and while Ably offered benefits, there wasn't a smoking gun reason to use us specifically for AI.

That's changed dramatically.

The shift to Gen 2 AI experiences

What we're calling "Gen 2" AI experiences are fundamentally different from what came before. Instead of simply querying a model's training data, today's AI agents reason, search the web, call APIs, interact with tools via MCP (Model Context Protocol), and orchestrate complex multi-step workflows. Just look at how Perplexity searches, or how ChatGPT now breaks down complex requests into observable reasoning steps.

This shift introduces an entirely new set of challenges:

Modern AI UX problems

Async by default: When an AI agent needs 30 seconds or a minute to complete a task (not 3 seconds) user behaviour changes. They switch tabs, check their phone, or start other work. A simple HTTP request suddenly needs to handle disconnections, reconnections, and state recovery.

Continuous feedback is mandatory: Users need to know what's happening. "Searching the web... Analysing documents... Calling your CRM..." This isn't a nice-to-have anymore. Without feedback, users assume the system has failed.

Multi-threading conversations: Imagine asking a support agent about your order status. While they're checking, you ask another question. Now you have two parallel operations that need coordination. The agent needs to know what else is in flight and potentially prioritise or sequence responses intelligently.

Cross-device continuity: Users want to be able to set tasks running and then pick-up later from where they left off. They may start a deep research query on their laptop, close it, and then want to check progress on their phone an hour later. The entire conversation state needs to seamlessly transfer.

The transport layer modern AI needs

Our vision for addressing these challenges centres on what we're calling the Ably AI Transport. A drop-in solution that handles the complexity of making the AI UX resilient and multi-device - so developers can focus on building great agent experiences, not wrestling with networking problems.

We focus on everything between your AI agents and end-user devices, leaving orchestration, LLM selection, and business logic where they belong – in your control.

Layer 1: The foundation

At the base level, the AI Transport provides what you'd expect from Ably: bulletproof reliability, multi-device synchronisation, and automatic resume capabilities. But the real shift is architectural. Instead of your agent responding directly to requests, it returns a conversation ID. Devices subscribe to that conversation, and from that point forward, the agent pushes updates through Ably.

This simple change unlocks powerful capabilities:

Decoupling: Agents and devices can disconnect and reconnect independently without losing continuity
Bidirectional control: Need to stop an agent mid-task or ask a follow-up question? There's a direct communication channel that doesn't require complex routing infrastructure
State recovery: Reconnecting devices don't replay every token. They get current state and resume from there

Layer 2: Richer orchestration

The next layer introduces shared state on channels using live shared state. This enables sophisticated coordination:

Agent presence and status: Devices know if agents are active, thinking, or have crashed. Agents can broadcast their current focus ("Analyzing Q4 data...") as state rather than events.
Multi-agent coordination: When multiple agents work simultaneously - say, one handling a technical query while another processes a billing question - they can see each other's state and coordinate without stepping on each other's work.
Context-aware prioritisation: Agents can see if a user is actively waiting versus having backgrounded their session, enabling smarter resource allocation.
Client-side tool calls: In co-pilot scenarios, agents can query the client directly about user context "Is the user currently editing this field?" without roundtripping through backend systems.

Layer 3: Enterprise-grade observability

Because everything flows through Ably, you gain comprehensive visibility into the last mile of your AI experience. Stream observability into your existing systems, integrate with Kafka for audit trails, and leverage enterprise features like SSO and SOC 2 compliance that come standard with Ably's infrastructure.

A new way of building

What excites us most is the idea of providing a stateful conversation layer that removes infrastructure concerns from the developer's plate. Think of it as abstract storage for conversation state, combined with the realtime capabilities developers need for modern AI UX.

The developers building these experiences don't want to solve networking problems. They want to focus on prompts, orchestration, RAG pipelines, and agent logic. The transport layer shouldn't be where they spend their time – but it will become critical as user expectations evolve to match what ChatGPT, Perplexity, and Claude are demonstrating daily.

Framework-agnostic by design

One pattern we've noticed: as engineering teams mature in their AI journey, they tend to move away from monolithic frameworks and build custom orchestration logic. This makes sense – these systems become core to their business differentiation.

That's why the Ably AI Transport is deliberately framework-agnostic. Yes, we're building drop-in integrations with OpenAI's agent framework, LangChain, LangGraph, Vercel AI SDK, ag-ui and others to make getting started trivially easy. But the architecture doesn't lock you in. Swap out your orchestration layer, change LLM providers, rebuild your agent logic – your transport layer and device communication remain consistent.

The road ahead

If you're building AI experiences and wrestling with questions like "How do I handle interruptions?", "What happens when users switch devices mid-conversation?", or "How do I coordinate multiple parallel agent tasks?" – we'd love to talk. We're convinced there's a better way to build these experiences, and it starts with not having to rebuild the real-time infrastructure layer from scratch.

The plumbing shouldn't be your problem. Building great AI experiences should be.

The evolution of realtime AI: The transport layer needed for stateful, steerable AI UX

From "nice-to-have" to essential infrastructure

The shift to Gen 2 AI experiences

Modern AI UX problems

The transport layer modern AI needs

Layer 1: The foundation

Layer 2: Richer orchestration

Layer 3: Enterprise-grade observability

A new way of building

Framework-agnostic by design

The road ahead

Recommended articles

Why your AI response restarts on page refresh (and what it takes to prevent it)

Appends for AI apps: Stream into a single message with Ably AI Transport

How we built an AI-first culture at Ably

Join the Ably newsletter today