6 min readUpdated Dec 17, 2025

The evolution of realtime AI: How Ably is powering Gen-2 conversational experiences

When we launched Ably in 2016, we set out to solve a fundamental problem: delivering reliable, low-latency real-time experiences at scale. So we set out to build a globally distributed system that didn't force developers to choose between latency, integrity, and reliability – trade-offs that had defined the realtime infrastructure space for years.

Fast forward to today, and we're reaching 2 billion devices monthly, processing 2 trillion operations for customers who demand rock-solid infrastructure for their mission-critical features. But over the past year, as AI has transformed from a backend optimisation tool into a front-and-centre user experience, we've been asking ourselves a critical question: What's Ably's role in the AI ecosystem?

From "nice-to-have" to essential infrastructure

A year ago, if you'd asked us about Ably's AI story, we would have told you that yes, customers were using us. Companies like HubSpot and Intercom were leveraging Ably for token streaming and realtime AI features. But honestly? The value proposition felt incremental. Traditional LLM interactions followed a simple request-response pattern: send a query, stream back tokens, done. HTTP streaming handled this reasonably well, and while Ably offered benefits, there wasn't a smoking gun reason to use us specifically for AI.

That's changed dramatically.

The shift to Gen 2 AI experiences

What we're calling "Gen 2" AI experiences are fundamentally different from what came before. Instead of simply querying a model's training data, today's AI agents reason, search the web, call APIs, interact with tools via MCP (Model Context Protocol), and orchestrate complex multi-step workflows. Just look at how Perplexity searches, or how ChatGPT now breaks down complex requests into observable reasoning steps.

Gen 1 vs. Gen 2 AI UX experiences table

This shift introduces an entirely new set of challenges:

The problems Gen 2 creates

Async by default: When an AI agent needs 30 seconds or a minute to complete a task (not 3 seconds) user behaviour changes. They switch tabs, check their phone, or start other work. A simple HTTP request suddenly needs to handle disconnections, reconnections, and state recovery.

Continuous feedback is mandatory: Users need to know what's happening. "Searching the web... Analysing documents... Calling your CRM..." This isn't a nice-to-have anymore. Without feedback, users assume the system has failed.

Multi-threading conversations: Imagine asking a support agent about your order status. While they're checking, you ask another question. Now you have two parallel operations that need coordination. The agent needs to know what else is in flight and potentially prioritise or sequence responses intelligently.

Cross-device continuity: Users want to be able to set tasks running and then pick-up later from where they left off. They may start a deep research query on their laptop, close it, and then want to check progress on their phone an hour later. The entire conversation state needs to seamlessly transfer.

Introducing Ably AI Transport

Our vision for addressing these challenges centres on what we're calling the Ably AI Transport Layer. A drop-in solution that handles the complexity of Gen 2 AI communications so developers can focus on building great agent experiences, not wrestling with networking problems.

The architecture is deliberately narrow in scope. We focus on everything between your AI agents and end-user devices, leaving orchestration, LLM selection, and business logic where they belong – in your control.

Layer 1: The foundation

At the base level, the AI Transport provides what you'd expect from Ably: bulletproof reliability, multi-device synchronisation, and automatic resume capabilities. But the real shift is architectural. Instead of your agent responding directly to requests, it returns a conversation ID. Devices subscribe to that conversation, and from that point forward, the agent pushes updates through Ably.

This simple change unlocks powerful capabilities:

  • Decoupling: Agents and devices can disconnect and reconnect independently without losing continuity
  • Bidirectional control: Need to stop an agent mid-task or ask a follow-up question? There's a direct communication channel that doesn't require complex routing infrastructure
  • State recovery: Reconnecting devices don't replay every token. They get current state and resume from there

Layer 2: Richer orchestration

The next layer introduces shared state on channels using our CRDT-based collaborative objects. This enables sophisticated coordination:

  • Agent presence and status: Devices know if agents are active, thinking, or have crashed. Agents can broadcast their current focus ("Analyzing Q4 data...") as state rather than events.
  • Multi-agent coordination: When multiple agents work simultaneously -  say, one handling a technical query while another processes a billing question -  they can see each other's state and coordinate without stepping on each other's work.
  • Context-aware prioritisation: Agents can see if a user is actively waiting versus having backgrounded their session, enabling smarter resource allocation.
  • Client-side tool calls: In co-pilot scenarios, agents can query the client directly about user context "Is the user currently editing this field?" without roundtripping through backend systems.

Layer 3: Enterprise-grade observability

Because everything flows through Ably, you gain comprehensive visibility into the last mile of your AI experience. Stream observability into your existing systems, integrate with Kafka for audit trails, and leverage enterprise features like SSO and SOC 2 compliance that come standard with Ably's infrastructure.

A new way of building

What excites us most isn't replacing streaming protocols - it's providing a stateful conversation layer that removes infrastructure concerns from the developer's plate. Think of it as abstract storage for conversation state, combined with the realtime capabilities developers need for modern AI UX.

The developers building these experiences don't want to solve networking problems. They want to focus on prompts, orchestration, RAG pipelines, and agent logic. The transport layer shouldn't be where they spend their time – but it will become critical as user expectations evolve to match what ChatGPT, Perplexity, and Claude are demonstrating daily.

Framework-agnostic by design

One pattern we've noticed: as engineering teams mature in their AI journey, they tend to move away from monolithic frameworks and build custom orchestration logic. This makes sense – these systems become core to their business differentiation.

That's why the Ably AI Transport is deliberately framework-agnostic. Yes, we're building drop-in integrations with OpenAI's agent framework, LangChain, LangGraph, and others to make getting started trivially easy. But the architecture doesn't lock you in. Swap out your orchestration layer, change LLM providers, rebuild your agent logic – your transport layer and device communication remain consistent.

The road ahead

We're still early in this journey, actively validating these concepts with customers and refining our approach based on real-world usage patterns. Not every team needs Gen 2 capabilities today, but the industry is moving fast, and user expectations are being set by the leaders in this space.

If you're building AI experiences and wrestling with questions like "How do I handle interruptions?", "What happens when users switch devices mid-conversation?", or "How do I coordinate multiple parallel agent tasks?" – we'd love to talk. We're convinced there's a better way to build these experiences, and it starts with not having to rebuild the real-time infrastructure layer from scratch.

The plumbing shouldn't be your problem. Building great AI experiences should be.

Reach out for early access to Ably's AI Transport.

Join the Ably newsletter today

1000s of industry pioneers trust Ably for monthly insights on the realtime data economy.
Enter your email