Infrastructure
A durable session layer is only as good as the infrastructure under it. AI Transport runs on Ably's global platform backed by the Ably four pillars: performance, integrity, reliability, availability.
A durable session layer solves several hard problems at once. Token streams need to arrive complete, in order, and exactly once across reconnects. Session state needs to be present wherever clients and agents are, not stuck behind a regional failover. Performance and latency must be good no matter where a client connects from. The system must absorb traffic spikes without capacity planning. Few systems do all of this together; this is what Ably's platform is built for.
The platform is organised around four pillars: integrity, reliability, performance, and availability. Each pillar maps to a specific guarantee that the session layer depends on.
Integrity: tokens arrive complete, in order, exactly once
A durable session layer must be dependable and predictable. When a user reconnects mid-stream and resumes, the token stream must be exactly correct: no duplicates, no gaps, no reordering. When a tab reloads, the state must be exactly as it was when streaming started. When an agent crashes and recovers, retries must be invisible to the user.
The platform provides exactly-once delivery and guaranteed message ordering across distributed infrastructure. Token streams are persisted and accumulated, so a reconnecting client gets the assembled state, not a sequence of deltas to reassemble. Operations are idempotent, so an agent retry after a crash is architecturally invisible. These properties are engineered at the protocol level, not implemented as application logic.
See the four pillars of dependability for the detailed technical claims.
Reliability: always on, in every region
The session layer is a single point of failure for every AI conversation in your application. If it goes down, every user across every session is affected at once. The platform is engineered so this does not happen.
Session state is present in every region simultaneously, not held in one region and replicated elsewhere. Publish and subscribe operations stay low-latency wherever the participant connects from, because the state lives in their region. When a region fails, there is nothing to relocate: the state already exists in other regions. No developer action is required.
This is the architectural choice that distinguishes the platform. Designs that keep state in one region and fail over elsewhere accept a window of regional outage. The platform avoids that window by keeping state present everywhere.
The platform has maintained 99.999% availability with zero global downtime over a multi-year operating history.
Performance and latency: low latency wherever the user is
Token streams should arrive as fast as the LLM generates them. The infrastructure should not add perceptible latency. Users connecting from different regions should experience the same quality.
The platform runs a globally distributed edge network. Clients connect to the nearest edge node, and because state is present in every region, every operation (token delivery, presence update, control signal) is served locally rather than crossing the network to a single region. The platform handles protocol negotiation automatically (WebSocket with fallback for restrictive networks) so connectivity works without developer configuration.
See latency for the detailed performance characteristics.
Availability: no ceiling, no capacity planning
Session state is stateful: that is what makes sessions useful (persistence, recovery, presence) and what makes them hard to run at scale.
The platform scales horizontally. There is no ceiling, no capacity planning, and no sharding decision to make. Sessions are replicated in the global cluster. The infrastructure absorbs demand spikes and regional failures without developer intervention.
Presence (whether a device is online, an agent is healthy, a session is active) is a first-class infrastructure primitive. Presence state is managed using data types that resolve consistently across regions without introducing extra latency. No polling is required.
See platform scalability for the underlying architecture.
Security
The platform includes DDoS protection, hardened network layers, token-based authentication, and an active bug bounty programme with independent security researchers. A dedicated security engineering team owns the posture. Ably is SOC 2 Type II certified and HIPAA compliant. See security and compliance for the current certifications.
SDKs and integrations
Multi-language SDKs provide the same session and the same guarantees whether your agent is in Python, your client is in React, or your mobile app is in Swift. Session data integrates into third-party systems through webhooks, Kafka, or other integrations.
See the SDK directory and integrations for the full list.
Read next
- Sessions: the durable conversation layer the infrastructure carries.
- Transport: how the SDK connects your code to a session.
- Going to production: the production checklist for shipping AI Transport.