Performance

For a system such as Ably, performance generally relates to the speed of operations; for Ably the single most important metric is the latency of message delivery, but performance relates more broadly to latency and throughput of operations generally.

Why performance matters

For the majority of use-cases, applications employ Ably because the timeliness of message delivery is very important. Performance in a realtime messaging platform directly affects user experience and, ultimately, business outcomes.

Applications such as live chat, collaborative editing or multiplayer games depend on low latency for a responsive user experience. When messages are delayed, chat applications feel sluggish rather than instantaneous, collaborative editing tools develop conflicts that frustrate users, and multiplayer games become unplayable due to synchronization issues. Even small increases in latency can have major impacts on how users perceive application quality.

From a business perspective, performance directly impacts outcomes. Research consistently shows that user engagement drops significantly with each additional 100ms of latency, conversion rates decline when applications feel sluggish, and operational costs increase when inefficient systems require more resources to achieve the same results.

Low and predictable latencies are needed to underpin predictability of any system built using Ably. Designing for performance delivers an optimum experience for end-users but indirectly, by ensuring efficiency in the design, it also helps to deliver scalability. When each operation is individually performant, the system can handle more operations with the same resources.

Key performance design goals and metrics

Ably's platform design focuses on two primary performance objectives that complement each other to ensure messages are delivered not only quickly but also efficiently, even at high volumes.

The first objective is minimizing latency and latency variance for message delivery. Latency represents the time delay between when a message is sent and when it is received. Just as important as reducing the average latency is minimizing the variance or "jitter" in these delays, as inconsistent performance can be more disruptive to applications than slightly higher but consistent latency.

The second objective is maximizing single-channel throughput in terms of both message rate and bandwidth. This ensures that even when a channel handles high volumes of traffic, such as during live events or peak usage periods, the system can efficiently process and distribute messages without degradation.

Ably achieves a global mean latency of 37ms to provide the best possible realtime experience to your users.

Beyond average latency, Ably focuses on the performance of the slowest percentiles of messages (p95, p99, p99.9) to ensure consistent performance for all messages. These tail latencies often reveal performance issues that might be hidden by average measurements.

In addition to latency, Ably measures throughput across several dimensions including messages per second per channel (the maximum rate at which messages can be published and delivered on a single channel), bandwidth per channel (the maximum data throughput a single channel can sustain), and aggregate throughput (the total message and data throughput across all channels in an application or across the entire platform).

Channels as the foundation of performance

Ably organizes its pub/sub service using the concept of channels. A channel is a named entity where publishers send messages by publishing to a channel, and clients express interest in receiving messages by subscribing to channels. As long as publishers and subscribers know the channel names they're using, Ably efficiently routes messages between them.

Channels are ephemeral — they don't require explicit creation ahead of time and exist virtually as a result of being referenced in publisher or subscriber operations. This approach eliminates administrative overhead and allows applications to scale seamlessly without pre-provisioning infrastructure for specific communication patterns.

Channels are also the key building blocks of Ably's internal implementation. Internally, the channel is the venue of the core processing that Ably performs per-message; persistence is organized on a per-channel basis, as is processing of integration rules. Ably minimizes the cost of processing each individual message by making channels stateful internally; sufficient state exists, for the time that a channel is active, that message processing comprises as few operations as possible.

Once each message is decoded, state exists for the channel that assists in placement of the message within the persistence layer, to optimize associating metadata with the message, and to know which other regions and which other connections are interested in the channel. This approach ensures that each message can be processed efficiently without repeatedly computing or looking up the same information.

While channels maintain state to optimize processing, they remain lightweight enough that a production application can maintain millions of them simultaneously. This contrasts with heavier abstractions like Kafka partitions, which are typically limited to hundreds or thousands per cluster. The lightweight nature of channels enables fine-grained resource allocation, efficient memory utilization, and horizontal scalability as channel processing can scale across multiple nodes.

Several technical optimizations make channel performance possible. Carefully chosen data structures minimize memory footprint and CPU usage, and multiple channels share network connections to reduce overhead. Messages are batched when beneficial for throughput, binary protocols reduce parsing overhead and bandwidth usage, and common operations follow highly optimized code paths.

Global infrastructure for performance

While many systems can achieve very low latencies within single datacenters, they're typically constrained to one region. Ably has optimized for both local edge processing and global performance.

In addition to having optimized intra-datacenter processing, Ably is also designed to optimize all other parts of the message delivery journey through its global infrastructure. Points of Presence exist globally based on Ably's use of AWS CloudFront as a CDN/access network. Using this network infrastructure, Ably can ensure that messages transit the public internet over the minimum distance possible.

Ably always routes messages between datacenters on a peer-to-peer basis, so messages never need to be directed to specific locations for processing that are not in the most direct path from publisher to subscriber. This direct peer-to-peer routing provides several performance advantages including reduced internet transit distance as most users connect to a nearby edge node, optimized backbone usage where inter-datacenter traffic travels over high-performance network backbones, reduced hop count with fewer network hops meaning less opportunity for latency or packet loss, and path redundancy where multiple routes ensure reliability without sacrificing performance.

Capacity management and quality of service

Ably manages the capacity of all elements of its infrastructure — both message processing and networking infrastructure — so that, irrespective of demand, users do not find that service performance is degraded by capacity limitations and associated queueing delays.

This proactive capacity management involves predictive scaling where capacity is adjusted ahead of anticipated demand changes, headroom maintenance where systems operate with sufficient margin to absorb spikes, and resource balancing where workloads are distributed to optimize utilization across the infrastructure.

Quality of service mechanisms include traffic prioritization where critical messages receive preferential treatment, fair usage enforcement to prevent any single client from monopolizing resources, graceful degradation so that under extreme load, system behavior remains predictable, and backpressure signaling where clients receive early warnings when approaching limits.