Live streaming has evolved from a novelty to the backbone of modern digital events. When major brands host virtual conferences, product launches, or community gatherings, they're no longer dealing with hundreds of viewers – they're managing tens of thousands of concurrent participants, all expecting to engage in realtime chat.
We recently worked with a team building live chat for a major creative software company's annual event. The technical challenges they faced offer valuable lessons for anyone building realtime infrastructure at scale.
Here's what we learned about engineering live chat that doesn't break under stadium-scale concurrency.
Why YouTube Live chat isn't enough
Many event organisers start with the obvious choice: YouTube's native live chat. It's built-in, it's tested, and it requires zero custom development. But when our customer tried this approach for their previous event, they discovered critical limitations.
The problem wasn't technical capacity – YouTube can certainly handle the load. The issue was friction. Viewers without YouTube accounts couldn't participate. Those with accounts but no YouTube channel (a requirement for posting) got redirected mid-event to complete YouTube's setup flows. Many never found their way back to the stream.
For a brand event where engagement drives value, losing viewers to authentication flows isn't acceptable. The chat needed to work for anonymous users, support profile customization, and most importantly, scale to 50,000+ concurrent chatters without overwhelming either the infrastructure or the viewers themselves.
The real challenges aren't just about scale
When engineers hear "50,000 concurrent users," they immediately think about infrastructure: load balancers, horizontal scaling, message buses, autoscaling policies. Those matter, but they're only part of the puzzle.
The harder challenges emerge at the intersection of technology and human behaviour.
Challenge 1: Moderation at scale
Anonymous chat rooms give audiences powerful engagement tools – and equally powerful disruption tools. Without proper moderation, chatters can spam, post harmful content, or derail conversations from the main event.
You need multiple defensive layers:
- Nuanced AI-powered content filtering that catches obvious violations in milliseconds (depending on your moderation levels)
- Human moderators who can review nuanced cases in real-time
- User blocking that persists even for anonymous users
- Emergency controls that let you shut down chat entirely if things spiral
The catch? Each layer adds latency. Send every message to a moderation API before publishing, and your "realtime" chat develops a noticeable lag. Process moderation asynchronously after publishing, and harmful content reaches viewers before you can stop it.
Challenge 2: Browser performance under message floods
Infrastructure can handle thousands of messages per second. Web browsers? Not so much.
When a major announcement drops during a keynote, chat explodes. Hundreds of messages arrive in seconds. If your frontend naively appends each message to the DOM, browsers grind to a halt. Users watching on mobile devices see their tabs crash.
The solution requires careful engineering:
- Message virtualisation that only renders visible messages
- Progressive loading that prioritises recent messages over history
- Throttling and debouncing that smooths rapid updates
- Optimisation to reduce bandwidth and overall number of transactions
- Memory management that caps stored message counts
Challenge 3: Late joiners and connection recovery
Not everyone joins at the event start. Viewers arrive mid-keynote, drop off Wi-Fi and reconnect, or refresh their browsers. Each scenario needs handling:
- Late joiners need recent message history without fetching the entire archive
- Reconnecting users need to resume from their last received message
- Message ordering must remain consistent even with network hiccups
- Presence updates (who's online) can't flood channels on every reconnection
These aren't edge cases – they're the norm for live events spanning hours with global audiences on varying network quality.
The architecture: Managed infrastructure with surgical control
Our customer's solution combined managed services with precise control points. The goal: eliminate infrastructure complexity while retaining granular control over behavior.
Global WebSocket infrastructure
Rather than self-hosting WebSocket servers, they used Ably's globally distributed infrastructure. This provided:
Automatic geo-routing: Users connect to the nearest edge location – US users to US nodes, EU users to EU nodes – without manual configuration. Channels stay active in all regions where subscribers exist.
Built-in message recovery: If a connection drops, the SDK automatically requests missed messages using connection state rather than requiring application-level recovery logic.
Server-side batching: Multiple messages are grouped together before being sent, significantly reducing the number of messages counted - lowering costs.
Message ordering guarantees: Guaranteed message ordering from any single realtime or non-realtime publisher to all subscribers, without developers implementing sequence numbers or ordering logic.
Message history: Archive messages for compliance and replay without managing databases or retention policies.
The infrastructure handled billions of messages monthly for other customers, so capacity wasn't a concern. The team focused entirely on the user experience.
Production-ready chat features
Rather than building chat features from scratch – message editing, deletion, reactions, presence – they used Ably Chat's pre-built components:
React hooks for Next.js: TypeScript-supported hooks that integrated directly into their existing stack without framework wrestling.
Presence metadata: Name and avatar changes propagated instantly to all connected clients through the presence system, no separate synchronisation logic needed.
Room-level reactions: Emoji reactions, designed for high-concurrency scenarios where thousands might react simultaneously.
UI components: Customisable components that provide a foundation rather than requiring everything from scratch.
AI moderation with surgical integration
For moderation, they integrated Hive's ML-powered filtering through Ably's native integration. The architecture is straightforward:
- Message published to Ably channel
- Ably forwards message to Hive via integration rule (configured in dashboard, no code required)
- Hive's AI analyses content in parallel
- If violations detected, Hive webhooks call out to the customers endpoint to revoke access as well to Ably directly to delete the offending message
- Customer revokes user's token, disconnecting them immediately
This creates a multi-tiered system:
Tier 1 – Automated filtering: Hive's models catch explicit content, hate speech, and spam patterns within milliseconds. Harmful messages get blocked within milliseconds.
Tier 2 – Human moderation: Community managers access Hive's dashboard to review edge cases, manually remove messages, and block users who repeatedly violate guidelines.
The key insight: moderation happens after publishing, using token revocation to enforce decisions. This keeps latency low for legitimate messages while still catching violations quickly.
Emergency controls through orchestration channels
For worst-case scenarios – coordinated attacks, critical bugs, or community meltdowns – they needed an instant kill switch.
The solution: an orchestration channel separate from the chat channel that stores the latest state in an Ably LiveObject. Admins with appropriate permissions can use a simple script to send updates to the object to inform clients that the chat has been shutdown, which will be received by clients in realtime.
const modes = ['object_subscribe', 'object_publish'] as ChannelMode[]
const channelOptions = { modes: modes };
const channel = realtimeClient.channels.get("chat-control", channelOptions);
await channel.attach();
// Get the root object from Ably LiveObjects associated with the channel
const root = await channel.objects.getRoot();
// Listen for events on the root object
let shutdown = !!root.get('shutdown') ?? false
console.log(`Shutdown initialised as ${shutdown}`)
root.subscribe((update) => {
const [key, change] = Object.entries(update.update)[0]
if (key === 'shutdown') {
shutdown = change === 'updated' ? !!root.get('shutdown') : false
if (shutdown) {
// Disable the chat
}
}
})
// Then to shutdown the chat on the admin client
root.set('shutdown', true)
Every connected client receives the command and disables chat input instantly. No deployment, no API calls, no coordination complexity.
Token authentication for anonymous users
Even "anonymous" users need identity – if only to block bad actors. The solution uses Ably's token authentication with dynamically generated client IDs:
- User visits chat (no login required)
- Frontend generates random client ID
- Token service issues Ably token with that client ID
- User connects and chats anonymously
- If user violates rules, revoke their specific token
- User disconnected, cannot reconnect with same ID
This enables user blocking without requiring accounts, emails, or any PII collection.
Implementation: Five weeks from planning to production
The timeline was tight: just over a month from kickoff to launch. Here's how managed infrastructure made it possible.
Week 1: Integration and proof-of-concept
Day one, we set up their production account and configured the Hive integration. Configuration took minutes: add an integration rule in the Ably dashboard, provide Hive credentials, done.
They spent the rest of the week building a minimal proof-of-concept: send messages through Ably, verify Hive receives them, test token revocation. No production UI, just infrastructure validation.
Week 2: Load testing and edge cases
With the happy path proven, they stress-tested edge cases:
- Simulated 1,000+ simultaneous messages
- Tested message ordering under load
- Verified reconnection logic and message recovery
- Confirmed moderation workflow under high throughput
They discovered browser performance limits and implemented message virtualization. They found reconnection storms after network blips and added exponential backoff.
Weeks 3-4: UI development
Infrastructure proven, they focused on the interface. Ably's React hooks simplified the integration:
import { useMessages, useRoom } from '@ably/chat/react';
function LiveChat() {
const { room } = useRoom();
const { send, historyBeforeSubscribe } = useMessages();
// That's it. The complexity is hidden.
}
They built anonymous profiles with customisable names and avatars, added live reactions alongside the video player, and implemented responsive layouts for mobile viewers.
Week 5: Final testing and launch prep
The final week focused on operational readiness:
- Training moderators on Hive's dashboard
- Documenting emergency procedures and CLI commands
- Running full end-to-end rehearsals
- Preparing monitoring and alerting
Launch day arrived with confidence, not panic.
Why managed infrastructure matters for tight timelines
Could they have built this with self-hosted WebSocket servers? Technically, yes. Practically, not in five weeks.
A self-hosted solution would require:
Horizontal scaling infrastructure
- Load balancing across multiple WebSocket servers
- Message bus (Redis/Kafka) to pass events between servers
- Autoscaling policies triggered by connection or message metrics
- Health checks and failover logic
Message persistence
- Database for message history
- Retention policies and cleanup jobs
- APIs for retrieving message ranges
- Handling schema migrations as features evolve
Global distribution
- Deploying to multiple regions
- Geo-routing users to nearest servers
- Ensuring message consistency across regions
- Handling cross-region latency
Moderation integration
- Webhooks to forward messages to Hive
- Rate limiting and retry logic
- Token revocation APIs
- Keeping track of which users are blocked
Reliability and observability
- Monitoring connection states and message throughput
- Alerting on anomalies or failures
- Log aggregation across distributed servers
- Debugging tools for production issues
Each piece takes weeks to build and test. Together, they'd consume months of engineering time – time this team didn't have.
With managed infrastructure, they focused entirely on the problems that mattered: crafting a great user experience and ensuring community safety. Ably handled the undifferentiated heavy lifting.
Five lessons for engineering live chat at scale
1. Managed infrastructure accelerates time-to-market
When timelines are tight – and they usually are – managed services let you ship faster. Focus engineering resources on differentiating features, not rebuilding solved problems.
2. Layer your moderation strategy
Don't choose between AI and humans. Use AI for obvious cases, humans for nuance. This creates a system that scales efficiently while maintaining quality.
3. Plan for emergencies from day one
Build kill switches and emergency controls into your architecture before launch. You don't want to be writing deployment scripts while your chat spirals out of control during a keynote.
4. Test beyond expected capacity
If you expect 50,000 concurrent users, test for 75,000. Live events consistently exceed projections. Discover your limits in controlled testing, not during the main event.
5. Ruthlessly prioritise features
Perfect is the enemy of shipped. Build core functionality reliably rather than shipping everything with bugs. You can always add features after proving the foundation works.
The bigger picture: Community as a competitive advantage
Live chat isn't just a feature – it's infrastructure for community building. When viewers feel connected to each other and to your brand, they return. They evangelise. They become customers.
Twitch built a $1.3 billion business in 2024 on this principle. Viewers don't just watch content; they participate in communities through chat. Those communities drive subscription revenue, donations, and platform loyalty.
For brands hosting live events, the dynamic is similar. Chat transforms passive viewing into active participation. It creates memorable moments – the collective reaction when a feature is announced, the jokes that become memes, the questions that spark real conversations.
But only if it works. A broken chat experience – laggy messages, missing moderation, browser crashes – destroys the magic. Users lose trust, disengage, and don't return.
The technical foundations matter enormously. Get them right, and you create experiences that stick.
Ready to build?
Whether you're planning a live event, building a SaaS product with collaboration features, or creating community spaces, the principles remain the same: reliable infrastructure, layered moderation, and ruthless focus on user experience.
Explore Ably Chat documentation to see how it works, or sign up for a free account to start building today.




