Scaling the Firebase Realtime Database beyond 200k users

Google’s Firebase is a complete Backend as a Service (BaaS) and offers myriad functionalities to choose from. It’s a great choice to quickly get your app up and running. 

There are many interesting things to understand around its pricing structure, data modelling, and scalability. Maybe your application already has Firebase embedded in its core, or you are just evaluating it against alternatives. In any case, this article will provide you with an overview, exploring some technical considerations regarding databases and realtime messaging in general, and ending with a work-around to scale Firebase beyond 200k concurrent users, without needing to shard the database.

Copy link to clipboardA quick overview of Firebase 

Firebase is a one-stop shop to quickly build your app’s backend. This BaaS allows you to keep your app’s architecture simple by offloading critical pieces of functionality.

You immediately get many useful features out of the box, some of which are listed below:

  • Database: At its core, Firebase is a cloud storage solution that allows you to store and retrieve data related to your application and service.

  • Authentication: Firebase’s authentication service is a natural extension to its database offering, allowing you to securely store user information. You can quickly enable various auth mechanisms with multi-platform support.

  • Notifications: Another important feature and a key differentiator from other databases is Firebase’s ability to push realtime notifications to subscribed clients following any updates to the database. This feature enables the Firebase Realtime Database to be used for several realtime use-cases such as chat and newsfeeds.

  • Offline support: With Firebase, devices can continue publishing data to the database store even when offline. Firebase’s SDKs store the messages locally and wait for the device to connect back to the internet. As soon as that happens, they transparently publish that data to the cloud storage. Definitely a handy feature.

  • App analytics: Firebase comes with a nice dashboard for visualizing your app’s activity, so you can make critical business decisions based on this data.

  • Rest of the backend: The only feature falling under an app’s “backend” we haven't covered yet is the server-side business logic for the app. Given that Firebase is powered by Google, this naturally extends its functionality via integrations with Google Pub/Sub, Google Cloud Functions, and other pieces of its cloud infrastructure offering. You could spin up your own server and host it with Google or go serverless and make use of cloud functions to manage all the logic.

Copy link to clipboardFirebase Realtime Database vs Firestore

Firebase offers two cloud-based, client-accessible database solutions that support realtime data syncing:

  • Realtime Database is Firebase's original database. It's an efficient, low-latency solution for mobile apps that require synced states across clients in realtime.

  • Cloud Firestore is Firebase's newest database for mobile app development. It builds on the successes of the Firebase Realtime Database with a new, more intuitive data model. Cloud Firestore also features richer, faster queries, and scales further than the Realtime Database.

Now that we have a better idea of what the service is all about, let’s try to understand it a bit deeper with an example. Say we were building a scalable chat application for our users. Let’s see how this would look like, in terms of architecture.

A video by the team behind Firebase explains how the Realtime Database is more suitable for such scenarios compared to the Cloud Firestore, so we’ll specifically focus on its Realtime Database offering. 

Copy link to clipboardThe architecture of a scalable chat app built with Firebase

Firebase’s realtime database allows users to listen to changes in the database. They can subscribe to insertion, update, or deletion events, and Firebase will publish new messages to all the subscribers when a change is made. 

PubSub architecture with Firebase

In a scenario where one person makes a change to the database, all the subscribers to that database will receive immediate notification about the changes.

This architecture is very database-centric. Any updates being streamed in realtime between the users must go via the database. There’s no way for them to communicate directly, without having to store that data.

Copy link to clipboardA tightly coupled system design

The dev community has a lot of mixed feelings when it comes to any technology. We’ve all seen Twitter wars over the best JavaScript framework or the best library to achieve some functionality etc. In reality, it all boils down to the different perspectives and trade-offs for the specific scenario under consideration.

Tightly coupled systems rarely leave us with any flexibility to pick the constituent pieces of our choosing. The idea of coupling a realtime messaging (publish/subscribe pattern) service and a core database for storage is a somewhat peculiar one because the two not only vary in semantics but also levels of scalability.

Copy link to clipboardTrade-offs – Generalist vs specialist in SaaS

There is an increasing recognition among developers that specialist Software as a Service (SaaS) is experiencing larger degrees of adoption as compared to generalist SaaS. 

Twitter screenshot - best in breed solutions

With individual specialized components, you can pick and choose the best-in-breed solution for each piece in your architecture. Meanwhile, opting for generalist tools might give you a quick head start into developing your product, but it seldom leaves you with any flexibility. 

This is elaborated on by a fellow Ablyan in Realtime and databases – a discussion on coupling versus modularity.

Copy link to clipboardForced database sharding to scale real-time messaging between simultaneous connections

Realtime messaging and data storage work at different levels of scale. This means that to scale up concurrent connections and realtime messaging between those connections with Firebase, we’d also need to scale up the database’s storage limits mostly unnecessarily. Firebase’s documentation suggests sharding your database to allow for more simultaneous connection limits, specifically to go over the 200k mark

If you’ve worked extensively with databases then you know that sharding is no joke. It’s an extremely tedious process with the onus of handling almost entirely on the developer. A 20 min video by popular YouTuber Hussein Nasser shows how frustrating it can be to shard your database and how it should be done only as a last resort.

Having said all that, let’s talk about workarounds to get around this complexity. Firebase’s core offering is a database and other cloud backend features like auth, analytics, cloud function triggers, etc. The complexity only arises when trying to couple the database and real time messaging functionality at scale. So, let’s try to see what happens when we decouple the two orthogonal concerns using a dedicated service like Ably for realtime communications in a chat app. Let’s take a look at the architecture of such an app  with Ably in the mix.

Ably in a nutshell

Ably provides APIs to implement pub/sub messaging for the realtime features in your apps. You also get a globally-distributed scalable infrastructure out-of-the-box, along with a suite of services. These include features like presence – which shows the online/offline status of various participants, automatic reconnection and resumption of messages in case of intermittent network issues, message ordering and guaranteed delivery and easy ways to integrate with third-party APIs.

Ably enables pub/sub messaging, primarily over WebSockets. The concept of channels allows you to categorize the data and decide which components have access to which channels. You can also specify capabilities for various participants on these channels like publish-only, subscribe-only, message history, etc.

Learn more about the Ably platform

Copy link to clipboardArchitecture of the chat app with Firebase for database and Ably for realtime comms

If you are new to Ably, here’s a quick introduction to the service – it’s a realtime messaging infrastructure provider that offers multiprotocol and platform support to enable reliable realtime messaging between any kind of clients at a global scale.

With Ably, we can completely off-load realtime messaging from the Firebase database and use it only for what it does best – storage. Here’s a high-level view of how this workaround will function:

Chat app architecture with Firebase and Ably

When a user makes an update to the database, we trigger a Google Cloud Function. The function publishes this update to Ably on a specific channel. Any subscribers to that channel will then receive this update in realtime via Ably. This way, we’ve essentially decoupled realtime communication from storage in Firebase, while still using their service for the rest of their BaaS features.

This does away with the complexities of dealing with varying levels of scale and reduces the possibility of sharding the database long before it’s actually needed.

Copy link to clipboardMaking the database a subsidiary concept

In Firebase, the database is the central concept, with realtime messaging and easy connection to other Google Cloud services as add-on features. Only the updates to your database are published to all the subscribers. This forces you to permanently store even the transient events that need to be streamed to your users, leading to an unnecessary increase in storage costs. Alternatively, you’d need to implement some maintenance logic to clean up unnecessary messages after they’ve been stored, which is not fun either. An example of transient messages in a chat app scenario would be indicators that others are typing – events that occur frequently but have zero need for permanent storage.

In reality, you may want the users to be able to interact with each other directly, without needing to make an update to the database each time. Let’s see how we can achieve this with Ably.

Connecting DB via Ably WebHooks

Ably allows realtime integrations with external services via WebHooks. Using this feature, we can consider the database to be just like an end-user subscribing to the data via an Ably channel. This way, all the transient messages and events can be streamed directly via Ably and only a subset of those messages which require permanent storage can be sent over to the database in realtime via WebHooks.

This way, the database’s burden is lifted by removing it from the crossfire of mostly ephemera. This further reduces the load on the database and restricts contact from front-end clients, which may be crucial for some applications.

Copy link to clipboardFinal thoughts

Firebase is a useful suite of backend services and off-loading the realtime messaging part to Ably allows us to scale it beyond its initial limits easily. Besides, not having a database at the centre of the architecture allows for a lot of flexibility.

Hope this article has given you enough food for thought to build apps at scale easily. Sign up to a free account and try it out for yourself. Got questions or concerns? Talk to us, we are always happy to help.

Srushtika Neelakantam

Srushtika Neelakantam

Srushtika Neelakantam is a Developer Relations and Partner Engineer at Ably. She loves spending time fiddling around with tech and then simplifying that for others by speaking or writing about it. She is a co-author of “Learning Web-Based Virtual Reality” and supports the open web by volunteering with Mozilla's Tech Speaker and Reps programs.