Firebase is frequently used for use cases where you need to keep frontend clients and your backend in realtime sync - for example, chat apps, and multiplayer collaboration functionality. But how (well) does it scale? That’s what we’ll investigate in this article, by looking at a chat use case. We’ll cover the following points:
A quick overview of Firebase
In a nutshell, Firebase is a Backend as a Service (BaaS) platform that helps you build, test, release, and monitor mobile and web applications. With Firebase, you immediately get many useful features out of the box, some of which are listed below:
Realtime database. At its core, Firebase is a cloud service that allows you to store data related to your application and service. Firebase offers two different non-relational database options: the Firebase Realtime Database, and Cloud Firestore.
Cloud and in-app messaging. Both Firebase Realtime Database and Cloud Firestore can push realtime updates to client devices following any changes to the database. There’s also Firebase Cloud Messaging, a service that allows you to send cross-platform push notifications and notification messages.
Authentication: Firebase’s authentication service is a natural extension to its database offering, allowing you to securely store user information.
Hosting. Firebase provides secure hosting on Google Cloud.
Integrations. Given that Firebase is powered by Google, this naturally extends its functionality via integrations with services such as Google Pub/Sub, Google Cloud Functions, and Google Analytics.
App analytics. Firebase comes with a useful dashboard for visualizing your app’s activity.
Firebase Realtime Database vs. Cloud Firestore: What are the differences?
Firebase offers two cloud-based, NoSQL database solutions that support realtime data syncing:
The Realtime Database is Firebase's original database. It's a low-latency solution for mobile apps that require synced states across clients in realtime.
Cloud Firestore is Firebase's newest database for mobile app development. It builds on the successes of the Firebase Realtime Database with a new, more intuitive data model. Cloud Firestore also features richer, faster queries, and scales further than the Realtime Database.
Both databases offer:
Client SDKs, with no servers to manage.
Realtime updates over WebSockets.
And here are some of their key differences:
The Realtime Database stores data as one large JSON tree, while Cloud Firestore stores data as collections of documents.
The Realtime Database supports presence; in contrast, there’s no native support for presence included with Cloud Firestore.
The Realtime Database offers deep queries with limited sorting and filtering; meanwhile, Cloud Firestore has indexed queries with compound sorting and filtering.
With the Realtime Database, you can only perform basic write and transaction operations. In comparison, Cloud Firestore enables advanced write and transaction operations.
While the Realtime Database is a regional solution that scales through sharding, Cloud Firestore auto-scales and may be used in a multi-region configuration.
Now that we have a better idea of the Firebase database offerings, let’s try to understand them a bit deeper with an example. Say we are building a live chat experience for end-users. Let’s see what our Firebase chat architecture looks like and how it works.
The architecture of a chat app built with Firebase
At a high-level, your chat architecture with Firebase would look something like this (note that the diagram is valid regardless of which Firebase database option you choose - the Realtime Database, or Cloud Firestore):
Both the Realtime Database and Cloud Firestore allow chat users to listen to changes in the database. Users can subscribe to insertion, update, or deletion events, and Firebase will sync new messages to all the subscribers over WebSocket connections when a change is made.
In a scenario where one person makes a change to the database, all the subscribers to that database will receive immediate notification about the changes. This architecture is very database-centric. Any updates being synced in realtime between chat users must go via the database. There’s no way for them to communicate directly, without having to store that data.
What are the challenges of using Firebase at scale?
The architecture presented in the previous section might seem straightforward. But what are the challenges and limitations of using Firebase when you’re attempting to build a chat experience you can trust to deliver at scale?
A tightly coupled system design
Tightly coupled systems rarely leave us with any flexibility to pick the constituent pieces of our choosing. The idea of coupling a realtime messaging (publish/subscribe pattern) service and a core database for storage is a somewhat peculiar one because the two not only vary in semantics, but also levels of scalability.
Related reading: Realtime and databases — a discussion on coupling versus modularity
There is an increasing recognition among developers that specialist Software as a Service (SaaS) is experiencing larger degrees of adoption as compared to generalist SaaS.
With individual specialized components, you can pick and choose the best-in-breed solution for each piece in your architecture. Meanwhile, opting for generalist tools (like Firebase) might give you a quick head start into developing your product, but it seldom leaves you with any flexibility.
For example, you can’t use Firebase to push updates to client devices over WebSockets from a non-Firebase database. It’s also very hard to migrate to another (NoSQL) database. This is because of the way data is stored - a JSON tree in the case of the Realtime Database, respectively collections of documents (which are very similar to JSON data) in the case of Cloud Firestore.
Limited scalability and trade-offs
Both Firebase database options come with some scaling limitations. An instance of the Firebase Realtime Database has a limit of 200.000 concurrent WebSocket connections / chat users. Firebase’s documentation suggests sharding your database to allow for more simultaneous connection limits, specifically to go over the 200k mark.
If you’ve worked extensively with databases then you know that sharding is no joke. It’s an extremely tedious process with the onus of handling almost entirely on the developer. A 20 min video by popular YouTuber Hussein Nasser shows how frustrating it can be to shard your database and how it should be done only as a last resort.
If you want to avoid the pain of sharding the Realtime Database, you can opt for Cloud Firestore, which auto-scales up to a more generous limit: roughly 1 million concurrent connections. However, if you need to scale beyond 1 million connections, Cloud Firestore can’t help you; you’re back to sharding the Realtime Database (or you might consider a Firebase alternative). In addition, unlike the Realtime Database, Cloud Firestore does not support presence natively, which is a key feature for use cases like chat apps.
Limited reliability and availability
To meet user expectations and business needs, any large-scale system needs to be highly reliable and available. However, this can be tricky to achieve with the Realtime Database in particular. That’s because the Realtime Database is limited to zonal availability within a single region.
The obvious downside of a single-region design is that it negatively impacts the overall latency, reliability, and availability of your system. What happens if the region where your Realtime Database setup is deployed goes through an outage and is temporarily unavailable? Your chat app would experience downtime, and become unavailable to users
High and unpredictable costs
Due to Firebase’s database-centric design, scaling to support a higher number of concurrent connections and realtime messaging between those connections also involves scaling up the database storage limits. Only the updates to your database are published to connected clients. This forces you to permanently store even the transient events that need to be streamed to your users, leading to an unnecessary increase in data storage costs. An example of transient messages in a chat app scenario would be indicators that others are typing – events that occur frequently but have zero need for permanent storage.
There are plenty of horror stories of costs escalating out of control, especially if you are new to Firebase, and don’t yet have a good grasp on the pricing model and how to engineer your apps in a cost-effective manner. See How not to get a $30k bill from Firebase for details.
Scaling a Firebase app with Ably as your realtime messaging layer
We’ve seen the complexities of using Firebase at scale. Let’s now talk about a workaround to bypass these challenges and limitations. Firebase’s core offering is a database and other cloud backend features like auth, analytics, cloud function triggers, etc. The complexity only arises when trying to couple the database and realtime messaging functionality at scale. So, let’s try to see what happens when we decouple the two orthogonal concerns using a dedicated service like Ably for realtime communications in a chat app.
Ably in a nutshell
Ably is a realtime PaaS. Our APIs and SDKs help developers ship realtime experiences like chat and multiplayer collaboration for millions of users without having to worry about maintaining and scaling messy infrastructure.
We provide pub/sub messaging over WebSockets to power realtime communication in your apps. You also get a globally-distributed scalable infrastructure out-of-the-box, along with a suite of capabilities. These include features like presence – which shows the online/offline status of various participants, automatic reconnection and resumption of messages in case of intermittent network issues, message interactions, message ordering and guaranteed delivery, and easy ways to integrate with third-party APIs.
Architecture of the chat app with Firebase for database and Ably for realtime messaging
With Ably, you can completely offload realtime messaging from the Firebase database and use it only for what it does best – storage. Here’s a high-level view of how this workaround functions:
When a user makes an update to the database, we trigger a Google Cloud Function. The function publishes this update to Ably on a specific channel. Any subscribers to that channel will then receive this update in realtime via Ably. This way, we’ve essentially decoupled realtime communication from storage in Firebase, while still using the Firebase platform for the rest of its BaaS features.
This does away with the complexities of dealing with varying levels of scale and reduces the possibility of sharding the database long before it’s actually needed.
Making the database a subsidiary concept
In Firebase, the database is the central concept, with realtime messaging and easy connection to other Google Cloud services as add-on features. Only the updates to your database are published to all the subscribers, which forces you to store everything, even transient events, thus increasing your costs. Alternatively, you’d need to implement some maintenance logic to clean up unnecessary messages after they’ve been stored, which is not fun either.
In reality, you may want users to be able to interact with each other directly, without needing to make an update to the database each time. Let’s see how we can achieve this with Ably.
Ably allows realtime integrations with external services via webhooks (among other options). Using this feature, we can consider the database to be just like an end-user subscribing to the data via an Ably channel. All the transient messages and events can be streamed directly via Ably and only a subset of those messages which require permanent storage can be sent over to the database in realtime via webhooks.
With this approach, the database’s burden is lifted by removing it from the crossfire of mostly ephemera. This further reduces the load on the database and restricts contact from front-end clients, which may be crucial for some applications.
We hope this article helps you understand the key challenges and limitations you would face trying to use the Firebase database to deliver chat apps (or any other type of realtime experiences for end-users) at scale. As we have seen, due to its database-centric nature, Firebase comes with a set of disadvantages. However, by using a Firebase database in combination with Ably for realtime messaging, you can ship scalable chat functionality in a cost-effective manner, while avoiding the complexity that comes with managing and scaling a realtime messaging layer.