Here at Ably, we’ve helped many people to solve some truly interesting, and at times wacky data distribution problems. Be it helping to organise transport for the healthcare sector, or acting as one of the key communication layers on a VR platform; in more and more aspects of our lives, we’re expecting online experiences to happen in realtime.
Unsurprisingly though, one key use-case reemerges every time: chat. Not only chat in chat-focused applications, but chat in games, chat in support desks, chat in waiting rooms. In any form of collaborative application, chat will usually worm its way in somewhere.
Building basic chat functionality is pretty easy with Ably, thankfully. Our Channels act as the perfect analogy to chat rooms, allowing for both 1-on-1 conversations as well as group conversations. Our multi-protocol support means that, regardless of the technical requirements of users, they’ll be able to communicate with everyone in a room. Finally, our Presence feature ensures that clients are accurately representing their current state, be it online, busy, or offline.
However, many of our clients quickly find that once they want to start implementing more complex or bespoke features that things start to get challenging.
What’s so hard about chat?
Although the above simple chat example is functional, it’s usually not enough for more production-level applications.
Storing and retrieving messages
We need to be able to search messages and edit old ones. So, how should we store the messages? What data structures should we use to make messages easily searchable? How do we allow users to edit the messages that we’ve stored, and ensure that we can provide guarantees that data from our realtime system is successfully and correctly stored in a database?
Authentication, authorization, and roles
Although some chat use cases will have anonymous users, many will expect their users to be able to log in and set themselves a username and potentially an avatar and other credentials. All of which will have varying permissions attached.
They’ll have roles that provide authorization; unique permissions for certain users, and actions they can take with those permissions. For example, an Admin User might be able to create new chat rooms, add and ban users from chat rooms, remove other users’ messages, and more.
Users often expect particular actions and formatting to happen when messages are sent containing certain information. For example, if I wrote ‘Hi @tom!’, I would expect Tom to get a notification of some kind indicating that they’ve been pinged, in addition to the message being delivered.
If the user is offline, they should get a push notification. I’d also expect the @tom in the message to be formatted, perhaps in a different colour to the rest of the message text, with a link to profile information and an hover state.
Equally, users may want some useful shortcut commands. ‘!currently-active’ could magically get converted to be a message saying ‘There are currently 50,000 people active in this room’. On top of explicit commands, we may also want implicit actions to be performed on messages.. If a message contains inappropriate words for the community or links to an untrustworthy site, we’d expect the message to be filtered out, or flagged, or some other action which we’ve defined.
Although Presence with Ably makes the ‘knowing if X is online’ bit of user information easy to do, other information can be a bit more tricky. How do we want to indicate the user’s profile image, bio, and status message?
In addition to these more global user states, we’d also need to indicate more chat room-specific information such as if a user is typing or not.
If we had a room with 50,000 users in it, and each time someone changes their status a message is sent to all the other users to indicate as such, we have the potential for hundreds of millions of messages to be sent between clients within seconds if many users change state rapidly. Most clients will have no chance of keeping up with this rate of messages. How can we get around this?
The above is only really scraping the surface of the considerations required for building a fully-featured and scalable chat solution. There are a lot of fiddly bits and more generic decisions to be made. What database do you use? How do we ensure consistency between all clients? How do we support further features such as plugins, API access, and more?
How we plan to solve it
Due to how crucial chat can be for so many applications and the complexities that each developer will face, we at Ably are now starting work on a series of blog posts that will help to break down the entire process.
We’ll be demonstrating best practices and providing code in GitHub which should allow for anyone to start integrating chat into their apps.
Tech stack of the Fully Featured Scalable Chat App
Currently, our plan is to use the following technologies to demonstrate this:
Frontend - React
React has become one of the most popular JS frameworks in the world, and for good reason. Its component-based architecture makes it easy to logically structure your applications, and promotes flexibility in development. In general, the concepts React makes use of are both simple and powerful, making it perfect for developing elegant solutions.
Overall this makes it perfect for us to use to demonstrate the creation of the client-side part of our application. This will mean all the UI of the chat application, receiving and sending of messages, and correlation of user interactions to API calls will be handled with React.
Backend - Azure Functions
We will need a way for clients to request tokens and interact with our database. Serverless functions are an ideal way to make scalable endpoints for these sorts of requests where we're expecting a single request to invoke a series of actions followed by a response. As we'll be hosting our chat app on Azure, Azure Functions are the obvious choice for us.
Database - CosmosDB
We will predominately have two types of information to store:
- Data which we need to only know the most recent version of. Think user information, chat room information, etc.
- Data where we need logs kept. The main example of this would be messages sent to channels.
We'll be hosting our solution on Azure, and so CosmosDB is an obvious integration for us to make use of for our database needs.
Authentication - Auth0
Whilst we will have our own login system, with user data stored within our database, developing more complex login features such as social media login can be quite complicated.
Auth0 make it simple to get social logins, 2FA, and more through a simple redirect to one of their endpoints during the login or signup process. Through a combination of our own login system, and using Auth0 for more complex functionality, we should have a production-tier authentication system.
Problems we’ll be solving
Current common problems we intend to create solutions for are:
- Setting up an integrated backend for chat application
- Authenticating and authorizing users
- Creating a UI for chat
- Directory of existing chat rooms
- Entering/leaving chat rooms
- Sending messages to chat rooms
- Editing and deleting messages
- Searching previous messages
- Chat room information
- Pinned messages
- User profile pages
- Online/offline indicator
- Typing indicators
- Complex messages allowing for commands, filtering, and more
- Notifications and push notifications
- Media previews and sharing media
- User-made plugin support
Over a series of future blog posts, we’ll be digging deeper into these topics. We’ll look at what options are available to us for each feature, weigh up the pros and cons of each, and then look to create handy components that anyone can use for each feature.
However, this is just our current plan. We want our work to be as beneficial for anyone making a chat app as possible. If you have any features of chat applications you’d like us to investigate, requests for us to look at different technologies, or simply want to let us know that this sort of content is important to you, please let us know on Twitter, or via email at [email protected]!
Chat App Resources
We've written a lot about chat during the years, here is a recollection of some of our best posts:
- Database-driven realtime architectures: building a serverless and editable chat app - Part 1
- Database-driven realtime architectures: building a serverless and editable chat app - Part 2
- Build your own live chat web component with Ably and AWS
- Building a realtime chat app with Next.js and Vercel
- Guide to Pub/Sub in Golang