WebSocket and WebRTC are key technologies for building modern, low-latency web apps. This blog post explores the differences between the two. We'll cover the following:
Web Real-Time Communication (WebRTC) is a framework that enables you to add real time communication (RTC) capabilities to your web and mobile applications. WebRTC allows the transmission of arbitrary data (video, voice, and generic data) in a peer-to-peer fashion.
Almost every modern browser supports WebRTC. Additionally, there are WebRTC SDKs targeting different platforms, such as iOS or Android.
WebRTC consists of several interrelated APIs. Here are the key ones:
RTCPeerConnection. Allows you to connect to a remote peer, maintain and monitor the connection, and close it once it has fulfilled its purpose.
RTCDataChannel. Provides a bi-directional network communication channel that allows peers to transfer arbitrary data.
MediaStream. Designed to let you access streams of media from local input devices like cameras and microphones. It serves as a way to manage actions on a data stream, like recording, sending, resizing, and displaying the stream’s content.
WebSocket is a realtime technology that enables full-duplex, bi-directional communication between a web client and a web server over a persistent, single-socket connection.
A WebSocket connection starts as an HTTP request/response handshake. If this initial handshake is successful, the client and server have agreed to use the existing TCP connection that was established for the HTTP request as a WebSocket connection. This connection is kept alive for as long as needed (in theory, it can last forever), allowing the server and the client to independently send data at will.
The WebSocket technology includes two core building blocks:
The WebSocket protocol. Standardized in December 2011 through RFC 6455, the WebSocket protocol enables realtime communication between a WebSocket client and a WebSocket server over the web. It supports transmission of binary data and text strings.
The WebSocket API. Allows you to perform necessary actions, like managing the WebSocket connection, sending and receiving messages, and listening for events triggered by the WebSocket server. Almost all modern web browsers support the WebSocket API.
WebRTC apps provide strong security guarantees; data transmitted over WebRTC is encrypted and authenticated with the help of theSecure Real-Time Transport Protocol (SRTP).
WebRTC is open-source and free to use. The project is backed by a strong and active community, and it's supported by organizations such as Apple, Google, and Microsoft.
WebRTC is platform and device-independent. A WebRTC application will work on any browser that supports WebRTC, irrespective of operating systems or the types of devices.
Even though WebRTC is a peer-to-peer technology, you still have to manage and pay for web servers. For two peers to talk to each other, you need to use a signaling server to set up, manage, and terminate the WebRTC communication session. In one-to-many WebRTC broadcast scenarios, you'll probably need a WebRTC media server to act as a multimedia middleware.
WebRTC can be extremely CPU-intensive, especially when dealing with video content and large groups of users. This makes it costly and hard to reliably use and scale WebRTC applications.
WebRTC is hard to get started with. There are plenty of concepts you need to explore and master: the various WebRTC interfaces, codecs & media processing, network address translations (NATs) & firewalls, UDP (the main underlying communications protocol used by WebRTC), and many more.
Before WebSocket, HTTP techniques like AJAX long polling and Comet were the standard for building realtime apps. Compared to HTTP, WebSocket eliminates the need for a new connection with every request, drastically reducing the size of each message (no HTTP headers). This helps save bandwidth, improves latency, and makes WebSockets less taxing on the server side compared to HTTP.
Flexibility is ingrained into the design of the WebSocket technology, which allows for the implementation of application-level protocols and extensions for additional functionality (such as pub/sub messaging).
As an event-driven technology, WebSocket allows data to be transferred without the client requesting it. This characteristic is desirable in scenarios where the client needs to react quickly to an event (especially ones it cannot predict, such as a fraud alert).
WebSocket is stateful. This can be tricky to handle, especially at scale, because it requires the server layer to keep track of each individual WebSocket connection and maintain state information.
WebSockets don’t automatically recover when connections are terminated – this is something you need to implement yourself, and is part of the reason why there are many WebSocket client-side libraries in existence.
Certain environments (such as corporate networks with proxy servers) will block WebSocket connections.
WebSocket provides a client-server computer communication protocol, whereas WebRTC offers a peer-to-peer protocol and communication capabilities for browsers and mobile apps.
While WebSocket works only over TCP, WebRTC is primarily used over UDP (although it can work over TCP as well).
WebSocket is a better choice when data integrity is crucial, as you benefit from the underlying reliability of TCP. On the other hand, if speed is more important and losing some packets is acceptable, WebRTC over UDP is a better choice.
WebRTC is primarily designed for streaming audio and video content. It is possible to stream media with WebSockets too, but the WebSocket technology is better suited for transmitting text/string data using formats such as JSON.
WebRTC is a good choice for the following use cases:
Audio and video communications, such as video calls, video chat, video conferencing, and browser-based VoIP.
File sharing apps.
Screen sharing apps.
Broadcasting live events (such as sports events).
IoT devices (e.g., drones or baby monitors streaming live audio and video data).
We can broadly group Web Sockets use cases into two distinct categories:
Realtime updates, where the communication is unidirectional, and the server is streaming low-latency (and often frequent) updates to the client. Think of live score updates or alerts and notifications, to name just a few use cases.
Bidirectional communication, where both the client and the server send and receive messages. Examples include chat, virtual events, and virtual classrooms (the last two usually involve features like live polls, quizzes, and Q&As). WebSockets can also be used to underpin multi-user synchronized collaboration functionality, such as multiple people editing the same document simultaneously.
WebSockets and WebRTC are complementary technologies. As mentioned before, WebRTC allows for peer-to-peer communication, but it still needs servers, so that these peers can coordinate communication, through a process called signaling. Generally, signaling involves transferring information such as media metadata (e.g., codecs and media types), network data (for example, the host’s IP address and port), and session-control messages for opening and closing communication.
A key thing to bear in mind: WebRTC does not provide a standard signaling implementation, allowing developers to use different protocols for this purpose. The WebSocket protocol is often used as a signaling mechanism for WebRTC applications, allowing peers to exchange network and media metadata in realtime.
Ably is a serverless WebSocket platform optimized for high-scale data distribution. We make it easy for developers to build live experiences such as chat, live dashboards, alerts and notifications, asset tracking, and collaborative apps, without having to worry about managing and scaling infrastructure. Additionally, you can use our WebSocket APIs to quickly implement dependable signaling mechanisms for your WebRTC apps.
Roust and diverse features, including pub/sub messaging, automatic reconnections with continuity, and presence.
Dependable guarantees: <65 ms round trip latency for 99th percentile, guaranteed ordering and delivery, global fault tolerance, and a 99.999% uptime SLA.
An elastically-scalable, globally-distributed edge network capable of streaming billions of messages to millions of concurrently-connected devices.
25+ client SDKs targeting every major programming language.