In real life, handshaking is the act of gently grasping two people's hands, followed by a brief up and down movement. If you have ever greeted someone this way, then you already understand the basic concept of the HTML5 WebSocket protocol.
WebSockets define a persistent two-way communication between web servers and web clients, meaning that both parties can exchange message data at the same time. WebSockets introduce true concurrency, they are optimized for high performance, and result in much more responsive and rich web applications.
The following diagram shows a server handshake with multiple clients:
![](https://static.packt-cdn.com/products/9781782166962/graphics/6962_01_01.jpg)
For the record, the WebSocket protocol has been standardized by the Internet Engineering Task Force (IETF) and the WebSocket API for web browsers is currently being standardized by the World Wide Web Consortium (W3C)—yes, it's a work in progress. No, you do not need to worry about enormous changes, as the current specification has been published as "proposed standard".
Before diving into the WebSockets' world, let's have a look at the existing techniques used for bidirectional communication between servers and clients.
Web engineers initially dealt with the issue using a technique called polling. Polling is a synchronous method (that is, no concurrency) that performs periodic requests, regardless whether data exists for transmission. The client makes consecutive requests after a specified time interval. Each time, the server responds with the available data or with a proper warning message.
Though polling "just works", it is easy to understand that this method is overkill for most situations and extremely resource consuming for modern web apps.
Long polling is a similar technique where, as its name indicates, the client opens a connection and the server keeps the connection active until some data is fetched or a timeout occurs. The client can then start over and perform a sequential request. Long polling is a performance improvement over polling, but the constant requests might slow down the process.
Streaming seemed like the best option for real-time data transmission. When using streaming, the client performs a request and the server keeps the connection open indefinitely, fetching new data when ready. Although this is a big improvement, streaming still includes HTTP headers, which increase file size and cause unnecessary delays.
The web has been built around the HTTP request-response model. HTTP is a stateless protocol, meaning that the communication between two parts consists of independent pairs of requests and responses. In plain words, the client asks the server for some information, the server responds with the proper HTML document and the page is refreshed (that's actually called a postback). Nothing happens in between, until a new action is performed (such as the click of a button or a selection from a drop-down menu). Any page load is followed by an annoying)(in terms of user experience) flickering effect.
It was not until 2005 that the postback flickering was bypassed thanks to Asynchronous JavaScript and XML (AJAX). AJAX is based on the JavaScript's XmlHttpRequest
Object and allows asynchronous execution of JavaScript code without interfering with the rest of the user interface. Instead of reloading the whole page, AJAX sends and receives back only a portion of the web page.
Imagine you are using Facebook and want to post a comment on your Timeline. You type a status update in the proper text field, hit Enter and... voila! Your comment is automatically published without a single page load. Unless Facebook used AJAX, the browser would need to refresh the whole page in order to display your new status.
AJAX, accompanied with popular JavaScript libraries such as jQuery, has strongly improved the end user experience and is widely considered as a must-have attribute for every website. It was only after AJAX that JavaScript became a respectable programming language, instead of being thought of as a necessary evil.
But it's still not enough. Long polling is a useful technique that makes it seem like your browser maintains a persistent connection, while the truth is that the client makes continuous calls! This might be extremely resource-intensive, especially in mobile devices, where speed and data size really matter.
All of the methods previously described provide real-time bidirectional communication, but have three obvious disadvantages in comparison with WebSockets:
They send HTTP headers, making the total file size larger
The communication type is half duplex, meaning that each party (client/server) must wait for the other one to finish
The web server consumes more resources
The postback world seems like a walkie-talkie—you need to wait for the other guy to finish speaking (half-duplex). In the WebSocket world, the participants can speak concurrently (full-duplex)!
The web was initially built for displaying text documents, but think how it is used today. We display multimedia content, add location capabilities, accomplish complex tasks and, hence, transmit data different than text. AJAX and browser plugins such as Flash are all great, but a more native way of doing things is required. The way we use the web nowadays bears the need for a holistic new application development framework.
HTML5 makes a huge, yet justifiable, buzz nowadays as it introduces vital solutions to the problems discussed previously. If you are already familiar with HTML5, feel free to skip this section and move on.
HTML5 is a robust framework for developing and designing web applications.
HTML5 is not just a new markup or some new styling selectors, neither is it a new programming language. HTML5 stands for a collection of technologies, programming languages and tools, each of which has a discrete role and all of these together accomplish a specific task—that is, to build rich web apps for any kind of device.
The main HTML5 pillars include Markup, CSS3, and JavaScript APIs, together.
The following diagram shows HTML5 components:
![](https://static.packt-cdn.com/products/9781782166962/graphics/6962_01_02.jpg)
Here are the dominant members of the HTML5 family. As this book does not cover the whole set of HTML5, I suggest you visit html5rocks.com and get started with hands-on examples and demos.
Markup |
Structural elements Form elements Attributes |
Graphics |
Style sheets Canvas SVG WebGL |
Multimedia |
Audio Video |
Storage |
Cache Local storage Web SQL |
Connectivity |
WebMessaging WebSocket WebWorkers |
Location |
Geolocation |
Although Storage and Connectivity are supposed to be the most advanced topics, you do not need to worry if you are not an experienced web developer. Throughout this book, we will explain how to accomplish common tasks and we'll create some step-by-step examples, which you can later download and experiment with. Moreover, managing WebSockets via the HTML5 API is pretty simple to grasp, so take a deep breath and dive in with no fear.
The WebSocket protocol redefines full-duplex communication from the ground up. Actually, WebSockets, along with WebWorkers, take a really enormous step in bringing desktop-rich functionality to web browsers. Concurrency and multi-threading did not truly exist in the postback world. They were emulated in a rather restrictive manner.
HTTP protocol requires its own schemas (http and https). So does the WebSocket protocol. Here is a typical WebSocket URL example:
ws://example.com:8000/chat.php
The first thing to notice is the ws
prefix. This is pretty normal, as we need a new URL schema for the new protocol. wss
is supported as well and is the WebSocket equivalent to https for secure connections (SSL). The rest of the URL is similar to the plain old HTTP URLs and is illustrated in the following image.
The following image shows the WebSocket URL in tokens:
![](https://static.packt-cdn.com/products/9781782166962/graphics/6962_01_03.jpg)
For the time being, the latest specification of the WebSocket protocol is RFC 6455 and it's a blessing that the latest versions of every modern web browser support it. More specifically, the RFC 6455 is supported in the following browsers:
Internet Explorer 10+
Mozilla Firefox 11+
Google Chrome 16+
Safari 6+
Opera 12+
It is worth mentioning that the mobile versions of Safari (for iOS), Firefox (Android), Chrome (Android, iOS), and Opera Mobile all support WebSockets, bringing the WebSocket power to smartphones and tablets!
But, wait. What about the older browser versions that many people still use worldwide? Well, no need to worry, as throughout this book, we'll have a look at some fallback techniques that make our websites accessible to the largest audience possible.
Although WebSocket is a brand-new technology, quite many promising companies utilize its various capabilities in order to deliver a richer experience to their users. The most well-known paradigm is Kaazing (http://demo.kaazing.com/livefeed/), a startup that raised an investment of 17 million dollars for its real-time communication platform.
Other businesses include the following:
Two great resources containing a large variety of WebSocket demos are as follows:
WebSockets, as the name indicates, are related to the web. As you know, the web is much more than a bunch of techniques for some browsers; rather, it's a broad communication platform for a vast number of devices, including desktop computers, smartphones, and tablets.
Obviously, any HTML5 app that utilizes WebSockets will work on (almost) any HTML5-enabled mobile web browser. Imagine you want to implement the same functionality using the enhanced features of a native mobile app. Is the WebSocket supported in the mainstream mobile operating systems? The short answer: yes. Currently, all key players in the mobile industry (Apple, Google, Microsoft) provide a WebSocket API you can use in your own native apps. iOS, Android, and Windows smartphones and tablets integrate WebSockets in a similar way to HTML5.
New neuroscience research confirms the old adage about the power of a handshake: people do form a better impression of those who proffer their hand in greeting (http://www.sciencedaily.com/releases/2012/10/121019141300.htm). As a human handshake can lead to better deals, so a WebSocket handshake can lead to better user experience. We investigate user experience as a combination of performance (the user is waiting less) and simplicity (the developer builds straight and quick).
So, it's up to you: do you want to build modern, truly real-time web applications? Do you want to provide your users with the maximum experience? Do you want to offer a terrific performance boost to your existing web apps? If the answer to any of these questions is yes, then it's time to realize that the WebSocket API is mature enough to offer its goodies right here right now.
Throughout this book, we are going to implement a real-world project: a simple, multi-user, WebSocket-based, chatting application. Live chat is a very common feature among all modern social networks. We will learn, step-by-step, how to configure the web server, implement the HTML5 client, and transfer messages between them.
Apart from plain text messages, we'll see how WebSockets handle various types of data, such as binary files, images, and videos. Yeah, we'll demonstrate real-time media streaming, too!
Moreover, we are going to enhance the security of our app, examine some known security risks and find out how to avoid common pitfalls. Furthermore, we'll take a glance at some fallback techniques targeting those poor guys who cannot (or do not want to) update their browsers yet.
Last but not least, we'll get mobile. You chat using a desktop browser, a phone, or a tablet. Wouldn't it be nice if you could use the same techniques and principles across multiple targets? Well, through reading this book, you'll find out how to easily convert your web app into a native mobile and tablet application as well.
In this first chapter we introduced the WebSocket protocol, mentioned the existing techniques for real-time communication and determined the specific needs that WebSockets fulfill. Moreover, we examined its relationship with HTML5 and illustrated how the users can benefit from such enhancements. It's now time to introduce the WebSocket client API in more detail.