You're reading from Socket.IO Cookbook
A single node server can typically handle several thousand simultaneous connections. However, as the audience of an application grows, it is important to make sure that the application is scalable. On the server side, this means that we may want to distribute our applications across multiple threads or node instances.
The issue with distributing your application across nodes is that when we emit a message, it will only be received by one of the distributed servers. Sockets that are not connected to the same server as the one that receives the message will not be able to receive it without some additional handling. Luckily, there are some great ways to pass session data between servers with a caching system, such as Redis, Memcache, or RabbitMQ. By using adapters for one of these distributed caching mechanisms, we can easily scale our servers without compromising our Socket.IO connections.
Nginx is a free, open source, high-performance HTTP server, and reversed proxy. Unlike traditional servers, Nginx doesn't rely on threads to handle requests. Instead, it uses a much more scalable asynchronous architecture. This architecture uses small and predictable amounts of memory under load.
We can use Nginx to load-balance our node servers and, if it is configured correctly, we won't have to worry about requests being lost between the original handshake and the callback when events are received.
Before we can do effective load balancing with the Nginx server, we will need to install it. Nginx can be installed with Homebrew with the following code:
brew install nginx
Once Nginx is installed, you can start it by running the following code:
sudo nginx
You can also stop it by running the following code:
sudo nginx –s stop
Node.js comes with a cluster package that can be used to run Node on multiple threads, as opposed to the single thread that it runs on normally. The child processes that cluster creates will all be able to run on the same port, which means that you can effectively load-balance without running your server on multiple ports.
Unfortunately, there is some boilerplate needed to determine the number of CPUs available to run Node processes and fork the original node. For this, we can use a module called sticky session. This is a load balancer that automatically spawns and manages multiple node sessions with the cluster module.
For this recipe, we will use the sticky session npm module. This can be installed by running npm install sticky-session
.
Now that we are able to run multiple nodes simultaneously with Socket.IO and not loose our socket connection between events, we will also need a way to ensure that, when an event is emitted on one node, it is also emitted across all of our other nodes.
For this, Socket.IO uses an interface called an adapter to route messages, and it allows us to use something other than the default memory-based adapter, so we can use our own instead. For a distributed system, we will need to use an adapter that lives outside of our server nodes.
Redis is a perfect solution for this problem. Redis is a key-value store, and cache is stored outside the web servers. This means that we can spin the instances of the server up and down. As a result, the data that is stored in Redis will not be lost. By plugging Redis into our Socket.IO adapter, we can propagate events across our nodes rather painlessly.
Memcached is an in-memory key-value store designed to handle small chunks of arbitrary data. Typically, Memcached is used for caching the server and API responses in the memory so that we can render the cached data, instead of hitting the database and waiting for a response if the data has already been persisted in the cache.
Similar to Redis, Memcached is run in a separate server instance out of the web server. This means that we can use it in the same way that we used Redis to propagate events across multiple server nodes.
There are a couple of projects on GitHub with the intention of providing the ability to use Memcached with Socket.IO, but at the time of writing there was none that had been updated after the 1.0 release of Socket.IO. As a result, the implementations all appeared to be either incomplete or buggy. The good news is that the lack of quality Memcached Socket.IO adapters have will provide us with an opportunity to explore how we can...
RabbitMQ is a message-oriented middleware that implements Advanced Message Queuing Protocol (AMQP) for extremely robust messaging across a distributed system.
In this recipe, we will use RabbitMQ, which allows you to use multiple servers and broadcast messages across them. One big advantage that RabbitMQ holds in comparison to Memcached (for instance) for this sort of task is that it is actually meant to be used to publish or subscribe style events. This means that we won't have to ping a server to determine whether or not there are changes; RabbitMQ will emit changes as they happen, which makes RabbitMQ a perfect solution for the existing style of Socket.IO.
At the time of writing, there were no satisfactory open source RabbitMQ adapters for Socket.IO. This means that we will need to write our own abstraction.
For this recipe, we will need to install RabbitMQ and have it running locally on our machine. It can be installed from https...