Welcome to ZeroMQ! This chapter is an introduction to ZeroMQ and gives the reader a general idea of what a message queuing system is and most importantly what ZeroMQ is. In this chapter we will learn about the following topics:
An overview of what a message queue is
Why use ZeroMQ and what makes it different from other message queuing technologies
Basic client/server architecture
Introducing the first pattern, request-reply
How we can handle strings in C
Detecting the installed ZeroMQ version
Humans are social and will always socially interact with each other for as long as they exist. Programs are no different. A program has to communicate with another program since we are living in a connected world. We have UDP, TCP, HTTP, IPX, WebSocket, and other relevant protocols to connect applications.
However, such low-level approaches make things harder and we need something easier and faster. High-level abstractions sacrifice speed and flexibility whereas directly dealing with low-level details is not easy to master and use. That is where ZeroMQ shows up as the savior, giving us the usability features of high-level techniques with the speed of low-level approaches.
Before we start digging into ZeroMQ, let's first have a brief introduction on the general concept of message queues.
A message queue, or technically a FIFO (First In First Out) queue is a fundamental and well-studied data structure. There are different queue implementations such as priority queues or double-ended queues that have different features, but the general idea is that the data is added in a queue and fetched when the data or the caller is ready.
Imagine we are using a basic in-memory queue. In case of an issue, such as power outage or a hardware failure, the entire queue could be lost. Hence, another program that expects to receive a message will not receive any messages.
However, adopting a message queue guarantees that messages will be delivered to the destination no matter what happens. Message queuing enables asynchronous communication between loosely-coupled components and also provides solid queuing consistency. In case of insufficient resources, which prevent you from immediately processing the data that is sent, you can queue them up in the message queue server that would store the data until the destination is ready to accept the messages.
Message queuing has an important role in large-scaled distributed systems and enables asynchronous communication. Let's have a quick overview on the difference between synchronous and asynchronous systems.
In ordinary synchronous systems, tasks are processed one at a time. A task is not processed until the task in-process is finished. This is the simplest way to get the job done.
We could also implement this system with threads. In this case threads process each task in parallel.
In the threading model, threads are managed by the operating system itself on a single processor or multiple processors/cores.
Asynchronous Input/Output (AIO) allows a program to continue its execution while processing input/output requests. AIO is mandatory in real-time applications. By using AIO, we could map several tasks to a single thread.
The traditional way of programming is to start a process and wait for it to complete. The downside of this approach is that it blocks the execution of the program while there is a task in progress. However, AIO has a different approach. In AIO, a task that does not depend on the process can still continue. We will cover AIO and how to use it with ZeroMQ in depth in Chapter 2, Introduction to Sockets.
You may wonder why you would use message queue instead of handling all processes with a single-threaded queue approach or multi-threaded queue approach. Let's consider a scenario where you have a web application similar to Google Images in which you let users type some URLs. Once they submit the form, your application fetches all the images from the given URLs. However:
If you use a single-threaded queue, your application would not be able to process all the given URLs if there are too many users
If you use a multi-threaded queue approach, your application would be vulnerable to a distributed denial of service attack (DDoS)
You would lose all the given URLs in case of a hardware failure
In this scenario, you know that you need to add the given URLs into a queue and process them. So, you would need a message queuing system.
Until now we have covered what a message queue is, which brings us to the purpose of this book, that is, ZeroMQ.
The community identifies ZeroMQ as "sockets on steroids". The formal definition of ZeroMQ is it is a messaging library that helps developers to design distributed and concurrent applications.
The first thing we need to know about ZeroMQ is that it is not a traditional message queuing system, such as ActiveMQ, WebSphereMQ, or RabbitMQ. ZeroMQ is different. It gives us the tools to build our own message queuing system. It is a library.
It runs on different architectures from ARM to Itanium, and has support for more than 20 programming languages.
ZeroMQ is simple. We can do some asynchronous I/O operations and ZeroMQ could queue the message in an I/O thread. ZeroMQ I/O threads are asynchronous when handling network traffic, so it can do the rest of the job for us. If you have worked on sockets before, you will know that it is quite painful to work on. However, ZeroMQ makes it easy to work on sockets.
ZeroMQ is fast. The website Second Life managed to get 13.4 microseconds end-to-end latencies and up to 4,100,000 messages per second. ZeroMQ can use multicast transport protocol, which is an efficient method to transmit data to multiple destinations.
Unlike other traditional message queuing systems, ZeroMQ is brokerless. In traditional message queuing systems, there is a central message server (broker) in the middle of the network and every node is connected to this central node, and each node communicates with other nodes via the central broker. They do not directly communicate with each other.
However, ZeroMQ is brokerless. In a brokerless design, applications can directly communicate with each other without any broker in the middle. We will cover this topic in depth in Chapter 2, Introduction to Sockets.
We can start writing some code after our introduction to message queuing and ZeroMQ and of course we will start with the famous "hello world" program.
Let's consider a scenario where we have a server and a client. The server replies world
whenever it receives a hello
message from the clients. The server runs on port 4040
and clients send messages to port 4040
.
The following is the server code, which sends the world
message to clients:
#include <string.h> #include <stdio.h> #include <unistd.h> #include "zmq.h" int main (int argc, char const *argv[]) { void* context = zmq_ctx_new(); void* respond = zmq_socket(context, ZMQ_REP); zmq_bind(respond, "tcp://*:4040"); printf("Starting…\n"); for(;;) { zmq_msg_t request; zmq_msg_init(&request); zmq_msg_recv(&request, respond, 0); printf("Received: hello\n"); zmq_msg_close(&request); sleep(1); // sleep one second zmq_msg_t reply; zmq_msg_init_size(&reply, strlen("world")); memcpy(zmq_msg_data(&reply), "world", 5); zmq_msg_send(&reply, respond, 0); zmq_msg_close(&reply); } zmq_close(respond); zmq_ctx_destroy(context); return 0; }
Tip
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.
The following is the client code that sends the hello
message to the server:
#include <string.h> #include <stdio.h> #include <unistd.h> #include "zmq.h" int main (int argc, char const *argv[]) { void* context = zmq_ctx_new(); printf("Client Starting….\n"); void* request = zmq_socket(context, ZMQ_REQ); zmq_connect(request, "tcp://localhost:4040"); int count = 0; for(;;) { zmq_msg_t req; zmq_msg_init_size(&req, strlen("hello")); memcpy(zmq_msg_data(&req), "hello", 5); printf("Sending: hello - %d\n", count); zmq_msg_send(&req, request, 0); zmq_msg_close(&req); zmq_msg_t reply; zmq_msg_init(&reply); zmq_msg_recv(&reply, request, 0); printf("Received: hello - %d\n", count); zmq_msg_close(&reply); count++; } // We never get here though. zmq_close(request); zmq_ctx_destroy(context); return 0; }
Note
Please note that the examples in this book are written for ZeroMQ 3.2. Bear in mind that some examples may not work properly when using ZeroMQ Version 2.2 or older. Methods that were deprecated in 2.x were removed in 3.x. Some methods have been deprecated from those versions.
We have our first basic request-reply architecture, as shown in the following diagram:
Let's have a closer look at the code to understand how it works.
First we create a context and a socket. The zmq_ctx_new()
method creates a new context. It is thread safe, so one context can be used from multiple threads.
zmq_socket(2)
creates a new socket in the defined context. ZeroMQ sockets are not thread safe, so it should be used only by the thread where it was created. Traditional sockets are synchronous whereas ZeroMQ sockets have a queue on the client side and another on the server side for managing the request-reply pattern asynchronously. ZeroMQ automatically arranges setting up the connection, reconnecting, disconnecting, and content delivery. We will cover the difference between traditional sockets and ZeroMQ sockets in depth in Chapter 3, Using Socket Topology.
The server binds the ZMQ_REP
socket to port 4040
and starts waiting for requests and replies back whenever it receives a message.
This basic "hello world" example introduces us to our first pattern, the request-reply pattern.
We use the request-reply pattern to send messages from a client to one or multiple services and receive a reply for each message sent. This is most likely the easiest way to use ZeroMQ. The replies to the requests have to be strictly in order.
The following is the reply part of the request-reply pattern:
void* context = zmq_ctx_new();
void* respond = zmq_socket(context, ZMQ_REP);
zmq_bind(respond, "tcp://*:4040");
A server uses the ZMQ_REP
socket to receive messages from and send replies to the clients. If the connection between a client and the server is lost then the replied message is thrown away without any notice. The incoming routing strategy of ZMQ_REP
is fair-queue and the outgoing strategy is last-peer.
This book is all about queues. You may wonder what we mean when we refer to a fair-queue strategy. It is a scheduling algorithm and allocates the resources fairly by its definition.
To understand how it works, let's say that the Flows in the preceding figure send 16, 2, 6, and 8 packets/second respectively, but the output can handle only 12 packets per second. In this case we could transmit 4 packets/second, but Flow 2 transmits only 2 packets/second. The rule of fair-queue is that there should not be any idle output unless all inputs are idle. Thus, we could allow Flow 2 to transmit its 2 packets/second and share the remaining 10 packets between the rest of the Flows.
This is the incoming routing strategy used by ZMQ_REP
. The round-robin scheduling is the simplest way of implementing the fair-queue strategy, which is used by ZeroMQ as well.
The following is the request part of the request-reply pattern:
void* context = zmq_ctx_new();
printf("Client Starting….\n");
void* request = zmq_socket(context, ZMQ_REQ);
zmq_connect(request, "tcp://localhost:4040");
A client uses ZMQ_REQ
for sending messages to and receiving replies from a server. All messages are sent with the round-robin routing strategy. The incoming routing strategy is last-peer.
ZMQ_REQ
does not throw away any messages. If there are no available services to send the message or if the all services are busy, all send operations—zmq_send(3)
—are blocked until a service becomes available to send the message.
ZMQ_REQ
is compatible with the ZMQ_REP
and ZMQ_ROUTER
types. We will cover ZMQ_ROUTER
in Chapter 4, Advanced Patterns.
This part combines the request and reply sections and shows how to request a message from somewhere and how to respond to them.
printf("Sending: hello - %d\n", count);
zmq_msg_send(&req, request, 0);
zmq_msg_close(&req);
The client sends the message to the server using zmq_msg_send(3)
. It queues the message and sends it to the socket.
int zmq_send_msg(zmq_msg_t *msg, void *socket, int flags)
zmq_msg_send
takes three parameters, namely, message, socket, and flags.
The message parameter is nullified during the request, so if you want to send the message to multiple sockets you need to copy it.
A successful
zmq_msg_send()
request does not point out if the message has been sent over the network.The flags parameter is either
ZMQ_DONTWAIT
orZMQ_SNDMORE
.ZMQ_DONTWAIT
indicates that the message should be sent asynchronously.ZMQ_SNDMORE
indicates that the message is a multipart message and the rest of the parts of the message are on the way.
After sending the message, the client waits to receive a response. This is done by using zmq_msg_recv(3)
.
zmq_msg_recv(&reply, request, 0);
printf("Received: hello - %d\n", count);
zmq_msg_close(&reply);
zmq_msg_recv(3)
receives a part of the message from the socket, as specified in the socket parameter, and stores the reply in the message parameter.
int zmq_msg_recv (zmq_msg_t *msg, void *socket, int flags)
zmq_msg_recv
takes three parameters, namely, message, socket, and flags.
The previously received message (if any) is nullified
The flags parameter could be
ZMQ_DONTWAIT
, which indicates that the operation should be done asynchronously
Every programming language has a different approach to handling strings. Erlang does not even have strings. In the C programming language, strings are null-terminated. Strings in C are basically character arrays where \0
states the end of the string. String manipulation errors are common and the result of many security vulnerabilities.
According to Miller and others (1995), 65 percent of Unix failures are due to string manipulation errors such as null-terminated byte and buffer overflow; therefore, handling strings in C should be done carefully.
When you send a message with ZeroMQ, it is your responsibility to format it safely, so that other applications can read it. ZeroMQ only knows the size of the message. That's about it.
It is a common way to use different programming languages in an application. An application written in a programming language that does not add a null-byte at the end of strings and C application code needs to communicate properly otherwise you will get strange results.
You could send a message such as world
as in our example with the null byte, as follows:
zmq_msg_init_data_(&request, "world", 6, NULL, NULL);
However, you would send the same message in Erlang as follows:
erlzmq:send(Request, <<"world">>)
Let's say our C client connects to a ZeroMQ service written in Erlang and we send the message world
to this service. In this case Erlang will see it as world
. If we send the message with the null byte, Erlang will see it as [119,111,114,108,100,0]
. Instead of a string, we would get a list that contains some numbers! Well, those numbers are the ASCII-encoded characters. However, it is not interpreted as a string anymore.
You cannot rely on the fact that a message coming from a ZeroMQ service is safely terminated when you work in C.
Strings in ZeroMQ are fixed in length and are sent without the null byte. So, ZeroMQ strings are transmitted as some bytes (the string itself in this example) along with the length.
It is quite useful to know which ZeroMQ version you are using. Knowing the exact version is helpful in some scenarios to avoid unwanted surprises. For example, there are some differences between ZeroMQ 2.x and ZeroMQ 3.x, such as deprecated methods; therefore, if you know the exact ZeroMQ version you have on your machine, you would avoid using deprecated methods.
#include <stdio.h> #include "zmq.h" int main (int argc, char const *argv[]) { int major, minor, patch; zmq_version(&major, &minor, &patch); printf("Installed ZeroMQ version: %d.%d.%d\n", major, minor, patch); return 0; }