Chapter 1. Introduction
This book aims to provide a practical approach to learning and using gRPC. It attempts to catalog and teach not only the basics of gRPC, which you could also find in countless blog posts and the gRPC home page itself, but also to show the more interesting (perhaps less well-documented) aspects of gRPC. The book even demonstrates some of the gRPC pitfalls, and what you need to know to overcome them.
The first few chapters in this book introduce gRPC, describing what it is and how it compares and contrasts to other technologies in the same space. We’ll also dive into how to actually use and apply it. Next, the book will venture into more advanced techniques. These advanced chapters will arm you with the tools to use gRPC to the fullest, so you can truly harness its features for solving problems and building software systems. There are also more practical concerns covered, like best practices for evolving your RPC interfaces and schemas. The later chapters in the book will detour into related technologies, some that complement gRPC in production environments and some that aid developers in building and testing applications that use gRPC.
Computer networks and distributed computing
The earliest ideas for connecting multiple computers to form a network came about in the 1960’s. Shortly after, the ARPANET (the precursor to the Internet) was created. In the 1970’s, email was invented, which became the most widely used distributed application on the ARPANET. In these early days, the power of the network was mostly for sharing information by sending data from one computer to another on the other side of the country. Distributed computing became its own field of computer science in the 1970’s. So the study of how multiple computers could be used to solve larger problems than a single computer could solve was of great interest. This fledgling Internet connected the computing resources of numerous universities and government organizations, creating a large pool of compute power.
The ARPANET eventually grew into the Internet we know today, connecting millions of computers all across the globe. It powered the rise of the World Wide Web in the 1990’s, and today the Internet is a utility, much like electricity or water. It is available almost everywhere in the USA and also in most of the world. With more recent innovations in mobile computing and embedded systems, the Internet has become an integral component in modern life, not just in business.
Request-response protocols
Numerous networking protocols have been developed for sharing information from one point of the Internet to the other. Almost all of the protocols use the TCP/IP Internet protocol suite. This has enabled the free flow of information across the globe, and it has also had an immeasurable impact on modern business and commerce, which rely greatly on computer networks and distributed computing.
This diagram shows a very simple but typical arrangement. In enterprise settings, the two halves will likely be on the same network, or perhaps a VPN is used to securely connect clients to the servers. For eCommerce, SaaS (Software as a Service) and other Internet offerings, the servers might be hosted in the provider’s datacenter or even with a third-party provider like Amazon Elastic Compute Cloud (EC2), Google Cloud Platform, or Microsoft Azure. The main components of interest here are clients, on the left, and servers, on the right.
A client is a program that initiates communication-usually by creating a TCP connection. It may be an end-user program, initiating communication to request information or resources that it presents to a user. In web applications, for example, the client is a web browser such as Chrome or Firefox.
A server is a program that accepts these client connections and in turn processes their requests. There are many kinds of servers. The diagram shows an application server, so named because it serves data to the client application. (It is also called a web server if it uses HTTP as the communication protocol). As shown in the diagram, the application server may involve other computers in order to serve responses. In that case, the server also acts as a client. In the diagram, the application server acts as a database client, requesting information from the database server.
Most networking protocols follow this pattern: a client creates a network connection to the server and sends requests. The server accepts requests, performs some processing, and then sends responses. This request-response flow was conceived in the early days of networking in the 1960’s and has been foundational to distributed computing ever since.
Remote Procedure Calls
RPC stands for Remote Procedure Calls. It is a programming model built on top of request-response network protocols. Issuing an RPC in a client amounts to invoking a procedure from application code. For a server, servicing an RPC amounts to implementing a procedure with a particular signature.
In the client, the objects that expose these procedures are called stubs. When application code invokes a procedure on a stub, the stub translates the arguments into bytes and then sends a request to the server, the contents of which are the serialized arguments. When it gets back a response from the server, it translates the bytes into a result value, which is then returned back to the application code that called it.
In the server, the objects that expose these procedures are service implementations. The server machinery receives the request, translates the bytes back into procedure arguments, and then invokes a procedure on the service implementation, passing it those arguments. The service implementation performs its business logic and then returns a result, either a value on success or an error code on failure. (In some RPC implementations, servers can return both values and an error code). The server machinery then translates this result into bytes, which then become the response that is sent back to the client.
RPC is not a new programming idiom: proposals for remote procedure call semantics were written in the 70’s, and practical RPC implementations appeared in the 80’s, such as the Network File System (NFS).
gRPC is a cross-platform RPC system that supports a wide variety of programming languages. It excels at providing high performance and ease of use, to greatly simplify the construction of all types of distributed systems.
Summary
This chapter introduced the domain in which gRPC is used (distributed computing), and provided a brief overview of how this domain has evolved over time.
In the next chapter, we will go into greater detail as to what exactly gRPC is, and how it compares to similar technologies.