Reader small image

You're reading from  Protocol Buffers Handbook

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781805124672
Edition1st Edition
Right arrow
Author (1)
Clément Jean
Clément Jean
author image
Clément Jean

Clément Jean is the CTO of Education for Ethiopia, a start-up focusing on educating K-12 students in Ethiopia. On top of that, he is also an online instructor (on Udemy, Linux Foundation, and others) teaching people about diff erent kinds of technologies. In both his occupations, he deals with technologies such as Protobuf and gRPC and how to apply them to real-life use cases. His overall goal is to empower people through education and technology.
Read more about Clément Jean

Right arrow

Protobuf is a Language

It is time to discover the Protobuf language and its syntax. In this chapter, we are going to see all the concepts that we need in order to write Protobuf schemas. This chapter is, thus, designed as a kind of documentation that can be read from start to end but also can be referenced in future chapters. As such, you might not understand every implication of each concept, but that is fine. Be confident, and we will make sure that you get all the knowledge you need throughout this book.

In this chapter, we will learn about the following:

  • Top-level statements
  • User-defined types
  • Out-of-the-box types
  • Services

At the end of this chapter, you will know all the most common concepts in the Protobuf language. You will understand what they are used for, and you will be able to start writing proto files.

Technical requirements

All the code examples that you will see in this section can be found under the directory called chapter2 in the GitHub repository (https://github.com/PacktPublishing/Protocol-Buffers-Handbook).

In the following sections, I will be using some extended Backus-Naur form (EBNF) notation (https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form) to describe the syntax of all the elements. The following elements will be used:

|   alternation (or)
()  grouping
[]  option (zero or one time)
{}  repetition (any number of times)

Do not worry too much, though. I am only writing in the EBNF for people who are interested and to emphasize that Protobuf is a language. If you feel like this is too overwhelming, you can just skip this part and look at the examples I will be providing.

On top of that, I will be omitting details for simplicity. However, all the details are available in the official specifications:

...

Top-level statements

In this section, we will see all the top-level statements in the order they should appear in a proto file according to the Protobuf Style Guide (https://protobuf.dev/programming-guides/style/). We are going to go through their meaning, and we are going to see some simple examples.

Syntax

The syntax statement is one of the easiest statements to understand. This tells the compiler (protoc) the version we are using in the file and, therefore, the features we can and cannot access:

EBNF – Syntax statement

version = "proto2" | "proto3" | "editions"
syntax = "syntax" "=" ("'" version "'" | '"' version '"') ";"

As you can see, there are three versions that we can pass to the syntax statement:

  • proto2
  • proto3
  • editions

Now, all of this is a little bit obscure, and these names, especially proto2 and proto3, are...

User-defined types

enum

You are certainly already familiar with enums in your favorite language, and they are pretty much the same in Protobuf. When we know all the possible values of a type, we use enums to create a lightweight representation of each value:

EBNF – Enum syntax

enumValueOption = optionName "=" constant
enumField = ident "=" [ "-" ] intLit [ "[" enumValueOption { ","  enumValueOption } "]" ]";"
enumBody = "{" { option | enumField | reserved } "}"
enum = "enum" ident enumBody

As such, an enum in Protobuf looks like the following:

enum PhoneType {
  PHONE_TYPE_MOBILE = 0;
  PHONE_TYPE_HOME = 1;
  //...
}

You can see that we are adding the name of the enum (in UPPER_SNAKE_CASE) as the prefix of each value, and then we have this magic number following the equal sign. The naming is purely done according to convention...

Message

Messages are the most complex concept in Protobuf. This is why we are going to split this section into multiple subsections. We will talk about these concepts in the following order:

  • Options: They are repeated in all the following concepts
  • Field, reserved, map, and oneof: These concepts are all about defining fields and specifying some serialization behavior
  • Nested messages

EBNF – Message syntax

messageBody = "{" { field | enum | message | option | oneof | mapField | reserved } "}"
message = "message" ident messageBody

Option

Since we are already familiar with options at this point, we can skip them. However, I still want to mention all the types that you can look for in the descriptor.proto (check https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/descriptor.proto) file so that you can check all the possible options available. Here is the list per concept:

message -> MessageOptions...

Services

Finally, the last concept we are going to see is service. service is designed for Protobuf interaction with RPC frameworks, such as gRPC. They define a type-safe contract that the server should implement and that the client can call:

EBNF – Service syntax

rpc = "rpc" ident "(" [ "stream" ] messageType ")" "returns" "(" [ "stream" ] messageType ")" (( "{" option "}" ) | ";")
service = "service" ident "{" { option | rpc } "}"

For example, we could have the following service:

service BookService {
  rpc ListBooks(ListBooksRequest) returns (ListBooksResponse);
  rpc GetBook(GetBookRequest) returns (Book);
  rpc CreateBook(CreateBookRequest) returns (Book);
  rpc UpdateBook(UpdateBookRequest) returns (Book);
  rpc DeleteBook(DeleteBookRequest) returns (google.protobuf.Empty...

Out-of-the-box types

On top of all the scalar types that Protobuf provides as part of the language, it also provides some already-defined types called well-known types (WKTs). All these types can be found in the src/google/protobuf folder in the GitHub repository (https://github.com/protocolbuffers/protobuf/tree/main/src/google/protobuf), and they are all defined under the google.protobuf package.

Here is a list of the most common WKTs available:

  • Duration
  • Timestamp
  • FieldMask
  • Any
  • Struct

Let’s go through all of these types and see what they are used for.

Duration and timestamp

These two types are pretty interesting because they show us the importance of naming, documenting with comments, and separating definitions into different files for reusability. In fact, these two types have the same definitions except for the message name. Here are the diff command results between those two (simplified):

-message Timestamp {
-  // Represents...

Summary

We have come a long way. This chapter was intense, and there are probably things that you are still not entirely sure about. That is normal; do not worry. You can refer to it later if you have questions about a certain Protobuf concept.

In this chapter, we saw how to write top-level statements such as Edition, Syntax, Import, Package, and Options. We then saw how to write enums and messages to create type definitions. After that, we saw what the services are. Finally, we talked about the well-known types that are provided out of the box when you use Protobuf.

In the next chapter, we will talk about the Protobuf text format and how we can set values to fields in a nonprogrammatic way.

Quiz

  1. Where can you find the proto files for the well-known types?
    1. Nowhere; this is hidden inside the Protobuf language.
    2. Nowhere; we need to define them ourselves.
    3. On the GitHub repository under the src/google/protobuf folder.
  2. Where can you find the message definitions for the option types?
    1. On the GitHub repository under the folder src/google/protobuf in the file called descriptor.proto.
    2. Nowhere; this is hidden inside the Protobuf language.
    3. Nowhere; we need to define them ourselves.
  3. Why do we define an UNSPECIFIED value in enums for proto3?
    1. In proto3, enums are closed, so we do not need to.
    2. In proto3, enums are open, and UNSPECIFIED is used as the default value.
    3. It is not needed in proto3, only in proto2.
  4. When do we need nesting types inside a message?
    1. When we want to specialize a type with fully qualified names.
    2. When a type is only relevant in the parent context.
    3. All the above.

Answers

  1. C
  2. A
  3. B
  4. C
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Protocol Buffers Handbook
Published in: Apr 2024Publisher: PacktISBN-13: 9781805124672
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Clément Jean

Clément Jean is the CTO of Education for Ethiopia, a start-up focusing on educating K-12 students in Ethiopia. On top of that, he is also an online instructor (on Udemy, Linux Foundation, and others) teaching people about diff erent kinds of technologies. In both his occupations, he deals with technologies such as Protobuf and gRPC and how to apply them to real-life use cases. His overall goal is to empower people through education and technology.
Read more about Clément Jean