Caffe2 Quick Start Guide

By Ashwin Nanjappa
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
About this book

Caffe2 is a popular deep learning library used for fast and scalable training and inference of deep learning models on various platforms. This book introduces you to the Caffe2 framework and shows how you can leverage its power to build, train, and deploy efficient neural network models at scale.

It will cover the topics of installing Caffe2, composing networks using its operators, training models, and deploying models to different architectures. It will also show how to import models from Caffe and from other frameworks using the ONNX interchange format. It covers the topic of deep learning accelerators such as CPU and GPU and shows how to deploy Caffe2 models for inference on accelerators using inference engines. Caffe2 is built for deployment to a diverse set of hardware, using containers on the cloud and resource constrained hardware such as Raspberry Pi, which will be demonstrated.

By the end of this book, you will be able to not only compose and train popular neural network models with Caffe2, but also be able to deploy them on accelerators, to the cloud and on resource constrained platforms such as mobile and embedded hardware.

Publication date:
May 2019


Introduction and Installation

Welcome to the Caffe2 Quick Start Guide. This book aims to provide you with a quick introduction to the Caffe2 deep learning framework and how to use it for training and deployment of deep learning models. This book uses code samples to create, train, and run inference on actual deep learning models that solve real problems. In this way, its code can be applied quickly by readers to their own applications.

This chapter provides a brief introduction to Caffe2 and shows you how to build and install it on your computer. In this chapter, we will cover the following topics:

  • Introduction to deep learning and Caffe2
  • Building and installing Caffe2
  • Testing Caffe2 Python API
  • Testing Caffe2 C++ API

Introduction to deep learning

Terms such as artificial intelligence (AI), machine learning (ML), and deep learning (DL) are popular right now. This popularity can be attributed to significant improvements that deep learning techniques have brought about in the last few years in enabling computers to see, hear, read, and create. First and foremost, we'll introduce these three fields and how they intersect:

Figure 1.1: Relationship between deep learning, ML, and AI


Artificial intelligence (AI) is a general term used to refer to the intelligence of computers, specifically their ability to reason, sense, perceive, and respond. It is used to refer to any non-biological system that has intelligence, and this intelligence is a consequence of a set of rules. It does not matter in AI if those sets of rules were created manually by a human, or if those rules were automatically learned by a computer by analyzing data. Research into AI started in 1956, and it has been through many ups and a couple of downs, called AI winters, since then.


Machine learning (ML) is a subset of AI that uses statistics, data, and learning algorithms to teach computers to learn from given data. This data, called training data, is specific to the problem being solved, and contains examples of input and the expected output for each input. ML algorithms learn models or representations automatically from training data, and these models can be used to obtain predictions for new input data.

There are many popular types of models in ML, including artificial neural networks (ANNs), Bayesian networks, support vector machines (SVM), and random forests. The ML model that is of interest to us in this book is ANN. The structure of ANNs are inspired by the connections in the brain. These neural network models were initially popular in ML, but later fell out of favor since they required enormous computing power that was not available at that time.

Deep learning

Over the last decade, utilization of the parallel processing capability of graphics processing units (GPUs) to solve general computation problems became popular. This type of computation came to be known as general-purpose computing on GPU (GPGPU). GPUs were quite affordable and were easy to use as accelerators by using GPGPU programming models and APIs such as Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL). Starting in 2012, neural network researchers harnessed GPUs to train neural networks with a large number of layers and started to generate breakthroughs in solving computer vision, speech recognition, and other problems. The use of such deep neural networks with a large number of layers of neurons gave rise to the term deep learning. Deep learning algorithms form a subset of ML and use multiple layers of abstraction to learn and parameterize multi-layer neural network models of data.


Introduction to Caffe2

The popularity and success of deep learning has been motivated by the creation of many popular and open source deep learning frameworks that can be used for training and inference of neural networks. Caffe was one of the first popular deep learning frameworks. It was created by Yangqing Jia at UC Berkeley for his PhD thesis and released to the public at the end of 2013. It was primarily written in C++ and provided a C++ API. Caffe also provided a rudimentary Python API wrapped around the C++ API. The Caffe framework created networks using layers. Users created networks by listing down and describing its layers in a text file commonly referred to as a prototxt.

Following the popularity of Caffe, universities, corporations, and individuals created and launched many deep learning frameworks. Some of the popular ones today are Caffe2, TensorFlow, MXNet, and PyTorch. TensorFlow is driven by Google, MXNet has the support of Amazon, and PyTorch was primarily developed by Facebook.

Caffe's creator, Yangqing Jia, moved to Facebook, where he created a follow-up to Caffe called Caffe2. Compared to the other deep learning frameworks, Caffe2 was designed to focus on scalability, high performance, and portability. Written in C++, it has both a C++ API and a Python API.

Caffe2 and PyTorch

Caffe2 and PyTorch are both popular DL frameworks, maintained and driven by Facebook. PyTorch originates from the Torch DL framework. It is characterized by a Python API that is easy for designing different network structures and experimenting with training parameters and regimens on them. While PyTorch could be used for inference in production applications on the cloud and in the edge, it is not as efficient when it comes to this.

Caffe2 has a Python API and a C++ API. It is designed for practitioners who tinker with existing network structures and use pre-trained models from PyTorch, Caffe, and other DL frameworks, and ready them for deployment inside applications, local workstations, low-power devices at the edge, mobile devices, and in the cloud.

Having observed the complementary features of PyTorch and Caffe2, Facebook has a plan to merge the two projects. As we will see later, Caffe2 source code is already organized as a subdirectory under the PyTorch Git repository. In the future, expect more intermingling of these two projects, with a final goal of fusing the two together to create a single DL framework that is easy to experiment with and tinker, efficient to train and deploy, and that can scale from the cloud to the edge, from general-purpose processors to special-purpose accelerators.

Hardware requirements

Working with deep learning models, especially the training process, requires a lot of computing power. While you could train a popular neural network on the CPU, it could typically take many hours or days, depending on the complexity of the network. Using GPUs for training is highly recommended since they typically reduce the training time by an order of magnitude or more compared to CPUs. Caffe2 uses CUDA to access the parallel processing capabilities of NVIDIA GPUs. CUDA is an API that enables developers to use the parallel computation capabilities of an NVIDIA GPU, so you will need to use an NVIDIA GPU. You can either install an NVIDIA GPU on your local computer, or use a cloud service provider such as Amazon AWS that provides instances with NVIDIA GPUs. Please take note of the running costs of such cloud instances before you use them for extended periods of training.

Once you have trained a model using Caffe2, you can use CPUs, GPUs, or many other processors for inference. We will explore a few such options in Chapter 6, Deploying Models to Accelerators for Inference, and Chapter 7, Caffe2 at the Edge and in the cloud, later in the book.

Software requirements

A major portion of deep learning research and development is currently taking place on Linux computers. Ubuntu is a distribution of Linux that happens to be very popular for deep learning research and development. We will be using Ubuntu as the operating system of choice in this book. If you are using a different flavor of Linux, you should be able to search online for commands similar to Ubuntu commands for most of the operations described here. If you use Windows or macOS, you will need to replace the Linux commands in this book with equivalent commands. All the code samples should work on Linux, Windows, and macOS with zero or minimal changes.


Building and installing Caffe2

Caffe2 can be built and installed from source code quite easily. Installing Caffe2 from source gives us more flexibility and control over our application setup. The build and install process has four stages:

  1. Installing dependencies
  2. Installing acceleration libraries
  1. Building Caffe2
  2. Installing Caffe2

Installing dependencies

We first need to install packages that Caffe2 is dependent on, as well as the tools and libraries required to build it.

  1. First, obtain information about the newest versions of Ubuntu packages by querying their online repositories using the apt-get tool:
$ sudo apt-get update
  1. Next, using the apt-get tool, install the libraries that are required to build Caffe2, and that Caffe2 requires for its operation:
$ sudo apt-get install -y --no-install-recommends \
build-essential \
cmake \
git \
libgflags2 \
libgoogle-glog-dev \
libgtest-dev \
libiomp-dev \
libleveldb-dev \
liblmdb-dev \
libopencv-dev \
libopenmpi-dev \
libsnappy-dev \
libprotobuf-dev \
openmpi-bin \
openmpi-doc \
protobuf-compiler \
python-dev \

These packages include tools required to download Caffe2 source code (Git) and to build Caffe2 (build-essential, cmake, and python-dev). The rest are libraries that Caffe2 is dependent on, including Google Flags (libgflags2), Google Log (libgoogle-glog-dev), Google Test (libgtest-dev), LevelDB (libleveldb-dev), LMDB (liblmdb-dev), OpenCV (libopencv-dev), OpenMP (libiomp-dev), OpenMPI (openmpi-bin and openmpi-doc), Protobuf (libprotobuf-dev and protobuf-compiler), and Snappy (libsnappy-dev).

  1. Finally, install the Python Pip tool and use it to install other Python libraries such as NumPy and Protobuf Python APIs that are useful when working with Python:
$ sudo apt-get install -y --no-install-recommends python-pip                   

$ pip install --user \
future \
numpy \

Installing acceleration libraries

Using Caffe2 to train DL networks and using them for inference involves a lot of math computation. Using acceleration libraries of math routines and deep learning primitives helps Caffe2 users by speeding up training and inference tasks. Vendors of CPUs and GPUs typically offer such libraries, and Caffe2 has support to use such libraries if they are available on your system.

Intel Math Kernel Library (MKL) is key to faster training and inference on Intel CPUs. This library is free for personal and community use. It can be downloaded by registering here: Installation involves uncompressing the downloaded package and running the installer script as a superuser. The library files are installed by default to the /opt/intel directory. The Caffe2 build step, described in the next section, finds and uses the BLAS and LAPACK routines of MKL automatically, if MKL was installed at the default directory.

CUDA and CUDA Deep Neural Network (cuDNN) libraries are essential for faster training and inference on NVIDIA GPUs. CUDA is free to download after registering here: cuDNN can be downloaded from here: Note that you need to have a modern NVIDIA GPU and an NVIDIA GPU driver already installed. As an alternative to the GPU driver, you could use the driver that is installed along with CUDA. Files of the CUDA and cuDNN libraries are typically installed in the /usr/local/cuda directory on Linux. The Caffe2 build step, described in the next section, finds and uses CUDA and cuDNN automatically if installed in the default directory.

Building Caffe2

Using Git, we can clone the Git repository containing Caffe2 source code and all the submodules it requires:

$ git clone --recursive && cd pytorch

$ git submodule update --init

Notice how the Caffe2 source code now exists in a subdirectory inside the PyTorch source repository. This is because of Facebook's cohabitation plan for these two popular DL frameworks as it endeavors to merge the best features of both frameworks over a period of time.

Caffe2 uses CMake as its build system. CMake enables Caffe2 to be easily built for a wide variety of compilers and operating systems.

To build Caffe2 source code using CMake, we first create a build directory and invoke CMake from within it:

$ mkdir build
$ cd build

$ cmake ..

CMake checks available compilers, operating systems, libraries, and packages, and figures out which Caffe2 features to enable and compilation options to use. These options can be seen listed in the CMakeLists.txt file present at the root directory. Options are listed in the form of option(USE_FOOBAR "Use Foobar library" OFF). You can enable or disable those options by setting them to ON or OFF in CMakeLists.txt.

These options can also be configured when invoking CMake. For example, if your Intel CPU has support for AVX/AVX2/FMA, and you would wish to use those features to speed up Caffe2 operations, then enable the USE_NATIVE_ARCH option as follows:


Installing Caffe2

CMake produces a Makefile file at the end. We can build Caffe2 and install it on our system using the following command:

$ sudo make install

This step involves building a large number of CUDA files, which can be very slow. It is recommended to use the parallel execution feature of make to use all the cores of your CPU for a faster build. We can do this by using the following command:

$ sudo make -j install

Using the make install method to build and install makes it difficult to update or uninstall Caffe2 later.
Instead, I prefer to create a Debian package of Caffe2 and install it. That way, I can uninstall or update it conveniently. We can do this using the checkinstall tool.

To install checkinstall, and to use it to build and install Caffe2, use the following commands:

$ sudo apt-get install checkinstall
$ sudo checkinstall --pkgname caffe2

This command also produces a Debian .deb package file that you can use to install on other computers or share with others. For example, on my computer, this command produced a file named caffe2_20181207-1_amd64.deb.

If you need a faster build, use the parallel execution feature of make along with checkinstall:

$ sudo checkinstall --pkgname caffe2 make -j install

If you need to uninstall Caffe2 in the future, you can now do that easily using the following command:

$ sudo dpkg -r caffe2

Testing the Caffe2 Python API

We have now installed Caffe2, but we need to make sure it is correctly installed and that its Python API is working. An easy way to do that is to return to your home directory and check whether the Python API of Caffe2 is imported and can execute correctly. This can be done using the following commands:

$ cd ~
$ python -c "from caffe2.python import core"

Do not run the preceding command from within the Caffe2 directories. This is to avoid the ambiguity of Python having to pick between your installed Caffe2 files and those in the source or build directories.

If your Caffe2 is not installed correctly, you may see an error of some kind, such as the one shown in the following code block, for example:

$ python -c "from caffe2.python import core"
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: No module named caffe2.python

If your Caffe2 has been installed correctly, then you may not see an error. However, you may still get a warning if you don't have a GPU:

$ python -c "from caffe2.python import core"
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.

Testing the Caffe2 C++ API

We have now installed Caffe2, but we need to make sure it is correctly installed and that its C++ API is working. An easy way to do that is to create a small C++ program that initializes the global environment of Caffe2. This is done by calling a method named GlobalInit and passing it the program's arguments. This is typically the first call in a Caffe2 C++ application.

Create a C++ source file named ch1.cpp with this code:

// ch1.cpp
#include "caffe2/core/init.h"

int main(int argc, char** argv)
caffe2::GlobalInit(&argc, &argv);
return 0;

We can compile this C++ source file using the following command:

$ g++ ch1.cpp -lcaffe2

We ask the linker to link with the shared library file by using the -lcaffe2 option. The compiler uses the default include file locations, and the linker uses the default shared library file locations, so we do not need to specify those.

By default, Caffe2 header files are installed to a caffe2 subdirectory in /usr/local/include. This location is usually automatically included in a C++ compilation. Similarly, the Caffe2 shared library files are installed to /usr/local/lib by default. If you installed Caffe2 to a different location, you would need to specify the include directory location using the -I option and the shared library file location using the -L option.

We can now execute the compiled binary:

$ ./a.out

If it executes successfully, then your Caffe2 installation is fine. You are now ready to write Caffe2 C++ applications.



Congratulations! This chapter provided a brief introduction to deep learning and Caffe2. We examined the process of building and installing Caffe2 on our system. We are now ready to explore the world of deep learning by building our own networks, training our own models, and using them for inference on real-world problems.

In the next chapter, we will learn about Caffe2 operators and learn how to compose them to build simple computation graphs. We will then proceed to build a neural network that can recognize handwritten digits.

About the Author
  • Ashwin Nanjappa

    Ashwin Nanjappa is a senior architect at NVIDIA, working in the TensorRT team on improving deep learning inference on GPU accelerators. He has a PhD from the National University of Singapore in developing GPU algorithms for the fundamental computational geometry problem of 3D Delaunay triangulation. As a post-doctoral research fellow at the BioInformatics Institute (Singapore), he developed GPU-accelerated machine learning algorithms for pose estimation using depth cameras. As an algorithms research engineer at Visenze (Singapore), he implemented computer vision algorithm pipelines in C++, developed a training framework built upon Caffe in Python, and trained deep learning models for some of the world's most popular online shopping portals.

    Browse publications by this author
Caffe2 Quick Start Guide
Unlock this book and the full library FREE for 7 days
Start now