Reader small image

You're reading from  Caffe2 Quick Start Guide

Product typeBook
Published inMay 2019
Reading LevelBeginner
PublisherPackt
ISBN-139781789137750
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Ashwin Nanjappa
Ashwin Nanjappa
author image
Ashwin Nanjappa

Ashwin Nanjappa is a senior architect at NVIDIA, working in the TensorRT team on improving deep learning inference on GPU accelerators. He has a PhD from the National University of Singapore in developing GPU algorithms for the fundamental computational geometry problem of 3D Delaunay triangulation. As a post-doctoral research fellow at the BioInformatics Institute (Singapore), he developed GPU-accelerated machine learning algorithms for pose estimation using depth cameras. As an algorithms research engineer at Visenze (Singapore), he implemented computer vision algorithm pipelines in C++, developed a training framework built upon Caffe in Python, and trained deep learning models for some of the world's most popular online shopping portals.
Read more about Ashwin Nanjappa

Right arrow

Composing Networks

In this chapter, we will learn about Caffe2 operators and how we can compose networks using these operators. To learn how to use operators, we will start off by building a simple computation graph from scratch. After that, we will solve a real computer vision problem called MNIST (by building a genuine neural network with trained parameters) and use it for inference.

This chapter covers the following topics:

  • Introduction to Caffe2 operators
  • The difference between operators and layers
  • How to use operators to compose a network
  • Introduction to the MNIST problem
  • Composing a network for the MNIST problem
  • Inference through a Caffe2 network

Operators

In Caffe2, a neural network can be thought of as a directed graph, where the nodes are operators and the edges represent the flow of data between operators. Operators are the basic units of computation in a Caffe2 network. Every operator is defined with a certain number of inputs and a certain number of outputs. When the operator is executed, it reads its inputs, performs the computation it is associated with, and writes the results to its outputs.

To obtain the best possible performance, Caffe2 operators are typically implemented in C++ for execution on CPUs and implemented in CUDA for execution on GPUs. All operators in Caffe2 are derived from a common interface. You can see this common interface defined in the caffe2/proto/caffe2.proto file in the Caffe2 source code.

The following is the Caffe2 operator interface found in my caffe2.proto file:

// Operator Definition...

Difference between layers and operators

Older deep learning frameworks, such as Caffe, did not have operators. Instead, their basic units of computation were called layers. These older frameworks chose the name layer inspired by the layers in neural networks.

However, contemporary frameworks, such as Caffe2, TensorFlow, and PyTorch, prefer to use the term operator for their basic units of computation. There is a subtle difference between operators and layers. A layer in older frameworks, such as Caffe, was composed of both the computation function of that layer and the trained parameters of that layer. In contrast to this, an operator in Caffe2 only holds the computation function. Both the trained parameters and the inputs are external to the operator and need to be fed to it explicitly.

...

Building a computation graph

In this section, we will learn how to build a network in Caffe2 using model_helper. (model_helper was introduced earlier in this chapter.) To maintain the simplicity of this example, we use mathematical operators that require no trained parameters. So, our network is a computation graph rather than a neural network because it has no trained parameters that were learned from training data. The network we will build is illustrated by the graph shown in Figure 2.5:

Figure 2.5: Our simple computation graph with three operators

As you can see, we provide two inputs to the network: a matrix, A, and a vector, B. A MatMul operator is applied to A and B and its result is fed to a Sigmoid function, designated by σ in Figure 2.5. The result of the Sigmoid function is fed to a SoftMax function. (We will learn a bit more about the Sigmoid and SoftMax operators...

Building a multilayer perceptron neural network

In this section, we introduce the MNIST problem and learn how to build a MultiLayer Perceptron (MLP) network using Caffe2 to solve it. We also learn how to load pretrained parameters into the network and use it for inference.

MNIST problem

The MNIST problem is a classic image classification problem that used to be popular in machine learning. State-of-the-art methods can now achieve greater than 99% accuracy in relation to this problem, so it is no longer relevant. However, it acts as a stepping stone for us to learn how to build a Caffe2 network that solves a real machine learning problem.

The MNIST problem lies in identifying the handwritten digit that is present in a grayscale...

Summary

In this chapter, we learned about Caffe2 operators and how they differ from layers used in older deep learning frameworks. We built a simple computation graph by composing several operators. We then tackled the MNIST machine learning problem and built an MLP network using Brew helper functions. We loaded pretrained weights into this network and used it for inference on a batch of input images. We also introduced several common layers, such as matrix multiplication, fully connected, Sigmoid, SoftMax, and ReLU.

We learned about performing inference on our networks in this chapter. In the next chapter, we will learn about training and how to train a network to solve the MNIST problem.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Caffe2 Quick Start Guide
Published in: May 2019Publisher: PacktISBN-13: 9781789137750
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Ashwin Nanjappa

Ashwin Nanjappa is a senior architect at NVIDIA, working in the TensorRT team on improving deep learning inference on GPU accelerators. He has a PhD from the National University of Singapore in developing GPU algorithms for the fundamental computational geometry problem of 3D Delaunay triangulation. As a post-doctoral research fellow at the BioInformatics Institute (Singapore), he developed GPU-accelerated machine learning algorithms for pose estimation using depth cameras. As an algorithms research engineer at Visenze (Singapore), he implemented computer vision algorithm pipelines in C++, developed a training framework built upon Caffe in Python, and trained deep learning models for some of the world's most popular online shopping portals.
Read more about Ashwin Nanjappa