You're reading from Caffe2 Quick Start Guide

Product typeBook

Published inMay 2019

Reading LevelBeginner

PublisherPackt

ISBN-139781789137750

Edition1st Edition

Languages

C++

Tools

Caffe

Concepts

Deep Learning

Author (1)

Ashwin Nanjappa

Training Networks

In Chapter 2, Composing Networks, we learned how to create Caffe2 operators and how we can compose networks from them. In this chapter, the focus is on training neural networks. We will learn how to create a network that is intended for training and how to train it using Caffe2. We will continue to use the MNIST dataset as an example. However, instead of the MLP network we built in the previous chapter, we will create a popular network named LeNet.

This chapter will cover the following topics:

Introduction to training a neural network
Building the training network for LeNet
Training and monitoring the LeNet network

Introduction to training

In this section, we provide a brief overview of how a neural network is trained. This will help us to understand the later sections where we use Caffe2 to actually train a network.

Components of a neural network

We employ neural networks to solve a particular type of problem for which devising a computer algorithm would be onerous or difficult. For example, in the MNIST problem (introduced in Chapter 2, Composing Networks), handcrafting a complicated algorithm to detect the common stroke patterns for each digit, and thereby determining each digit, would be tedious. Instead, it is easier to design a neural network suited to this problem and then train it (as shown later in this chapter) using a lot...

LeNet network

In Chapter 2, Composing Networks, we built an MLP network that was composed of multiple pairs of fully connected layers and activation layers. In this chapter, we will build and train a convolutional neural network (CNN). This type of network is so named because it primarily uses convolution layers (introduced in the next section). For computer vision problems, CNNs have been shown to deliver better results with fewer numbers of parameters compared to MLPs. One of the first successful CNNs was used to solve the MNIST problem that we looked at earlier. This network, named LeNet-5, was created by Yann LeCun and his colleagues:

Figure 3.4: Structure of our LeNet model

We will construct a network similar in spirit to the LeNet. We will refer to this as the LeNet model in the remainder of this book. From Figure 3.4, we can see that our LeNet network has eight layers...

Training data

We use brew in this chapter to simplify the process of building our LeNet network. We begin by first initializing the model using ModelHelper, which was introduced in the previous chapter:

# Create the model helper for the train model
train_model = model_helper.ModelHelper(name="mnist_lenet_train_model")

We then add inputs to the training network using our add_model_inputs method:

# Specify the input is from the train lmdb
data, label = add_model_inputs(
    train_model,
    batch_size=64,
    db=os.path.join(data_folder, "mnist-train-nchw-lmdb"),
    db_type="lmdb",
)

Training data is usually stored in a database (DB) so that it can be accessed efficiently. Reading from a DB is usually faster than reading from thousands of individual files on the filesystem. For every training image in the MNIST dataset, the DB stores the grayscale pixel...

Building LeNet

We build the LeNet layers required for inference by calling the build_mnist_lenet method in our script:

# Build the LeNet network
softmax_layer = build_mnist_lenet(train_model, data)

Note how we only pass in the image pixel data input to this network and not the labels. The labels are not required for inference; they are required for training or testing to use as ground truth to compare against the prediction of the network’s final layer.

The remainder of the following subsections describe how we add pairs of convolution and pooling layers, the fully connected and ReLU layers, and the final SoftMax layer, to create the LeNet network.

Layer 1 – Convolution

The first layer in LeNet is a convolution...

Training layers

In earlier sections, we built the layers of a LeNet network required for inference and added inputs of image pixels and the label corresponding to each image. In this section, we are adding a few layers at the end of the network required to compute the loss function and for backpropagation. These layers are only required during training and can be discarded when using the trained network for inference.

Loss layer

As we noted in the Introduction to training section, we need a loss function at the end of the network to determine the error of the network. Caffe2 provides implementations of many common loss functions as operators in its operators' catalog.

For this example, we compute the loss value using...

Training and monitoring

We begin the training process by creating the network in the workspace and initializing all the parameter blobs of the network in the workspace. This is done by calling the workspace RunNetOnce method:

# The parameter initialization network only needs to be run once.
workspace.RunNetOnce(train_model.param_init_net)

Next, we ask Caffe2 to create the network in memory:

# Creating an actual network as a C++ object in memory.
# We need this as the object is going to be used a lot
# so we avoid creating an object every single time it is used.
workspace.CreateNet(train_model.net, overwrite=True)

We are finally ready to train. We iterate a predetermined number of times and, in each iteration, we use the workspace RunNet method to run a forward pass and a backward pass.

Training a small network such as our LeNet model is fast both on CPU and GPU. However, many of...

Summary

In this chapter, we learned about the general training process for a neural network using a gradient-based optimization algorithm. We learned about CNNs and the classic LeNet CNN to solve the MNIST problem. We built this network, and learned how to add training and test layers to it, so that we could use it for training. We finally used this network to train and learned how to monitor the network during training using Caffe2. In the following chapters, we will learn how to work with models trained using other frameworks, such as Caffe, TensorFlow, and PyTorch.

The rest of the chapter is locked

You have been reading a chapter from

Caffe2 Quick Start Guide

Published in: May 2019Publisher: PacktISBN-13: 9781789137750

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Ashwin Nanjappa

Ashwin Nanjappa is a senior architect at NVIDIA, working in the TensorRT team on improving deep learning inference on GPU accelerators. He has a PhD from the National University of Singapore in developing GPU algorithms for the fundamental computational geometry problem of 3D Delaunay triangulation. As a post-doctoral research fellow at the BioInformatics Institute (Singapore), he developed GPU-accelerated machine learning algorithms for pose estimation using depth cameras. As an algorithms research engineer at Visenze (Singapore), he implemented computer vision algorithm pipelines in C++, developed a training framework built upon Caffe in Python, and trained deep learning models for some of the world's most popular online shopping portals.
Read more about Ashwin Nanjappa

Other recommended products

Related to this chapter

Generative Adversarial Networks Cookbook

Generative Adversarial Networks have opened up many new possibilities in the machine learning domain. This book is all you need to implement different types of GANs using TensorFlow and Keras, in order to provide optimized and efficient deep learning solutions.

BookDec 2018268 pages

CUDA Cookbook

This book is for programmers who want to delve into parallel computing, become part of the high-performance computing community and apply those techniques to build modern applications. Experience with C++ programming is assumed. There are some sample examples on equivalent Fortran code. For Deep Learning enthusiasts python based sample code is also provided.

BookSep 2019508 pages

Mastering Computer Vision with TensorFlow 2.x

You will learn the principles of computer vision and deep learning, and understand various models and architectures with their pros and cons. You will learn how to use TensorFlow 2.x to build your own neural network model and apply it to various computer vision tasks such as image acquiring, processing, and analyzing.

BookMay 2020430 pages

Deep Learning with Microsoft Cognitive Toolkit Quick Start Guide

Cognitive Toolkit is one of the most popular and recently open sourced deep learning toolkit by Microsoft. Cognitive Toolkit is used to train fast and effective deep learning models. This book will be a quick introduction to using Cognitive Toolkit and will teach you how to train and validate different types of neural networks.

BookMar 2019208 pages

PyTorch Deep Learning Hands-On

PyTorch Deep Learning Hands-On is a book for engineers who want a fast-paced guide to doing deep learning work with Pytorch. It is not an academic textbook and does not try to teach deep learning principles. The book will help you most if you want to get your hands dirty and put PyTorch to work quickly.

BookApr 2019250 pages

Hands-On Machine Learning with C++

This book will help you explore how to implement different well-known machine learning algorithms with various C++ frameworks and libraries. You will cover basic to advanced machine learning concepts with practical and easy to follow examples. By the end of the book, you will be able to build various machine learning models with ease.

BookMay 2020530 pages

Deep Learning with TensorFlow

Machine learning is concerned with algorithms for transforming data into actionable intelligence and predictive analytics. Deep learning is a branch of machine learning based on multiple levels of representations. This book introduces the core concepts of deep learning using the latest version of TensorFlow to get implementation and research details on cutting-edge architectures. You will learn deep learning with the hands-on model building, data collection and transformation and even more!

BookApr 2017320 pages

PyTorch Artificial Intelligence Fundamentals

In this book, you will start from the basics of tensor manipulation to all the way releasing your deep learning model to production. Using hands-on recipes you will learn to build deep learning applications and visualize the model performance. It teaches you about CNNs, RNNs, GANs and deep reinforcement learning with Pytorch.

BookFeb 2020200 pages

Practical Convolutional Neural Networks

This book helps you master CNN, from the basics to the most advanced concepts in CNN such as GANs, instance classification and attention mechanism for vision models and more. You will implement advanced CNN models using complex image and video datasets. By the end of the book you will learn CNN’s best practices to implement smart ConvNet models and apply them to solve complex deep learning problems.

BookFeb 2018218 pages

Hands-On Computer Vision with TensorFlow 2

Computer vision is achieving a new frontier of capabilities in fields like health, automobile or robotics. This book explores TensorFlow 2, Google's open-source AI framework, and teaches how to leverage deep neural networks for visual tasks. It will help you acquire the insight and skills to be a part of the exciting advances in computer vision.

BookMay 2019372 pages

Deep Learning with TensorFlow

This book introduces the core concepts of deep learning. Get implementation and research details on cutting-edge architectures and apply advanced concepts to your own projects. Develop your knowledge of deep neural networks through hands-on model building and examples of real-world data collection.

BookMar 2018484 pages

Deep Learning Essentials

Deep Learning is one of the trending topics in the field of Artificial Intelligence today and can be considered to be an advanced form of machine learning. This book will help you take your first steps when it comes to training efficient deep learning models, and apply them in various practical scenarios. You will model, train and deploy different kinds of neural networks such as Convolutional Neural Network, Recurrent Neural Network, and see their applications in real-world domains such as computer vision, natural language processing, and speech recognition. This book also covers solutions to tackle different problems you might come across while training your models and ensure their high performance. This book does not assume any prior knowledge of deep learning. By the end of this book, you will have a firm understanding of the basics of deep learning and neural network modeling, along with their practical applications.

BookJan 2018284 pages3

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages