Search icon CANCEL
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Learning Hub
Free Learning
Arrow right icon
Deep Learning with PyTorch
Deep Learning with PyTorch

Deep Learning with PyTorch: A practical approach to building neural network models using PyTorch

By Vishnu Subramanian
$35.99 $24.99
Book Feb 2018 262 pages 1st Edition
$35.99 $24.99
$15.99 Monthly
$35.99 $24.99
$15.99 Monthly

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now
Table of content icon View table of contents Preview book icon Preview Book

Deep Learning with PyTorch

Getting Started with Deep Learning Using PyTorch

Deep learning (DL) has revolutionized industry after industry. It was once famously described by Andrew Ng on Twitter:

Artificial Intelligence is the new electricity!

Electricity transformed countless industries; artificial intelligence (AI) will now do the same.

AI and DL are used like synonyms, but there are substantial differences between the two. Let's demystify the terminology used in the industry so that you, as a practitioner, will be able to differentiate between signal and noise.

In this chapter, we will cover the following different parts of AI:

  • AI itself and its origination
  • Machine learning in the real world
  • Applications of deep learning
  • Why deep learning now?
  • Deep learning framework: PyTorch

Artificial intelligence

Countless articles discussing AI are published every day. The trend has increased in the last two years. There are several definitions of AI floating around the web, my favorite being the automation of intellectual tasks normally performed by humans.

The history of AI

The term artificial intelligence was first coined by John McCarthy in 1956, when he held the first academic conference on the subject. The journey of the question of whether machines think or not started much earlier than that. In the early days of AI, machines were able to solve problems that were difficult for humans to solve.

For example, the Enigma machine was built at the end of World War II to be used in military communications. Alan Turing built an AI system that helped to crack the Enigma code. Cracking the Enigma code was a very challenging task for a human, and it could take weeks for an analyst to do. The AI machine was able to crack the code in hours.

Computers have a tough time solving problems that are intuitive to us, such as differentiating between dogs and cats, telling whether your friend is angry at you for arriving late at a party (emotions), differentiating between a truck and a car, taking notes during a seminar (speech recognition), or converting notes to another language for your friend who does not understand your language (for example, French to English). Most of these tasks are intuitive to us, but we were unable to program or hard code a computer to do these kinds of tasks. Most of the intelligence in early AI machines was hard coded, such as a computer program playing chess.

In the early years of AI, a lot of researchers believed that AI could be achieved by hard coding rules. This kind of AI is called symbolic AI and was useful in solving well-defined, logical problems, but it was almost incapable of solving complex problems such as image recognition, object detection, object segmentation, language translation, and natural-language-understanding tasks. Newer approaches to AI, such as machine learning and DL, were developed to solve these kinds of problems.

To better understand the relationships among AI, ML, and DL, let's visualize them as concentric circles with AI—the idea that came first (the largest), then machine learning—(which blossomed later), and finally DL—which is driving today’s AI explosion (fitting inside both):

How AI, machine learning, and DL fit together

Machine learning

Machine learning (ML) is a sub-field of AI and has become popular in the last 10 years and, at times, the two are used interchangeably. AI has a lot of other sub-fields aside from machine learning. ML systems are built by showing lots of examples, unlike symbolic AI, where we hard code rules to build the system. At a high level, machine learning systems look at tons of data and come up with rules to predict outcomes for unseen data:

Machine learning versus traditional programming

Most ML algorithms perform well on structured data, such as sales predictions, recommendation systems, and marketing personalization. An important factor for any ML algorithm is feature engineering and data scientists need to spend a lot of time to get the features right for ML algorithms to perform. In certain domains, such as computer vision and natural language processing (NLP), feature engineering is challenging as they suffer from high dimensionality.

Until recently, problems like this were challenging for organizations to solve using typical machine-learning techniques, such as linear regression, random forest, and so on, for reasons such as feature engineering and high dimensionality. Consider an image of size 224 x 224 x 3 (height x width x channels), where 3 in the image size represents values of red, green, and blue color channels in a color image. To store this image in computer memory, our matrix will contain 150,528 dimensions for a single image. Assume you want to build a classifier on top of 1,000 images of size 224 x 224 x 3, the dimensions will become 1,000 times 150,528. A special branch of machine learning called deep learning allows you to handle these problems using modern techniques and hardware.

Examples of machine learning in real life

The following are some cool products that are powered by machine learning:

  • Example 1: Google Photos uses a specific form of machine learning called deep learning for grouping photos
  • Example 2: Recommendation systems, which are a family of ML algorithms, are used for recommending movies, music, and products by major companies such as Netflix, Amazon, and iTunes

Deep learning

Traditional ML algorithms use handwritten feature extraction to train algorithms, while DL algorithms use modern techniques to extract these features in an automatic fashion.

For example, a DL algorithm predicting whether an image contains a face or not extracts features such as the first layer detecting edges, the second layer detecting shapes such as noses and eyes, and the final layer detecting face shapes or more complex structures. Each layer trains based on the previous layer's representation of the data. It's OK if you find this explanation hard to understand, the later chapters of the book will help you to intuitively build and inspect such networks:

Visualizing the output of intermediate layers (Image source:

The use of DL has grown tremendously in the last few years with the rise of GPUs, big data, cloud providers such as Amazon Web Services (AWS) and Google Cloud, and frameworks such as Torch, TensorFlow, Caffe, and PyTorch. In addition to this, large companies share algorithms trained on huge datasets, thus helping startups to build state-of-the-art systems on several use cases with little effort.

Applications of deep learning

Some popular applications that were made possible using DL are as follows:

  • Near-human-level image classification
  • Near-human-level speech recognition
  • Machine translation
  • Autonomous cars
  • Siri, Google Voice, and Alexa have become more accurate in recent years
  • A Japanese farmer sorting cucumbers
  • Lung cancer detection
  • Language translation beating human-level accuracy

The following screenshot shows a short example of summarization, where the computer takes a large paragraph of text and summarizes it in a few lines:

Summary of a sample paragraph generated by computer

In the following image, a computer has been given a plain image without being told what it shows and, using object detection and some help from a dictionary, you get back an image caption stating two young girls are playing with lego toy. Isn't it brilliant?

Object detection and image captioning (Image source:

Hype associated with deep learning

People in the media and those outside the field of AI, or people who are not real practitioners of AI and DL, have been suggesting that things like the story line of the film Terminator 2: Judgement Day could become reality as AI/DL advances. Some of them even talk about a time in which we will become controlled by robots, where robots decide what is good for humanity. At present, the ability of AI is exaggerated far beyond its true capabilities. Currently, most DL systems are deployed in a very controlled environment and are given a limited decision boundary.

My guess is that when these systems can learn to make intelligent decisions, rather than merely completing pattern matching and, when hundreds or thousands of DL algorithms can work together, then maybe we can expect to see robots that could probably behave like the ones we see in science fiction movies. In reality, we are no closer to general artificial intelligence, where machines can do anything without being told to do so. The current state of DL is more about finding patterns from existing data to predict future outcomes. As DL practitioners, we need to differentiate between signal and noise.

The history of deep learning

Though deep learning has become popular in recent years, the theory behind deep learning has been evolving since the 1950s. The following table shows some of the most popular techniques used today in DL applications and their approximate timeline:



Neural networks



Early 1960s

Convolution Neural Networks


Recurrent neural networks


Long Short-Term Memory


Deep learning has been given several names over the years. It was called cybernetics in the 1970s, connectionism in the 1980s, and now it is either known as deep learning or neural networks. We will use DL and neural networks interchangeably. Neural networks are often referred to as an algorithms inspired by the working of human brains. However, as practitioners of DL, we need to understand that it is majorly inspired and backed by strong theories in math (linear algebra and calculus), statistics (probability), and software engineering.

Why now?

Why has DL became so popular now? Some of the crucial reasons are as follows:

  • Hardware availability
  • Data and algorithms
  • Deep learning frameworks

Hardware availability

Deep learning requires complex mathematical operations to be performed on millions, sometimes billions, of parameters. Existing CPUs take a long time to perform these kinds of operations, although this has improved over the last several years. A new kind of hardware called a graphics processing unit (GPU) has completed these huge mathematical operations, such as matrix multiplications, orders of magnitude faster.

GPUs were initially built for the gaming industry by companies such as Nvidia and AMD. It turned out that this hardware is extremely efficient, not only for rendering high quality video games, but also to speed up the DL algorithms. One recent GPU from Nvidia, the 1080ti, takes a few days to build an image-classification system on top of an ImageNet dataset, which previously could have taken around a month.

If you are planning to buy hardware for running deep learning, I would recommend choosing a GPU from Nvidia based on your budget. Choose one with a good amount of memory. Remember, your computer memory and GPU memory are two different things. The 1080ti comes with 11 GB of memory and it costs around $700.

You can also use various cloud providers such as AWS, Google Cloud, or Floyd (this company offers GPU machines optimized for DL). Using a cloud provider is economical if you are just starting with DL or if you are setting up machines for organization usage where you may have more financial freedom.

Performance could vary if these systems are optimized.

The following image shows some of the benchmarks that compare performance between CPUs and GPUs :

Performance benchmark of neural architectures on CPUs and GPUs (Image source:

Data and algorithms

Data is the most important ingredient for the success of deep learning. Due to the wide adoption of the internet and the growing use of smartphones, several companies, such as Facebook and Google, have been able to collect a lot of data in various formats, particularly text, images, videos, and audio. In the field of computer vision, ImageNet competitions have played a huge role in providing datasets of 1.4 million images in 1,000 categories.

These categories are hand-annotated and every year hundreds of teams compete. Some of the algorithms that were successful in the competition are VGG, ResNet, Inception, DenseNet, and many more. These algorithms are used today in industries to solve various computer vision problems. Some of the other popular datasets that are often used in the deep learning space to benchmark various algorithms are as follows:

  • COCO dataset
  • The Street View House Numbers
  • Wikipedia dump
  • 20 Newsgroups
  • Penn Treebank
  • Kaggle

The growth of different algorithms such as batch normalization, activation functions, skip connections, Long Short-Term Memory (LSTM), dropouts, and many more have made it possible in recent years to train very deep networks faster and more successfully. In the coming chapters of this book, we will get into the details of each technique and how they help in building better models.

Deep learning frameworks

In the earlier days, people needed to have expertise in C++ and CUDA to implement DL algorithms. With a lot of organizations now open sourcing their deep learning frameworks, people with knowledge of a scripting language, such as Python, can start building and using DL algorithms. Some of the popular deep learning frameworks used today in the industry are TensorFlow, Caffe2, Keras, Theano, PyTorch, Chainer, DyNet, MXNet, and CNTK.

The adoption of deep learning would not have been this huge if it had not been for these frameworks. They abstract away a lot of underlying complications and allow us to focus on the applications. We are still in the early days of DL where, with a lot of research, breakthroughs are happening every day across companies and organizations. As a result of this, various frameworks have their own pros and cons.


PyTorch, and most of the other deep learning frameworks, can be used for two different things:

  • Replacing NumPy-like operations with GPU-accelerated operations
  • Building deep neural networks

What makes PyTorch increasingly popular is its ease of use and simplicity. Unlike most other popular deep learning frameworks, which use static computation graphs, PyTorch uses dynamic computation, which allows greater flexibility in building complex architectures.

PyTorch extensively uses Python concepts, such as classes, structures, and conditional loops, allowing us to build DL algorithms in a pure object-oriented fashion. Most of the other popular frameworks bring their own programming style, sometimes making it complex to write new algorithms and it does not support intuitive debugging. In the later chapters, we will discuss computation graphs in detail.

Though PyTorch was released recently and is still in its beta version, it has become immensely popular among data scientists and deep learning researchers for its ease of use, better performance, easier-to-debug nature, and strong growing support from various companies such as SalesForce.

As PyTorch was primarily built for research, it is not recommended for production usage in certain scenarios where latency requirements are very high. However, this is changing with a new project called Open Neural Network Exchange (ONNX) (, which focuses on deploying a model developed on PyTorch to a platform like Caffe2 that is production-ready. At the time of writing, it is too early to say much about this project as it has only just been launched. The project is backed by Facebook and Microsoft.

Throughout the rest of the book, we will learn about the various Lego blocks (smaller concepts or techniques) for building powerful DL applications in the areas of computer vision and NLP.


In this introductory chapter, we explored what artificial intelligence, machine learning, and deep learning are and we discussed the differences between all the three. We also looked at applications powered by them in our day-to-day lives. We dig deeper into why DL is only now becoming more popular. Finally, we gave a gentle introduction to PyTorch, which is a deep learning framework.

In the next chapter, we will train our first neural network in PyTorch.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Learn PyTorch for implementing cutting-edge deep learning algorithms.
  • Train your neural networks for higher speed and flexibility and learn how to implement them in various scenarios;
  • Cover various advanced neural network architecture such as ResNet, Inception, DenseNet and more with practical examples;


Deep learning powers the most intelligent systems in the world, such as Google Voice, Siri, and Alexa. Advancements in powerful hardware, such as GPUs, software frameworks such as PyTorch, Keras, TensorFlow, and CNTK along with the availability of big data have made it easier to implement solutions to problems in the areas of text, vision, and advanced analytics. This book will get you up and running with one of the most cutting-edge deep learning libraries—PyTorch. PyTorch is grabbing the attention of deep learning researchers and data science professionals due to its accessibility, efficiency and being more native to Python way of development. You'll start off by installing PyTorch, then quickly move on to learn various fundamental blocks that power modern deep learning. You will also learn how to use CNN, RNN, LSTM and other networks to solve real-world problems. This book explains the concepts of various state-of-the-art deep learning architectures, such as ResNet, DenseNet, Inception, and Seq2Seq, without diving deep into the math behind them. You will also learn about GPU computing during the course of the book. You will see how to train a model with PyTorch and dive into complex neural networks such as generative networks for producing text and images. By the end of the book, you'll be able to implement deep learning applications in PyTorch with ease.

What you will learn

• Use PyTorch for GPU-accelerated tensor computations • Build custom datasets and data loaders for images and test the models using torchvision and torchtext • Build an image classifier by implementing CNN architectures using PyTorch • Build systems that do text classification and language modeling using RNN, LSTM, and GRU • Learn advanced CNN architectures such as ResNet, Inception, Densenet, and learn how to use them for transfer learning • Learn how to mix multiple models for a powerful ensemble model • Generate new images using GAN’s and generate artistic images using style transfer

Product Details

Country selected

Publication date : Feb 23, 2018
Length 262 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781788624336
Category :
Concepts :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details

Publication date : Feb 23, 2018
Length 262 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781788624336
Category :
Concepts :

Table of Contents

11 Chapters
Preface Chevron down icon Chevron up icon
1. Getting Started with Deep Learning Using PyTorch Chevron down icon Chevron up icon
2. Building Blocks of Neural Networks Chevron down icon Chevron up icon
3. Diving Deep into Neural Networks Chevron down icon Chevron up icon
4. Fundamentals of Machine Learning Chevron down icon Chevron up icon
5. Deep Learning for Computer Vision Chevron down icon Chevron up icon
6. Deep Learning with Sequence Data and Text Chevron down icon Chevron up icon
7. Generative Networks Chevron down icon Chevron up icon
8. Modern Network Architectures Chevron down icon Chevron up icon
9. What Next? Chevron down icon Chevron up icon
10. Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Top Reviews
No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial


How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to
  • To contact us directly if a problem is not resolved, use
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.