You're reading from Hands-On Natural Language Processing with PyTorch 1.x

Product typeBook

Published inJul 2020

Reading LevelBeginner

PublisherPackt

ISBN-139781789802740

Edition1st Edition

Languages

Python

Tools

PyTorch

Concepts

Mobile Application Development

Author (1)

Thomas Dop

Chapter 2: Getting Started with PyTorch 1.x for NLP

PyTorch is a Python-based machine learning library. It consists of two main features: its ability to efficiently perform tensor operations with hardware acceleration (using GPUs) and its ability to build deep neural networks. PyTorch also uses dynamic computational graphs instead of static ones, which sets it apart from similar libraries such as TensorFlow. By demonstrating how language can be represented using tensors and how neural networks can be used to learn from NLP, we will show that both these features are particularly useful for natural language processing.

In this chapter, we will show you how to get PyTorch up and running on your computer, as well as demonstrate some of its key functionalities. We will then compare PyTorch to some other deep learning frameworks, before exploring some of the NLP functionality of PyTorch, such as its ability to perform tensor operations, and finally demonstrate how to build a simple neural...

Technical requirements

For this chapter, Python needs to be installed. It is recommended to use the latest version of Python (3.6 or higher). It is also recommended to use the Anaconda package manager to install PyTorch. A CUDA-compatible GPU is required to run tensor operations on a GPU. All the code for this chapter can be found at https://github.com/PacktPublishing/Hands-On-Natural-Language-Processing-with-PyTorch-1.x.

Installing and using PyTorch 1.x

Like most Python packages, PyTorch is very simple to install. There are two main ways of doing so. The first is to simply install it using pip in the command line. Simply type the following command:

pip install torch torchvision

While this installation method is quick, it is recommended to install using Anaconda instead, as this includes all the required dependencies and binaries for PyTorch to run. Furthermore, Anaconda will be required later to enable training models on a GPU using CUDA. PyTorch can be installed through Anaconda by entering the following in the command line:

conda install torch torchvision -c pytorch

To check that PyTorch is working correctly, we can open a Jupyter Notebook and run a few simple commands:

To define a Tensor in PyTorch, we can do the following:
```
import torch
x = torch.tensor([1.,2.])
print(x)
```
This results in the following output:
Figure 2.1 – Tensor output
This shows that tensors within PyTorch...

Enabling PyTorch acceleration using CUDA

One of the main benefits of PyTorch is its ability to enable acceleration through the use of a graphics processing unit (GPU). Deep learning is a computational task that is easily parallelizable, meaning that the calculations can be broken down into smaller tasks and calculated across many smaller processors. This means that instead of needing to execute the task on a single CPU, it is more efficient to perform the calculation on a GPU.

GPUs were originally created to efficiently render graphics, but since deep learning has grown in popularity, GPUs have been frequently used for their ability to perform multiple calculations simultaneously. While a traditional CPU may consist of around four or eight cores, a GPU consists of hundreds of smaller cores. Because calculations can be executed across all these cores simultaneously, GPUs can rapidly reduce the time taken to perform deep learning tasks.

Consider a single pass within a neural network...

Comparing PyTorch to other deep learning frameworks

PyTorch is one of the main frameworks used in deep learning today. There are other widely used frameworks available too, such as TensorFlow, Theano, and Caffe. While these are very similar in many ways, there are some key differences in how they operate. These include the following:

How the models are computed
The way in which the computational graphs are compiled
The ability to create dynamic computational graphs with variable layers
Differences in syntax

Arguably, the main difference between PyTorch and other frameworks is in the way that the models themselves are computed. PyTorch uses an automatic differentiation method called autograd, which allows computational graphs to be defined and executed dynamically. This is in contrast to other frameworks such as TensorFlow, which is a static framework. In these static frameworks, computational graphs must be defined and compiled before finally being executed...

Building a simple neural network in PyTorch

We will now walk through building a neural network from scratch in PyTorch. Here, we have a small .csv file containing several examples of images from the MNIST dataset. The MNIST dataset consists of a collection of hand-drawn digits between 0 and 9 that we want to attempt to classify. The following is an example from the MNIST dataset, consisting of a hand-drawn digit 1:

Figure 2.11 – Sample image from the MNIST dataset

These images are 28x28 in size: 784 pixels in total. Our dataset in train.csv consists of 1,000 of these images, with each consisting of 784 pixel values, as well as the correct classification of the digit (in this case, 1).

Loading the data

We will begin by loading the data, as follows:

First, we need to load our training dataset, as follows:

train = pd.read_csv("train.csv")
train_labels = train['label'].values
train = train.drop("label",axis=1...

NLP for PyTorch

Now that we have learned how to build neural networks, we will see how it is possible to build models for NLP using PyTorch. In this example, we will create a basic bag-of-words classifier in order to classify the language of a given sentence.

Setting up the classifier

For this example, we'll take a selection of sentences in Spanish and English:

First, we split each sentence into a list of words and take the language of each sentence as a label. We take a section of sentences to train our model on and keep a small section to one side as our test set. We do this so that we can evaluate the performance of our model after it has been trained:
```
("This is my favourite chapter".lower().split(),\
 "English"),
("Estoy en la biblioteca".lower().split(), "Spanish")
```
Note that we also transform each word into lowercase, which stops words being double counted in our bag-of-words. If we have the word book and the word Book...

Summary

In this chapter, we introduced PyTorch and some of its key features. Hopefully, you now have a better understanding of how PyTorch differs from other deep learning frameworks and how it can be used to build basic neural networks. While these simple examples are just the tip of the iceberg, we have illustrated that PyTorch is an immensely powerful tool for NLP analysis and learning.

In future chapters, we will demonstrate how the unique properties of PyTorch can be utilized to build highly sophisticated models for solving very complex machine learning tasks.

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Natural Language Processing with PyTorch 1.x

Published in: Jul 2020Publisher: PacktISBN-13: 9781789802740

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Thomas Dop

Thomas Dop is a data scientist at MagicLab, a company that creates leading dating apps, including Bumble and Badoo. He works on a variety of areas within data science, including NLP, deep learning, computer vision, and predictive modeling. He holds an MSc in data science from the University of Amsterdam.
Read more about Thomas Dop

Other recommended products

Related to this chapter

Getting Started with Google BERT

Getting Started with Google BERT will help you become well-versed with the BERT model from scratch and learn how to create interesting NLP applications. You'll understand several variants of BERT such as ALBERT, RoBERTa, DistilBERT, ELECTRA, VideoBERT, and many others in detail.

BookJan 2021352 pages

Hands-On Python Natural Language Processing

This book provides a blend of both the theoretical and practical aspects of Natural Language Processing (NLP). It covers the concepts essential to develop a thorough understanding of NLP and also delves into a detailed discussion on NLP based use-cases such as language translation, sentiment analysis, etc. Every module covers real-world examples

BookJun 2020316 pages4

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Hands-On Natural Language Processing with Python

This book teaches you to leverage deep learning models in performing various NLP tasks along with showcasing the best practices in dealing with the NLP challenges. The book equips you with practical knowledge to implement deep learning in your linguistic applications using NLTk and Python's popular deep learning library, TensorFlow.

BookJul 2018312 pages

Transformers for Natural Language Processing

Being the first book in the market to dive deep into the Transformers, it is a step-by-step guide for data and AI practitioners to help enhance the performance of language understanding and gain expertise with hands-on implementation of transformers using PyTorch, TensorFlow, Hugging Face, Trax, and AllenNLP.

BookJan 2021384 pages

Python Natural Language Processing Cookbook

Leverage your natural language processing skills to make sense of text. With this book, you'll learn fundamental and advanced NLP techniques in Python that will help you to make your data fit for application in a wide variety of industries. You’ll also find recipes for overcoming common challenges in implementing NLP pipelines.

BookMar 2021284 pages

Advanced Natural Language Processing with TensorFlow 2

This book provides hands-on training in NLP tools and techniques with intrinsic details. Apart from gaining expertise, you will be able to carry out novel state-of-the-art research using the skills gained.

BookFeb 2021380 pages

Intelligent Projects Using Python

This book includes 9 projects on building smart and practical AI-based systems. These projects cover solutions to different domain-specific problems in healthcare, e-commerce and more. With this book, you will apply different machine learning and deep learning techniques and learn how to build your own intelligent applications for smart predictions and other insight-driven tasks.

BookJan 2019342 pages

Natural Language Processing with TensorFlow

TensorFlow is the leading framework for deep learning algorithms critical to artificial intelligence, and natural language processing (NLP) makes much of the data used by deep learning applications accessible to them. This book brings the two together and teaches deep learning developers how to work with today’s vast amount of unstructured data.

BookMay 2018472 pages

Mastering PyTorch

Discover the flexibility of the PyTorch library for implementing new algorithms in a scalable and efficient way with this expert guide. This book will show you how to process data with deep learning methodologies using PyTorch 1.x and cover advanced topics such as GANs, Deep RL, and NLP using advanced deep learning techniques.

BookFeb 2021450 pages

Natural Language Processing and Computational Linguistics

Discover how you can perform your own modern text analysis, to make predictions, create inferences, and gain insights about the data around you today. Learn how to harness the powerful Python ecosystem and tools such as spaCy and Gensim to perform natural language processing, and computational linguistics algorithms.

BookJun 2018306 pages

Deep Learning with Microsoft Cognitive Toolkit Quick Start Guide

Cognitive Toolkit is one of the most popular and recently open sourced deep learning toolkit by Microsoft. Cognitive Toolkit is used to train fast and effective deep learning models. This book will be a quick introduction to using Cognitive Toolkit and will teach you how to train and validate different types of neural networks.

BookMar 2019208 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages