You're reading from Artificial Intelligence with Python - Second Edition

Product typeBook

Published inJan 2020

Reading LevelBeginner

PublisherPackt

ISBN-139781839219535

Edition2nd Edition

Languages

Python

Tools

TensorFlow

Concepts

Artificial Intelligence

Author (1)

Prateek Joshi

Recurrent Neural Networks and Other Deep Learning Models

In this chapter, we are going to learn about deep learning and Recurrent Neural Networks (RNNs). Like CNNs covered in previous chapters, RNNs have also gained a lot of momentum over the last few years. In the case of RNNs, they are heavily used in the area of speech recognition. Many of today's chatbots have built their foundation on RNN technologies. There has been some success predicting financial markets using RNNs. As an example, we might have a text with a sequence of words, and we have an objective to predict the next word in the sequence.

We will discuss the architecture of RNNs and their components. We will continue using TensorFlow, which we started learning about in the previous chapter. We will use TensorFlow to quickly build RNNs. We will also learn how to build an RNN classifier using a single layer neural network. We will then build an image classifier using a CNN.

By the end of this chapter...

The basics of Recurrent Neural Networks

RNNs are another type of popular model that is currently gaining a lot of traction. As we discussed in Chapter 1, Introduction to Artificial Intelligence, the study of neural networks in general and RNNs in particular is the domain of the connectionist tribe (as described in Pedro Domingos' AI classification). RNNs are frequently used to tackle Natural Language Processing (NLP) and Natural Language Understanding (NLU) problems.

The math behind RNNs can be overwhelming at times. Before we get into the nitty gritty of RNNs, keep this thought in mind: a race car driver does not need to fully understand the mechanics of their car to make it go fast and win races. Similarly, we don't necessarily need to fully understand how RNNs work under the hood to make them do useful and sometimes impressive work for us. Francois Chollet, the creator of the Keras library, describes Long Short-Term Memory (LSTM) networks – which are a form...

Architecture of RNNs

The main concept behind an RNN is to take advantage of previous information in a sequence. In a traditional neural network, it is assumed that all inputs and outputs are independent of one another. In some domains and use cases, this assumption is not correct, and we can take advantage of this interconnectedness.

I will use a personal example. I believe that in many cases, I can predict what my wife will say next based on a couple initial sentences. I tend to believe that I have a high accuracy rate with my predictive ability. That said, if you ask my wife, she may tell you a quite different story! A similar concept is being used by Google's email service, Gmail. If you are a user of the service, you will have noticed that, from 2019, it started making suggestions when it thinks it can complete a sentence. If it guesses right, all you do is hit the tab key and the sentence is completed. If it doesn't, you can continue typing and it might...

A language modeling use case

Our goal is to build a language model using an RNN. Here's what that means. Let's say we have a sentence of m words. A language model allows us to predict the probability of observing the sentence (in a given dataset) as:

In words, the probability of a sentence is the product of probabilities of each word given the words that came before it. So, the probability of the sentence "Please let me know if you have any questions" would be the probability of "questions" given "Please let me know if you have any..." multiplied by the probability of "any" given "Please let me know if you have..." and so on.

How is that useful? Why is it important to assign a probability to the observation of a given sentence?

First, a model like this can be used as a scoring mechanism. A language model can be used to pick the most probable next word. Intuitively, the most probable next word is likely...

Training an RNN

As we discussed at the beginning of the chapter, the applications of RNNs are wide and varied across a plethora of industries. In our case, we will only perform a quick example in order to more firmly understand the basic mechanics of RNNs.

The input data that we will be trying to model with our RNN is the mathematical cosine function.

So first let's define our input data and store it into a NumPy array.

import numpy as np
import math
import matplotlib.pyplot as plt
input_data = np.array([math.cos(x) for x in np.arange(200)])
plt.plot(input_data[:50])
plt.show

The preceding statement will plot the data so we can visualize what our input data looks like. You should get an output like this:

Figure 8: Visualization of input data

Let's now split the input data into two sets so we can use one portion for training and another portion for validation. Perhaps not the optimal split from a training standpoint, but to keep...

Summary

In this chapter, we continued to learn about deep learning and learned the basics of RNNs. We then discussed what the basic concepts of the architecture of an RNN are and why these concepts are important. After learning the basics, we looked at some of the potential uses of RNNs and landed on using it to implement a language model. Initially we implemented the language model using basic techniques and we started adding more and more complexity to the model to understand higher-level concepts.

We hope you are as excited as we are to go to the next chapter where we will learn how to create intelligent agents using reinforcement learning.

The rest of the chapter is locked

You have been reading a chapter from

Artificial Intelligence with Python - Second Edition

Published in: Jan 2020Publisher: PacktISBN-13: 9781839219535

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Prateek Joshi

Prateek Joshi is the founder of Plutoshift and a published author of 9 books on Artificial Intelligence. He has been featured on Forbes 30 Under 30, NBC, Bloomberg, CNBC, TechCrunch, and The Business Journals. He has been an invited speaker at conferences such as TEDx, Global Big Data Conference, Machine Learning Developers Conference, and Silicon Valley Deep Learning. Apart from Artificial Intelligence, some of the topics that excite him are number theory, cryptography, and quantum computing. His greater goal is to make Artificial Intelligence accessible to everyone so that it can impact billions of people around the world.
Read more about Prateek Joshi

Other recommended products

Related to this chapter

Python Machine Learning Cookbook

With this book, you will learn how to perform various machine learning tasks in different environments. You’ll use a wide variety of machine learning algorithms using Python to solve real-world problems. By the end of the book, you will learn to implement most used machine learning algorithms using complex datasets and optimized techniques.

BookMar 2019642 pages

OpenCV 3.x with Python By Example

Computer vision is found everywhere in modern technology. OpenCV for Python enables us to run computer vision algorithms in real time. With the advent of powerful machines, we have more processing power to work with. Using this technology, we can seamlessly integrate our computer vision applications into the cloud. Focusing on OpenCV 3.x and Python 3.6, this book will walk you through all the building blocks needed to build amazing computer vision applications with ease.

BookJan 2018268 pages

Learn OpenCV 4 By Building Projects

OpenCV is mainly used in Computer Vision and image processing and is considered to be one of the best open source libraries that helps developers focus on constructing complete projects on image processing, motion detection, and image segmentation. This book will be your guide to understanding the basic OpenCV concepts and algorithms.

BookNov 2018310 pages

Artificial Intelligence and Machine Learning Fundamentals

Artificial Intelligence and Machine Learning Fundamentals teaches you machine learning and neural networks from the ground up using real-world examples. After you complete this book, you will be excited to revamp your current projects or build new intelligent networks.

BookDec 2018330 pages

Hands-On Genetic Algorithms with Python

Using this book, you will gain expertise in genetic algorithms, understand how they work and know when and how to use them to create intelligent Python-based applications. By the end of this book, you will have hands-on experience applying genetic algorithms to artificial intelligence as well as numerous other domains.

BookJan 2020346 pages

The Applied Artificial Intelligence Workshop

The Applied Artificial Intelligence Workshop teaches you the ins and outs of machine learning and neural networks from the ground up, using real-world examples. You'll learn to develop AI and ML models using Python, starting with using the minmax algorithm and alpha-beta pruning to create your first game, and ending with classifying images using neural networks.

BookJul 2020420 pages

Artificial Intelligence for Big Data

Create smart systems to extract intelligent insights for decision making. You will learn about widely used Artificial Intelligence techniques for carrying out solutions in a production-ready environment. You'll explore advanced topics such as clustering, symbolic and sub-symbolic information representation, and many more.

BookMay 2018384 pages

Hands-On Artificial Intelligence for IoT

The book will help you get well-versed with different techniques in Artificial Intelligence such as machine learning, deep learning, natural language processing and more to build smart IoT systems. By the end of the book, you will have practical knowledge on how to implement and manipulate text, audio, and speech data within the IoT system.

BookJan 2019390 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages