You're reading from The Applied Artificial Intelligence Workshop

Product typeBook

Published inJul 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781800205819

Edition1st Edition

Languages

Python

Tools

Jupyter

Concepts

Artificial Intelligence

Authors (3):

Anthony So

William So

Zsolt Nagy

View More author details

6. Neural Networks and Deep Learning

Overview

In this chapter, you will be introduced to the final topic on neural networks and deep learning. You will be learning about TensorFlow, Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs). You will use key deep learning concepts to determine creditworthiness of individuals and predict housing prices in a neighborhood. Later on, you will also implement an image classification program using the skills you learned. By the end of this chapter, you will have a firm grasp on the concepts of neural networks and deep learning.

Introduction

In the previous chapter, we learned about what clustering problems are and saw several algorithms, such as k-means, that can automatically group data points on their own. In this chapter, we will learn about neural networks and deep learning networks.

The difference between neural networks and deep learning networks is the complexity and depth of the networks. Traditionally, neural networks have only one hidden layer, while deep learning networks have more than that.

Although we will use neural networks and deep learning for supervised learning, note that neural networks can also model unsupervised learning techniques. This kind of model was actually quite popular in the 1980s, but because the computation power required was limited at the time, it's only recently that this model has been widely adopted. With the democratization of Graphics Processing Units (GPUs) and cloud computing, we now have access to a tremendous amount of computation power. This...

Artificial Neurons

Artificial Neural Networks (ANNs), as the name implies, try to replicate how a human brain works, and more specifically how neurons work.

A neuron is a cell in the brain that communicates with other cells via electrical signals. Neurons can respond to stimuli such as sound, light, and touch. They can also trigger actions such as muscle contractions. On average, a human brain contains 10 to 20 billion neurons. That's a pretty huge network, right? This is the reason why humans can achieve so many amazing things. This is also why researchers have tried to emulate how the brain operates and in doing so created ANNs.

ANNs are composed of multiple artificial neurons that connect to each other and form a network. An artificial neuron is simply a processing unit that performs mathematical operations on some inputs (x1, x2, …, xn) and returns the final results (y) to the next unit, as shown here:

Figure 6.1: Representation of an artificial...

Neurons in TensorFlow

TensorFlow is currently the most popular neural network and deep learning framework. It was created and is maintained by Google. TensorFlow is used for voice recognition and voice search, and it is also the brain behind translate.google.com. Later in this chapter, we will use TensorFlow to recognize written characters.

The TensorFlow API is available in many languages, including Python, JavaScript, Java, and C. TensorFlow works with tensors. You can think of a tensor as a container composed of a matrix (usually with high dimensions) and additional information related to the operations it will perform (such as weights and biases, which you will be looking at later in this chapter). A tensor with no dimensions (with no rank) is a scalar. A tensor of rank 1 is a vector, rank 2 tensors are matrices, and a rank 3 tensor is a three-dimensional matrix. The rank indicates the dimensions of a tensor. In this chapter, we will be looking at tensors of ranks 2 and...

Neural Network Architecture

Neural networks are the newest branch of Artificial Intelligence (AI). Neural networks are inspired by how the human brain works. They were invented in the 1940s by Warren McCulloch and Walter Pitts. The neural network was a mathematical model that was used to describe how the human brain can solve problems.

We will use ANN to refer to both the mathematical model, and the biological neural network when talking about the human brain.

The way a neural network learns is more complex compared to other classification or regression models. The neural network model has a lot of internal variables, and the relationship between the input and output variables may involve multiple internal layers. Neural networks have higher accuracy than other supervised learning algorithms.

Note

Mastering neural networks with TensorFlow is a complex process. The purpose of this section is to provide you with an introductory resource to get started.

In this...

Activation Functions

As seen previously, a single neuron needs to perform a transformation by applying an activation function. Different activation functions can be used in neural networks. Without these functions, a neural network would simply be a linear model that could easily be described using matrix multiplication.

The activation function of a neural network provides non-linearity and therefore can model more complex patterns. Two very common activation functions are sigmoid and tanh (the hyperbolic tangent function).

Sigmoid

The formula of sigmoid is as follows:

Figure 6.4: The sigmoid formula

The output values of a sigmoid function range from 0 to 1. This activation function is usually used at the last layer of a neural network for a binary classification problem.

Tanh

The formula of the hyperbolic tangent is as follows:

Figure 6.5: The tanh formula

The tanh activation function is very similar to the sigmoid function...

Forward Propagation and the Loss Function

So far, we have seen how a neuron can take an input and perform some mathematical operations on it and get an output. We learned that a neural network is a combination of multiple layers of neurons.

The process of transforming the inputs of a neural network into a result is called forward propagation (or the forward pass). What we are asking the neural network to do is to make a prediction (the final output of the neural network) by applying multiple neurons to the input data:

Figure 6.11: Figure showing forward propagation

The neural network relies on the weights matrices, biases, and activation function of each neuron to calculate the predicted output value, . For now, let's assume the values of the weight matrices and biases are set in advance. The activation functions are defined when you design the architecture of the neural networks.

As for any supervised machine learning algorithm, the goal is...

Backpropagation

Previously, we learned how a neural network makes predictions by using weight matrices and biases (we can combine them into a single matrix) from its neurons. Using the loss function, a network determines how good or bad the predictions are. It would be great if it could use this information and update the parameters accordingly. This is exactly what backpropagation is about: optimizing a neural network's parameters.

Training a neural network involves executing forward propagation and backpropagation multiple times in order to make predictions and update the parameters from the errors. During the first pass (or propagation), we start by initializing all the weights of the neural network. Then, we apply forward propagation, followed by backpropagation, which updates the weights.

We apply this process several times and the neural network will optimize its parameters iteratively. You can decide to stop this learning process by setting the maximum number of...

Optimizers and the Learning Rate

In the previous section, we saw that a neural network follows an iterative process to find the best solution for any input dataset. Its learning process is an optimization process. You can use different optimization algorithms (also called optimizers) for a neural network. The most popular ones are Adam, SGD, and RMSprop.

One important parameter for the neural networks optimizer is the learning rate. This value defines how quickly the neural network will update its weights. Defining a too-low learning rate will slow down the learning process and the neural network will take a long time before finding the right parameters. On the other hand, having too-high a learning rate can make the neural network not learn a solution as it is making bigger weight changes than required. A good practice is to start with a not-too-small learning rate (such as 0.01 or 0.001), then stop the neural network training once its score starts to plateau or get worse, and...

Regularization

As with any machine learning algorithm, neural networks can face the problem of overfitting when they learn patterns that are only relevant to the training set. In such a case, the model will not be able to generalize the unseen data.

Luckily, there are multiple techniques that can help reduce the risk of overfitting:

L1 regularization, which adds a penalty parameter (absolute value of the weights) to the loss function
L2 regularization, which adds a penalty parameter (squared value of the weights) to the loss function
Early stopping, which stops the training if the error for the validation set increases while the error decreases for the training set
Dropout, which will randomly remove some neurons during training

All these techniques can be added at each layer of a neural network we create. We will be looking at this in the next exercise.

Exercise 6.04: Predicting Boston House Prices with Regularization

In this exercise, you will...

Deep Learning

Now that we are comfortable in building and training a neural network with one hidden layer, we can look at more complex architecture with deep learning.

Deep learning is just an extension of traditional neural networks but with deeper and more complex architecture. Deep learning can model very complex patterns, be applied in tasks such as detecting objects in images and translating text into a different language.

Shallow versus Deep Networks

Now that we are comfortable in building and training a neural network with one hidden layer, we can look at more complex architecture with deep learning.

As mentioned earlier, we can add more hidden layers to a neural network. This will increase the number of parameters to be learned but can potentially help to model more complex patterns. This is what deep learning is about: increasing the depth of a neural network to tackle more complex problems.

For instance, we can add a second layer to the neural network we presented...

Computer Vision and Image Classification

Deep learning has achieved amazing results in computer vision and natural language processing. Computer vision is a field that involves analyzing digital images. A digital image is a matrix composed of pixels. Each pixel has a value between 0 and 255 and this value represents the intensity of the pixel. An image can be black and white and have only one channel. But it can also have colors, and in that case, it will have three channels for the colors red, green, and blue. This digital version of an image that can be fed to a deep learning model.

There are multiple applications of computer vision, such as image classification (recognizing the main object in an image), object detection (localizing different objects in an image), and image segmentation (finding the edges of objects in an image). In this book, we will only look at image classification.

In the next section, we will look at a specific type of architecture: CNNs.

Convolutional...

Recurrent Neural Networks (RNNs)

In the last section, we learned how we can use CNNs for computer vision tasks such as classifying images. With deep learning, computers are now capable of achieving and sometimes surpassing human performance. Another field that is attracting a lot of interest from researchers is natural language processing. This is a field where RNNs excel.

In the last few years, we have seen a lot of different applications of RNN technology, such as speech recognition, chatbots, and text translation applications. But RNNs are also quite performant in predicting time series patterns, something that's used for forecasting stock markets.

RNN Layers

The common point with all the applications mentioned earlier is that the inputs are sequential. There is a time component with the input. For instance, a sentence is a sequence of words, and the order of words matters; stock market data consists of a sequence of dates with corresponding stock prices.

To accommodate...

Summary

We have just completed the entire book of The Applied Artificial Intelligence Workshop, Second Edition. In this workshop, we have learned about the fundamentals of AI and its applications. We wrote a Python program to play tic-tac-toe. We learned about search techniques such as breadth-first search and depth-first search and how they can help us solve the tic-tac-toe game.

In the next couple of chapters after that, we learned about supervised learning using regression and classification. These chapters included data preprocessing, train-test splitting, and models that were used in several real-life scenarios. Linear regression, polynomial regression, and support vector machines all came in handy when it came to predicting stock data. Classification was performed using k-nearest neighbor and support vector classifiers. Several activities helped you to apply the basics of classification in an interesting real-life use case: credit scoring.

In Chapter 4, An Introduction...

The rest of the chapter is locked

You have been reading a chapter from

The Applied Artificial Intelligence Workshop

Published in: Jul 2020Publisher: PacktISBN-13: 9781800205819

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (3)

Anthony So

Anthony So is a renowned leader in data science. He has extensive experience in solving complex business problems using advanced analytics and AI in different industries including financial services, media, and telecommunications. He is currently the chief data officer of one of the most innovative fintech start-ups. He is also the author of several best-selling books on data science, machine learning, and deep learning. He has won multiple prizes at several hackathon competitions, such as Unearthed, GovHack, and Pepper Money. Anthony holds two master's degrees, one in computer science and the other in data science and innovation.
Read more about Anthony So

William So

William So is a Data Scientist with both a strong academic background and extensive professional experience. He is currently the Head of Data Science at Douugh and also a Lecturer for Master of Data Science and Innovation at the University of Technology Sydney. During his career, he successfully covered the end-end spectrum of data analytics from ML to Business Intelligence helping stakeholders derive valuable insights and achieve amazing results that benefits the business. William is a co-author of the "The Applied Artificial Intelligence Workshop" published by Packt.
Read more about William So

Zsolt Nagy

Zsolt Nagy is an engineering manager in an ad tech company heavy on data science. After acquiring his MSc in inference on ontologies, he used AI mainly for analyzing online poker strategies to aid professional poker players in decision making. After the poker boom ended, he put extra effort into building a T-shaped profile in leadership and software engineering.
Read more about Zsolt Nagy

Other recommended products

Related to this chapter

Artificial Intelligence and Machine Learning Fundamentals

Artificial Intelligence and Machine Learning Fundamentals teaches you machine learning and neural networks from the ground up using real-world examples. After you complete this book, you will be excited to revamp your current projects or build new intelligent networks.

BookDec 2018330 pages

The Deep Learning Workshop

With The Deep Learning Workshop, you’ll learn about essential deep learning concepts, such as image recognition, text embedding, and neural networks, all so that you can build your own smart machine learning models. You'll be able to learn at your own pace with the help of interesting activities and hands-on exercises that will keep you hooked throughout the book.

BookJul 2020474 pages

Machine Learning with scikit-learn Quick Start Guide

Scikit-learn is a robust machine learning library for the Python programming language. It provides a set of supervised and unsupervised learning algorithms. This book is the easiest way to learn how to deploy, optimize and evaluate all the important machine learning algorithms that scikit-learn provides.

BookOct 2018172 pages

Artificial Intelligence with Python

Build real-world artificial intelligence apps to intelligently interact with the world around you, explore real-world scenarios, and discover the various algorithms that can be used to build AI applications. Packed with insightful examples and topics such as predictive analytics and deep learning, this book is a must-have for Python developers.

BookJan 2017446 pages

The Data Science Workshop

The Data Science Workshop equips you with the basic skills you need to start working on a variety of data science projects. You’ll work through the essential building blocks of a data science project gradually through the book, and then put all the pieces together to consolidate your knowledge and apply your learnings in the real world.

BookAug 2020824 pages5

The Data Science Workshop

Cut through the noise and get real results with a step-by-step approach to data science

BookJan 2020818 pages

The Machine Learning Workshop

With expert guidance and real-world examples, The Machine Learning Workshop gets you up and running with programming machine learning algorithms. By showing you how to leverage scikit-learn's flexibility, it teaches you all the skills you need to use machine learning to solve real-world problems.

BookJul 2020286 pages

Python Data Mining Quick Start Guide

This book is an introduction to data mining and its practical demonstration of working with real-world data sets. With this book, you will be able to extract useful insights using common Python libraries. You will also learn key stages like data loading, cleaning, analysis, visualization to build an efficient data mining pipeline.

BookApr 2019188 pages

The Applied TensorFlow and Keras Workshop

The Applied TensorFlow and Keras Workshop provides you with a blueprint to build an application that generates predictions using a deep learning model. You’ll learn to apply techniques to improve the model: add more data and features, change its architecture, or create a new model by changing the core components to meet your own requirements.

BookJul 2020174 pages

Data Science for Marketing Analytics

Data Science for Marketing Analytics opens doors to looking at data with a different approach and new tools. Drawing on machine learning and data science concepts, this book broadens the range of tools that you can use to transform the market analysis process.

BookMar 2019420 pages

Mastering Machine Learning with scikit-learn

This book examines machine learning models including k-nearest neighbors, logistic regression, naive Bayes, random forests, and support vector machines. You will work through document classification, image recognition, and other example problems.

BookJul 2017254 pages

The Applied Data Science Workshop

The Applied Data Science Workshop explores the key elements and interesting applications of data science techniques with the help of practical examples and interactive exercises. Following a hands-on approach, it allows you the freedom of analyzing data in the Jupyter Notebook effectively using many diverse open-source Python libraries.??

BookJul 2020352 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages