You're reading from 50 Algorithms Every Programmer Should Know - Second Edition

Product typeBook

Published inSep 2023

PublisherPackt

ISBN-139781803247762

Edition2nd Edition

Concepts

Data Structures and Algorithms

Author (1)

Imran Ahmad

Training a neural network

The process of building a neural networkneural network using a given dataset is called training a neural networkneural network. Let's look into the anatomy of a typical neural networkneural network. When we talk about training a neuralneural network, we are talking about calculating the best values for the weights. The training is done iteratively by using a set of examples in the form of training data. The examples in the training data have the expected values of the output for different combinations of input values. The training process for neural neural networks is different from the way traditional models are trained (which were discussed in Chapter 7, Traditional Supervised Learning Algorithms).

Understanding the Anatomy of a neural network

Let's see what a neural networkneural network consists of:

Layers: Layers are the core building blocks of a neural networkneural network. Each layer is a data-processing module that acts as a filter. It takes one or more inputs, processes it in a certain way, and then produces one or more outputs. Each time data passes through a layer, it goes through a processing phase and shows patterns that are relevant to the business question we are trying to answer.
Loss function: A loss function provides the feedback signal that is used in the various iterations of the learning process. The loss function provides the deviation for a single example.
Cost function: The cost function is the loss function on a complete set of examples.
Optimizer: An optimizer determines how the feedback signal provided by the loss function will be interpreted.
Input data: Input data is the data that is used to train the neural networkneural...

Defining Gradient Descent

The purpose of training a neural networkneural network model is to find the right values for weights. We start training a neuralneural network with random or default values for the weights. Then, we iteratively use an optimizer algorithm, such as gradient descent, to change the weights in such a way that our predictions improve.The starting point of a gradient descent algorithm is the random values of weights that need to be optimized as we iterate through the algorithm. In each of the subsequent iterations, the algorithm proceeds by changing the values of the weights in such a way that the cost is minimized.The following diagram explains the logic of the gradient descent algorithm:

In the preceding diagram, the input is the feature vector X. The actual value of the target variable is Y and the predicted value of the target variable is Y’. We determine the deviation of the actual value from the...

Activation Functions

An activation function formulates how the inputs to a particular neuron will be processed to generate an output.As shown in the following diagram, each of the neurons in a neural networkneural network has an activation function that determines how inputs will be processed:

In the preceding diagram, we can see that the results generated by an activation function are passed on to the output. The activation function sets the criteria that how the values of the inputs are supposed to be interpreted to generate an output.For exactly the same input values, different activation functions will produce different outputs. Understanding how to select the right activation function is important when using neural neural networks to solve problems.Let's now look into these activation functions one by one.

Step Function

The simplest possible activation function is the threshold function. The output of the threshold function is binary: 0 or 1. It will generate 1 as the output if any of the input is greater than 1. This can be explained in the following diagram:

Note that as soon as there are any signs of life detected in the weighted sums of inputs, the output (y) becomes 1. This makes the threshold activation function very sensitive. It is quite vulnerable to being wrongly triggered by the slightest signal in the input due to a glitch or some noise.

Sigmoid

The sigmoid function can be thought of as an improvement of the threshold function. Here, we have control over the sensitivity of the activation function:

The sigmoid function, y, is defined as follows:

A picture containing text Description automatically generated

It can be implemented in Python as follows:

def sigmoidFunction(z): 
      return 1/ (1+np.exp(-z))

Note that by reducing the sensitivity of the activation function, we make glitches in the input less disruptive. Note that the output of the sigmoid activation function is still binary, that is, 0 or 1.

Rectified linear unit (ReLU)

The output for the first two activation functions presented in this chapter was binary. That means that they will take a set of input variables and convert them into binary outputs. ReLU is an activation function that takes a set of input variables as input and converts them into a single continuous output. In neuralneural networks, ReLU is the most popular activation function and is usually used in the hidden layers, where we do not want to convert continuous variables into category variables.The following diagram summarizes the ReLU activation function:

Note that when x≤ 0, that means y = 0. This means that any signal from the input that is zero or less than zero is translated into a zero output:

Shape Description automatically generated with medium confidence

As soon as x becomes more than zero, it is x.The ReLU function is one of the most used activation functions in neural neural networks. It can...

Leaky ReLU

In ReLU, a negative value for x results in a zero value for y. It means that some information is lost in the process, which makes training cycles longer, especially at the start of training. The Leaky ReLU activation function resolves this issue. The following applies for Leaky ReLu:

This is shown in the following diagram:

Here, ß is a parameter with a value less than one.It can be implemented in Python as follows:

def leakyReLU(x,beta=0.01):
    if x<0:
        return (beta*x)    
    else:        
        return x

There are three ways of specifying the value for ß:

We can specify a default value of ß.
We can make ß a parameter in our neural networkneural network and we can let the neuralneural network decide the value (this is called parametric ReLU).
We can make ß a random value (this is called randomized ReLU).

Hyperbolic tangent (tanh)

The tanh function is similar to the sigmoid function, but it has the ability to give a negative signal as well. The following diagram illustrates this:

The y function is as follows:

Text Description automatically generated

It can be implemented by the following Python code:

def tanh(x): 
    numerator = 1-np.exp(-2*x) 
    denominator = 1+np.exp(-2*x) 
    return numerator/denominator

Now let's look at the softmax function.

Softmax

Sometimes we need more than two levels for the output of the activation function. Softmax is an activation function that provides us with more than two levels for the output. It is best suited to multiclass classification problems. Let's assume that we have n classes. We have input values. The input values map the classes as follows:x = {x(1),x(2),....x(n)}Softmax operates on probability theory. The output probability of the e^thclass of the softmax is calculated as follows:

Text Description automatically generated with low confidence

For binary classifiers, the activation function in the final layer will be sigmoid, and for multiclass classifiers it will be softmax.

The rest of the chapter is locked

You have been reading a chapter from

50 Algorithms Every Programmer Should Know - Second Edition

Published in: Sep 2023Publisher: PacktISBN-13: 9781803247762

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Imran Ahmad

Imran Ahmad has been a part of cutting-edge research about algorithms and machine learning for many years. He completed his PhD in 2010, in which he proposed a new linear programming-based algorithm that can be used to optimally assign resources in a large-scale cloud computing environment. In 2017, Imran developed a real-time analytics framework named StreamSensing. He has since authored multiple research papers that use StreamSensing to process multimedia data for various machine learning algorithms. Imran is currently working at Advanced Analytics Solution Center (A2SC) at the Canadian Federal Government as a data scientist. He is using machine learning algorithms for critical use cases. Imran is a visiting professor at Carleton University, Ottawa. He has also been teaching for Google and Learning Tree for the last few years.
Read more about Imran Ahmad

Personalised recommendations for you

Based on your interests and search pattern

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

BookSep 2023288 pages

Expert C++

Discover advanced programming techniques, the latest features of C++17 and C++20, and best practices for memory management, debugging, testing, and large-scale application design with Expert C++. Ideal for experienced developers advancing to proficient programmers and building professional-grade C++ applications.

BookAug 2023604 pages

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

BookOct 2023604 pages4

Developer Career Masterplan

Written by industry experts that have spent the last 20+ years helping developers grow their career path towards senior developer positions and beyond. This book provides a comprehensive guide, sharing examples and stories from their global careers. By the end, you’ll have the knowledge to create a clear career progression plan as a technical professional.

BookSep 2023310 pages

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

BookNov 2023434 pages

Python Real-World Projects

Amplify your developer journey by curating a dynamic project portfolio that outshines traditional resumes. Delve into the Python realm through immersive projects, mastering core concepts while constructing comprehensive modules and applications. From data acquisition prowess to impactful data visualization, Python Real-World Projects arms you with essential skills to beat the competition.

BookSep 2023478 pages5

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

BookNov 2023386 pages

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Quantum Computing Algorithms

The book emphasizes intuitive ideas behind quantum algorithms in ways that other books don’t cover, striking a careful balance between no math and too much math. To get the most from this book, you should be comfortable with basic algebra and writing simple computer code. No prior understanding of quantum physics is needed to get started.

BookSep 2023342 pages

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5