You're reading from 50 Algorithms Every Programmer Should Know - Second Edition

Product typeBook

Published inSep 2023

PublisherPackt

ISBN-139781803247762

Edition2nd Edition

Concepts

Data Structures and Algorithms

Author (1)

Imran Ahmad

When is linear regression used?

Linear regression is used to solve many real-world problems, including the following:

Sales forecasting
Predicting optimum product prices
Quantifying the causal relationship between an event and the response, such as in clinical drug trials, engineering safety tests, or marketing research
Identifying patterns that can be used to forecast future behavior, given known criteria—for example, predicting insurance claims, natural disaster damage, election results, and crime rates

The weaknesses of linear regression

The weaknesses of linear regression are as follows:

It only works with numerical features.
Categorical data needs to be preprocessed.
It does not cope well with missing data.
It makes assumptions about the data.

The regression tree algorithm

The regression tree algorithm is similar to the classification tree algorithm, except the label is a continuous variable, not a category variable.

Using the regression tree algorithm for the regressors challenge

In this section, we will see how a regression tree algorithm can be used for the regressors challenge:

First, we train the model using a regression tree algorithm:

Text Description automatically generated

Once the regression tree model is trained, we use the trained model to predict the values:

y_pred = regressor.predict(X_test)

Then, we calculate RMSE to quantify the performance of the model:

from sklearn.metrics import mean_squared_error
from math import sqrt
sqrt(mean_squared_error(y_test, y_pred))

We get the following output:

The gradient boost regression algorithm

Let's now look at the gradient boost regression algorithm. It uses an ensemble of decision trees in an effort to better formulate the underlying patterns in data.

Using gradient boost regression algorithm for the regressors challenge

In this section, we will see how we can use the gradient boost regression algorithm for the regressors challenge:

First, we train the model using the gradient boost regression algorithm:

Once the gradient regression algorithm model is trained, we use it to predict the values:

y_pred = regressor.predict(X_test)

Finally, we calculate RMSE to quantify the performance of the model:

from sklearn.metrics import mean_squared_error
from math import sqrt
sqrt(mean_squared_error(y_test, y_pred))

Running this will give us the output value, as follows:

For regression algorithms, the winner is

Let's look at the performance of the three regression algorithms that we used on the same data and exactly the same use case:

Algorithm	RMSE
Linear regression	4.36214129677179
Regression tree	5.2771702288377
Gradient boost regression	4.034836373089085

Looking at the performance of all the regression algorithms, it is obvious that the performance of gradient boost regression is the best as it has the lowest RMSE. This is followed by linear regression. The regression tree algorithm performed the worst for this problem.

Practical example – how to predict the weather

Let's see how we can use the concepts developed in this chapter to predict the weather. Let's assume that we want to predict whether it will rain tomorrow based on the data collected over a year for a particular city.The data available to train this model is in the CSV file called weather.csv:

Let's import the data as a pandas data frame:

import numpy as np 
import pandas as pd
df = pd.read_csv("weather.csv")

Let's look at the columns of the data frame:

Next, let's look at the header of the first 13 columns of the weather.csv data:

A screenshot of a computer Description automatically generated

Now, let's look at the last 10 columns of the weather.csv data:

A picture containing application Description automatically generated

Let's use x to represent the input features. We will drop the Date field for the feature list as it is not useful in the context of predictions. We will also drop the RainTomorrow label:

x = df.drop(['Date','RainTomorrow...

Summary

In this chapter, we started by looking at the basics of supervised machine learning. Then, we looked at various classification algorithms in more detail. Next, we looked at different methods to evaluate the performance of classifiers and studied various regression algorithms. We also looked at the different methods that can be used to evaluate the performance of the algorithms that we studied.In the next chapter, we will look at neural networks and deep learning algorithms. We will look at the methods used to train a neural network and we will also look at the various tools and frameworks available for evaluating and deploying a neural network.

Understanding the types of neural networks

Neural networks can be designed in various ways, depending on how the neurons are interconnected. In a dense, or fully connected, neural network, every single neuron in a given layer is linked to each neuron in the next layer. This means each input from the preceding layer is fed into every neuron of the subsequent layer, maximizing the flow of information.

However, neural networks aren’t always fully connected. Some may have specific patterns of connections based on the problem they are designed to solve. For instance, in convolutional neural networks used for image processing, each neuron in a layer may only be connected to a small region of neurons in the previous layer. This mirrors the way neurons in the human visual cortex are organized and helps the network efficiently process visual information.

Remember, the specific architecture of a neural network – how the neurons are interconnected – greatly impacts...

Using transfer learning

Throughout the years, countless organizations, research entities, and contributors within the open-source community have meticulously built sophisticated models for general use cases. These models, often trained with vast amounts of data, have been optimized over years of hard work and are suited for various applications, such as:

Detecting objects in videos or images
Transcribing audio
Analyzing sentiment in text

When initiating the training of a new ML model, it’s worth questioning, rather than starting from a blank slate, whether we can modify an already established, pre-trained model to suit our needs. Put simply, could we leverage the learning of existing models to tailor a custom model that addresses our specific needs? Such an approach, known as transfer learning, can provide several advantages:

It gives a head start to our model training.
It potentially enhances the quality of our model by utilizing...

Case study – using deep learning for fraud detection

Using ML techniques to identify fraudulent documents is an active and challenging field of research. Researchers are investigating to what extent the pattern recognition power of neural networks can be exploited for this purpose. Instead of manual attribute extractors, raw pixels can be used for several deep learning architectural structures.

Methodology

The technique presented in this section uses a type of neural network architecture called Siamese neural networks, which features two branches that share identical architectures and parameters.

The use of Siamese neural networks to flag fraudulent documents is shown in the following diagram:

Figure 8.17: Siamese neural networks

When a particular document needs to be verified for authenticity, we first classify the document based on its layout and type, and then we compare it against its expected template and pattern. If it deviates beyond a certain...

Summary

In this chapter, we journeyed through the evolution of neural networks, examining different types, key components like activation functions, and the significant gradient descent algorithm. We touched upon the concept of transfer learning and its practical application in identifying fraudulent documents.

As we proceed to the next chapter, we’ll delve into natural language processing, exploring areas such as word embedding and recurrent networks. We will also learn how to implement sentiment analysis. The captivating realm of neural networks continues to unfold.

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/WHLel

The rest of the chapter is locked

You have been reading a chapter from

50 Algorithms Every Programmer Should Know - Second Edition

Published in: Sep 2023Publisher: PacktISBN-13: 9781803247762

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Imran Ahmad

Imran Ahmad has been a part of cutting-edge research about algorithms and machine learning for many years. He completed his PhD in 2010, in which he proposed a new linear programming-based algorithm that can be used to optimally assign resources in a large-scale cloud computing environment. In 2017, Imran developed a real-time analytics framework named StreamSensing. He has since authored multiple research papers that use StreamSensing to process multimedia data for various machine learning algorithms. Imran is currently working at Advanced Analytics Solution Center (A2SC) at the Canadian Federal Government as a data scientist. He is using machine learning algorithms for critical use cases. Imran is a visiting professor at Carleton University, Ottawa. He has also been teaching for Google and Learning Tree for the last few years.
Read more about Imran Ahmad

Personalised recommendations for you

Based on your interests and search pattern

C++ Programming for Linux Systems

This book covers the essential system programming tools and helps you explore the features of C++20. It emphasizes important details to maintain code quality and tackle everyday challenges of developing software for high performance, optimization, and more.

BookSep 2023288 pages

Expert C++

Discover advanced programming techniques, the latest features of C++17 and C++20, and best practices for memory management, debugging, testing, and large-scale application design with Expert C++. Ideal for experienced developers advancing to proficient programmers and building professional-grade C++ applications.

BookAug 2023604 pages

iOS 17 Programming for Beginners

iOS 17 Programming for Beginners, Eighth Edition is your comprehensive guide to learning the art of iOS app development. Whether you dream of creating the next chart-topping app or simply want to enhance your programming skills, this book is your trusted companion on this exciting journey.

BookOct 2023604 pages4

Developer Career Masterplan

Written by industry experts that have spent the last 20+ years helping developers grow their career path towards senior developer positions and beyond. This book provides a comprehensive guide, sharing examples and stories from their global careers. By the end, you’ll have the knowledge to create a clear career progression plan as a technical professional.

BookSep 2023310 pages

Refactoring with C#

In Refactoring with C#, you’ll explore the process of safely refactoring modern .NET code using Visual Studio features, advanced unit tests, AI assistance, and custom Roslyn analyzers.

BookNov 2023434 pages

Python Real-World Projects

Amplify your developer journey by curating a dynamic project portfolio that outshines traditional resumes. Delve into the Python realm through immersive projects, mastering core concepts while constructing comprehensive modules and applications. From data acquisition prowess to impactful data visualization, Python Real-World Projects arms you with essential skills to beat the competition.

BookSep 2023478 pages5

The MVVM Pattern in .NET MAUI

The MVVM Pattern in .NET MAUI enables developers to master MVVM principles and effectively apply them to .NET MAUI. This book uses real-life examples and covers complex problems to help you successfully apply MVVM with .NET MAUI to confidently develop robust and high-performing cross-platform apps.

BookNov 2023386 pages

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Extending Microsoft Business Central with Power Platform

Extending Business Central with the Power Platform is a step-by-step guide for Business Central professionals to create solutions that automate business processes, explain complex workflow approvals, and integrate with hundreds of other systems, without traditional development. It’ll guide you in customizing Business Central with Power Platform.

BookAug 2023458 pages5

Quantum Computing Algorithms

The book emphasizes intuitive ideas behind quantum algorithms in ways that other books don’t cover, striking a careful balance between no math and too much math. To get the most from this book, you should be comfortable with basic algebra and writing simple computer code. No prior understanding of quantum physics is needed to get started.

BookSep 2023342 pages

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5

Python – Complete Python, Django, Data Science and ML Guide

Unlock Python's full potential with this 50+ hour course! From programming to web and game development, data manipulation, and machine learning, gain the skills required to succeed in various Python-related careers. With practical tasks, hands-on experience, and a strong foundation in Python, you'll be ready to tackle real-world challenges and take advantage of the many opportunities this versatile language offers.

VideoNov 202350 hours 30 minutes5