You're reading from Hands-On Genetic Algorithms with Python

Product typeBook

Published inJan 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781838557744

Edition1st Edition

Languages

Python

Tools

Keras TensorFlow

Concepts

Artificial Intelligence

Author (1)

Eyal Wirsansky

Enhancing Machine Learning Models Using Feature Selection

This chapter describes how genetic algorithms can be used to improve the performance of supervised machine learning models by selecting the best subset of features from the provided input data. This chapter will start with a brief introduction to machine learning and then describe the two main types of supervised machine learning tasks – regression and classification. We will then discuss the potential benefits of feature selection when it comes to the performance of these models. Next, we will demonstrate how genetic algorithms can be utilized to pinpoint the genuine features that are generated by the Friedman-1 Test regression problem. Then, we will use the real-life Zoo dataset to create a classification model and improve its accuracy – again by applying genetic algorithms to isolate the best features for...

Technical requirements

In this chapter, we will be using Python 3 with the following supporting libraries:

deap
numpy
pandas
matplotlib
seaborn
sklearn – introduced in this chapter

In addition, we will be using the UCI Zoo Dataset (https://archive.ics.uci.edu/ml/datasets/zoo).

The programs that will be used in this chapter can be found in this book's GitHub repository at https://github.com/PacktPublishing/Hands-On-Genetic-Algorithms-with-Python/tree/master/Chapter07.

Check out the following video to see the Code in Action:
http://bit.ly/37HCKyr

Supervised machine learning

The term machine learning typically refers to a computer program that receives inputs and produces outputs. Our goal is to train this program, also known as the model, to produce the correct outputs for the given inputs, without explicitly programming them.

During this training process, the model learns the mapping between the inputs and the outputs by adjusting its internal parameters. One common way to train the model is by providing it with a set of inputs, for which the correct output is known. For each of these inputs, we tell the model what the correct output is so that it can adjust, or tune itself, aiming to eventually produce the desired output for each of the given inputs. This tuning is at the heart of the learning process.

Over the years, many types of machine learning models have been developed. Each model has its own particular internal...

Feature selection in supervised learning

As we saw in the previous section, a supervised learning model receives a set of inputs, called features, and maps them to a set of outputs. The assumption is that the information described by the features is useful for determining the value of the corresponding outputs. At first glance, it may seem that the more information we can use as input, the better our chances of predicting the output(s) correctly. However, in many cases, the opposite holds true; if some of the features we use are irrelevant or redundant, the consequence could be a (sometimes significant) decrease in the accuracy of the models.

Feature selection is the process of selecting the most beneficial and essential set of features out of the entire given set of features. Besides increasing the accuracy of the model, a successful feature selection can provide the following...

Selecting the features for the Friedman-1 regression problem

The Friedman-1 regression problem, which was created by Friedman and Breiman, describes a single output value, y, which is a function of five input values, x₀..x₄, and randomly generated noise, according to the following formula:

The input variables, x₀..x₄, are independent, and uniformly distributed over the interval [0, 1]. The last component in the formula is the randomly generated noise. The noise is normally distributed and multiplied by the constant noise, which determines its level.

In Python, the scikit-learn (sklearn) library provides us with the make_friedman1() function, which can be used to generate a dataset containing the desired number of samples. Each of the samples consists of randomly generated x₀..x₄ values and their corresponding calculated y value. The interesting part, however, is that we can tell...

Selecting the features for the classification Zoo dataset

The UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/index.php) maintains over 350 datasets as a service to the machine learning community. These datasets can be used for experimentation with various models and algorithms. A typical dataset contains a number of features (inputs) and the desired output, in a form of columns, with a description of their meaning.

In this section, we will use the UCI Zoo dataset (https://archive.ics.uci.edu/ml/datasets/zoo). This dataset describes 101 different animals using the following 18 features:

...

No.	Feature Name	Data Type
1	animal name	Unique for each instance
2	hair	Boolean
3	feathers	Boolean
4	eggs	Boolean
5	milk	Boolean
6	airborne	Boolean
7	aquatic	Boolean
8	predator	Boolean

Summary

In this chapter, you were introduced to machine learning and the two main types of supervised machine learning tasks – regression and classification. Then, you were presented with the potential benefits of feature selection on the performance of the models carrying out these tasks. At the heart of this chapter were two demonstrations of how genetic algorithms can be utilized to enhance the performance of such models via feature selection. In the first case, we pinpointed the genuine features that were generated by the Friedman-1 Test regression problem, while, in the other case, we selected the most beneficial features of the Zoo classification dataset.

In the next chapter, we will look at another possible way of enhancing the performance of supervised machine learning models, namely hyperparameter tuning.

Applied Supervised Learning with Python, Benjamin Johnston and Ishita Mathur, April 26, 2019
Feature Engineering Made Easy, Sinan Ozdemir and Divya Susarla, January 22, 2018
Feature selection for classification, M.Dash and H.Liu, 1997: https://doi.org/10.1016/S1088-467X(97)00008-5
UCI Machine Learning Repository: ht tps://archive.ics.uci.edu/ml/index.php

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Genetic Algorithms with Python

Published in: Jan 2020Publisher: PacktISBN-13: 9781838557744

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Eyal Wirsansky

Eyal Wirsansky is a senior data scientist, an experienced software engineer, a technology community leader, and an artificial intelligence researcher. Eyal began his software engineering career over twenty-five years ago as a pioneer in the field of Voice over IP. He currently works as a member of the data platform team at Gradle, Inc. During his graduate studies, he focused his research on genetic algorithms and neural networks. A notable result of this research is a novel supervised machine learning algorithm that integrates both approaches. In addition to his professional roles, Eyal serves as an adjunct professor at Jacksonville University, where he teaches a class on artificial intelligence. He also leads both the Jacksonville, Florida Java User Group and the Artificial Intelligence for Enterprise virtual user group, and authors the developer-focused artificial intelligence blog, ai4java.
Read more about Eyal Wirsansky

Other recommended products

Related to this chapter

Hands-On Neuroevolution with Python

This book will help you to apply popular neuroevolution strategies to existing neural network designs to improve their performance. It covers practical examples in areas such as games, robotics, and simulation of natural processes, using real-world examples and data sets for your better understanding.

BookDec 2019368 pages

Artificial Intelligence with Python

Build real-world artificial intelligence apps to intelligently interact with the world around you, explore real-world scenarios, and discover the various algorithms that can be used to build AI applications. Packed with insightful examples and topics such as predictive analytics and deep learning, this book is a must-have for Python developers.

BookJan 2017446 pages

Mastering Predictive Analytics with scikit-learn and TensorFlow

In this book, you will find a range of methods to improve the performance of almost any predictive model, from ensemble methods to dimensionality reduction and cross-validation. You will learn the tools to produce advanced predictive models. In addition, you will dive into the exiting field of Deep Learning using TensorFlow.

BookSep 2018154 pages

Hands-On Neural Network Programming with C#

This book will give you a complete walkthrough of the process of developing basic to advanced practical examples based on neural networks with C#. From simple perceptrons through GRU’s, you’ll learn to understand and implement these technologies using the concepts you’ve learned. Explore methods to optimize and adapt neural networks in real time.

BookSep 2018328 pages

Hands-On Artificial Intelligence for IoT

The book will help you get well-versed with different techniques in Artificial Intelligence such as machine learning, deep learning, natural language processing and more to build smart IoT systems. By the end of the book, you will have practical knowledge on how to implement and manipulate text, audio, and speech data within the IoT system.

BookJan 2019390 pages

Keras Reinforcement Learning Projects

Keras Reinforcement Learning Projects book teaches you essential concept, techniques and, models of reinforcement learning using best real-world demonstrations. You will explore popular algorithms such as Markov decision process, Monte Carlo, Q-learning making you equipped with complex statistics in various projects with the help of Keras

BookSep 2018288 pages

The Reinforcement Learning Workshop

With the help of practical examples and engaging activities, The Reinforcement Learning Workshop takes you through reinforcement learning’s core techniques and frameworks. Following a hands-on approach, it allows you to learn reinforcement learning at your own pace to develop your own intelligent applications with ease.

BookAug 2020822 pages

Artificial Intelligence with Python

Completely updated and revised edition of the bestselling guide to artificial intelligence, updated to Python 3.8, with seven new chapters that cover RNNs, AI and Big Data, fundamental use cases, machine learning data pipelines, chatbots, Big Data, and more.

BookJan 2020618 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages