You're reading from Machine Learning Infrastructure and Best Practices for Software Engineers

Product typeBook

Published inJan 2024

Reading LevelIntermediate

PublisherPackt

ISBN-139781837634064

Edition1st Edition

Languages

Python

Concepts

Machine Learning

Author (1)

Miroslaw Staron

Training and Evaluating Classical Machine Learning Systems and Neural Networks

Modern machine learning frameworks are designed to be user-friendly for programmers. The popularity of the Python programming environment (and R) has shown that designing, developing, and testing machine learning models can be focused on the machine learning task and not on the programming tasks. The developers of the machine learning models can focus on developing the entire system and not on programming the internals of the algorithms. However, this bears a darker side – a lack of understanding of the internals of the models and how they are trained, evaluated, and validated.

In this chapter, I’ll dive a bit deeper into the process of training and evaluation. We’ll start with the basic theory behind different algorithms before learning how they are trained. We’ll start with the classical machine learning models, exemplified by decision trees. Then, we’ll gradually...

Training and testing processes

Machine learning has revolutionized the way we solve complex problems by enabling computers to learn from data and make predictions or decisions without being explicitly programmed. One crucial aspect of machine learning is training models, which involves teaching algorithms to recognize patterns and relationships in data. Two fundamental methods for training machine learning models are model.fit() and model.predict().

The model.fit() function lies at the heart of training a machine learning model. It is the process by which a model learns from a labeled dataset to make accurate predictions. During training, the model adjusts its internal parameters to minimize the discrepancy between its predictions and the true labels in the training data. This iterative optimization process, often referred to as “learning,” allows the model to generalize its knowledge and perform well on unseen data.

In addition to the training data and labels,...

Training classical machine learning models

We’ll start by training a model that lets us look inside it. We’ll use the CART decision tree classifier, where we can visualize the actual decision tree that is trained. We’ll use the same numerical data we used in the previous chapter. First, let’s read the data and create the train/test split:

# read the file with data using openpyxl
import pandas as pd
# we read the data from the excel file,
# which is the defect data from the ant 1.3 system
dfDataAnt13 = pd.read_excel('./chapter_6_dataset_numerical.xlsx',
                            sheet_name='ant_1_3',
                            index_col=0)
# prepare the dataset...

Understanding the training process

From the software engineer’s perspective, the training process is rather simple – we fit the model, validate it, and use it. We check how good the model is in terms of the performance metrics. If the model is good enough, and we can explain it, then we develop the entire product around it, or we use it in a larger software product.

When the model does not learn anything useful, we need to understand why this is the case and whether there could be another model that can. We can use the visualization techniques we learned about in Chapter 6 to explore the data and clear it from noise using the techniques from Chapter 4.

Now, let’s explore the process of how the decision tree model learns from the data. The DecisionTree classifier learns from the provided data by recursively partitioning the feature space based on the values of the features in the training dataset. It constructs a binary tree where each internal node represents...

Random forest and opaque models

Let’s train the random forest classifier based on the same data as in the counter-example and check whether the model performs better and whether the model uses similar features as the DecisionTree classifier in the original counter-example.

Let’s instantiate, train, and validate the model on the same data using the following fragment of code:

from sklearn.ensemble import RandomForestClassifier
randomForestModel = RandomForestClassifier()
randomForestModel.fit(X_train, y_train)
y_pred_rf = randomForestModel.predict(X_test)

After evaluating the model, we obtain the following performance metrics:

Accuracy: 0.62
Precision: 0.63, Recall: 0.62

Admittedly, these metrics are different than the metrics in the decision trees, but the overall performance is not that much different. The difference in accuracy of 0.03 is negligible. First, we can extract the important features, reusing the same techniques that were presented in Chapter...

Training deep learning models

Training a dense neural network involves various steps. First, we prepare the data. This typically involves tasks such as feature scaling, handling missing values, encoding categorical variables, and splitting the data into training and validation sets.

Then, we define the architecture of the dense neural network. This includes specifying the number of layers, the number of neurons in each layer, the activation functions to be used, and any regularization techniques, such as dropout or batch normalization.

Once the model has been defined, we need to initialize it. We create an instance of the neural network model based on the defined architecture. This involves creating an instance of the neural network class or using a predefined model architecture available in a deep learning library. We also need to define a loss function that quantifies the error between the predicted output of the model and the actual target values. The choice of loss function...

Misleading results – data leaking

In the training process, we use one set of data and in the test set, we use another set. The best training process is when these two datasets are separate. If they are not, we get into something that is called a data leak problem. This problem is when we have the same data points in both the train and test sets. Let’s illustrate this with an example.

First, we need to create a new split, where we have some data points in both sets. We can do that by using the split function and setting 20% of the data points to the test set. This means that at least 10% of the data points are in both sets:

X_trainL, X_testL, y_trainL, y_testL = \
        sklearn.model_selection.train_test_split(X, y, random_state=42, train_size=0.8)

Now, we can use the same code to make predictions on this data and then calculate the performance metrics:

# now, let's evaluate the model on this new data
with torch...

Summary

In this chapter, we discussed various topics related to machine learning and neural networks. We explained how to read data from an Excel file using the pandas library and prepare the dataset for training a machine learning model. We explored the use of decision tree classifiers and demonstrated how to train a decision tree model using scikit-learn. We also showed how to make predictions using the trained model.

Then, we discussed how to switch from a decision tree classifier to a random forest classifier, which is an ensemble of decision trees. We explained the necessary code modifications and provided an example. Next, we shifted our focus to using a dense neural network in PyTorch. We described the process of creating the neural network architecture, training the model, and making predictions using the trained model.

Lastly, we explained the steps involved in training a dense neural network, including data preparation, model architecture, initializing the model, defining...

References

Chidamber, S.R. and C.F. Kemerer, A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 1994. 20(6): p. 476–493.

The rest of the chapter is locked

You have been reading a chapter from

Machine Learning Infrastructure and Best Practices for Software Engineers

Published in: Jan 2024Publisher: PacktISBN-13: 9781837634064

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Author (1)

Miroslaw Staron

Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner's Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
Read more about Miroslaw Staron

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages