Reader small image

You're reading from  MATLAB for Machine Learning - Second Edition

Product typeBook
Published inJan 2024
Reading LevelIntermediate
PublisherPackt
ISBN-139781835087695
Edition2nd Edition
Languages
Tools
Right arrow
Author (1)
Giuseppe Ciaburro
Giuseppe Ciaburro
author image
Giuseppe Ciaburro

Giuseppe Ciaburro holds a PhD and two master's degrees. He works at the Built Environment Control Laboratory - Università degli Studi della Campania "Luigi Vanvitelli". He has over 25 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in MATLAB, Python and R. As an expert in AI applications to acoustics and noise control problems, Giuseppe has wide experience in researching and teaching. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He was recently included in the world's top 2% scientists list by Stanford University (2022).
Read more about Giuseppe Ciaburro

Right arrow

Prediction Using Classification and Regression

Classification algorithms return accurate predictions based on our observations. Starting from a set of predefined class labels, the classifier assigns each piece of input data a class label according to the training model. Classification algorithms learn linear or non-linear associations between independent and categorical dependent variables. For example, a classification algorithm may learn to predict the weather as clear sky, gentle showers or heavy rain, and so on. Regression relates a set of independent variables to a dependent variable, numeric or continuous, for example, predicting rainfall in units of millimeters. Through this technique, it is possible to understand how the value of the dependent variable changes as the independent variable varies. This chapter shows us how to classify an object using nearest neighbors and how to perform an accurate regression analysis in a MATLAB environment. The aim of this chapter is to provide...

Technical requirements

In this chapter, we will introduce basic concepts relating to machine learning. To understand these topics, a basic knowledge of algebra and mathematical modeling is needed. A working knowledge of the MATLAB environment is also required.

To work with the MATLAB code in this chapter, you need the following files (available on GitHub at https://github.com/PacktPublishing/MATLAB-for-Machine-Learning-second-edition):

  • datatraining.txt
  • VehiclesItaly.xlsx
  • Employees.xlsx
  • AirfoilSelfNoise.xlsx

Introducing classification methods using MATLAB

Classification methods are an essential component of machine learning and data analysis. These methods allow us to categorize data into predefined classes or groups based on specific characteristics or attributes. By utilizing classification algorithms, we can train models to make predictions or assign labels to new, unseen data points. Classification plays a vital role in various domains, including image recognition, spam filtering, sentiment analysis, fraud detection, and medical diagnosis. It enables us to make informed decisions, identify patterns, and gain insights from data.

There are numerous classification algorithms available, each with its own strengths, assumptions, and applications. Some common classification methods include decision trees, support vector machines (SVMs), random forests, logistic regression, and naive Bayes classifiers. SVM has two variations: SVC for classification and SVR for regression. To effectively...

Building an effective and accurate classifier

Classification in machine learning is a supervised learning task that involves categorizing or classifying data into predefined classes or categories. It is one of the fundamental and widely used techniques in machine learning and data mining. The goal of classification is to develop a model or classifier that can accurately assign new, unseen instances to the correct class based on their features or attributes. The classifier learns patterns and relationships from a labeled training dataset, where each instance is associated with a known class label.

We will first discuss SVMs.

SVMs explained

SVMs are powerful supervised machine learning algorithms used for classification and regression tasks. They are particularly effective in solving complex problems with a clear margin of separation between classes. SVMs can handle both linearly separable and non-linearly separable data by transforming the input space into a higher-dimensional...

Exploring different types of regression

Regression analysis is a statistical method used to examine the connection between a group of independent variables (also known as explanatory variables) and a dependent variable (referred to as the response variable). By employing this technique, it becomes possible to comprehend how the value of the response variable fluctuates when the explanatory variable is altered.

Regression analysis serves a dual purpose: explanatory and predictive. The explanatory role helps us understand and assess the impact of independent variables on the dependent variable based on a specific theoretical model. It allows us to quantify the relationship and determine the magnitude and significance of the effects. In the predictive role, regression analysis aims to identify the optimal linear combination of independent variables to predict the value of the dependent variable accurately. By utilizing this technique, we can make predictions based on the observed relationships...

Making predictions with regression analysis in MATLAB

Having explored numerous instances of linear regression, we can confidently assert that we comprehend the underlying mechanisms of this statistical method. Non-linear regression is used to model the relationship between a dependent variable and one or more independent variables when the relationship is not linear. In contrast to linear regression, where the relationship is assumed to be a straight line, non-linear regression allows for more complex and flexible relationships between variables.

Up until now, we have exclusively employed continuous variables as predictors. However, what transpires when the predictors are categorical variables? No need to fret, as the fundamental principles of regression techniques remain unchanged.

Multiple linear regression with categorical predictor

Categorical variables differ from numerical ones as they do not stem from measurement operations but rather from classification and comparison...

Evaluating model performance

Model performance refers to how well a model fits the given data and accurately predicts outcomes. It is important to evaluate model performance to assess its reliability and effectiveness in making predictions or in capturing the underlying patterns in the data. One commonly used metric to evaluate model performance is the R-squared value, also known as the coefficient of determination. R-squared measures the proportion of the variance in the dependent variable that can be explained by the independent variables in the model. A higher R-squared value indicates a better fit, as it means a larger proportion of the variability in the data is accounted for by the model.

However, R-squared alone may not provide a complete picture of model performance. Other metrics, such as mean squared error (MSE) or mean absolute error (MAE), can be used to assess the average prediction error of the model. Lower values of MSE or MAE indicate better predictive performance...

Using advanced techniques for model evaluation and selection in MATLAB

Model evaluation and selection are crucial steps in machine learning to ensure the chosen model performs well on unseen data and generalizes effectively. When it comes to advanced techniques for model evaluation and selection in MATLAB, there are several approaches you can consider.

In the subsequent sub-section, we will take a look at the most important techniques for model evaluation and selection.

Understanding k-fold cross-validation

K-fold cross-validation is a widely used technique for model evaluation and selection. It involves partitioning the dataset into k equally sized subsets or folds. The model undergoes training and assessment in k iterations, with each iteration employing a distinct fold as the validation set while using the remaining folds as the training set. The outcomes of each iteration are then averaged to derive a comprehensive performance estimation. This is the essence of how k-fold...

Summary

In this chapter, we have gained valuable insights into performing accurate classification tasks within the MATLAB environment. We began by delving into the realm of decision tree methods, where we familiarized ourselves with key concepts such as nodes, branches, and leaf nodes. By repeatedly dividing records into homogeneous subsets based on the target attribute, we learned how to classify objects into distinct classes effectively. Moreover, we explored the prediction aspect of SVMs, which are particularly effective in solving complex problems with a clear margin of separation between classes. SVMs can handle both linearly separable and non-linearly separable data by transforming the input space into a higher-dimensional feature space.

In the subsequent section, our focus shifted toward conducting precise regression analysis within the MATLAB environment. We commenced by delving into simple linear regression, gaining an understanding of its definition and the process of...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
MATLAB for Machine Learning - Second Edition
Published in: Jan 2024Publisher: PacktISBN-13: 9781835087695
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Giuseppe Ciaburro

Giuseppe Ciaburro holds a PhD and two master's degrees. He works at the Built Environment Control Laboratory - Università degli Studi della Campania "Luigi Vanvitelli". He has over 25 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in MATLAB, Python and R. As an expert in AI applications to acoustics and noise control problems, Giuseppe has wide experience in researching and teaching. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He was recently included in the world's top 2% scientists list by Stanford University (2022).
Read more about Giuseppe Ciaburro