Packt+ | Advance your knowledge in tech

You're reading from Regression Analysis with R

Product typeBook

Published inJan 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781788627306

Edition1st Edition

Languages

Concepts

Predictive Analytics

Author (1)

Giuseppe Ciaburro

Chapter 8. Beyond Linearity – When Curving Is Much Better

Some problems cannot be solved with linear models. Often, we must go beyond the simple linearity of models by introducing features that take into account the complexity of the phenomenon. Nonlinear models are more complex (and more prone to overfitting), but sometimes they are the only solution.

In this chapter, we will see an introduction to the most used ones, how to train them, and how to apply them. First, a nonlinear least squares method will be treated, where the parameters of the regression function to be estimated are nonlinear. In this technique, given the nonlinearity of the coefficients, the solution of the problem occurs by means of iterative numerical calculation methods. Then Multivariate Adaptive Regression Splines (MARS) will be performed. This is a nonparametric regression procedure that makes no assumption about the underlying functional relationship between the response and predictor variables. This relationship...

Nonlinear least squares

In Chapter 3, More Than Just One Predictor – MLR, we have already handled a case in which a linear regression was unable to model the relationship between the response and predictors. In that case, we solved the problem by applying polynomial regression. When the relationships between variables are not linear, three solutions are possible:

Linearize the relationship by transforming the data
Fit polynomial or complex spline models
Fit a nonlinear model

The first two solutions you have already faced in somemanner in the previous chapters. Now we will focus on the third solution. If the parameters of the regression function to be estimated are nonlinear, that is, they appear at a different degree from the first, the Ordinary Least Squares (OLS) can no longer be applied and other methods need to be applied.

In the multiple nonlinear regression models, the dependent variable is related to two or more independent variables as follows:

Here, the model is not linear with respect...

Multivariate Adaptive Regression Splines

MARS is a form of regression analysis introduced by Jerome H. Friedman (1991), with the main purpose being to predict the values of a response variable from a set of predictor variables.

MARS is a nonparametric regression procedure that makes no assumption about the underlying functional relationship between the response and predictor variables.

This relationship is constructed from a set of coefficients and basis functions that are processed starting from the regression data. The method divides the input space into regions, each with its own regression equation. This makes MARS particularly suitable for problems with a large number of predictors. The following figure shows a distribution with two regression regions:

The MARS algorithm operates as a multiple piecewise linear regression, where each breakpoint (estimated from the data) defines the region of application for a very simple linear regression equation.

The general MARS model equation is as follows...

Generalized Additive Model

A GAM is a GLM in which the linear predictor is given by a user-specified sum of smooth functions of the covariates plus a conventional parametric component of the linear predictor. Assume that a sample of n objects has a response variable y and r explanatory variables x₁,. . . , x_r. In these assumptions, the regression equation becomes:

Here, the functions f₁, f₂,…., f_r are different nonlinear functions on variables x. Into the GAM, the linear relationship between the response and predictors are replaced by several nonlinear smooth functions to model and capture the nonlinearities in the data.

We can see the GAM as a generalization of a multiple regression model without interactions between predictors. Among the advantages of this approach, in addition to greater flexibility than the linear model, the good algorithmic convergence rate should also be mentioned for problems with many explanatory variables. The biggest drawback lies in the complexity of the parameter...

Regression trees

Decision trees are used to predict a response or class y from several input variables x₁, x₂,…,x_n. If y is a continuous response, it's called a regression tree, if y is categorical, it's called a classification tree. That's why these methods are often called Classification and Regression Tree (CART). The algorithm is based on the following procedure: at each node of the tree, we check the value of one the input x_i and depending of the (binary) answer we continue to the left or to the right branch. When we reach a leaf we will find the prediction.

This algorithm starts from grouped data into a single node (root node) and executes a comprehensive recursion of all possible subdivisions at every step. At each step, the best subdivision is chosen, that is, the one that produces as many homogeneous branches as possible.

In the regression trees, we try to partition the data space into small-enough parts where we can apply a simple different model on each part. The non leaf part of...

Support Vector Regression

SVR is based on the same principles as the Support Vector Machine (SVM). In fact, SVR is the adapted form of SVM when the dependent variable is numeric rather than categorical. One of the main advantages of using SVR is that it is a nonparametric technique.

To build the model, the SVR technique uses the kernel functions. The commonly used kernel functions are:

Linear
Polynomial
Sigmoid
Radial base

This technique allows the fitting of a nonlinear model without changing the explanatory variables, helping to interpret the resulting pattern better.

In the SVR, we do not have to worry about the prediction as long as the error (ε) remains above a certain value. This method is called the maximal margin principle. The maximal margin allows SVR to be seen as a convex optimization problem.

Regression can also be penalized using a cost parameter, which becomes useful in avoiding excess adaptation. SVR is a useful technique that provides the user with a great flexibility in distributing...

Summary

In this chapter, several advanced techniques to solve regression problems that cannot be solved with linear models were treated. First, a nonlinear least squares method was explored, where the parameters of the regression function to be estimated were nonlinear. In this technique, given the nonlinearity of the coefficients, the solution of the problem occurs by means of iterative numerical calculation methods. Then a MARS was performed. This is a nonparametric regression procedure that makes no assumption about the underlying functional relationship between the response and predictor variables. This relationship is constructed from a set of coefficients and basis functions that are processed, starting from the regression data.

Later, we focused attention on a GAM. This is a GLM in which the linear predictor is given by a user-specified sum of smooth functions of the covariates plus a conventional parametric component of the linear predictor. Then, we introduced the tree regression...

The rest of the chapter is locked

You have been reading a chapter from

Regression Analysis with R

Published in: Jan 2018Publisher: PacktISBN-13: 9781788627306

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Giuseppe Ciaburro

Giuseppe Ciaburro holds a PhD and two master's degrees. He works at the Built Environment Control Laboratory - Università degli Studi della Campania "Luigi Vanvitelli". He has over 25 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in MATLAB, Python and R. As an expert in AI applications to acoustics and noise control problems, Giuseppe has wide experience in researching and teaching. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He was recently included in the world's top 2% scientists list by Stanford University (2022).
Read more about Giuseppe Ciaburro

Other recommended products

Related to this chapter

Neural Networks with R

The book helps you learn neural networks and implement them in R. It covers real-world use cases that will help you better understand their concepts. A basic understanding of R and mathematics is required.

BookSep 2017270 pages

MATLAB for Machine Learning

MATLAB is the language of choice for many researchers and mathematics experts for machine learning. This book will build a foundation for machine learning using MATLAB for beginners. It will also help you learn regression, clustering, classification, predictive analytics, artificial neural networks, and more with MATLAB.

BookAug 2017382 pages

Hands-On Exploratory Data Analysis with R

Hands-On Exploratory Data Analysis with R puts the complete process of exploratory data analysis into a practical demonstration in one nutshell. You will understand the concepts of data analysis right from data ingestion, data cleaning, data manipulation to applying statistical techniques and visualizing hidden patterns.

BookMay 2019266 pages

Keras 2.x Projects

Keras is a deep learning library that enables the fast, efficient training of deep learning models. The book begins with setting up the environment, training various types of models in the domain of deep learning and reinforcement learning. The projects are exciting and are real-world market demanding projects which take you from simple to complex level.

BookDec 2018394 pages

Practical Machine Learning Cookbook

Machine learning is the new BLACK GOLD. In this book, we explore topics such as classification, clustering, model selection and regularization, nonlinearity, supervised, unsupervised, and reinforcement learning, structured prediction, neural networks, deep learning, and case studies. The algorithms are developed using R.The book is for students and professionals in the field of statistics, data analytics, and computer science.

BookApr 2017570 pages

Machine Learning with R Cookbook

The R language is a powerful open source functional programming language. At its core, R is a statistical language that provides impressive tools to analyze data and create high-level graphics. This book covers the basics of R by setting up a user-friendly programming environment and programming ETL in R. Data exploration examples are provided that demonstrate how powerful data visualisation and machine learning is in discovering hidden relationships. You will also explore air quality data, steps to fix the missing values and visualising the same. You will then dive into important machine learning topics, including data classification, regression, survival analysis, time series analysis, clustering association rule mining, and dimension reduction.This book will include the latest code and examples based on R 3.3 and above—updated for better computation, accuracy, and speed with R.

BookOct 2017572 pages

Practical Machine Learning with R

Practical Machine Learning with R gives you the complete knowledge to solve your business problems - starting by forming a good problem statement, selecting the most appropriate model to solve your problem, and then ensuring that you do not overtrain the model.

BookAug 2019416 pages

Hands-On Machine Learning on Google Cloud Platform

In this book, you will learn how to create powerful machine learning based applications for a wide variety of problems leveraging different data services from the Google Cloud Platform. Finally, you will know the main difficulties that you may encounter and get appropriate strategies to overcome these difficulties and build efficient systems.

BookApr 2018500 pages

Applied Supervised Learning with R

Applied Supervised Learning with R will make you a pro at identifying your business problem, selecting the best supervised machine learning algorithm to solve it, and fine-tuning your model to exactly deliver your needs without overfitting itself.

BookMay 2019502 pages

Statistical Application Development with R and Python

Statistical Analysis involves collecting and examining data to describe the nature of data that needs to be analyzed. It helps you explore the relation of data and build models to make better decisions. You will begin with a brief understanding of the nature of data and end with modern and advanced statistical models like CART. Every step is taken with DATA and R code, and further enhanced by Python. By the end of this book you will be able to apply your statistical learning in major domains at work or in your projects.

BookAug 2017432 pages

Mastering Machine Learning with R

Machine learning is the field of Artificial Intelligence where we build systems that learn from data. Given the growing prominence of R—a cross-platform, zero-cost statistical programming environment—there has never been a better time to start applying machine learning to your data. This book will teach you advanced techniques in machine learning with the latest code in R 3.3.2.

BookApr 2017420 pages

Jupyter for Data Science

Jupyter Notebook is a web-based environment that enables interactive computing in notebook documents. It allows you to create documents that contain live code, equations and visualizations. This book will be a comprehensive guide to getting started with data science using the popular Jupyter notebook. It will show you how to leverage the capabilties of Jupyter to perform various data science tasks efficiently. From data exploration to visualization, this book will take you through every step of the way in implementing an effective data science pipeline using Jupyter.

BookOct 2017242 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages