Packt+ | Advance your knowledge in tech

You're reading from Learning Predictive Analytics with R

Product typeBook

Published inSep 2015

Reading LevelIntermediate

PublisherPackt

ISBN-139781782169352

Edition1st Edition

Languages

Concepts

Predictive Analytics

Author (1)

Eric Mayor

Chapter 12. Multilevel Analyses

In Chapter 10, Classification with k-Nearest Neighbors and Naïve Bayes, we discussed association with k-Nearest Neighbors and Naïve Bayes. In the previous chapter, we examined classification trees using notably C4.5, C50, CART, random forests, and conditional inference trees.

In this chapter, we will discuss:

Nested data and the importance of dealing with them appropriately
Multilevel regression including random intercepts and random slopes
The comparison of multilevel models
Prediction using multilevel modeling

Nested data

If you have nested data, this chapter is essential for you! What is meant by nested data is that observations share a common context. The examples include:

Consumers nested within shops
Employees nested within managers
Teachers and/or students nested within schools
Nurses, patients, and/or physicians nested within hospitals
Inhabitants nested in neighborhoods

We could imagine way more cases of data nesting. What they all have in common is a data structure similar to the one depicted in the following figure:

A depiction of nested data

We will only discuss two levels of data with unique membership in this chapter, but of course, more complex situations can arise. For instance, in all the preceding examples, shops, managers, schools, hospitals, and neighborhoods can be nested within higher level units (for example, companies, cities) which could be a third level in the analyses). Also, crossed memberships could be imagined, for example, patients sharing a hospital but not a neighborhood...

Multilevel regression

To solve all these issues, we can rely on a kind of analysis that can partial out (take away) the variance due to the context. This can be done using multilevel regression analysis (also known as mixed-effect regression). We will not go into the detail of the computations of such highly complex analyses but will simply provide the amount of information necessary to understand and perform the analysis at a basic level. The necessary diagnostic checks are not fully presented here. Simply note that diagnostics for linear regression apply, and that additional diagnostics should be performed, such as checking the normality of residuals at level 2. We will not discuss this further here. The Handbook of multilevel analysis book, edited by De Leeuw and Meijer, provides the necessary information for diagnostics of multilevel models.

When we discussed regression in Chapter 9, Linear Regression, we showed that the value of a criterion attribute for an observation is computed as...

Multilevel modeling in R

Now that we have examined (laconically) the basics of multilevel modeling equations, we can turn to how to build multilevel models in R and predict unseen data.

For this purpose, we will first load our dataset produced using the same procedure as mentioned previously (except that the attributes are not scaled). Here again, there are 100 generated observations for each of the 17 hospitals:

NursesML = read.table("NursesML.dat", header = T, sep = " ")

The null model

We will examine the variation in our attributes considering hospitals and observations as a unit of analysis, that is, we will compare whether there is more variation at the hospital and observation levels. What we could do is compute this by hand.

The following will compute the mean for the attribute we want to predict (WorkSat) for each of the hospitals:

means = aggregate(NursesML[,4], by=list(NursesML[,5]), 
   FUN=mean)[2]

We can display the variance of work satisfaction in hospitals and observations as follows...

Predictions using multilevel models

Now that we have our model ready, we can predict work satisfaction in the testing dataset.

Using the predict() function

One way to do so is simply to use the predict() function. The allow.new.levels argument specifies that we allow new hospitals in the analysis. As we have the same hospitals in the training and testing sets, we set its value to F (false) (which is actually the default value):

NursesMLtest$predicted = predict(modelRS, NursesMLtest,
   allow.new.levels = F)

Assessing prediction quality

There is no perfect way to measure the quality of the predictions for nested data. A simple estimate of the quality of our prediction is the correlation test. Because of the nested structure of our dataset, we will perform the test for each hospital separately:

1  correls = matrix(nrow=17,ncol=3)
2  colnames(correls) = c("Correlation", "p value", "r squared")
3  for (i in 1:17){
4     dat = subset(NursesMLtest, hosp == i)
5     correls[i,1] = cor.test(dat$predicted...

Summary

In this chapter, we saw why it is necessary to use analyses that account for the structure of the data when dealing with nested data. We have examined how to fit several types of multilevel models and saw how to predict new data. In the next chapter, we will deal with text mining, including document classification.

The rest of the chapter is locked

You have been reading a chapter from

Learning Predictive Analytics with R

Published in: Sep 2015Publisher: PacktISBN-13: 9781782169352

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Eric Mayor

Eric Mayor is a senior researcher and lecturer at the University of Neuchatel, Switzerland. He is an enthusiastic user of open source and proprietary predictive analytics software packages, such as R, Rapidminer, and Weka. He analyzes data on a daily basis and is keen to share his knowledge in a simple way.
Read more about Eric Mayor

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages