You're reading from R Statistics Cookbook

Product typeBook

Published inMar 2019

Reading LevelExpert

PublisherPackt

ISBN-139781789802566

Edition1st Edition

Languages

Tools

ggplot

Concepts

Statistics

Author (1)

Francisco Juretig

Mixed Effects Models

We will cover the following recipes in this chapter:

The standard model and ANOVA
Some useful plots for mixed effects models
Nonlinear mixed effects models
Crossed and nested designs
Robust mixed effects models with robustlmm
Choosing the best linear mixed model
Mixed generalized linear models

Introduction

In Chapter 2, Univariate and Multivariate Tests for Equality of Means, we discussed mixed effects models in the context of the analysis of variance (ANOVA). These models arise when we have a mixture of fixed and random effects. Fixed effects are associated to standard coefficients that appear in every regression problem, and random effects are variance components that govern shocks that are shared by members of the same groups. For example, the grades of any student can be thought of as the sum of how many hours the student spent studying (this would be the fixed effect) and a random shock that is shared across all students from the same school. The idea is to capture that students belonging to the same school to have correlated grades.

The standard model and ANOVA

In this recipe, we will be more interested in the regression part of it, instead of the ANOVA part. In the previous ANOVA chapter, we only used random effects for the intercepts, and this is usually not the price only way that random effects are introduced. Imagine that we model the sales in terms of price for certain customers, where we have several observations for each one of them. The ordinary least squares (OLS) standard approach would be to ignore this heterogeneity and pool all the observations together.

Naturally, this would introduce a problem, because the residuals would then be correlated (observations belonging to the same individual will produce similar residuals). The correct approach would be to introduce a random effect per individual, but there is a subtle point here: we are not expecting the response to differ in terms of an intercept...

Some useful plots for mixed effects models

In this recipe, we will explore some interesting plots that are for presenting and analyzing the results from mixed effects models. In the simplest formulation of mixed effects models, we have a random intercept by group. Every observation belonging to the same group will share that very same shock, rendering all of them correlated. But this can be extended to other coefficients (not just the intercept). We could have yet another coefficient, that is, beta would be the sum of beta1 (which would be fixed) and beta_random (this would be a random effect). What this would imply is that the slope relating to the regressor and the response, would have two parts: a part that is the same for all the observations, and another part that depends on each group.

...

Nonlinear mixed effects models

Linear mixed effects models assume that a linear relationship exists between the predictors and the target variable. In many cases, this is a problematic assumption; whenever the target is expected to show any kind of saturation effect or have an exponential response with respect to any of the regressors, the linearity assumption needs to be removed.

In medicine and biology, this is usually the case, as dose response studies almost always exhibit a certain kind of saturation effect. The same happens for marketing studies, because spending increasing amounts of resources in order to drive sales up might be effective, but it won’t be effective if that spend is too large.

Fitting nonlinear mixed effects models is much harder than their linear counterpart. Here, we can’t rely on any matrix techniques and we need to attack the problem...

Crossed and nested designs

Whenever we collect data of a model with the intention of testing something, we are implicitly working with an experimental design. Experimental design refers to the setup that defines which experimental units are used, and how they are allocated to each treatment. For example, if we want to measure whether clients are more likely to buy a product after receiving a discount, we need to define which clients will be in the control or test group. Furthermore, we need to define how many of them will fall in each group. All these decisions will have implications regarding the effects and contrasts that we can estimate, and what the precision will be for each one. This is why experimental design has transcendental consequences for our ANOVA and regression models.

Understanding the underlying design for an experiment is of prime importance. The design type...

Robust mixed effects models with robustlmm

The lme4 package is the de facto package for linear mixed models. Its syntax has become a standard in the industry and most researchers working with applied linear models use it. As we have seen with many techniques so far, the problem with it is that it can be impacted greatly by outliers. Even a minor contamination causes major estimation problems.

Getting ready

The lme4 and robustlmm packages are needed for this recipe. They can be installed using install.packages().

How to do it...

In this recipe, we will use the robustlmm...

Choosing the best linear mixed model

When using OLS models, choosing the best one is not a complex task: we have a set of variables that we use, and we just pick whichever model has the lowest Akaike information criterion (AIC) (or any other appropriate metric that we choose).

Mixed models entail an extra level of complexity, as we can define the random effects in many ways. Resuming our previous example of deal_size versus time_spent and salespeople, we could choose a model with random effects only for the deal_size or both the deal_size and salespeople. We can also decide to add a random intercept or not, and we can force the model to assume that the shocks impacting each one of these are either, uncorrelated or correlated.

Choosing models by comparing the AIC is quite hard for mixed models, since we have a random and a fixed part. There are two types of analysis that we might...

Mixed generalized linear models

Generalized linear models are a set of techniques that generalizes the linear regression model (which assumes that the dependent variable is Gaussian) into a wide variety of distributions for the response variable. This response can no longer be Gaussian, but can belong to any distribution that is part of the so-called exponential family. In fact, there are many distributions that fall into this category, such as the binomial, gamma, Poisson, or negative binomial distributions. This fact allows us to work with a wide array of situations, such as with count data, or binary responses, and so on.

Generalized linear models (referred to as GLMs in the literature) are defined by three things: first, a linear predictor that relates the covariates with the response variable; second, a probability distribution for the dependent variable from the exponential...

The rest of the chapter is locked

You have been reading a chapter from

R Statistics Cookbook

Published in: Mar 2019Publisher: PacktISBN-13: 9781789802566

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Francisco Juretig

Francisco Juretig has worked for over a decade in a variety of industries such as retail, gambling and finance deploying data-science solutions. He has written several R packages, and is a frequent contributor to the open source community.
Read more about Francisco Juretig

Other recommended products

Related to this chapter

Data Analysis with IBM SPSS Statistics

SPSS Statistics is a software package used for logical batched and non-batched statistical analysis. Analytical tools such as SPSS can readily provide even a novice user with an overwhelming amount of information and a broad range of options for analyzing patterns in the data. This book will have a comprehensive coverage of IBM’s premier statistics and data analysis tool – IBM SPSS Statistics. It is designed for business professionals who wish to analyze their data. By the end of this book, you will have a firm understanding of the various statistical analysis techniques offered by SPSS Statistics, and be able to master its use for data analysis with ease.

BookSep 2017446 pages

Associations and Correlations

Through this book, you’ll learn why most statistical techniques give incorrect results and what you can do to avoid the most common pitfalls. You’ll learn how to make sure you get the correct results the first time, every time.

BookJun 2019134 pages

Machine Learning with R Cookbook

The R language is a powerful open source functional programming language. At its core, R is a statistical language that provides impressive tools to analyze data and create high-level graphics. This book covers the basics of R by setting up a user-friendly programming environment and programming ETL in R. Data exploration examples are provided that demonstrate how powerful data visualisation and machine learning is in discovering hidden relationships. You will also explore air quality data, steps to fix the missing values and visualising the same. You will then dive into important machine learning topics, including data classification, regression, survival analysis, time series analysis, clustering association rule mining, and dimension reduction.This book will include the latest code and examples based on R 3.3 and above—updated for better computation, accuracy, and speed with R.

BookOct 2017572 pages

Bayesian Analysis with Python

Bayesian inference uses probability distributions and Bayes' theorem to build flexible models. The book uses PyMC3 to abstract all the mathematical and computational details from this process allowing readers to solve a wide range of problems in data science.

BookDec 2018356 pages4

Regression Analysis with R

Regression analysis is a statistical process which enables prediction of relationships between variables. This book will give you a rundown explaining what regression analysis is, explaining you the process from scratch. Each chapter starts with explaining the theoretical concepts and once the reader gets comfortable with the theory, we move to the practical examples to support the understanding. By the end of this book you will know all the concepts and pain-points related to regression analysis, and you will be able to implement your learning in your projects.

BookJan 2018422 pages

Practical Time Series Analysis

Practical Time Series Analysis will introduce you to the basic concepts of time series analysis and describe powerful yet simple techniques in Python which data scientists and data engineers would find useful in dealing with real life datasets in industrial settings. This book focuses on explaining important concepts and practical techniques to process, summarize and model time series data. Real life case studies with code snippets in Python are used to demonstrate the concepts and techniques.

BookSep 2017244 pages

Hands-On Time Series Analysis with R

This book introduces you to time series analysis and forecasting with R; this is one of the key fields in statistical programming and includes techniques for analyzing data to extract meaningful insights. You will explore methods, such as prediction with time series analysis, and identify the relationship between each data point in the series.

BookMay 2019448 pages

Data Analysis with R

R has spread deep into the private sector and can be found in the production pipelines at some of the most advanced and successful enterprises. Starting with the basics of R and statistical reasoning, this book dives into advanced predictive analytics, showing how to apply those techniques to real-world data though with real-world examples.

BookMar 2018570 pages

Statistical Application Development with R and Python

Statistical Analysis involves collecting and examining data to describe the nature of data that needs to be analyzed. It helps you explore the relation of data and build models to make better decisions. You will begin with a brief understanding of the nature of data and end with modern and advanced statistical models like CART. Every step is taken with DATA and R code, and further enhanced by Python. By the end of this book you will be able to apply your statistical learning in major domains at work or in your projects.

BookAug 2017432 pages

Learning Quantitative Finance with R

This book covers applications of quantitative finance in R. It starts with the basics of quantitative finance and goes to complexity at the end of the book along with a varying degree of R complexity. This will guide you to implement different trading strategies for various financial instruments using basic to complex techniques along with its optimization and keeping the risk of financial instruments in check.

BookMar 2017284 pages

Practical Machine Learning Cookbook

Machine learning is the new BLACK GOLD. In this book, we explore topics such as classification, clustering, model selection and regularization, nonlinearity, supervised, unsupervised, and reinforcement learning, structured prediction, neural networks, deep learning, and case studies. The algorithms are developed using R.The book is for students and professionals in the field of statistics, data analytics, and computer science.

BookApr 2017570 pages

Advanced Analytics with R and Tableau

R is the go-to tool for statistics and data mining while Tableau offers an interface to filter data, plug and play with rich visualizations to describe insights from your data. When combined these two tools makes it easier to harness interesting patterns and communicate stories. This book covers various analytical techniques like prediction, classification, clustering and best practices to visualize it using interactive dashboard with drop-downs, sliders, and other visual cues of Tableau. Get to know how R can be used in conjunction with Tableau and implement powerful machine learning techniques making big data analytics accessible and presentable through Tableau workbooks.

BookAug 2017178 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages