Reader small image

You're reading from  R Statistics Cookbook

Product typeBook
Published inMar 2019
Reading LevelExpert
PublisherPackt
ISBN-139781789802566
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Francisco Juretig
Francisco Juretig
author image
Francisco Juretig

Francisco Juretig has worked for over a decade in a variety of industries such as retail, gambling and finance deploying data-science solutions. He has written several R packages, and is a frequent contributor to the open source community.
Read more about Francisco Juretig

Right arrow

Introduction

Classical statistical methods don't handle outliers well. The worst part is that even the most basic methods suffer this problem: for example, the sample mean, which is the maximum likelihood estimate for the µ parameter (assuming that the distribution is Gaussian), can be wrong even with a single contaminated observation. For example, the average between the numbers: 3, 4, and 5, is 4; and if we replace that last value (=5) with a new contaminated value =100, the new average will be =107/3. Let's introduce the concept of breakdown point for an estimator. The breakdown point (of an estimator) is the proportion of values that the estimator can handle before yielding wrong results. In the case of the mean that we just explained, the breakdown is 0; meaning that even a single contaminated observation would make the estimator give wrong results. The median...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
R Statistics Cookbook
Published in: Mar 2019Publisher: PacktISBN-13: 9781789802566

Author (1)

author image
Francisco Juretig

Francisco Juretig has worked for over a decade in a variety of industries such as retail, gambling and finance deploying data-science solutions. He has written several R packages, and is a frequent contributor to the open source community.
Read more about Francisco Juretig