Reader small image

You're reading from  R Statistics Cookbook

Product typeBook
Published inMar 2019
Reading LevelExpert
PublisherPackt
ISBN-139781789802566
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Francisco Juretig
Francisco Juretig
author image
Francisco Juretig

Francisco Juretig has worked for over a decade in a variety of industries such as retail, gambling and finance deploying data-science solutions. He has written several R packages, and is a frequent contributor to the open source community.
Read more about Francisco Juretig

Right arrow

Robust Gaussian mixture models with the qclust package

Clustering is usually done via the k-means algorithm. It works by assigning each observation to the closest centroid (vector of means for each group), then recalculating the centroid, and then iterating across all observations. The algorithm stops when no observation changes from cluster. But k-means has a major flaw. Because each observation is assigned to the closest centroid, we are implicitly assuming that the clusters are spherical (no correlation between the variables). In many cases this is not a realistic assumption.

A different approach is to assume that each observation comes from one out of three possible distributions. These distributions are assumed to be multivariate Gaussian, with possibly different covariance matrices. Of course, since the algorithm relies on estimating covariance matrices using standard techniques...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
R Statistics Cookbook
Published in: Mar 2019Publisher: PacktISBN-13: 9781789802566

Author (1)

author image
Francisco Juretig

Francisco Juretig has worked for over a decade in a variety of industries such as retail, gambling and finance deploying data-science solutions. He has written several R packages, and is a frequent contributor to the open source community.
Read more about Francisco Juretig