Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Rapid - Apache Mahout Clustering designs

You're reading from  Rapid - Apache Mahout Clustering designs

Product type Book
Published in Oct 2015
Publisher
ISBN-13 9781783284436
Pages 130 pages
Edition 1st Edition
Languages

Evaluating clusters


Cluster evaluation involves cluster validation. We can apply multiple algorithms to get the clustering results, and we wish to know how one result is better than the other.

Two types of methods are available to evaluate clusters:

  • Extrinsic methods

  • Intrinsic methods

Let's take a look at each of these types.

Extrinsic methods

Extrinsic methods are the methods in which data that is not used for clustering is used for evaluation. This data consists of known class labels and external benchmarks. These benchmarks are thought of as gold standards and are often created by experts. A measure on clustering quality is effective if it satisfies the following four criteria (A comparison of Extrinsic Clustering Evaluation Metrics based on Formal constraints, Enrique Amigó, Julio Gonzalo, Javier Artiles, and FelisaVerdejo):

  • Cluster Homogeneity: Clusters should not mix items belonging to different categories. Look at the following diagram:

    Cluster 1 has all six data points in one cluster, while...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}