Cluster Evaluation Metrics
Evaluating the results of clustering is crucial to assess the quality and relevance of the groupings discovered by unsupervised algorithms. However, unlike supervised learning, clustering lacks true labels or target values we’re trying to predict, so we rely on internal and external evaluation metrics such as the silhouette score, Davies-Bouldin index, and adjusted Rand index to determine how well the model has performed. Again, with unsupervised learning techniques, evaluation can be seen as more of an art than science, but we can still make educated decisions with the right tools. This recipe explores some methods for evaluating your clustering techniques and optimizing your solution.
Getting ready
To begin, we’ll load our evaluation metrics, create a dummy data set and fit a K-means clustering model.
Load the libraries:
from sklearn.metrics import silhouette_score, davies_bouldin_score, adjusted_rand_score from sklearn.datasets import make_blobs...