Summary
In this chapter, we discussed how to improve cluster quality. We looked at different measuring techniques that help us to identify cluster quality. We further discussed intrinsic and extrinsic methods for cluster evaluation techniques. Then, we saw how to use inter-cluster distance measure to calculate the Dunn index. We also discussed custom distance measure in Mahout. A wrong selection of distance measure can affect the quality of clusters badly. In the next, and final, chapter of this book, we will use Hadoop to run our clustering job and see how to go for clustering in production.