DBSCAN
Another clustering method that can work well for strange cluster shapes is DBSCAN, which stands for Density-Based Spatial Clustering of Applications with Noise. The algorithm is completely different from k-means or hierarchical clustering. With DBSCAN, our clusters are composed of core points and non-core points. Core points are all within a distance, epsilon (eps in the sklearn parameters), of at least n points in the same cluster (n is the min_samples parameter in the sklearn function). Then, any other points within the distance epsilon of the core points are also in the cluster. If any points are not within the epsilon distance of any core points, these are outliers. This algorithm assumes we have some dead space between samples, so our clusters must have at least some separation. We can also tune the eps and min_samples hyperparameters to optimize clustering metrics.
The min_samples hyperparameter should generally be between the number of features and two times the...