In this section, you will learn about how the k-means algorithm works under the hood, in order to cluster data into groups that make logical sense.
Let's consider a set of points, as illustrated in the following diagram:
Kevin Jolly is a formally educated data scientist with a master's degree in data science from the prestigious King's College London. Kevin works as a statistical analyst with a digital healthcare start-up, Connido Limited, in London, where he is primarily involved in leading the data science projects that the company undertakes. He has built machine learning pipelines for small and big data, with a focus on scaling such pipelines into production for the products that the company has built. Kevin is also the author of a book titled Hands-On Data Visualization with Bokeh, published by Packt. He is the editor-in-chief of Linear, a weekly online publication on data science software and products.
Read more about Kevin Jolly
In this section, you will learn about how the k-means algorithm works under the hood, in order to cluster data into groups that make logical sense.
Let's consider a set of points, as illustrated in the following diagram:
The first step that the algorithm takes is to assign a set of random centroids. Assuming that we want to find two distinct clusters or groups, the algorithm can assign two centroids, as shown in the following diagram:
In the preceding diagram, the stars represent the centroids of the algorithm. Note that in this case, the clusters' centers perfectly fit the two distinct groups. This is the most...
Kevin Jolly is a formally educated data scientist with a master's degree in data science from the prestigious King's College London. Kevin works as a statistical analyst with a digital healthcare start-up, Connido Limited, in London, where he is primarily involved in leading the data science projects that the company undertakes. He has built machine learning pipelines for small and big data, with a focus on scaling such pipelines into production for the products that the company has built. Kevin is also the author of a book titled Hands-On Data Visualization with Bokeh, published by Packt. He is the editor-in-chief of Linear, a weekly online publication on data science software and products.
Read more about Kevin Jolly