What Is K-Means Clustering?

K-means is an unsupervised learning algorithm used for problems having to do with clustering data. It follows the sequence of steps described below:

  1. Choose how many clusters to create and assign it as k.
  2. Choose k points from the dataset randomly, which will serve as the centroids.
  3. Take each data point and group it with the closest centroid. This will lead to the formation of k clusters.
  4. Calculate the variance in the dataset and assign a new centroid for each cluster accordingly.
  5. Now repeat the third step by reassigning each data point with the new centroids.
  6. If any reassignments have taken place, then repeat the fourth step. If not, the model is ready.