K-means is an unsupervised learning algorithm used for problems having to do with clustering data. It follows the sequence of steps described below:
- Choose how many clusters to create and assign it as k.
- Choose k points from the dataset randomly, which will serve as the centroids.
- Take each data point and group it with the closest centroid. This will lead to the formation of k clusters.
- Calculate the variance in the dataset and assign a new centroid for each cluster accordingly.
- Now repeat the third step by reassigning each data point with the new centroids.
- If any reassignments have taken place, then repeat the fourth step. If not, the model is ready.