How will you define the number of clusters in a clustering algorithm?

palak-virmani-30ac6553 · 28 August 2020 04:17

iftekar-patel-f1e6bf65 · 29 August 2020 00:10

Elbow Method

Compute clustering algorithm (e.g., k-means clustering) for different values of k. For instance, by varying k from 1 to 10 clusters.
For each k, calculate the total within-cluster sum of square (wss).
ref:https://towardsdatascience.com/10-tips-for-choosing-the-optimal-number-of-clusters-277e93d72d92
Plot the curve of wss according to the number of clusters k.
The location of a bend (knee) in the plot is generally considered as an indicator of the appropriate number of clusters.

iftekar-patel-f1e6bf65 · 29 August 2020 00:11

https://towardsdatascience.com/10-tips-for-choosing-the-optimal-number-of-clusters-277e93d72d92

chirag-garg · 15 August 2021 16:31

Before clusterting any dataset, you may check if the dataset is clusterable. So there is a group of methods saying if the dataset could be clusterabale or not. This group is called clustering tendency.

After this step, another group of methods may be used to determine the optimal number of clusters of the dataset. It is called relative clustering. It consists to execute a clustering algorithm on the same data but variang the parameters of the algorithm (e.g: number of clusters). After each execution the quality of the model is measured by a metric of clustering validation (e.g index silhouette). After the all executions, the best model in terms of quality is chosen