How to Find the Optimal Number of Clusters in Agglomerative Clustering Algorithm?

To find the optimal number of clusters, Silhouette Score is considered to be one of the popular approaches. This technique measures how close each of the observations in a cluster is to the observation in its neighboring clusters.

Let ai be the mean distance between an observation i and other observations in the cluster to which observation i assigned.

Let bi be the minimum mean distance between an observation i and observation in other clusters.

optimal clusters

Conclusion:

  • The range of the Silhouette Scores is from -1 to +1. Higher the value of the Silhouette Score indicates observations are well clustered.
  • Silhouette Score = 1 describes that the data point (i) is correctly and well-matched in the cluster assignment.