What are the conditions to stop combining the clusters in the Hierarchical Clustering Algorithm?

Some of the popular approaches to stop combining the clusters are as follows:

Approach 1: Pick several clusters(k) upfront

When we don’t want to form, for example, 250 clusters, then we choose the K value. Therefore, we decide the number of clusters (say, the first five or six) required in the beginning, and we complete the process when we reach the value K. This is done so that we can put a limit on the incoming information.

It becomes very important especially when we are feeding it into another algorithm that probably requires three or four values.

Possible challenges: This approach only makes sense when you know the data well or you have some domain knowledge when you’re clustering with K clusters. But if you’re analyzing a brand new dataset, then you may not know how many clusters we are required.

Approach 2: Stop combining the clusters when the next merging of clusters would form a cluster with low cohesion.

We keep clustering until the next combination of clusters gives a bad cluster with a low cohesion setup. This implies that the points are so close to being in both the clusters that it doesn’t make sense to keep them together.

Approach 3:

Approach 3.1: Diameter of a cluster

The diameter of a cluster is defined as the maximum distance between any pair of observations in the cluster.

We stop combining the clusters when the diameter of a new cluster formed exceeds the threshold. Moreover, we don’t want the two clusters to overlap as the diameter increases.

Approach 3.2: Radius of a cluster

The radius of a cluster is defined as the maximum distance of a point from the centroid.

We stop combining the clusters when the radius of a new cluster formed exceeds the threshold(decided by the user itself according to the problem statement).