Types of unsupervised learning

Unsupervised learning problems further grouped into clustering and association problems.

Clustering

Clustering is an important concept when it comes to unsupervised learning. It mainly deals with finding a structure or pattern in a collection of uncategorized data. Clustering algorithms will process your data and find natural clusters(groups) if they exist in the data. You can also modify how many clusters your algorithms should identify. It allows you to adjust the granularity of these groups.

There are different types of clustering you can utilize:

Exclusive (partitioning)

In this clustering method, Data are grouped in such a way that one data can belong to one cluster only.

Example: K-means

Agglomerative

In this clustering technique, every data is a cluster. The iterative unions between the two nearest clusters reduce the number of clusters.

Example: Hierarchical clustering

Overlapping

In this technique, fuzzy sets is used to cluster data. Each point may belong to two or more clusters with separate degrees of membership.

Here, data will be associated with an appropriate membership value. Example: Fuzzy C-Means

Probabilistic

This technique uses probability distribution to create the clusters

Example: Following keywords

  • “man’s shoe.”
  • “women’s shoe.”
  • “women’s glove.”
  • “man’s glove.”

can be clustered into two categories “shoe” and “glove” or “man” and “women.”

Unsupervised learning means there is no training phase where we feed labelled data to the learning algorithm in order to train the model. Instead the algorithm has to figure out things by itself.

Two types of unsupervised learning are Clustering and Association.

Clustering algorithms groups data into clusters based on similar patterns. An example, if you feed a large number of pictures of various animals, the clustering algorithm will group them into various clusters such as cats, dogs etc.

Association algorithms identity relationships between variables. A frequently quoted example is that if we feed sales data, it can identify patterns such as the people who bought item X has a probability of p% for buying item Y too.