Explain Univariate Analysis in Machine Learning?

Univariate analysis is the technique of comparing and analyzing the dependency of a single predictor and a response variable. The prefix “uni” means one, emphasizing the fact that the analysis only accounts for one variable’s effect on a dependent variable.

Some patterns that can be easily identified with univariate analysis are Central Tendency (mean, mode and median), Dispersion (range, variance), Quartiles (interquartile range), and Standard deviation.

Univariate data can be described through:

  • Ø Frequency Distribution Tables: The frequency distribution table reflects how often an occurrence has taken place in the data. It gives a brief idea of the data and makes it easier to find patterns.

Example:

The list of IQ scores is: 118, 139, 124, 125, 127, 128, 129, 130, 130, 133, 136, 138, 141, 142, 149, 130, 154.

IQ RangeNumber118-1253126-1337134-1414142-1492150-1571
  • Ø Bar Charts: The bar graph is very convenient while comparing categories of data or different groups of data. It helps to track changes over time. It is best for visualizing discrete data.

  • Ø Histograms: Histograms are similar to bar charts and display the same categorical variables against the category of data. Histograms display these categories as bins which indicate the number of data points in a range. It is best for visualizing continuous data.

  • Ø Pie Charts: Pie charts are mainly used to comprehend how a group is broken down into smaller pieces. The whole pie represents 100 percent, and the slices denote the relative size of that particular category.

  • Ø Frequency Polygons: Similar to histograms, a frequency polygon is used for comparing datasets or displaying the cumulative frequency distribution.