Skewness is defined with respect to the distribution of data. In Normal distribution the data is equally spread along the mean, or you can say that it is symmetric along the mean.

However, when some data points extend to higher or lower values on either the right or the left side, it is called as Skewness, or the data is said to be left or right Skewed respectively.

For a right/positive skewed distribution,

Mean >= Median >= Mode

Here, Mean will be getting overrated because of the fact that tail is getting more flat and long at the right side. Mode will be lower because of the large data frequency at the left side of the distribution.

For a left/negative skewed distribution,

Mode >= Median >=Mean

Here, Mean will be getting underrated because of the fact that tail is getting more flat and long at the left side. Mode will be higher because of the large data frequency at the right side of the distribution.

But how to deal with it?

To deal with Skewness, we can do a number of transformations on the data so that its information is preserved and yet the data is plotted under a symmetrical curve. Some operations include taking the square root/cube root/log/reciprocal of each data point and plotting again.

#statistics #datascience #machinelearning