The probability distributions which can be either discrete or continuous. The most well-known continuous distribution is the Normal Distribution, which is also known as the Gaussian distribution or the “Bell Curve.”
Consider a normal distribution of heights of people. Most of the heights are clustered in the middle part which is taller and gradually reduces towards left and right extremes which denote a lower probability of getting that value randomly.
This curve is centered at its mean and can be tall and slim or it can be short and spread out. A slim one denotes that there are less number of distinct values that we can sample. And a more spread out curve shows that there are a larger range of values. This spread is defined by its Standard Deviation.
Greater the SD, more spread will be your data. SD is just a mathematical derivation of another property called the Variance, which defines how much the data ‘varies’. And variance is what data is all about, Variance is information. No Variance, no information.
A lot of measurements are normally distributed in real scenarios. And this distribution has a crucial use, which I’ll be talking about in my next post!
#datascience #statistics #machinelearning #probability