This is data science different topic explanation conversation series.
If you are not following below mentioned are the links for each part.
Data Science different topic’s explanation – Part-9 – Normal Distribution
from scipy.stats import norm
import matplotlib.pyplot as plt
import numpy as np
def normal() -> None:
fig, ax = plt.subplots(1, 1)
# calculate a few first moments
mean, var, skew, kurt = norm.stats(moments='mvsk')
# display the probability density function (`pdf`)
x = np.linspace(norm.ppf(0.01), norm.ppf(0.99), 100)
'r-', lw=5, alpha=0.6, label='norm pdf')
'b-', lw=5, alpha=0.6, label='norm cdf')
# check accuracy of `cdf` and `ppf`
vals = norm.ppf([0.001, 0.5, 0.999])
np.allclose([0.001, 0.5, 0.999], norm.cdf(vals))
# generate random numbers:
r = norm.rvs(size=1000)
# and compare the histogram
ax.hist(r, normed=True, histtype='stepfilled', alpha=0.2)
Continuing the left over content of above post.
At the heart of the statistics lies the normal distribution, known to millions of people as a bell-shaped curve. It is a two-parameter family of curves that represent plots of probability density functions:
So, the specific form of the normal distribution depends on 2 parameters: the expectation (µ) and variance (σ2). Briefly denoted by N(m, (σ2)). The parameter µ (expectation) determines the distribution centre, which corresponds to the maximum height of the graph. The variance σ2 characterises the range of variation, that is, the “spread” of the data".
Another interesting detail of this distribution is when we calculate the standard deviation we find that:
about 68% of values are within 1 standard deviation of the mean.
about 95% of values are within 2 standard deviations of the mean.
about 99.7% of value are within 3 standard deviations of the mean.
Please feel free to post your thoughts for this post in the comment section and make sure to give a like to this post for support and motivation for fresh content.