Probability Distribution

kalyani-dhanwada-55d091d2 · 15 June 2022 07:25

Have you ever thought of how a single decision of ours may result in various possible outcomes? In our daily lives, we always make decisions by weighing the possible outcomes. But, this kind of decision making is very popular in business too. Every businessman/entrepreneur decides to pursue a project only when he/she sees value to it. Our decisions though are not backed by huge sheets of data and analysis but theirs sure does involve a lot of such things. So, let us delve into the details.

As defined by the famous “Wikipedia” (a little tweaked with my understanding) Probability Distribution is a listing of all the possible outcomes that could result if an experiment were done. For example, let us take a coin and flip it. What do you think the outcomes are? (does she even not know that) Obviously, it is either heads or tails. Here, the probability of each outcome is equal(which is 0.5). That is you are either going to have a head or tail. (ab aisa bhi hain when the coin may stand i.e., in movies. Let us stay in reality).

So when an experiment is performed we note all the possible outcomes of the experiment just as we list all the possible outcomes of our decision. This list is also known as sample space. So, the mathematical function (basically when you calculate the probability/ the event of the happening of an alternative) out of all the listed ones, the output gives probability and this phenomenon is known as a probability distribution.

While I was learning probability in school I remember my teacher using the term frequency distribution. But, when I read about probability distribution the first thing I wanted to know is how these two differ. The frequency distribution is the list of all observed frequencies of an experiment when the experiment actually occurred. Whereas, PD is the list of all possible outcomes(as I told you what we anticipate the result could be) that could result if the experiment were done.

Types of Probability Distribution

When one exactly knows what the outcome can be then it is known as discrete probability distributions. This means that the probability can take on only a limited number of values. A coin toss outcomes are one of those as we exactly know what the outcome might be. But, think about the temperature of your room. It is something that varies continuously. These kinds of distributions where one cannot list all the possible outcomes and it keeps varying within a range are called continuous probability distributions.

Life would have been so easy if these types could just stop here. But, we are not so lucky. Each of these distributions has further been divided into various types according to their uses. The best thing here is this learning is not going to go waste because they have a lot of real-world problems that help us do a lot of predictions.

Before we move any further with the discussion we need to understand two important terms. They are random variable and expected value. A variable is random if it takes on different values as a result of the outcomes of a random experiment. The expected value of a random variable is the arithmetic mean of a large number of independent realizations of that variable.

Discrete Probability Distribution:

a.) Binomial Distribution: It is described as discrete, not continuous data, resulting from an experiment known as the Bernoulli process (takes the value 1 with probability p and the value 0 with probability q = 1-p. Or it can be thought of as a model where the set of possible outcomes of any single experiment is a question if it is yes-no).

Mean of a binomial distribution = np.
n = no.of trails; p = probability of success.
Standard Deviation of binomial distribution = sqrt(npq).
n = no.of trails; p = probability of success; q = probability of failure(1-p).

Conditions: i.) There should be only two possible outcomes.
ii.) The probability of one outcome of any trail remains fixed over time.
iii.) The trails are statistically independent i.e., the outcome of one trail does not affect the outcome of any other trail.

b.) Poisson Distribution: It is the probability of an event occurring in a fixed interval of time or space and independent of time since the event has occurred. For example, the number of phone calls received by a call center in an hour and it does not depend on the previous call. This can be attributed to poisson distribution.

Conditions: i.) It describes rare events.
ii.) Each occurrence is independent of any other occurrences.
iii.) The number of occurrences in each interval of time can range from 0 to infinity.
iv.) The expected number of occurrences must hold constant throughout the experiment, but may vary with time.
v.) Two events cannot occur exactly at the same instance.

c.) Hyper-geometric Distribution: A hyper-geometric distribution is applicable when selecting from a finite population without replacement from a finite population of size N that contains exactly K objects with that feature, wherein each draw is a success or failure.

Conditions: i.) The events are performed without replacement, unlike binomial distribution where replacement is necessary.
ii.) There should be only two possible outcomes.

d.) Discrete Uniform Distribution: It is a symmetric probability distribution wherein a finite number of values are equally likely to be observed (every one of n values has equal probability 1/n).

Continuous Probability Distribution:

a.) Uniform Distribution: It has equal probabilities for all possible outcomes of the random variable. The outcome lies between certain bounds. The bounds are defined by certain parameters, a and b, which are the minimum and maximum values.

For example, when we go to a restaurant and order a piece of meal, the time span within which the meal is served is 15-20 minutes. So, in the time interval of 5 minutes between the given estimated time each and every minute has the equal probability or is known as uniformly distributed random variable.

b.) Normal/Gaussian Distribution: It is a probability function that describes how the values of a variable are distributed. It is a symmetric distribution where most of the observations cluster around the central peak and the probabilities for values further away from the mean taper off equally in both directions.

Conditions: i.) It is a bell-shaped and symmetrical curve.
ii.) The mean, median, and mode are equal.
iii.) Location is characterized by mean and spread is characterized by the standard deviation.

c.) Exponential Probability: It a process in which events occur continuously and independently at a constant average rate. It is a special case of Gamma distribution which is used to model the continuous variables that are always positive and have skewed distributions.

For example, the time required to complete a questionnaire is always positive and is skewed in one direction.

d.) Log-Normal Distribution: is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution. Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y), has a log-normal distribution.

For example, they are used to analyse stock prices. The log normal curve can help identify the compound return that the stock can expect to achieve over a period of time.

There is another distribution which I would like to discuss here and is extensively used in real life is the power law distribution.

Power Law Distribution: It is mainly associated with pereto distribution. It defines the relationship between two quantities, where a relative change in one quantity results in a proportional relative change in the other irrespective of their previous initial size. Precisely, one quantity changes as the power of the other.

For example, the area of a square is twice that of its side as the length increases so does the area as a power of 2. Surprisingly, the frequencies of the words that occur in a certain language also follow a power-law distribution.

chirag-garg · 17 June 2022 06:43

The most common use is for a Bayesian prior distribution. These are usually distributions of parameters, but each set of parameters represents a specific probability distribution within a family. And there’s no reason a Bayesian prior can’t include distributions of different types.

But more generally, it’s just a probability distribution that assigns either probabilities or likelihoods to a set of other probability distributions.

Let f(x) be the pdf of a random variable X, then mean μ of the distribution is given by ; E(x) = μ = ∫x. f(x) dx within limits -∞ to ∞ . E is expected value operator.