Correlation in Data Science – An introduction

Correlation – a relationship or connection of two or more variables. It s essence is that when one variable’s value changes the other variable changes as well (decreases or increases).

When calculating correlations, we try to determine whether there is a statistically significant relationship between two or more variables in one or more data samples. For instance, a relationship between height and weight, a relationship between performance and IQ test results, a relationship between experience and performance.

If you read a phrase in a newspaper like “it turned out that these events have such a correlation here”, then in about 99% of cases, unless other wise stated, we are talking about Pearson correlation coefficient. This is the default correlation.