Tech target defines correlation as: “a statistical measure that indicates the extent to which two or more variables fluctuate together”. There are two major components to correlation: strength and type of correlation.
Let’s look at some types:
So now we know what the types of correlation are, let’s look at the strengths:
So, this is all great, but it’s all a bit subjective. I might think a correlation is moderate, while someone else believes it’s strong. We can quantify this with some statistics!
We use the correlation coefficient to do this – it’s denoted as r. r is a quantification of how correlated X and Y are – it runs between negative 1 and plus 1.
When we look at the score, we can use the below table to determine how strong the relationship is:
Now, here comes the fun part, the formula – it looks much worse than it is, let’s break it down.
X and Y form my dataset. We need to add 3 columns:
Once we have calculated each of these, we can sum them (as shown above). I have denoted which piece of the formula each refers to, just below the summed number.
As below, we can then map those numbers into the formula.
We can then start to work out the result:
So, in this case, we have a very strong positive correlation of 0.94.
Note: correlation coefficients can only be done on a normal distribution and remember, correlation does not infer causation.