Spearman's rank correlation coefficient
Author: Dr. Hannah Volk-Jesussek
Updated:
What is Spearman rank correlation?
The Spearman rank correlation examines the relationship between two variables and is the non-parametric counterpart of Pearson's correlation. Therefore, a normal distribution of the data is not required.
There is an important difference between the two correlation coefficients. Spearman correlation uses ranks rather than the original values, hence the name rank correlation.
Example of Spearman correlation
We measured the reaction time of 8 computer gamers and asked them about their age.
If we use a Pearson correlation, we simply take the two variables, reaction time and age, and calculate the Pearson correlation coefficient. However, we now want to calculate the Spearman rank correlation, so we first assign a rank to each person for reaction time and age.
The reaction time is already sorted by value. 12 is the smallest value, so it gets rank 1, 15 is the second smallest, so it gets rank 2 and so on. We do the same for age.
Let's take a look at this in a scatter plot. On the left side we see the raw data for age and reaction time, and on the right side the ranks.
We studied 8 people, and because there are no rank correlations, we have 8 ranks to assign. With this transformation we now have a more even distribution of the data.
To calculate the Spearman correlation, we simply calculate the Pearson correlation of the ranks. So the Spearman correlation is the same as the Pearson correlation, except that the ranks are used instead of the original values.
Let's look at this in numiqo. Click to load the example data.
Example dataOn one hand, we have reaction time and age; on the other, we have the newly created ranks for reaction time and age.
Now we can either calculate the Spearman rank correlation from the reaction time and age, or we can calculate the Pearson correlation from the ranks. In both cases we get a correlation of 0.9.
Spearman rank correlation and Kendall's tau
Kendall's tau is very similar to the Spearman correlation. However, Kendall's tau should be preferred to Spearman's correlation when only a small dataset with many ties is available.
Spearman Correlation Calculation
If there are no rank ties, this equation can also be used to calculate the Spearman correlation.
where n is the number of cases and d is the difference between the ranks of the two variables. For our example, the result is as follows:
The sum of di2 is 8 and n, which is the number of people, is also 8. Substituting these values, we get a correlation coefficient of 0.9.
Strength and direction of the Spearman oefficient
Like Pearson's correlation coefficient r, the Spearman correlation coefficient rs also varies between -1 and 1.
Using the coefficient, we can determine two things:
- The strength of the correlation and
- the direction of the correlation.
The strength of the correlation can be read from a table.
| Value of rs | Strength of correlation |
|---|---|
| 0.0 < 0.1 | no correlation |
| 0.1 < 0.3 | low correlation |
| 0.3 < 0.5 | medium correlation |
| 0.5 < 0.7 | high correlation |
| 0.7 < 1 | very high correlation |
If we have a coefficient between -1 and 0, there is a negative correlation, that is, a negative relationship between the variables. If we have a coefficient between 0 and 1, there is a positive correlation, that is, a positive relationship between the two variables. If the result is 0, there is no correlation.
Testing the significance of correlation coefficients
Often our aim is to test a hypothesis about the population from a sample.
We have calculated the correlation coefficient for the sample data. We can now test whether the correlation coefficient is significantly different from 0.
The null hypothesis and the alternative hypothesis are as follows:
- Null hypothesis: the correlation coefficient rs = 0 (there is no correlation).
- Alternative hypothesis: The correlation coefficient rs ≠ 0 (there is a correlation).
Whether the correlation coefficient is significantly different from zero, based on the sample collected, can be tested using a t-test.
Here, rs is the correlation coefficient and n is the sample size. A p-value can then be calculated from the test statistic t. If the p-value is less than the specified significance level (usually 5%), then the null hypothesis is rejected, otherwise it is not.
Calculation with numiqo
If we use numiqo to calculate the example, we get a p-value of 0.002.
Therefore, the p-value is less than 0.05 and we can reject the null hypothesis that the correlation coefficient is zero in the population.
Statistics made easy
- many illustrative examples
- ideal for exams and theses
- statistics made easy on 454 pages
- 6th revised edition (March 2025)
- Only 8.99 €
"Super simple written"
"It could not be simpler"
"So many helpful examples"