Spearman Rank Correlation Co-efficient

A method for calculating the strength of the relationship between two variables.

When looking at correlations, a correlation coefficient is calculated to determine how close the data points are to the trend line.

The coefficient is always between 0 and 1; 0 indicating there is no correlation and 1 indicating there is a perfect relationship between the two variables.

The coefficient is also given either a positive or negative sign, indicating whether or not the relationship between the two variables is positively or negatively correlated.

Requirements:

  • Ordinal or interval data
  • Scatter diagram indicates a possible continuously increasing or decreasing relationship
  • At least 5 pairs of measurements (preferably 10 or more)

 

Example: Moisture content of soil across the saltmarsh

1. Write a null hypothesis (note: this is not the same as your hypothesis in your investigation)

There is no positive correlation between the distance from high water and the water moisture content of the soil.

Data:

Distance (m) 0 4 8 12 16 20 24 28 32 36
% Water Content 31.5 28.2 34.9 57.6 60.1 49.4 62.2 60.4 68.5 54.7

2. Give a rank to each value for each variable

A Distance (m) 0 4 8 12 16 20 24 28 32 36
B Rank for Distance 1 2 3 4 5 6 7 8 9 10
C % Water Content 31.5 28.2 34.9 57.6 60.1 49.4 62.2 60.4 68.5 54.7
D Rank for water content 2 1 3 6 7 4 9 8 10 5

3. Work out the difference, d, between each rank for each pair of values (Row B – Row D)

Difference in Ranks (d) -1 1 0 -2 -2 2 -2 0 -1 5

4. Square the differences

d2 1 1 0 4 4 4 4 0 1 25

5. Add up the above differences

∑d² = 1+1+0+4+4+4+4+0+1+25 = 44

6. Calculate the Correlation Coefficient, rs, using the formula below:
rs = 1 –   6 ∑d²
n(n² – 1)

Where n is the size of sample. In this example, n = 10.

Therefore, rs = 1 –  6 x 44      =   1 – 264    = + 0.733
10(100-1)             990

This value indicates a strong positive correlation…
…but how close to +1 does it have to be for the correlation not to be attributed to natural variability and for the null hypothesis to be rejected?

Compare the calculated value for rs with a critical value for that particular number of samples (n) in a statistical table at the 5% significance level, see below.

You use a one-tailed test if you know which direction the correlation will be, i.e. positive or negative. If you do not know this information, use a two-tailed test.

If the calculated value is greater than or equal to the critical value in the table, the null hypothesis is rejected (there is only 5% probability the null hypothesis is true)

Example: At the 5% level, using a 0ne-tailed test (refer back to the null hypothesis to see direction), the calculated value of +0.733 is greater than the critical value at +0.648, so the null hypothesis is rejected.

Therefore, there is a statistically significant positive correlation between the distance from high water and the % water content in the soil.

Don’t forget to relate this back to your results – think about zonation in this example! And make sure when writing up statistical results in your project you quote the calculated value, the number of samples, whether you used a one- or two-tailed test, the critical value and the significance level.

© Medina Valley Centre 2017. Charity Reg No: 236153.