# Spearman Rank Correlation Co-efficient

A method for calculating the strength of the relationship between two variables.

When looking at correlations, a correlation coefficient is calculated to determine how close the data points are to the trend line.

The coefficient is always between 0 and 1; 0 indicating there is no correlation and 1 indicating there is a perfect relationship between the two variables.

The coefficient is also given either a positive or negative sign, indicating whether or not the relationship between the two variables is positively or negatively correlated.

**Requirements:**

- Ordinal or interval data
- Scatter diagram indicates a possible continuously increasing or decreasing relationship
- At least 5 pairs of measurements (preferably 10 or more)

**Example: Moisture content of soil across the saltmarsh**

1. Write a null hypothesis (note: this is not the same as your hypothesis in your investigation)

**There is no positive correlation between the distance from high water and the water moisture content of the soil.**

Data:

Distance (m) | 0 | 4 | 8 | 12 | 16 | 20 | 24 | 28 | 32 | 36 |

% Water Content | 31.5 | 28.2 | 34.9 | 57.6 | 60.1 | 49.4 | 62.2 | 60.4 | 68.5 | 54.7 |

2. Give a rank to each value for each variable

A | Distance (m) | 0 | 4 | 8 | 12 | 16 | 20 | 24 | 28 | 32 | 36 |

B | Rank for Distance | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |

C | % Water Content | 31.5 | 28.2 | 34.9 | 57.6 | 60.1 | 49.4 | 62.2 | 60.4 | 68.5 | 54.7 |

D | Rank for water content | 2 | 1 | 3 | 6 | 7 | 4 | 9 | 8 | 10 | 5 |

3. Work out the difference, d, between each rank for each pair of values (Row B – Row D)

Difference in Ranks (d) | -1 | 1 | 0 | -2 | -2 | 2 | -2 | 0 | -1 | 5 |

4. Square the differences

d^{2} |
1 | 1 | 0 | 4 | 4 | 4 | 4 | 0 | 1 | 25 |

5. Add up the above differences

∑d² = 1+1+0+4+4+4+4+0+1+25 = 44

6. Calculate the Correlation Coefficient, r_{s}, using the formula below:

**r _{s} = 1 – 6 ∑d²
n(n² – 1)**

Where n is the size of sample. In this example, n = 10.

Therefore, rs = 1 – 6 x 44 = 1 – 264 = + 0.733

10(100-1) 990

This value indicates a strong positive correlation…

…but how close to +1 does it have to be for the correlation not to be attributed to natural variability and for the null hypothesis to be rejected?

Compare the calculated value for r_{s} with a critical value for that particular number of samples (n) in a statistical table at the 5% significance level, see below.

You use a one-tailed test if you know which direction the correlation will be, i.e. positive or negative. If you do not know this information, use a two-tailed test.

If the calculated value is greater than or equal to the critical value in the table, the null hypothesis is rejected (there is only 5% probability the null hypothesis is true)

Example: At the 5% level, using a 0ne-tailed test (refer back to the null hypothesis to see direction), the **calculated value of +0.733** is greater than the **critical value at +0.648**, so the null hypothesis is rejected.

Therefore, there is a statistically significant positive correlation between the distance from high water and the % water content in the soil.

Don’t forget to relate this back to your results – think about zonation in this example! And make sure when writing up statistical results in your project you quote the calculated value, the number of samples, whether you used a one- or two-tailed test, the critical value and the significance level.