Nayland College - Mathematics Home . Year 9 . Year 10 . Level 1 . Level 2 . L3 Statistics . L3 Maths . L3 Calculus . About . Links

# Key Concepts: correlation coefficient 'r'

1. Correlation Coefficient 'r' has no units

2. It is only designed to measure linear relationship (It is NOT appropriate for curved relationships/models)

3. Scaling data has no effect on 'r'

4. The order of the data does not effect 'r'

5. The order of the variables has no effect on 'r'

6. Both variables must be quantitative

7. Correlation coefficient is NOT resistant to outliers (see outliers)

### Always plot the data and decide VISUALLY, before rushing into linear model and 'r' calculation!

No linear relationship, but
there is a relationship!

Reasonable linear relationship, but

there is a better non-linear relationship!

Pearson product-moment correlation coefficient

Which is 'obviously' the same as....

This can be used to calculate 'r'

eg. The old percentage assessment system

 Student Stats x Calc y xy x2 y2 Bill 72% 65% Ted 58% 52% B Jelly 85% 90% D Boot 12% 8% D Mouse 34% 41% Jim 25% 28% Σ

Or it is much easier to use the EXCEL CORREL function

Activity: Construct a scatter plot of the data

Form an 'aim for an investigation' relating to the data

Use the 'correl' Excel function to find the correlation coefficient for the scatter plot (learn how to use the correlation function)

Correl Function Spreadsheet to check out (sigma Ex13.02 #3)

Describe the relationship between the variables including the 'r' value

# Adding a trend line in iNZight

Make a scatterplot in iNZight

(Achieve) Add a linear trend line

To add the line of fit:

 'Add to plot' 'Trend Curves' 'OK' 'Linear' To add the equation of the line of fit: (Merit) Add non linear trend lines (find out more) Comparing groups

# McDonalds Example: Scatterplots

The scatterplot of the energy content verses the fat content indicates that the higher the fat content (explanatory variable) the greater the energy content of a product (dependant variable).

The correlation coefficient of 0.9456 indicates a very strong positive correlation between fat content (g) and energy content (kj)

There is a reasonably even scatter of the data with one possible outlier of the 'chocolate sundae' (4.5g fat, 1200kj energy)

The scatterplot of the energy content verses the carbohydrate content indicates that the higher the carbohydrate content (explanatory variable) the greater the energy content of a product (dependant variable).

The correlation coefficient of 0.6175 indicates a positive correlation between carbohydrate content (g) and energy content (kj)

There is an uneven scatter of data, with data values above 40g of carbohydrate having a greater scatter from the positive trend than those below

There is a stronger relationship between the fat content and energy content, than between the carbohydrate content and the energy content.

This would indicate that the fat content is a better indicator for energy content (as expected because fat is a more concentrated form of energy)

Sigma practice