Skip to Main Content
  • Type/Group: General Purpose/Skills Centre
  • Maths and Stats Support

    ~ ~

    Maths and Stats Support

    stats resources

    <  Back to Stats Resources

    Correlation and regression


    Correlation and regression are used to investigate relationships between two continuous variables but can be extended to testing multiple continuous or binary independent (explanatory) variables at the same time in multiple regression

    Pearson's correlation

    Use: Summarising and testing the strength of a relationship between two continuous variables 

    Dependent (Outcome): Continuous;
    Independent (predictor): Continuous

    Example: Is there a relationship between % attendance and % grade

    Summary statistics/graphs: Scatterplot and Pearson's correlation coefficient

    Spearman's Correlation: If the variables are ordinal or very skewed, then Spearman's correlation is more appropriate.  For example if grade was measured as Fail - 1st, or attendance as poor - excellent, these are ordinal variables

     

    Linear regression

    Simple linear regression tests for a relationship between two continuous variables and produces a line of best fit which can be used to predict the dependent variable given values of the independent.
    Multiple linear regression simultaneously tests relationships for multiple continuous and binary independent variables and controls for other variables when assessing significance of each

    Dependent (outcome) variable: Continuous

    Example (simple): Is attendance a predictor of grade? Can attendance predict grade?

    Example (multiple): Which factors impact on grade and how from: attendance, gender, interest in learning, self-efficacy, asking for help, having maths A level

    Summary statistics/graphs: Correlation coefficient and scatterplots for continuous independent variables and means, boxplot or mean bar chart for binary

    Ordinal variables: If your dependent variable is the mean of a set of related ordinal questions and the assumptions have been met, the parametric techniques such as regression can be used. If you have one ordinal question, opinion is divided on whether regression is suitable. Ordinal regression can be used but this is a complex technique so if your variable only has a few categories, consider reducing the number of categories to two and using logistic regression. If you have many categories and they can be considered equally spaced, use linear regression.

     

    Logistic regression

    Use: Tests which of multiple independent variables are significant predictors of a binary outcome such as survival and produces a model (regression equation) to predict the likelihood of the event happening.

    Dependent (Outcome): Binary (2 categories);
    Independent (predictor): Any number of continuous or binary variables.

    Example: Car insurance companies use logistic regression to identify the factors which increase the likelihood of someone crashing. Car insurance premiums are then based on the predicted probability of you having a crash

    Summary statistics/graphs:%'s for binary outcomes and means/ standard deviation for continuous independent variables

    Note: If you have more than two categories for your dependent variable, consider combining categories so that there are only two categories and using logistic regression rather than using more complex techniques such as multinomial or ordinal regression

     

    Resources by software

          The following resources show you how to carry out and understand output from SPSS including checking assumptions, 

      The Jamovi videos cover everything from carrying out analysis to reporting including suitable summary statistics.  Videos are only currently available to SHU students.

      The resources guide you through the r code and interpretation of the relevant summary statistics and test . The program code files contain all the code can be easily adapted to run on your own data.

         These resources contain the SAS code, output and interpretation

     These resources show the calculations for the specified techniques

    Test chooser

    Test chooser resources

    Need help choosing the right test?