Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Correlation and Correlation Analysis
Transcript of Correlation and Correlation Analysis
Maurice Dormer What is Correlation? Correlation What is a scatterplot? Correlation is a statistical measurement of the relationship between two variables. Group Members A scatterplot is a visual representation of the relationship or association between two variables, which are represented as points (dots), each plotted at a horizontal (x axis) and a vertical (y axis) and assist in interpreting the correlation coefficient. Correlation can be computed into what is known as a correlation coefficient, which ranges from -1 to +1
Negative Correlation No Correlation Positive Correlation
Positive Correlation 0 > r < or = +1 A B or A B
Perfect Positive Correlation r=1
No Correlation r=0 A B or A B
Negative Correlation-1 > or = r < 0 A B or A B
Perfect Negative Correlation r=-1 Objectives Scatterplot Zero Correlation Negative Correlation Positive correlation is indicated by an upward trend as shown below. A perfect positive correlation is given the value 1 where the points form a perfect straight line. Positive Correlation A negative correlation is shown below where small values on the x-axis correspond to larger values on the y-axis. A perfectly negative correlation is therefore given the value -1. The closer the value is to 1 or -1, the stronger the relationship between the variables. Zero Correlation is illustrated below which suggest there isn’t sufficient evidence that a linear relationship exist. The closer the number is to zero, the weaker the correlation. Step 1 Step 2 Step 3 Step 4 Correlation Pearson’s r, Spearmon’s rho and Kendall’s tau Pearson’s r is a parametric method that is used when variables are interval and the relationship is linear.
However, when variables are at the ordinal level two measures are employed. They are Spearmon’s rho( ) and Kendall’s tau ( ). Spearmon’s rho and Kendall’s tau differ from Pearson’s r in that they are both non-parametric methods. The coefficient that is computed for all these methods lie between -1 and +1. Select the Appropriate Test Select OK However, for X to cause Y, they must at least be correlated. By the end of this session, students should be able to:
Clearly define correlation and differentiate between correlation, causation and regression.
Understand and create scatter plots using SPSS.
Understanding parametric and non-parametric measures of correlation coefficient
Calculate correlation coefficient using the SPSS software
Test Hypothesis of variable independence using the SPSS software
Understand and apply the three types of correlational analysis
Test hypotheses using the different methods of correlational analysis
Understand and create contingency tables using SPSS
Understand relationship between variables in the contingency table
Use Chi-Square to understand the strength and direction between variables in the contingency tables Pearson’s r Parametric Statistics Non-parametric Statistics Kendall’s tau and Spearmon’s rho An indicator of linearity in the relationship between the two employed variables. Used to compliment regression Does not always presume causality Characteristics of Correlation -1 0 +1 Correlation vs Regression Correlation is the association or relationship between two quantitative variables. Causation is If a prior change in a variable X effects a change in the variable Y, then, ceteris paribus, X causes Y.
The sufficient condition that is implicitly apart of causality is ceteris paribus – which means “other relevant factors being equal” - and is the key distinction from correlation. Correlation vs Causation If changes in the value of one variable are associated with changes in the value of another, the variables are said to be correlated. Select Analyse Correlate Select the two variable to use
•The chi-square test is used along with contingency tables to determine the probability of a relationship between two variables
•Chi-square analysis entails a comparison of the actual frequencies with those that would have resulted by chance (i.e. difference between observed and expected frequencies)
•A null hypothesis (Ho) of no relationship between the two variables is prescribed prior to the chi-square test. Acceptance or rejection of this hypothesis is determined by the Asymp. Sig value against the conventional significance chosen
•Chi-square is not a strong statistics because it does not convey information about the strength of the relationship (Cramer’s V).
•Variables usually evaluated are either nominal or ordinal variables. Cross Tabulation (Cross Clarification / Contingency tables) Results Sketches This is a table showing the frequencies for the combination of two or more variables indicating the absence or presence of a relationship.
Within the table possible combinations are compared along the frequencies of their occurrences in order to summarize and reduce the amount of data so as to make it more readable and analyzable. (similar to frequency tables)
Variables are either categorical or numerical variables that have been arranged into groups (i.e. nominal or ordinal)
The cross tabulation between two variables is often referred to as a contingency table since there are only four possible combination of the two variables (2 X 2 tables)
Column percentages and frequencies are generally evaluated to discern if there is a relationship between the two variables being evaluated. Cross Tabulation and Chi-square
Conclusion: the Asymp. Sig. Level is .060 which is less than 10% significance level. Therefore we can decide to reject null hypothesis and conclude that smoking does have a negative effect on sleeping. Ho: Smoking does not have a negative effect on an individual’s ability to fall asleep?
H1: Smoking has a negative effect on an individual’s ability to fall asleep?
Using the 10% significance level Select the Tab Graphs-->Legacy Dialogs---> Scatter/Dot on the top menu as shown below You will be presented with the following screen. Select Simple Scatter and click Define Select Variables from the list on the left to input on each axis and double click or click the arrow beside Y Axis and X axis respectively--> Select OK In the output window double click on the displayed scatterplot to open Chart Editor-->Select Elements in the top Menu---> Select Create Fit Line at Total to create a Line of Best Fit APA Style Types of Correlation Analysis Zero Order Correlation Partial and semi-partial correlation improves upon the zero-order Correlation by identifying a variable’s unique contribution to explaining another variable.
The partial correlation is a measure of the unique relationship between two variables by removing the effect of one or more other variables on the former two variables.
A correlation between two variables when the effects of one or more related variables are removed. Semi-Partial Correlation A measure of the strength of the straight-line or linear relationship between two variables. The value of the calculations for zero order correlation takes on values ranging between +1 and -1. Zero order correlation is the same as correlation coefficient. Partial Correlation measures a variable Y unique share of the total variation in another variable Z by removing the effect that one or more variables has on Y. It has a more intuitive interpretation in the context of regression analysis. Testing Partial Correlation Research question:
Does an improvement in physical fitness improve the general health of an individual irrespective of their current weight?
H0: There is no relationship (between the two variables)
H1: There exists a statistically significant relationship (between said two variables)
Analyze-> Correlate-> Partial -> Select Variables_> Selecting Variable controlling for ->Ok With the significance at .000, we have sufficient evidence to reject the null hypothesis.
For the purpose of comparisons, the Pearsons r was ran to show the difference in correlation values between Bivariate and Partial Correlation. Steps There are two methods of measuring Semi-Partial (Part) correlation with SPSS •Linear regression method
If we are interested in the effect of the covariate on the independent variable then this method is most suitable.
Analyze-> Regression-> Linear->Select Y&X Variables->Statistics->Part and Partial Correlation->Continue->Ok
•Residualizing Independent variables
If we are interested in the effect of the covariate on the dependent variable then this method is most suitable.
Analyze-> Regression->Linear->Select Y and one (1) X variable->Save Unstandardized Residuals->Continue->Ok
Adjusting New variable
Variable view-> Res_1->Clear/delete Label “Unstandardized Residual”
Find Correlation Coefficient
Analyze-> Correlation->Bivariate->Select Y and Res_1 variable->Select appropriate Correlation Coefficien->tOk
Select File->Apply Chart Template-> Select APA_Styles.sgt Bivariate