Please enable JavaScript.
Coggle requires JavaScript to display documents.
Correlation and Regression - Coggle Diagram
Correlation and Regression
Correlation = statistical technique that is used to measure and describe the relationship between two variables
Scatter plot
adding a constant to (or subtracting a constant from) each X and/or Y value does not change the pattern of data points and does not change the correlation
multiplying (or dividing) each X or each Y value by a positive constant does not change the pattern and does not change the value of the correlation
Multiplying by a negative constant produces a mirror image of the pattern and changes the sign of the correlation
A correlation is a numerical value that describes and measures three characteristics of the relationship between X and Y
2) The Form of the Relationship
3) The Strength or Consistency of the Relationship
1) The Direction of the Relationship
The sign of the correlation, positive or negative, describes the direction of the relationship.
In a positive correlation , the two variables tend to change in the same direction
In a negative correlation , the two variables tend to go in opposite directions
The Pearson Correlation
measures the degree and the direction of the linear relationship between two variables.
Requires Sum of Product Deviations to Calculate
Pearson correlation can be expressed entirely in terms of z-scores
Where Correlations are used
Prediction
If two variables are known to be related in some systematic way, it is possible to use one of the variables to make accurate predictions about the other
Validity
One common technique for demonstrating validity is to use a correlation
Reliability
One way to evaluate reliability is to use correlations to determine the relationship between two sets of measurements. When reliability is high, the correlation between two measurements should be strong and positive.
Theory Verification
the prediction of the theory could be tested by determining the correlation between the two variables
Interpreting Correlations
1)Correlation simply describes a relationship between two variables. It does not explain why the two variables are related
Correlation does not establish causation
2)The value of a correlation can be affected greatly by the range of scores represented in the data
For a correlation to provide an accurate description for the general population, there should be a wide range of X and Y values in the data
3)One or two extreme data points can have a dramatic effect on the value of a correlation.
If you only “go by the numbers,” you might overlook the fact that one extreme data point inflated the size of the correlation.
4)a correlation should not be interpreted as a proportion.
Hypothesis Tests with the Pearson Correlation
When you obtain a nonzero correlation for a sample, the purpose of the hypothesis test is to decide between the following two interpretations
There is no correlation in the populationand the sample value is the result of sampling error.
A sample correlation near zero supports the conclusion that the population correlation is also zero.
The nonzero sample correlation accurately represents a real, nonzero correlation in the population.
A sample correlation near zero supports the conclusion that the population correlation is also zero.
The Spearman Correlation
When the Pearson correlation formula is used with data from an ordinal scale (ranks), the result is called the Spearman correlation
is used to measure the relationship between X and Y when both variables are measured on ordinal scales
the Spearman correlation can be used to measure the degree to which a relationship is consistently one directional, independent of its form.
Ranking Tied Scores
2) Assign a rank (first, second, and so on) to each position in the ordered list
3) When two (or more) scores are tied, compute the mean of their ranked positions, and assign this mean value as the final rank for each score.
1) List the scores in order from smallest to largest. Include tied values in the list.
Point-Biserial Correlation .
The point-biserial correlation is used to measure the relationship between two variables in situations in which one variable consists of regular, numerical scores, but the second variable has only two values.
To compute the point-biserial correlation, the dichotomous variable is first converted to numerical values by assigning a value of zero (0) to one category and a value of one (1) to the other category
Then the regular Pearson correlation formula is used with the converted data
dichotomous variable
A variable with only two values
The Phi-Coefficient
When both variables (X and Y) measured for each individual are dichotomous, the correlation between the two variables
1) Convert each of the dichotomous variables to numerical values by assigning a 0 to one category and a 1 to the other category for each of the variables.
Use the regular Pearson formula with the converted scores
Linear Equation
a and b are fixed constants
the value of b is called the slope
The slope determines how much the Y variable changes when X is increased by one point
The value of a in the general equation is called the Y-intercept because it determines the value of Y when
Regression
The statistical technique for finding the best-fitting straight line for a set of data
the resulting straight line is called the regression line
The goal for regression is to find the best-fitting straight line for a set of data