Please enable JavaScript.
Coggle requires JavaScript to display documents.
Unit 2 Exploring Two Variable Data - Coggle Diagram
Unit 2 Exploring Two Variable Data
Relationship between two quantitative variable
Explanatory Variable
:variable that helps explain or predict the value of the other variable.
(X-axis)
Response Variable
: Variable measured in a study/focus of the study.
(Y-axis)
Positive Association
: Both variables increase together
Negative Association
: As one variable increases, the other variable decreases
Residuals
: the differences between actual and predicted values of the data.
Scatterplot:
Describing a scatterplot:
Form
- Linear/ curved/ no form
Outliers
- unusual values/ don't hang out with the overall pattern of data
Strength
- Weak/ Moderate/ Strong
Direction
- Positive/ Negative/ No direction
Correlation
: A quantifiable measurement of the direction and strength of a linear relationship between two variables. (Represented with 'R')
Formula:
R>0: Positive Association
R<0: Negative Association
R value close to 1:
Strong linear relationship
R value far away from 1:
Weak linear Relationship
R also represents
association
and have
no units
Describe a correlation:
strong/weak, positive/negative, and linear/nonlinear
Linear Regression Model
a
= Y-intercept
b
= Slope
y(hat)
= Predicted response(y-value)
x
= Explanatory variable(x-value)
Slope
-> For every 1unit of x, the
predicted
y value increases/decreases by b units
Intercept
-> When x is zero the predicted y is approximately 'a' unit
Residuals
Residual=Actual-Predicted (
R=A-P
)
If there
is pattern
in the residual plot, linear model is
not appropriate
. On the other hand, if there
isn't a pattern (random
) in the residual plot, then the linear model is
appropriate
.
The
mean
of Residuals is
ALWAYS 0
Least Square Regression Line
A straight line that describes how a response variable y changes as an explanatory variable x changes
R^2
: Tells us the proportion of variation in the values of y that can be explained by the values of x (how well the data fit the regression model)
No units & Represents by %
Standard deviation of Residuals
: Gives us the "typical" or "average" prediction error
Slope of LSRL
:
Y-intercept of LSRL
:
Analyzing Departures from Linearity
Influential Observation
: Is a point that, when removed, drastic changes to slope, y-intercept, and correlation
High leverage point
: a point that has a substantially larger or smaller x value compared to the other observations. (
Extreme 'x' value
)
Outlier
: out of pattern
A good LSRL has small S, large r^2(Close to 1)