Please enable JavaScript.
Coggle requires JavaScript to display documents.
Simple Linear Regression chp 8 (Sum of Sq total, SSE, SSR (summation ( y …
Simple Linear Regression
chp 8
elegant method for
estimating
continuous target variable
least square estimate
aims to reduce residuals
y = mx + c + e
e - account for indeterminacy in the model
formula specifics in Appendix 1, SS4
Avoid extrapolating (estimating and predicting) outside of the range of x - we no know if it's still linear outside
if can't avoid extrapolation - inform end user
Sum of Sq total, SSE, SSR
summation ( y - y' ) ^2
where y' = predicted score
measure of total variability in the response var
diff from sum of sq error - in SSE y' = predicted, while in SST y' = mean
if SST > SSE - it is meaningful to use regression as it improves our predictive ability
Sum of Sq Regression - summation (y' - y'')^2 where y' - predicted and y'' - mean
SSR = finds improvement caused by using predictors rather than just mean
SST = SSR + SSE, Appendix 1 SS5 for eq
goodness of fit, "r^2"
- SSR/SST
physical sciences have better r^2 than social - cause of person to person variability
Standard Error
s = root (Mean sq error) = root ( (sum of sq error) / (n-m-1) )
m = no. of predictor variables
This is the typical estimation error as SD is the size of typical deviation
measured in units of y - smaller is better
typical predictor error
is when we use SST instead of SSE - goal is to have this more than Standard error - shows improvement
Correlation Coefficient
r - tells strength of relaitonship
smtiems even small r useful for large data sets
Formula in Appendix 1, SS6
Anova Table
- Appendix 1, SS7
Special cases
Outliers
Very large standardized residual in abs value
High leverage points
influential Obs