Linear Regression

Type of Plot:
Scatterplot

Simple Linear Regression:
qplot(x = explanatory,
y = response,
data = dataframe,
geom = "point")

Parameters: βi for i0,,k1
Point Estimates: Bi for i0,,k1

Hypothesis Test

Hypotheses:
\( H_0: \beta_i = 0 \)
1) \( H_a: \beta_i < 0 \)
2) \( H_a: \beta_i \ne 0 \)
3) \( H_a: \beta_i > 0 \)

Test Statistic Random Variable (Assuming \(H_0\) is true):
\( T = \dfrac{B_i - 0}{{SE}_i} \sim t(df = n - k) \) where \( SE_i \) is defined here

Observed Test Statistic:
\( t_{obs} = \dfrac{b_{i, obs} - 0}{{SE}_{i, obs}}\) where \( {SE}_{i, obs} \) is defined here

\( \mathbf{\textit{P}}\)-value:
1) \( \mathbb{P}(B_i \le b_{i, obs}) = \mathbb{P}(T \le t_{obs}) \)
2) \( \mathbb{P}\left(\big| B_i \big| \ge \big| b_{i, obs} \big| \right) = \mathbb{P}\left(\big| T \big| \ge \big| t_{obs} \big|\right) \)
3) \( \mathbb{P}(B_i \ge b_{i, obs}) = \mathbb{P}(T \ge t_{obs}) \)

Conditions for Distributional Approximation (To \( T\)):

  1. Linear relationship between between response and predictors (Check residual plot for randomly distributed errors)
  2. Independent observations, errors, and predictor variables (Check residual plot for no time series-like patterns and plot the predictors pairwise)
  3. Nearly normal residuals (Check qqplot of standardized residuals)
  4. Equal variances across explanatory variable (Check residual plot for fan-shaped patterns)

Confidence Interval

Formula for CI:
\( b_i \pm t_{obs}^* \cdot {SE}_{obs} \) where \( {SE}_{obs} \) is defined here

Conditions for Distributional Approximation (To \( T\)):

  1. Linear relationship between between response and predictors (Check residual plot for randomly distributed errors)
  2. Independent observations, errors, and predictor variables (Check residual plot for no time series-like patterns and plot the predictors pairwise)
  3. Nearly normal residuals (Check qqplot of standardized residuals)
  4. Equal variances across explanatory variable (Check residual plot for fan-shaped patterns)

\( Y = \beta_0 + \beta_1 X_1 + \ldots + \beta_{k - 1} X_{k - 1} + \varepsilon \)


\( k = 2 \) in Simple Linear Regression

Multiple Linear Regression plots are commonly in dimensions \( \ge 3 \)