Inference for Categorical & Quantitative Data: Chi-Square & Slopes

Inference for
Categorical & Quantitative
Data:
Chi-Square & Slopes

Chi-square Test

Goodness of Fit(1 way table)

Observed Count - The actual observed counts from a sample/study

Expected Count - The counts expected if the null hypothesis is true

One-way table - displays counts for categories of a single categorical variable

Chi-square Statistics:

df:

As df increases the chi-sqaure dist. becomes less right skewed

Conditions for contructing GOF test

Independence: Independent treatments/N>=n*10

Large Counts: All expected counts are >= 5

Randomness: Random sampling/assignemnt

Ho: The distribution of the variable is the SAME as the claim Ha: The distribution of the variable is DIFFERENT than the claim

p-value = X²cdf

Chi-square test for Homogeniety (1 categorical varibale)

Chi-square test for Independence (2 categorical vairbales)

Same Conditions: Randomness Independence, Large Counts

Ho: The distribution of var. A is the same as var. B Ha: The distribution of var. A is the not the same as var. B

Ho: There is no association between var A & var. B Ha: There is an association between var A & var. B (They are dependent)

Expected Counts = (row total * colum total) / (table total)

2- way table

Expected Counts = (row total * colum total) / (table total)

2-way table

Confidence Interval for Slopes

Conditions

Normality - no strong skew or outliers on the resdual plot

Equal Variance - No fanning on the residual plot / same distribution around residual=0

Independene - N>=10n or treatments are indpt from each other

Randomness - random assignment/random selection

Linear - The relationship between x and y is linear (NO pattern in the residual plot)

Standard Deivation:

Margin of Error: (t*)(SEb) =

Confidence Interval: b±(t*)(SEb) =

Interpretations

If all conditions met perform a linear regression t-intreval for slope

Constant(α): when (Y) is 0, the predicted (X) is ___

Slope(β): for every 1 increase in (Y), their predicted (X) will increase by ___

SD(σ): On average, the predicted (X) typically vary by ___ from the actual (X)

Standard error of slope(SEb): the slope of the sample regression line typically vary by ___ from the true regression line for predicting (X) from (Y)

Conclsion for 4-step plan: We are (confidence level) confident that the interval to captures the slope of the true regression line for predicting (X) from (Y)

Mean:

df = n-2

Test for Slopes

Test Statistics: Hypothesized slope = 0 when checking for linear association

β = 0 (if sample suggests linear relationship)

β = hypothesized slope (if given)

β ≠ 0 (There is an linear relationship)

β > 0 (if there is a positve linear relationship)

β < 0 (if there is a negative linear relationship)

df = n-2

if testing for β ≠ 0 remember to times tcdf by 2 to get the p-value

Conditions

LINER

if all conditions met, perform a linear regression t-test for slope

if p-value>α, we fail to reject the Ho. Therefore we do not have convincing evidence that (Y) increases (X) increases

if p-value<α, we reject the Ho. Therefore we do have convincing evidence that (Y) increases (X) increases