Statistical Tests

Confidence Interval

Z confidence interval

t confidence interval

Hypothesis testing of means

2 samples

1 sample

Data is normal: 1-sample t-test

Data is non-normal: Sign test of median

Data is normal and independent: 2-sample t-test

Data is non-normal & independent: Wilcoxon rank-sum test

P confidence interval

Hypothesis testing of proportions

Two-sample Z test for p1-p2=0

Exact binomial procedure

One-sample z test

Chi-square test for independence

3+ samples: ANOVA

Two factors: Two-way ANOVA

Post-hoc comparisons

One factor: One-way ANOVA

Bonferroni correction: Tends to be overly conservative

Tukey's HSD Test: More accurate (less conservative)

SRS, normal x, known σ.

SRS, x̅ normal or n large. σ unknown!

CI (1-a) = x̅ ± t a/2 s/root n

CI (1-a) = x̅ ± z a/2 σ/root n

t a/2 is value of t (df = n-1) with tail probability of a/2

SRS, x̅ normal or n large.

t obs = x̅ - μ/(s/root n)).

Include tail(s) consistent with Ha

Independence, SRS, n1 & n2 normal or large

Include tail(s) consistent with Ha

Ha: Population median =/= value (or appropriate tail)

H0: Population median = value

SRS & continuous data

H0: Distribution A = Distribution B

Ha: Distribution A > or < or centred at different place than B

2 independent SRS and continuous data

H0: all μ's equal

F obs = MS(treatments)/MS(error)

Independent SRSs, normality/large n, constant σ

Follows F distribution with df = (m - 1, m(n-1))

Y(ij) = μ + a(ij) + B(ij) + aB(ij) + e

Follows distribution with N(μ(ij), σ

Three F-tests: effect of a, effect of B, interaction effect of aB

SRS, categorical variable with 2 outcomes

if p known: np AND n(1-p) >= 10

if p unknown: np AND n(1-p) >= 15

CI (1-a) = p̂ ± z a/2 root(p̂q̂/n)

SRS, 2 possible outcomes, np & nq >= 10

follows Z distribution: p̂ - p0/root(p0q0/n)

Independent SRSs, # of successes and # of failures > 5 for both samples

(p̂1 - p̂2) - 0/root(pq/n1 + pq/n2)

Using pooled estimate of p for denominator

SRS with 2 categorical variables, counts can be organized into 2-way table, 5+ observations in each cell

H0: no association between row and column variables

SRS and independence only!

Follows Z distribution (standard normal distribution ~N(0,1)

Follows t distribution: not standardized (heavier tails, lower peak)