Statistical Tests
Confidence Interval
Z confidence interval
t confidence interval
Hypothesis testing of means
2 samples
1 sample
Data is normal: 1-sample t-test
Data is non-normal: Sign test of median
Data is normal and independent: 2-sample t-test
Data is non-normal & independent: Wilcoxon rank-sum test
P confidence interval
Hypothesis testing of proportions
Two-sample Z test for p1-p2=0
Exact binomial procedure
One-sample z test
Chi-square test for independence
3+ samples: ANOVA
Two factors: Two-way ANOVA
Post-hoc comparisons
One factor: One-way ANOVA
Bonferroni correction: Tends to be overly conservative
Tukey's HSD Test: More accurate (less conservative)
SRS, normal x, known σ.
SRS, x̅ normal or n large. σ unknown!
CI (1-a) = x̅ ± t a/2 s/root n
CI (1-a) = x̅ ± z a/2 σ/root n
t a/2 is value of t (df = n-1) with tail probability of a/2
SRS, x̅ normal or n large.
t obs = x̅ - μ/(s/root n)).
Include tail(s) consistent with Ha
Independence, SRS, n1 & n2 normal or large
Include tail(s) consistent with Ha
Ha: Population median =/= value (or appropriate tail)
H0: Population median = value
SRS & continuous data
H0: Distribution A = Distribution B
Ha: Distribution A > or < or centred at different place than B
2 independent SRS and continuous data
H0: all μ's equal
F obs = MS(treatments)/MS(error)
Independent SRSs, normality/large n, constant σ
Follows F distribution with df = (m - 1, m(n-1))
Y(ij) = μ + a(ij) + B(ij) + aB(ij) + e
Follows distribution with N(μ(ij), σ
Three F-tests: effect of a, effect of B, interaction effect of aB
SRS, categorical variable with 2 outcomes
if p known: np AND n(1-p) >= 10
if p unknown: np AND n(1-p) >= 15
CI (1-a) = p̂ ± z a/2 root(p̂q̂/n)
SRS, 2 possible outcomes, np & nq >= 10
follows Z distribution: p̂ - p0/root(p0q0/n)
Independent SRSs, # of successes and # of failures > 5 for both samples
(p̂1 - p̂2) - 0/root(pq/n1 + pq/n2)
Using pooled estimate of p for denominator
SRS with 2 categorical variables, counts can be organized into 2-way table, 5+ observations in each cell
H0: no association between row and column variables
SRS and independence only!
Follows Z distribution (standard normal distribution ~N(0,1)
Follows t distribution: not standardized (heavier tails, lower peak)