Basic Stats
QuaNTitative variables
numerical
graph
grouped frequency table
cumulative relative frequency table
histogram
cumulative relative frequency table
symmetric distribution
mean
standard deviation
non-symmetric distribution
median (Q50)
lower quartile (Q25)
upper quartile (Q75)
cumulative relative frequency plot
box plot
positive (right skew)
geometric mean
mode
P Value
p> 0.1: no evidence against H0 (H0 might be correct)
P< 0.05: null hypothesis rejected due to evidence against it
0.05 < P < 0.01: weak / suggestive evidence against H0
0.01 < P < 0.05: quite strong evidence against H0
0.001 < P < 0.01: strong evidence against H0
P<0.001: very strong evidence against H0
Stat Table gives one-sided P-value (so multiply with 2! for two-sided P-Value)
Calculation
Z test (for n>20)
T test (for n<20)
if 95% CI included 0 (proportion) or 1 (risk ratio), then P> 0.05
QuaLItative variables
Proportions
Comparison of Two Proportions
Standard error
SE= root[ p-bar(100-p-bar)/n1 + p-bar (100- p-bar)/n2 ]
95% CI for difference
= (p1 - p2) +/- root[ p1(100-p1)/n1 + p2(100-p2)]
sample n>20
2 X P = two-sided P-value
with Z value: look for one-sided P
Z test= (p1 - p2)/SE(p1-p2)
Confidence Interval
Standard error = root[ p*(100-p)/n ]
Graph
bar chart
frequency table
ordered categorical
categorical
binary
p-bar = overall percentage of cases
Sampling Variability of a Mean
Standard Error
95% CI
n > 20
n < 20
n >20
95% CI: p +/- 1.96* SE
n < 20
95% CI= p +/- t*SE
95% CI: mean +/- 1.96*SE
95% CI = mean +/- t* SE
df = n-1
look for t in table
SE = s / root[ n ]
Significance Test
Z = (mean - "population mean")/ SE (mean)
n < 20
df = n-1
look in table the crossing of Z and df
P value is btw two values (e.g. 0.002 and 0.001)
Analysis of matched pairs
difference d= p1 - p2
calculate mean (d-bar) of all d
SE (d-bar) = s / root[ n ]
look for t at df= n-1
95% CI = d-bar +/- t*SE
T-test: (d-bar - 0)/SE(d-bar)
look for P in table at df and T
Comparing Two Means (unpaired data)
Significance Test
n > 20
95% Confidence Interval
(mean 1- mean 2) +/- 1.96 * SE (mean1 - mean2)
Standard Error
SE (mean1 - mean1)= root[ s1^2/n1 + s2^2/n2)
Significance Test
n < 20
pooled standard deviation
s(p) = root [ ((n1-1)s1^2 + (n2-1)s^2) / (n1-1) + (n2-1) ]
Standard Error
SE (mean 1-mean2) = s(p) * root[ 1/n1 + 1/n2 ]
95% CI
(mean1 - mean2) +/- t*SE
Z = (mean1 - mean2)/ SE (mean1-mean2)
Significance Test
T = (mean1 - mean2) / SE (mean1-mean2)