Basic Stats

QuaNTitative variables

numerical

graph

grouped frequency table

cumulative relative frequency table

histogram

cumulative relative frequency table

symmetric distribution

mean

standard deviation

non-symmetric distribution

median (Q50)

lower quartile (Q25)

upper quartile (Q75)

cumulative relative frequency plot

box plot

positive (right skew)

geometric mean

mode

P Value

p> 0.1: no evidence against H0 (H0 might be correct)

P< 0.05: null hypothesis rejected due to evidence against it

0.05 < P < 0.01: weak / suggestive evidence against H0

0.01 < P < 0.05: quite strong evidence against H0

0.001 < P < 0.01: strong evidence against H0

P<0.001: very strong evidence against H0

Stat Table gives one-sided P-value (so multiply with 2! for two-sided P-Value)

Calculation

Z test (for n>20)

T test (for n<20)

if 95% CI included 0 (proportion) or 1 (risk ratio), then P> 0.05

QuaLItative variables

Proportions

Comparison of Two Proportions

Standard error

SE= root[ p-bar(100-p-bar)/n1 + p-bar (100- p-bar)/n2 ]

95% CI for difference

= (p1 - p2) +/- root[ p1(100-p1)/n1 + p2(100-p2)]

sample n>20

2 X P = two-sided P-value

with Z value: look for one-sided P

Z test= (p1 - p2)/SE(p1-p2)

Confidence Interval

Standard error = root[ p*(100-p)/n ]

Graph

bar chart

frequency table

ordered categorical

categorical

binary

p-bar = overall percentage of cases

Sampling Variability of a Mean

Standard Error

95% CI

n > 20

n < 20

n >20

95% CI: p +/- 1.96* SE

n < 20

95% CI= p +/- t*SE

95% CI: mean +/- 1.96*SE

95% CI = mean +/- t* SE

df = n-1

look for t in table

SE = s / root[ n ]

Significance Test

Z = (mean - "population mean")/ SE (mean)

n < 20

df = n-1

look in table the crossing of Z and df

P value is btw two values (e.g. 0.002 and 0.001)

Analysis of matched pairs

difference d= p1 - p2

calculate mean (d-bar) of all d

SE (d-bar) = s / root[ n ]

look for t at df= n-1

95% CI = d-bar +/- t*SE

T-test: (d-bar - 0)/SE(d-bar)

look for P in table at df and T

Comparing Two Means (unpaired data)

Significance Test

n > 20

95% Confidence Interval

(mean 1- mean 2) +/- 1.96 * SE (mean1 - mean2)

Standard Error

SE (mean1 - mean1)= root[ s1^2/n1 + s2^2/n2)

Significance Test

n < 20

pooled standard deviation

s(p) = root [ ((n1-1)s1^2 + (n2-1)s^2) / (n1-1) + (n2-1) ]

Standard Error

SE (mean 1-mean2) = s(p) * root[ 1/n1 + 1/n2 ]

95% CI

(mean1 - mean2) +/- t*SE

Z = (mean1 - mean2)/ SE (mean1-mean2)

Significance Test

T = (mean1 - mean2) / SE (mean1-mean2)