click to edit title

Outcome

Continous

Independent observation

2 groups/ time points

2-sample t-test

Mechanics

Difference in means : T- distribution
(Z distribution for larger samples)

SE of the difference in means (pooled)

More precise estimate of SE

T-distribution has more df

Required Homogeneity of variances

SE of the difference in means (unpooled)

2

F test

Difference in means of the groups more than background noise (=variability within groups) ?

F = (Variability btw groups/ Variability within groups)

Hypothesis : two variances are equal (F statistic =1)

Global test

Total sum of square (TSS)

Sum of Squares within (SSW) or Sum of Squares Error (SSE)

Sum of Squares Between (SSB) or Sum of Squares Regressgion (SSR)

TSS = SSW + SSB

Coefficient of Determination

click to edit

The amount of variation in the outcome variable (dependent variable) that is explained by the predictor (independent variable).

Corection for multiple comparisons post-hoc

Bonferroni correction

Holm/Hochberg

Tukey (adjusts p)

Scheffe (adjusts p)

Pairwise t-tests?

Type I error (5% each test)

1 - (0.95)3 = 14%

Continous variable

Linear correlation (Pearson): 2 variables treated equals

Covariance

click to edit

cov(x,y)=0 : X and Y independent

cov(x,y)>0 : X and Y positively correlated

cov(x,y)<0 : X and Y inversely correlated

Var(x) = Cov(x,x)

Correlation coefficient

click to edit

Strength of linear relationship

0: no correlation (independent)

-1 : perfect inverse correlation

+1: perfect positive correlation

Unitless

PEARSON's Correlation Coefficient

R2 (R-squared)

Proportion of variability explained by the predictors

Measure of model fit

Distribution

Normal for larger n

T-distribution for smaller n (n<100)

click to edit

click to edit

Linear regression: Predictor ~ Outcome

Simple linear regression

Y = bx+ a

Intercept

click to edit

Slope (beta coefficient)

click to edit

click to edit

click to edit

Distribution of beta coefficient

T distribution

click to edit

Multiple linear regression

Each regression coefficient is the amount of change in the outcome variable that would be expected per one-unit change of the predictor, if all other variables in the model were held constant

Functions

Control for confounders

10% change in adjusted beta

Do not judge confounders by their effect on p-values

Improve predictions

Test of interactions btw predictors (Effect modification)

Categorical predictors?

Binary: Treat as numbers (0 and 1)

Categorical : Dummy coding!

Residual analysis

Residual = Observed - Predicted

Residual analysis for normality

Residual analysis for homogeneity of variances

Residual plots

WATCH OUT

Over fitting

Missing data

Variable transformation

NOT normaly distributed (n<100)

NON-homogenous variances

Predictor and Outcome do not have a linear relationship

Correlated observation

2 groups/ time points

Paired t-test

2

Repeated-measures ANOVA

Question

Are there significant differences across time periods? (Time factor)

Are there significant differences between groups (=your categorical predictor)? (Group factor)

Are there significant differences between groups in their changes over time? (Group x Time) factor

Serial paired t-test : type I error

Linear Models Assumptions

Normally distributed outcome

Homogeneity of variances

Models are robust against this assumption

NOT required for 2 sample t-test if using unpooled variance

Violated?

Wilcoxon rank-sum test ~ Mann-Whitney U test ("t-test" )

Wilcoxon sign-rank test ("paired t-test")

Kruskall-Wallis test ("ANOVA")

Spearman rank correlation coefficient ("Pearson's correlation coefficient")

Binary or Categorical (propotions)

Independent

Risk difference/ Relative risk (2x2 table)

Z-distribution

Risk ratio

Odds ratio from logistic regression

Chi-square test (RxC table)

P(A)*P(B)= P(A&B) !Independence

Expected cell count =P(A)P(B)N

click to edit

Chi-square distribution

Logistic regression (multivariate regression technique)

Correlated

McNemar's chi-square test (2x2 table)

Conditional logistic regression (multivariate regression technique)

GEE modeling (multivariate regression technique)

Alternatives if sparse data

McNemar's exact test ("McNemar's chi-square test")

Fisher's exact test ("Chi-square")

Time to event (Survial analysis)

Independent

Rate ratio (2 groups)

Kaplan-Meier statistics (2 or more groups)

Non-parametric estimate of the survival function

Empirical probability of surviving

Taking into account of censoring

Compare different group by log-rank test (a type of chi-square test)

Cox regression (multivariate regression technique)

Correlated

Frailty model (multivariate regression technique)