Statistical tests

Experimental / Quasi-experimental - manipulation, control and random selection

Single-factor (1 IV; 1 DV)

t-statistic (comparing 2 means)

Categorical IV, continuous DV

Assumptions

Sampling distribution: normal distribution

If violated:

Independent t-test: Mann-Whitney test or Wilcoxon rank-sum test

Repeated measures t-test; Wilcoxon signed-ranked test

DV must be continuous

Independent

Homogeneity of variences

Levene's p>.05

Repeated measures

paired sample correlations

p<.05

Independent t-test

Assumptions: Normality and Levene's

t-stats, p-value t(df2)=t-stat, p=..

Descriptive stat (M=..., SD=...)

Repeated measures t-test

Assumptions: Normality and paired samples correlation

t-stat, p-value

Descriptive stat

Analysis of Variance ANOVA (more than 2 means i.e. more than 2 levels)

Conducting multiple t-tests increases type I error

Assumptions:

Nominal IV; continuous DV

Scores are independent from each other

Normal sampling distribution

Homogeneity of variance

Levene's p>.05

If violated: Welch's F (Brown-Forsythe F)

ANOVA is robust, results still interpretable even when certain assumptions are violated

Between subject ANOVA

Levene's p>.05

F-statistic, p-value F(df between, df within)=f-stat, p=...

Descriptive stat

Post hoc

Repeated measures 1-way ANOVA

Assumptions: Mauchly's p>.05 chi-square (df)=approx chi-square, p=...

F-stat, p-value

Descriptive stat

Post hoc

Not assumed:

Greenhouse-Geisser

Huynh Feldt

Lower-bound

Factorial (more than 1 IV)

Between-subject ANOVA

Assumptions

Nominal IV, interval/ratio DV

Normality

Independence

Homogeneity of variances

Reading output

Descriptive stat

Levene's

F-stat, p-value

Post hoc

If violated, assuming that sample sizes are equal and large, ANOVA is robust to violation

Pairwise comparisons

Multiple comparisons

Only for more than 2 levels of IV and significant F-stat

Repeated measures ANOVA

Assumptions

Nominal IV, interval/ratio DV

Normality

Sphericity

Reading output

Descriptive stat

Mauchly's

Not assumed

Greenhouse-Geisser

Huynh Feldt

F-stat, p-value

Post hoc

Mixed ANOVA

Advanced design

Analysis of Covariance ANCOVA (1 or more IV, 1 or more covariates, 1 DV)

Multivariate Analysis of Variance MANOVA (1 or more IV, 2 or more DV)

Correlational

Simple linear regression

Multiple regression

Categorical IV, continuous CV and DV

Assumptions

Correlation between CV and DV

Pearson's r

Homogeneity of regression slopes

Correlation between CV and DV are not different across groups

No interaction between IV and CV on DV

Multivariate effect: effect of IVs and their interactions on combination of DVs

Univariate effect: effect of IVs and their interactions on each DV. Examined only when multivariate effects are significant

Running multiple ANOVAs results in accumulation of Type I error

Assumptions

Multivariate normality (normality test for each DV)

Moderate correlation between DVs (r-value: +.3 -.9/ up to -.4

Homogeneity of variance-covariance matrices between groups

Box's M-test p>.001

Homogeneity of between groups variance

Levene's

Sphericity of within groups variance (if more than 2 levels)

Mauchly's

If violated: Pillai's criterion

Commonly used test: Wilk's lambda

Pearson's r

Linear relationship between 2 continuous variables

Continuous variables

Assumptions

Relationship between predictor and criterion is linear

Outcome variable scores are normally distributed

Reading output

F-value and p-value F(df Regression, df Residual) = f-stat, p=...

Beta, t-value and p-value beta = beta coefficient, t(df) = t-value, p=...

Regression equation Read from unstandardized coefficient column

Reading output

Types

Forced entry

Hierarchical

Stepwise

Correlation matrix correlation between predictors (r>.8)

Model summary

R square

R square change and F change

ANOVA overall fit of model

F and p values

Coefficients

Unstandardized coefficient (b)

Standardized coefficient (Beta)

t and p values

Assumptions

Linearity

Normally distributed residuals

No multicollinearity

Pearson's r <.8

Variance Inflation Factor (VIF) <10

Tolerance >.2

Homoscendasticity

Independence of errors

Durbin-Watson value should be close to 2

Basics

Scales of measurement

Nominal - categorical

Ordinal - rank order

Ratio - true zero

Interval - no true zero

Test statistic = (Variance explained by model) / (Variance not explained by model) = Effect/Error

Power = ability of a test to detect an effect Effect should be more than error. Conventional test for high power =.8

Errors

Type I error rejecting the null hypothesis when it is true

Type II error failing to reject the null when it is false

Probability of falsely rejecting the null: a (alpha)

Conventional alpha level = .05 or .01

Probability of correctly accepting the null: 1 - a

Probability of falsely accepting the null: beta

Probability of correctly rejecting the null: Power(1 - beta)

Depends on

Alpha

Sample size

Effect size (d)

(Mean of experimental group - Mean of control group) / Standard deviation

0.2 - 0.5 = small;
0.5 - 0.8 = medium; >0.8 = large

Normality p-value >.05 for normality to be assumed

Skewness

Negatively skewed

Normal (no skew)

Positively skewed

Kurtosis

Leptokurtic

Mesokurtic (normal)

Platykurtic

Between -2 and +2

Lenient: -3 to +3 Conservative: -1 to +1

Kolmogorov-Smirnov (KS): sample size >2000

Shapiro-Wilk: sample size <2000

Categorical variables must be dummy coded or effect coded

Non-parametric equivalents: Spearmen's rho and Kendall's tau