Statistical tests
Experimental / Quasi-experimental - manipulation, control and random selection
Single-factor (1 IV; 1 DV)
t-statistic (comparing 2 means)
Categorical IV, continuous DV
Assumptions
Sampling distribution: normal distribution
If violated:
Independent t-test: Mann-Whitney test or Wilcoxon rank-sum test
Repeated measures t-test; Wilcoxon signed-ranked test
DV must be continuous
Independent
Homogeneity of variences
Levene's p>.05
Repeated measures
paired sample correlations
p<.05
Independent t-test
Assumptions: Normality and Levene's
t-stats, p-value t(df2)=t-stat, p=..
Descriptive stat (M=..., SD=...)
Repeated measures t-test
Assumptions: Normality and paired samples correlation
t-stat, p-value
Descriptive stat
Analysis of Variance ANOVA (more than 2 means i.e. more than 2 levels)
Conducting multiple t-tests increases type I error
Assumptions:
Nominal IV; continuous DV
Scores are independent from each other
Normal sampling distribution
Homogeneity of variance
Levene's p>.05
If violated: Welch's F (Brown-Forsythe F)
ANOVA is robust, results still interpretable even when certain assumptions are violated
Between subject ANOVA
Levene's p>.05
F-statistic, p-value F(df between, df within)=f-stat, p=...
Descriptive stat
Post hoc
Repeated measures 1-way ANOVA
Assumptions: Mauchly's p>.05 chi-square (df)=approx chi-square, p=...
F-stat, p-value
Descriptive stat
Post hoc
Not assumed:
Greenhouse-Geisser
Huynh Feldt
Lower-bound
Factorial (more than 1 IV)
Between-subject ANOVA
Assumptions
Nominal IV, interval/ratio DV
Normality
Independence
Homogeneity of variances
Reading output
Descriptive stat
Levene's
F-stat, p-value
Post hoc
If violated, assuming that sample sizes are equal and large, ANOVA is robust to violation
Pairwise comparisons
Multiple comparisons
Only for more than 2 levels of IV and significant F-stat
Repeated measures ANOVA
Assumptions
Nominal IV, interval/ratio DV
Normality
Sphericity
Reading output
Descriptive stat
Mauchly's
Not assumed
Greenhouse-Geisser
Huynh Feldt
F-stat, p-value
Post hoc
Mixed ANOVA
Advanced design
Analysis of Covariance ANCOVA (1 or more IV, 1 or more covariates, 1 DV)
Multivariate Analysis of Variance MANOVA (1 or more IV, 2 or more DV)
Correlational
Simple linear regression
Multiple regression
Categorical IV, continuous CV and DV
Assumptions
Correlation between CV and DV
Pearson's r
Homogeneity of regression slopes
Correlation between CV and DV are not different across groups
No interaction between IV and CV on DV
Multivariate effect: effect of IVs and their interactions on combination of DVs
Univariate effect: effect of IVs and their interactions on each DV. Examined only when multivariate effects are significant
Running multiple ANOVAs results in accumulation of Type I error
Assumptions
Multivariate normality (normality test for each DV)
Moderate correlation between DVs (r-value: +.3 -.9/ up to -.4
Homogeneity of variance-covariance matrices between groups
Box's M-test p>.001
Homogeneity of between groups variance
Levene's
Sphericity of within groups variance (if more than 2 levels)
Mauchly's
If violated: Pillai's criterion
Commonly used test: Wilk's lambda
Pearson's r
Linear relationship between 2 continuous variables
Continuous variables
Assumptions
Relationship between predictor and criterion is linear
Outcome variable scores are normally distributed
Reading output
F-value and p-value F(df Regression, df Residual) = f-stat, p=...
Beta, t-value and p-value beta = beta coefficient, t(df) = t-value, p=...
Regression equation Read from unstandardized coefficient column
Reading output
Types
Forced entry
Hierarchical
Stepwise
Correlation matrix correlation between predictors (r>.8)
Model summary
R square
R square change and F change
ANOVA overall fit of model
F and p values
Coefficients
Unstandardized coefficient (b)
Standardized coefficient (Beta)
t and p values
Assumptions
Linearity
Normally distributed residuals
No multicollinearity
Pearson's r <.8
Variance Inflation Factor (VIF) <10
Tolerance >.2
Homoscendasticity
Independence of errors
Durbin-Watson value should be close to 2
Basics
Scales of measurement
Nominal - categorical
Ordinal - rank order
Ratio - true zero
Interval - no true zero
Test statistic = (Variance explained by model) / (Variance not explained by model) = Effect/Error
Power = ability of a test to detect an effect Effect should be more than error. Conventional test for high power =.8
Errors
Type I error rejecting the null hypothesis when it is true
Type II error failing to reject the null when it is false
Probability of falsely rejecting the null: a (alpha)
Conventional alpha level = .05 or .01
Probability of correctly accepting the null: 1 - a
Probability of falsely accepting the null: beta
Probability of correctly rejecting the null: Power(1 - beta)
Depends on
Alpha
Sample size
Effect size (d)
(Mean of experimental group - Mean of control group) / Standard deviation
0.2 - 0.5 = small;
0.5 - 0.8 = medium; >0.8 = large
Normality p-value >.05 for normality to be assumed
Skewness
Negatively skewed
Normal (no skew)
Positively skewed
Kurtosis
Leptokurtic
Mesokurtic (normal)
Platykurtic
Between -2 and +2
Lenient: -3 to +3 Conservative: -1 to +1
Kolmogorov-Smirnov (KS): sample size >2000
Shapiro-Wilk: sample size <2000
Categorical variables must be dummy coded or effect coded
Non-parametric equivalents: Spearmen's rho and Kendall's tau