Please enable JavaScript.
Coggle requires JavaScript to display documents.
RMA - Coggle Diagram
RMA
Hypothesis testing
Sampling error
- Difference between pop & sample mean
- Variability due to chance
Sampling distribution
- Degree of variability between samples expected by chance
- Frequency distribution of sampling means
- Sampling distrib. mean = population mean
Standard error
- SD of set of means / sampling distribution
- How much pop & sample means vary
- Estimate of sampling error
- Large samples -> more info -> less standard error
Null hypothesis H0
- No difference / relationship
- From same population
Given null hypothesis is true, what is probability of results?
Check against rejection / significance level
- p < .05 - reject null hypothesis
- p >= .05 - fail to reject null hypothesis - there is no difference
- If χ2/t//F > critical value -> reject null hypothesis
- If χ2/t//F <= critical value -> fail to reject null hypothesis
Type I error - α
- False positive, say difference when none
- = p, H0 true, but reject
- Under researcher's control
Type II error - β
- False negative, say no difference when there is
- H0 false, but don't reject
One & two-tailed tests
- Identify before collecting data
- One-tail: predict direction
- Two-tail: no direction (there will be a difference)
** Split p between tails
Sample | descriptive statistic: characteristic of sample
- e.g. SD, mean
Test | inferential statistic: assoc. with stat. procedure
- e.g. t-statistic, z-score
Degrees of freedom
- Once abc for all categories except 1 is set, abc for last category is auto determined
General
Statistics
-
-
Descriptive: done first, e.g. means, outliers
Inferential: after, to answer research question
Variables
Continuous variables -> measurement / quant data -> means, variance, SD
Discrete variables -> categorical data -> percentages, frequencies ->
-
Scales
Nominal: categories, no sequence
Ordinal: categories, sequence
Interval: categories, sequence, same size interval, NO absolute/true zero, 0 doesn't mean absence
Ratio: categories, sequence, same size interval, absolute/true zero - no negative numbers
Distributions
-
-
Kurtosis: Leptokurtic = high peak, platykurtic = flat
-
-
Central tendency
-
Central tendency
-
-
Mean: average
- Used algebraically
- Influenced by extreme scores
-
Variability
-
Average deviation: deviation from mean, average => 0
Mean absolute deviation (m.a.d.): absolute deviation, average
Variance s2 | σ2 = sum of squared deviations from mean / N - 1
- Mean squared deviation score
- Sample variance smaller than pop variance
-
-
Linear Relationships
Correlation
Scatterplot
- Visual representation
- Must be linear
Continuous, can be categorical
Degree of linear association / relationship between 2 variables
Scores, not groups
Correlation coefficient
- Strength / Degree: -1 to 1, amount of scatter, 0 = random, 1 or -1 = perfect
- Direction: - or +
- Pearson's correlation coefficient r =
covariance (extent vary together) /
SD of X & Y (extent vary separately)
- Adjusted r: b'cos smaller samples overestimate population correlation
Coefficient of determination r2
- Proportion of variance accounted for in 1 variable by other
- x% of variability in Y can be explained by X
Significance test - t-statistic
Factors effecting correlation
- Large N -> trivial correlations may be statistically significant
- Outliers
- Range restriction
- Heterogenous samples (between groups)
Regression
Regression line
- Line of best fit ('centre' of relationship)
- Used for prediction - predict Y based on X
- X = predictor = IV = e.g. anxiety = x-axis
- Y = criterion = predicted = DV = e.g. negative mood = y-axis
- Stronger correlation -> more reliable prediction
- Bivariate (simple) - 1 predictor variable VS multivariate
Equation
Y hat = bX + a
- b = slope = cov / var x = difference in Y associated with 1 unit difference in X = as X increases by 1, Y increases by slope = degree of association
- a = intercept = mean Y - b * mean X = score on DV when IV is 0 = where line crosses Y, X is 0
Standard error of estimate
SE = SDy x SQRT(1 - r2)
Confidence intervals
CI(Y) = Y hat +- critical t * SE of estimate
Mean Differences
t-statistic
- Ratio of mean diff /
mean diff by chance (standard error)
Matched-Samples t-test
- Sample means of 1 group tested on 2 occasions
- 1 categorical DV (pre/post); 1 continuous DV
- e.g. does treatment reduce impulsiveness?
- t = mean diff / (SD diff / SQRT N)
- d = pre mean - post mean / pre SD
Independent Samples t-test
- Sample mean of 2 groups
- 1 categorical IV (2 levels); 1 continuous DV
- e.g. are men more aggressive than women?
- t = mean diff / SQRT( s21 / n1 + s22 / n2)
- Use pooled variance if uneven groups
- CI 95% = (mean diff) +- critical t * SE of mean diff (bottom half of t)
- d = mean diff / SQRT pooled variance
One-Sample t-test
- Compare single sample mean to pop mean (pop SD unknown)
- Continuous
- e.g. does mean impulsiveness differ from pop mean?
- t = mean diff / (sample SD / SQRT n)
Central limit theorem
- Standard error = SD / SQRT n
- Larger sample -> less error -> lower standard error
- Changes from z to t when estimating SE using sample SD
Cohen's d
- Difference in SD units
- .20 = small, .50 = medium, .80 = large
ANOVA
- Difference between means with 3+ groups
- One-way: 1 categorical IV (3+ levels); 1 continuous DV
- Two-way factorial: 2 categorical IV; 1 continuous DV
- Use post-hoc after if significant
Is difference between | treatment (treatment + chance) > within | error variance (chance alone)?
F = 1 means no treatment effect
Conditions
- Homogeneity of variance: scores spread equally around mean
- Normality: scores normal distribution
- Independence of observation: scores unrelated
If violated, be cautious & use larger sample size
Sum of squares
- SS treat: n x [(treatment mean - grand mean)2, then sum]
- SS error: (Individual score - group mean)2, then sum - for each data point
- SS total: SS treat + SS error
Degrees of freedom
- df treat = k - 1
- df error = k(n - 1)
- df total = N - 1
k = no. treatments, n = in each group, N = total participants
Other
- MS (2) = SS / df
- F (1) = MS treat / MS error
- MS = variance
- If F > critical F value in table, then reject null hypothesis - there is difference somewhere between groups
- Effect of treatment is F times greater than error within groups
- F = 1 means no treatment effect
Effect size
- How much of variability due to difference between treatments
- η2 = SS treat / SS total
- From 0 - 1
- η2 % of variation in A can be attributed to difference in B
Error rate per comparison (EC) vs Family-wise (FW)
- EC = alpha = .05
- FW = 1 – (1 – α)c = prob. of making at least 1 Type I error
More analyses -> more chance of Type I error
A Priori vs Post Hoc Tests
- A priori = setup before ANOVA, test specific hypothesis
- Post hoc = conducted after significant ANOVA, with 3+ groups - which groups differ
Studentized Range Statistic (q)
- Use diff between largest & smallest mean to calc q
- If q > critical q, reject null hypothesis
OR
- Calc minimum mean difference for significance (adjust r to test diff means)
- Lookup q table with r (no. means) and df error
Tukey's Honestly Significant Difference Test
- Use max r -> max q -> larger mean diff required -> harder to reach significance -> more conservative
- Each mean diff compared to higher r - FW constant over all comparisons
Two-way ANOVAS
- IV = factors, factors have levels
- e.g. age (3 levels), gender (2 levels) = 3x2 factorial OR two-way ANOVA
Effects
- Main 1: ignore factor 2, average, effect regardless of factor 2
- Main 2: ignore factor 1
- Interaction: effect of 1 factor on DV is not same at all levels of other factor (depends on levels of 2nd IV)
** Lines parallel -> no interaction
** Be cautious about interpreting main effect if signification interaction
Sum of squares
- SS total: every score with grand mean
- SS A: mean of both A groups with grand mean x n x number of B levels
- SS cells: cell mean with grand mean * n
- SS AB = SS cells - SS A - SS B
- SS error = SS total - SS cells
Degrees of freedom
- df A = levels - 1
- df AB = df A * df B
- df error = df total - df A - df B - df AB
- df total = N - 1
MS (4) = SS / df
F (3) = MS / MS error
-