Please enable JavaScript.
Coggle requires JavaScript to display documents.
Ch 5: test power and effect size (sample size and test requirements (t…
Ch 5: test power and effect size
sample size and test requirements
t-test: 30 or more participants
chi-square: 5 per category
regression analysis: 20 or more cases per category
anova: variance (# of cases should be equal among groups- equal group size)
test on larger sample= more statistical significant
small standard error= larger test statistic values= small p-values
more precision= more likely to reject null bc there is less spread; rejection region= closer to the mean
effect size
: size of difference between sample outcome and hypothesized population value
the more likely the sample deviates from the mean, the more it is likely to reject the null
metal detector: large effect size--> statistical test picks up large amount of metal more easily
probability of rejecting null hypothesis depends on sample size and effect size
large sample size= statistical test more
sensitive
test will be significant for small effect sizes
large effect size= easily picked up by statistical test
large effect size more easily statistically significant
practical significance
the minimum effect size that makes test worthwhile
should be in rejection region bc we want to reject null of no improvement in population (after exposure to campaign for example)
statistical significance (metal detector) is used to signal practical significance
we have the OG effect (mean) and the new effect
new effect should be in rejection region
statistical significance will beep (signal it)
null (of no improvement) is rejected-- we want this bc we want to see an improvement
unstandardized effect size
: the difference between sample outcome and hypothesized value
no rule of thumb for interpreting bc changing the scale does not mean effect size is different (only units are different)
depending on sample size used but if effect size is a great deal to us (in rejection region) then it is practically significant
variation in scores is important (a lot of variations= small diff between sample outcome and hyp pop value is not relevant; constant scores= small diff is relevant)
standardized effect size
takes variation into account and tells us if effect size is relevant (strength
direction not relevant
(s.mean- hyp pop value)/ stnd dev in sample
can be applied to one sample, paired sample, and independent sample t-test (look in notebook for formulas)
calculating cohehs d spss (take values from table)
for ind samples t-est, if levens test is not sig, use first row to plug in variables
association as effect sizes (for reg coeff, model, and ANOVA)
if null hypothesis expect no correlation (0 correlation expected), pearsons product-moment correlation coeff./ spearmans rank corr coeff express effect size
SPSS- test on reg coeff (b), reg model (R2), ANOVA (eta2) use effect size of 0
use B* (beta on SPSS), r2, eta2 for effect sizes on SPSS
use previous research to decide sample size
stndized effect size used to express effects we are interested in
go for the sample that produce certain effects more freq
with larger sig levels, rejection regions are bigger and more likely to reject a false null (power of the test)
test power
: probability of rejecting a false null
higher sig level--> large risk to reject null and type 1 error increases bc more likely to reject true null (type 2 error decreases)
1- type II error probability (area outside of rejection region)-- the rejection regions
type 2 error
: not rejecting null when it is false
if there is a small effect in the population, the null hypothesis is not true (bc its a diff value than expected), however we still might fail to reject it bc it does not land in the rejection region
prob of making this error is the area in the middle (not the rejection region)
the world of a researcher
two graphs (hypothetical and true imaginary population. researchers use critical values from tope graph and reject null based on bottom graph
when wondering which sample size to use, think of sig level, effect size, test power
smaller test power= larger risk of not rejecting false null
leave stat sig the same or else test power change (smaller sig=smaller test power)
effect size--> use previous knowledge (small effect sizes need large samples); if there are standardized effect sizes, use those
power: rule of thumb- 80% chance of rejecting false hyp
prob of not rejecting true null= 95% (1-type 1 error-- sig level)
determining sample size
previous research
large sample just in case
effect size more important than stat sig bc it relates to practical sig
test power and type 2 error imp for when we don't reject null hyp (confidence in results)
large sample size needs small effect size for it to be significant
small sample size needs large effect size for it to be significant
0.2: weak effect; 0.5: moderate effect; 0.8 strong effect
0-.1= no to weak association; .1-.3= weak;.3-.5=moderate; .5-.8=strong; .8-1= very strong; 1= perfect