Please enable JavaScript.
Coggle requires JavaScript to display documents.
Module 8: criterion-related validation (attenuation and inflation of…
Module 8: criterion-related validation
the criterion-related approach to test validation involves examining the empirical relationship between scores on a test and a criterion of interest, typically by use of a correlation coefficient
criterion-related validation strategies remind us to inquire about what exactly the test is valid for
college intelligence exam may predict grade point average but not predict student morality
criterion-related validation research designs
predictive validity:
studies correlate test scores at one time with criterion scores collected at some future date. the desire is to examine how well test scores predict future criterion scores. predictive criterion-related research designs typically utilize less restricted samples than other criterion-related designs, including random samples in some cases
concurrent validity:
studies collect test and criterion scores at about the same time. there is no lag between collection times so the determination can be done much more quickly. because the sample is predetermined it is often not randomized.
postdictive validity:
criterion scores are collected prior to obtaining test scores. as is the case for concurrent designs, postdictive criterion-related validation studies use a predetermined sample. in this case, the sample is limited to those individuals for whom we already have criterion data.
examples of criterion-related validation
the correlation between a employment test and years of performance reviews after hire
the data can be used to determine if a new employment hire test needs to be developed
interpreting the validity coefficent
measures range from -1 to 0 to +1
a validity coefficient typically ranges from 0 to +1... yet it rarely exceeds 0.50
cohen determined that 0.1 is small, 0.3 is moderate, and 0.5 is considered large
random selection increases the number closer to +1
attenuation and inflation of observed validity coefficients
inadequate sample size:
it is convenient to employ sample sizes that are small. inadequate sample size often results in a failure to detect the relationship between the test and the criterion in the population. a way to fix this is to increase the sample size. we should use sample sizes larger than 200 individuals. if we cannot gather 200, we can use synthetic validity where we compare our criterion to a similar criterion test already developed.
criterion contamination:
when the test contains aspects that are unrelated to what we are trying to measure. a common source is when an individual with knowledge of test scores also assigns criterion scores. basically we have to have construct-related items in our test
attenuation due to unreliability:
we have to make sure the criterion itself is appropriate to measure. in this case the criterion is not perfectly reliable.
restriction of range:
the variability in test scores in our sample may be considerably smaller than the actual population. it is typically the case that when we reduce the variability of the test scores, we reduce the magnitude of the observed correlation. we would then erroneously conclude that our test is less valid than it actually is.
additional considerations
differential validity determines whether we break a sample group into sub groups.
concluding comments
attempts to provide evidence of the accuracy of test scores by empirically relating test scores to score on a chosen criterion.
best practices
when conducting criterion-related validation, choose the most relevant criterion, not necessarily the most easily measured.
attempt criterion-related validation only if the sample is sufficiently large (perhaps more than 200) to provide a stable validity estimate
correct for artifacts that may lower the estimated criterion-related validity estimate