Please enable JavaScript.
Coggle requires JavaScript to display documents.
Reliability + Validity (Reliability: When same user takes test multiple…
Reliability + Validity
Content validity: How well does the content (i.e. questions) of the scale relate to what we are measuring? For example, if we are measuring the mathematical fluency of an 8th grader, does the test include 8th grade math problems?
-
Criterion-related validity: Is what we are measuring correlated to a gold-standard? For example, if we create a test to predict the a student's grade in class, does this test correlate with the student's grade?
Concurrent validity: Little time has elapsed between taking initial predictive test and knowing grade outcome
How to establish: Run a Pearson correlation test between predictive test and outcome of past students immediately; higher scores= higher concurrent validity
Predictive validity: A block of time has elapsed between taking the initial predictive test and knowing grade outcome
How to establish: Run a Pearson correlation test between predictive test and actual outcome; higher scores= higher predictive validity
Construct validity: Are the questions of the scale actually measuring what we intend it to? For example, if we step on a scale, is it actually measuring weight?
Convergent validity: If we are developing a new measure, we need to ensure that it measures the same construct as a previous measure used
How to establish: run Pearson correlation between two measures; correlation > 0.8 indicates high convergent validity
Divergent validity: When developing measures, we need to ensure that two scales do not measure the same construct
How to establish: run Pearson correlation between two measures; low correlation is indicative of divergent validity
Reliability: When same user takes test multiple times, how consistent are the results? Note that reliability is required to establish validity, but measure can be reliable without being valid
Internal consistency reliability: The manner in which multiple respondents answer the same set of question should be similar; internal consistency is a measure of the consistency between items of the same set across multiple users.
How to establish: Run Cronbach's alpha on a set of data; should be > .70, though for high-stake questionnaires, standard is > 0.90.
Parallel forms reliability: Create two different forms that contain similar questions that are slightly altered; another common method is to create 20 questions and split the test in half to create two forms, each with 10 questions
How to establish: Run correlation between the two parallel forms; should have a high correlation plus no systematic differences between the two
Test-retest reliability: Despite fluctuations within the user, he/she should still have similar responses to the same tests when re-taking it. Have the same user answer the questionnaire twice, but only after an x amount of time has elapsed; especially useful when running online questionnaires
-
Intra-rater reliability: Even within the same rater, variation exists depending on factors such as mood; however, there should be a consistency in one's own rating among repeated ratings of the same measure
-
Inter-rater reliability: Subjectivity is inevitable due to different perspective of people; therefore variability between raters' opinions is to be expected. However, there should be a certain degree of consistency from one person's rating to another's
-