Please enable JavaScript.
Coggle requires JavaScript to display documents.
Validity and Reliability - Coggle Diagram
Validity and Reliability
-
-
. Reliability (Consistency):
Reliability is the degree of consistency, stability, and dependability of a measuring instrument. A reliable tool produces similar results under consistent conditions.
Reliability Coefficient (r): A numeric index ranging from .00 (no reliability) to 1.00 (perfect reliability). Coefficients above .70 are generally acceptable, but .85 to .95 are preferred.
Aspects of Reliability:
A. Stability (Test-Retest Reliability): Assesses the consistency of a measure over time. The same instrument is administered to the same subjects twice (e.g., with a 2-week interval), and the scores are correlated.
Best for: Stable, enduring traits (e.g., personality, abilities).
Limitations: Traits can change naturally, responses can be influenced by memory, and participants may drop out.
B. Internal Consistency (Homogeneity): Assesses the degree to which all items within a single instrument measure the same concept.
Assessed by: Cronbach's Alpha (α). A high alpha (≥ .70) indicates that the items are consistent and interrelated.
-
C. Equivalence:
Inter-Rater Reliability: The degree of agreement between two or more different raters observing or scoring the same event at the same time. High agreement (r ≥ .70) indicates consistency between observers.
Intra-Rater Reliability: The degree of consistency of one rater across different time points (i.e., the same rater scoring the same thing twice and getting the same result).
Parallel-Forms Reliability: The degree of agreement between two different versions of the same instrument (Form A and Form B) administered to the same people at the same time.
Validity (Accuracy):
Validity is the degree to which an instrument measures what it is intended to measure. A test can be reliable (consistent) but not valid (accurate), but it cannot be valid if it is not reliable.
Types of Validity:
A. Content Validity: Assesses whether the instrument's content covers all relevant parts of the construct it is supposed to measure.
Evaluated by: Comparing items to the literature and, most importantly, by a panel of expert judges (a jury). Experts rate each item on its relevance (e.g., on a 4-point scale from "not relevant" to "very relevant"). 80% expert agreement on item relevance indicates good content validity.
B. Criterion-Related Validity: Assesses how well the instrument's scores correlate with an external criterion (a gold standard or another measure).
Concurrent Validity: The instrument's scores are correlated with a criterion measure administered at the same time. It checks if the instrument can distinguish between groups that are known to be different now (e.g., a new depression scale vs. a clinician's current diagnosis).
Predictive Validity (Not explicitly covered but implied): The instrument's ability to predict future outcomes or performance on a criterion (e.g., an entrance exam predicting future academic success).