Chapter 7: Scale Reliability and Validity

Reliability: the degree to which the measure of a construct is consistent or dependable.

Validity: refers to the extent to which a measure adequately represents the underlying construct that it is supposed to measure.

Theory of Measurement

An Integrated Approach to Measurement Validation

Inter-rater reliability: a measure of consistency between two or more independent raters (observers) of the same construct.

Test-retest reliability: a measure of consistency between two measurements (tests) of the same construct administered to the same sample of two different points in time.

Split-half reliability: a measure of consistency between two halves of a construct measure.

Internal consistency reliability: a measure of consistency between different items of the same construct.

Translational validity: how well the ideas of a theoretical construct is translated into or represented in an operational measure. Consists of two subtypes: face and content validity.

Criterion-related validity: empirical assessment of validity examines how well a given measure relates to one or more external criterion, based on empirical observations. Four subtypes: convergent, discriminant, concurrent, and predictive validity.

Face validity: refers to whether an indicator seems to be a reasonable measure of its underlying construct "on its face".

Content validity: an assessment of how well a set of scale items matches with the relevant content domain of the construct that it is trying to measure.

Convergent validity: refers to the closeness with which a measure relates to the construct that is purported to measure.

Discriminant validity: refers to the degree to which a measure does not measure other constructs that it is not supposed to measure.

Predictive validity: the degree to which a measure successfully predicts a future outcome that is theoretically expected to predict.

Concurrent validity: examines how well one measure relates to other concrete criterion that is presumed to occur simultaneously.

Random error: the error that can be attributed to a set of unknown uncontrollable external factors that randomly influence some observations but not others.

Systemic error: an error that is introduced by factors that systematically affect all observations o a construct across an entire sample in a systemic manner.

First step is conceptualizing the constructs of interest. Defining each construct and identifying their constituent domains and/or dimensions.

Second step is selecting items or indicators for each construct based on our conceptualization of these constructs.

Third step a panel of expert judges can be employed to examine each indicator and conduct a Q-sort analysis.

After the judges review the validation procedure moves to the empirical realm. Data collected is tabulated and subjected to correctional analysis or exploratory factor analysis using a software program.

Remaining scales are evaluated for reliability using a measure of internal consistency such as Cronbach alpha.

Evaluate the predictive ability of each construct within a theoretically specified nomological network of construct using regression analysis or structural equation modeling.