Please enable JavaScript.
Coggle requires JavaScript to display documents.
Reliability and
Validity - Coggle Diagram
Reliability and
Validity
Reliability
- refers to the consistency/repeatability of the results of a measurement
- are the results of the measurement consistent?
- how reliable a measure is, is relative and depends on the situation
types of reliability
observations: internal (split-half) reliability
- internal: the degree to which all of the specific items or observations in a multiple-item measure behave the same way.
- measuring intelligence: all the items should equally measure intelligence.
- high internal reliability shows the entire measure is consistently measuring what it should be.
- we want more items to measure to reduce error > very important that these items all consistently measure the construct we are interested in.
- split-half: test this by dividing the measure into two halves.
- need to compare like with like
- don't just split in half down the middle.
- look at the correlation between individuals' scores on the two halves
- a high correlation between scores indicates good internal reliability
occasions: test-retest reliability
- the reliability of a measure to produce the same results at different points in time or occasions.
- important to show that the test or measure consistently measures the construct we are interested in , provided no other variables have changed.
- a large difference in scores between an individual test, taken by the same person during test 1 and test 2 would suggest low test-retest reliability.
- practice effects undermine test-retest reliability.
- randomly assign people to differ orders.
observers: inter-observer reliability
- degree to which observers agree upon an observation or judgement.
- measure this reliability with correlations
- positive relationship between the scores of each observer.
- to have high inter-observer reliability, we want both observers to agree > very important for scientific research.
- the higher the correlation between observer judgements, the more reliable the results are.
replication: the reliability of results across experiments
- can we replicate the results when all the variables and conditions remain the same?
- need clear and detailed method sections.
- critical to the scientific method:
- must have evidence from multiple experiments
- the more times a result is replicated, the more likely it is the findings are accurate and not due to error
Validity
- refers to how well a measure or construct actually measures or represents what it claims to.
- validity relates to accuracy
- very important in psychology where we often measure abstract constructs.
- requirements for causality: j.s. mills proposed three requirements for causality
- covariation: is there evidence for a relationship between variables?
- temporal sequence: one variable occurs before the other
- eliminate confounds: explain or rule out other possible explanations
types of validity
construct validity:
- how well do your operationalised variables (independent and/or dependent) represent the abstract variables of interest?
- experimentally: are you measuring what you think/say you're measuring
- construct validity = the strength of operational definitions
measurement validity:
- measurement validity refers to how well a measure or an operationalised variable corresponds to what it is supposed to measure/represent.
- shows that the measurement procedure actually measures what it claims to measure.
- we use a number of methods to assess the validity of a measurement > critical for scientific research.
ecological validity:
- ecological validity is how well you can generalise the results outside of a laboratory environment to the real world
- laboratory experiments vs real-life settings
- laboratory settings are very controlled and different from real-life settings.
- people are aware they are under experimental conditions and behave differently.
content validity:
- degree to which the items on a multi-faceted measure accurately sample the target domain
- how well does a measure/task represent all the facets of a construct.
- many constructs are multi-faceted and sometimes multiple measures must be used to achieve adequate content validity.
- content validity vs internal reliability: content validity demonstrate that all of the items on a multiple domain measure accurately measure the construct.
- extraversion scales need all the questions to accurately measure extraversion, not another construct.
- internal reliability relates to whether the items on a multiple measure domain consistently measure all the construct.
- all questions about a construct should produce consistent scores is the same individual
population validity:
- population validity refers to how well your experimental findings can be replicated in a wider population
- aim to have the findings generalised from our experimental sample to the wider population.
- it is difficult to obtain high external validity in controlled experimental settings.
external validity:
- how well a causal relationship holds across different people, settings, treatment, variables, measurements, and time.
- how well we can generalise the causal relationship outside the specifics of the experiment.
- is your sample representative?
- is the context representative?
- can results from animal labs generalise to humans?
- high external validity occurs when we are able to generalise our experimental findings.
criterion validity:
- criterion validity measures how well scores on one measure predict the outcome for another measure.
- the extent a procedure or measure can be used to infer or predict some criterion (another outcome)
concurrent validity (now):
- compares the scores on two current measures to determine whether they are consistent
- how well do scores on one measure predict the outcome for another existing measure.
- if the two tests produce similar and consistent results, you can say that they have concurrent validity.
predictive validity (future):
- scientific theories make predictions about how scores on a measure for a certain construct affect future behaviour
- if the measurement of a construct accurately predicts future behaviour then the measurement would have high predictive validity.
internal validity:
- internal validity is focused on whether the research design and evidence allows us to demonstrate a clear cause and effect relationship
- high internal validity occurs when the research design can established a clear and unambiguous explanation for the relationship between two variables.
- relative statement rather than an absolute measure;
- can we rule out other explanations?
- are the variables accurately manipulating or measuring the construct?
- does the research design support the causal claim?
- can't directly measure internal validity with a correlation
- crucial to make claims about the causal relationship between variables
Measurement
and Error
- all measurements can be broken down into at least two components; the true value of what is being measured, and the measurement error
- measured score = true score + error
- X = t+e
- however, we want the measured score to = the true score.
reducing error
- error is reduced with:
- many participants > removes individual differences error
- many measurements > removes measurement error
- many occasions
- averages of scores are more reliable than individual scores
Reliability vs
Validity
- reliability: the consistency and repeatability of the results of a measurement
- validity: the degree to which a measure or experiment actually measures what it claims to measure.
- ideally, we want scientific measures to be both reliable and valid.
- reliability demonstrates that the measures consistently performs the same way.
- validity demonstrates that the measure actually measures what it claims to meaure
- a valid measure is also reliable > the measure accurately measures what it claims to and does this consistently