Task 4: Norms and Test Standadisation
click to edit
Gregory 3A
Norms - comparative frame of reference for interpreting scores, many varieties (e.g. percentile ranks, age/ grade equivalents, standard scores
- Indicate an examinees standing on the test relative to the performance of other person
- Norm-referenced tests: tests that are interpreted using norms (in comparison to criterion-referenced tests)
Test standardisation: determine the distribution of raw scores in the norm group
ESSENTIAL STATISTICAL CONCEPTS
Raw scores: most basic level of information from a test, initial outcome of testing, alone totally meaningless need some reference point to compare the results
Frequency distributions: specifying a small number of usually equal-sized class intervals and then tallying how many scores fall within each interval
HIstogram
Polygon
measures of central tendncy
median
mode
mean
measures of dispersion/variablity
Standard devoatopm
Variance
Normal Distribution
positive skew - when most scores are on the low end (left end) of the scale
RAW SCORE TRANSFORMATIONS
Percentiles
standard scores (z-scores)
- Retaining relative magnitudes of distances between successive values found in original raw scores = use does not distort the underlying measurement scale (advantage over percentiles)
Standardised scores: conceptually identical to standard scores, BUT always expressed as positive whole numbers
- T-score: mean of 50 and SD of 10, common scale for personality tests
- Transformation: or simply: T = 10z + 50
Normalising standard scores: conversion of percentiles into normalised standard scores
- Use percentile for each raw score to determine the corresponding standard score do this for every case in a nonnormal distribution normally distributed standard scores
Selecting a norm group
Stratified random sampling - stratifying/classifying the target population on important background variables (age, gender, race, social class, educational level etc.) and then select an appropriate percentage of persons at random from each stratum
Age norms: as we grow older, we change in measureable ways depicts level of the test performance for each separate age group in the normative sample
Random sampling - all subjects have equal chance of being selected
Expectancy tables: portrays the established relationship between test scores and expected outcome on a relevant task
Grade norms - depicits level of performance per grade in school (used in schooling)
Local norms - representative oflocal examinees - as opoised to national sample
Subgrouo norms - consist of the scores obtained from an individual subgroup, as opposed to diversified national sample
Criterion-referenced test: compare test results to a predetermined standard and not in relation to other examinees
- Best for basic academic skills but inappropriate for higher-level abilities because it is difficult to define specific objectives for such content domains
- Cut-off value is set subjectively
Van Breukelen
Test norms: obtained traditionally by splitting large representative sample into subgroups and compute summary statistics
- Raw test scores can then be converted into z-scores, percentile ranks etc.
Multiple regression: of test scores on all measured and possibly relevant person variables
- Allows to decide which variables are predictive of the scores
- Allows to obtain smooth and reliable norms for evaluating the observed test scores in terms such as average, modestly or seriously above/below average
Regression vs traditional norming
Regression: advantage of greater validity and reliability
Limitation: unattractive for daily use by clinicians
Compared to traditional method of separate norm tables for subgroups based on age and gender, the regression method has 2 advantages:
o Example: here gender was irrelevant, but educational level was relevant for the PCL scalesAllows clinician to distinguish which background variables are/ are not predictive of scale scores and therefore relevant to a valid norming procedure
Norms are more continuous and more reliable than those obtained by tabulating the mean and SD of the scale score for different age x gender groups
Only advantage of traditional method is ease of use, in all other respects its doomed to fail