Task 4: Norms and Test Standadisation

click to edit

Gregory 3A

Norms - comparative frame of reference for interpreting scores, many varieties (e.g. percentile ranks, age/ grade equivalents, standard scores

  • Indicate an examinees standing on the test relative to the performance of other person
  • Norm-referenced tests: tests that are interpreted using norms (in comparison to criterion-referenced tests)

Test standardisation: determine the distribution of raw scores in the norm group

ESSENTIAL STATISTICAL CONCEPTS

Raw scores: most basic level of information from a test, initial outcome of testing, alone totally meaningless  need some reference point to compare the results

Frequency distributions: specifying a small number of usually equal-sized class intervals and then tallying how many scores fall within each interval

HIstogram

Polygon

measures of central tendncy

median

mode

mean

measures of dispersion/variablity

Standard devoatopm

Variance

Normal Distribution

positive skew - when most scores are on the low end (left end) of the scale

RAW SCORE TRANSFORMATIONS

Percentiles

standard scores (z-scores)

  • Retaining relative magnitudes of distances between successive values found in original raw scores = use does not distort the underlying measurement scale (advantage over percentiles)

Standardised scores: conceptually identical to standard scores, BUT always expressed as positive whole numbers

  • T-score: mean of 50 and SD of 10, common scale for personality tests
  • Transformation: or simply: T = 10z + 50

Normalising standard scores: conversion of percentiles into normalised standard scores

  • Use percentile for each raw score to determine the corresponding standard score  do this for every case in a nonnormal distribution  normally distributed standard scores

Selecting a norm group

Stratified random sampling - stratifying/classifying the target population on important background variables (age, gender, race, social class, educational level etc.) and then select an appropriate percentage of persons at random from each stratum

Age norms: as we grow older, we change in measureable ways  depicts level of the test performance for each separate age group in the normative sample

Random sampling - all subjects have equal chance of being selected

Expectancy tables: portrays the established relationship between test scores and expected outcome on a relevant task

Grade norms - depicits level of performance per grade in school (used in schooling)

Local norms - representative oflocal examinees - as opoised to national sample

Subgrouo norms - consist of the scores obtained from an individual subgroup, as opposed to diversified national sample

Criterion-referenced test: compare test results to a predetermined standard and not in relation to other examinees

  • Best for basic academic skills but inappropriate for higher-level abilities because it is difficult to define specific objectives for such content domains
  • Cut-off value is set subjectively

Van Breukelen

Test norms: obtained traditionally by splitting large representative sample into subgroups and compute summary statistics

  • Raw test scores can then be converted into z-scores, percentile ranks etc.

Multiple regression: of test scores on all measured and possibly relevant person variables

  • Allows to decide which variables are predictive of the scores
  • Allows to obtain smooth and reliable norms for evaluating the observed test scores in terms such as average, modestly or seriously above/below average

Regression vs traditional norming

Regression: advantage of greater validity and reliability

  • Limitation: unattractive for daily use by clinicians 
    

Compared to traditional method of separate norm tables for subgroups based on age and gender, the regression method has 2 advantages:

  1. Allows clinician to distinguish which background variables are/ are not predictive of scale scores and therefore relevant to a valid norming procedure 
    
    o Example: here gender was irrelevant, but educational level was relevant for the PCL scales
  1. Norms are more continuous and more reliable than those obtained by tabulating the mean and SD of the scale score for different age x gender groups 
    

Only advantage of traditional method is ease of use, in all other respects its doomed to fail