Please enable JavaScript.
Coggle requires JavaScript to display documents.
Psychometrics Quiz 2 mind map from flash cards (test-retest: taking the…
Psychometrics Quiz 2 mind map from flash cards
Spearman-brown formula
n(rxx)/1+(n-1)rxx
types of error
random error: something weird happens, such as two mechanical pencils running out of lead
observed score: an example of this would be a percentile rank
true score: this is the average of numerous tests taken over a period of time
error score: an example of this would be someone guessing. they may do well once, but it is not their true score because they may not be as lucky the next time they take the test
test-retest: taking the test once and then allowing a certain amount of time in between taking the test the second time to find out the true score or averaged out score of a participant
there is a second interview but not a second interviewer
researchers calculate the consistency between the two tests
too short of a time between tests results in test wiseness
too long a break results in error as well
error comes from memory effects, administration, guessing, and fluctuation of the true score over time
error in test retest
too long a break in between the test results in error
too short a break results in test-wiseness
error can result in administration
error can come from guessing but it is unlikely as they will not get as lucky or unlucky the second time around
fluctuation in true score over time is another form of error
parallel form: results in two similar but different tests that are supposed to be both psychometrically sound
estimating 2 versions of the same test
trying to make two psychometrically sound tests is difficult and can result in error
this type of error would have true score variance, observed score means and error variances
error in parallel form reliability
lack of agreement between the scorers
administration issues where the person administering the test does not do it correctly or read the instructions clearly
the two versions of the test could not be psychometrically sound resulting in one test being more reliable than the second test or vice-versa
internal consistency
split the test in half and compare the consistency between the two values
since it is split in half spearman-brown formula would be used to bring it back to full scale
error in internal consistency
because the test is split in half and scored using spearman-brown formula the reduced length of the test could result in a lower reliability or alpha coefficient
there could be error in agreement between the scorers
could be error in scoring
could be guessing on half of the participants and because they are not doing a test-retest there is no way to tell a better true score of the participants
inter-rater reliability
have multiple interviewers and have them rate the tests
Kappa is used if the scores were consistent
this will show consistency but not level of agreement
reliability measures error in scoring but not error in candidate
error in inter-rater reliability
the most common type of error in inter-rater reliability is that of lack of agreement between the scorers of the test
other types of error
error with memory: too short a break between test-retest results in test-wiseness. too long a break between test-retest results in fluctuation in the true score over time
error in administration: when there are two administrators they both have to give the instructions the same way to increase the reliability of the test. the environment also has to be the same when the test is taken
error with guessing: this is a result of someone getting lucky. if there were a test-retest they may not get as lucky the second time around and it could result in a lower or higher true score when the tests are averaged out
ways to combat errors with guessing
write suitable distractors
change the order of items and correct answers
error with fluctuation of true score: waiting too long between test-retest or too short a time frame between test-retest results in test-wiseness
error in lack of agreement: a way to ensure this does not effect the test reliability is to make sure the two versions of the test are equivalent.
error in scoring: a way to combat this is to give graders clear scoring rules and standards
Kappa
O-E/N-E
O is observed count of agreement between scorers
E is expected amount of agreement (multiply corresponding margins then divide each by N and add them together
N is the total number of items
finding the confidence interval
Ci=SD (square root [1-r])
what are some of the major assumptions of CTT?
the observed score is = to the sum of the true score plus measurement error
error is random such as the mean of error is zero
true scores are uncorrelated with error scores since error scores are random
reliability is the fraction of observed variance that is die to true score variance
how is reliability defined in terms of CTT?
the degree to which measures are free from error and yield consistent results
under what conditions are tests reliable
when there is no measurement error,,, t=x
applied purpose above 0.80
research purpose above 0.70
why is reliability important?
for a test to be valid it must be reliable
under what conditions do we accept low reliability?
in experimental research
when the test is shorter
what are the limitations of CTT?
reflects both individual differences and measurement fluctuations
assumes standard error is the same for all test takers
reliability of overall test, not individual items
it is not possible to get an individuals actual true score because inorder to do that it they would have to take the test an infinite number of times
what are the important things to keep in mind with alpha coefficient?
test length
average inter-item correlation
the dimensionality of the scale
adding questions should increase the reliability but they have to be psychometrically sound questions
what is the difference between inter-rater reliability and inter-rater agreement
inter-rater reliability is the degree to which the scorers are within the same range on each item consistently within the standard deviation of the mean. the inter-rater agreement is the actual number of times the scorers agree on an item