Please enable JavaScript.
Coggle requires JavaScript to display documents.
Ch. 3 Validity of Assessment Results - Coggle Diagram
Ch. 3
Validity of Assessment Results
General Nature of Validity
Four Principles for Validation
Use
Appropriate Uses
Values
Appropriate Values
Interpretations
Appropriate Interpretations
Consequences
Appropriate Consequences
Validity of Teacher-Made Classroom Assessment Results
Content Representativeness and Relevance
Does my assessment procedure emphasize what I have taught?
Do my assessment task and scoring schemes accurately represent the outcomes specified in my school's and state's curriculum framework?
Are my assessment tasks in line with the current thinking about what should be taught and how it should be assessed?
Is the Content in my Assessment important and worth learning?
Thinking Processes and Skills Represented
Does my assessment instrument represent the kinds of thinking skills that my school's curriculum framework and state's standards view as important?
During the assessment, do students actually use the types of thinking I expect them to use?
Do the tasks on my assessment instrument require students to use important thinking skills and processes?
Do I allow enough time for students to demonstrate the type of thinking I am trying to assess?
Consistency with other Classrom Assessments
Is the pattern of results in the class consistent with what I expected based on my other assessments of them?
Do I make the assessment tasks to difficult or too easy for my students?
Reliability and Objectivity
Do I use a scoring guide for obtaining quality ratings or scores from students' performance on the assessment?
Is my assessment instrument long enough to be a representative sample of the types of learning outcomes I am assessing?
Fairness to Different Types of Students
Do I word the problems or tasks on my assessment so that students with different ethnic and socioeconomic backgrounds will interpret them in appropriate ways?
Do I modify the wording of the administrative conditions of the assessment tasks to accommodate students with disabilities or special learning problems?
Do the Pictures, stories, verbal statements, or other aspects of my assessment procedure perpetuate racial, ethnic, or gender stereotypes?
Economy, Efficiency, Practicality, Instructional Features
7s the assessment relatively easy for me to construct and not to cumbersome to use to evaluate students?
Would the time needed to use this assessment be better spent directly teaching my students instead?
Does my assessment represent the best use of my time?
Multiple Assessment Usage
Do I use one assessment result in conjunction with other assessment results?
Positive Consequences for learning
Do my assessments result in both the students' and my getting information that helps students learn?
Do my assessments avoid inappropriate negative consequences?
Validity of Large-Scale Assessments
Construct an Argument for Validity, Supported by Evidence
Aninterpretive argument
The validiry argument
You can appropriately assess students' success in the calculus course (i.e., a suitable criterion assessment procedure is available).
You can identify the algebra concepts and thinking skills that students will use frequently in the calculus course.
The algebra content and thinking skills assessed by the placement test match those frequently used in the calculus course.
The remedial course to which low-scoring students will be assigned will succeed in teaching students the algebra concepts and skills needed in the calculus course.
Scores on the placement test are reliable (i.e., students' scores are consistent across different samples of test items, different testing occasions, and different persons scoring the test).
It is not helpful for students with high ability in algebra to take the remedial algebra course (i.e., students who score high on the placement tests will not significantly improve their chances of success in calculus by first taking this particular remedial algebra course).
The placement test scores are not affected by systematic errors that would lower the validity of your interpretation that the placement test measures algebra knowledge and thinking skills.
Content Representativeness and Relevance: Content Evidence
Content
Depth
Emphasis
Performances
Implied applicability
Thinking Skills and Processes: Substantive Evidence
Relationships Among Parts of Assessment: Internal Structure Evidence
Relationships of Results to Other Variables: External Structure Evidence
Correlation Coefficient
Students' Scores on Different Texts
Comparing Students' Rank Orders
Scatter Diagrams
Pearson Product-Moment Correlation Coefficients
Degrees of Relationships
Correlation and Causation
Correlation Coefficients and Sample Sizes
Factors that Raise or Lower Correlation Coefficients
Validity Coefficients
Expectancy Tables
The Criterion
Judging the Worth of Criteria
Low Criterion Reliability Limits Validity
Systematic Errors
Practical Considerations
Reliability Over Time, Assessors, and Content Domains: Reliability Evidence
Generalization of Interpretation Over People, Conditions, or Special Instructions and Interventions: Generalization Evidence
Intended and Unintended Consequences: Consequential Evidence
Cost, Efficiency, Practicality, and Instructional Features: Practicality Evidence
Validity Issues When Accommodating Students With Disabilities
Validity of Scores From Test Accommodations
How should Accommodated Norm-Referenced Scores Be Reported?
How Should Accommodated Criterion-Referenced Scores Be Reported?
Measurement Perspective on Accommodations and Modifications
Will changes in format or testing conditions change the skill being measured?
Will the scores of examinees tested under standard conditions have a different meaning than scores for examinees tested with the requested accommodation?
Would examinees who do not need accommodations benefit if they were nevertheless allowed the same accommodations?
Do examinees requesting or granted accommodations have any capacity for adjusting to standard test administration conditions?
Is the disability evidence or testing accommodation policy based on procedures with doubtful validity and reliability? (adapted from Phillips, 1994, p. 104).
Conclussion
The validity of classroom and large-scale assessment results depends on intended purposes and uses. This chapter has outlined the various types of evidence that should be considered in arguments that particular assessment results are valid for a particular purpose or use. We introduced the concept of reliability as a necessary but not sufficient condition for validity. Chapter 4 discusses reliability in more detail.