The Role of Domain Specifications in Improving the Technical Quality of Performance Assessments. Project 2.2: Alternative Approaches to Measuring Liberal Arts Subjects: History, Geography, and Writing.

Autor/in	Baker, Eva L.
Institution	Center for Research on Evaluation, Standards, and Student Testing, Los Angeles, CA.
Titel	The Role of Domain Specifications in Improving the Technical Quality of Performance Assessments. Project 2.2: Alternative Approaches to Measuring Liberal Arts Subjects: History, Geography, and Writing.
Quelle	(1992), (145 Seiten) PDF als Volltext kostenfreie Datei Verfügbarkeit
Sprache	englisch
Dokumenttyp	gedruckt; online; Monographie
Schlagwörter	Educational Assessment; Essay Tests; Grade 11; High School Students; High Schools; Objective Tests; Performance Based Assessment; Rating Scales; Scoring; Scoring Rubrics; Sex Differences; Test Construction; Test Use; Test Validity; United States History + Suchen Sie Ihr Suchwort? Education; assessment; Bewertungssystem; Schriftlicher Sprachgebrauch; School year 11; 11. Schuljahr; Schuljahr 11; High school; High schools; Student; Students; Oberschule; Schüler; Schülerin; Studentin; Objektiver Test; Leistungsermittlung; Rating-Skala; Bewertung; Scoring formulas; Auswertungsbogen; Sex difference; Geschlechtsunterschied; Testaufbau; Testanwendung; Testvalidität
Abstract	Work on the development of history performance assessments is described. So far, six complete sets of assessments in United States history have been developed as part of this project. Students are first assessed on their historical knowledge of the period through a short-answer test. They are then asked to write an essay explaining the positions of authors of the primary source texts provided. A series of studies conducted to determine how scoring rubrics should be developed found the best strategy to involve looking at differences between expert and novice performance. The use of task specifications controlled some of the variation commonly associated with performance tasks. Results from the research to date include: (1) development of a valid scoring scheme; (2) development of rater training procedures; (3) a task structure that reduces score variability; (4) distinguishing between assessment purposes and the utility of overall score and subscores; (5) detecting gender differences in scores; (6) supporting the validity of the measures through grade point averages and a scale ensuring effort through studies with 68 11th graders; and (7) addressing validity criteria with the same data. There are 27 tables and 4 graphs of study data, and a 40-item list of references. Five appendices provide samples of the content assessments and information about validity and reliability results. (SLD)
Erfasst von	ERIC (Education Resources Information Center), Washington, DC
Update	2004/1/01