Construction of psychometrically sound written university exams.

Autor/inn/en	Frey, Andreas; Spoden, Christian; Born, Sebastian
Titel	Construction of psychometrically sound written university exams.
Quelle	In: Psychological test and assessment modeling, 65 (2020) 4, S. 472-486 PDF als Volltext kostenfreie Datei Link als defekt meldenVerfügbarkeit
Beigaben	Literaturangaben
Zusatzinformation	Forschungsdaten, Studiendetails und Erhebungsinstrumente
Sprache	englisch
Dokumenttyp	online; gedruckt; Zeitschriftenaufsatz
ISSN	2190-0493; 2190-0507
Schlagwörter	Forschungsmethode; Reliabilität; Testaufbau; Validität; Kompetenzerwerb; Testdiagnostik; Testdurchführung; Testentwicklung; Klausur; Klausuraufgabe; Kursarbeit; Hochschulbildung; Studium; Universität; Seminar; Studienfach; Studiengang; Seminararbeit; Hochschule; Kursmaterial; Bewertung; Einführungskurs; Item; Kompetenzentwicklung
Abstract	Written university exams typically used at German-speaking universities often do not represent the learning objectives of the respective course appropriately. Moreover, they do not allow for criterion-referenced inferences regarding the degree to which the learning objectives have been met, and they are statistically unconnected across different test cycles. To overcome these shortcomings, we propose applying a combination of established methods from the fields of educational measurement and psychometrics to written university exams. The key elements of the proposed procedure are (a) the definition of the content domain of interest in relation to the learning objectives of the course, (b) the specification of an assessment framework, (c) the operationalization of the assessment framework with test items, (d) the standardized administration of the exam, (e) the scaling of gathered responses with item response theory models, and (f) the setting of grade levels with standard-setting procedures. Empirical results obtained from six test cycles of a real university exam at the end of an introductory course on research methods in education show that this procedure can successfully be applied in a typical university setting. It was possible to constitute a reliable and valid scale and maintain it across the six test cycles based on a common item nonequivalent group design. The comparison of the observed student competence distributions across the six years gave interesting insights that can be used to optimize the course. (Orig.).
Erfasst von	DIPF \| Leibniz-Institut für Bildungsforschung und Bildungsinformation, Frankfurt am Main
Update	2021/4