Literaturnachweis - Detailanzeige
Autor/inn/en | Levine, Michael V.; Rubin, Donald B. |
---|---|
Institution | Educational Testing Service, Princeton, NJ. |
Titel | Measuring the Appropriateness of Multiple-Choice Test Scores. |
Quelle | (1976), (31 Seiten)
PDF als Volltext |
Sprache | englisch |
Dokumenttyp | gedruckt; online; Monographie |
Schlagwörter | Academic Ability; Aptitude Tests; College Entrance Examinations; High Schools; Item Analysis; Mathematical Formulas; Maximum Likelihood Statistics; Multiple Choice Tests; Predictive Measurement; Probability; Research Reports; Response Style (Tests); Scores; Test Bias; Test Interpretation; Test Reliability; Test Validity; SAT (College Admission Test) Aptitude test; Eignungsprüfung; Eignungstest; Aufnahmeprüfung; High school; Oberschule; Itemanalyse; Mathematische Formel; Multiple choice examinations; Multiple-choice tests, Multiple-choice examinations; Multiple-Choice-Verfahren; Wahrscheinlichkeitsrechnung; Wahrscheinlichkeitstheorie; Research report; Forschungsbericht; Antwortverhalten; Testkritik; Test analysis; Testauswertung; Testreliabilität; Testvalidität |
Abstract | Appropriateness indexes (statistical formulas) for detecting suspiciously high or low scores on aptitude tests were presented, based on a simulation of the Scholastic Aptitude Test (SAT) with 3,000 simulated scores--2,800 normal and 200 suspicious. The traditional index--marginal probability--uses a model for the normal examinee's test-taking behavior only, based on item characteristic curve theory. The other two indices use a generalization of the traditional index which allows ability to vary during testing. One uses the standard likelihood ratio to quantify the amount of improvement of fit achieved by permitting ability to vary across items. The other index estimates the parameter values of the varying ability models, and uses estimated parameter values to indicate the degree of aberrance. Files of candidates with 4%, 10%, 20%, and 40% aberrance were generated by modifying item scores of normal examinees. Results showed that 20% aberrance was surprisingly well detected for the suspiciously low group on all three indices. Suspiciously high candidates were even more easily detected. Results are significant because they suggest that inappropriately scoring candidates (such as low ability students who cheat or high ability students who misinterpret instructions), can be detected without reference to background variables. (CP) |
Erfasst von | ERIC (Education Resources Information Center), Washington, DC |