Wo soll gesucht werden?
Erweiterte Literatursuche

Ariadne Pfad:


Literaturnachweis - Detailanzeige

AutorBogorevich, Valeriia
TitelNative and Non-Native Raters of L2 Speaking Performance: Accent Familiarity and Cognitive Processes
Quelle(2018), (271 Seiten)
PDF als Volltext    Verfügbarkeit 
Ph.D. Dissertation, Northern Arizona University
Dokumenttypgedruckt; online; Monographie
SchlagwörterHochschulschrift; Dissertation; Pronunciation; Familiarity; English (Second Language); Second Language Learning; Evaluators; Interrater Reliability; Scores; Language Tests; Classification; Oral Language; Native Speakers; Test Validity; Scoring Rubrics; Language Usage; Teaching Experience; Student Characteristics; Second Language Instruction; North American English; Russian; Native Language; Semitic Languages; Chinese; Language Proficiency; Protocol Analysis; Content Analysis
AbstractRater variation in performance assessment can impact test-takers' scores and compromise assessments' fairness and validity (Crooks, Kane, & Cohen, 1996). Rater variation can also undermine a test's validity and fairness; therefore, it is important to investigate raters' scoring patterns in order to inform rater training. Substantial work has been done analyzing rater cognition in writing assessment (e.g., Cumming, 1990; Eckes, 2008); however, few studies have tried to classify factors that could contribute to rater variation in speaking assessment (e.g., May, 2006). The present study used a mixed methods approach (Tashakkori & Teddlie, 1998; Greene, Carcelli, & Graham, 1989) to investigate the potential differences between native English-speaking and non-native English-speaking raters in how they assess L2 students' speaking performance. Kane's (2006) argument-based approach to validity was used as the theoretical framework. The study challenged the plausibility of the assumptions for the evaluation inference, which links the observed performance and the observed score and depends on the assumption that the raters apply the scoring rubric accurately and consistently. The study analyzed raters' scoring patterns when using a TOEFL iBT speaking rubric analytically. The raters provided scores for each rubric criterion (i.e., Overall, Delivery, Language Use, and Topic Development). Each rater received individual training, practice, and calibration experience. All the raters filled out a background questionnaire asking about their teaching experiences, language learning history, the background of students in their classrooms, and their exposure to and familiarity with the non-native accents used in the study. For the quantitative analysis, the two groups of raters 23 native (North American) and 23 non-native (Russian) raters graded and left comments for speech samples from Arabic (n = 25), Chinese (n = 25), and Russian (n = 25) L1 background. Students' samples were in response to two independent speaking tasks; the students' responses varied from low to high proficiency levels. For the qualitative part, 16 raters (7 native and 9 non-native) shared their scoring behavior through think-aloud protocols and interviews. The speech samples graded during the think-aloud included Arabic (n = 4), Chinese (n = 4), and Russian (n = 4) speakers. Raters' scores were examined using the Multi-Faceted Rasch Measurement using FACETS (Linacre, 2014) software to test group differences between native and non-native raters as well as raters who are familiar and unfamiliar with the accents of students in the study. In addition, raters' comments were coded and also used to explore rater group differences. The qualitative analyses involved thematical coding of transcribed think-aloud sessions and interview sessions using content analysis (Strauss & Corbin, 1998) to investigate the cognitive processes of raters and their perceptions of their rating processes. The coding included such themes as decision-making and re-listening patterns, perceived severity, criteria importance, and non-rubric criteria (e.g., accent familiarity, L1 match). Afterward, the quantitative and qualitative results were analyzed together to describe the potential sources of rater variability. This analysis was done employing side-by-side comparison of qualitative and quantitative data (Onwuegbuzie & Teddlie, 2003). The results revealed that there were no radical differences between native and non-native raters; however, some different patterns were observed. Non-native raters also showed more lenient grading patterns towards the students with whom their L1 matched. In addition, all raters, regardless of the group, demonstrated several patterns of rating depending on their focus while listening to examinees' performance and interpretations of the rating criteria during the decision-making process. The findings can motivate professionals who oversee and train raters at testing companies and intensive English programs to study their raters' scoring behaviors to individualize training to help make exam ratings fair and raters interchangeable. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page:] (As Provided).
AnmerkungenProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site:
Erfasst vonERIC (Education Resources Information Center), Washington, DC
Literaturbeschaffung und Bestandsnachweise in Bibliotheken prüfen

Standortunabhängige Dienste
Die Wikipedia-ISBN-Suche verweist direkt auf eine Bezugsquelle Ihrer Wahl.
Tipps zum Auffinden elektronischer Volltexte im Video-Tutorial

Trefferlisten Einstellungen

Permalink als QR-Code

Permalink als QR-Code

Inhalt auf sozialen Plattformen teilen (nur vorhanden, wenn Javascript eingeschaltet ist)