Suche

Wo soll gesucht werden?
Erweiterte Literatursuche

Ariadne Pfad:

Inhalt

Literaturnachweis - Detailanzeige

 
Autor/inn/enOrganisciak, Peter; Acar, Selcuk; Dumas, Denis; Berthiaume, Kelly
TitelBeyond Semantic Distance: Automated Scoring of Divergent Thinking Greatly Improves with Large Language Models
Quelle(2023), (32 Seiten)
PDF als Volltext Verfügbarkeit 
ZusatzinformationORCID (Organisciak, Peter)
ORCID (Acar, Selcuk)
ORCID (Dumas, Denis)
ORCID (Berthiaume, Kelly)
Weitere Informationen
Spracheenglisch
Dokumenttypgedruckt; online; Monographie
SchlagwörterAutomation; Computer Assisted Testing; Scoring; Creative Thinking; Creativity Tests; Semantics; Prompting; Test Items; Alternative Assessment; Models; Natural Language Processing; Effect Size; Robustness (Statistics)
AbstractAutomated scoring for divergent thinking (DT) seeks to overcome a key obstacle to creativity measurement: the effort, cost, and reliability of scoring open-ended tests. For a common test of DT, the Alternate Uses Task (AUT), the primary automated approach casts the problem as a semantic distance between a prompt and the resulting idea in a text model. This work presents an alternative approach that greatly surpasses the performance of the best existing semantic distance approaches. Our system, "Ocsai," fine-tunes deep neural network-based large-language models (LLMs) on human-judged responses. Trained and evaluated against one of the largest collections of human-judged AUT responses, with 27 thousand responses collected from nine past studies, our fine-tuned large-language-models achieved up to r = 0.81 correlation with human raters, greatly surpassing current systems (r = 0.12-0.26). Further, learning transfers well to new test items and the approach is still robust with small numbers of training labels. We also compare prompt-based zero-shot and few-shot approaches, using GPT-3, ChatGPT, and GPT-4. This work also suggests a limit to the underlying assumptions of the semantic distance model, showing that a purely semantic approach that uses the stronger language representation of LLMs, while still improving on existing systems, does not achieve comparable improvements to our fine-tuned system. The increase in performance can support stronger applications and interventions in DT and opens the space of automated DT scoring to new areas for improving and understanding this branch of methods. [This paper was published in "Thinking Skills and Creativity" v49 Article 101356 2023.] (As Provided).
Erfasst vonERIC (Education Resources Information Center), Washington, DC
Update2024/1/01
Literaturbeschaffung und Bestandsnachweise in Bibliotheken prüfen
 

Standortunabhängige Dienste
Da keine ISBN zur Verfügung steht, konnte leider kein (weiterer) URL generiert werden.
Bitte rufen Sie die Eingabemaske des Karlsruher Virtuellen Katalogs (KVK) auf
Dort haben Sie die Möglichkeit, in zahlreichen Bibliothekskatalogen selbst zu recherchieren.
Tipps zum Auffinden elektronischer Volltexte im Video-Tutorial

Trefferlisten Einstellungen

Permalink als QR-Code

Permalink als QR-Code

Inhalt auf sozialen Plattformen teilen (nur vorhanden, wenn Javascript eingeschaltet ist)

Teile diese Seite: