Literaturnachweis - Detailanzeige
Autor/in | Johnson, Rusty |
---|---|
Titel | Authorship Attribution with Function Word N-Grams |
Quelle | (2013), (192 Seiten)
PDF als Volltext Ph.D. Dissertation, Nova Southeastern University |
Sprache | englisch |
Dokumenttyp | gedruckt; online; Monographie |
ISBN | 978-1-3031-4673-2 |
Schlagwörter | Hochschulschrift; Dissertation; Authors; Word Order; Word Frequency; Accuracy; Probability; Classification; Scores; Computational Linguistics; Language Styles; Language Patterns Thesis; Dissertations; Academic thesis; Author; Autor; Autorin; Wortfolge; Word analysis; Frequency; Wortanalyse; Häufigkeit; Wahrscheinlichkeitsrechnung; Wahrscheinlichkeitstheorie; Classification system; Klassifikation; Klassifikationssystem; Linguistics; Computerlinguistik; Sprachstil; Sprachmodell; Sprachstruktur |
Abstract | Prior research has considered the sequential order of function words, after the contextual words of the text have been removed, as a stylistic indicator of authorship. This research describes an effort to enhance authorship attribution accuracy based on this same information source with alternate classifiers, alternate n-gram construction methods, and a genetically tuned configuration. The approach is original in that it is the first time that probabilistic versions of Burrows's Delta have been used. Instead of using z-scores as an input for a classifier, the z-scores were converted to probabilistic equivalents (since z-scores cannot be subtracted, added, or divided without the possibility of distorting their probabilistic meaning); this adaptation enhanced accuracy. Multiple versions of Burrows's Delta were evaluated; this includes a hybrid of the Probabilistic Burrows's Delta and the version proposed by Smith & Aldridge (2011); in this case accuracy was enhanced when individual frequent words were evaluated as indicators of style. Other novel aspects include alternate n-gram construction methods; a reconciliation process that allows texts of various lengths from different authors to be compared; and a GA selection process that determines which function (or frequent) words (see Smith & Rickards, 2008; see also Shaker, Corne, & Everson, 2007) may be used in the construction of function word n-grams. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.] (As Provided). |
Anmerkungen | ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml |
Erfasst von | ERIC (Education Resources Information Center), Washington, DC |
Update | 2020/1/01 |