Literaturnachweis - Detailanzeige
Autor/in | Londhe, Nikhil |
---|---|
Titel | A Bootstrapped Approach to Multilingual Text Stream Parsing |
Quelle | (2017), (146 Seiten)
PDF als Volltext Ph.D. Dissertation, State University of New York at Buffalo |
Sprache | englisch |
Dokumenttyp | gedruckt; online; Monographie |
ISBN | 978-0-3553-1016-0 |
Schlagwörter | Hochschulschrift; Dissertation; Multilingualism; Social Media; Social Networks; Grammar; Language Styles; Code Switching (Language); Computer Mediated Communication; Spelling; Language Usage; Visual Aids; Nonverbal Communication; Language Processing; Discourse Analysis; Computational Linguistics Thesis; Dissertations; Academic thesis; Mehrsprachigkeit; Multilingualismus; Soziale Medien; Social network; Soziales Netzwerk; Grammatik; Sprachstil; Computerkonferenz; Schreibweise; Sprachgebrauch; Anschauungsmaterial; Non-verbal communication; Nonverbale Kommunikation; Sprachverarbeitung; Diskursanalyse; Linguistics; Computerlinguistik |
Abstract | The ubiquitous hashtag has disruptively transformed how news stories are reported and shared across social media networks. Often, such text streams are massively multilingual with 50 different languages on an average and contain a combination of subjective user opinion, objective evolving information about the story and unrelated spam. This is in addition to the usual challenges of processing social media content like lack of grammar, stylized spellings and usage of slang, emojis and emoticons. Further, language dense regions frequently exhibit code switching and code mixing, where users switch between languages in a single post with or without retaining a single writing system. So far, most research on parsing such streams has largely resorted to piecemeal and language specific approaches. As part of this work, we propose a processing pipeline with two salient features. First, we show how the topical and temporal relationships between the posts can be utilized for language agnostic discourse interpretation. Second, we also show how bootstrapping for incremental parsing can lead to an improved system performance and propose an end to end pipeline to that effect. We explore how the said pipeline can be utilized for two sample use cases--question answering and summarization. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.] (As Provided). |
Anmerkungen | ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml |
Erfasst von | ERIC (Education Resources Information Center), Washington, DC |
Update | 2020/1/01 |