Suche

Wo soll gesucht werden?
Erweiterte Literatursuche

Ariadne Pfad:

Inhalt

Literaturnachweis - Detailanzeige

 
Autor/inAlmquist, Brian Alan
TitelMining for Evidence in Enterprise Corpora
Quelle(2011), (152 Seiten)
PDF als Volltext Verfügbarkeit 
Ph.D. Dissertation, The University of Iowa
Spracheenglisch
Dokumenttypgedruckt; online; Monographie
ISBN978-1-1247-4079-9
SchlagwörterHochschulschrift; Dissertation; Information Retrieval; Search Strategies; Online Searching; Active Learning; Evidence; Feedback (Response); Costs; Heuristics; Information Science
AbstractThe primary research aim of this dissertation is to identify the strategies that best meet the information retrieval needs as expressed in the "e-discovery" scenario. This task calls for a high-recall system that, in response to a request for all available relevant documents to a legal complaint, effectively prioritizes documents from an enterprise document collection in order of likelihood of relevance. High recall information retrieval strategies, such as those employed for e-discovery and patent or medical literature searches, reflect high costs when relevant documents are missed, but they also carry high document review costs. Our approaches parallel the evaluation opportunities afforded by the TREC Legal Track. Within the ad hoc framework, we propose an approach that includes query field selection, techniques for mitigating OCR error, term weighting strategies, query language reduction, pseudo-relevance feedback using document metadata and terms extracted from documents, merging result sets, and biasing results to favor documents responsive to lawyer-negotiated queries. We conduct several experiments to identify effective parameters for each of these strategies. Within the relevance feedback framework, we use an active learning approach informed by signals from collected prior relevance judgments and ranking data. We train a classifier to prioritize the unjudged documents retrieved using different ad hoc information retrieval techniques applied to the same topic. We demonstrate significant improvements over heuristic rank aggregation strategies when choosing from a relatively small pool of documents. With a larger pool of documents, we validate the effectiveness of the merging strategy as a means to increase recall, but that sparseness of judgment data prevents effective ranking by the classifier-based ranker. We conclude our research by optimizing the classifier-based ranker and applying it to other high recall datasets. Our concluding experiments consider the potential benefits to be derived by modifying the merged runs using methods derived from social choice models. We find that this technique, Local Kemenization, is hampered by the large number of documents and the minimal number of contributing result sets to the ranked list. This two-stage approach to high-recall information retrieval tasks continues to offer a rich set of research questions for future research. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.] (As Provided).
AnmerkungenProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml
Erfasst vonERIC (Education Resources Information Center), Washington, DC
Update2017/4/10
Literaturbeschaffung und Bestandsnachweise in Bibliotheken prüfen
 

Standortunabhängige Dienste
Die Wikipedia-ISBN-Suche verweist direkt auf eine Bezugsquelle Ihrer Wahl.
Tipps zum Auffinden elektronischer Volltexte im Video-Tutorial

Trefferlisten Einstellungen

Permalink als QR-Code

Permalink als QR-Code

Inhalt auf sozialen Plattformen teilen (nur vorhanden, wenn Javascript eingeschaltet ist)

Teile diese Seite: