Exploring the legal discovery and enterprise tracks at the University of Iowa

Almquist, Brian; Ha-Thuc, Viet; Sehgal, Aditya K.; Arens, Robert; Srinivasan, Padmini

Exploring the legal discovery and enterprise tracks at the University of Iowa

Author

Date

2007

Keywords

information retrieval, rank aggregation

Abstract (summary)

In designing our own toolset for the TREC Legal Track, we opted to use the Lucene library of indexing and search tools. Lucene, developed in Java, is highly scalable and extendable. Indexing and searching the TREC-Legal collection proved well within Lucene’s capabilities. We indexed the entire TREC collection, opting to merge the document content and the title into a single field, using the Lucene StandardAnalyzer, which strips punctuation, but recognizes and retains elements such as e-mail addresses. The StandardAnalyzer stoplist was used for indexing. For our explorations, we converted topic fields into term vectors for querying the collection. For each topic, our system returned a ranked set of results with enough documents to match in quantity either those retrieved by a reference Boolean query executed on behalf of the TREC 2006 evaluators, or enough to reach a set cap on the number of documents returned, whichever was greater.

Publication Information

Almquist, B., Ha-Thuc, V., Sehgal, A.K., Arens, R. and Srinivasan, P. Exploring the Legal Discovery and Enterprise Tracks at the University of Iowa. Proceedings of the 16th Text REtrieval Conference (2007).

Item Type

Presentation

Language

English

Rights

Permalink

https://hdl.handle.net/20.500.14078/1653

Collections

Department of Decision Sciences

Files

Exploring_the_legal_discovery_and_enterprise-_2007_roam.pdf(390.89 KB)

Show complete metadata

Exploring the legal discovery and enterprise tracks at the University of Iowa

Author

Faculty Advisor

Date

Keywords

Abstract (summary)

Publication Information

DOI

Notes

Item Type

Language

Rights

Permalink

Collections

Embargoed Until:

MacEwan Users Only

Files