Browsing by Author "Almquist, Brian"
Now showing 1 - 5 of 5
Results Per Page
- ItemElectronic evidence and technology-assisted review(2019) Almquist, BrianWe consider the technological developments relevant to procedures characterized as “discovery” in legal proceedings over the last two decades. As storage technology has improved, the idea of a “document” as a piece of discrete discoverable information is becoming increasingly outdated. The courts, while adapting to the new technologies, have also had to reckon with increasing costs of document production, and more expensively, review of information for relevance and privilege or work-product protections. Technology-Assisted Review, using machine learning, is rapidly being employed to handle large document productions. This brings eDiscovery right in line with some of the most interesting current developments in information technology.
- ItemExploring the legal discovery and enterprise tracks at the University of Iowa(2007) Almquist, Brian; Ha-Thuc, Viet; Sehgal, Aditya K.; Arens, Robert; Srinivasan, PadminiIn designing our own toolset for the TREC Legal Track, we opted to use the Lucene library of indexing and search tools. Lucene, developed in Java, is highly scalable and extendable. Indexing and searching the TREC-Legal collection proved well within Lucene’s capabilities. We indexed the entire TREC collection, opting to merge the document content and the title into a single field, using the Lucene StandardAnalyzer, which strips punctuation, but recognizes and retains elements such as e-mail addresses. The StandardAnalyzer stoplist was used for indexing. For our explorations, we converted topic fields into term vectors for querying the collection. For each topic, our system returned a ranked set of results with enough documents to match in quantity either those retrieved by a reference Boolean query executed on behalf of the TREC 2006 evaluators, or enough to reach a set cap on the number of documents returned, whichever was greater.
- ItemMacEwan University at the TREC 2020 Fair Ranking Track(2020) Almquist, BrianThe MacEwan University School of Business submitted two runs for the TREC 2020 Fair Ranking Track. For this task, we indexed the document abstracts and the associated metadata from the provided Semantic Scholar dataset into a single Solr1 node using a standard Tokenizer chain. For each of the evaluation queries, we executed a query for each of the documents that required reranking. For each query-document pairing, we collected the BM25 similarity score for the “paperAbstract” and “title” fields. Each of the documents are ranked for each field based on the similarity score with the query text, with ties sharing their combined rank. This resulted in two ranked lists for each query.
- ItemRefining ranked retrieval results for legal discovery search through supervised rank aggregation(2013) Almquist, Brian; Srinivasan, PadminiWe propose and evaluate a data mining system that uses a set of document features describing each document in the context of partially evaluated ranked results. We find our system to be competitive with existing metasearch ranking strategies for prioritizing the review of evidence for legal relevance.
- ItemThe University of Iowa at TREC 2008 legal and relevance feedback tracks(2008) Almquist, Brian; Mejova, Yelena; Ha-Thuc, Viet; Srinivasan, PadminiThis is the second year that our research group has participated in the TREC Legal Track. Our ad hoc retrieval system has been modified to extract the additional Boolean query fields added to the 2008 topics, and to privilege documents found by the Boolean reference run when conducting our queries. We have also submitted runs that fuse the results from existing runs. For the relevance feedback task, our system uses ranking information of relevant and non-relevant documents from previously submitted runs to the TREC Legal Track to train a classifier. The classifier is applied to the remaining unjudged documents to create a new ranked list. This approach is applied to sets of input runs, including a hybrid run where a classifier trained on one set of runs is applied to the unjudged documents from another set of runs.