Researching and building IR applications using Terrier
Terrier is a flexible open-source platform for developing Information Retrieval (IR) applications and for performing IR research. Terrier has a growing user community and an active discussion forum, making the platform lively and of continually improving quality. Indeed, the platform is ideal for researchers exploring the field and the existing retrieval models while knowing that it can be easily extended in the future to support their own research. Terrier provides a very comprehensive teaching platform for those lecturers involved in running undergraduate and postgraduate information retrieval courses.
2. The main design of an IR system
Terrier comes with Divergence From Randomness (DFR) weighting models, as well as other popular weighting models such as TF-IDF, Okapi's BM25 and Language modelling. We will introduce the general idea behind a weighting model and explain the DFR idea for weighting terms. We show that the same DFR idea can be naturally used for Query Expansion. In all cases, weighting models are explained and their implementation within Terrier is illustrated.
3. Experimenting and Researching with Terrier
Terrier allows experimentation using many standard test collections, such as off-the-shelf support for TREC experiments. In addition, Terrier is a flexible platform that allows easy implementation of your own research ideas, giving researchers a rapid path from idea development to experimentation.
In this part of tutorial, we will focus on how to implement your idea/method to facilitate your own research. For instance, we will introduce with examples how to extract text from your own collection of documents, and how to determine the most informative terms from a set of documents. In particular, we provide overview in how to implement current state-of-the-art applications, such as opinion finding retrieval (c.f. TREC Blog track), document prior integration (c.f. Web IR) and others.
4. Course Materials
Handouts containing slides, a Terrier "crib sheet", and detailed examples of implementations of common research problems will be provided, in addition to a bibliography of informative related papers.