30th March - 3rd April 2008, Glasgow, Scotland
Industry Day

ECIR 2008 will be followed by a special day on Thursday 3rd April 2008 devoted to the interests and needs of Information Retrieval practitioners. The Industry Day is devoted to designing and developing practical solutions for information retrieval products and services, and aims to build bridges between IR specialists in industry and academia. This forum presents an opportunity for commercial organisations and individuals to share their work with a wider audience, and for researchers to learn more about the issues and problems faced by IR practitioners in developing practical solutions for the information search and retrieval industry.

While this is the first time an industry day is held as an intrinsic part of the ECIR programme and in the same venue as the main conference, the previous editions of the BCS IRSG industry day held in London (see [1] and [2]) were very successful.

Registration
Thursday 3rd April 2008 from 08:30 in Sir Alwyn Williams (SAW) Building (D16 on campus map)
.
 
Venue:
Industry day will take place at the department of Computing Science, Sir Alwyn Williams (SAW) Building - Level 5 (D16 on campus map).
 
Schedule:
08:50 - 09:00 Welcome
 
09:00 - 10:30 Session: Language
Chair: John Tait
Trying to Find Answers to Complex Questions -- Hugo Zaragoza (Yahoo! Research) [Slides]
Guided Summarization -- Daniel Tunkelang (Endeca) [Slides]
What can Language Technologies do for Information Retrieval ? -- Antonio Valderrábanos (Bitext) [Slides]
 
10:30 - 11:00 Coffee break
 
11:00 - 12:30 Session: Search
Chair: Elizabeth Liddy
A Brief Tour of "Query Space" -- Nick Craswell (Microsoft Live Search) [Slides]
Innovation in search -- Mihai Stroe (Google)
The Challenge of Engineering Vertical Search -- Jeffery Dalton (GlobalSpec) [Slides]
 
14:00 - 15:30 Session: Enterprise & Business
Chair: Alex Bailey
The perfect storm - IR and Business -- Theo Huibers (Thaesis / University of Twente) [Slides]
Supporting enterprise search with open source tools -- Richard Boulton (Lemur Consulting Limited)
The Information Retrieval Facility and its Role in Professional Information Research -- Francisco Webber (Matrixware) [Slides]
 
15:30 - 16:00 Coffee break
 
16:00 - 17:30 Session: Society
Chair: Fidel Cacheda
Great Expectations or Mass Extinction ? (Public Libraries in an informed society) -- Friso Visser (Bibliotheek.nl / Netherlands Public Library Association) [Slides]
How to get from "is that them?" to "who is that?" -- Iain Drummond (Memex Technology Limited) [Slides]
How Semantics Changes Information Retrieval -- Antonio Linari (Expert System) [Slides]
 
17:30 - 17:45 Close
 

Title: Trying to Find Answers to Complex Questions

Speaker:
Hugo Zaragoza, Researcher, Yahoo! Research

Abstract:
Web 2.0 applications are generating a wealth of "user generated content" of different forms. Such content provides unique opportunities for research in information retrieval, natural language processing and machine learning. At the same time, using this content effectively for research poses some challenges. In my talk I will discuss ongoing work in Yahoo! Research Barcelona on two such collections: Wikipedia and Yahoo! Answers. (1) Wikipedia publishes explicitly encoded semantic information (infoboxes) in parallel with natural language text (the entry itself). We are trying to use this to learn better semantic taggers and to retrieve entities and relations between them. (2) Yahoo! Answers is a site where users ask, answer and read eachother's questions; we will show initial experiments to use this data to learn complex retrieval functions capable of exploiting rich linguistic features of text.

Slides:
[pdf]


Title: Guided Summarization

Speaker:
Daniel Tunkelang, Chief Scientist, Endeca

Abstract:
A recent trend in the information retrieval community is a focus on exploratory search, often described as interactive or human-computer information retrieval. The central theme of these efforts is that information access systems need to move past the traditional approach of optimizing the one-time batch retrieval process of ranking results based on their estimated relevance to a query. Instead, the goal is to optimize the communication between the user and the system in the context of an iterative dialog. Endeca's information access platform enables this dialog with the data through Guided Summarization, a set-oriented retrieval approach that responds to user queries with both an overview of the user's current context and an organized set of options for incremental exploration. In this talk, we will demonstrate Endeca's technology through examples that show the breadth with which we can apply this technique in enterprise settings.

Slides:
[pdf]


Title: What can Language Technologies do for Information Retrieval?

Speaker:
Antonio Valderrábanos, CEO and Founder, Bitext

Abstract:
Traditional search engines handle language according to its form; ignoring its content and context. People, instead, use words because of their content and within a particular context. As a result, the linguistic properties of text are not exploited in information retrieval and a gap is left open between users and search engines. The talk will describe what language technologies can do to make search engines more efficient and easier to use for the general public. These ideas will be presented in action with a demo that integrates Live, Microsoft web search engine, and NaturalFinder, a complement for any search engine developed by Bitext.

Slides:
[pdf]


Title: A Brief Tour of "Query Space"

Speaker:
Nick Craswell, Applied Researcher, Microsoft Live Search

Abstract:
Modelling relationships between queries is an area of growing interest in Information Retrieval. A user who types 'dog toy' probably looking for something similar to a user who typed 'dog toys' or 'toys for dogs'. While a user who types 'toy dog' probably wants something more like 'toy dogs' and 'toy dog breeds'. How can we detect these relationships between queries? Information retrieval practitioners have an advantage in this area, because they have easier access to query/click logs. I will illustrate some log-based techniques for finding relationships between queries, based on how often queries are typed, and what other queries and clicks tend to follow. Such techniques may be used for selective query alteration (for example selective stemming), query rewriting and query suggestion. I will give examples based on Live Search logs.

Slides:
[pdf]


Title: Innovation in search

Speaker:
Mihai Stroe, Senior Engineer, Google

Abstract:
The problem of helping internet users find the information they are looking for presents challenges in Information Retrieval and related fields - Distributed Systems, User Interface Design etc. As one of the major search engines, Google plays an important role in connecting users to relevant information. Innovation is a driving force for many improvements in this area. In this talk we look at how Google encourages and supports innovation. We present our general principles and a brief overview of some of our major successes. We then describe the process of transforming an initial idea into a launched project that helps millions of users per day. As a case study, we use our URL correction project, based on our research on approximate string matching. We are using this technology to improve web navigation, search engine results for URL queries, and user experience. This can also be applied in other environments such as enterprise search.


Title: The Challenge of Engineering Vertical Search

Speaker:
Jeffery Dalton, Engineer, GlobalSpec

Abstract:
Topic-specific search engines leverage structure from deep domain knowledge to provide better ranking with more powerful search capability than a general search engine. However, our experience at Globalspec is that realizing this vision is quite difficult. In this talk I will use Globalspec's search systems as a model and outline some of the challenges making topic-specific search hard. I will also talk briefly about our experiences using open source search technology. Finally, I will explore challenging problems for future research and opportunities for academic-industry collaboration in vertical search.

Slides:
[pdf]


Title: The perfect storm - IR and Business -

Speaker:
Theo Huibers, Managing Partner, Thaesis / University of Twente

Abstract:
In the past few years, the application of Information Retrieval technology in businesses has seen a significant broadening. Nowadays, not only basic search functionality is used in workflow and customer interaction, but new business services and portal components are developed using search technology. This anchors search technology in the commercial heart of many information service providers. The talk homes in on all these changes.

Slides:
[pdf]


Title: Supporting enterprise search with open source tools

Speaker:
Richard Boulton, Senior Engineer, Lemur Consulting Limited

Abstract:
Over the last decade, many open source technologies for information retrieval have been developed. Several of these have now matured to the extent that production, customer facing, and business critical enterprise search needs can often be satisfied using purely open source technologies. We discuss some of the available software and the advantages brought to search projects by using the available open software platforms, with reference to case studies from the e-commerce and publishing arenas.


Title: The Information Retrieval Facility and its Role in Professional Information Research

Speaker:
Francisco Webber, CEO, Matrixware

Abstract:
The IRF represents the beginning of a dialogue between the Information Retrieval (IR) community and the community of information searching professionals working in industry with a special focus in Intellectual Property Documentation (IP). For the IR community the IP data in the form of 20 million fulltext documents structured in xml is very attractive. In addition the IP environment has real users, with real tasks and real work situations who can help evaluate IR research by offering high quality relevance judgements. The IP Industry is attracted by methods and tools available through IR research that can be introduced to improve their efficiency. IP Research is increasingly an area that can attract funding both from the industrial IP members of the IRF as well as the governments who increasingly identify the importance of IP to their national economic interest. Out of the IRF workshops at the last symposium have come a number of initiatives for further elaboration. The work and research themes of the IRF are practically implemented on the available Semantic Supercomputing infrastructure assisted by the provided support and scientific project management. Access to the experimental environment will be gained through direct access, web interfaces and through an eclipse based rich client environment.

Slides:
[pdf]


Title: Great Expectations or Mass Extinction? (Public Libraries in an informed society)

Speaker:
Friso Visser, Manager, Bibliotheek.nl / Netherlands Public Library Association

Abstract:
Consumption of traditional media, books, papers CD’s, is decreasing rapidly. Public Libraries, as loan-industries, will lose their significance in society. Search engines and online information have replaced the portfolio of printed non-fiction materials and the catalogue as interface to the information stored within these materials. Online communities create their own reliable resources. Adding up these facts, the extinction of the public library seems inevitable. But maybe there is a(n other) role for libraries to play? That has to do with resource discovery, maybe also with trust, selectiveness, facilitating communities and adding value to products and services. The Dutch Public Libraries have taken some joint initiatives. Developing a common repository and search portal, the notion of ‘information brokerage’ and serving communities is introduced more widely. This talk addresses the following questions: What does this look like? And are these the Public Libraries’ Great Expectations?

Slides:
[pdf]


Title: How to get from "is that them?" to "who is that?"

Speaker:
Iain Drummond, Director of Solution Strategy, Memex Technology Limited

Abstract:
Memex have over a quarter of a century of experience providing IR solutions to the intelligence community and would like to share with you what we perceive as some irritating "real world" issues we run into on a regular basis, and how these relate to IR. Memex has provided software to organisations as diverse as umbrella groups for CD piracy, international insurance company fraud departments, financial institutions, some well-known police forces and military specialists. Brutal summarisation on their mission statements end up with a single common thread - reduction (or elimination) of threats to their organisation. You can only do this is you can establish, ideally rather quickly, who threats have been in the past, who they are now and who they might be in the future. But that is not easy, especially when you look at some of the lessons learned post 9/11 and 7/7.

Slides:
[pdf]


Title: How Semantics Changes Information Retrieval

Speaker:
Antonio Linari, CTO for Semantic Intelligence Division, Expert System

Abstract:
Nowadays there are two different approaches to Information Retrieval: the linguistic and the semantic one. Google has cut Natural Language in keyword slides changing our habits, while Semantic Search helps people without modifying their habits. The talk gives a brief about the new "way of searching",the "old fashion" of Natural Language Processing and explains how semantics enables businesses to leverage their knowledge wealth through intelligent Text Mining, Categorization and Data Fusion.

Slides:
[pdf]