ITL’s Annual Information Retrieval Conference Draws Strong International Participation

 

The Information Access Division of ITL held the eleventh Text REtrieval Conference (TREC 2002) on November 19-22, 2002 at NIST in Gaithersburg, drawing a strong international participation. The 90 participating groups represented 21 different countries. Approximately two-thirds of all participants and half of the speakers invited to present their work in the plenary sessions are based in countries other than the United States. The conference also included reports from retrieval evaluation efforts modeled after TREC that have been held in Europe and Japan.

 

TREC is a series of evaluation workshops designed to foster research on technologies for information retrieval. Participants produce retrieval results for one or more focus areas, called tracks, prior to the workshop, then meet during the workshop to discuss the results.  TREC 2002 contained seven tracks, including question answering, content-based access to digital video, cross-language document retrieval, document filtering, interactive retrieval, and retrieval of web documents. A new novelty track focused on the problem of eliminating redundant information from the retrieved set.

 

The web track used a new document collection specifically created for TREC. This collection is based on results of a webcrawl of the .gov domain in January 2002, which  simulated the type of crawl that might be used by an actual .gov search service: breadth first, stopping after the first million html pages, and including an additional 250,000 non-html pages (such as images and .pdf files). Documents contain the information returned by the http daemon as well as the page content. The collection is now available to the public through CSIRO, the Australian government research organization that created the collection.

 

Part of each TREC meeting is spent planning the tasks for future years. The set of tracks to be offered in TREC 2003 will be quite different from the TREC 2002 set. The video retrieval track will be spun off into its own evaluation series with a workshop that will meet immediately before TREC. TREC itself will include new tracks in bioinformatics (specifically, supporting genomic research – see Genomics Pre-Track Information link on the TRECwebsite), exploiting user data for personalized retrieval, and reducing the variability of retrieval performance across different search requests. The TREC website is http://trec.nist.gov.

 

CONTACT: Ellen Voorhees, ext. 3761