ITL’s Annual Information Retrieval Conference
Draws Strong International Participation
The Information Access Division of ITL held the eleventh Text
REtrieval Conference (TREC 2002) on November 19-22, 2002 at NIST in
Gaithersburg, drawing a strong international participation. The 90
participating groups represented 21 different countries. Approximately
two-thirds of all participants and half of the speakers invited to present
their work in the plenary sessions are based in countries other than the United
States. The conference also included reports from retrieval evaluation efforts
modeled after TREC that have been held in Europe and Japan.
TREC is a series of evaluation workshops
designed to foster research on technologies for information retrieval. Participants
produce retrieval results for one or more focus areas, called tracks, prior to
the workshop, then meet during the workshop to discuss the results. TREC 2002 contained seven tracks, including
question answering, content-based access to digital video, cross-language
document retrieval, document filtering, interactive retrieval, and retrieval of
web documents. A new novelty track focused on the problem of eliminating
redundant information from the retrieved set.
The web track used a new document collection
specifically created for TREC. This collection is based on results of a
webcrawl of the .gov domain in January 2002, which simulated the type of crawl that might be used by an actual .gov
search service: breadth first, stopping after the first million html pages, and
including an additional 250,000 non-html pages (such as images and .pdf files).
Documents contain the information returned by the http daemon as well as the
page content. The collection is now available to the public through CSIRO, the Australian
government research organization that created the collection.
Part of each TREC meeting is spent planning the
tasks for future years. The set of tracks to be offered in TREC 2003 will be
quite different from the TREC 2002 set. The video retrieval track will be spun
off into its own evaluation series with a workshop that will meet immediately
before TREC. TREC itself will include new tracks in bioinformatics
(specifically, supporting genomic research – see Genomics Pre-Track Information
link on the TRECwebsite), exploiting user data for personalized retrieval, and
reducing the variability of retrieval performance across different search
requests. The TREC website is http://trec.nist.gov.
CONTACT: Ellen Voorhees, ext. 3761