Genomics Track introduced in TREC 2003
The Text REtrieval
Conference (TREC) is a series of evaluation workshops managed by ITL’s
Information Access Division and designed to foster research on technologies for
information retrieval. Participants
produce retrieval results for one or more focus areas called tracks prior to
the workshop, then meet during the workshop to discuss the results. The twelfth TREC conference, TREC 2003, was
held November 18-21, 2003 at NIST (Gaithersburg). TREC 2003 contained six tracks, including tracks on question
answering, retrieving web documents, and eliminating redundant information in a
response. Two new tracks focused on
improving baseline retrieval effectiveness.
A third new track examined retrieval effectiveness when the information sought
is restricted to a particular domain, and used genomics data as the domain of
interest.
The primary task in
the genomics track was to retrieve documents describing gene function. Systems were given a gene name and an
organism (e.g., "human" or "mouse"), which was interpreted
as a request to retrieve documents describing the basic biology of the gene and
its protein products in the specified organism. The motivating scenario for the task was that of a biological
researcher or graduate student---someone who already has considerable domain
knowledge---confronted with the need to learn about a new gene very
quickly. The document collection used
for the test consisted of approximately 526,000 MEDLINE records donated to the
track by the National Library of Medicine.
Twenty-five groups including academic (Berkeley, Stanford, University of
Maryland, University Hospital of Geneva), commercial (Erasmus MC, Tarragon
Consulting Corp.), and governmental (the National Library of Medicine, the
Canadian National Research Council) research groups participated in the track.
More information
regarding TREC can be found on the TREC web site, http://trec.nist.gov.
CONTACT: Ellen
Voorhees, ext. 3761