ITL Publishes Proceedings of Text Retrieval Conference TREC 2006

 

The Information Access Division has published the proceedings of the fifteenth annual Text REtrieval Conference, TREC 2006, as NIST Special Publication 500-272. The TREC workshop series is sponsored by NIST and the Disruptive Technology Office (DTO) of the Office of the Director of National Intelligence to support the text retrieval industry by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies.  TREC 2006, the most recent workshop in the series, was held at NIST on November 14-17, 2006 and was attended by approximately 175 people.  The 107 participating organizations included academic, commercial, and government groups from 17 different countries.

 

The proceedings contain an overview summarizing the retrieval tasks

and main results of the conference, papers that were presented

at the conference, and evaluation reports for each organization's results.  The proceedings also contain "track" reports, where a track is a focus on a particular retrieval subproblem.  TREC 2006 contained seven tracks, including question answering, detecting spam in an email stream,

enterprise search, search on (almost) terabyte-scale document sets, and

information access within the genomics domain.  Two new tracks explored blog search and providing support for legal discovery of electronic documents.  For each track, TREC provides participants with a document set and a set of questions; participating systems return the best responses from the document set for each question.  Document sets ranged in size from approximately 160,000 biomedical journal articles to 25 million web pages.

 

There were two main themes in TREC 2006 that were supported by the different tracks.  The first theme was exploring broader information contexts than in previous TRECs.  This was accomplished by exploring both different document genres (blogs, email, corporate repositories, newswire, scientific documents,web pages) and different retrieval tasks (ad hoc and known-item search, focused responses, classification).  The second theme of the conference was a focus on creating new evaluation methodologies.  The need for new methodologies is caused by the extension to new retrieval tasks as well as the increased size of the data collections.

 

CONTACT:  Ellen Voorhees, ext. 3761.