![]() |
![]() |
|
|
Pilot Study Summary
The objective of the ACE pilot study was to lay the groundwork for the
ACE program. A major part of this effort was exploring research possibilities
and defining the research program appropriately. This required answering
some important questions and then choosing productive research directions
and establishing performance baselines. The key questions used to guide
the pilot study included the following three:
The research objectives for the full ACE program were
characterized as the detection and characterization of Entities, Relations,
and Events; and the pilot study then focused on the first of these, namely
Entities. With this perspective and general direction an ACE pilot study
task was developed. This task, called "Entity Detection and Tracking"
(EDT), is actually a suite of 4 related tasks and is defined in a web
accessible EDT
task definition document. The EDT tasks include 2 primary tasks and
2 secondary tasks. The primary tasks are Entity Detection and Entity Attribute
Recognition, and the secondary tasks are Mention Detection and Mention
Extent Recognition. Another major part of the ACE pilot study was the creation
of a pilot corpus and the annotation of entities mentioned in this corpus
according to the EDT task definition. The pilot corpus was defined to
include text from three different kinds of sources, namely newswire, ASR
transcriptions of broadcast news programs, and OCR transcriptions of newspapers.
This corpus is accessible from the NIST ACE web site. Specifically, the
training and development portions of the pilot corpus may be downloaded.
More detailed information on the
OCR part of this corpus is also web accessible. Cross-site annotation
consistency is an important criterion for task definition and was measured
across all sites that provided annotations. At the conclusion of the pilot study, research sites
created preliminary systems that performed all four EDT tasks. These systems
were evaluated on the evaluation portion of the pilot corpus, with encouraging
results. NIST has performed various analyses of the performance of these
systems and of the corpus and the EDT tasks in general. These
analyses are web accessible in PowerPoint form and include the following:
Additional
information on the ACE program may be accessed from NIST’s ACE web
site. This information includes an
overview of the ACE pilot study. The LDC also has assembled and made
web accessible some
helpful information regarding EDT annotation tools and procedures.
Page Created: September 6, 2007 |
ACE Phase 1 links: |
|
Multimodal Information
Group is part of IAD
and ITL NIST is an agency of the U.S. Department of Commerce |
Privacy
Policy | Security
Notices Accessibility Statement | Disclaimer | FOIA |