Information Technology Lab, Information Access Division NIST: National Institute of Standards and Technology

  • Multimodal Information Group Home
  • Benchmark Tests
  • Tools
  • Test Beds
  • Publications
  • Links
  • Contacts
  • ACE Phase 1 (2000)
    Pilot Study Summary

    The objective of the ACE pilot study was to lay the groundwork for the ACE program. A major part of this effort was exploring research possibilities and defining the research program appropriately. This required answering some important questions and then choosing productive research directions and establishing performance baselines. The key questions used to guide the pilot study included the following three:

      • What are the right technical goals?
      • What is the impact of degraded text?
      • How should performance be measured?

    The research objectives for the full ACE program were characterized as the detection and characterization of Entities, Relations, and Events; and the pilot study then focused on the first of these, namely Entities. With this perspective and general direction an ACE pilot study task was developed. This task, called "Entity Detection and Tracking" (EDT), is actually a suite of 4 related tasks and is defined in a web accessible EDT task definition document. The EDT tasks include 2 primary tasks and 2 secondary tasks. The primary tasks are Entity Detection and Entity Attribute Recognition, and the secondary tasks are Mention Detection and Mention Extent Recognition.

    Another major part of the ACE pilot study was the creation of a pilot corpus and the annotation of entities mentioned in this corpus according to the EDT task definition. The pilot corpus was defined to include text from three different kinds of sources, namely newswire, ASR transcriptions of broadcast news programs, and OCR transcriptions of newspapers. This corpus is accessible from the NIST ACE web site. Specifically, the training and development portions of the pilot corpus may be downloaded. More detailed information on the OCR part of this corpus is also web accessible. Cross-site annotation consistency is an important criterion for task definition and was measured across all sites that provided annotations.

    At the conclusion of the pilot study, research sites created preliminary systems that performed all four EDT tasks. These systems were evaluated on the evaluation portion of the pilot corpus, with encouraging results. NIST has performed various analyses of the performance of these systems and of the corpus and the EDT tasks in general. These analyses are web accessible in PowerPoint form and include the following:

      • Analysis of a priori statistics of Entity Type, Mention Count, etc., as a function of source type.
      • Analysis of Entity Detection and Type Recognition Performance as a function of source type.
      • Contrast of Entity Detection and Type Recognition Performance between ground truth and ASR/OCR.
      • Analysis of Entity Detection and Type Recognition Performance for various entity characteristics.
      • Contrast of Entity Detection and Type Recognition Performance across different research sites.
      • Analysis of Name and Mention Detection and Extent Recognition Performance.

    Additional information on the ACE program may be accessed from NISTís ACE web site. This information includes an overview of the ACE pilot study. The LDC also has assembled and made web accessible some helpful information regarding EDT annotation tools and procedures.

    [ ACE Home ]

     

     

     

    Page Created: September 6, 2007
    Last Updated: November 4, 2008

    ACE Phase 1 links:

    ACE Phase 1 Home

    Documentation

    Schedule

    Resources

    Contacts

    ACE Home

    Multimodal Information Group is part of IAD and ITL
    NIST is an agency of the U.S. Department of Commerce
    Privacy Policy | Security Notices
    Accessibility Statement | Disclaimer | FOIA