<%@LANGUAGE="JAVASCRIPT" CODEPAGE="65001"%> NIST Speech Group Website
Information Technology Lab, Information Access Division NIST: National Institute of Standards and Technology

  • Speech Group Home
  • Benchmark Tests
  • Tools
  • Test Beds
  • Publications
  • Links
  • Contact
  • RT-02 Annotation Data Samples

    The goal of the RT-02 evaluation is to automatically build "rich transcripts" which means that recognition systems must generate both word sequences and higher levels of annotation.

    The data is available as a collection *.tgz and will serve as sample data for metadata annotation experiment currently under way and to inform the community.

    There are three sets of data, one for each RT domain, Switchboard, Broadcast News and Meeting Room. For each domain there are three, nominally 100 second samples. As new annotation types are defined by NIST for the RT evaluation, the transcripts for each sample will be updated. Currently, speaker change information is the only annotation type.

     

     

    Page Created: September 18, 2007
    Last Updated: December 17, 2007

    Speech Group is part of IAD and ITL
    NIST is an agency of the U.S. Department of Commerce
    Privacy Policy | Security Notices|
    Accessibility Statement | Disclaimer | FOIA