ITL hosts Workshop on Language Recognition

 

The Information Access Division (IAD) in ITL hosted the 2003 NIST Language Recognition Workshop at NIST April 28-29, 2003 This workshop, held in cooperation with DoD sponsors, reviewed the recent evaluation of language recognition research systems in this area.  Six sites representing organizations from around the world participated in this evaluation demonstrating current state-of-the-art capabilities for detection of the languages used in segments of conversational telephone speech.

 

The participants were MIT Lincoln Laboratory; the OGI School of Science and Engineering of the Oregon Health & Science University working in collaboration with the Institute of Acoustics of the Chinese Academy of Sciences, the Speech Research Lab of Queensland University of Technology, R523 (Department of Defense), the Department of Electrical Engineering of the University of Washington, and a collaboration of the Institut de Recherche en Informatique de Toulouse and the Laboratoire Dynamique du Langage (Lyon).  In the evaluation, each system was presented with numerous test segments of conversational speech with durations of approximately three, ten, or thirty seconds.  The system had to decide for each of twelve target languages whether the speech segment was in that particular language.  The target languages were Arabic, English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spanish, Tamil, and Vietnamese.  The test segments came from previously collected corpora of telephone conversations in each of these languages and also in Russian. 

 

Alvin Martin and Mark Przybocki of IAD gave presentations summarizing the overall performance results and analyzing how performance varied with segment duration, speaker sex, and the languages being tested.  One surprising finding was that language detection performance was generally superior on female speech than on male speech.

 

The last such evaluation and workshop was conducted by NIST in 1996.  Two of the participating sites in 2003, MIT and OGI, also participated in the 1996 evaluation.  Each of these sites had results this year that were considerably superior to their performance seven years earlier. 

 

More information about the 2003 NIST Language Recognition Evaluation is available on the Web at:  http://www.nist.gov/speech/tests/lang/index.htm.

 

Contact:   Alvin Martin, ext. 3169

                 Mark Przybocki, ext. 3347