|
Multimodal Information Group Home
Benchmark Tests
Tools
Test Beds
Publications
Links
Contacts
|
NIST Machine Translation Evaluation for GALE
Phase 4
The GALE Translation evaluation will test machine translation of text and recorded speech data. The test will include language data from both Arabic and Chinese, with system performance tallied separately for each language and separately for text and recorded speech sources.
GALE contractors will be the only participants in this evaluation, and the participants must meet specific Go/No-Go levels of performance. This page provides information regarding the 2009 GALE Phase 4 Translation evaluation.
Documentation
- Evaluation plan (updated 11/19/2009)
- Data selection guidelines v2.2 (updated 01/02/2007)
- Post-editing guidelines v3.0.2 (updated 05/25/2007)
- P3.5 sequestered data
list
(updated 03/18/2009)
- About one third (~5k reference words) of the P3.5 documents in each language and genre are to be sequestered for Phase 4.
- Although the list identifies the snippets that were used in P3.5, all snippets within the given document IDs are also being sequestered.
For example:
ABUDHABI_ABUDHNEWS_ARB_20061216_115800-S1
This means all snippets in ABUDHABI_ABUDHNEWS_ARB_20061216_115800 (not just snippet S1) are to be sequestered.
Software
For P4 evaluation, we will the following software:
- BBN/UMD-created Java scoring software v0.7.25. This code can be obtained here or directly from the author's website at http://www.cs.umd.edu/~snover/tercom (link updated 06/06/2008)
* Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul, "A Study of Translation Edit Rate with Targeted Human Annotation," Proceedings of Association for Machine Translation in the Americas, 2006.
- Post-editing software v1.2.2. NIST is developing the post editing software package using Java. The MTPostEditor JAR file (link updated 12/02/2008)
Schedule
| Dates |
Event |
| Nov-01-06 to Dec-22-06 |
P2/P2.5 evaluation epoch |
| Jun-01-07 to Jun-30-07 |
P3/P3.5 evaluation epoch |
| Jun-01-08 to Jun-30-08 |
P4 evaluation epoch |
Non-GALE data to be used for additional training/development must be before the cut-off date (August 31, 2009) and do not overlap in time with the above epochs
|
| |
P4 |
| Jan-05-10 |
GALE translation evaluation starts |
| Jan-26-10 |
Translations due at NIST |
| Feb-01-10 |
Post-editing begins |
| Mar-26-10 |
Post-editing ends |
| Mar-30-10 |
Final scores to DARPA |
| TBD |
GALE PI meeting |
[ GALE Home ]
Page Created: May 6, 2008
Last Updated:
December 16, 2009
|