Go Back

Data Set

The MetricsMATR 2008 evaluation data set is not to be publicly released. Portions will be reused for future NIST MT evaluations.

Primary Evaluation Set

OriginSource LanguageTarget LanguageGenre(s)Words (est.)Systems
MT08ArabicEnglishNW, WB15,00010
ChineseEnglishNW, WB15,00010
GALE P2ArabicEnglishNW, WB11,5003
ChineseEnglishNW, WB10,0003
GALE P2.5ArabicEnglishBN5,5002
ChineseEnglishBC, BN10,0003
Transtac, Jul 07ArabicEnglishDialog6,5005
FarsiEnglishDialog4,5005
Transtac, Jan 07ArabicEnglishDialog5,0005

Secondary Evaluation Set

OriginSource LanguageTarget LanguageGenre(s)Words (est.)Systems
CESTA, run1ArabicFrenchGeneral28,0002
EnglishFrenchGeneral21,5005
CESTA, run2ArabicFrenchHealth20,0001
EnglishFrenchHealth22,5005