Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Adequacy, 4-point scale
  • Target Language: English
  • Correlation Level: system

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.7473(0.3336, 0.9196)0.5641(0.0190, 0.8507)0.6405(0.1384, 0.8807)graph_scatterplot
2CDer-0.8022(-0.9384, -0.4502)-0.6410(-0.8809, -0.1392)-0.7613(-0.9245, -0.3623)graph_scatterplot
3ULCh0.5824(0.0463, 0.8581)0.4359(-0.1515, 0.7958)0.5482(-0.0040, 0.8442)graph_scatterplot
4TER-v0.7.25-0.8242(-0.9457, -0.5004)-0.6667(-0.8905, -0.1828)-0.8964(-0.9689, -0.6826)graph_scatterplot
5DP-Orp0.3956(-0.1987, 0.7772)0.2821(-0.3184, 0.7210)0.2467(-0.3521, 0.7023)graph_scatterplot
6NIST-v11b0.7912(0.4259, 0.9347)0.5897(0.0574, 0.8610)0.7626(0.3650, 0.9249)graph_scatterplot
7ATEC40.8022(0.4502, 0.9384)0.6410(0.1392, 0.8809)0.7591(0.3577, 0.9237)graph_scatterplot
8ATEC10.8132(0.4750, 0.9421)0.6667(0.1828, 0.8905)0.7408(0.3207, 0.9174)graph_scatterplot
9mBLEU-0.0659(-0.5953, 0.5033)0.0000(-0.5510, 0.5510)0.4146(-0.1768, 0.7860)graph_scatterplot
10SNR0.6099(0.0887, 0.8689)0.4359(-0.1515, 0.7958)0.6061(0.0827, 0.8674)graph_scatterplot
114-GRR0.8407(0.5396, 0.9511)0.7179(0.2762, 0.9093)0.7632(0.3662, 0.9252)graph_scatterplot
12ATEC20.8132(0.4750, 0.9421)0.6667(0.1828, 0.8905)0.7513(0.3418, 0.9210)graph_scatterplot
13SEPIA10.7088(0.2589, 0.9060)0.4872(-0.0872, 0.8185)0.6054(0.0816, 0.8671)graph_scatterplot
14ULCopt0.6538(0.1608, 0.8857)0.5128(-0.0532, 0.8294)0.6223(0.1085, 0.8737)graph_scatterplot
15mTER-0.5989(-0.8646, -0.0715)-0.4359(-0.7958, 0.1515)-0.5184(-0.8318, 0.0457)graph_scatterplot
16EDPM0.7967(0.4380, 0.9366)0.6154(0.0974, 0.8710)0.7250(0.2897, 0.9118)graph_scatterplot
17BLEU-40.7912(0.4259, 0.9347)0.5897(0.0574, 0.8610)0.6920(0.2280, 0.8999)graph_scatterplot
18METEOR-v0.60.5879(0.0546, 0.8602)0.4615(-0.1200, 0.8072)0.6255(0.1137, 0.8749)graph_scatterplot
19RTE-MT0.4835(-0.0920, 0.8169)0.3333(-0.2666, 0.7471)0.5146(-0.0508, 0.8302)graph_scatterplot
20BadgerLite0.4725(-0.1061, 0.8121)0.3846(-0.2111, 0.7720)0.4600(-0.1219, 0.8065)graph_scatterplot
21METEOR-ranking0.6209(0.1062, 0.8732)0.4872(-0.0872, 0.8185)0.6197(0.1043, 0.8727)graph_scatterplot
22LET0.7473(0.3336, 0.9196)0.5385(-0.0178, 0.8402)0.6203(0.1053, 0.8729)graph_scatterplot
23DP-Or0.7143(0.2693, 0.9079)0.5128(-0.0532, 0.8294)0.6117(0.0915, 0.8696)graph_scatterplot
24ATEC30.8132(0.4750, 0.9421)0.6667(0.1828, 0.8905)0.7614(0.3624, 0.9245)graph_scatterplot
25BLEU-v120.7912(0.4259, 0.9347)0.5897(0.0574, 0.8610)0.6948(0.2330, 0.9009)graph_scatterplot
26BEwT-E0.4505(-0.1336, 0.8024)0.3333(-0.2666, 0.7471)0.4151(-0.1762, 0.7862)graph_scatterplot
27RTE0.4945(-0.0776, 0.8216)0.3333(-0.2666, 0.7471)0.5211(-0.0420, 0.8329)graph_scatterplot
28DR-Or0.4945(-0.0776, 0.8216)0.3590(-0.2394, 0.7597)0.5195(-0.0441, 0.8323)graph_scatterplot
29BleuSP0.7967(0.4380, 0.9366)0.6154(0.0974, 0.8710)0.6453(0.1464, 0.8825)graph_scatterplot
30SVM-Rank0.5879(0.0546, 0.8602)0.4615(-0.1200, 0.8072)0.5458(-0.0074, 0.8432)graph_scatterplot
31BLEU-10.7747(0.3904, 0.9291)0.5897(0.0574, 0.8610)0.7034(0.2489, 0.9040)graph_scatterplot
32Bleu-sbp0.7912(0.4259, 0.9347)0.5897(0.0574, 0.8610)0.6895(0.2233, 0.8990)graph_scatterplot
33invWer-0.8352(-0.9493, -0.5264)-0.6923(-0.9000, -0.2285)-0.8546(-0.9556, -0.5738)graph_scatterplot
34BLEU-v11b0.7912(0.4259, 0.9347)0.5897(0.0574, 0.8610)0.6906(0.2253, 0.8994)graph_scatterplot
35SR-Or0.6099(0.0887, 0.8689)0.4359(-0.1515, 0.7958)0.5388(-0.0173, 0.8403)graph_scatterplot
36Badger0.4066(-0.1861, 0.7823)0.3333(-0.2666, 0.7471)0.4509(-0.1331, 0.8025)graph_scatterplot
37Meteor-v0.70.6923(0.2285, 0.9000)0.5385(-0.0178, 0.8402)0.6723(0.1927, 0.8926)graph_scatterplot
38MaxSim0.6538(0.1608, 0.8857)0.4872(-0.0872, 0.8185)0.6023(0.0769, 0.8659)graph_scatterplot
39TERp-0.8022(-0.9384, -0.4502)-0.6410(-0.8809, -0.1392)-0.7522(-0.9214, -0.3437)graph_scatterplot

39 metrics (including 7 baseline metrics)
13 data points (total number of systems used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.7143(0.2693, 0.9079)0.5128(-0.0532, 0.8294)0.7009(0.2443, 0.9031)graph_scatterplot
2CDer-0.7802(-0.9310, -0.4021)-0.6154(-0.8710, -0.0974)-0.7224(-0.9108, -0.2848)graph_scatterplot
3ULCh0.6154(0.0974, 0.8710)0.4615(-0.1200, 0.8072)0.6247(0.1125, 0.8747)graph_scatterplot
4TER-v0.7.25-0.8352(-0.9493, -0.5264)-0.6923(-0.9000, -0.2285)-0.9131(-0.9740, -0.7287)graph_scatterplot
5DP-Orp0.1898(-0.4034, 0.6707)0.1677(-0.4223, 0.6579)0.1158(-0.4649, 0.6268)graph_scatterplot
6NIST-v11b0.7967(0.4380, 0.9366)0.6154(0.0974, 0.8710)0.7209(0.2818, 0.9103)graph_scatterplot
7ATEC40.8022(0.4502, 0.9384)0.6410(0.1392, 0.8809)0.7611(0.3618, 0.9244)graph_scatterplot
8ATEC10.8022(0.4502, 0.9384)0.6410(0.1392, 0.8809)0.7463(0.3317, 0.9193)graph_scatterplot
9SNR0.6099(0.0887, 0.8689)0.4359(-0.1515, 0.7958)0.6176(0.1010, 0.8719)graph_scatterplot
10mBLEU-0.2857(-0.7229, 0.3148)-0.1538(-0.6498, 0.4339)0.4146(-0.1768, 0.7860)graph_scatterplot
114-GRR0.8407(0.5396, 0.9511)0.7179(0.2762, 0.9093)0.9140(0.7313, 0.9743)graph_scatterplot
12ATEC20.8132(0.4750, 0.9421)0.6667(0.1828, 0.8905)0.7594(0.3583, 0.9238)graph_scatterplot
13SEPIA10.7857(0.4139, 0.9329)0.5641(0.0190, 0.8507)0.7024(0.2470, 0.9037)graph_scatterplot
14ULCopt0.6154(0.0974, 0.8710)0.4615(-0.1200, 0.8072)0.6456(0.1469, 0.8826)graph_scatterplot
15EDPM0.8077(0.4625, 0.9402)0.6154(0.0974, 0.8710)0.8137(0.4763, 0.9423)graph_scatterplot
16mTER-0.5659(-0.8514, -0.0217)-0.4103(-0.7840, 0.1818)-0.5065(-0.8268, 0.0617)graph_scatterplot
17BLEU-40.8022(0.4502, 0.9384)0.5897(0.0574, 0.8610)0.8150(0.4792, 0.9427)graph_scatterplot
18METEOR-v0.60.5824(0.0463, 0.8581)0.4359(-0.1515, 0.7958)0.6246(0.1122, 0.8746)graph_scatterplot
19BadgerLite0.3352(-0.2647, 0.7480)0.3077(-0.2930, 0.7342)0.4030(-0.1903, 0.7806)graph_scatterplot
20METEOR-ranking0.6154(0.0974, 0.8710)0.4615(-0.1200, 0.8072)0.6234(0.1103, 0.8741)graph_scatterplot
21LET0.7692(0.3788, 0.9272)0.5385(-0.0178, 0.8402)0.7421(0.3234, 0.9178)graph_scatterplot
22DP-Or0.6520(0.1576, 0.8850)0.5032(-0.0661, 0.8254)0.6612(0.1733, 0.8885)graph_scatterplot
23ATEC30.8077(0.4625, 0.9402)0.6410(0.1392, 0.8809)0.7667(0.3735, 0.9264)graph_scatterplot
24BLEU-v120.7912(0.4259, 0.9347)0.5641(0.0190, 0.8507)0.8216(0.4944, 0.9449)graph_scatterplot
25BEwT-E0.4835(-0.0920, 0.8169)0.3333(-0.2666, 0.7471)0.4277(-0.1613, 0.7920)graph_scatterplot
26DR-Or0.5604(0.0137, 0.8492)0.3846(-0.2111, 0.7720)0.6017(0.0759, 0.8657)graph_scatterplot
27BleuSP0.7363(0.3117, 0.9158)0.5128(-0.0532, 0.8294)0.6782(0.2032, 0.8948)graph_scatterplot
28SVM-Rank0.6209(0.1062, 0.8732)0.4872(-0.0872, 0.8185)0.5447(-0.0090, 0.8427)graph_scatterplot
29BLEU-10.6758(0.1989, 0.8939)0.4872(-0.0872, 0.8185)0.6647(0.1794, 0.8898)graph_scatterplot
30Bleu-sbp0.8022(0.4502, 0.9384)0.5897(0.0574, 0.8610)0.8112(0.4704, 0.9414)graph_scatterplot
31invWer-0.8132(-0.9421, -0.4750)-0.6410(-0.8809, -0.1392)-0.8221(-0.9450, -0.4956)graph_scatterplot
32BLEU-v11b0.8022(0.4502, 0.9384)0.5897(0.0574, 0.8610)0.8350(0.5260, 0.9493)graph_scatterplot
33SR-Or0.6154(0.0974, 0.8710)0.4615(-0.1200, 0.8072)0.5978(0.0698, 0.8642)graph_scatterplot
34Badger0.3132(-0.2874, 0.7370)0.2308(-0.3669, 0.6936)0.3666(-0.2310, 0.7634)graph_scatterplot
35Meteor-v0.70.6154(0.0974, 0.8710)0.4615(-0.1200, 0.8072)0.6578(0.1675, 0.8872)graph_scatterplot
36MaxSim0.6813(0.2087, 0.8960)0.4872(-0.0872, 0.8185)0.6545(0.1619, 0.8860)graph_scatterplot
37TERp-0.7473(-0.9196, -0.3336)-0.5641(-0.8507, -0.0190)-0.7414(-0.9176, -0.3220)graph_scatterplot

37 metrics (including 7 baseline metrics)
13 data points (total number of systems used)