Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Adequacy, Yes-No qualitative question, proportion of Yes assigned
  • Target Language: English
  • Correlation Level: document

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.7403(0.7208, 0.7587)0.5399(0.5095, 0.5690)0.7043(0.6825, 0.7248)graph_ILR graph_scatterplot
2CDer-0.7484(-0.7663, -0.7293)-0.5464(-0.5752, -0.5163)-0.7356(-0.7543, -0.7158)graph_ILR graph_scatterplot
3ULCh0.5082(0.4763, 0.5387)0.3540(0.3167, 0.3902)0.5408(0.5104, 0.5699)graph_ILR graph_scatterplot
4TER-v0.7.25-0.7094(-0.7297, -0.6879)-0.5108(-0.5412, -0.4791)-0.6943(-0.7154, -0.6719)graph_ILR graph_scatterplot
5DP-Orp0.4113(0.3758, 0.4456)0.2815(0.2424, 0.3197)0.4507(0.4166, 0.4835)graph_ILR graph_scatterplot
6NIST-v11b0.7380(0.7183, 0.7566)0.5376(0.5071, 0.5668)0.7214(0.7007, 0.7410)graph_ILR graph_scatterplot
7ATEC40.6058(0.5785, 0.6317)0.4283(0.3934, 0.4620)0.5911(0.5630, 0.6177)graph_ILR graph_scatterplot
8ATEC10.6056(0.5783, 0.6315)0.4279(0.3930, 0.4616)0.5895(0.5614, 0.6162)graph_ILR graph_scatterplot
9mBLEU0.4319(0.3971, 0.4655)0.3142(0.2759, 0.3516)0.4039(0.3681, 0.4384)graph_ILR graph_scatterplot
10SNR0.5243(0.4932, 0.5541)0.3671(0.3302, 0.4029)0.4890(0.4564, 0.5203)graph_ILR graph_scatterplot
114-GRR0.7230(0.7023, 0.7425)0.5236(0.4924, 0.5534)0.7088(0.6873, 0.7291)graph_ILR graph_scatterplot
12ATEC20.6071(0.5799, 0.6330)0.4296(0.3947, 0.4632)0.5904(0.5624, 0.6171)graph_ILR graph_scatterplot
13SEPIA10.7563(0.7378, 0.7738)0.5546(0.5248, 0.5830)0.7313(0.7111, 0.7502)graph_ILR graph_scatterplot
14ULCopt0.5217(0.4904, 0.5516)0.3647(0.3277, 0.4006)0.5080(0.4762, 0.5385)graph_ILR graph_scatterplot
15mTER-0.4001(-0.4347, -0.3642)-0.2907(-0.3287, -0.2518)-0.2356(-0.2749, -0.1956)graph_ILR graph_scatterplot
16EDPM0.7505(0.7316, 0.7683)0.5501(0.5202, 0.5788)0.7394(0.7198, 0.7579)graph_ILR graph_scatterplot
17BLEU-40.7199(0.6990, 0.7395)0.5213(0.4901, 0.5512)0.7070(0.6853, 0.7274)graph_ILR graph_scatterplot
18METEOR-v0.60.7749(0.7576, 0.7912)0.5740(0.5451, 0.6015)0.7578(0.7394, 0.7752)graph_ILR graph_scatterplot
19RTE-MT0.6931(0.6706, 0.7143)0.5083(0.4764, 0.5388)0.6681(0.6442, 0.6908)graph_ILR graph_scatterplot
20BadgerLite0.5813(0.5528, 0.6084)0.4122(0.3767, 0.4465)0.5815(0.5530, 0.6086)graph_ILR graph_scatterplot
21METEOR-ranking0.7698(0.7521, 0.7863)0.5677(0.5386, 0.5955)0.7558(0.7372, 0.7732)graph_ILR graph_scatterplot
22LET0.7272(0.7068, 0.7464)0.5297(0.4988, 0.5593)0.7139(0.6927, 0.7339)graph_ILR graph_scatterplot
23DP-Or0.5291(0.4982, 0.5587)0.3706(0.3338, 0.4062)0.5763(0.5475, 0.6037)graph_ILR graph_scatterplot
24ATEC30.5987(0.5710, 0.6250)0.4229(0.3878, 0.4568)0.5660(0.5367, 0.5938)graph_ILR graph_scatterplot
25BLEU-v120.7241(0.7035, 0.7435)0.5246(0.4935, 0.5544)0.7077(0.6861, 0.7281)graph_ILR graph_scatterplot
26BEwT-E0.5860(0.5577, 0.6129)0.4186(0.3834, 0.4526)0.5609(0.5314, 0.5890)graph_ILR graph_scatterplot
27RTE0.6584(0.6339, 0.6815)0.4785(0.4454, 0.5102)0.6205(0.5940, 0.6457)graph_ILR graph_scatterplot
28DR-Or0.4753(0.4421, 0.5072)0.3290(0.2910, 0.3659)0.5138(0.4822, 0.5441)graph_ILR graph_scatterplot
29BleuSP0.7480(0.7289, 0.7659)0.5486(0.5186, 0.5773)0.7267(0.7063, 0.7460)graph_ILR graph_scatterplot
30SVM-Rank0.7408(0.7212, 0.7592)0.5390(0.5085, 0.5681)0.7180(0.6970, 0.7378)graph_ILR graph_scatterplot
31BLEU-10.7340(0.7140, 0.7528)0.5344(0.5037, 0.5637)0.7191(0.6982, 0.7387)graph_ILR graph_scatterplot
32Bleu-sbp0.7227(0.7020, 0.7422)0.5240(0.4928, 0.5538)0.7068(0.6852, 0.7272)graph_ILR graph_scatterplot
33invWer-0.7148(-0.7347, -0.6936)-0.5166(-0.5467, -0.4851)-0.6924(-0.7136, -0.6698)graph_ILR graph_scatterplot
34BLEU-v11b0.7209(0.7001, 0.7405)0.5225(0.4912, 0.5523)0.7046(0.6828, 0.7252)graph_ILR graph_scatterplot
35SR-Or0.4883(0.4556, 0.5196)0.3356(0.2978, 0.3724)0.5417(0.5113, 0.5707)graph_ILR graph_scatterplot
36Badger0.5742(0.5454, 0.6017)0.4078(0.3722, 0.4422)0.5816(0.5531, 0.6087)graph_ILR graph_scatterplot
37Meteor-v0.70.7667(0.7488, 0.7835)0.5644(0.5351, 0.5923)0.7516(0.7328, 0.7694)graph_ILR graph_scatterplot
38MaxSim0.5415(0.5111, 0.5705)0.3775(0.3409, 0.4129)0.5556(0.5259, 0.5840)graph_ILR graph_scatterplot
39TERp-0.7463(-0.7643, -0.7270)-0.5517(-0.5803, -0.5218)-0.7397(-0.7582, -0.7201)graph_ILR graph_scatterplot

39 metrics (including 7 baseline metrics)
2179 data points (total number of documents used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.7094(0.6821, 0.7347)0.5252(0.4858, 0.5624)0.6703(0.6402, 0.6985)graph_ILR graph_scatterplot
2CDer-0.7357(-0.7591, -0.7105)-0.5474(-0.5834, -0.5093)-0.7345(-0.7579, -0.7092)graph_ILR graph_scatterplot
3ULCh0.1040(0.0514, 0.1560)0.0775(0.0248, 0.1299)0.1287(0.0764, 0.1803)graph_ILR graph_scatterplot
4TER-v0.7.25-0.6499(-0.6795, -0.6183)-0.4691(-0.5093, -0.4268)-0.6574(-0.6864, -0.6263)graph_ILR graph_scatterplot
5DP-Orp0.0684(0.0155, 0.1208)0.0473(-0.0056, 0.0999)0.0435(-0.0094, 0.0962)graph_ILR graph_scatterplot
6NIST-v11b0.7084(0.6810, 0.7338)0.5232(0.4837, 0.5606)0.6947(0.6663, 0.7211)graph_ILR graph_scatterplot
7ATEC40.5660(0.5290, 0.6009)0.4097(0.3648, 0.4528)0.5580(0.5205, 0.5934)graph_ILR graph_scatterplot
8ATEC10.5655(0.5285, 0.6004)0.4098(0.3649, 0.4529)0.5564(0.5188, 0.5918)graph_ILR graph_scatterplot
9SNR0.1285(0.0762, 0.1802)0.0971(0.0445, 0.1492)0.0773(0.0245, 0.1296)graph_ILR graph_scatterplot
10mBLEU0.5251(0.4857, 0.5624)0.3644(0.3176, 0.4094)0.5257(0.4863, 0.5629)graph_ILR graph_scatterplot
114-GRR0.6609(0.6301, 0.6897)0.4804(0.4387, 0.5201)0.6916(0.6630, 0.7182)graph_ILR graph_scatterplot
12ATEC20.5639(0.5267, 0.5989)0.4088(0.3638, 0.4519)0.5537(0.5159, 0.5893)graph_ILR graph_scatterplot
13SEPIA10.7309(0.7053, 0.7546)0.5443(0.5060, 0.5804)0.7269(0.7010, 0.7509)graph_ILR graph_scatterplot
14ULCopt0.1209(0.0685, 0.1727)0.0943(0.0417, 0.1465)0.1060(0.0534, 0.1580)graph_ILR graph_scatterplot
15EDPM0.7043(0.6766, 0.7300)0.5170(0.4772, 0.5547)0.7152(0.6883, 0.7400)graph_ILR graph_scatterplot
16mTER-0.4844(-0.5239, -0.4429)-0.3321(-0.3783, -0.2842)-0.3724(-0.4170, -0.3259)graph_ILR graph_scatterplot
17BLEU-40.6865(0.6574, 0.7134)0.5041(0.4636, 0.5425)0.7069(0.6794, 0.7324)graph_ILR graph_scatterplot
18METEOR-v0.60.7616(0.7384, 0.7829)0.5707(0.5340, 0.6053)0.7505(0.7264, 0.7727)graph_ILR graph_scatterplot
19BadgerLite0.3344(0.2866, 0.3806)0.2331(0.1825, 0.2825)0.3390(0.2913, 0.3849)graph_ILR graph_scatterplot
20METEOR-ranking0.7388(0.7138, 0.7619)0.5507(0.5128, 0.5865)0.7446(0.7201, 0.7673)graph_ILR graph_scatterplot
21LET0.6977(0.6695, 0.7239)0.5142(0.4742, 0.5520)0.6844(0.6552, 0.7115)graph_ILR graph_scatterplot
22DP-Or0.1093(0.0568, 0.1612)0.0841(0.0314, 0.1364)0.1812(0.1295, 0.2318)graph_ILR graph_scatterplot
23ATEC30.5556(0.5180, 0.5911)0.4004(0.3551, 0.4439)0.5196(0.4799, 0.5572)graph_ILR graph_scatterplot
24BLEU-v120.6884(0.6595, 0.7152)0.5052(0.4648, 0.5436)0.6985(0.6703, 0.7246)graph_ILR graph_scatterplot
25BEwT-E0.6672(0.6368, 0.6955)0.4801(0.4383, 0.5198)0.6581(0.6271, 0.6871)graph_ILR graph_scatterplot
26DR-Or0.1302(0.0779, 0.1818)0.0944(0.0418, 0.1466)0.1184(0.0659, 0.1702)graph_ILR graph_scatterplot
27BleuSP0.7390(0.7140, 0.7621)0.5503(0.5123, 0.5861)0.7438(0.7192, 0.7665)graph_ILR graph_scatterplot
28SVM-Rank0.7709(0.7486, 0.7915)0.5809(0.5447, 0.6149)0.7694(0.7470, 0.7902)graph_ILR graph_scatterplot
29BLEU-10.6751(0.6452, 0.7028)0.4921(0.4509, 0.5311)0.6513(0.6197, 0.6807)graph_ILR graph_scatterplot
30Bleu-sbp0.6880(0.6591, 0.7148)0.5053(0.4648, 0.5436)0.7036(0.6758, 0.7293)graph_ILR graph_scatterplot
31invWer-0.7065(-0.7320, -0.6790)-0.5194(-0.5570, -0.4797)-0.7062(-0.7318, -0.6787)graph_ILR graph_scatterplot
32BLEU-v11b0.6859(0.6569, 0.7130)0.5036(0.4631, 0.5420)0.6979(0.6698, 0.7241)graph_ILR graph_scatterplot
33SR-Or0.1055(0.0530, 0.1575)0.0757(0.0229, 0.1280)0.1368(0.0846, 0.1883)graph_ILR graph_scatterplot
34Badger0.3357(0.2879, 0.3818)0.2330(0.1823, 0.2824)0.3567(0.3097, 0.4020)graph_ILR graph_scatterplot
35Meteor-v0.70.7413(0.7165, 0.7642)0.5540(0.5163, 0.5896)0.7378(0.7128, 0.7610)graph_ILR graph_scatterplot
36MaxSim0.1697(0.1179, 0.2206)0.1217(0.0693, 0.1735)0.1804(0.1288, 0.2311)graph_ILR graph_scatterplot
37TERp-0.7235(-0.7478, -0.6973)-0.5424(-0.5787, -0.5040)-0.7303(-0.7541, -0.7047)graph_ILR graph_scatterplot

37 metrics (including 7 baseline metrics)
1375 data points (total number of documents used)