Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Adequacy, 7-point scale, straight average
  • Target Language: English
  • Correlation Level: segment

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.6290(0.6215, 0.6364)0.4753(0.4658, 0.4848)0.5263(0.5173, 0.5351)graph_category graph_category_2 graph_scatterplot
2CDer-0.6535(-0.6605, -0.6464)-0.4994(-0.5086, -0.4901)-0.6536(-0.6606, -0.6465)graph_category graph_category_2 graph_scatterplot
3ULCh0.4478(0.4379, 0.4575)0.3255(0.3145, 0.3364)0.4684(0.4588, 0.4780)graph_category graph_category_2 graph_scatterplot
4TER-v0.7.25-0.5796(-0.5877, -0.5714)-0.4391(-0.4490, -0.4291)-0.5244(-0.5332, -0.5154)graph_category graph_category_2 graph_scatterplot
5DP-Orp0.3334(0.3224, 0.3443)0.2405(0.2289, 0.2521)0.3442(0.3334, 0.3550)graph_category graph_category_2 graph_scatterplot
6NIST-v11b0.6193(0.6116, 0.6268)0.4680(0.4583, 0.4775)0.6246(0.6171, 0.6321)graph_category graph_category_2 graph_scatterplot
7ATEC40.5811(0.5729, 0.5892)0.4394(0.4294, 0.4493)0.5810(0.5728, 0.5890)graph_category graph_category_2 graph_scatterplot
8ATEC10.5831(0.5749, 0.5912)0.4417(0.4318, 0.4515)0.5817(0.5736, 0.5898)graph_category graph_category_2 graph_scatterplot
9mBLEU0.3872(0.3767, 0.3976)0.2854(0.2741, 0.2966)0.3926(0.3821, 0.4029)graph_category graph_category_2 graph_scatterplot
10SNR0.4464(0.4366, 0.4562)0.3243(0.3133, 0.3352)0.4421(0.4322, 0.4519)graph_category graph_category_2 graph_scatterplot
114-GRR0.5825(0.5744, 0.5906)0.4381(0.4281, 0.4479)0.5300(0.5211, 0.5387)graph_category graph_category_2 graph_scatterplot
12ATEC20.5823(0.5741, 0.5904)0.4406(0.4306, 0.4504)0.5804(0.5722, 0.5885)graph_category graph_category_2 graph_scatterplot
13SEPIA10.6317(0.6243, 0.6390)0.4783(0.4688, 0.4877)0.5585(0.5500, 0.5669)graph_category graph_category_2 graph_scatterplot
14ULCopt0.4598(0.4501, 0.4695)0.3358(0.3249, 0.3467)0.4775(0.4680, 0.4869)graph_category graph_category_2 graph_scatterplot
15mTER-0.3302(-0.3411, -0.3192)-0.2543(-0.2657, -0.2427)-0.3102(-0.3212, -0.2990)graph_category graph_category_2 graph_scatterplot
16EDPM0.6348(0.6274, 0.6420)0.4809(0.4714, 0.4903)0.6264(0.6189, 0.6338)graph_category graph_category_2 graph_scatterplot
17BLEU-40.5813(0.5731, 0.5894)0.4307(0.4207, 0.4407)0.5168(0.5077, 0.5257)graph_category graph_category_2 graph_scatterplot
18METEOR-v0.60.6809(0.6742, 0.6874)0.5209(0.5119, 0.5298)0.6855(0.6790, 0.6920)graph_category graph_category_2 graph_scatterplot
19RTE-MT0.6061(0.5982, 0.6138)0.4498(0.4400, 0.4596)0.5815(0.5733, 0.5896)graph_category graph_category_2 graph_scatterplot
20BadgerLite0.4412(0.4313, 0.4511)0.3243(0.3133, 0.3352)0.3653(0.3546, 0.3759)graph_category graph_category_2 graph_scatterplot
21METEOR-ranking0.6691(0.6622, 0.6758)0.5132(0.5041, 0.5222)0.6527(0.6456, 0.6597)graph_category graph_category_2 graph_scatterplot
22LET0.6304(0.6230, 0.6378)0.4827(0.4732, 0.4921)0.6382(0.6308, 0.6454)graph_category graph_category_2 graph_scatterplot
23DP-Or0.4455(0.4356, 0.4553)0.3286(0.3176, 0.3395)0.4641(0.4545, 0.4737)graph_category graph_category_2 graph_scatterplot
24ATEC30.5734(0.5651, 0.5816)0.4320(0.4220, 0.4419)0.5715(0.5632, 0.5797)graph_category graph_category_2 graph_scatterplot
25BLEU-v120.4388(0.4289, 0.4487)0.3504(0.3396, 0.3611)0.4513(0.4414, 0.4610)graph_category graph_category_2 graph_scatterplot
26BEwT-E0.4854(0.4759, 0.4947)0.3742(0.3636, 0.3847)0.5009(0.4916, 0.5100)graph_category graph_category_2 graph_scatterplot
27RTE0.5659(0.5575, 0.5742)0.4142(0.4039, 0.4243)0.5272(0.5182, 0.5360)graph_category graph_category_2 graph_scatterplot
28DR-Or0.3941(0.3837, 0.4045)0.2864(0.2751, 0.2977)0.4281(0.4181, 0.4381)graph_category graph_category_2 graph_scatterplot
29BleuSP0.6210(0.6134, 0.6285)0.4679(0.4583, 0.4775)0.5853(0.5771, 0.5933)graph_category graph_category_2 graph_scatterplot
30SVM-Rank0.6119(0.6042, 0.6195)0.4636(0.4539, 0.4732)0.5985(0.5905, 0.6063)graph_category graph_category_2 graph_scatterplot
31BLEU-10.6200(0.6124, 0.6275)0.4683(0.4586, 0.4778)0.6360(0.6286, 0.6432)graph_category graph_category_2 graph_scatterplot
32Bleu-sbp0.4281(0.4180, 0.4380)0.3423(0.3314, 0.3531)0.4414(0.4315, 0.4513)graph_category graph_category_2 graph_scatterplot
33invWer-0.5978(-0.6056, -0.5899)-0.4546(-0.4643, -0.4448)-0.5418(-0.5504, -0.5331)graph_category graph_category_2 graph_scatterplot
34BLEU-v11b0.4281(0.4180, 0.4380)0.3423(0.3314, 0.3531)0.4414(0.4315, 0.4513)graph_category graph_category_2 graph_scatterplot
35SR-Or0.3958(0.3854, 0.4061)0.2877(0.2764, 0.2989)0.3858(0.3753, 0.3962)graph_category graph_category_2 graph_scatterplot
36Badger0.3871(0.3767, 0.3975)0.2827(0.2714, 0.2940)0.3405(0.3296, 0.3513)graph_category graph_category_2 graph_scatterplot
37Meteor-v0.70.6652(0.6583, 0.6720)0.5107(0.5016, 0.5198)0.6789(0.6722, 0.6855)graph_category graph_category_2 graph_scatterplot
38MaxSim0.4631(0.4534, 0.4727)0.3404(0.3295, 0.3512)0.4844(0.4749, 0.4937)graph_category graph_category_2 graph_scatterplot
39TERp-0.6840(-0.6905, -0.6774)-0.5246(-0.5334, -0.5156)-0.6737(-0.6803, -0.6669)graph_category graph_category_2 graph_scatterplot

39 metrics (including 7 baseline metrics)
25473 data points (total number of segments used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.6660(0.6575, 0.6745)0.5100(0.4986, 0.5212)0.6301(0.6208, 0.6392)graph_category graph_category_2 graph_scatterplot
2CDer-0.7130(-0.7204, -0.7054)-0.5518(-0.5624, -0.5411)-0.7199(-0.7272, -0.7124)graph_category graph_category_2 graph_scatterplot
3ULCh0.1971(0.1824, 0.2117)0.1414(0.1264, 0.1563)0.3051(0.2912, 0.3189)graph_category graph_category_2 graph_scatterplot
4TER-v0.7.25-0.5939(-0.6037, -0.5839)-0.4473(-0.4594, -0.4350)-0.5354(-0.5462, -0.5244)graph_category graph_category_2 graph_scatterplot
5DP-Orp0.1430(0.1280, 0.1580)0.1019(0.0868, 0.1170)0.1797(0.1649, 0.1945)graph_category graph_category_2 graph_scatterplot
6NIST-v11b0.6511(0.6422, 0.6598)0.4963(0.4847, 0.5078)0.6686(0.6601, 0.6770)graph_category graph_category_2 graph_scatterplot
7ATEC40.6484(0.6394, 0.6571)0.4961(0.4845, 0.5076)0.6571(0.6484, 0.6657)graph_category graph_category_2 graph_scatterplot
8ATEC10.6450(0.6359, 0.6538)0.4941(0.4825, 0.5056)0.6533(0.6445, 0.6620)graph_category graph_category_2 graph_scatterplot
9SNR0.2132(0.1986, 0.2277)0.1528(0.1379, 0.1677)0.2371(0.2226, 0.2514)graph_category graph_category_2 graph_scatterplot
10mBLEU0.4498(0.4375, 0.4619)0.3289(0.3152, 0.3424)0.4567(0.4445, 0.4687)graph_category graph_category_2 graph_scatterplot
114-GRR0.6157(0.6062, 0.6251)0.4649(0.4528, 0.4768)0.5533(0.5426, 0.5638)graph_category graph_category_2 graph_scatterplot
12ATEC20.6415(0.6324, 0.6504)0.4906(0.4789, 0.5021)0.6487(0.6397, 0.6575)graph_category graph_category_2 graph_scatterplot
13SEPIA10.6739(0.6654, 0.6821)0.5158(0.5045, 0.5269)0.6527(0.6439, 0.6614)graph_category graph_category_2 graph_scatterplot
14ULCopt0.2011(0.1864, 0.2157)0.1469(0.1319, 0.1618)0.3022(0.2883, 0.3160)graph_category graph_category_2 graph_scatterplot
15EDPM0.6681(0.6596, 0.6765)0.5124(0.5010, 0.5236)0.6871(0.6789, 0.6950)graph_category graph_category_2 graph_scatterplot
16mTER-0.3876(-0.4005, -0.3746)-0.2898(-0.3038, -0.2758)-0.3482(-0.3616, -0.3347)graph_category graph_category_2 graph_scatterplot
17BLEU-40.6203(0.6108, 0.6297)0.4650(0.4529, 0.4769)0.6064(0.5966, 0.6159)graph_category graph_category_2 graph_scatterplot
18METEOR-v0.60.7196(0.7121, 0.7268)0.5575(0.5469, 0.5679)0.7331(0.7260, 0.7401)graph_category graph_category_2 graph_scatterplot
19BadgerLite0.3330(0.3193, 0.3465)0.2457(0.2313, 0.2600)0.2186(0.2040, 0.2331)graph_category graph_category_2 graph_scatterplot
20METEOR-ranking0.7108(0.7032, 0.7183)0.5517(0.5410, 0.5622)0.7103(0.7026, 0.7178)graph_category graph_category_2 graph_scatterplot
21LET0.6735(0.6651, 0.6818)0.5214(0.5102, 0.5325)0.6779(0.6696, 0.6861)graph_category graph_category_2 graph_scatterplot
22DP-Or0.2306(0.2161, 0.2450)0.1679(0.1530, 0.1827)0.3421(0.3285, 0.3555)graph_category graph_category_2 graph_scatterplot
23ATEC30.6383(0.6291, 0.6472)0.4861(0.4743, 0.4977)0.6473(0.6384, 0.6561)graph_category graph_category_2 graph_scatterplot
24BLEU-v120.5058(0.4943, 0.5170)0.4019(0.3890, 0.4147)0.5254(0.5143, 0.5364)graph_category graph_category_2 graph_scatterplot
25BEwT-E0.5686(0.5582, 0.5789)0.4359(0.4235, 0.4482)0.5778(0.5675, 0.5879)graph_category graph_category_2 graph_scatterplot
26DR-Or0.1956(0.1809, 0.2103)0.1414(0.1264, 0.1563)0.3208(0.3070, 0.3344)graph_category graph_category_2 graph_scatterplot
27BleuSP0.6864(0.6783, 0.6944)0.5246(0.5134, 0.5356)0.6873(0.6791, 0.6952)graph_category graph_category_2 graph_scatterplot
28SVM-Rank0.7187(0.7112, 0.7260)0.5570(0.5463, 0.5674)0.7183(0.7108, 0.7256)graph_category graph_category_2 graph_scatterplot
29BLEU-10.6279(0.6185, 0.6371)0.4769(0.4650, 0.4886)0.6532(0.6444, 0.6619)graph_category graph_category_2 graph_scatterplot
30Bleu-sbp0.4975(0.4859, 0.5089)0.3955(0.3826, 0.4083)0.5181(0.5068, 0.5292)graph_category graph_category_2 graph_scatterplot
31invWer-0.6634(-0.6719, -0.6548)-0.5095(-0.5207, -0.4981)-0.6154(-0.6248, -0.6059)graph_category graph_category_2 graph_scatterplot
32BLEU-v11b0.4975(0.4859, 0.5089)0.3955(0.3826, 0.4083)0.5181(0.5068, 0.5292)graph_category graph_category_2 graph_scatterplot
33SR-Or0.2017(0.1870, 0.2163)0.1454(0.1304, 0.1603)0.2818(0.2677, 0.2959)graph_category graph_category_2 graph_scatterplot
34Badger0.2931(0.2791, 0.3070)0.2136(0.1989, 0.2281)0.2397(0.2253, 0.2541)graph_category graph_category_2 graph_scatterplot
35Meteor-v0.70.7157(0.7082, 0.7231)0.5572(0.5465, 0.5676)0.7366(0.7295, 0.7435)graph_category graph_category_2 graph_scatterplot
36MaxSim0.2486(0.2342, 0.2629)0.1806(0.1658, 0.1954)0.3247(0.3109, 0.3383)graph_category graph_category_2 graph_scatterplot
37TERp-0.7127(-0.7202, -0.7051)-0.5488(-0.5594, -0.5381)-0.7216(-0.7289, -0.7142)graph_category graph_category_2 graph_scatterplot

37 metrics (including 7 baseline metrics)
16450 data points (total number of segments used)