Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Adequacy, Yes-No qualitative question, proportion of Yes assigned
  • Target Language: English
  • Correlation Level: segment

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.5108(0.5017, 0.5199)0.4029(0.3926, 0.4132)0.4183(0.4082, 0.4284)graph_YN graph_scatterplot
2CDer-0.5342(-0.5429, -0.5254)-0.4230(-0.4330, -0.4129)-0.5129(-0.5219, -0.5038)graph_YN graph_scatterplot
3ULCh0.4027(0.3924, 0.4129)0.3144(0.3033, 0.3254)0.4390(0.4290, 0.4489)graph_YN graph_scatterplot
4TER-v0.7.25-0.4793(-0.4887, -0.4698)-0.3803(-0.3907, -0.3697)-0.4234(-0.4334, -0.4133)graph_YN graph_scatterplot
5DP-Orp0.3093(0.2981, 0.3203)0.2408(0.2292, 0.2523)0.3407(0.3298, 0.3515)graph_YN graph_scatterplot
6NIST-v11b0.5027(0.4934, 0.5118)0.3964(0.3860, 0.4067)0.5027(0.4934, 0.5118)graph_YN graph_scatterplot
7ATEC40.4678(0.4581, 0.4773)0.3689(0.3582, 0.3795)0.4619(0.4522, 0.4715)graph_YN graph_scatterplot
8ATEC10.4697(0.4600, 0.4792)0.3708(0.3601, 0.3813)0.4620(0.4523, 0.4716)graph_YN graph_scatterplot
9mBLEU0.3046(0.2934, 0.3157)0.2394(0.2278, 0.2509)0.3373(0.3264, 0.3481)graph_YN graph_scatterplot
10SNR0.3912(0.3808, 0.4016)0.3059(0.2947, 0.3170)0.3964(0.3860, 0.4067)graph_YN graph_scatterplot
114-GRR0.4819(0.4724, 0.4912)0.3806(0.3700, 0.3910)0.4559(0.4461, 0.4656)graph_YN graph_scatterplot
12ATEC20.4702(0.4606, 0.4797)0.3710(0.3603, 0.3815)0.4626(0.4529, 0.4722)graph_YN graph_scatterplot
13SEPIA10.5138(0.5047, 0.5228)0.4056(0.3953, 0.4159)0.4661(0.4564, 0.4757)graph_YN graph_scatterplot
14ULCopt0.4074(0.3971, 0.4176)0.3183(0.3072, 0.3292)0.4397(0.4297, 0.4495)graph_YN graph_scatterplot
15mTER-0.2536(-0.2650, -0.2421)-0.2043(-0.2160, -0.1925)-0.2197(-0.2314, -0.2080)graph_YN graph_scatterplot
16EDPM0.5241(0.5152, 0.5330)0.4141(0.4039, 0.4242)0.5229(0.5139, 0.5317)graph_YN graph_scatterplot
17BLEU-40.4859(0.4764, 0.4952)0.3818(0.3712, 0.3922)0.4624(0.4526, 0.4720)graph_YN graph_scatterplot
18METEOR-v0.60.5544(0.5458, 0.5629)0.4389(0.4289, 0.4488)0.5475(0.5389, 0.5561)graph_YN graph_scatterplot
19RTE-MT0.5144(0.5053, 0.5234)0.4053(0.3950, 0.4155)0.5048(0.4955, 0.5139)graph_YN graph_scatterplot
20BadgerLite0.3873(0.3768, 0.3977)0.3027(0.2915, 0.3138)0.3759(0.3653, 0.3864)graph_YN graph_scatterplot
21METEOR-ranking0.5447(0.5360, 0.5533)0.4315(0.4214, 0.4414)0.5288(0.5199, 0.5376)graph_YN graph_scatterplot
22LET0.5067(0.4975, 0.5158)0.4009(0.3906, 0.4112)0.4942(0.4848, 0.5034)graph_YN graph_scatterplot
23DP-Or0.3943(0.3839, 0.4047)0.3099(0.2987, 0.3209)0.4367(0.4267, 0.4466)graph_YN graph_scatterplot
24ATEC30.4684(0.4588, 0.4779)0.3695(0.3588, 0.3801)0.4578(0.4481, 0.4675)graph_YN graph_scatterplot
25BLEU-v120.3599(0.3492, 0.3706)0.3057(0.2945, 0.3168)0.3899(0.3794, 0.4002)graph_YN graph_scatterplot
26BEwT-E0.3774(0.3668, 0.3879)0.3041(0.2929, 0.3152)0.3963(0.3859, 0.4066)graph_YN graph_scatterplot
27RTE0.4786(0.4691, 0.4880)0.3753(0.3647, 0.3858)0.4622(0.4524, 0.4718)graph_YN graph_scatterplot
28DR-Or0.3505(0.3397, 0.3612)0.2745(0.2631, 0.2858)0.4019(0.3915, 0.4121)graph_YN graph_scatterplot
29BleuSP0.5106(0.5015, 0.5196)0.4029(0.3926, 0.4131)0.5009(0.4916, 0.5100)graph_YN graph_scatterplot
30SVM-Rank0.4930(0.4837, 0.5022)0.3887(0.3782, 0.3991)0.4874(0.4780, 0.4967)graph_YN graph_scatterplot
31BLEU-10.5090(0.4999, 0.5181)0.4022(0.3918, 0.4124)0.5075(0.4983, 0.5166)graph_YN graph_scatterplot
32Bleu-sbp0.3520(0.3412, 0.3627)0.2998(0.2886, 0.3110)0.3819(0.3714, 0.3923)graph_YN graph_scatterplot
33invWer-0.4893(-0.4986, -0.4799)-0.3885(-0.3988, -0.3780)-0.4307(-0.4407, -0.4207)graph_YN graph_scatterplot
34BLEU-v11b0.3520(0.3412, 0.3627)0.2998(0.2886, 0.3110)0.3819(0.3714, 0.3923)graph_YN graph_scatterplot
35SR-Or0.3518(0.3409, 0.3625)0.2781(0.2667, 0.2894)0.3603(0.3496, 0.3710)graph_YN graph_scatterplot
36Badger0.3426(0.3317, 0.3534)0.2677(0.2563, 0.2791)0.3450(0.3342, 0.3558)graph_YN graph_scatterplot
37Meteor-v0.70.5404(0.5316, 0.5490)0.4280(0.4179, 0.4379)0.5297(0.5208, 0.5384)graph_YN graph_scatterplot
38MaxSim0.4001(0.3897, 0.4103)0.3130(0.3019, 0.3240)0.4295(0.4194, 0.4394)graph_YN graph_scatterplot
39TERp-0.5681(-0.5764, -0.5597)-0.4532(-0.4629, -0.4434)-0.5730(-0.5812, -0.5647)graph_YN graph_scatterplot

39 metrics (including 7 baseline metrics)
25473 data points (total number of segments used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.5308(0.5197, 0.5417)0.4237(0.4111, 0.4362)0.4702(0.4582, 0.4820)graph_YN graph_scatterplot
2CDer-0.5641(-0.5744, -0.5536)-0.4511(-0.4632, -0.4388)-0.5553(-0.5658, -0.5446)graph_YN graph_scatterplot
3ULCh0.1548(0.1398, 0.1697)0.1209(0.1059, 0.1360)0.2736(0.2594, 0.2877)graph_YN graph_scatterplot
4TER-v0.7.25-0.4788(-0.4905, -0.4669)-0.3813(-0.3942, -0.3681)-0.4318(-0.4441, -0.4192)graph_YN graph_scatterplot
5DP-Orp0.1255(0.1104, 0.1405)0.0979(0.0827, 0.1130)0.1875(0.1727, 0.2022)graph_YN graph_scatterplot
6NIST-v11b0.5158(0.5045, 0.5269)0.4106(0.3978, 0.4232)0.5157(0.5044, 0.5269)graph_YN graph_scatterplot
7ATEC40.5125(0.5011, 0.5237)0.4084(0.3956, 0.4210)0.5042(0.4927, 0.5155)graph_YN graph_scatterplot
8ATEC10.5103(0.4989, 0.5215)0.4070(0.3942, 0.4197)0.5003(0.4887, 0.5116)graph_YN graph_scatterplot
9SNR0.1645(0.1496, 0.1793)0.1283(0.1133, 0.1433)0.1842(0.1694, 0.1989)graph_YN graph_scatterplot
10mBLEU0.3424(0.3288, 0.3558)0.2698(0.2556, 0.2839)0.3976(0.3847, 0.4104)graph_YN graph_scatterplot
114-GRR0.4824(0.4706, 0.4941)0.3832(0.3701, 0.3961)0.4669(0.4548, 0.4787)graph_YN graph_scatterplot
12ATEC20.5084(0.4970, 0.5197)0.4053(0.3924, 0.4180)0.4977(0.4861, 0.5091)graph_YN graph_scatterplot
13SEPIA10.5330(0.5220, 0.5438)0.4253(0.4127, 0.4378)0.5305(0.5194, 0.5414)graph_YN graph_scatterplot
14ULCopt0.1554(0.1405, 0.1703)0.1216(0.1065, 0.1366)0.2632(0.2489, 0.2774)graph_YN graph_scatterplot
15EDPM0.5361(0.5251, 0.5469)0.4274(0.4148, 0.4398)0.5497(0.5389, 0.5603)graph_YN graph_scatterplot
16mTER-0.2819(-0.2959, -0.2678)-0.2240(-0.2385, -0.2094)-0.2480(-0.2623, -0.2336)graph_YN graph_scatterplot
17BLEU-40.5072(0.4957, 0.5184)0.4033(0.3904, 0.4160)0.5418(0.5309, 0.5525)graph_YN graph_scatterplot
18METEOR-v0.60.5720(0.5616, 0.5822)0.4576(0.4455, 0.4696)0.5822(0.5720, 0.5922)graph_YN graph_scatterplot
19BadgerLite0.3182(0.3044, 0.3319)0.2507(0.2363, 0.2650)0.2838(0.2697, 0.2978)graph_YN graph_scatterplot
20METEOR-ranking0.5633(0.5528, 0.5737)0.4509(0.4386, 0.4630)0.5796(0.5693, 0.5896)graph_YN graph_scatterplot
21LET0.5373(0.5264, 0.5481)0.4300(0.4174, 0.4423)0.4947(0.4831, 0.5062)graph_YN graph_scatterplot
22DP-Or0.1678(0.1529, 0.1826)0.1321(0.1170, 0.1470)0.3053(0.2913, 0.3190)graph_YN graph_scatterplot
23ATEC30.5084(0.4970, 0.5197)0.4046(0.3918, 0.4173)0.4933(0.4817, 0.5048)graph_YN graph_scatterplot
24BLEU-v120.3972(0.3842, 0.4100)0.3345(0.3209, 0.3480)0.4391(0.4267, 0.4513)graph_YN graph_scatterplot
25BEwT-E0.4362(0.4238, 0.4485)0.3505(0.3370, 0.3638)0.4486(0.4363, 0.4607)graph_YN graph_scatterplot
26DR-Or0.1494(0.1345, 0.1643)0.1174(0.1023, 0.1325)0.2983(0.2843, 0.3122)graph_YN graph_scatterplot
27BleuSP0.5499(0.5391, 0.5605)0.4385(0.4261, 0.4508)0.5821(0.5719, 0.5921)graph_YN graph_scatterplot
28SVM-Rank0.5707(0.5603, 0.5809)0.4559(0.4437, 0.4679)0.5763(0.5660, 0.5864)graph_YN graph_scatterplot
29BLEU-10.5032(0.4917, 0.5146)0.4008(0.3879, 0.4136)0.4802(0.4684, 0.4919)graph_YN graph_scatterplot
30Bleu-sbp0.3917(0.3786, 0.4045)0.3304(0.3168, 0.3440)0.4346(0.4221, 0.4469)graph_YN graph_scatterplot
31invWer-0.5250(-0.5360, -0.5138)-0.4207(-0.4332, -0.4080)-0.4832(-0.4949, -0.4714)graph_YN graph_scatterplot
32BLEU-v11b0.3917(0.3786, 0.4045)0.3304(0.3168, 0.3440)0.4346(0.4221, 0.4469)graph_YN graph_scatterplot
33SR-Or0.1475(0.1325, 0.1624)0.1168(0.1017, 0.1319)0.2498(0.2354, 0.2641)graph_YN graph_scatterplot
34Badger0.2813(0.2672, 0.2953)0.2209(0.2063, 0.2354)0.2892(0.2751, 0.3031)graph_YN graph_scatterplot
35Meteor-v0.70.5665(0.5560, 0.5768)0.4535(0.4413, 0.4656)0.5588(0.5482, 0.5692)graph_YN graph_scatterplot
36MaxSim0.1747(0.1598, 0.1895)0.1369(0.1219, 0.1519)0.2512(0.2369, 0.2655)graph_YN graph_scatterplot
37TERp-0.5782(-0.5883, -0.5679)-0.4629(-0.4748, -0.4508)-0.5945(-0.6043, -0.5846)graph_YN graph_scatterplot

37 metrics (including 7 baseline metrics)
16450 data points (total number of segments used)