Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Adequacy, 5-point scale
  • Target Language: French
  • Correlation Level: segment

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1BadgerLite0.4967(0.4787, 0.5144)0.3715(0.3509, 0.3917)0.4925(0.4743, 0.5102)graph_scatterplot
2METEOR-ranking0.5636(0.5472, 0.5795)0.4274(0.4078, 0.4465)0.5857(0.5699, 0.6010)graph_scatterplot
3LET0.5551(0.5385, 0.5712)0.4229(0.4033, 0.4422)0.5890(0.5734, 0.6043)graph_scatterplot
4SEPIA20.5605(0.5440, 0.5765)0.4241(0.4045, 0.4433)0.3842(0.3638, 0.4042)graph_scatterplot
5CDer-0.5789(-0.5944, -0.5630)-0.4407(-0.4595, -0.4214)-0.5014(-0.5189, -0.4834)graph_scatterplot
6ATEC30.4711(0.4525, 0.4894)0.3510(0.3300, 0.3716)0.5193(0.5018, 0.5363)graph_scatterplot
7TER-v0.7.25-0.5617(-0.5777, -0.5452)-0.4267(-0.4459, -0.4071)-0.3774(-0.3976, -0.3569)graph_scatterplot
8BLEU-v120.4196(0.3999, 0.4389)0.3320(0.3108, 0.3529)0.4820(0.4636, 0.5000)graph_scatterplot
9BleuSP0.5978(0.5824, 0.6128)0.4540(0.4350, 0.4726)0.5947(0.5792, 0.6098)graph_scatterplot
10NIST-v11b0.5594(0.5429, 0.5755)0.4232(0.4036, 0.4425)0.5821(0.5662, 0.5975)graph_scatterplot
11SVM-Rank0.5721(0.5560, 0.5879)0.4317(0.4122, 0.4508)0.5747(0.5586, 0.5904)graph_scatterplot
12BLEU-10.5531(0.5364, 0.5693)0.4187(0.3990, 0.4381)0.5900(0.5743, 0.6052)graph_scatterplot
13ATEC40.4916(0.4735, 0.5094)0.3681(0.3474, 0.3883)0.5397(0.5227, 0.5563)graph_scatterplot
14Bleu-sbp0.4097(0.3898, 0.4292)0.3263(0.3050, 0.3473)0.4747(0.4561, 0.4928)graph_scatterplot
15ATEC10.4888(0.4706, 0.5066)0.3658(0.3451, 0.3862)0.5374(0.5203, 0.5540)graph_scatterplot
16invWer-0.5764(-0.5920, -0.5604)-0.4385(-0.4574, -0.4192)-0.3854(-0.4054, -0.3651)graph_scatterplot
17SNR0.5211(0.5036, 0.5381)0.3929(0.3727, 0.4127)0.4971(0.4791, 0.5147)graph_scatterplot
18mBLEU0.5256(0.5082, 0.5425)0.3923(0.3721, 0.4122)0.4916(0.4734, 0.5093)graph_scatterplot
194-GRR0.5624(0.5460, 0.5784)0.4270(0.4074, 0.4461)0.4850(0.4667, 0.5029)graph_scatterplot
20BLEU-v11b0.4097(0.3898, 0.4292)0.3264(0.3050, 0.3474)0.4747(0.4561, 0.4928)graph_scatterplot
21Badger0.4854(0.4671, 0.5033)0.3614(0.3407, 0.3818)0.4938(0.4757, 0.5115)graph_scatterplot
22ATEC20.4883(0.4700, 0.5061)0.3654(0.3447, 0.3857)0.5367(0.5196, 0.5533)graph_scatterplot
23SEPIA10.5646(0.5483, 0.5805)0.4277(0.4081, 0.4469)0.4810(0.4625, 0.4990)graph_scatterplot
24Meteor-v0.70.5749(0.5589, 0.5906)0.4384(0.4190, 0.4573)0.5485(0.5317, 0.5649)graph_scatterplot
25MaxSim0.4456(0.4264, 0.4644)0.3317(0.3104, 0.3526)0.5015(0.4836, 0.5190)graph_scatterplot
26mTER-0.5292(-0.5461, -0.5120)-0.3994(-0.4191, -0.3793)-0.3403(-0.3611, -0.3192)graph_scatterplot
27BLEU-40.5543(0.5377, 0.5705)0.4178(0.3981, 0.4372)0.5417(0.5248, 0.5583)graph_scatterplot
28METEOR-v0.60.5648(0.5484, 0.5807)0.4277(0.4081, 0.4468)0.6016(0.5863, 0.6165)graph_scatterplot
29TERp-0.5829(-0.5984, -0.5671)-0.4476(-0.4664, -0.4285)-0.5941(-0.6092, -0.5786)graph_scatterplot

29 metrics (including 7 baseline metrics)
6850 data points (total number of segments used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1BadgerLite0.5191(0.5008, 0.5370)0.3883(0.3671, 0.4091)0.4800(0.4607, 0.4988)graph_scatterplot
2METEOR-ranking0.5496(0.5321, 0.5667)0.4133(0.3926, 0.4336)0.5775(0.5608, 0.5938)graph_scatterplot
3LET0.5059(0.4873, 0.5241)0.3814(0.3600, 0.4023)0.4858(0.4667, 0.5045)graph_scatterplot
4SEPIA20.5165(0.4981, 0.5344)0.3875(0.3663, 0.4083)0.3192(0.2968, 0.3412)graph_scatterplot
5CDer-0.5437(-0.5609, -0.5260)-0.4081(-0.4286, -0.3873)-0.5531(-0.5701, -0.5357)graph_scatterplot
6ATEC30.4908(0.4718, 0.5094)0.3653(0.3437, 0.3866)0.4870(0.4679, 0.5057)graph_scatterplot
7TER-v0.7.25-0.5274(-0.5451, -0.5093)-0.3964(-0.4170, -0.3753)-0.5045(-0.5227, -0.4858)graph_scatterplot
8BLEU-v120.4233(0.4028, 0.4434)0.3218(0.2995, 0.3438)0.4387(0.4185, 0.4584)graph_scatterplot
9BleuSP0.5717(0.5548, 0.5881)0.4293(0.4089, 0.4493)0.5971(0.5809, 0.6128)graph_scatterplot
10NIST-v11b0.4743(0.4549, 0.4932)0.3509(0.3290, 0.3724)0.4980(0.4792, 0.5164)graph_scatterplot
11SVM-Rank0.5632(0.5460, 0.5798)0.4221(0.4016, 0.4422)0.5771(0.5604, 0.5934)graph_scatterplot
12BLEU-10.4970(0.4782, 0.5154)0.3711(0.3496, 0.3922)0.4996(0.4808, 0.5180)graph_scatterplot
13Bleu-sbp0.4280(0.4075, 0.4480)0.3264(0.3041, 0.3483)0.4508(0.4309, 0.4703)graph_scatterplot
14ATEC40.5077(0.4890, 0.5258)0.3786(0.3572, 0.3996)0.5063(0.4877, 0.5245)graph_scatterplot
15invWer-0.5480(-0.5650, -0.5304)-0.4128(-0.4331, -0.3921)-0.5296(-0.5472, -0.5116)graph_scatterplot
16ATEC10.4991(0.4803, 0.5174)0.3720(0.3505, 0.3931)0.5005(0.4817, 0.5188)graph_scatterplot
17SNR0.4754(0.4561, 0.4944)0.3510(0.3291, 0.3725)0.3951(0.3740, 0.4157)graph_scatterplot
18mBLEU0.4410(0.4208, 0.4607)0.3242(0.3019, 0.3462)0.4588(0.4390, 0.4781)graph_scatterplot
19BLEU-v11b0.4286(0.4082, 0.4486)0.3269(0.3046, 0.3488)0.4514(0.4315, 0.4709)graph_scatterplot
204-GRR0.5472(0.5297, 0.5643)0.4103(0.3895, 0.4307)0.5398(0.5220, 0.5571)graph_scatterplot
21Badger0.5000(0.4812, 0.5184)0.3730(0.3515, 0.3941)0.4694(0.4499, 0.4884)graph_scatterplot
22ATEC20.4981(0.4793, 0.5165)0.3711(0.3496, 0.3923)0.4992(0.4804, 0.5175)graph_scatterplot
23SEPIA10.5238(0.5056, 0.5415)0.3907(0.3695, 0.4114)0.4566(0.4368, 0.4760)graph_scatterplot
24Meteor-v0.70.5430(0.5253, 0.5602)0.4077(0.3869, 0.4282)0.5548(0.5375, 0.5717)graph_scatterplot
25MaxSim0.4371(0.4169, 0.4569)0.3226(0.3003, 0.3446)0.4600(0.4402, 0.4793)graph_scatterplot
26mTER-0.4924(-0.5109, -0.4734)-0.3686(-0.3898, -0.3470)-0.4697(-0.4887, -0.4502)graph_scatterplot
27BLEU-40.5102(0.4916, 0.5282)0.3801(0.3588, 0.4011)0.5347(0.5168, 0.5521)graph_scatterplot
28METEOR-v0.60.5525(0.5351, 0.5695)0.4158(0.3951, 0.4361)0.5825(0.5659, 0.5986)graph_scatterplot
29TERp-0.5454(-0.5625, -0.5278)-0.4092(-0.4296, -0.3884)-0.5722(-0.5886, -0.5554)graph_scatterplot

29 metrics (including 7 baseline metrics)
6274 data points (total number of segments used)