Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Fluency, 5-point scale
  • Target Language: French
  • Correlation Level: segment

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1BadgerLite0.3858(0.3654, 0.4058)0.2850(0.2630, 0.3066)0.4543(0.4353, 0.4729)graph_scatterplot
2METEOR-ranking0.3598(0.3390, 0.3802)0.2681(0.2460, 0.2900)0.4242(0.4046, 0.4435)graph_scatterplot
3LET0.3428(0.3217, 0.3635)0.2568(0.2345, 0.2787)0.3658(0.3451, 0.3861)graph_scatterplot
4SEPIA20.3618(0.3411, 0.3823)0.2702(0.2481, 0.2920)0.2145(0.1918, 0.2370)graph_scatterplot
5CDer-0.3891(-0.4090, -0.3688)-0.2915(-0.3130, -0.2697)-0.2954(-0.3169, -0.2737)graph_scatterplot
6ATEC30.2498(0.2275, 0.2719)0.1813(0.1583, 0.2041)0.3160(0.2945, 0.3372)graph_scatterplot
7TER-v0.7.25-0.4027(-0.4224, -0.3827)-0.3016(-0.3230, -0.2800)-0.2147(-0.2371, -0.1920)graph_scatterplot
8BLEU-v120.2416(0.2192, 0.2638)0.1917(0.1688, 0.2144)0.3586(0.3378, 0.3790)graph_scatterplot
9BleuSP0.4465(0.4273, 0.4653)0.3343(0.3131, 0.3552)0.4928(0.4747, 0.5105)graph_scatterplot
10NIST-v11b0.3466(0.3256, 0.3673)0.2585(0.2362, 0.2804)0.3925(0.3723, 0.4123)graph_scatterplot
11SVM-Rank0.3729(0.3524, 0.3931)0.2768(0.2548, 0.2985)0.4217(0.4020, 0.4410)graph_scatterplot
12BLEU-10.3451(0.3241, 0.3658)0.2575(0.2352, 0.2794)0.3790(0.3586, 0.3991)graph_scatterplot
13ATEC40.2696(0.2475, 0.2914)0.1971(0.1742, 0.2197)0.3368(0.3157, 0.3576)graph_scatterplot
14Bleu-sbp0.2531(0.2308, 0.2751)0.2011(0.1783, 0.2237)0.3671(0.3465, 0.3875)graph_scatterplot
15ATEC10.2644(0.2423, 0.2863)0.1930(0.1701, 0.2157)0.3331(0.3119, 0.3540)graph_scatterplot
16invWer-0.4203(-0.4397, -0.4007)-0.3155(-0.3366, -0.2940)-0.2236(-0.2460, -0.2010)graph_scatterplot
17SNR0.3402(0.3191, 0.3609)0.2535(0.2312, 0.2755)0.2621(0.2399, 0.2841)graph_scatterplot
18mBLEU0.3796(0.3592, 0.3997)0.2802(0.2582, 0.3019)0.4145(0.3947, 0.4340)graph_scatterplot
194-GRR0.3932(0.3730, 0.4130)0.2938(0.2720, 0.3152)0.3462(0.3252, 0.3669)graph_scatterplot
20BLEU-v11b0.2531(0.2308, 0.2751)0.2011(0.1783, 0.2237)0.3671(0.3465, 0.3875)graph_scatterplot
21Badger0.3866(0.3663, 0.4065)0.2841(0.2622, 0.3057)0.4566(0.4377, 0.4752)graph_scatterplot
22ATEC20.2646(0.2425, 0.2865)0.1931(0.1702, 0.2158)0.3331(0.3119, 0.3540)graph_scatterplot
23SEPIA10.3685(0.3478, 0.3887)0.2754(0.2533, 0.2971)0.3295(0.3082, 0.3505)graph_scatterplot
24Meteor-v0.70.3970(0.3769, 0.4168)0.2989(0.2772, 0.3203)0.4564(0.4374, 0.4749)graph_scatterplot
25MaxSim0.2518(0.2295, 0.2738)0.1839(0.1610, 0.2067)0.3270(0.3057, 0.3480)graph_scatterplot
26mTER-0.3706(-0.3909, -0.3500)-0.2761(-0.2978, -0.2540)-0.1790(-0.2018, -0.1560)graph_scatterplot
27BLEU-40.3935(0.3733, 0.4133)0.2922(0.2704, 0.3137)0.4568(0.4379, 0.4754)graph_scatterplot
28METEOR-v0.60.3644(0.3437, 0.3848)0.2713(0.2492, 0.2931)0.4052(0.3852, 0.4248)graph_scatterplot
29TERp-0.4057(-0.4253, -0.3857)-0.3069(-0.3282, -0.2853)-0.4517(-0.4704, -0.4327)graph_scatterplot

29 metrics (including 7 baseline metrics)
6850 data points (total number of segments used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1BadgerLite0.5094(0.4909, 0.5275)0.3816(0.3603, 0.4026)0.5312(0.5132, 0.5487)graph_scatterplot
2METEOR-ranking0.5265(0.5084, 0.5442)0.3969(0.3759, 0.4176)0.5671(0.5501, 0.5836)graph_scatterplot
3LET0.4724(0.4529, 0.4914)0.3554(0.3335, 0.3768)0.4261(0.4056, 0.4462)graph_scatterplot
4SEPIA20.4944(0.4755, 0.5129)0.3731(0.3516, 0.3942)0.2574(0.2341, 0.2803)graph_scatterplot
5CDer-0.5306(-0.5481, -0.5126)-0.4001(-0.4207, -0.3791)-0.5267(-0.5443, -0.5085)graph_scatterplot
6ATEC30.4108(0.3900, 0.4312)0.3043(0.2817, 0.3266)0.3859(0.3646, 0.4067)graph_scatterplot
7TER-v0.7.25-0.5254(-0.5431, -0.5073)-0.3959(-0.4166, -0.3748)-0.4859(-0.5046, -0.4668)graph_scatterplot
8BLEU-v120.3814(0.3601, 0.4024)0.2934(0.2706, 0.3158)0.3928(0.3717, 0.4136)graph_scatterplot
9BleuSP0.5572(0.5399, 0.5740)0.4204(0.3999, 0.4406)0.5904(0.5740, 0.6063)graph_scatterplot
10NIST-v11b0.4201(0.3995, 0.4402)0.3117(0.2892, 0.3338)0.4250(0.4045, 0.4450)graph_scatterplot
11SVM-Rank0.5289(0.5108, 0.5465)0.3966(0.3756, 0.4173)0.5382(0.5204, 0.5556)graph_scatterplot
12BLEU-10.4622(0.4425, 0.4815)0.3469(0.3249, 0.3684)0.4410(0.4209, 0.4608)graph_scatterplot
13Bleu-sbp0.3824(0.3610, 0.4033)0.2952(0.2724, 0.3176)0.4050(0.3841, 0.4255)graph_scatterplot
14ATEC40.4287(0.4083, 0.4487)0.3179(0.2955, 0.3400)0.4077(0.3869, 0.4281)graph_scatterplot
15invWer-0.5465(-0.5636, -0.5289)-0.4138(-0.4341, -0.3930)-0.5130(-0.5310, -0.4945)graph_scatterplot
16ATEC10.4205(0.3999, 0.4406)0.3118(0.2893, 0.3339)0.4022(0.3813, 0.4228)graph_scatterplot
17SNR0.4687(0.4492, 0.4878)0.3466(0.3246, 0.3682)0.3294(0.3071, 0.3512)graph_scatterplot
18mBLEU0.4398(0.4196, 0.4595)0.3231(0.3008, 0.3451)0.4711(0.4516, 0.4901)graph_scatterplot
19BLEU-v11b0.3829(0.3616, 0.4038)0.2956(0.2729, 0.3181)0.4053(0.3844, 0.4258)graph_scatterplot
204-GRR0.4855(0.4664, 0.5042)0.3639(0.3422, 0.3852)0.4950(0.4761, 0.5135)graph_scatterplot
21Badger0.4975(0.4786, 0.5159)0.3713(0.3497, 0.3924)0.5235(0.5052, 0.5412)graph_scatterplot
22ATEC20.4199(0.3993, 0.4400)0.3112(0.2887, 0.3334)0.4018(0.3809, 0.4224)graph_scatterplot
23SEPIA10.5083(0.4897, 0.5264)0.3816(0.3603, 0.4025)0.4226(0.4021, 0.4428)graph_scatterplot
24Meteor-v0.70.5342(0.5163, 0.5516)0.4026(0.3817, 0.4232)0.5703(0.5534, 0.5868)graph_scatterplot
25MaxSim0.3405(0.3185, 0.3622)0.2502(0.2269, 0.2733)0.3629(0.3413, 0.3842)graph_scatterplot
26mTER-0.4681(-0.4872, -0.4485)-0.3499(-0.3714, -0.3280)-0.4363(-0.4561, -0.4160)graph_scatterplot
27BLEU-40.4956(0.4767, 0.5140)0.3704(0.3489, 0.3916)0.5340(0.5161, 0.5515)graph_scatterplot
28METEOR-v0.60.5249(0.5067, 0.5426)0.3954(0.3743, 0.4160)0.5556(0.5383, 0.5725)graph_scatterplot
29TERp-0.5362(-0.5536, -0.5183)-0.4049(-0.4254, -0.3840)-0.5660(-0.5826, -0.5490)graph_scatterplot

29 metrics (including 7 baseline metrics)
6274 data points (total number of segments used)