Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Fluency, 5-point scale
  • Target Language: French
  • Correlation Level: document

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1BadgerLite0.5228(0.4262, 0.6077)0.3709(0.2585, 0.4734)0.5198(0.4228, 0.6050)graph_scatterplot
2METEOR-ranking0.4527(0.3479, 0.5462)0.3364(0.2214, 0.4422)0.3917(0.2811, 0.4921)graph_scatterplot
3LET0.4148(0.3063, 0.5127)0.3145(0.1979, 0.4223)0.3633(0.2503, 0.4666)graph_scatterplot
4SEPIA20.4397(0.3337, 0.5348)0.3319(0.2166, 0.4382)0.4028(0.2931, 0.5020)graph_scatterplot
5CDer-0.4116(-0.5098, -0.3027)-0.3159(-0.4236, -0.1995)-0.3663(-0.4692, -0.2535)graph_scatterplot
6ATEC30.1738(0.0506, 0.2918)0.1254(0.0011, 0.2459)0.2078(0.0857, 0.3238)graph_scatterplot
7TER-v0.7.25-0.4051(-0.5040, -0.2957)-0.3105(-0.4187, -0.1937)-0.3950(-0.4950, -0.2846)graph_scatterplot
8BLEU-v120.5114(0.4134, 0.5978)0.3890(0.2781, 0.4896)0.4933(0.3931, 0.5820)graph_scatterplot
9BleuSP0.5052(0.4065, 0.5924)0.3792(0.2675, 0.4809)0.5029(0.4039, 0.5903)graph_scatterplot
10NIST-v11b0.4405(0.3346, 0.5355)0.3355(0.2204, 0.4414)0.4187(0.3106, 0.5162)graph_scatterplot
11SVM-Rank0.4520(0.3472, 0.5456)0.3401(0.2253, 0.4456)0.4612(0.3574, 0.5538)graph_scatterplot
12BLEU-10.4204(0.3124, 0.5177)0.3182(0.2018, 0.4256)0.3904(0.2797, 0.4909)graph_scatterplot
13ATEC40.2084(0.0863, 0.3243)0.1521(0.0283, 0.2713)0.2320(0.1109, 0.3464)graph_scatterplot
14Bleu-sbp0.4972(0.3975, 0.5853)0.3779(0.2660, 0.4796)0.4807(0.3790, 0.5709)graph_scatterplot
15ATEC10.2031(0.0808, 0.3193)0.1478(0.0239, 0.2672)0.2306(0.1094, 0.3450)graph_scatterplot
16invWer-0.4023(-0.5015, -0.2926)-0.3064(-0.4149, -0.1893)-0.4044(-0.5034, -0.2949)graph_scatterplot
17SNR0.4159(0.3075, 0.5136)0.3099(0.1930, 0.4181)0.3190(0.2027, 0.4264)graph_scatterplot
18mBLEU0.5283(0.4323, 0.6124)0.4042(0.2946, 0.5032)0.5021(0.4030, 0.5896)graph_scatterplot
194-GRR0.4894(0.3888, 0.5785)0.3736(0.2615, 0.4759)0.4764(0.3742, 0.5671)graph_scatterplot
20BLEU-v11b0.5043(0.4054, 0.5915)0.3813(0.2697, 0.4827)0.4861(0.3850, 0.5756)graph_scatterplot
21Badger0.5254(0.4291, 0.6099)0.3761(0.2642, 0.4781)0.5196(0.4226, 0.6048)graph_scatterplot
22ATEC20.2053(0.0831, 0.3215)0.1488(0.0249, 0.2681)0.2317(0.1106, 0.3461)graph_scatterplot
23SEPIA10.4604(0.3565, 0.5531)0.3482(0.2340, 0.4529)0.4372(0.3309, 0.5326)graph_scatterplot
24Meteor-v0.70.5007(0.4014, 0.5884)0.3816(0.2701, 0.4830)0.4645(0.3610, 0.5567)graph_scatterplot
25MaxSim0.3234(0.2075, 0.4304)0.2261(0.1047, 0.3408)0.3428(0.2282, 0.4480)graph_scatterplot
26mTER-0.3938(-0.4939, -0.2833)-0.3027(-0.4116, -0.1854)-0.3785(-0.4803, -0.2668)graph_scatterplot
27BLEU-40.5074(0.4089, 0.5942)0.3840(0.2727, 0.4852)0.4888(0.3880, 0.5780)graph_scatterplot
28METEOR-v0.60.4122(0.3034, 0.5104)0.3062(0.1891, 0.4148)0.3645(0.2515, 0.4676)graph_scatterplot
29TERp-0.5050(-0.5922, -0.4062)-0.3746(-0.4768, -0.2626)-0.5224(-0.6073, -0.4257)graph_scatterplot

29 metrics (including 7 baseline metrics)
249 data points (total number of documents used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1BadgerLite0.5029(0.3932, 0.5984)0.3629(0.2380, 0.4760)0.6624(0.5780, 0.7328)graph_scatterplot
2METEOR-ranking0.5320(0.4263, 0.6234)0.3885(0.2659, 0.4987)0.7075(0.6319, 0.7698)graph_scatterplot
3LET0.5405(0.4361, 0.6306)0.3915(0.2692, 0.5014)0.7254(0.6535, 0.7843)graph_scatterplot
4SEPIA20.5820(0.4838, 0.6657)0.4264(0.3076, 0.5321)0.7594(0.6948, 0.8118)graph_scatterplot
5CDer-0.5038(-0.5993, -0.3943)-0.3697(-0.4821, -0.2454)-0.7251(-0.7840, -0.6531)graph_scatterplot
6ATEC30.3142(0.1855, 0.4323)0.2210(0.0869, 0.3472)0.4547(0.3391, 0.5568)graph_scatterplot
7TER-v0.7.25-0.4976(-0.5939, -0.3872)-0.3650(-0.4778, -0.2403)-0.7459(-0.8010, -0.6784)graph_scatterplot
8BLEU-v120.5520(0.4492, 0.6404)0.4030(0.2818, 0.5115)0.7454(0.6778, 0.8006)graph_scatterplot
9BleuSP0.5687(0.4684, 0.6545)0.4165(0.2967, 0.5234)0.7603(0.6959, 0.8126)graph_scatterplot
10NIST-v11b0.5392(0.4345, 0.6295)0.3908(0.2685, 0.5008)0.7287(0.6574, 0.7870)graph_scatterplot
11SVM-Rank0.5129(0.4046, 0.6071)0.3729(0.2488, 0.4849)0.6898(0.6107, 0.7553)graph_scatterplot
12BLEU-10.5409(0.4364, 0.6309)0.3919(0.2697, 0.5018)0.7324(0.6619, 0.7900)graph_scatterplot
13Bleu-sbp0.5498(0.4467, 0.6385)0.3997(0.2782, 0.5086)0.7434(0.6753, 0.7989)graph_scatterplot
14ATEC40.3568(0.2314, 0.4706)0.2560(0.1236, 0.3794)0.4894(0.3780, 0.5869)graph_scatterplot
15invWer-0.5090(-0.6037, -0.4001)-0.3755(-0.4872, -0.2517)-0.7486(-0.8031, -0.6817)graph_scatterplot
16ATEC10.3539(0.2283, 0.4680)0.2524(0.1198, 0.3761)0.4844(0.3723, 0.5825)graph_scatterplot
17SNR0.5489(0.4456, 0.6378)0.3974(0.2757, 0.5066)0.6755(0.5936, 0.7435)graph_scatterplot
18mBLEU0.6140(0.5211, 0.6926)0.4532(0.3374, 0.5555)0.7559(0.6905, 0.8090)graph_scatterplot
19BLEU-v11b0.5501(0.4471, 0.6388)0.4016(0.2802, 0.5103)0.7480(0.6809, 0.8026)graph_scatterplot
204-GRR0.4978(0.3875, 0.5941)0.3620(0.2370, 0.4752)0.6911(0.6123, 0.7564)graph_scatterplot
21Badger0.5009(0.3909, 0.5967)0.3590(0.2338, 0.4725)0.6676(0.5842, 0.7370)graph_scatterplot
22ATEC20.3553(0.2297, 0.4692)0.2534(0.1208, 0.3770)0.4863(0.3745, 0.5842)graph_scatterplot
23SEPIA10.5585(0.4567, 0.6459)0.4065(0.2857, 0.5146)0.7478(0.6806, 0.8024)graph_scatterplot
24Meteor-v0.70.5485(0.4451, 0.6374)0.4015(0.2801, 0.5102)0.7256(0.6538, 0.7845)graph_scatterplot
25MaxSim0.2958(0.1658, 0.4157)0.2105(0.0760, 0.3375)0.4388(0.3214, 0.5429)graph_scatterplot
26mTER-0.5627(-0.6494, -0.4614)-0.4200(-0.5265, -0.3005)-0.7775(-0.8264, -0.7171)graph_scatterplot
27BLEU-40.5487(0.4454, 0.6376)0.3993(0.2777, 0.5082)0.7444(0.6766, 0.7997)graph_scatterplot
28METEOR-v0.60.4942(0.3834, 0.5910)0.3562(0.2307, 0.4700)0.6954(0.6174, 0.7599)graph_scatterplot
29TERp-0.5462(-0.6355, -0.4425)-0.3959(-0.5053, -0.2740)-0.7688(-0.8194, -0.7063)graph_scatterplot

29 metrics (including 7 baseline metrics)
206 data points (total number of documents used)