Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Adequacy, 5-point scale
  • Target Language: French
  • Correlation Level: document

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1BadgerLite0.6373(0.5572, 0.7057)0.4654(0.3621, 0.5575)0.6143(0.5304, 0.6862)graph_scatterplot
2METEOR-ranking0.7652(0.7082, 0.8122)0.5824(0.4938, 0.6590)0.7409(0.6791, 0.7922)graph_scatterplot
3LET0.7140(0.6471, 0.7700)0.5459(0.4523, 0.6277)0.7646(0.7075, 0.8117)graph_scatterplot
4SEPIA20.7344(0.6714, 0.7869)0.5703(0.4800, 0.6486)0.7734(0.7182, 0.8190)graph_scatterplot
5CDer-0.7189(-0.7740, -0.6529)-0.5580(-0.6381, -0.4660)-0.7994(-0.8402, -0.7496)graph_scatterplot
6ATEC30.5068(0.4082, 0.5937)0.3646(0.2517, 0.4678)0.5958(0.5092, 0.6705)graph_scatterplot
7TER-v0.7.25-0.6854(-0.7462, -0.6134)-0.5325(-0.6160, -0.4371)-0.7691(-0.8155, -0.7130)graph_scatterplot
8BLEU-v120.7528(0.6934, 0.8021)0.5904(0.5031, 0.6659)0.7093(0.6416, 0.7661)graph_scatterplot
9BleuSP0.7559(0.6970, 0.8046)0.5931(0.5061, 0.6681)0.7384(0.6762, 0.7902)graph_scatterplot
10NIST-v11b0.7227(0.6574, 0.7772)0.5623(0.4709, 0.6417)0.7590(0.7008, 0.8072)graph_scatterplot
11SVM-Rank0.7729(0.7175, 0.8186)0.6022(0.5165, 0.6759)0.7844(0.7314, 0.8279)graph_scatterplot
12BLEU-10.7114(0.6440, 0.7678)0.5529(0.4602, 0.6336)0.7768(0.7223, 0.8218)graph_scatterplot
13ATEC40.5393(0.4448, 0.6219)0.3917(0.2810, 0.4920)0.6291(0.5476, 0.6987)graph_scatterplot
14Bleu-sbp0.7523(0.6928, 0.8017)0.5909(0.5036, 0.6663)0.6968(0.6268, 0.7557)graph_scatterplot
15ATEC10.5349(0.4399, 0.6182)0.3873(0.2763, 0.4881)0.6235(0.5412, 0.6941)graph_scatterplot
16invWer-0.6884(-0.7487, -0.6169)-0.5365(-0.6195, -0.4417)-0.7781(-0.8228, -0.7238)graph_scatterplot
17SNR0.7293(0.6654, 0.7827)0.5460(0.4524, 0.6277)0.8071(0.7589, 0.8465)graph_scatterplot
18mBLEU0.7503(0.6904, 0.8000)0.5839(0.4956, 0.6603)0.7003(0.6309, 0.7586)graph_scatterplot
194-GRR0.7528(0.6934, 0.8021)0.5926(0.5055, 0.6677)0.7287(0.6646, 0.7822)graph_scatterplot
20BLEU-v11b0.7458(0.6849, 0.7963)0.5837(0.4953, 0.6601)0.6968(0.6268, 0.7557)graph_scatterplot
21Badger0.6443(0.5653, 0.7116)0.4719(0.3693, 0.5632)0.6238(0.5414, 0.6942)graph_scatterplot
22ATEC20.5365(0.4416, 0.6195)0.3883(0.2774, 0.4890)0.6246(0.5424, 0.6950)graph_scatterplot
23SEPIA10.7505(0.6906, 0.8001)0.5855(0.4974, 0.6616)0.7233(0.6582, 0.7777)graph_scatterplot
24Meteor-v0.70.7758(0.7210, 0.8209)0.6060(0.5209, 0.6792)0.6784(0.6051, 0.7403)graph_scatterplot
25MaxSim0.6392(0.5593, 0.7073)0.4645(0.3611, 0.5567)0.6892(0.6178, 0.7493)graph_scatterplot
26mTER-0.6712(-0.7343, -0.5967)-0.5183(-0.6037, -0.4211)-0.7650(-0.8120, -0.7080)graph_scatterplot
27BLEU-40.7480(0.6876, 0.7981)0.5865(0.4985, 0.6625)0.6961(0.6260, 0.7551)graph_scatterplot
28METEOR-v0.60.7222(0.6569, 0.7768)0.5444(0.4506, 0.6263)0.7825(0.7291, 0.8264)graph_scatterplot
29TERp-0.7702(-0.8164, -0.7143)-0.5926(-0.6677, -0.5055)-0.7921(-0.8342, -0.7407)graph_scatterplot

29 metrics (including 7 baseline metrics)
249 data points (total number of documents used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1BadgerLite0.4835(0.3714, 0.5818)0.3583(0.2330, 0.4719)0.5729(0.4733, 0.6581)graph_scatterplot
2METEOR-ranking0.6307(0.5406, 0.7065)0.4620(0.3472, 0.5631)0.7187(0.6454, 0.7789)graph_scatterplot
3LET0.5599(0.4582, 0.6470)0.4201(0.3007, 0.5266)0.6867(0.6070, 0.7527)graph_scatterplot
4SEPIA20.5908(0.4940, 0.6731)0.4352(0.3174, 0.5398)0.7105(0.6355, 0.7722)graph_scatterplot
5CDer-0.6027(-0.6832, -0.5079)-0.4415(-0.5453, -0.3244)-0.7362(-0.7931, -0.6666)graph_scatterplot
6ATEC30.4512(0.3352, 0.5538)0.3160(0.1874, 0.4339)0.5514(0.4485, 0.6399)graph_scatterplot
7TER-v0.7.25-0.5598(-0.6470, -0.4581)-0.4062(-0.5144, -0.2854)-0.7146(-0.7755, -0.6404)graph_scatterplot
8BLEU-v120.5553(0.4530, 0.6432)0.4137(0.2936, 0.5209)0.6875(0.6080, 0.7534)graph_scatterplot
9BleuSP0.6268(0.5360, 0.7032)0.4629(0.3482, 0.5639)0.7397(0.6708, 0.7959)graph_scatterplot
10NIST-v11b0.5622(0.4609, 0.6490)0.4114(0.2911, 0.5189)0.7025(0.6259, 0.7657)graph_scatterplot
11SVM-Rank0.6199(0.5280, 0.6975)0.4549(0.3392, 0.5569)0.7417(0.6733, 0.7975)graph_scatterplot
12BLEU-10.5390(0.4343, 0.6293)0.3955(0.2736, 0.5049)0.6835(0.6031, 0.7501)graph_scatterplot
13Bleu-sbp0.5727(0.4730, 0.6579)0.4218(0.3025, 0.5281)0.6977(0.6201, 0.7617)graph_scatterplot
14ATEC40.4922(0.3812, 0.5893)0.3488(0.2227, 0.4634)0.5864(0.4888, 0.6694)graph_scatterplot
15invWer-0.5925(-0.6746, -0.4960)-0.4321(-0.5371, -0.3140)-0.7262(-0.7850, -0.6544)graph_scatterplot
16ATEC10.4807(0.3682, 0.5793)0.3412(0.2145, 0.4566)0.5772(0.4782, 0.6617)graph_scatterplot
17SNR0.6304(0.5402, 0.7062)0.4557(0.3402, 0.5577)0.7065(0.6307, 0.7690)graph_scatterplot
18mBLEU0.5457(0.4420, 0.6351)0.3976(0.2759, 0.5068)0.6620(0.5776, 0.7325)graph_scatterplot
19BLEU-v11b0.5674(0.4669, 0.6534)0.4195(0.3000, 0.5261)0.6953(0.6172, 0.7598)graph_scatterplot
204-GRR0.6077(0.5136, 0.6873)0.4438(0.3269, 0.5473)0.7308(0.6600, 0.7887)graph_scatterplot
21Badger0.4877(0.3760, 0.5854)0.3539(0.2282, 0.4679)0.5841(0.4863, 0.6675)graph_scatterplot
22ATEC20.4828(0.3705, 0.5811)0.3416(0.2149, 0.4570)0.5781(0.4793, 0.6624)graph_scatterplot
23SEPIA10.5983(0.5027, 0.6794)0.4423(0.3252, 0.5460)0.7094(0.6342, 0.7713)graph_scatterplot
24Meteor-v0.70.6233(0.5319, 0.7003)0.4605(0.3456, 0.5619)0.7038(0.6274, 0.7667)graph_scatterplot
25MaxSim0.5039(0.3943, 0.5993)0.3582(0.2329, 0.4718)0.5958(0.4998, 0.6773)graph_scatterplot
26mTER-0.5819(-0.6656, -0.4836)-0.4385(-0.5427, -0.3211)-0.7214(-0.7810, -0.6486)graph_scatterplot
27BLEU-40.5746(0.4752, 0.6595)0.4239(0.3049, 0.5299)0.6986(0.6212, 0.7625)graph_scatterplot
28METEOR-v0.60.5977(0.5021, 0.6790)0.4334(0.3154, 0.5383)0.7303(0.6594, 0.7883)graph_scatterplot
29TERp-0.6239(-0.7008, -0.5326)-0.4505(-0.5531, -0.3343)-0.7403(-0.7964, -0.6716)graph_scatterplot

29 metrics (including 7 baseline metrics)
206 data points (total number of documents used)