Go Back

Correlation Results

Current Conditions

  • Human Assessment Type: Adequacy, 4-point scale
  • Target Language: English
  • Correlation Level: document

Subdivisions

By track:

Ranking

Single Reference Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.6282(0.4894, 0.7360)0.4551(0.2802, 0.6008)0.6374(0.5009, 0.7429)graph_scatterplot
2CDer-0.7334(-0.8142, -0.6248)-0.5393(-0.6676, -0.3799)-0.7127(-0.7990, -0.5976)graph_scatterplot
3ULCh0.5685(0.4154, 0.6903)0.4033(0.2207, 0.5586)0.5602(0.4052, 0.6839)graph_scatterplot
4TER-v0.7.25-0.7240(-0.8073, -0.6124)-0.5239(-0.6555, -0.3613)-0.7041(-0.7927, -0.5864)graph_scatterplot
5DP-Orp0.1952(-0.0055, 0.3808)0.1252(-0.0772, 0.3177)0.2398(0.0413, 0.4201)graph_scatterplot
6NIST-v11b0.6809(0.5563, 0.7755)0.4889(0.3197, 0.6278)0.6535(0.5213, 0.7551)graph_scatterplot
7ATEC40.3309(0.1397, 0.4984)0.2371(0.0385, 0.4178)0.2856(0.0903, 0.4598)graph_scatterplot
8ATEC10.3216(0.1295, 0.4905)0.2303(0.0313, 0.4118)0.2589(0.0616, 0.4368)graph_scatterplot
9mBLEU0.3444(0.1546, 0.5097)0.2480(0.0500, 0.4272)0.1399(-0.0624, 0.3311)graph_scatterplot
10SNR0.5934(0.4460, 0.7095)0.4169(0.2362, 0.5698)0.5816(0.4315, 0.7005)graph_scatterplot
114-GRR0.7533(0.6512, 0.8287)0.5584(0.4031, 0.6825)0.6822(0.5580, 0.7765)graph_scatterplot
12ATEC20.3106(0.1174, 0.4811)0.2222(0.0227, 0.4047)0.2518(0.0541, 0.4306)graph_scatterplot
13SEPIA10.7023(0.5841, 0.7914)0.5090(0.3436, 0.6438)0.6395(0.5036, 0.7446)graph_scatterplot
14ULCopt0.6056(0.4611, 0.7188)0.4266(0.2472, 0.5777)0.5651(0.4112, 0.6877)graph_scatterplot
15mTER-0.3264(-0.4945, -0.1347)-0.1814(-0.3685, 0.0198)-0.2745(-0.4502, -0.0783)graph_scatterplot
16EDPM0.6926(0.5715, 0.7842)0.5042(0.3379, 0.6400)0.6634(0.5339, 0.7625)graph_scatterplot
17BLEU-40.6188(0.4775, 0.7288)0.4346(0.2564, 0.5842)0.5891(0.4406, 0.7062)graph_scatterplot
18METEOR-v0.60.5487(0.3912, 0.6749)0.3832(0.1979, 0.5420)0.5491(0.3918, 0.6753)graph_scatterplot
19RTE-MT0.6007(0.4550, 0.7151)0.4253(0.2457, 0.5766)0.5953(0.4483, 0.7109)graph_scatterplot
20BadgerLite0.2808(0.0851, 0.4556)0.1840(-0.0171, 0.3708)0.3263(0.1346, 0.4944)graph_scatterplot
21METEOR-ranking0.5828(0.4328, 0.7013)0.4204(0.2402, 0.5727)0.6180(0.4766, 0.7283)graph_scatterplot
22LET0.6234(0.4833, 0.7324)0.4435(0.2667, 0.5914)0.6174(0.4758, 0.7278)graph_scatterplot
23DP-Or0.4929(0.3245, 0.6311)0.3415(0.1514, 0.5072)0.4837(0.3137, 0.6237)graph_scatterplot
24ATEC30.3058(0.1122, 0.4770)0.2189(0.0193, 0.4018)0.2483(0.0503, 0.4275)graph_scatterplot
25BLEU-v120.6174(0.4758, 0.7278)0.4321(0.2536, 0.5822)0.5869(0.4379, 0.7045)graph_scatterplot
26BEwT-E0.3127(0.1197, 0.4829)0.2209(0.0213, 0.4035)0.3603(0.1722, 0.5230)graph_scatterplot
27RTE0.5660(0.4123, 0.6884)0.3968(0.2132, 0.5532)0.5681(0.4148, 0.6900)graph_scatterplot
28DR-Or0.4377(0.2601, 0.5867)0.2967(0.1023, 0.4693)0.4450(0.2684, 0.5926)graph_scatterplot
29BleuSP0.6807(0.5561, 0.7754)0.4906(0.3218, 0.6292)0.6370(0.5004, 0.7427)graph_scatterplot
30SVM-Rank0.5737(0.4218, 0.6944)0.4007(0.2177, 0.5565)0.5731(0.4210, 0.6939)graph_scatterplot
31BLEU-10.6399(0.5040, 0.7448)0.4565(0.2818, 0.6019)0.6473(0.5135, 0.7504)graph_scatterplot
32Bleu-sbp0.6201(0.4792, 0.7299)0.4393(0.2619, 0.5880)0.5924(0.4447, 0.7087)graph_scatterplot
33invWer-0.7342(-0.8147, -0.6258)-0.5409(-0.6689, -0.3818)-0.7117(-0.7983, -0.5962)graph_scatterplot
34BLEU-v11b0.6188(0.4775, 0.7288)0.4346(0.2564, 0.5842)0.5891(0.4406, 0.7062)graph_scatterplot
35SR-Or0.5257(0.3635, 0.6569)0.3731(0.1865, 0.5336)0.5528(0.3963, 0.6782)graph_scatterplot
36Badger0.1418(-0.0604, 0.3328)0.0958(-0.1067, 0.2907)0.2032(0.0028, 0.3879)graph_scatterplot
37Meteor-v0.70.6196(0.4786, 0.7295)0.4472(0.2710, 0.5944)0.6495(0.5163, 0.7521)graph_scatterplot
38MaxSim0.5470(0.3892, 0.6736)0.3823(0.1969, 0.5413)0.5614(0.4066, 0.6848)graph_scatterplot
39TERp-0.7269(-0.8094, -0.6162)-0.5294(-0.6599, -0.3680)-0.7264(-0.8091, -0.6156)graph_scatterplot

39 metrics (including 7 baseline metrics)
96 data points (total number of documents used)

Multiple References Track
RankMetric NameSpearman's RhoKendall's TauPearson's RGraphs
Value95% confidence intervalValue95% confidence intervalValue95% confidence interval
1SEPIA20.7056(0.5883, 0.7938)0.5139(0.3494, 0.6476)0.7022(0.5839, 0.7913)graph_scatterplot
2CDer-0.7428(-0.8210, -0.6372)-0.5503(-0.6762, -0.3932)-0.7376(-0.8173, -0.6304)graph_scatterplot
3ULCh0.6638(0.5345, 0.7628)0.4884(0.3192, 0.6275)0.6325(0.4947, 0.7392)graph_scatterplot
4TER-v0.7.25-0.7427(-0.8209, -0.6370)-0.5467(-0.6734, -0.3888)-0.7278(-0.8101, -0.6174)graph_scatterplot
5DP-Orp0.3793(0.1935, 0.5388)0.2467(0.0487, 0.4261)0.3759(0.1897, 0.5360)graph_scatterplot
6NIST-v11b0.6996(0.5805, 0.7893)0.5139(0.3494, 0.6476)0.6696(0.5418, 0.7671)graph_scatterplot
7ATEC40.3627(0.1750, 0.5250)0.2716(0.0752, 0.4477)0.2841(0.0886, 0.4585)graph_scatterplot
8ATEC10.3636(0.1759, 0.5258)0.2676(0.0709, 0.4443)0.2713(0.0749, 0.4475)graph_scatterplot
9SNR0.7323(0.6234, 0.8134)0.5419(0.3831, 0.6697)0.7132(0.5982, 0.7994)graph_scatterplot
10mBLEU0.3220(0.1299, 0.4908)0.2301(0.0311, 0.4116)0.1384(-0.0639, 0.3297)graph_scatterplot
114-GRR0.7756(0.6810, 0.8447)0.5826(0.4326, 0.7012)0.7149(0.6004, 0.8006)graph_scatterplot
12ATEC20.3593(0.1711, 0.5222)0.2660(0.0692, 0.4429)0.2572(0.0598, 0.4352)graph_scatterplot
13SEPIA10.7492(0.6457, 0.8257)0.5542(0.3980, 0.6792)0.7241(0.6125, 0.8074)graph_scatterplot
14ULCopt0.6471(0.5131, 0.7502)0.4639(0.2904, 0.6078)0.6247(0.4850, 0.7334)graph_scatterplot
15EDPM0.7364(0.6287, 0.8163)0.5441(0.3857, 0.6714)0.7238(0.6122, 0.8072)graph_scatterplot
16mTER-0.2277(-0.4095, -0.0285)-0.1459(-0.3365, 0.0563)-0.2114(-0.3951, -0.0113)graph_scatterplot
17BLEU-40.7052(0.5879, 0.7935)0.5248(0.3625, 0.6563)0.6828(0.5588, 0.7769)graph_scatterplot
18METEOR-v0.60.6377(0.5013, 0.7432)0.4494(0.2735, 0.5962)0.6156(0.4736, 0.7264)graph_scatterplot
19BadgerLite0.2261(0.0269, 0.4081)0.1559(-0.0460, 0.3456)0.2727(0.0763, 0.4486)graph_scatterplot
20METEOR-ranking0.6962(0.5761, 0.7868)0.5064(0.3405, 0.6417)0.6987(0.5794, 0.7887)graph_scatterplot
21LET0.6795(0.5546, 0.7745)0.4897(0.3208, 0.6285)0.6646(0.5354, 0.7633)graph_scatterplot
22DP-Or0.5838(0.4341, 0.7021)0.4134(0.2322, 0.5669)0.5846(0.4351, 0.7027)graph_scatterplot
23ATEC30.3599(0.1718, 0.5226)0.2669(0.0702, 0.4437)0.2568(0.0594, 0.4349)graph_scatterplot
24BLEU-v120.7091(0.5930, 0.7964)0.5242(0.3618, 0.6558)0.6852(0.5619, 0.7787)graph_scatterplot
25BEwT-E0.3655(0.1781, 0.5274)0.2691(0.0725, 0.4456)0.4210(0.2408, 0.5731)graph_scatterplot
26DR-Or0.5383(0.3787, 0.6668)0.3744(0.1880, 0.5347)0.5465(0.3886, 0.6733)graph_scatterplot
27BleuSP0.7388(0.6320, 0.8181)0.5542(0.3980, 0.6792)0.7158(0.6016, 0.8013)graph_scatterplot
28SVM-Rank0.6246(0.4849, 0.7333)0.4472(0.2710, 0.5944)0.6161(0.4741, 0.7268)graph_scatterplot
29BLEU-10.6352(0.4981, 0.7413)0.4527(0.2774, 0.5989)0.6417(0.5064, 0.7462)graph_scatterplot
30Bleu-sbp0.7080(0.5915, 0.7956)0.5283(0.3667, 0.6590)0.6856(0.5624, 0.7790)graph_scatterplot
31invWer-0.7751(-0.8444, -0.6804)-0.5772(-0.6970, -0.4260)-0.7724(-0.8424, -0.6766)graph_scatterplot
32BLEU-v11b0.7057(0.5885, 0.7939)0.5242(0.3618, 0.6558)0.6838(0.5601, 0.7777)graph_scatterplot
33SR-Or0.6264(0.4871, 0.7347)0.4520(0.2766, 0.5983)0.6152(0.4731, 0.7261)graph_scatterplot
34Badger0.0937(-0.1088, 0.2888)0.0656(-0.1367, 0.2626)0.1242(-0.0782, 0.3168)graph_scatterplot
35Meteor-v0.70.7067(0.5897, 0.7946)0.5191(0.3557, 0.6518)0.7109(0.5952, 0.7977)graph_scatterplot
36MaxSim0.6220(0.4815, 0.7313)0.4415(0.2644, 0.5898)0.6208(0.4801, 0.7304)graph_scatterplot
37TERp-0.7609(-0.8341, -0.6613)-0.5604(-0.6840, -0.4054)-0.7525(-0.8281, -0.6502)graph_scatterplot

37 metrics (including 7 baseline metrics)
96 data points (total number of documents used)