StRD: How to Use

How to Use Statistical Reference Datasets

The Statistical Reference Dataset pages are intended to complement the testing of statistical software by providing datasets and corresponding "certified values" of statistical parameters which are commonly estimated by software packages. To obtain statistical reference datasets for this purpose, read this entire list of instructions and then proceed.

1) Click on Dataset Archives

2) Click on the statistical method of interest:

Under each statistical method, you will see several datasets for testing statistical software. These datasets are roughly graded according to difficulty, so for a quick test, one may try the most difficult problems. Alternatively, since a fairly wide variety of examples are provided, one might choose examples which are closest to a particular application.

3) Click on a dataset of interest to view the following:

The data files are in ASCII format with header information in lines 1-60, with the data beginning on line 61. For convenience, the header information and certified results are included in the file so that only one file needs to be downloaded for each dataset of interest.

4) If the dataset is of interest, then:

The "certified values" represent the "best available" solution. More discussion of certified values can be found on pages dealing with the various statistical methods.

5) Repeat this procedure for several datasets that are in your domain of interest.

The question of how close one should expect results using a particular software package to be to the certified values cannot (at least at present) be answered in general. Testing software with these datasets can provide useful information for evaluation, but the decision as to whether there is a potential problem with a particular computer algorithm requires judgement which includes, but should not be limited to, comparsion with certified values.