Image Group

Census & NIST Sponsored OCR Conferences

The following lists contain information from the Census & NIST sponsored OCR conferences. These files are also available from our anonymous ftp server sequoyah.nist.gov.

If there is a discrepancy between online and published version of a document, the published version is authoritative.


First OCR Conference

test1_an.zip [710K]
PKZIP file, from the floppy disk included with the First Census Conference test CD-ROM, containing files with the answers, plurality hypotheses, and information identifying the writers of each character.

test1_readme.txt
The file "readme.txt" from the above floppy disk, further documenting the files.

test1.tar.Z [476K]
Compressed tar file, designed for UNIX users, containing the files in the PKZIP file above and the file "readme.txt".

ir_4912.ps.Z [4,985K]
(This report is very large. A paper copy is available from webmaster@magi.nist.gov). The First Census OCR Systems Conference report.


Second OCR Conference

announce.tar.Z [20K]
Files describing the Second OCR Systems Conference task, file formats, etc.

samples_1.tar [12,665K]
Contains a directory with a sampling of Industry and Occupation miniform images. It has the same directory structure that was used for the Second OCR Systems Conference training CD-ROM's (Special Databases 11 and 12). The subdirectory "data" contains images from microfilm (100 files, 500 miniforms, 1500 total fields) and no reference files, while the subdirectory "data3" contains images from paper (60 files, 300 miniforms, 900 total fields) and the corresponding reference files.

samples_2.tar [15,409K]
Contains a directory with a sampling of Industry and Occupation miniform images. It has the same general directory structure that was used for the Second OCR Systems Conference training CD-ROM's (Special Databases 11 and 12). The directory contains images from microfilm (200 files, 1000 miniforms, 3000 total fields) and the corresponding reference files, and is a copy of the subdirectory "data" from the Special Database 11 CD-ROM.

refs_paper.tar.Z [103K]
Reference files for paper Conference test images (9000 total fields), available on CD-ROM ( Special Database 13).

refs_microfilm.tar.Z [102K]
Reference files for microfilm Conference test images (9000 total fields), available on CD-ROM ( Special Database 13).

ir_5452.pt1.tar.Z [827K]
(This report is very large. A paper copy is available from webmaster@magi.nist.gov). The first 103 pages of the Second Census OCR Systems Conference report.

ir_5452.pt2.tar.Z [13,143K]
(This report is very large. A paper copy is available from webmaster@magi.nist.gov). Tar file containing PostScript files of pages 104-261 from the Second Census OCR Systems Conference report, mainly viewgraphs from Conference participants and plots of system performance curves. Many of the pages are several megabytes in size because they are composed of multiple scanned viewgraphs. So that your printer spooling queue capacity is not exceeded, it is recommended that these pages be printed one at a time in a loop that either waits for the queue to empty or sleeps for the typical printing time for a large file (a few minutes on a SparcPrinter, but dozens of minutes on older printers like a LaserWriter II).

-----------------------------
Created April 1999.
Last modified April 9, 2003.
Contact webmaster@magi.nist.gov with corrections/comments.
NIST Disclaimer