Federal organizations need to have an appropriate level of confidence that the security features of IT products work as intended and meet security specifications. The basis for having such confidence is through security assurance. Products with an appropriate degree of assurance contribute to the security and assurance of the system as a whole and thus should be an important factor in IT procurement decisions. NIST helps agencies obtain security assurance in products through two programs for product evaluation and testing -- the National Information Assurance Partnership (NIAP)'s Common Criteria Evaluation and Validation Scheme and the Cryptographic Module Validation Program (CMVP). Both programs use accredited private sector laboratories to conduct the actual testing and issue government certificates upon successful completion of testing. The NIAP's evaluation program focuses on evaluations of against a set of security specifications drawn from the "Common Criteria" (ISO 15408). Testing under CMVP this program helps provide customers with assurance that: 1) a cryptographic module meets one of the four security specification levels of Federal Information Processing Standard (FIPS) 140-1, Security Requirements for Cryptographic Modules and 2) that the FIPS-approved algorithms (e.g., for encryption or digital signatures) are correctly implemented. Both programs help agencies have confidence in the security aspects of the IT products they use.
Over the past decade, the number of electronic display technologies available to consumers has risen dramatically, and the capabilities of existing technologies have expanded. This proliferation of choices provides new opportunities for visual stereo presentation, but also new challenges. The methods of implementing stereo on an electronic display, optimized for the original capabilities of the original displays, may no longer be the best choices. Features such as response time, frame rate, aspect ratio, sync timing, pixel registration, and temporal modulation of grayscale (brightness) and color can strongly influence the process of selecting an optimum presentation format for a given display technology. Display performance issues such as brightness, contrast, flicker, image distortion, defective pixels, and mura are more critical in 3D imagery than in 2D. Susceptibility to burn-in limits the implementation choices for a display that is to be used for both 3D and 2D applications. Resolution and frame rate establish the overall capability for representing depth, and also establish the performance requirements for the engine providing the 3D material.
This paper surveys the capabilities and characteristics of traditional displays such as CRT and LCD panel, and a broad assortment of newer display technologies, including color plasma, field emission, micromirror and other reflective systems, and the general classes of microdisplays. Relevance of display characteristics to various stereo presentation formats is discussed, with description of laboratory experimentation to provide hard numbers. Recommendations are made regarding the stereo formats to be used with various display technologies, and the display technologies to be used with various stereo formats.
The increasingly pervasive influence of information technology on daily life makes accessibility a higher priority than ever before. Millions of blind and visually impaired people in the U.S. (and far higher numbers worldwide) need some form of non-visual access to information. Non-visual displays differ from visual displays, but some features and issues are strikingly similar to those of visual displays. Significant progress has been made with text-to-speech systems, but many users prefer the precision and the reading experience of touch-based Braille systems. The widespread use of Braille displays has been limited primarily by cost and reliability issues. The cost to the user of a conventional 80-character Braille display is often $10000-15000 U.S., and maintenance costs can be around $500 per year. The primary cost and reliability factor is the large number of electromechanical actuators. Each 6- or 8-dot Braille cell requires six or eight actuators, with hundreds needed for the entire display. Smaller displays (e.g., 8-character) are available, but require the user to move a finger back and forth, raising issues of convenience and repetitive stress injuries. Our objective in undertaking this project was to find a new approach to Braille display design that would significantly lower cost and improve reliability, and still provide a worthwhile reading experience approaching that of full-line (80-character) displays. Our target was a factor of ten reduction in display cost.
This document discusses some aspects of selecting and testing random and pseudorandom number generators. The outputs of such generators may be used in many cryptographic applications, such as the generation of key material. Random or pseudorandom outputs are needed in contexts in which it is necessary to prevent unauthorized parties from ascertaining quantities such as keys. Generators suitable for use in cryptographic contexts may need to meet stronger requirements in other contexts. In particular, their outputs may need to be unpredictable in the absence of knowledge of the inputs. Some criteria for characterizing and selecting appropriate generators are discussed in this document. The subject of statistical testing and its relation to cryptanalysis is also discussed, and some recommended statistical tests are provided. These tests may be useful as a first step in determining whether or not a generator is suitable for a particular cryptographic application. However, no set of statistical tests can absolutely certify a generator as appropriate for usage in a particular application, i.e., statistical testing cannot serve as a substitute for cryptanalysis. The design and cryptanalysis of generators is outside the scope of this document.
Although virtually unchanged since its initial publication in 1964, the National Bureau of Standards (NBS) Handbook of Mathematical Functions continues to be widely used by the mathematical and scientific community. As a result, the National Institute of Standards and Technology(NIST), the successor organization to NBS, is engaged in a large scale project to update and expand the handbook and disseminate it on the World Wide Web as the NIST Digital Library of Mathematical Functions (DLMF). A key feature of the DLMF will be 3D graphics and visualization capabilities that allow a user to interactively examine the unique features of complicated mathematical functions. The authors have discovered that many commercial packages produce adequate surface plots of functions, but improperly clip the surface when the plot must be rescaled to emphasize interesting features. This paper discusses some initial results in using a "contour" fitted mesh to generate an appropriately clipped surface plot and examines some of the issues involved in extending the technique to more complicated surfaces.
This paper is a proposal for conducting a Special Interest Group (SIG) at the CHI2000 Conference. The focus of the meeting will be the use of the Common Industry Format (CIF) developed by the Industry Usability Reporting (IUSR) project, a group managed by NIST.
Usability testing for web sites is difficult for several reasons. First, the development time of web sites is far less than traditional software thus leaving fewer iterations for usability refinements. The user population for web sites can be very broad. There is no way to determine the type of hardware or browser software that users will use - nor the bandwidth speed. All this affects usability of the site. In addition, the content of web sites is tightly coupled with the structure and design of the site. Traditional methods of usability testing must be modified in order to be useful for web site design. In this paper, we present our initial research into a methodology for testing web sites for usability. We view this only as a beginning. There are many more aspects of usability testing that we must investigate and modify for web site testing.
The Industry Usability Report (IUSR) Project is designed to help potential corporate purchasers of software obtain information about the usability of supplier products. There are two parts to the IUSR project: a proposed format for sharing usability information, and a pilot study which will allow both supplier (the developer of the software) and consumer (the purchaser of the software) companies to test the effectiveness of using usability test results as procurement criteria and to verify the usefulness of the reporting format.
Mars, RC6, Rijndael, Serpent and Twofish were selected as finalists for the Advanced Encryption Standard (AES). To evaluate the finalists' suitability as random number generators, empirical statistical testing is commonly employed. Although it widely believed that these five algorithms are indeed random, randomness testing was conducted to show that there is empirical evidence supporting this belief. In this paper, NIST reports on the studies that were conducted on the finalists for the 192-bit key size and 256-bit key size. The results to date suggest that all five of the finalists appear to be random.
Interoperable Message Passing Interface (IMPI) is a protocol specification to allow multiple MPI implementations to cooperate on a single MPI job. Unlike portable MPI implementations, an IMPI-connected parallel job allows the use of vendor-tuned message passing libraries on given target architectures, thus potentially allowing higher levels of performance than previously possible. Additionally, the IMPI protocol uses a low number of connections, which may be suitable for parallel computations across WAN (wide area network) distances. The IMPI specification defines a low-level wire protocol when communicating with a remote MPI implementation. When running IMPI jobs, the only change visible to the user is the sequence of steps necessary to run the job; any correct MPI program will run correctly under IMPI. In this paper, we provide an overview of IMPI, describe its incorporation into the LAM (Local Area Multicomputer) implementation of MPI, and show an example of its use.
CSPP provides the guidance necessary to develop "compliant," Common Criteria protection profiles for near-term achievable, security baselines using commercial off-the-shelf (COTS) information technology. CSPP accomplishes this purpose by describing a largely policy-neutral, notional information system in the format of a protection profile (PP), specifying a subset of the common criteria to be used in developing "compliant" protection profiles, and providing the basis for refining policy neutral guidance into specific policy requirements and system security threats, objectives, and requirements into a subset which is appropriate for a specific PP. CSPP provides the requirements necessary to specify needs for both stand-alone and distributed, multi-user information systems. This covers general-purpose operating systems, database management systems, and other applications.
Fiber-reinforced polymeric composites provide lightweight, high strength, and corrosive resistance to severe environ-mental exposures. These composites have been extensively used in aerospace and military application over the last three decades and are being extended into civil engineering applications. However, there is little quantitative research on the effects of civil engineering environments, namely, water, sea water, temperature, concrete pore solution, ultraviolet light, and loading on the fatigue of polymeric composites. We have developed a fatigue model, for predicting the fatigue life of fiber-reinforced polymeric composites, that incorporates applied maximum stress, stress amplitude, loading frequency, residual tensile modulus, and material constants. The model has been verified with experimental fatigue data of a glass fiber/vinyl ester composite in various environments: air, fresh water, and saltwater at 30oC. This study continues to investigate the effects on fatigue life by the change of temperature. Both the residual mechanical properties at specified loading cycles and the number of cycles at which the specimens fail are measured. The results show, for the material used in this study, that the fatigue life in these aqueous environments at 65oC is about the same as that at 30oC, but the fatigue life at 4oC is significantly longer than that at 30oC. Based on these experimental data, the material constants, m and C, are derived as functions of temperature, T.
Superresolution phase pupil filter in conjunction with solid immersion lens for high density in optical data storage is described. The frequency response of the system is studied. A binary annular phase filter is used in this study. Simulation results are included to show the effectiveness of using such a system with acceptable side lobe effect.
Test collections have traditionally been used by information retrieval researchers to improve their retrieval strategies. To be viable as a laboratory tool, a collection must reliably rank different retrieval variants according to their true effectiveness. In particular, the relative effectiveness of two retrieval strategies should be insensitive to modest changes in the relevant document set since individual relevance assessments are known to vary widely. The test collections developed in the TREC workshops have become the collections of choice in the retrieval research community. To verify their reliability, NIST investigated the effect changes in the relevance assessments have on the evaluation of retrieval results. Very high correlations were found among the rankings of systems produced using different relevance judgment sets. The high correlations indicate that the comparative evaluation of retrieval performance is stable despite substantial differences in relevance judgments, and thus reaffirm the use of the TREC collections as laboratory tools.
The TREC-8 Question Answering track was the first large-scale evaluation of systems that return answers, as opposed to lists of documents, in response to a question. As a first evaluation, it is important to examine the evaluation methodology itself to understand any limits on the conclusions that can be drawn from the evaluation and possibly to find ways to improve subsequent evaluations. This paper has two main goals: to describe in detail how the evaluation was implemented, and to examine the consequences of the methodology on the comparative performance of the systems participating in the evaluation. The examination uncovered no serious flaws in the methodology, supporting its continued use for question answering evaluation. Nonetheless, redefining the specific task to be performed so that it more closely matches an actual user task does appear warranted.
The TREC-8 Question Answering track was the first large-scale evaluation of domain-independent question answering systems. This paper summarizes the results of the track, including both an overview of the approaches taken to the problem and an analysis of the evaluation methodology. Retrieval results for the more stringent condition in which system responses were limited to 50 bytes showed that explicit linguistic processing was more effective than the bag-of-words approaches that are effective for document retrieval. The use of multiple human assessors to judge the correctness of the systems' responses demonstrated that assessors have legitimate differences of opinion as to correctness even for fact-based, short-answer questions. Evaluations of question answering technology will need to accommodate these differences since eventual end-users of the technology will have similar differences.
Achieving 100 percent software reliability may seem an unreasonable goal. Software developers and consumers of many software products are largely unsure about the reliability of their product or purchase. Today, many opportunities exist for some assurance of software products. Current practices and issues address process (e.g., CMM, ISO 9000), people (e.g., software engineering degrees, certification exams, licensing) and product (e.g., measurement of the product) encompass major areas of progress toward software reliability. This presentation discusses one aspect of the product: the usage of history data of faults and failures of software systems, collected from either the development and assurance processes or operational use, to improve reliability of software products. Information contained in these histories characterizes the nature of faults, or defects, for a specific product line. The objectives are to use the history to determine how to prevent faults from entering into the product, to remove faults before the product is released, and to measure a product's frequency profile against others in the same domain. Finally, the histories may indicate problems for which better methods are needed to prevent or detect, hence providing justification for research ideas.
Most complex systems today contain software, and systems failures activated by software faults can provide lessons for software development practices and software quality assurance. This paper presents an analysis of software-related failures of medical devices that caused no death or injury but led to recalls by the manufacturers. The analysis categorizes the failures by their symptoms and faults, and discusses methods of preventing and detecting faults in each category. The nature of the faults provides lessons about the value of generally accepted quality practices for prevention and detection methods applied prior to system release. It also provides some insight into the need for formal requirements specification and for improved testing of complex hardware-software systems.
Most complex systems today contain software, and systems failures activated by software faults can provide lessons for software development practices and software quality assurance. This report presents an analysis of 342 software-related failures of medical devices that caused no death or injury but led to recalls by the manufacturers. The analysis categorizes the failures by their symptoms and faults. Tables provide methods for preventing and detecting the faults. The nature of the faults provides lessons about the value of generally accepted quality practices for prevention and detection methods applied prior to system release. It also provides some insight into the need for formal requirements specification and for improved testing of complex hardware-software systems.
This paper gives results for using distortion tolerant filters to improve performance of fingerprint correlation matching. Three types of distortion tolerant filters were tested: summation, weighted, and MINACE. A set of 55 fingers were used from NIST Special Database 24 to evaluate the filters. Our results show performance was improved from 49% correct, using one training fingerprint, to 100% correct, using multiple training fingerprints and a distortion-tolerant MINACE filter, with no false alarms.
An auto-focusing method is described and practically applied to an optical disc test-bed. In this application, an optical microscope was used to create clear pit images of optical discs. A CCD camera generated the graphic signals for the images. A standard deviation value of the gray levels for all pixels in the image is used as a feedback for auto-focusing control. The performance and possible applications of this method are discussed and test results for optical discs are given in this paper.
At the NIST Cold Neutron Research Facility, a two-stage experiment, which consists of neutron production and neutron decay stages, is underway to determine the neutron lifetime or time to decay. In this paper, statistical methods are developed for estimating the mean neutron lifetime from such experiments. Salient features of the experimental data are that the number, N, of neutrons whose decay times are being recorded is an unobservable random variable and that the decay times are contaminated by background noises in the recording. Under the assumption that neutron productions and decays follow a Markovian birth and death process, we deduce the distribution of N and the decay times subject to a variety of experimental constraints. These distributions serve as likelihood models for estimation of the mean neutron life time. The method of maximum likelihood and the method of minimum chi-square are employed and compared for estimation. The minimum chi-square estimates suitable for binned decay data are shown, by simulation, to be comparable to the maximum likelihood estimates based on the data of exact decay times. The loss of efficiency in using binned data instead of the exact decay times is minimal provided the bin width are reasonably small and the observation period for decay is not too short. Our systematic and unified approach facilitates the application of the developed statistical methods to the study of other similar two-stage experiment of atom decays. It also clarifies some of the ambiguities in the literature in using the conditional distributions for statistical estimation.
Metaanalysis, the statistical synthesis of results from several different studies, often involves combining estimates of effect size from several experiments. The standardized mean difference is one of the most commonly used estimates of effect size; however, when the data are not normally distributed, nonparametric effect size estimates may be preferable. This paper presents a nonparametric estimate of effect size based on the MannWhitney test that combines excellent efficiency with robustness. In addition, estimates of variance and confidence interval procedures for this estimator are provided.
The estimation of effect sizes is crucial part of meta-analysis. This papers uses influence functions as a fundamental tool in the analysis of estimates of effect size. The generalization to several variables of the influence function provides heuristic information about the robustness of the estimators and a way to calculate their large sample variances, including for non-normal situations. Some commonly used parametric estimators are organized into one of two classes of estimators that we call first order quotient estimators and second order quotient estimators. We provide the influence functions and large sample variance estimates for quotient estimators in general and for certain specific cases. In addition, influence functions are used to analyze several nonparametric estimates and to provide variance estimates where none existed previously. The performance of the various estimators along with their proposed variance estimates and confidence intervals are tested in Monte Carlo experiments.
In many situations, weighted means can achieve much lower variances than unweighted means by placing more weight on those observations that are more precise. However, that weighted means can be dominated by one or two observations with very large weights raises questions about their robustness. The trimmed weighted mean, by trimming portions of the weights off of either end, retains much of the precision of weighted means while adding robustness. We present asymptotic results based on the theory of weighted empirical functions, including a Central Limittype theorem, estimates of variance, and procedures for forming confidence intervals. Finally, these procedures are demonstrated and tested using Monte Carlo studies.
In this correspondence, we describe a holistic face recognition method based on subspace Linear Discriminant Analysis (LDA). Like existing methods, this method consists of two steps: first, the face image is projected into a face subspace via Principal Component Analysis (PCA) where the subspace dimension is carefully chosen, and then the PCA projection vectors are projected into the LDA to construct a linear classifier in the subspace. Experiments show that the dimension of the face subspace is fixed regardless of the image size as long as the image size exceeds the subspace dimension. The property of relative invariance of the subspace dimension enables the system to work with smaller face images without sacrificing performance. The choice of such a fixed subspace dimension is mainly based on the characteristics of the eigenvectors instead of the commonly used eigenvalues. Such a choice of the subspace dimension enables the system to generate class-separable features via LDA from the full subspace representation. Hence the generalization/overfitting problem can be addressed. In addition, a weighted distance metric guided by the LDA eigenvalues is employed to improve the performance of the subspace LDA method. Experimental results using FERET datasets and the MPEG-7 content set are presented.