Statistical Engineering Division Collaboration with
Software Diagnostics and Conformance Testing Division
8/22/97
PROJECT: Software Testing by Statistical Methods
Competence proposal presented to NIST Director on 7/10/97.
INITIAL FOCUS: We will place initial priority on statistical methods potentially applicable to
the effective design, construction, and implementation of conformance testing techniques for
functional specifications, i.e. conformance testing of software products for which source code is
not available to the testing authority (often called black box testing using only published
interfaces defined by the functional specification). We will remain open to potential application
of statistical methods to all forms of software testing, including tests with source code
availability, tests derived from formal specifications, proofs of correctness, and even non-testing
methodologies such as clean room techniques and clinical trials.
Task 0 - Task Success Scenarios
The following Tasks 1 through 4 pursue four different directions for potential application of
statistical methods to conformance testing of functional specifications. Early in the study period,
estimate the likelihood of success for each method and its potential for having a significant
impact on black-box conformance testing.
Task 1 - Binomial Trials.
Identify and describe some straightforward statistical methods that will enable conformance
testers to quantify uncertainty and reduce testing costs using existing falsification testing methods
on a fixed number of independent conformance requirements. Consider the Binomial model for
a finite population as well as Bayesian analysis with a Beta-Binomial model. Consider the effect
of stratified sampling as an approximation of various classical techniques for functional testing
of software. Identify an appropriate class of functional specifications for which these methods
provide an acceptable and useful measure of software reliability relative to the functional
specification.
Task 1 Deliverable: A report containing:
* Delineation of each potential binomial method.
* For each method, savings that may accrue in existing testing.
* Description of protocol to implement each method.
Task 2 - Coverage Designs.
Evaluate the applicability of covering designs and methodologies recently developed at Bellcore
and AT&T Research by Dalal and Mallows. Examine the role of design factors in the context of
black-box testing. Identify an appropriate class of functional specifications for which these
methods provide an acceptable and useful measure of software reliability relative to the
functional specification.
Task 2 Deliverable: A report containing:
* Description of how covering designs relate to black-box testing.
* Description of "when to stop testing" and "covering design" methods.
* Blueprint for implementation of coverage design testing.
Task 3 - Mutation Testing.
Evaluate the applicability of mutation testing methods to products claiming conformance to a
functional specification. Focus on techniques developed by Budd, DeMillo, Lipton, Mathur,
Offutt, Untch, Voas, and others over the past few years. Identify an appropriate class of
functional specifications for which these methods provide an acceptable and useful measure of
software reliability relative to the functional specification.
Task 3 Deliverable: A report containing:
* Applicability of mutation testing to black-box conformance testing.
* Recommendations for further study.
Task 4 - Usage Models.
Evaluate the applicability of generalized usage models for functional testing of software. Given
the known usage of a certain class of input functions, determine what can be said about the
reliability of the software restricted to that usage pattern. Determine what can be said about
usage patterns that differ slightly (or significantly) from the tested usage pattern. Also consider
Markov chain usage model techniques developed by the Software Quality Research Laboratory
at the Univ. of Tennessee. Identify an appropriate class of functional specifications for which
usage model methods provide an acceptable and useful measure of software reliability relative to
the functional specification.
Task 4 Deliverables:
* Input data on usage distributions for CGM specification.
* Analysis of data and creation of reliability estimates.
* Evaluation and applicability conclusions.
Task 5 - Conformance Testing Report
Produce a report that focuses on the potential and the limitations of quantitative measures of
software reliability in the context of black-box conformance testing. Include the individual
conclusions reached during analysis of Tasks 1 through 4 as well as general recommendations for
a foundation on which to base more focused efforts. Prepare for publication of key findings as in
selected journals or distribution via Web pages, as appropriate.
Task 5 Deliverables:
* Comprehensive Conformance Testing Report.
* Publication or other distribution of key findings.
Task 6 - Experimental Prototype.
Choose one or more of the most promising approaches resulting from Tasks 1 through 4 and
develop a plan for designing and implementing an experimental test scenario applicable in a
general way to black-box conformance testing. Begin implementation of an experimental
prototype for conformance testing of products claiming to satisfy functional specifications.
Task 6 Deliverables:
* Design plan for implementing conformance testing scenario.
* Alpha implementation of conformance testing tool (s).
Task 7 - Clinical Trials.
Investigate application of non-testing methods, e.g. clinical trials, to quantitative measurements
of software quality. Identify an appropriate potential role for government to play in an industry-wide beta testing of products claiming conformance to public specifications, e.g. WWW
specifications. If comprehensive data is collected on the frequency, type, and severity of errors in
proposed implementations, then mature statistical methods exist for estimating software
reliability of resulting products.
Task 7 Deliverable: A report containing:
* Description of the application of clinical trials methods to software testing.
* Conclusions and recommendations for subsequent efforts.
MILESTONES
| Task 0 -Task Success Scenarios | Success Estimates for Tasks 1-4 | 10/97 |
| Task 1- Binomial Models | Preliminary Status Report | 11/97 |
| Evaluation and Applicability Report | 4/98 | |
| Task 2 - Coverage Designs | Preliminary Status Report | 6/98 |
| Evaluation and Applicability Report | 9/98 | |
| Task 3 - Mutation Testing | Evaluation and Applicability Report | 12/97 |
| Task 4 - Usage Models | Input data for CGM usage distributions | 10/97 |
| CGM reliability estimates | 1/98 | |
| Analysis of assumptions and estimates | 2/98 | |
| Evaluation and Applicability Report | 5/98 | |
| Task 5 - Conformance Testing Report | Preliminary Draft | 7/98 |
| Status and Planning Report | 9/98 | |
| Task 6 - Experimental Prototype | Requirements Plan | 8/98 |
| Alpha Implementation | 6/99 | |
| Task 7 - Clinical Trials | Preliminary Status Report | 5/98 |