 # Dataplot Analysis of Four Typical Problems

Introduction A computer language is a tool--a means of generating solutions to problems. Before delving into the details of the DATAPLOT language, let us first consider the types of problems that the scientist/engineer/researcher typically encounters. This will provide motivation for how the computer language (as a tool) was developed. All computer languages have their own areas of strength. The choice of problems below serve as a frame of reference for both reader and developer alike as to the type of problems that DATAPLOT considers "important" and for which it has been designed to be strong in.
List of Problems The following are four typical problems.
1. A graphics problem--
An analyst has a data set consisting of (x,y) pairs. Plot the data. Blow up" any interesting sub-regions of the plot.
2. A Non-Linear Fitting Problem--
An analyst has a data set consisting of (x,y) pairs. Read the data into the computer. Plot them. Perform a non-linear fit for the model y = exp(-alpha+x)/(a+b+x). Generate a superimposed plot of raw data and predicted values from the fit. Generate a plot of residuals versus x. Generate a normal probability plot of the residuals.
3. A Data Analysis Problem--
An analyst has data consisting of a response variable and 3 independent variables (factors). Determine if the factors affect the response. Determine if there is interaction between the factors. Perofrm an analysis of variance. Carry out a graphical analysis of variance.
4. A Mathematics Problem--
An analyst wishes to examine the function x*exp(-x) + sin(x**2) over the interval 0 to 3. Plot the function over the interval. Determine any roots in the interval. Determine its definite integral over the interval.
Some Noteworthy Points Several points are noteworthy:
1. Graphics as a Core Activity.
Note the graphics component that exists in all of the above problems. Graphics is a key activity in both data analysis and mathematics.
2. Time.
These problems should all take less than 10 minutes to solve.
3. Number of Lines of Code.
These problems should all be solvable with 1ess than 10 lines of code.
4. Interactive Analysis.
These problems should be solvable interactively so that in case some interesting tangent arises in the course of the solution, the analyst can immediately pursue it.
5. Graphics Quality.
Ideally, the graphics should be all continuous and of manuscript-quality if so desired.
6. Variety of Graphics Devices.
On the other hand, if the analyst is working at a discrete terminal or in batch, neither the logic of the analysis nor the entered plot cammands should be any different.
7. Subset Analyses.
The analyst should not need to worry about whether the graphics, fitting, or data analysis is being performed over the full data set or over any complicated subsets of the data. Performing analyses over subsets should not result in irrelevant (as far as the scientist is concerned) preliminary data extraction and manipulation. It should be as easy to carry out any (and all) graphics and analysis operations over a subset as it is to carry it out over the full set.
8. Sample Size.
The analyst should not need to worry about whether the data set consists of 7 data points or 700 data points. The number of data points is a nuisance parameter that the analyst should not need to concern himself with.
9. Data Format.
The analyst should not need to worry about how the data is formatted upon input. This also is an unimportant nuisance item.
10. Predicted Values and Residuals.
The analyst should not need to worry about predicted values and residuals from the fit. They should be automatically available for further analysis and plotting.
11. Scope of Capabilities.
The analyst should be able to fluidly glide from graphics to fitting to data analysis to mathematics activities with no interruption and within the context of the language.
12. Ease of Use.
The ultimate objective of the analyst is not in learning a computer language. It is in gaining insight into the problem at hand; thus the computer language shouid be natural, easy to learn, and easy to use. The language that the analyst uses should preserve the continuity of thought that is so important in scientific research.
13. English-Syntax.
Ideally, the language should correspond as close as possible to the English-language and mathematical representation of the solution. This will allow the analyst to "think science" as opposed to "think computing" and will eliminate an unnecessary mapping from conceptual solution to computer-language solution.

NIST is an agency of the U.S. Commerce Department.

Date created: 06/05/2001
Last updated: 09/28/2016