1.3.3.26.11. Scatterplot Matrix

1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot

1.3.3.26.11. Scatterplot Matrix

Purpose:
Check Pairwise Relationships Between Variables

Given a set of variables X₁, X₂, ... , X_k, the scatterplot matrix contains all the pairwise scatter plots of the variables on a single page in a matrix format. That is, if there are k variables, the scatterplot matrix will have k rows and k columns and the ith row and jth column of this matrix is a plot of X_i versus X_j.

Although the basic concept of the scatterplot matrix is simple, there are numerous alternatives in the details of the plots.

The diagonal plot is simply a 45-degree line since we are plotting X_i versus X_i. Although this has some usefulness in terms of showing the univariate distribution of the variable, other alternatives are common. Some users prefer to use the diagonal to print the variable label. Another alternative is to plot the univariate histogram on the diagonal. Alternatively, we could simply leave the diagonal blank.
Since X_i versus X_j is equivalent to X_j versus X_i with the axes reversed, some prefer to omit the plots below the diagonal.
It can be helpful to overlay some type of fitted curve on the scatter plot. Although a linear or quadratic fit can be used, the most common alternative is to overlay a lowess curve.
Due to the potentially large number of plots, it can be somewhat tricky to provide the axes labels in a way that is both informative and visually pleasing. One alternative that seems to work well is to provide axis labels on alternating rows and columns. That is, row one will have tic marks and axis labels on the left vertical axis for the first plot only while row two will have the tic marks and axis labels for the right vertical axis for the last plot in the row only. This alternating pattern continues for the remaining rows. A similar pattern is used for the columns and the horizontal axes labels. Another alternative is to put the minimum and maximum scale value in the diagonal plot with the variable name.
Some analysts prefer to connect the scatter plots. Others prefer to leave a little gap between each plot.
Although this plot type is most commonly used for scatter plots, the basic concept is both simple and powerful and extends easily to other plot formats that involve pairwise plots such as the quantile-quantile plot and the bihistogram.

Sample Plot

This sample plot was generated from pollution data collected by NIST chemist Lloyd Currie.

There are a number of ways to view this plot. If we are primarily interested in a particular variable, we can scan the row and column for that variable. If we are interested in finding the strongest relationship, we can scan all the plots and then determine which variables are related.

Definition

Given k variables, scatter plot matrices are formed by creating k rows and k columns. Each row and column defines a single scatter plot

The individual plot for row i and column j is defined as

Vertical axis: Variable X_i
Horizontal axis: Variable X_j

Questions

The scatterplot matrix can provide answers to the following questions:

Are there pairwise relationships between the variables?
If there are relationships, what is the nature of these relationships?
Are there outliers in the data?
Is there clustering by groups in the data?

Linking and Brushing

The scatterplot matrix serves as the foundation for the concepts of linking and brushing.

By linking, we mean showing how a point, or set of points, behaves in each of the plots. This is accomplished by highlighting these points in some fashion. For example, the highlighted points could be drawn as a filled circle while the remaining points could be drawn as unfilled circles. A typical application of this would be to show how an outlier shows up in each of the individual pairwise plots. Brushing extends this concept a bit further. In brushing, the points to be highlighted are interactively selected by a mouse and the scatterplot matrix is dynamically updated (ideally in real time). That is, we can select a rectangular region of points in one plot and see how those points are reflected in the other plots. Brushing is discussed in detail by Becker, Cleveland, and Wilks in the paper "Dynamic Graphics for Data Analysis" (Cleveland and McGill, 1988).

Related Techniques

Star plot
Scatter plot
Conditioning plot
Locally weighted least squares

Software

Scatterplot matrices are becoming increasingly common in general purpose statistical software programs. If a software program does not generate scatterplot matrices, but it does provide multiple plots per page and scatter plots, it should be possible to write a macro to generate a scatterplot matrix. Brushing is available in a few of the general purpose statistical software programs that emphasize graphical approaches.