Next Page Previous Page Home Tools & Aids Search Handbook
1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic Scatter Plot

Scatter Plot: Outlier

Scatter Plot Showing Outliers scatter plot showing outliers
Discussion The scatter plot here reveals
  1. a basic linear relationship between X and Y for most of the data, and
  2. a single outlier (at X = 375).
An outlier is defined as a data point that emanates from a different model than do the rest of the data. The data here appear to come from a linear model with a given slope and variation except for the outlier which appears to have been generated from some other model.

Outlier detection is important for effective modeling. Outliers should be excluded from such model fitting. If all the data here are included in a linear regression, then the fitted model will be poor virtually everywhere. If the outlier is omitted from the fitting process, then the resulting fit will be excellent almost everywhere (for all points except the outlying point).

Home Tools & Aids Search Handbook Previous Page Next Page