Skip to main content

4-Plot

NIST/SEMATECH Section 1.3.3.32 4-Plot

20 40 60 80 100 120 140 160 180 200 Observation -600 -400 -200 0 200 Value Run Sequence -600 -500 -400 -300 -200 -100 0 100 200 300 Y(i-1) -600 -400 -200 0 200 Y(i) Lag Plot -500 -400 -300 -200 -100 0 100 200 300 0 10 20 30 40 Frequency Histogram -3 -2 -1 0 1 2 3 Normal N(0,1) Order Statistic Medians -600 -400 -200 0 200 Ordered Response Normal Probability
The 4-plot is a composite diagnostic display that combines four graphical panels into a single figure: a run-sequence plot, a lag plot, a histogram, and a normal probability plot. Each panel tests one of the four underlying assumptions of a univariate measurement process: fixed location, fixed variation, randomness, and distributional form.

What It Is

The 4-plot is a composite diagnostic display that combines four graphical panels into a single figure: a run-sequence plot, a lag plot, a histogram, and a normal probability plot. Each panel tests one of the four underlying assumptions of a univariate measurement process: fixed location, fixed variation, randomness, and distributional form.

The four panels occupy a 2×22 \times 2 grid. Upper-left: run-sequence plot (YiY_i vs run order ii) tests fixed location and fixed variation. Upper-right: lag plot (YiY_i vs Yi1Y_{i-1}) tests randomness and detects serial correlation. Lower-left: histogram tests distributional shape, modality, and symmetry. Lower-right: normal probability plot specifically tests for normality. Together, these four panels answer: is the process stable, is it random, and what does the distribution look like? The 4-plot also applies to residuals from fitted models Yi=f(X1,,Xk)+EiY_i = f(X_1, \ldots, X_k) + E_i, making it useful beyond simple univariate analysis.

Questions This Plot Answers

  • Is the process in-control, stable, and predictable?
  • Is the process drifting with respect to location?
  • Is the process drifting with respect to variation?
  • Are the data random?
  • Is an observation related to an adjacent observation?
  • If the data are a time series, is it white noise?
  • If not white noise, is it sinusoidal, autoregressive, etc.?
  • If the data are non-random, what is a better model?
  • Does the process follow a normal distribution?
  • If non-normal, what distribution does the process follow?
  • Is the underlying model Yi=A0+EiY_i = A_0 + E_i valid and sufficient?
  • If the default model is insufficient, what is a better model?
  • Is the formula s/Ns/\sqrt{N} valid for computing the uncertainty of the mean?
  • Is the sample mean a good estimator of the process location?
  • If not, what would be a better estimator?
  • Are there any outliers?

Why It Matters

The 4-plot is the universal first-step diagnostic for any univariate measurement process. It simultaneously tests all four foundational assumptions (fixed location, fixed variation, randomness, and distributional form) in a single display. If any panel shows a problem, the analyst knows immediately which assumption is violated and can choose appropriate corrective methods before proceeding with analysis.

When to Use a 4-Plot

Use the 4-plot as a comprehensive screening tool to simultaneously check all four assumptions that underlie most univariate statistical analyses. Rather than examining each assumption separately, the 4-plot provides a single-page summary that reveals whether the data are suitable for standard statistical methods. It is the recommended starting point in the NIST/SEMATECH handbook for univariate process characterization and is particularly valuable during initial data exploration before committing to specific modeling or testing approaches.

How to Interpret a 4-Plot

The run-sequence plot (upper left) plots YiY_i vs run order ii and checks for fixed location and fixed variation over time: a horizontal band of points with constant spread indicates stability. The lag plot (upper right) plots YiY_i vs Yi1Y_{i-1} and checks for randomness: a structureless cloud indicates independence, while any pattern indicates serial correlation. The histogram (lower left) provides a visual summary of the distributional shape, center, and spread, and flags potential outliers or multimodality. The normal probability plot (lower right) plots ordered YiY_i against theoretical N(0,1)N(0,1) quantiles and specifically assesses normality: points along the reference line indicate a normal distribution. When all four panels show ideal patterns, the analyst can proceed with standard methods. When any panel shows departures, the nature of the departure guides the choice of alternative methods.

Examples

Ideal Process

Run-sequence plot shows a stable horizontal band. Lag plot shows a structureless cloud. Histogram is bell-shaped. Normal probability plot follows the reference line. All four assumptions are satisfied — the process is stable, random, and normally distributed.

Trending Process

Run-sequence plot shows a clear upward or downward drift. Lag plot shows a tight diagonal band. Histogram may appear normal but the time-dependence makes summary statistics misleading. The trend must be modeled or removed before further analysis.

Non-Normal Process

Run-sequence plot and lag plot appear normal (stable, random), but the histogram is skewed and the normal probability plot curves away from the reference line. The process is in control but a normal-theory analysis would give incorrect confidence intervals.

Assumptions and Limitations

The 4-plot requires time-ordered data for the run-sequence and lag plot panels to be meaningful. If the data do not have a natural time ordering, only the histogram and probability plot panels are interpretable. The 4-plot is a screening tool, not a definitive test, and unusual patterns should be investigated with more specialized techniques.

See It In Action

This technique is demonstrated in the following case studies:

Reference: NIST/SEMATECH e-Handbook of Statistical Methods, Section 1.3.3.32

Formulas

4-Plot Diagnostic Ensemble

4-Plot=[Yi vs iYi vs Yi1Histogram(Y)Normal Prob. Plot(Y)]\text{4-Plot} = \begin{bmatrix} Y_i \text{ vs } i & Y_i \text{ vs } Y_{i-1} \\ \text{Histogram}(Y) & \text{Normal Prob. Plot}(Y) \end{bmatrix}

The 4-plot combines four panels in a 2×22 \times 2 grid: run-sequence plot (YiY_i vs ii, tests fixed location and variation), lag plot (YiY_i vs Yi1Y_{i-1}, tests randomness), histogram (shows distributional shape), and normal probability plot (ordered YiY_i vs N(0,1)N(0,1) quantiles, tests normality).

Python Example

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import probplot
# Generate sample data from a normal process
rng = np.random.default_rng(42)
n = 200
data = rng.normal(loc=10, scale=2, size=n)
fig, axes = plt.subplots(2, 2, figsize=(10, 8))
# Panel 1: Run sequence plot
axes[0, 0].plot(range(1, n + 1), data, "o", markersize=2)
axes[0, 0].axhline(y=np.mean(data), color="r", linestyle="--")
axes[0, 0].set_xlabel("Run Order")
axes[0, 0].set_ylabel("Response")
axes[0, 0].set_title("Run Sequence Plot")
# Panel 2: Lag plot
axes[0, 1].scatter(data[:-1], data[1:], alpha=0.5, s=10)
axes[0, 1].set_xlabel("Y(i-1)")
axes[0, 1].set_ylabel("Y(i)")
axes[0, 1].set_title("Lag Plot")
# Panel 3: Histogram
axes[1, 0].hist(data, bins=20, density=True, edgecolor="black")
axes[1, 0].set_xlabel("Value")
axes[1, 0].set_ylabel("Density")
axes[1, 0].set_title("Histogram")
# Panel 4: Normal probability plot
probplot(data, plot=axes[1, 1])
axes[1, 1].set_title("Normal Probability Plot")
fig.suptitle("4-Plot Diagnostic", fontsize=14)
plt.tight_layout()
plt.show()