Skip to main content

Linear Plots

NIST/SEMATECH Section 1.3.3.16-19 Linear Plots

2 4 6 8 10 12 14 Batch 0.9995 0.9996 0.9997 0.9998 0.9999 1 Correlation(Y,X) Linear Correlation Plot 2 4 6 8 10 12 14 Batch -0.02 -0.01 0 0.01 0.02 0.03 0.04 Intercept(Y,X) Linear Intercept Plot 2 4 6 8 10 12 14 Batch 0.172 0.173 0.174 0.175 0.176 Slope(Y,X) Linear Slope Plot 2 4 6 8 10 12 14 Batch 0 0.005 0.01 0.015 Residual SD Linear RESSD Plot
Linear plots are a set of four companion plots used to assess whether linear fits are consistent across groups: the linear correlation plot (1.3.3.16), the linear intercept plot (1.3.3.17), the linear slope plot (1.3.3.18), and the linear residual standard deviation plot (1.3.3.19). Each plot shows a per-group statistic on the vertical axis against the group identifier on the horizontal axis, with a reference line at the corresponding statistic computed from all the data.

What It Is

Linear plots are a set of four companion plots used to assess whether linear fits are consistent across groups: the linear correlation plot (1.3.3.16), the linear intercept plot (1.3.3.17), the linear slope plot (1.3.3.18), and the linear residual standard deviation plot (1.3.3.19). Each plot shows a per-group statistic on the vertical axis against the group identifier on the horizontal axis, with a reference line at the corresponding statistic computed from all the data.

For each group, a linear fit is computed and four statistics are extracted: the correlation coefficient, the intercept, the slope, and the residual standard deviation. Each statistic is plotted against the group identifier, creating four companion panels. A reference line on each panel shows the corresponding statistic from a linear fit using all the data combined. Stable (flat) lines across all four panels confirm a globally consistent linear relationship. Trends or jumps in any panel indicate that the linear model parameters differ across groups, which may require separate models per group.

Questions This Plot Answers

  • Are there linear relationships across groups?
  • Are the strength of the linear relationships relatively constant across the groups?
  • Is the intercept from linear fits relatively constant across groups?
  • If the intercepts vary across groups, is there a discernible pattern?
  • Do you get the same slope across groups for linear fits?
  • If the slopes differ, is there a discernible pattern in the slopes?
  • Is the residual standard deviation from a linear fit constant across groups?
  • If the residual standard deviations vary, is there a discernible pattern across the groups?

Why It Matters

For grouped data, it may be important to know whether the different groups are homogeneous (similar) or heterogeneous (different). Linear plots help answer this question in the context of linear fitting. A regression can have a high overall correlation but be driven by different relationships in different groups. Detecting this heterogeneity prevents the incorrect application of a single model to data that require group-specific models.

When to Use a Linear Plots

Use linear plots when your data have groups and you need to know whether a single linear fit can be used across all groups or whether separate fits are required for each group. The correlation plot is often examined first: if correlations are high across groups, it is worthwhile to continue with the slope, intercept, and residual standard deviation plots. If correlations are weak, a different model should be pursued. When you do not have predefined groups, treat each distinct data set as a group.

How to Interpret a Linear Plots

In the linear correlation plot, the correlation for each group is plotted against the group identifier; high, roughly constant values indicate that linear fitting is appropriate across all groups. The linear intercept plot shows the fitted intercept for each group; constant values confirm a stable baseline, while shifts suggest different operating conditions. The linear slope plot shows the fitted slope for each group; constant values mean the rate of change is uniform, while variation suggests group-specific relationships. The linear residual standard deviation plot shows the residual standard deviation for each group; constant values indicate homogeneous fit quality, while trends indicate that the fit is better for some groups than others. A reference line at the overall statistic (computed from all data) appears on each plot.

Assumptions and Limitations

Linear plots require grouped data with enough observations per group to compute a meaningful linear fit (at least several observations per group). The four plots assume the analyst has already established that a linear model is a reasonable starting point, and they serve to diagnose whether that model holds uniformly across groups.

Reference: NIST/SEMATECH e-Handbook of Statistical Methods, Sections 1.3.3.16-19

Python Example

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
# Simulate grouped calibration data (15 batches of 50 observations)
rng = np.random.default_rng(42)
n_batches, n_per_batch = 15, 50
x_all, y_all, batches = [], [], []
for b in range(1, n_batches + 1):
x = np.linspace(1, 11, n_per_batch)
slope = 0.174 + rng.normal(0, 0.001)
intercept = 0.01 + rng.normal(0, 0.015)
y = slope * x + intercept + rng.normal(0, 0.002, n_per_batch)
x_all.extend(x); y_all.extend(y); batches.extend([b] * n_per_batch)
x_all, y_all, batches = np.array(x_all), np.array(y_all), np.array(batches)
# Compute per-batch linear fit statistics
batch_ids = np.unique(batches)
correlations, intercepts, slopes, res_sds = [], [], [], []
for b in batch_ids:
mask = batches == b
res = stats.linregress(x_all[mask], y_all[mask])
correlations.append(res.rvalue)
intercepts.append(res.intercept)
slopes.append(res.slope)
residuals = y_all[mask] - (res.slope * x_all[mask] + res.intercept)
n_b = mask.sum()
res_sds.append(np.sqrt(np.sum(residuals**2) / (n_b - 2)))
# Overall statistics for reference lines
overall = stats.linregress(x_all, y_all)
overall_resid = y_all - (overall.slope * x_all + overall.intercept)
overall_ressd = np.sqrt(np.sum(overall_resid**2) / (len(x_all) - 2))
fig, axes = plt.subplots(2, 2, figsize=(10, 8))
# Panel 1: Correlation plot
axes[0, 0].plot(batch_ids, correlations, 'o-', color='steelblue')
axes[0, 0].axhline(overall.rvalue, color='red', linestyle='--',
label=f'Overall r = {overall.rvalue:.4f}')
axes[0, 0].set_title("Linear Correlation Plot")
axes[0, 0].set_xlabel("Batch")
axes[0, 0].set_ylabel("Correlation(Y,X)")
axes[0, 0].legend(fontsize=8)
# Panel 2: Intercept plot
axes[0, 1].plot(batch_ids, intercepts, 'o-', color='steelblue')
axes[0, 1].axhline(overall.intercept, color='red', linestyle='--',
label=f'Overall = {overall.intercept:.4f}')
axes[0, 1].set_title("Linear Intercept Plot")
axes[0, 1].set_xlabel("Batch")
axes[0, 1].set_ylabel("Intercept(Y,X)")
axes[0, 1].legend(fontsize=8)
# Panel 3: Slope plot
axes[1, 0].plot(batch_ids, slopes, 'o-', color='steelblue')
axes[1, 0].axhline(overall.slope, color='red', linestyle='--',
label=f'Overall = {overall.slope:.4f}')
axes[1, 0].set_title("Linear Slope Plot")
axes[1, 0].set_xlabel("Batch")
axes[1, 0].set_ylabel("Slope(Y,X)")
axes[1, 0].legend(fontsize=8)
# Panel 4: Residual SD plot
axes[1, 1].plot(batch_ids, res_sds, 'o-', color='steelblue')
axes[1, 1].axhline(overall_ressd, color='red', linestyle='--',
label=f'Overall = {overall_ressd:.4f}')
axes[1, 1].set_title("Linear RESSD Plot")
axes[1, 1].set_xlabel("Batch")
axes[1, 1].set_ylabel("Residual SD")
axes[1, 1].legend(fontsize=8)
plt.suptitle("Linear Plots (NIST 1.3.3.16-19)", y=1.02)
plt.tight_layout()
plt.show()