Skip to main content

When Assumptions Fail

NIST/SEMATECH Section 1.2.5 Consequences

Consequences of Assumption Violations

When the four underlying assumptions of a statistical analysis are violated, the consequences range from mild inaccuracy to completely misleading conclusions. Section 1.2.5 of the NIST/SEMATECH Engineering Statistics Handbook addresses this directly: understanding what happens when assumptions fail is just as important as checking them.

The handbook frames assumption-testing positively — as a framework for learning about the process. Assumption-testing promotes insight into important aspects of the process that may not have surfaced otherwise. The primary goal is to have correct, validated, and complete scientific and engineering conclusions flowing from the analysis.

The four underlying assumptions are:

  1. Random drawings (randomness)
  2. Fixed location
  3. Fixed variation
  4. Fixed distribution

This ordering follows the structure of Section 1.2.5, which addresses consequences in subsections 1.2.5.1 through 1.2.5.4.

1. Consequences of Non-Randomness (Section 1.2.5.1)

The randomness assumption is the most critical but the least tested.

If the randomness assumption does not hold, then:

  1. All of the usual statistical tests are invalid.
  2. The calculated uncertainties for commonly used statistics become meaningless.
  3. The calculated minimal sample size required for a pre-specified tolerance becomes meaningless.
  4. The simple model y=constant+errory = \text{constant} + \text{error} becomes invalid.
  5. The parameter estimates become suspect and non-supportable.

Autocorrelation-specific consequences. One specific and common type of non-randomness is autocorrelation — the correlation between YtY_t and YtkY_{t-k}, where kk is an integer that defines the lag. That is, autocorrelation is a time-dependent non-randomness, meaning the value of the current point is highly dependent on the previous point (if k=1k = 1) or the point kk steps ago. If the data are not random due to autocorrelation, then:

  1. Adjacent data values may be related.
  2. There may not be nn independent snapshots of the phenomenon under study.
  3. There may be undetected “junk” outliers.
  4. There may be undetected “information-rich” outliers.

Detection: Structure in the lag plot; significant spikes in the autocorrelation plot.

2. Consequences of Non-Fixed Location (Section 1.2.5.2)

If the run sequence plot does not support the assumption of fixed location, then:

  1. The location may be drifting.
  2. The single location estimate may be meaningless (if the process is drifting).
  3. The choice of location estimator (e.g., the sample mean) may be sub-optimal.
  4. The usual formula for the uncertainty of the mean may be invalid and the numerical value optimistically small.
  5. The location estimate may be poor.
  6. The location estimate may be biased.

Detection: A trend or step change in the run sequence plot.

3. Consequences of Non-Fixed Variation (Section 1.2.5.3)

If the run sequence plot does not support the assumption of fixed variation, then:

  1. The variation may be drifting.
  2. The single variation estimate may be meaningless (if the process variation is drifting).
  3. The variation estimate may be poor.
  4. The variation estimate may be biased.

Detection: A widening or narrowing pattern in the run sequence plot.

The variability and noisiness of the mean as a location estimator are intrinsically linked with the underlying distribution of the data. For certain distributions, the mean is a poor choice. For any given distribution, there exists an optimal choice — the estimator with minimum variability. This optimal choice may be the median, the midrange, the midmean, the mean, or something else. The implication is to estimate the distribution first, and then — based on the distribution — choose the optimal estimator. The resulting parameter estimators will have less variability than if this approach is not followed.

Other consequences that flow from problems with distributional assumptions are organized into three categories:

Distribution:

  1. The distribution may be changing.
  2. The single distribution estimate may be meaningless (if the process distribution is changing).
  3. The distribution may be markedly non-normal.
  4. The distribution may be unknown.
  5. The true probability distribution for the error may remain unknown.

Model:

  1. The model may be changing.
  2. The single model estimate may be meaningless.
  3. The default model y=constant+errory = \text{constant} + \text{error} may be invalid.
  4. If the default model is insufficient, information about a better model may remain undetected.
  5. A poor deterministic model may be fit.
  6. Information about an improved model may go undetected.

Process:

  1. The process may be out-of-control.
  2. The process may be unpredictable.
  3. The process may be un-modelable.

Detection: Curvature in the normal probability plot; formal testing via the Anderson-Darling test or Kolmogorov-Smirnov test.

Remedial Actions

The following remedial strategies are general statistical practice and are not part of Section 1.2.5, which covers only consequences.

Data Transformations

A common first response to assumption violations is to transform the data. Logarithmic, square root, or reciprocal transformations can stabilize variance, reduce skewness, and improve normality simultaneously. The Box-Cox normality plot provides a data-driven method for selecting the optimal power transformation.

Robust Methods

When outliers or heavy tails are the primary concern, robust statistical methods reduce their influence. Trimmed means, Winsorized estimators, and M-estimators all downweight extreme observations. These methods sacrifice some efficiency under perfect normality in exchange for much better performance when the normality assumption is violated.

Non-Parametric Alternatives

When the distributional assumption cannot be rescued by transformation, non-parametric methods provide a distribution-free alternative. The Wilcoxon rank-sum test replaces the two-sample t-test, the Kruskal-Wallis test replaces one-factor ANOVA, and rank-based correlation (Spearman) replaces Pearson correlation. These methods rely on ranks rather than raw values and make no assumption about the underlying distribution.

Time-Series Methods

When randomness is violated because the data exhibit autocorrelation, time-series methods — autoregressive models, moving-average models, and their ARIMA generalizations — explicitly model the dependence structure. Alternatively, the analyst can subsample the data at intervals large enough to achieve approximate independence.

Cross-References