Runs Test for Randomness

NIST/SEMATECH Section 1.3.5.13 Runs Test for Randomness

What It Is

The runs test (Wald-Wolfowitz test) is a non-parametric test for randomness that counts the number of runs -- uninterrupted sequences of observations above or below the median. Too few or too many runs indicate non-random patterns in the data.

When to Use It

Use the runs test to assess whether the order of observations is random, without making assumptions about the underlying distribution. It detects trends, oscillations, clustering, and other departures from randomness that might not be captured by the autocorrelation function. The test is particularly useful for residual analysis after model fitting and for validating the randomness assumption in control chart applications.

How to Interpret

If the absolute value of Z exceeds 1.96 (at the 5% significance level), reject the null hypothesis of randomness. Too few runs (negative Z) suggest positive autocorrelation or trending behavior -- adjacent observations tend to be on the same side of the median. Too many runs (positive Z) suggest negative autocorrelation or oscillatory behavior -- observations frequently alternate above and below the median. The runs test complements the autocorrelation function by detecting patterns that may not manifest as simple lag-1 dependence. Always visualize the data with a run-sequence plot to supplement the test results.

Assumptions and Limitations

The runs test requires a sequence of observations in their original time order. It is a non-parametric test and makes no assumptions about the underlying distribution. The normal approximation for the Z-statistic requires that both n1 > 10 and n2 > 10 (where n1 and n2 are the counts above and below the median); for smaller samples, exact critical value tables should be used. The test is sensitive to ties at the median, which must be handled consistently.

Reference: NIST/SEMATECH e-Handbook, Section 1.3.5.13

Formulas

Expected Number of Runs

E(R) = \frac{2 n_1 n_2}{n_1 + n_2} + 1

The expected number of runs under the null hypothesis of randomness, where n1 and n2 are the counts of observations above and below the median.

Standard Deviation of Runs

\sigma_R = \sqrt{\frac{2n_1 n_2(2n_1 n_2 - n_1 - n_2)}{(n_1 + n_2)^2(n_1 + n_2 - 1)}}

The standard deviation of the number of runs under randomness, used to standardize the test statistic.

Z-Statistic

Z = \frac{R - E(R)}{\sigma_R}

The standardized test statistic, which follows an approximate standard normal distribution for large samples.