Box-Cox Linearity Plot
NIST/SEMATECH Section 1.3.3.5 Box-Cox Linearity Plot
What It Is
A Box-Cox linearity plot helps identify the optimal power transformation of the predictor (X) variable that maximizes the linear correlation between a response Y and a predictor X. It evaluates a range of power transformation exponents and displays the correlation coefficient as a function of the transformation parameter lambda.
The procedure evaluates for a range of values (typically to ) and computes the Pearson correlation between and at each . The case uses by convention. The plot displays correlation vs. , and the peak identifies the optimal transformation. Common special cases: (no transform), (square root), (log), (reciprocal).
Questions This Plot Answers
- Would a suitable transformation improve my linear fit?
- What is the optimal value of the transformation parameter?
Why It Matters
Many bivariate relationships are non-linear in their original scale but become linear after a power transformation of the predictor. The Box-Cox linearity plot automates the search for this transformation, turning a difficult non-linear modeling problem into a simple linear regression. Note that the Box-Cox transformation can also be applied to the response variable Y to satisfy error assumptions such as normality and constant variance; that usage is covered by the Box-Cox normality plot.
When to Use a Box-Cox Linearity Plot
Use a Box-Cox linearity plot when a scatter plot suggests a curvilinear relationship between a predictor and a response and a linear model is desired. The technique finds the value of lambda that maximizes the correlation between the response and the transformed predictor, effectively straightening the relationship. This is particularly useful in regression analysis when the analyst wants to apply a simple linear model but the raw data violate the linearity assumption.
How to Interpret a Box-Cox Linearity Plot
The horizontal axis shows the range of values tested, typically from to , and the vertical axis shows the corresponding correlation coefficient between and the transformed . The peak of the curve identifies the optimal value. Common interpretable values include (no transformation needed), (square root), (log transform by convention), and (reciprocal). If the curve is relatively flat near the peak, multiple transformations give similar results and the analyst may choose the most interpretable one. A sharply peaked curve indicates that the linearity of the relationship is highly sensitive to the choice of transformation.
Assumptions and Limitations
The Box-Cox linearity plot requires that the predictor values be strictly positive for most values of , since raising negative numbers to fractional powers is undefined. It assumes a monotonic relationship between the predictor and response; if the relationship is not monotonic, no power transformation will produce linearity. The method targets linearity only and does not address heteroscedasticity or non-normality of residuals.
Reference: NIST/SEMATECH e-Handbook of Statistical Methods, Section 1.3.3.5
Formulas
Box-Cox Transformation
The Box-Cox family of power transformations applied to the predictor variable X. The lambda = 0 case uses the natural logarithm by convention as the limiting case.
Linearity Measure
The Pearson correlation between the response Y and the transformed predictor is computed for each lambda value. The lambda that maximizes r(lambda) yields the most linear relationship.
Python Example
import numpy as npimport matplotlib.pyplot as pltfrom scipy import stats
# Generate nonlinear relationship: Y = sqrt(X) + noiserng = np.random.default_rng(42)X = rng.uniform(1, 50, size=100)Y = np.sqrt(X) + rng.normal(0, 0.5, size=100)
# Evaluate correlation for a range of lambda valueslambdas = np.linspace(-2, 2, 201)correlations = []for lam in lambdas: if abs(lam) < 1e-10: T = np.log(X) else: T = (X**lam - 1) / lam r, _ = stats.pearsonr(Y, T) correlations.append(r)
optimal_idx = np.argmax(correlations)optimal_lambda = lambdas[optimal_idx]
fig, ax = plt.subplots(figsize=(10, 5))ax.plot(lambdas, correlations, 'b-', linewidth=2)ax.axvline(optimal_lambda, color='r', linestyle='--', label=f'Optimal lambda = {optimal_lambda:.2f}')ax.set_xlabel("Lambda")ax.set_ylabel("Correlation r(lambda)")ax.set_title("Box-Cox Linearity Plot")ax.legend()plt.tight_layout()plt.show()