F-Test for Equality of Two Variances

NIST/SEMATECH Section 1.3.5.9 F-Test for Equality of Two Variances

What It Is

The F-test for equality of two variances compares the variances of two independent samples by computing their ratio. Under the null hypothesis of equal population variances, this ratio follows an F-distribution.

When to Use It

Use the F-test when you need to determine whether two independent groups have the same variability. It is often used as a preliminary test before the two-sample t-test to decide whether the pooled or Welch version is appropriate. The test is also important in its own right when comparing the precision of two measurement methods or processes.

How to Interpret

If the computed F-statistic exceeds the critical value from the F-distribution at (nu_1, nu_2) degrees of freedom, reject the null hypothesis of equal variances. For a two-sided test, use both upper and lower critical values, or equivalently, double the one-sided p-value. An F-value close to 1 indicates similar variances, while a large F indicates that one group is much more variable than the other. When the F-test rejects equality of variances, use the Welch t-test instead of the pooled t-test for comparing means.

Assumptions and Limitations

The F-test assumes that both samples come from independent, normally distributed populations. Like the chi-square variance test, it is highly sensitive to non-normality, which can produce misleading results. The Levene test is a more robust alternative when normality cannot be assured.

Reference: NIST/SEMATECH e-Handbook, Section 1.3.5.9

Formulas

F-Statistic

F = \frac{s_1^2}{s_2^2}

The ratio of the two sample variances. The more this ratio deviates from 1, the stronger the evidence for unequal population variances.

Degrees of Freedom

\nu_1 = n_1 - 1, \quad \nu_2 = n_2 - 1

The numerator and denominator degrees of freedom correspond to the sample sizes of the two groups minus one.

Python Example

import numpy as np
from scipy import stats

# Two independent samples
sample1 = np.array([24.5, 23.8, 25.1, 22.9, 24.2, 23.6, 25.0, 24.8])
sample2 = np.array([26.3, 27.1, 25.8, 26.9, 27.5, 26.0, 24.5, 28.1])

# Compute F-statistic
var1 = np.var(sample1, ddof=1)
var2 = np.var(sample2, ddof=1)
f_stat = var1 / var2 if var1 >= var2 else var2 / var1

# Degrees of freedom
df1 = len(sample1) - 1 if var1 >= var2 else len(sample2) - 1
df2 = len(sample2) - 1 if var1 >= var2 else len(sample1) - 1

# Two-sided p-value
p_value = 2 * min(stats.f.sf(f_stat, df1, df2),
                  stats.f.cdf(f_stat, df1, df2))

print(f"Variance 1: {var1:.4f}")
print(f"Variance 2: {var2:.4f}")
print(f"F-statistic: {f_stat:.4f}")
print(f"p-value:     {p_value:.6f}")
print(f"Equal variances (alpha=0.05): {p_value > 0.05}")