Two-Sample t-Test | EDA Visual Encyclopedia

What It Is

The two-sample t-test is a hypothesis test that determines whether the means of two independent groups differ significantly. It computes a t-statistic from the observed difference in sample means, standardized by the pooled (or unpooled) standard error.

When to Use It

Use the two-sample t-test when you have two independent groups and want to test whether their population means are equal. This is one of the most commonly used statistical tests in science and engineering, applicable to comparing treatments, processes, or populations. It is a prerequisite check in many quality improvement and experimental analysis workflows.

How to Interpret

Compare the absolute value of the computed t-statistic to the critical value from the t-distribution at the chosen significance level. If |t| exceeds the critical value, reject the null hypothesis that the two means are equal. Equivalently, if the p-value is less than the significance level (commonly 0.05), the difference is statistically significant. A large t-statistic indicates that the difference between means is large relative to the variability within the groups. Always pair this test with an examination of the confidence interval for the mean difference to assess practical significance.

Assumptions and Limitations

The test assumes both samples are drawn independently from normally distributed populations. The pooled version additionally assumes equal population variances; when this assumption is violated, use the Welch (unequal variance) version with Welch-Satterthwaite degrees of freedom. For large samples, the normality assumption is relaxed by the Central Limit Theorem.

Reference: NIST/SEMATECH e-Handbook, Section 1.3.5.3

Formulas

t-Statistic (Equal Variances)

t = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}

The test statistic measures the difference between two sample means in units of the pooled standard error.

Pooled Standard Deviation

s_p = \sqrt{\frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}}

The pooled standard deviation combines the two sample variances, weighted by their degrees of freedom, under the equal-variance assumption.

Degrees of Freedom (Equal Variances)

\nu = n_1 + n_2 - 2

Under the equal-variance assumption, the degrees of freedom equal the total sample size minus two.

t-Statistic (Unequal Variances / Welch)

t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}}}

When the equal-variance assumption is not met, each sample variance is divided by its own sample size rather than using a pooled estimate.

Welch-Satterthwaite Degrees of Freedom

\nu = \frac{\left(\dfrac{s_1^2}{n_1} + \dfrac{s_2^2}{n_2}\right)^2}{\dfrac{(s_1^2/n_1)^2}{n_1-1} + \dfrac{(s_2^2/n_2)^2}{n_2-1}}

The approximate degrees of freedom for the Welch t-test, accounting for unequal variances. The result is generally not an integer and is rounded down.

Python Example

import numpy as np
from scipy import stats

# Sample data: two independent groups
group_a = np.array([24.5, 23.8, 25.1, 22.9, 24.2, 23.6, 25.0, 24.8])
group_b = np.array([26.3, 27.1, 25.8, 26.9, 27.5, 26.0, 27.2, 26.5])

# Two-sample t-test (assuming equal variances)
t_stat, p_value = stats.ttest_ind(group_a, group_b, equal_var=True)

print(f"t-statistic: {t_stat:.4f}")
print(f"p-value:     {p_value:.6f}")
print(f"Reject H0 at alpha=0.05: {p_value < 0.05}")

# Welch's t-test (unequal variances)
t_welch, p_welch = stats.ttest_ind(group_a, group_b, equal_var=False)
print(f"\nWelch t-statistic: {t_welch:.4f}")
print(f"Welch p-value:     {p_welch:.6f}")