Heat Flow Meter 1 Case Study
NIST/SEMATECH Section 1.4.2.8 Heat Flow Meter 1
Background and Data
This case study applies exploratory data analysis to the NIST ZARR13.DAT dataset, which contains 195 calibration factor measurements from a heat flow meter calibration and stability analysis. The data were collected by Bob Zarr of NIST in January 1990. The response variable is a calibration factor, and the motivation for studying this dataset is to illustrate a well-behaved process where the underlying assumptions hold and the process is in statistical control.
The dataset originates from NIST/SEMATECH Section 1.4.2.8. With observations, this study demonstrates how the standard EDA methodology confirms that a univariate measurement process meets the assumptions required for valid statistical inference.
Dataset
Bob Zarr, NIST, heat flow meter calibration factor (Jan 1990)
NIST source description
Heat Flow Meter (HFM) Calibration & Stability Analysis (ASTM C-16). Bob Zarr, NIST. Coded file ID: set F29, scan rate = 1, date of test = 1/24/90. Response variable = computed calibration factor. Number of observations = 195.
Preview data
| # | Value |
|---|---|
| 1 | 9.206343 |
| 2 | 9.299992 |
| 3 | 9.277895 |
| 4 | 9.305795 |
| 5 | 9.275351 |
| 6 | 9.288729 |
| 7 | 9.287239 |
| 8 | 9.260973 |
| 9 | 9.303111 |
| 10 | 9.275674 |
| ... 185 more rows | |
Test Underlying Assumptions
Goals
The analysis has three primary objectives:
- Model validation — assess whether the univariate model is an appropriate fit for the heat flow meter calibration data:
-
Assumption testing — evaluate whether the data satisfy the four standard assumptions for a measurement process in statistical control:
- Random sampling — the data are uncorrelated
- Fixed distribution — the data come from a fixed distribution
- Fixed location — the distribution location (mean) is constant
- Fixed variation — the distribution scale (standard deviation) is constant
-
Confidence interval validity — determine whether the standard confidence interval formula is appropriate:
where is the standard deviation. This formula relies on all four assumptions holding; if they are violated, the confidence interval has no statistical meaning.
Graphical Output and Interpretation
4-Plot Overview
The 4-plot is the primary graphical tool for testing all four assumptions simultaneously. For the heat flow meter data, it reveals a well-behaved dataset. The run sequence plot indicates no significant shifts in location or scale over time. The lag plot does not indicate any non-random pattern. The histogram shows the data are reasonably symmetric and consistent with a normal distribution. The normal probability plot verifies that the normality assumption is reasonable.
The assumptions are addressed by the four diagnostic plots:
- The run sequence plot (upper left) shows 195 measurements fluctuating around a stable central value of approximately 9.261 with no systematic drift — the fixed-location and fixed-variation assumptions appear satisfied.
- The lag plot (upper right) displays a roughly circular scatter cloud, consistent with approximately independent observations — no strong evidence of non-randomness.
- The histogram (lower left) is approximately symmetric and bell-shaped, centered near 9.261 — consistent with a normal distribution.
- The normal probability plot (lower right) shows data points closely following the theoretical straight line — confirming that the normality assumption is reasonable.
From the above plots, we conclude that the underlying assumptions are valid and the data follow approximately a normal distribution. The standard confidence interval is appropriate for quantifying the uncertainty of the calibration factor.
Run Sequence Plot
The run sequence plot shows 195 measurements fluctuating around a stable central value of approximately 9.261. No systematic drift or trend is visible. The data are consistent with a stable measurement process.
Lag Plot
The lag plot at lag 1 displays a roughly circular scatter cloud, consistent with approximately independent observations. There is no strong linear or curvilinear pattern that would indicate severe autocorrelation.
Histogram
The histogram is approximately symmetric and bell-shaped, centered near 9.261. The shape is consistent with a normal distribution. An overlaid normal PDF with mean 9.261 and standard deviation 0.023 fits the data well.
Normal Probability Plot
The normal probability plot shows data points closely following the theoretical straight line (fitted intercept 9.261, slope 0.023). The linearity confirms that the normality assumption is reasonable for these data.
Autocorrelation Plot
The autocorrelation plot quantifies the serial dependence in the data. With 95% confidence bands at , autocorrelation coefficients exceeding these bounds indicate significant non-randomness.
The lag-1 autocorrelation of 0.281 exceeds the significance bounds, confirming the mild non-randomness detected by the runs test. However, the autocorrelation decays quickly, and the departure from independence is not severe enough to warrant a more complex model.
Spectral Plot
The spectral plot shows the frequency-domain structure of the calibration data.
The spectrum shows modest low-frequency content consistent with the mild autocorrelation, but no sharp peaks that would indicate periodic structure. The process is essentially well-behaved with a minor departure from strict independence.
Quantitative Output and Interpretation
Summary Statistics
| Statistic | Value |
|---|---|
| Sample size | 195 |
| Mean | 9.261460 |
| Std Dev | 0.022789 |
| Median | 9.261952 |
| Min | 9.196848 |
| Max | 9.327973 |
| Range | 0.131126 |
The mean and median are nearly identical (9.261460 vs. 9.261952), confirming symmetry. The standard deviation of 0.023 represents approximately 0.25% relative precision, typical for heat flow meter calibrations.
Location Test
The location test fits a linear regression of the response against the run-order index and tests whether the slope is significantly different from zero.
| Parameter | Estimate | Std Error | t-Value |
|---|---|---|---|
| (intercept) | 9.26699 | 0.003253 | 2849 |
| (slope) | −0.000056412 | 0.00002878 | −1.960 |
Residual standard deviation: 0.022624 with 193 degrees of freedom.
Conclusion: The slope t-value of −1.960 is at the boundary of the critical value . While technically borderline, the slope estimate of −0.000056 is so small that it can essentially be considered zero — the estimated drift over the entire 195-observation run is only , less than half a standard deviation. The fixed-location assumption is satisfied.
Variation Test
Bartlett’s test divides the data into equal-length intervals and tests whether their variances are homogeneous.
| Statistic | Value |
|---|---|
| Test statistic | 3.147 |
| Degrees of freedom | |
| Critical value | 7.815 |
| Significance level | 0.05 |
Conclusion: The test statistic of 3.147 is well below the critical value of 7.815, so we do not reject — the variances are not significantly different across the four quarters of the dataset. The constant-variation assumption is satisfied.
Randomness Tests
Two complementary tests assess whether the observations are independent.
Runs test — tests whether the sequence of values above and below the median was produced randomly.
| Statistic | Value |
|---|---|
| Test statistic | −3.2306 |
| Critical value | 1.96 |
Conclusion: exceeds 1.96, so we reject — the data show statistically significant non-randomness. The negative Z indicates fewer runs than expected, meaning the data tend to cluster in sequences above or below the median.
Lag-1 autocorrelation — measures the linear dependence between consecutive observations.
| Statistic | Value |
|---|---|
| 0.281 | |
| Critical value | 0.140 |
Conclusion: The lag-1 autocorrelation of 0.281 exceeds the critical value of 0.140, confirming the non-randomness detected by the runs test. The randomness assumption is violated, but the violation is mild — the autocorrelation is modest compared to cases like a random walk where .
However, as the NIST handbook notes, the violation of the randomness assumption is mild and “not serious enough to warrant developing a more sophisticated model.” In practice, mild non-randomness is common in calibration data and requires a judgment call about whether the violation warrants a more complex model.
Distribution Test
The probability plot correlation coefficient is 0.999, exceeding the critical value of 0.987. The Anderson-Darling test provides a more sensitive formal test.
| Statistic | Value |
|---|---|
| PPCC | 0.999 |
| PPCC critical value | 0.987 |
| Anderson-Darling | 0.129 |
| Critical value () | 0.787 |
Conclusion: Both the PPCC (0.999 > 0.987) and Anderson-Darling ( well below 0.787) support the normality assumption. The normal distribution is a good model for this data.
Outlier Detection
Grubbs’ test tests whether the most extreme observation is a statistical outlier.
| Statistic | Value |
|---|---|
| 2.918673 | |
| Critical value (upper one-tailed, ) | 3.597898 |
Conclusion: is less than the critical value of 3.598, so we do not reject the null hypothesis — no outliers are detected.
Test Summary
| Assumption | Test | Statistic | Critical Value | Result |
|---|---|---|---|---|
| Fixed location | Regression on run order | 1.96 | Borderline — do not reject | |
| Fixed variation | Bartlett’s test | 7.815 | Do not reject | |
| Randomness | Runs test | 1.96 | Reject | |
| Randomness | Autocorrelation lag-1 | 0.140 | Reject | |
| Normality | Anderson-Darling | 0.787 | Do not reject | |
| Outliers | Grubbs’ test | 3.598 | Do not reject |
The assumptions of fixed location, fixed variation, and normality are satisfied. The randomness assumption shows a mild violation (lag-1 autocorrelation of 0.281 is statistically significant), but the departure is not severe enough to require a more complex model.
Interpretation
The graphical and quantitative analyses are largely consistent: three of the four underlying assumptions are clearly satisfied, and the sole violation is mild. The run sequence plot shows a stable process with no visible trend, confirmed by the borderline location test (, with estimated drift of only 0.011 over the full run — less than half a standard deviation). Bartlett’s test confirms constant variation (, well below the critical value of 7.815). The normal probability plot is closely linear, and both the Anderson-Darling test (, well below 0.787) and PPCC (0.999, well above 0.987) strongly support normality. Grubbs’ test (, below 3.598) confirms no outliers.
The only departure from the ideal is a mild violation of the randomness assumption. The runs test () and lag-1 autocorrelation () both reject independence at the 5% level. However, the autocorrelation plot shows that the dependence decays quickly beyond lag 1, and the spectral plot reveals no periodic structure — the modest low-frequency content is consistent with mild short-range correlation rather than a systematic pattern. As the NIST handbook notes, the departure “is not serious enough to warrant developing a more sophisticated model.”
The univariate model is appropriate for this dataset. The standard 95% confidence interval is approximately valid, though the true uncertainty is somewhat larger than suggests due to the mild autocorrelation. The process is considered to be in statistical control — the calibration factor is stable, the variation is consistent, and the data are well-described by a normal distribution.
Conclusions
The heat flow meter calibration data satisfy three of the four assumptions. The location is stable, the variation is constant across quarters of the dataset, and the data follow a normal distribution with no outliers. The randomness assumption shows a mild violation — the lag-1 autocorrelation of 0.281 and runs test statistic of −3.23 both reject the null hypothesis of independence at the 5% level. However, this violation is not severe enough to invalidate the univariate model.
The recommended model is:
The 95% confidence interval for the calibration factor is:
The standard deviation of the mean is , and the 95% confidence interval for the standard deviation is (0.02073, 0.02531).
Because the randomness assumption has a mild violation, the true uncertainty is somewhat larger than the standard confidence interval suggests. Nevertheless, the NIST handbook considers this process to be in statistical control — the departure from randomness is not serious enough to warrant developing a more sophisticated model. This case study illustrates a well-behaved measurement process where the standard EDA methodology confirms that simple statistical inference based on the sample mean is appropriate.