Josephson Junction Cryothermometry
NIST/SEMATECH Section 1.4.2.4 Josephson Junction Cryothermometry
Background and Data
This case study applies exploratory data analysis to the NIST SOULEN.DAT dataset, which contains 700 voltage count measurements from a Josephson junction cryothermometry (low temperature) experiment. The data were collected by Bob Soulen of NIST in October 1971 as a sequence of observations collected equi-spaced in time from a volt meter to ascertain the process temperature. The response variable is voltage counts, recorded as integers.
The dataset originates from NIST/SEMATECH Section 1.4.2.4. With n = 700 observations, this study illustrates the case where there is discreteness in the measurements (the data are integers with only a few distinct values and many repeats), but the underlying assumptions approximately hold. The discrete nature of the data creates artifacts in several plots and affects the interpretation of normality tests.
Dataset
Bob Soulen, NIST, Josephson junction voltage counts (Oct 1971)
NIST source description
Josephson Junction Cryothermometry Experiment (outlier-free). Bob Soulen, October 1971. Response variable = voltage counts. The Josephson Junction can be used as a cryothermometry device. Number of observations = 700.
Preview data
| # | Value |
|---|---|
| 1 | 2899 |
| 2 | 2898 |
| 3 | 2898 |
| 4 | 2900 |
| 5 | 2898 |
| 6 | 2901 |
| 7 | 2899 |
| 8 | 2901 |
| 9 | 2900 |
| 10 | 2898 |
| ... 690 more rows | |
Test Underlying Assumptions
Goals
The analysis has three primary objectives:
- Model validation — assess whether the univariate model is an appropriate fit for the cryothermometry data:
-
Assumption testing — evaluate whether the data satisfy the four standard assumptions for a measurement process in statistical control:
- Random sampling — the data are uncorrelated
- Fixed distribution — the data come from a fixed distribution
- Fixed location — the distribution location (mean) is constant
- Fixed variation — the distribution scale (standard deviation) is constant
-
Confidence interval validity — determine whether the standard confidence interval formula is appropriate:
where is the standard deviation. This formula relies on all four assumptions holding; if they are violated, the confidence interval may not have the intended coverage.
If any assumptions are violated, identify the nature and severity of the violations and determine whether the violations are serious enough to invalidate the model.
Graphical Output and Interpretation
The graphical analysis uses the standard suite of EDA plots to assess the four underlying assumptions. The discrete integer nature of the cryothermometry data creates visual artifacts in several plots that must be interpreted carefully.
4-Plot Overview
The 4-plot is the primary graphical tool for testing all four assumptions simultaneously. For the cryothermometry data, it reveals a dataset that approximately satisfies the assumptions, with some caveats due to data discreteness.
The run sequence plot (upper left) shows 700 voltage count measurements fluctuating around a stable location of approximately 2899 with no visible shifts. The lag plot (upper right) does not indicate any significant non-random patterns, though the discrete integer values create a grid-like appearance. The histogram (lower left) is approximately symmetric and bell-shaped. The normal probability plot (lower right) shows a staircase-like pattern due to discrete integer values, but the overall trend is consistent with approximate normality.
The assumptions are addressed by the four diagnostic plots:
- The run sequence plot (upper left) shows 700 voltage counts fluctuating around a stable location of approximately 2899 with no visible shifts in location or scale — the fixed-location and fixed-variation assumptions appear satisfied.
- The lag plot (upper right) does not indicate significant non-random patterns, though the discrete integer values create a grid-like appearance rather than a smooth scatter — this is a measurement artifact rather than a process feature.
- The histogram (lower left) is approximately symmetric and bell-shaped, consistent with a normal distribution, though the discrete measurement resolution creates a slightly irregular profile.
- The normal probability plot (lower right) shows a staircase-like pattern due to discrete integer values, but the overall trend is consistent with approximate normality.
The integer data with only a few distinct values and many repeats accounts for the discrete appearance of several plots. The underlying assumptions appear approximately satisfied, with caveats due to data discreteness that will be further examined quantitatively.
Run Sequence Plot
The run sequence plot tests whether the location is constant over time. The data range from 2895 to 2902, fluctuating around a stable center near 2899. There are no visible shifts in location or scale over the course of the experiment.
Conclusion: The run sequence plot shows stable location and variation. The fixed-location assumption appears satisfied from graphical inspection.
Lag Plot
The lag plot tests whether the data are random by plotting versus . If the data are random, the lag plot should show a structureless cloud.
The lag plot at lag 1 does not indicate any significant non-random patterns. The discrete integer nature of the data creates a grid-like appearance rather than a smooth scatter, but this is a measurement artifact rather than a process feature.
Conclusion: No strong non-random structure is visible, though the grid pattern from discrete data makes subtle patterns harder to detect.
Histogram
The histogram assesses the shape of the underlying distribution. The histogram is approximately symmetric and bell-shaped, centered near 2899. The discrete measurement resolution (integer values) creates a slightly irregular profile, but the overall distributional shape is consistent with a normal distribution. An overlaid normal PDF with mean 2898.562 and standard deviation 1.305 fits the data reasonably well.
Normal Probability Plot
The normal probability plot shows an approximately linear pattern (fitted intercept 2898.562, slope 1.276), but the discrete integer values create a staircase-like appearance that makes interpretation more difficult than with continuous data. The overall shape is consistent with approximate normality.
Autocorrelation Plot
The autocorrelation plot provides quantitative detail on the serial dependence detected by the lag plot. With 95% confidence bands at ±2/√N = ±0.076, any lag exceeding these bounds indicates significant autocorrelation.
The autocorrelation at lag 1 is 0.31, exceeding the significance bound. The autocorrelation decays relatively quickly, suggesting the non-randomness is mild and partially attributable to the discrete nature of the integer-valued measurements.
Spectral Plot
The spectral plot shows the frequency-domain structure. For the cryothermometry data, the spectrum should show modest low-frequency content consistent with the mild autocorrelation.
Quantitative Output and Interpretation
Summary Statistics
| Statistic | Value |
|---|---|
| Sample size | 700 |
| Mean | 2898.562 |
| Std Dev | 1.305 |
| Median | 2899.0 |
| Min | 2895.0 |
| Max | 2902.0 |
| Range | 7.0 |
| Normal PPCC | 0.975 |
The mean and median are close, consistent with the approximate symmetry seen in the histogram. The small range of 7 units reflects the precision of the voltage counting process.
Location Test
The location test fits a linear regression of the response against the run-order index and tests whether the slope is significantly different from zero.
| Parameter | Estimate | Std Error | t-Value |
|---|---|---|---|
| (intercept) | 2.898E+03 | 9.745E-02 | 29739.288 |
| (slope) | 1.071E-03 | 2.409E-04 | 4.445 |
Residual standard deviation: 1.288 with 698 degrees of freedom.
Conclusion: The slope t-value of 4.445 exceeds the critical value , so we reject — the slope is statistically significant. However, the slope value of 0.001071 is practically negligible — over the full 700 observations, the predicted drift is only about 0.75 voltage counts. The assumption of constant location is not seriously violated even though the test is statistically significant. This is a classic example of a statistically significant but practically unimportant effect.
Variation Test
The Levene test (median-based variant, used instead of Bartlett’s test because the discrete nature of the data makes the normality assumption questionable) divides the data into equal-length intervals and tests whether their variances are homogeneous.
| Statistic | Value |
|---|---|
| Test statistic | 1.43 |
| Degrees of freedom | and |
| Critical value | 2.618 |
Conclusion: The test statistic of 1.43 is less than the critical value of 2.618, so we fail to reject — the variances are not significantly different across the four intervals. The constant-variation assumption is satisfied.
Randomness Tests
Two complementary tests assess whether the observations are independent.
Runs test — tests whether the sequence of values above and below the median was produced randomly.
| Statistic | Value |
|---|---|
| Test statistic | −13.4162 |
| Critical value | 1.96 |
Conclusion: far exceeds 1.96, so we reject — the data are not random. The negative Z indicates far fewer runs than expected, meaning consecutive observations tend to cluster at the same integer value.
Lag-1 autocorrelation — measures the linear dependence between consecutive observations.
| Statistic | Value |
|---|---|
| 0.31 | |
| Critical value | 0.087 |
Conclusion: The lag-1 autocorrelation of 0.31 exceeds the critical value of 0.087. There is statistically significant autocorrelation, indicating mild non-randomness. However, the non-randomness can at least partially be explained by the discrete nature of the data — when measurements are restricted to a few integer values, consecutive readings tend to repeat, inflating autocorrelation and reducing the number of runs. The randomness assumption is mildly violated.
Distribution Test
The distributional assumption is tested with two complementary methods.
Probability Plot Correlation Coefficient (PPCC) — measures the linearity of the normal probability plot.
| Statistic | Value |
|---|---|
| Normal PPCC | 0.975 |
| Critical value (5%) | 0.987 |
Conclusion: The PPCC of 0.975 is less than the critical value of 0.987, so we reject normality at the 5% level.
Anderson-Darling test — a weighted goodness-of-fit test that is particularly sensitive to deviations in the tails.
| Statistic | Value |
|---|---|
| 16.858 | |
| Critical value (5%) | 0.787 |
Conclusion: The Anderson-Darling statistic of 16.858 far exceeds the critical value of 0.787, also rejecting normality. However, the violation of the normality assumption is not severe enough to conclude that the univariate model is unreasonable. At least part of the non-normality can be explained by the discrete nature of the data — when measurements can only take a few integer values, the resulting distribution cannot be perfectly normal.
Outlier Detection
Grubbs’ test tests whether the most extreme observation is a statistical outlier.
| Statistic | Value |
|---|---|
| 2.729 | |
| Critical value (5%, one-tailed) | 3.951 |
Conclusion: The Grubbs statistic of 2.729 is less than the critical value of 3.951, so we fail to reject — no outliers are detected.
Test Summary
| Assumption | Test | Statistic | Critical Value | Result |
|---|---|---|---|---|
| Fixed location | Regression on run order | 1.96 | Reject (but practically negligible) | |
| Fixed variation | Levene test | 2.618 | Fail to reject | |
| Randomness | Runs test | 1.96 | Reject (mild) | |
| Randomness | Autocorrelation lag-1 | 0.087 | Reject (mild) | |
| Distribution | Normal PPCC | 0.975 | 0.987 | Reject (mild) |
| Distribution | Anderson-Darling | 0.787 | Reject (mild) | |
| Outliers | Grubbs’ test | 3.951 | No outliers |
The randomness, normality, and (technically) location assumptions are violated, but all violations are mild and at least partially attributable to the discrete integer nature of the data. The variation assumption is satisfied and no outliers are present.
Interpretation
The constant-variation assumption is cleanly satisfied (Levene test , well below the critical value of 2.618). The location drift is statistically significant () but practically negligible — the regression slope of 0.001071 implies a total predicted drift of only 0.75 voltage counts over 700 observations, as confirmed by the stable run sequence plot. Grubbs’ test detects no outliers (). The key diagnostic questions concern the randomness and normality violations.
The runs test () and lag-1 autocorrelation () both reject independence, while the Anderson-Darling test () and normal PPCC (0.975) reject normality. However, these violations must be interpreted in the context of discrete integer data with only about 8 distinct values. Integer-valued measurements naturally produce long runs of identical readings, which inflates the autocorrelation and reduces the number of runs far below what continuous data would exhibit. Similarly, discrete data cannot conform to a continuous normal distribution — the staircase pattern in the normal probability plot is a measurement-resolution artifact, not evidence of a fundamentally non-normal process. The discrete nature of the data accounts for much of the observed violation in both the randomness and normality tests.
Despite the formal statistical rejections, the violations are mild and largely attributable to measurement discreteness rather than genuine process deficiencies. The univariate model remains appropriate for characterizing this measurement process. The standard confidence interval is approximately valid, though the mild autocorrelation means the effective sample size is smaller than 700 and the true confidence interval is somewhat wider than the nominal calculation suggests.
Conclusions
Although the randomness and normality assumptions are mildly violated, the violations are mild enough and at least partially explained by the discrete nature of the data, so the data may be modeled as if the process were in statistical control. The recommended univariate model is:
Since the randomness assumption is mildly violated (lag-1 autocorrelation ), the standard 95% confidence interval for the mean is:
However, this interval assumes independent observations. With autocorrelation present, the effective sample size is smaller than 700, and the true confidence interval is wider. The autocorrelation-adjusted 95% confidence interval is (2898.515, 2898.928), which accounts for the reduced effective degrees of freedom due to serial correlation.
This case study demonstrates that discrete integer data can create artifacts in graphical plots (particularly the lag plot and normal probability plot) and can affect the results of formal statistical tests, but careful interpretation considering the nature of the data supports valid conclusions about the measurement process. The key insight is distinguishing between statistically significant violations and practically important ones — the location drift is real but negligible, and the non-randomness is at least partly a consequence of measurement discreteness rather than a genuine process deficiency.