Josephson Junction Cryothermometry

NIST/SEMATECH Section 1.4.2.4 Josephson Junction Cryothermometry

Background and Data

This case study applies exploratory data analysis to the NIST SOULEN.DAT dataset, which contains 700 voltage count measurements from a Josephson junction cryothermometry (low temperature) experiment. The data were collected by Bob Soulen of NIST in October 1971 as a sequence of observations collected equi-spaced in time from a volt meter to ascertain the process temperature. The response variable is voltage counts, recorded as integers.

The dataset originates from NIST/SEMATECH Section 1.4.2.4. With n = 700 observations, this study illustrates the case where there is discreteness in the measurements (the data are integers with only a few distinct values and many repeats), but the underlying assumptions approximately hold. The discrete nature of the data creates artifacts in several plots and affects the interpretation of normality tests.

Dataset

SOULEN.DAT

Observations: 700

Variable: Voltage counts

Bob Soulen, NIST, Josephson junction voltage counts (Oct 1971)

NIST source description

Josephson Junction Cryothermometry Experiment (outlier-free). Bob Soulen, October 1971. Response variable = voltage counts. The Josephson Junction can be used as a cryothermometry device. Number of observations = 700.

Preview data

#	Value
1	2899
2	2898
3	2898
4	2900
5	2898
6	2901
7	2899
8	2901
9	2900
10	2898
... 690 more rows

Download CSV NIST Source

Test Underlying Assumptions

Goals

The analysis has three primary objectives:

Model validation — assess whether the univariate model is an appropriate fit for the cryothermometry data:

Y_i = C + E_i

Assumption testing — evaluate whether the data satisfy the four standard assumptions for a measurement process in statistical control:
- Random sampling — the data are uncorrelated
- Fixed distribution — the data come from a fixed distribution
- Fixed location — the distribution location (mean) is constant
- Fixed variation — the distribution scale (standard deviation) is constant
Confidence interval validity — determine whether the standard confidence interval formula is appropriate:

\bar{Y} \pm \frac{2s}{\sqrt{N}}

where $s$ is the standard deviation. This formula relies on all four assumptions holding; if they are violated, the confidence interval may not have the intended coverage.

If any assumptions are violated, identify the nature and severity of the violations and determine whether the violations are serious enough to invalidate the model.

Graphical Output and Interpretation

The graphical analysis uses the standard suite of EDA plots to assess the four underlying assumptions. The discrete integer nature of the cryothermometry data creates visual artifacts in several plots that must be interpreted carefully.

4-Plot Overview

The 4-plot is the primary graphical tool for testing all four assumptions simultaneously. For the cryothermometry data, it reveals a dataset that approximately satisfies the assumptions, with some caveats due to data discreteness.

The run sequence plot (upper left) shows 700 voltage count measurements fluctuating around a stable location of approximately 2899 with no visible shifts. The lag plot (upper right) does not indicate any significant non-random patterns, though the discrete integer values create a grid-like appearance. The histogram (lower left) is approximately symmetric and bell-shaped. The normal probability plot (lower right) shows a staircase-like pattern due to discrete integer values, but the overall trend is consistent with approximate normality.

Four-plot diagnostic layout for the cryothermometry dataset (run sequence, lag, histogram, normal probability). Discrete integer values create grid-like artifacts in the lag plot and staircase patterns in the probability plot.

The assumptions are addressed by the four diagnostic plots:

The run sequence plot (upper left) shows 700 voltage counts fluctuating around a stable location of approximately 2899 with no visible shifts in location or scale — the fixed-location and fixed-variation assumptions appear satisfied.
The lag plot (upper right) does not indicate significant non-random patterns, though the discrete integer values create a grid-like appearance rather than a smooth scatter — this is a measurement artifact rather than a process feature.
The histogram (lower left) is approximately symmetric and bell-shaped, consistent with a normal distribution, though the discrete measurement resolution creates a slightly irregular profile.
The normal probability plot (lower right) shows a staircase-like pattern due to discrete integer values, but the overall trend is consistent with approximate normality.

The integer data with only a few distinct values and many repeats accounts for the discrete appearance of several plots. The underlying assumptions appear approximately satisfied, with caveats due to data discreteness that will be further examined quantitatively.

Run Sequence Plot

The run sequence plot tests whether the location is constant over time. The data range from 2895 to 2902, fluctuating around a stable center near 2899. There are no visible shifts in location or scale over the course of the experiment.

Conclusion: The run sequence plot shows stable location and variation. The fixed-location assumption appears satisfied from graphical inspection.

Run sequence plot of 700 voltage count measurements fluctuating around a stable location of approximately 2899 with no visible shifts in location or scale.

Lag Plot

The lag plot tests whether the data are random by plotting $Y_i$ versus $Y_{i-1}$ . If the data are random, the lag plot should show a structureless cloud.

The lag plot at lag 1 does not indicate any significant non-random patterns. The discrete integer nature of the data creates a grid-like appearance rather than a smooth scatter, but this is a measurement artifact rather than a process feature.

Conclusion: No strong non-random structure is visible, though the grid pattern from discrete data makes subtle patterns harder to detect.

Lag-1 plot showing a grid-like pattern due to discrete integer values. No strong non-random structure is visible, though the grid makes subtle patterns harder to detect.

Histogram

The histogram assesses the shape of the underlying distribution. The histogram is approximately symmetric and bell-shaped, centered near 2899. The discrete measurement resolution (integer values) creates a slightly irregular profile, but the overall distributional shape is consistent with a normal distribution. An overlaid normal PDF with mean 2898.562 and standard deviation 1.305 fits the data reasonably well.

Histogram with KDE overlay of the cryothermometry data. The distribution is approximately symmetric and bell-shaped, centered near 2899.

Normal Probability Plot

The normal probability plot shows an approximately linear pattern (fitted intercept 2898.562, slope 1.276), but the discrete integer values create a staircase-like appearance that makes interpretation more difficult than with continuous data. The overall shape is consistent with approximate normality.

Normal probability plot of the cryothermometry data. The staircase-like pattern is caused by discrete integer values, but the overall trend is consistent with approximate normality.

Autocorrelation Plot

The autocorrelation plot provides quantitative detail on the serial dependence detected by the lag plot. With 95% confidence bands at ±2/√N = ±0.076, any lag exceeding these bounds indicates significant autocorrelation.

Autocorrelation plot showing mild positive autocorrelation (r₁ = 0.31) that decays relatively quickly, partially attributable to discrete integer measurements.

The autocorrelation at lag 1 is 0.31, exceeding the significance bound. The autocorrelation decays relatively quickly, suggesting the non-randomness is mild and partially attributable to the discrete nature of the integer-valued measurements.

Spectral Plot

The spectral plot shows the frequency-domain structure. For the cryothermometry data, the spectrum should show modest low-frequency content consistent with the mild autocorrelation.

Spectral plot of the cryothermometry data showing modest low-frequency content consistent with the mild autocorrelation detected in the time domain.

Quantitative Output and Interpretation

Summary Statistics

Statistic	Value
Sample size $n$	700
Mean $\bar{Y}$	2898.562
Std Dev $s$	1.305
Median	2899.0
Min	2895.0
Max	2902.0
Range	7.0
Normal PPCC	0.975

The mean and median are close, consistent with the approximate symmetry seen in the histogram. The small range of 7 units reflects the precision of the voltage counting process.

Location Test

The location test fits a linear regression of the response $Y$ against the run-order index $X = 1, 2, \ldots, N$ and tests whether the slope is significantly different from zero.

H_0\!: B_1 = 0 \quad \text{vs.} \quad H_a\!: B_1 \neq 0

Parameter	Estimate	Std Error	t-Value
$B_0$ (intercept)	2.898E+03	9.745E-02	29739.288
$B_1$ (slope)	1.071E-03	2.409E-04	4.445

Residual standard deviation: 1.288 with 698 degrees of freedom.

Conclusion: The slope t-value of 4.445 exceeds the critical value $t_{0.975,\,698} = 1.96$ , so we reject $H_0$ — the slope is statistically significant. However, the slope value of 0.001071 is practically negligible — over the full 700 observations, the predicted drift is only about 0.75 voltage counts. The assumption of constant location is not seriously violated even though the test is statistically significant. This is a classic example of a statistically significant but practically unimportant effect.

Variation Test

The Levene test (median-based variant, used instead of Bartlett’s test because the discrete nature of the data makes the normality assumption questionable) divides the data into $k = 4$ equal-length intervals and tests whether their variances are homogeneous.

H_0\!: \sigma_1^2 = \sigma_2^2 = \sigma_3^2 = \sigma_4^2 \quad \text{vs.} \quad H_a\!: \text{at least one } \sigma_i^2 \text{ differs}

Statistic	Value
Test statistic $W$	1.43
Degrees of freedom	$k - 1 = 3$ and $N - k = 696$
Critical value $F_{0.05,\,3,\,696}$	2.618

Conclusion: The test statistic of 1.43 is less than the critical value of 2.618, so we fail to reject $H_0$ — the variances are not significantly different across the four intervals. The constant-variation assumption is satisfied.

Randomness Tests

Two complementary tests assess whether the observations are independent.

Runs test — tests whether the sequence of values above and below the median was produced randomly.

H_0\!: \text{sequence is random} \quad \text{vs.} \quad H_a\!: \text{sequence is not random}

Statistic	Value
Test statistic $Z$	−13.4162
Critical value $Z_{1-\alpha/2}$	1.96

Conclusion: $|Z| = 13.4162$ far exceeds 1.96, so we reject $H_0$ — the data are not random. The negative Z indicates far fewer runs than expected, meaning consecutive observations tend to cluster at the same integer value.

Lag-1 autocorrelation — measures the linear dependence between consecutive observations.

Statistic	Value
$r_1$	0.31
Critical value $2/\sqrt{N}$	0.087

Conclusion: The lag-1 autocorrelation of 0.31 exceeds the critical value of 0.087. There is statistically significant autocorrelation, indicating mild non-randomness. However, the non-randomness can at least partially be explained by the discrete nature of the data — when measurements are restricted to a few integer values, consecutive readings tend to repeat, inflating autocorrelation and reducing the number of runs. The randomness assumption is mildly violated.

Distribution Test

The distributional assumption is tested with two complementary methods.

Probability Plot Correlation Coefficient (PPCC) — measures the linearity of the normal probability plot.

Statistic	Value
Normal PPCC	0.975
Critical value (5%)	0.987

Conclusion: The PPCC of 0.975 is less than the critical value of 0.987, so we reject normality at the 5% level.

Anderson-Darling test — a weighted goodness-of-fit test that is particularly sensitive to deviations in the tails.

Statistic	Value
$A^2$	16.858
Critical value (5%)	0.787

Conclusion: The Anderson-Darling statistic of 16.858 far exceeds the critical value of 0.787, also rejecting normality. However, the violation of the normality assumption is not severe enough to conclude that the univariate model is unreasonable. At least part of the non-normality can be explained by the discrete nature of the data — when measurements can only take a few integer values, the resulting distribution cannot be perfectly normal.

Outlier Detection

Grubbs’ test tests whether the most extreme observation is a statistical outlier.

Statistic	Value
$G$	2.729
Critical value (5%, one-tailed)	3.951

Conclusion: The Grubbs statistic of 2.729 is less than the critical value of 3.951, so we fail to reject — no outliers are detected.

Test Summary

Assumption	Test	Statistic	Critical Value	Result
Fixed location	Regression on run order	$t = 4.445$	1.96	Reject (but practically negligible)
Fixed variation	Levene test	$W = 1.43$	2.618	Fail to reject
Randomness	Runs test	$Z = {-13.4162}$	1.96	Reject (mild)
Randomness	Autocorrelation lag-1	$r_1 = 0.31$	0.087	Reject (mild)
Distribution	Normal PPCC	0.975	0.987	Reject (mild)
Distribution	Anderson-Darling	$A^2 = 16.858$	0.787	Reject (mild)
Outliers	Grubbs’ test	$G = 2.729$	3.951	No outliers

The randomness, normality, and (technically) location assumptions are violated, but all violations are mild and at least partially attributable to the discrete integer nature of the data. The variation assumption is satisfied and no outliers are present.

Interpretation

The constant-variation assumption is cleanly satisfied (Levene test $W = 1.43$ , well below the critical value of 2.618). The location drift is statistically significant ( $t = 4.445$ ) but practically negligible — the regression slope of 0.001071 implies a total predicted drift of only 0.75 voltage counts over 700 observations, as confirmed by the stable run sequence plot. Grubbs’ test detects no outliers ( $G = 2.729$ ). The key diagnostic questions concern the randomness and normality violations.

The runs test ( $Z = {-13.4162}$ ) and lag-1 autocorrelation ( $r_1 = 0.31$ ) both reject independence, while the Anderson-Darling test ( $A^2 = 16.858$ ) and normal PPCC (0.975) reject normality. However, these violations must be interpreted in the context of discrete integer data with only about 8 distinct values. Integer-valued measurements naturally produce long runs of identical readings, which inflates the autocorrelation and reduces the number of runs far below what continuous data would exhibit. Similarly, discrete data cannot conform to a continuous normal distribution — the staircase pattern in the normal probability plot is a measurement-resolution artifact, not evidence of a fundamentally non-normal process. The discrete nature of the data accounts for much of the observed violation in both the randomness and normality tests.

Despite the formal statistical rejections, the violations are mild and largely attributable to measurement discreteness rather than genuine process deficiencies. The univariate model $Y_i = 2898.562 + E_i$ remains appropriate for characterizing this measurement process. The standard confidence interval $\bar{Y} \pm 2s / \sqrt{N}$ is approximately valid, though the mild autocorrelation means the effective sample size is smaller than 700 and the true confidence interval is somewhat wider than the nominal calculation suggests.

Conclusions

Although the randomness and normality assumptions are mildly violated, the violations are mild enough and at least partially explained by the discrete nature of the data, so the data may be modeled as if the process were in statistical control. The recommended univariate model is:

Y_i = 2898.562 + E_i

Since the randomness assumption is mildly violated (lag-1 autocorrelation $r_1 = 0.31$ ), the standard 95% confidence interval for the mean is:

\bar{Y} \pm \frac{2s}{\sqrt{N}} = 2898.562 \pm \frac{2 \times 1.305}{\sqrt{700}} = (2898.463,\; 2898.661)

However, this interval assumes independent observations. With autocorrelation present, the effective sample size is smaller than 700, and the true confidence interval is wider. The autocorrelation-adjusted 95% confidence interval is (2898.515, 2898.928), which accounts for the reduced effective degrees of freedom due to serial correlation.

This case study demonstrates that discrete integer data can create artifacts in graphical plots (particularly the lag plot and normal probability plot) and can affect the results of formal statistical tests, but careful interpretation considering the nature of the data supports valid conclusions about the measurement process. The key insight is distinguishing between statistically significant violations and practically important ones — the location drift is real but negligible, and the non-randomness is at least partly a consequence of measurement discreteness rather than a genuine process deficiency.