The Case of the Disembodied Heads

Background

A joint in the jaw can suffer from a variety of pathologies: displacement, perforation, and lateral caporsal fibrosis, to name a few. These three pathologies are not independent, which means that should have you have one, you are more likely to have another. A team of researchers wished to study the diagnostic abilities of three techniques for detecting these pathologies. The techniques are MRI scan, arthroscopy, and dissection. Arthroscopy, which involves inserting a fiber optic cable to take pictures, is inexpensive but invasive and therefore has some risk for the patient. MRI is non-invasive, but sometimes prohibitively expensive. The third route, of course, applies only to those who are already dead.

One purpose of this study was to examine whether it was worthwhile to invest in improving arthroscopic techniques so that they could be made less invasive. Clearly this would be feasible only if arthroscopy were successful at diagnosing these conditions.

About The Data

The data come courtesy of 18 heads. Each technique was applied to each joint and a diagnosis was made. Diagnoses were made independently, in the sense that the person who performed the MRI, say, did not know the results of the diagnoses using the arthroscopy. The result of each diagnosis was recorded as a 0 (healthy), 1 (mild pathology), 2 (severe pathology), or 9 (missing or ambiguous).

Each row of the data set represents a head. The first column represents the diagnosis for the right joing using MRI. The second is for the right joint using arthroscopy, and the third is for the right joint using dissection. The next three columns give the diagnoses for the left joint, first for MRI, then arthroscopy, and finally dissection.

Research Questions

Assume that the dissection method is the “truth”. That is, if MRI agrees with dissection, it is correct. If it disagrees, MRI is wrong.

1) For a given method, are left-joint diagnoses correlated with right-joint diagnoses? What does correlation mean for data like these (discrete-valued)? If there is no correlation, (or no “significant” correlation) then we have 18 X 2 = 36 independent measurements. Otherwise, we might have only 18 and maybe diagnoses should be treated as pairs.

2) Our general goal is to understand the extent to which MRI, say, “agrees” with dissection. How would you measure/quantify “agreement”? You should come up with several possible methods.

3) General question: for which pair is the agreement strongest: arthro/dissection or MRI/dissection? How does the agreement between arthro/MRI compare to these first two agreements?

4) Sensitivity is the probability a joint truly has the pathology, given that a pathology was diagnosed. Specificity is the probabability that a joint does NOT have the pathology, given that the diagnosis technique predicted that it would not. Estimate Sensitivity and Specificity for each technique. Can you rate which procedure is best? (It’s good to have high sensitivity and specificity.) Can you find confidence intervals for these estimates?

Issues

These data are fairly complex. First, they are ordinal, discrete data. Approximately, they are categorical data to which an order applies to the categories (healthy, mild, severe.) However, it’s not clear where the “ambiguous” ranking fits into this hierarchy. Normal approximations, particularly with such a small sample size, do not apply. (Perhaps they could be coerced into a normal model, but only with care and skepticism.) Finally, if you decide that the left and right measurements are correlated, you are faced with an interesting challenge, because then the data must be treated as bivariate.

The Data

1 1 1 0 1 0
9 1 0 9 1 0
9 1 1 1 1 1
0 1 1 1 1 0
1 1 1 1 1 0
1 1 1 2 2 2
0 0 0 0 0 0
0 1 1 0 1 0
0 0 0 0 0 0
1 0 0 0 0 0
0 1 0 0 0 0
1 1 1 1 1 1
9 2 2 0 1 0
0 0 0 1 0 0
1 1 1 2 1 1
2 1 1 2 2 2
2 2 2 9 2 2
1 1 1 1 1 1