Accuracy is cheap, precision expensive – Data Engines Corporation

Dawid and Skene noted in their 1976 paper, where they presented an unsupervised inference algorithm, that data likelihood had degenerate maxima. The same thing occurs for the unsupervised inference equations based on frequencies of label voting patterns.

A simple example will illustrate the degenerate solutions to the unsupervised inference problem. Consider three independent recognizers that are 80% accurate for both labels of a two label task. And for ease of discussion, let us assume that the labels are equally prevalent in the data stream, $p(A)=50\%$ and $p(B)=50\%$.

The pattern $\{B,A,A\}$ occurs with frequency
$$\begin{split}
p_1(B|A)p_2(A|A)p_3(A|A)p(A) + p_1(B|B)p_2(A|B)p_3(A|B)p(B) = \\
0.20*0.80*0.80*0.5 + 0.80*0.20*0.20*0.5 = \\
0.08
\end{split}$$
The same frequency would be observed if the recognizers where all 20% accurate.

I once gave a talk at a company on the subject of unsupervised inference and this was the objection from an audience member. If degenerate solutions exist, how could one claim that unsupervised inference is completely possible. The answer is that it is not. If anybody were to claim that such an algorithm existed, it would be tantamount to claiming that one could build a perfect detector.

The algorithms discussed here are useful and practical because in most situations we have prior knowledge about the quality of our recognizers. For example, one does not usually deploy recognizers that are worse than uniform guessers. A lot of research and development has gone to build generally good recognizers that seem to be robust to changes in the data stream. If that is the case, then one can deploy the algorithms discussed here and monitor their operational swings.

Besides, knowing that the recognizers are 80% or 20% percent accurate is a heck of a lot better than not knowing anything about their performance. This degeneracy is not too much of a problem because accuracy is cheap. A data stream monitored sporadically would allow one to gauge which solution is correct. Since these two solutions are the only ones that explain the frequency of label voting patterns, a few samples will quickly decide between the two solutions. This point about accuracy being cheap but precision being costly was made in our ICML 2008 paper on unsupervised inference for the regression task.

An unsupervised inference algorithm together with light human monitoring is much cheaper and precise than continual human monitoring.