Unsupervised inference for WER without any correct transcripts

One of the neat applications of having an algorithm for unsupervised inference in the labeling task is that you can estimate statistical quantities that are of interest to users of machine learning systems. One such statistic of interest is the word error rate or WER for sequential data. For example, you may be considering the quality of DNA sequencers and you want to use the word error rate – one minus the average label accuracy of the sequencer’s performance over the data processed – rather than the conditional recognition probabilities for each of the labels.

Let us rephrase that last statement mathematically. The WER when you are performing an $L$-label task combines the prevalence of the labels and the conditional recognition probabilities of the recognizers is,
$$\text{WER} = \sum_{\ell \in \mathcal{L}} p(\ell) \left( \sum_{\ell_r \neq \ell} p(\ell_r | \ell) \right).$$
The average label accuracy would be one minus this expression.

One can readily accommodate recognizers that perform deletion and insertion errors by introducing a null label ,denoted by the letter $\mathcal{N}$. Deletion errors would be associated with conditional probabilities of the form $p(\mathcal{N} | \ell \ne \mathcal{N})$. Insertion errors would be associated with the conditional probabilities $p(\ell \ne \mathcal{N} | \mathcal{N})$.

By taking a bunch of outputs from a collection of sequencers (e.g. DNA sequencers, speech recognizers) and aligning them to each other, you would be able to infer the WER of each of the transcripts since you have turned it into a L+1 labeling problem and we have an unsupervised inference algorithm for all the quantities necessary to calculate the WER.

I don’t know about you, but I think there is something magical about the idea that the average accuracy or WER of sequencers can be estimated even when we have no idea what is the correct sequence. This could be extremely useful in settings where one needs to monitor sequencers without human supervision.