Unsupervised inference for the ranking task

We want to take a detour on our discussions of the algorithms for unsupervised inference for the labeling task to point out a side benefit of this technology: the ability to measure the accuracy of rankers without any knowledge of the correct rank for the items!

Consider four ranking algorithms given the task of repeatedly ranking three items. The output of their ranking decisions can be represented as a table:

ranker 1 ranker 2 ranker 3 ranker 4 true rank
a,c,b b,c,a c,a,b c,b,a ?
f,e,d d,e,f f,e,d e,d,f ?
g,h,i h,g,i g,h,i g,h,i ?

We implicitly assume that this table extends downward for a large number of ranking decisions. The unsupervised inference problem for the ranking task is to estimate the performance of these rankers without any knowledge of the correct ranking for the items.

The way to solve this problem is to map it into an abstract labeling task using the permutation group. Select any of the rankers as your reference ranker. Replace its ranking decision with the identity element of the $S_n$ permutation group. In this case we have three items that are being repeatedly ranked so the group to consider is $S_3$. Let us arbitrarily pick the second ranker and transform the table accordingly:

ranker 1 ranker 2 ranker 3 ranker 4 true rank
a,c,b $\pi_1$ c,a,b c,b,a ?
f,e,d $\pi_1$ f,e,d e,d,f ?
g,h,i $\pi_1$ g,h,i g,h,i ?

where $\pi_1$ denotes the identity element for $S_n$.

We can continue in this manner for all the rankers. Using the ordering of the second ranker as the reference, we can replace another ranker’s ordering by the permutation group element that relates the two. This is always possible because the rank orderings form a group! There is always an operation that takes you from one ordering of three items to any other ranking of the same items.

ranker 1 ranker 2 ranker 3 ranker 4 true rank
$\pi_{\text{whatever}}$ $\pi_1$ $\pi_{\text{whatever}}$ $\pi_{\text{whatever}}$ ?
$\pi_{\text{whatever}}$ $\pi_1$ $\pi_{\text{whatever}}$ $\pi_{\text{whatever}}$ ?
$\pi_{\text{whatever}}$ $\pi_1$ $\pi_{\text{whatever}}$ $\pi_{\text{whatever}}$ ?

Where I have not bothered to fill out the table precisely but you get the idea. For each ranker other than the reference we can find the permutation element in $S_n$ that relates its ranking with that of the arbitrarily chosen ranker.

That is it. We are done. The unknown correct ranking would also have to be transformed into one of the elements of the permutation group. The fact that we don’t know the correct “label” does not matter! We have an unsupervised inference algorithm for the labeling task. By turning the ranking task to an abstract labeling problem involving the members of the permutation group, we can solve the problem.

In general, for the task of ranking $n$ items, you would have to solve a $n!$ label task. The prevalence of the labels are not really meaningful except as they allow us to calculate the overall accuracy of the rankers. The only difference is that you have to decide with how many rankers you are going to do it. Why?

The table of abstract labels that we have constructed has only one label type in the second column. Therefore, not all possible $(n!)^R$ labels voting patterns show up on that table. The elegant way to deal with this is to throw out that column. Then you have mapped the $n$-item ranking task with $R$ rankers to the labeling task with $n!$ labels using $R-1$ labelers.