Unsupervised inference for graph statistics

I’ve been reading about Machine Learning Reductions and trying to understand its similarities with the unsupervised inference transformations we use. For example, when we turn an inference problem in ranking to one for labels as detailed in this post. The beauty of such transformations is that if you can turn an unsupervised inference problem into one involving labels, you can solve it. And given the generality of the concept of labels, you can solve a lot of problems. In this post we detail another application of this concept – how to carry out unsupervised inference for algorithms that return graphs.

Suppose that you wanted to know how good those graphs were, but you did not have any ground truth to use. If you settled for a statistic of the graphs, you could do it. For example, let us ask for the accuracy of the algorithms laying down a correct connection between two nodes. That corresponds to labeling the edges with two labels: in the graph, absent from the graph. For every data instance that caused the algorithms to spit out a graph annotation, you would have a collection of pairs for which they gave one label or the other. So you would have decision label patterns for the binary label task. That takes at least 6 annotators to model pair decisions correlations between them.