What is this?
You tell us the decisions of your four (4) binary classifiers. We will tell you whether they are all "healthy" (greater than 50% accurate) on each label or we can detect up to one "sick" classifier—its two modes of failure are explained below. We do this without knowing the true labels for your data. If we can do it without knowing the true labels of your data, you can do it on your unlabeled data.
How do I use this app?
Since we don't use the true labels for your data, just give us the number of times all four classifiers voted in the 2^4 = 16 different ways they can (see form). For our method to work, all voting patterns must appear at least once. We recommend that you submit vote counts for a dataset of size 10K or greater.
What do the answers mean?
- Healthy classifiers are those that are greater than 50% accurate on each label.
- Sick classifiers can be of two types:
- Inverted classifiers are less than 50% accurate on both labels.
- Skewed ones are less than 50% accurate on one label, but greater than 50% on the other.
How is this useful?
A sick classifier detection could be used as a "warning light." When a sick classifier is identified, you can decide whether to retire it from production, or flip its decisions. For example, if one of them is inverted, flipping its decisions will immediately improve the performance of your ensemble. (See The Costanza Method.) If one of them is skewed, whether to flip its labels or not would depend on the bias you want. You may be happy letting in mud if it means not letting gold slip through your fingers.
Do you think we can help you?
Contact us. We welcome questions and feedback.
Future versions of our algorithms will provide the actual accuracy values of each classifier which will enable you to determine the best way to deploy them.