Confident Learning (CL)

Introduction

Confident Learning (CL) is a subfield of supervised learning and weak-supervision aimed at characterizing label noise, finding label errors, learning with noisy labels and finding ontological issues. CL is based on the principles of pruning noisy data, counting to estimate noise and ranking examples to train with confidence. CL generalizes Angluin and Laird's classification noise process to directly estimate the joint distribution between given and unknown labels. CL requires two inputs: out-of-sample predicted probabilities; and noisy labels. The three steps for weak supervision using CL are: estimating the joint distribution; pruning noisy examples; and re-weighting examples by estimated prior.