Robust active learning strategies for model variability - Naver Labs Europe

Abstract

Active learning methods are useful when a limited budget for data labelling is available. However, the most widely used methods — uncertainty sampling — may suffer from problems derived from an excessive dependence on the model learned during data acquisition.
This results in datasets which are not optimal when they are used to train models very different from those used during data creation.
In this paper, we link this to the tendency of uncertainty sampling to select outliers and show that other methods that favour selection of representative sampling are more robust to changes in models.