SuperLoss: a generic loss for robust curriculum learning - Naver Labs Europe
loader image


Recent works have shown that curriculum learning can be formulated dynamically in a self-supervised manner and thus effectively applied to deep learning. The key idea is to somehow estimate the importance (or weight) of each sample directly during training based on the observation that easy and hard samples show different patterns in their respective losses.
However, existing works are most of the time limited to a specific task (e.g. classification) and come at the cost of having (either of) extra data annotations, extra learnable parameters, extra layers or loss functions as well as a complex and onerous training procedure.
In this paper, we propose instead a novel loss function which simply needs to be plugged \emph{on top} of the original task loss, hence the name SuperLoss.
In contrast to existing work, our proposed framework: (a) is generic and can be applied to a variety of losses and tasks, (b) does not require extra learnable parameters, annotations, layers nor any kind of changes in the learning framework.
Its main effect is to automatically downweight the contribution of samples that incur a large loss, i.e. hard samples, effectively mimicking the core principle of curriculum learning without the need to modify the sampling strategy. As a side effect, we show that our loss prevents the memorization of noisy samples, making possible to train from noisy data with non-robust loss functions.
Experimental results of the SuperLoss applied to image classification, image retrieval and object detection demonstrate consistent gain on multiple tasks and datasets, in particular when they are noisy.

Related Content