Learning with label noise for image retrieval by selecting interactions



Learning with noisy labels is an active research area for image classification. However, the effect of noisy labels on image retrieval has been less studied. In this work, we propose a noise-resistant method for image retrieval named Teacher-based Selection of Interactions, T-SINT, which identifies noisy interactions, i.e. elements in the distance matrix, and selects correct positive and negative interactions to be considered in the retrieval loss by using a teacher-based training setup which contributes to the stability. As a result, it consistently outperforms state-of-the-art methods on high noise rates across benchmark datasets with synthetic noise and more realistic noise.

Image Retrieval Pipeline

  • Model f extracts descriptors.
  • Distance functions obtain a pairwise distance matrix.
  • Loss: Contrastive Margin Loss.
  • Positive interactions between clean samples are likely to have a small distance value.
  • For noisy interactions, the distance value will be larger.

Teacher-based Approach

  • Inspired by knowledge distillation (Mean Teacher).
  • Teacher trained on open-domain images: ViT backbone of CLIP model.
  • Used to estimate the distributions of positive and negative interactions.
  • With the help of a cutting value, we select which percentile of interactions to keep and to discard.
  • Resulting mask will exclude noisy interactions in the loss function.
  • The teacher model is updated to an exponential moving average of the parameters of the main model.
  • The cutting value is updated with a moving average.

Take-home Messages:

T-SINT works on both realistic and simulated label noise.
T-SINT outperforms the state of the of art at low and mid-levels of noise (up to 50% of uniform noise).
T-SINT is the only image retrieval method robust to high levels of label noise (70% of uniform noise).


Paper accepted at Winter Conference on Applications of Computer Vision (WACV) 2022


author = {Ibrahimi, Sarah and Sors, Arnaud and Rezende, Rafael S. and Clinchant, St\’{e}phane},
title = {Learning with Label Noise for Image Retrieval by Selecting Interactions},
booktitle = {WACV},
year = {2022}

This web site uses cookies for the site search, to display videos and for aggregate site analytics.

Learn more about these cookies in our privacy notice.


Cookie settings

You may choose which kind of cookies you allow when visiting this website. Click on "Save cookie settings" to apply your choice.

FunctionalThis website uses functional cookies which are required for the search function to work and to apply for jobs and internships.

AnalyticalOur website uses analytical cookies to make it possible to analyse our website and optimize its usability.

Social mediaOur website places social media cookies to show YouTube and Vimeo videos. Cookies placed by these sites may track your personal data.