No reason for no supervision: improved generalization in supervised models

Published by Diane Larlus at 1 May 2023

Mert Bulent Sariyildiz, Yannis Kalantidis, Karteek Alahari, Diane Larlus

International Conference on Learning Representations (ICLR), Kigali, Rwanda, 1–5 May, 2023

We consider the problem of training a deep neural network on a given classification task, e.g., ImageNet-1K (IN1K), so that it excels at both the training task as well as at other (future) transfer tasks. These two seemingly contradictory properties impose a trade-off between improving the model’s generalization and maintaining its performance on the original task. Models trained with self-supervised learning tend to generalize better than their supervised counterparts for transfer learning; yet, they still lag behind supervised models on IN1K. In this paper, we propose a supervised learning setup that leverages the best of both worlds. We extensively analyze supervised training using multi-scale crops for data augmentation and an expendable projector head, and reveal that the design of the projector allows us to control the trade-off between performance on the training task and transferability. We further replace the last layer of class weights with class prototypes computed on the fly using a memory bank and derive two models: t-ReX that achieves a new state of the art for transfer learning and outperforms top methods such as DINO and PAWS on IN1K, and t-ReX* that matches the highly optimized RSB-A1 model on IN1K while performing better on transfer tasks.

ImageNet-1K (IN1K) vs transfer task performance for ResNet50. We report IN1K (Top-1 accuracy) and transfer performance (log odds) averaged over 13 datasets (the 5 ImageNet-CoG concept generalization datasets, Aircraft, Cars196, DTD, EuroSAT, Flowers, Pets, Food101 and SUN397) for a large number of our models trained with the supervised training setup we propose. Models on the convex hull are denoted by stars. We compare to public state-of-the-art (SotA) models: the supervised RSB-A1 and SupCon models, the self- supervised DINO, the semi-supervised PAWS and a variant of LOOK using multi-crop.

@inproceedings{sariyildiz2023improving,
title={No Reason for No Supervision: Improved Generalization in Supervised Models},
author={Sariyildiz, Mert Bulent and Kalantidis, Yannis and Alahari, Karteek and Larlus, Diane},
booktitle={International Conference on Learning Representations},
year={2023}
}

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2025

All

Publications

Blog

News

Code & Data

Careers

People

No reason for no supervision: improved generalization in supervised models

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings