Interaction losses  are defined between data examples rather than on one example as in classification or regression. They are becoming ubiquitous in machine learning, especially in text and image retrieval [2, 3], self-supervised pre-training , and start being considered in the context of unsupervised learning. Unfortunately, such losses are known to be difficult to optimize  because the joint optimal setting of the different variables that control training (batch size, learning rate, class-balanced sampling, choice of using part vs all interactions) does usually not scale in the same way as it does for single-example losses. Various strategies already exist to deal with this challenge [6, 7, 8]. The 6-month internship will be organized in two main parts. First, we will extend our existing benchmark of these approaches. In a second part, we will develop and evaluate extensions of them.
 Hadsell, R., Chopra, S., & LeCun, Y. (2006, June). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) (Vol. 2, pp. 1735-1742). IEEE.
 Liu, T. Y. (2011). Learning to rank for information retrieval. Springer Science & Business Media.
 Gordo, A., Almazán, J., Revaud, J., & Larlus, D. (2016, October). Deep image retrieval: Learning global representations for image search. In European conference on computer vision (pp. 241-257). Springer, Cham.
 Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709.
 Wu, C. Y., Manmatha, R., Smola, A. J., & Krahenbuhl, P. (2017). Sampling matters in deep embedding learning. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2840-2848).
 Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815-823).
 He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9729-9738).
 Wang, X., Zhang, H., Huang, W., & Scott, M. R. (2020). Cross-Batch Memory for Embedding Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6388-6397).
NAVER LABS Europe has full-time positions, PhD and PostDoc opportunities throughout the year which are advertised here and on international conference sites that we sponsor such as CVPR, ICCV, ICML, NeurIPS, EMNLP etc.
NAVER LABS Europe is an equal opportunity employer.
NAVER LABS are in Grenoble in the French Alps. We have a multi and interdisciplinary approach to research with scientists in machine learning, computer vision, artificial intelligence, natural language processing, ethnography and UX working together to create next generation ambient intelligence technology and services that deeply understand users and their contexts.