Learning to rank images with cross-modal graph convolutions

Published by Thibault Formal at 14 April 2020

Thibault Formal, Stéphane Clinchant, Jean-Michel Renders, Sooyeol Lee, Geun Hee Cho

European Conference on Information Retrieval (ECIR), Lisbon, Portugal (virtual event), 14-17 April, 2020

Abstract

We are interested in the problem of cross-modal retrieval for web image search, where the goal is to retrieve images relevant to a text query. While most of the current approaches for cross-modal retrieval revolve around learning how to represent text and images in a shared latent space, we take a different direction: we propose to generalize the cross-modal relevance feedback mechanism, a simple yet effective unsupervised method, that relies on standard information retrieval heuristics and the choice of a few hyper-parameters. We show that we can cast it as a supervised representation learning problem on graphs, using graph convolutions operating jointly over text and image features, namely cross-modal graph convolutions. The proposed architecture directly learns how to combine image and text features for the ranking task, while taking into account the context given by all the other elements in the set of images to be (re-)ranked. We validate our approach on two datasets: a public dataset from a MediaEval challenge, and a small sample of proprietary image search query logs, referred to as WebQ. Our experiments demonstrate that our model improves over standard baselines.

Related Content

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

Learning to rank images with cross-modal graph convolutions

Related Content

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings