An Empirical Study of Fusion Operators for Multi-Modal Image Retrieval

Published by NAVER LABS Europe at 7 April 2013

10th Workshop on Content-Based Multimedia Indexing, Annecy, France, June 27-29, 2012.

In this paper we propose an empirical study of late fusion operators for multimodal image retrieval. Therefore, we consider two experts, one based on textual and one on visual similarities between documents and study the possibilities to go beyond simple score averaging. The main idea is to exploit the correlation between the two experts by encoding explicitly or implicitly an â€œandâ€? and an â€œorâ€? operator in an efficient way. We show through several experiments that the operators that combine both of these two aspects generally outperform the ones that look only to one of them. Based on this observation we propose several generalized version of most classical fusion operators and compare them using ImageClef benchmark datasets both in an unsupervised and in a supervised framework.

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

An Empirical Study of Fusion Operators for Multi-Modal Image Retrieval

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings