Semantic Combination of Textual and Visual Information in Multimedia Retrieval

Published by NAVER LABS Europe at 7 April 2013

Julien Ah-Pine, Stéphane Clinchant, Gabriela Csurka

ICMR-International Conference on Multimedia Retrieval(ACM)- Trento,Italy - April 17-20,2011

The goal of this paper is to introduce a set of techniques we call semantic combination in order to efficiently fuse text and image retrieval systems in the context of multimedia information access. These techniques emerge from the observation that image and textual queries are expressed at different semantic levels and that a single image query is often ambiguous. Overall, the semantic combination techniques overcome a conceptual barrier rather than a technical one: these methods can be seen as a combination of late fusion and image reranking. Albeit simple, this approach has not been used yet. We assess the proposed techniques against late and cross-media fusion using 4 different ImageCLEF datasets. Compared to late fusion, performances significantly increase on two datasets and remain similar on the two other ones.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2026

All

Publications

Blog

News

Code & Data

Careers

People

Semantic Combination of Textual and Visual Information in Multimedia Retrieval

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings