Large-scale document image retrieval and classification with runlength histograms

Published by NAVER LABS Europe at 12 July 2016

Albert Gordo, Florent Perronnin, Ernest Valveny

Published in Pattern Recognition

We present a new document image descriptor based on multi-scale runlength histograms. This descriptor does not rely on layout analysis and can be computed efficiently. We show how this descriptor can achieve state-of-theart results on two very different public datasets in classification and retrieval tasks. Moreover, we show how we can compress and binarize these descriptors to make them suitable for large-scale applications. We can achieve state-ofthe- art results in classification using binary descriptors of as few as 16 to 64 bits.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2026

All

Publications

Blog

News

Code & Data

Careers

People

Large-scale document image retrieval and classification with runlength histograms

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings