To annotate or not? Predicting performance drop under domain shift

Published by NAVER LABS Europe at 16 September 2019

Hady Elsahar, Matthias Gallé

Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP), Hong Kong, China, 3-7 November, 2019

Download

@inproceedings{Elsahar_EMNLP_2019,
 author    = {Hady Elsahar and Matthias Gall{\'{e}}},
 title     = { To Annotate or Not? Prediction of Predicting Performance Drop under Domain Shift },
 booktitle = {Proceeding of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing},
 year      = {2019},
}

Careers home

Performance drop due to domain-shift is an endemic problem for NLP models in production. This problem creates an urge to continuously annotate evaluation datasets to measure the expected drop in the model performance which can be prohibitively expensive and slow. In this paper, we study the problem of predicting the performance drop of modern NLP models under domain-shift, in the absence of any target domain labels. We investigate three families of methods (H-divergence, reverse classification accuracy and confidence measures), show how they can be used to predict the performance drop and study their robustness to adversarial domain-shifts. Our results on sentiment classification and sequence labelling show that our method is able to predict performance drops with an error rate as low as 2.15% and 0.89% for sentiment analysis and POS tagging respectively.

To Annotate or Not? Predicting Performance Drop under Domain Shift | EMNLP-IJCNLP2019

This article was first published on the 30^th October 2019.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2026

All

Publications

Blog

News

Code & Data

Careers

People

To annotate or not? Predicting performance drop under domain shift

This article was first published on the 30th October 2019.

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings

This article was first published on the 30^th October 2019.