NLP Meetup, UK: Predicting when ML models fail

Published by NAVER LABS Europe at 12 February 2020

12^th February 2020, South England Natural Language Processing Meetup, London, UK

Abstract: In real-world applications — where errors cannot be tolerated — ML models deployed in production require tight monitoring of performance. This is usually done manually by continuously annotating evaluation examples to measure the model performance; such process is prohibitively expensive and slow meaning that they are unsuitable to be used as an alerting mechanism during run time.
In my talk, I’ll present a method to predict the performance drop of ML models on new examples seen during test time. In our experiments, this method was able to predict performance drops of a sentiment classifier with an error rate as low as 2.15%. At the end of the talk, I’ll leave you with a practical recipe to implement an inexpensive runtime methodology for monitoring your ML model in production.

Publication: To Annotate or Not? Predicting Performance Drop under Domain Shift

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

NLP Meetup, UK: Predicting when ML models fail

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings