Seminars at NAVER LABS Europe are open to the public but space is limited. Please register here.
Date: 22nd October 2019
Christophe Gravier, associate professor, Université Jean-Monnet, Saint-Étienne, France
Many of Natural Language Processing modern pipelines share a two-step process approach : a first network make it possible to learn word representation, while a second is dedicated to (deep) fine-tuning for downstream tasks. This talk will be two-fold on the first step — as many of you may already know about our works on the second steps especially in the frame of Hady Elsahar’s PhD, now working at NaverLabs in the NLP team. First, we will discuss how external knowledge hold in dictionaries can help learning word embedding in a simple and efficient way. We will then present how it is possible to binarize these embeddings in order to reduce by order of magnitude both the memory footprint of word features as well as keeping the same level of accuracy on downstream tasks such as text classification, which is of the utmost practical interest for on-device NLP.
This talk is based on the following publications : Julien Tissier, Christophe Gravier, Amaury Habrard: “Dict2vec : Learning Word Embeddings using Lexical Dictionaries”. EMNLP 2017: 254-263 and Julien Tissier, Amaury Habrard, Christophe Gravier: “Near-lossless Binarization of Word Embeddings”. AAAI 2019: pages to be announced
Dr. Christophe Gravier, PhD, is an associate professor at Université Jean Monnet, Université de Lyon, France. His research interest is Natural Language Processing, including text generation, chatbots, learning binary embeddings, and question answering. Christophe Gravier is a Computer Science teacher at Télécom Saint-Etienne, a French engineering school (Master diploma), where he also serves as director of development and innovation.