Language models are systems that can map sequences of tokens to their probability of occurring in written corpora. Thanks to them, we can generate text that fits a given context by searching for the sequences of tokens that are judged most likely by the language model.
Greedy based (beam search) and stochastic decoding (nucleus sampling) strategies based on locally normalized probabilities have been de facto strategies for obtaining samples from neural language models.
Even though the search space grows exponentially as sequences get longer, this does not pose a big problem in open-ended NLG where the space is relatively dense and one can find highly likely sequences - even if we have to give up finding the global best model scores. However, it is considerably more challenging to find sequences of tokens that match a given control condition. For instance, consider the task of completing the phrase "This parrot is ____." with phrases that are most similar to "dead". While there are seemingly many phrases to express this concept ("off the twig", "gone to meet its maker", "pushing up the daisies", "pining for the fjords", "an ex-parrot", etc.), they are only a tiny fraction of the possible continuations of this phrase and finding them becomes considerably harder. Similarly, we can go into similar problems if we search for fillers that are factually correct (e.g. "a bird", "mostly found in tropical and subtropical regions", etc.). To make matters worse, a more exhaustive search does not always lead to better results because of errors in the underlying model .
In this internship, we will explore methods to solve this problem in an efficient and robust way by developing search  and sampling  techniques that can be applied to do controlled generation from the original pre-trained language models. We will apply these methods to target some possible applications, such as paraphrasing in context, style transfer, unsupervised machine translation and controlled NLG for factual correctness.
 Stahlberg, Felix, and Bill Byrne. "On NMT Search Errors and Model Errors: Cat Got Your Tongue?." Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.
 Li, Jiwei, Will Monroe, and Dan Jurafsky. "Learning to decode for future success." arXiv preprint arXiv:1701.06549 (2017).
 Miao, Ning, et al. "Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods." Proceedings of ACL (2020).
NAVER LABS Europe has full-time positions, PhD and PostDoc opportunities throughout the year which are advertised here and on international conference sites that we sponsor such as CVPR, ICCV, ICML, NeurIPS, EMNLP etc.
NAVER LABS Europe is an equal opportunity employer.
NAVER LABS are in Grenoble in the French Alps. We have a multi and interdisciplinary approach to research with scientists in machine learning, computer vision, artificial intelligence, natural language processing, ethnography and UX working together to create next generation ambient intelligence technology and services that deeply understand users and their contexts.