Improving representations for language modeling

Published by NAVER LABS Europe at 21 November 2024

NAVER LABS Europe seminars are open to the public. This seminar is virtual and requires registration.

Date: 21^stNovember 2024, 11:00 am (CET)

Improving representations for language modeling

About the speaker: Nathan Godey is a final year PhD student in the ALMAnaCH lab at Inria Paris, advised by Benoît Sagot and Éric de la Clergerie. He was recently a visiting student in Edoardo Ponti’s lab at the University of Edinburgh. He also teaches the Advanced NLP course in the SCIA MSc at EPITA.

Abstract: Generative models (e.g. Llama) have now mostly replaced traditional predictive models (e.g. BERT) for a variety of tasks, driving language systems to prioritize expansive generative capability over strong feature extraction. As a consequence, recent models can be used as black-box systems that only need to be dissected for explanation or interpretation purposes. In our work, we find that observing high-level characteristics of the representations these models produce can provide insights on the inherent limitations of the LLM paradigm, by exposing biases and distortions that emerge from both the nature of the training data and from the inductive biases used in model architectures.
Our work not only reveals key bottlenecks but also guides alternatives to standard modeling approaches, including a neural tokenization layer that enhances robustness, a contrastive LM objective that improves training efficiency, and paves the way for compression schemes aimed at more memory-efficient generative modeling. Overall, this presentation shows how representation analysis can shed light on fundamental modeling limitations while inspiring new approaches to overcome them.

Improving representations for language modeling

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

Improving representations for language modeling

Improving representations for language modeling

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings