Examining modularity in multilingual LMs via language-specialized subnetworks

Published by NAVER LABS Europe at 18 February 2025

NAVER LABS Europe virtual seminars are open to the public. Please register here for your participation (Zoom event).

Date: 18^thFebruary 2025, 11:00 am (CET)

Examining modularity in multilingual LMs via language-specialized subnetworks

About the speaker: Rochelle Choenni is a postdoctoral researcher in Natural Language Processing (NLP) working with Ivan Titov. Her main research interests include multilingual and cross-cultural NLP, modular deep learning, interpretability and social biases in language models. Previously, she obtained a PhD in NLP at the University of Amsterdam (UvA) under the supervision of prof. Ekaterina Shutova and Dr. Dan Garrette (Google Research). Before her PhD she graduated with a bachelor’s and master’s degree in Artificial Intelligence from the UvA.

Abstract: Multilingual language models (MLMs) are jointly trained on data from many different languages such that representation of individual languages can benefit from other languages’ data. Impressive performance in zero-shot cross-lingual transfer shows that these models are able to exploit this property. Yet, it remains unclear to what extent, and under which conditions, languages rely on each other’s data. To answer this question, we developed an approach to measure cross-language influence using a training data attribution method. Specifically, we test how much influence training data examples from particular training languages exert cross-lingually on the predictions for individual test languages. This allows us to analyse cross-lingual sharing mechanisms of MLMs from a new perspective. We find that MLMs rely on data from multiple languages and this reliance increases as fine-tuning progresses. Moreover, we use the proposed measure for cross-language influence to examine modularity in MLMs. Specifically, we studied the emergence of language-specialized subnetworks in pretrained MLMs and studied the effect that sparse fine-tuning (SFT) has on the degree of language specialization of subnetworks. Interestingly, our results suggest that the success of SFT can not be attributed to stronger modularity in the form of language-specialized subnetworks.

Examining modularity in multilingual LMs via language-specialized subnetworks

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2023

Action

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

Examining modularity in multilingual LMs via language-specialized subnetworks

Examining modularity in multilingual LMs via language-specialized subnetworks

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings