NAVER LABS Europe seminars are open to the public. Please register to participate.
Date: 2nd June 2026, 10:00 am (CEST)
About the speaker: After completing a PhD in theoretical computer science in 2011, Matthias Gallé joined Xerox Research and later Naver Labs – where he led all of research – until 2022, at which point he joined the startup world. He has been training LLMs for more than 5 years, not only for research but also through the BigScience initiative that he co-led (creating the BLOOM model), as well as the Command R & Command A families at Cohere and now the LagunaM seriees at Poolside.
Abstract: On the surface, little seems to have changed in how we post-train large language models since 2022. We still do some “mid-training” (arguably the worst misnomer in this process), followed by supervised fine-tuning, and finish with some form of reinforcement learning to align the model. Add a bit of context-length extension, and the recipe sounds familiar.
Under the surface, however, everything has changed. Data curation has become far more complex. Training now depends on carefully designed environments, LLM-as-a-judge systems, and increasingly sophisticated sequencing of reinforcement learning algorithms, objectives, and rewards. What once looked like a straightforward pipeline has evolved into a highly engineered production system.
Two broad approaches have emerged to manage this complexity: the artisan and the model factory.
In this talk, Matthias will draw on his experience leading post-training teams at Cohere and Poolside to explore the state of post-training in 2026—what has changed, what has scaled, and where the field might be heading next.
