Une approche à la traduction automatique statistique par segments discontinus

Published by NAVER LABS Europe at 6 April 2013

Michel Simard, Nicola Cancedda, Bruno Cavestro, Marc Dymetman, Eric Gaussier, Cyril Goutte, Kenji Yamada, Arne Mauser

Traitement Automatique des Langues Naturelles (TALN 2005), Dourdan,France, 6-10 juin 2005.

Cet article présente une méthode de traduction automatique statistique basée sur des segments non-continus, c est-à-dire des segments formés de mots qui ne se présentent pas nécéssairement de façon contiguë dans le texte. On propose une méthode pour produire de tels segments à partir de corpus alignés au niveau des mots. On présente également un modèle de traduction statistique capable de tenir compte de tels segments, de même qu une méthode d apprentissage des paramètres du modèle visant à maximiser l exactitude des traductions produites, telle que mesurée avec la métrique NIST. Les traductions optimales sont produites par le biais d une recherche en faisceau. On présente finalement des résultats expérimentaux, qui démontrent comment la méthode proposée permet une meilleure généralisation à partir des données d entraînement.

This paper presents a phrase-based statistical machine translation method, based on non-contiguous phrases, i.e. phrases with gaps. A method for producing such phrases from a word-aligned corpora is proposed. A statistical translation model is also presented that deals with such phrases, as well as a training method based on the maximization of translation accuracy, as measured with the NIST evaluation metric. Translations are produced by means of a beam-search decoder. Experimental results are presented, that demonstrate how the proposed method allows to better generalize from the training data.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2025

All

Publications

Blog

News

Code & Data

Careers

People

Une approche à la traduction automatique statistique par segments discontinus

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings