Learning latent syntactic representations for downstream tasks

Published by NAVER LABS Europe at 13 April 2015

Speaker: Jason Naradowsky, post-doctoral researcher at University College London, London, U.K

Abstract: Solving complex NLP tasks, like question answering, requires the processing of many layers of linguistic information. In practice the most common method for accomplishing this is the NLP pipeline, in which individual pre-trained NLP components (taggers, parsers, etc.) are arranged such that the output of one component becomes input to the next. While this is most unfavorable, notably due to error propagation between components, it also requires that training data exists for each component while neglecting the most relevant training signal: performance on the downstream task.

Syntactic structure is a useful type of linguistic information, and is necessary for state-of-the-art performance on many NLP tasks, but the treebanks needed to train parsers is costly to produce and unavailable for a majority of the world’s languages. We propose a novel marginalization-based training method in which end task annotations are used to guide the induction of a constrained latent syntactic representation, with the resulting syntactic distribution being specially-tailored for the desired end task, without the need for supervised training. This is implemented in a joint modeling framework using factor graphs, with combinatorial factors to provide efficient structural constraints over latent structure, and soft Boolean factors to coordinate between component models. Inference is performed using loopy belief propagation.

We find that across a number NLP tasks (semantic role labeling, named entity recognition, relation extraction) this approach not only offers performance comparable to the fully supervised training of the joint model (using syntactically-annotated data), but in some instances even improves upon it by learning latent structures which are more appropriate for the task.

This is joint work with Mark Johnson, Sebastian Riedel, and David A. Smith.

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

Learning latent syntactic representations for downstream tasks

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings