Linguistic Parsing of Lists in Structured Documents

Published by NAVER LABS Europe at 6 April 2013

Salah Ait-Mokhtar, Veronika Lux, Eva Banik

EACL Workshop on NLP and XML, Budapest, Hungrary, April 12-17, 2003.

This paper shows how taking document structure into account helps to improve the performance of linguistic parsing. We restrict our study to one specific structure in a single markup language : lists in HTML documents. First we establish a typology of lists based on a corpus study. Then, after describing a transformation process that creates documents with uniform list markup, we show how the list tags can be incorporated into a parsing system, and how they enhance performance on every level of parsing.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

NAVER FRANCE Gender Equality 2026

All

Publications

Blog

News

Code & Data

Careers

People

Linguistic Parsing of Lists in Structured Documents

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings