Corpora as networks: semantic analysis for knowledge-based content integration

Published by NAVER LABS Europe at 4 March 2015

Speaker: Remo Pareschi, associate professor at University of Molise, Campobasso, Italy

Abstract: I show how to implement the integration of contents from large corpora into semantically consistent knowledge bases. This objective is treated as a problem of detection of communities in a network (identification of the denser regions of a network), with the difference, compared to the standard algorithms for community detection, that in this case the nodes initially lack explicit links, which are nonetheless identified and made to emerge through semantic analysis. The detected communities correspond to topics (concepts) that group together text objects such as documents, Web pages, blogs, software modules etc. . Topics and objects are then structured and organized into “topic-topic” and “object-object” networks, thus providing the groundwork for navigable and semantically consistent knowledge bases. The applied methodologies rely on the exploitation of techniques for semantic analysis derived from probabilistic topic modelling through Latent Dirichlet Allocation. I also show how these methodologies generally outperform purely structural methods of community detection like Harel and Infomap even when the corpus comes with the explicit structure of a network, eg as with the World Wide Web, if the specific contents to be analyzed and integrated originate from multiple independent sources, and thus are not connected via hyperlinks or similar constructs. Finally, I illustrate a number of applications of the approach, such as ontology learning, knowledge discovery and knowledge-based document management.

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

Corpora as networks: semantic analysis for knowledge-based content integration

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings