Distributional reinforcement learning for energy-based sequential models

Published by Claudia Heyer at 13 January 2020

Tetiana Parshakova, Marc Dymetman, Jean-Marc Andreoli

Workshop on the Optimization Foundations of Reinforcement Learning (OPTRL) at the Conference on Neural Information Processing Systems (NeurIPS), Vancouver, British Columbia, Canada, 8-14 December, 2019

Download

@article{parshakova2019distributional,
  title={Distributional Reinforcement Learning for Energy-Based Sequential Models},
  author={Parshakova, Tetiana and Andreoli, Jean-Marc and Dymetman, Marc},
  journal={arXiv preprint arXiv:1912.08517},
  year={2019}
}

Careers home

Abstract

Global Autoregressive Models (GAMs) are a recent proposal [Parshakova et al., CoNLL 2019] for exploiting global properties of sequences for data-efficient learning of seq2seq models. In the first phase of training, an Energy-Based model (EBM) over sequences is derived. This EBM has high representational power, but is unnormalized and cannot be directly exploited for sampling. To address this issue [Parshakova et al., CoNLL 2019] proposes a distillation technique, which can only be applied under limited conditions. By relating this problem to Policy Gradient techniques in RL, but in a \emph{distributional} rather than \emph{optimization} perspective, we propose a general approach applicable to any sequential EBM. Its effectiveness is illustrated on GAM-based experiments.

Related Content

NAVER FRANCE Gender Equality 2024

All

Publications

Blog

News

Code & Data

Careers

People

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

VISION

Perception to help robots understand and interact with the environment.

NAVER FRANCE Gender Equality 2023

Action

Distributional reinforcement learning for energy-based sequential models

Related Content

All

Publications

Blog

News

Code & Data

Careers

People

Cookie settings