Improving the Generalization of Visual Navigation Policies using Invariance Regularization - Naver Labs Europe


Training agents to operate in one environment often yields overfitted models that are unable to generalize to the changes in that environment. However, due to the numerous variations that can occur in the real-world, the agent is required to be robust in order to be useful. This has not been the case for agents trained with reinforcement learning (RL) algorithms. In this paper, we investigate the overfitting of RL agents to the training environments in visual navigation tasks. Our experiments show that deep RL agents can overfit even when trained on multiple environments simultaneously. Another point of the paper is to discuss the role of adding invariance to the input and what it means then to the notion of generalization. Finally, we suggest a training procedure that combines RL with supervised learning methods to improve the generalization to changes in the visual input. The experimentation is done on the VizDoom environment which contains hundreds of textures that are suitable to investigate generalization to changes in the visual observation.