|Theo Cachet, Julien Perez, Seungsu Kim|
|3rd Workshop on Robot Learning, Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS), virtual only conference, 6-12 December, 2020|
Imitation learning has been considered as one of the promising approaches to enable a robot to acquire competencies. Recently, one-shot imitation learning has shown encouraging results for executing variations of initial conditions of a given task without requiring task-specific engineering. However, it remains inefficient for generalizing in variations of tasks involving different reward or transition functions. In this work, we aim at improving the generalization ability of demonstration based learning to unseen tasks that are significantly different from the training tasks. First, we introduce the use of transformer-based sequence-to-sequence policy networks trained from limited sets of demonstrations. Then, we propose to meta-train our model from a set of training demonstrations by leveraging optimization-based meta-learning. Finally, we evaluate our approach and report encouraging results using the recently proposed framework Meta-World which is composed of a large set of robotic manipulation tasks organized in various categories.