Multi-Agent Learning and Coordination with Clustered Deep Q-Network - Naver Labs Europe
loader image


Existing decentralized learning methods entail scalability issues due to the number of agents involved. Independent Q-Learning approach proposes that each agent learns its own action-values. One drawback of this method is that the non-stationarity introduced by Independent Q-Learning limits the use of experience replay memory, needed in deep reinforcement learning methods such as Deep Q-Network. This paper presents a multiagent, multi-level solution named Clustered Deep Q-Network (CDQN) to overcome this issue.