Multi-agent deep reinforcement learning for large-scale traffic signal control

Chu, Tianshu; Wang, Jie; Codecà, Lara; Li, Zhaojian

IEEE Transactions on Intelligent Transportation Systems, 15 March 2019

Reinforcement learning (RL) is a promising datadriven approach for adaptive traffic signal control (ATSC) in complex urban traffic networks, and deep neural networks

further enhance its learning power. However, centralized RL is infeasible for large-scale ATSC due to the extremely high dimension of the joint action space. Multi-agent RL (MARL) overcomes the scalability issue by distributing the global control to each local RL agent, but it introduces new challenges: now the environment becomes partially observable from the viewpoint of each local agent due to limited communication among agents. Most existing studies in MARL focus on designing efficient communication and coordination among traditional Q-learning agents. This paper presents, for the first time, a fully scalable and decentralized MARL algorithm for the state-of-the-art deep RL agent: advantage actor critic (A2C), within the context of ATSC. In particular, two methods are proposed to stabilize the learning procedure, by improving the observability and reducing the learning difficulty of each local agent. The proposed multi-agent A2C is compared against independent A2C and independent Q-learning algorithms, in both a large synthetic traffic grid and a large real-world traffic network of Monaco city, under simulated peak-hour traffic dynamics. Results demonstrate its optimality, robustness, and sample efficiency over other state-ofthe-art decentralized MARL algorithms.

Detail

Document

DOI

BIBTEX

Type:

Journal

Date:

2019-03-15

Department:

Systèmes de Communication

Eurecom Ref:

5813

© 2019 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.