A Survey of Deep RL and IL for Autonomous Driving Policy Learning
This is the first survey to focus on AD policy learning using DRL/DIL, which is addressed simultaneously from the system, task-driven and problem-driven perspectives.
Abstract
Autonomous driving (AD) agents generate driving policies based on online\nperception results, which are obtained at multiple levels of abstraction, e.g.,\nbehavior planning, motion planning and control. Driving policies are crucial to\nthe realization of safe, efficient and harmonious driving behaviors, where AD\nagents still face substantial challenges in complex scenarios. Due to their\nsuccessful application in fields such as robotics and video games, the use of\ndeep reinforcement learning (DRL) and deep imitation learning (DIL) techniques\nto derive AD policies have witnessed vast research efforts in recent years.\nThis paper is a comprehensive survey of this body of work, which is conducted\nat three levels: First, a taxonomy of the literature studies is constructed\nfrom the system perspective, among which five modes of integration of DRL/DIL\nmodels into an AD architecture are identified. Second, the formulations of\nDRL/DIL models for conducting specified AD tasks are comprehensively reviewed,\nwhere various designs on the model state and action spaces and the\nreinforcement learning rewards are covered. Finally, an in-depth review is\nconducted on how the critical issues of AD applications regarding driving\nsafety, interaction with other traffic participants and uncertainty of the\nenvironment are addressed by the DRL/DIL models. To the best of our knowledge,\nthis is the first survey to focus on AD policy learning using DRL/DIL, which is\naddressed simultaneously from the system, task-driven and problem-driven\nperspectives. We share and discuss findings, which may lead to the\ninvestigation of various topics in the future.\n