login
Home / Papers / Safe reinforcement learning for dynamical games

Safe reinforcement learning for dynamical games

106 Citations2020
Yongliang Yang, Kyriakos G. Vamvoudakis, Hamidreza Modares

A novel actor‐critic‐barrier structure is presented for the multiplayer safety‐critical systems where non‐zero‐sum games with full‐state constraints are first transformed into unconstrained NZS games using a barrier function.

Abstract

<jats:title>Summary</jats:title><jats:p>This article presents a novel actor‐critic‐barrier structure for the multiplayer safety‐critical systems. Non‐zero‐sum (NZS) games with full‐state constraints are first transformed into unconstrained NZS games using a barrier function. The barrier function is capable of dealing with both symmetric and asymmetric constraints on the state. It is shown that the Nash equilibrium of the unconstrained NZS guarantees to stabilize the original multiplayer system. The barrier function is combined with an actor‐critic structure to learn the Nash equilibrium solution in an online fashion. It is shown that integrating the barrier function with the actor‐critic structure guarantees that the constraints will not be violated during learning. Boundedness and stability of the closed‐loop signals are analyzed. The efficacy of the presented approach is finally demonstrated by using a simulation example.</jats:p>