I. Introduction
In recent years, autonomous driving (AD) has attracted a lot of attention benefiting from the boom in artificial intelligence technology [1], [2]. In particular, the AD decision-making module receives the surrounding traffic information from the perception to make safe and efficient behavioral decisions that are then fed to the planning and control to enable AD [3]. Finite state machines [4], [5] are currently the most widely used behavioral decision models because of their simplicity and ease of implementation but ignore the dynamics and uncertainty of the environment. In addition, the division and management of states are tedious when there are many driving scene features, which are mostly applicable to simple scenarios and difficult to perform the behavioral decision tasks in complex road environments with rich structured features [6].