I. Introduction
Significant progress has been achieved in cooperative platooning of connected autonomous vehicular systems [1]. Benefiting from various onboard sensors and vehicle-to-vehicle (V2V) communication technique, vehicles can exchange information such as the speed, acceleration and position between each other. Cooperative platooning of autonomous vehicular systems aims to design a distributed control law such that vehicles can travel at the same speed while maintaining a desired inter-vehicle distance. Therefore, the vehicular platooning technique can effectively reduce fuel consumption while simultaneously increasing road capacity. However, in practice, vehicles will be subject to various constraints to keep safety, such as the safe inter-vehicle distances, input saturation, and so on. It is still a challenge to cooperatively learn distributed control strategy for a connected autonomous vehicular system subject to multiple constraints.