Abstract:The driving decisions of human drivers have the social intelligence to handle complex conditions in addition to the driving correctness. However, the existing autonomous driving strategies mainly focus on the correctness of the perception-control mapping, which deviates from the driving logic that human drivers follow. To solve this problem, this paper proposes a human-like autonomous driving strategy in an end-toend control framework based on deep deterministic policy gradient (DDPG). By applying rule constraints to the continuous behavior of the agents, an unmanned end-to-end control strategy was established. This strategy can output continuous and reasonable driving behavior that is consistent with the human driving logic. To enhance the driving safety of the end-to-end decision-making scheme, it utilizes the posterior feedback of the policy output to reduce the output rate of dangerous behaviors. To deal with the catastrophic events in the training process, a continuous reward function is proposed to improve the stability of the training algorithm. The results validated in different simulation environments showed that, the proposed human-like autonomous driving strategy has better control performance than the traditional DDPG algorithm. And the improved reward shaping method is more in line with the control strategy to model the catastrophic events of sparse rewards. The optimization expectation of the objective function can be increased by 85.57%. The human-like DDPG autonomous driving strategy proposed in this paper improves the training efficiency of the traditional DDPG algorithm by 21%, the task success rate by 19%, and the task execution efficiency by 15.45%, which significantly reduces collision accidents.