Beyond Conservative Automated Driving in Multi-Agent Scenarios via Coupled Model Predictive Control and Deep Reinforcement Learning

April 15, 20262604.13891

Saeed Rahmani, Gözde Körpe, Zhenlin, Xu, Bruno Brito + 2 more

cs.ROcs.AIeess.SY

TLDR

This paper introduces an MPC-RL framework for automated driving, balancing safety and efficiency in multi-agent scenarios, outperforming standalone methods.

Key contributions

Combines Model Predictive Control (MPC) and Deep Reinforcement Learning (RL) for automated driving.
Reduces collision rate by 21% and improves success rate by 6.5% compared to pure MPC.
Achieves better zero-shot transfer to new scenarios, highlighting MPC's role in robustness.
Shows faster training stabilization than end-to-end RL, indicating a reduced learning burden.

Why it matters

This work addresses the critical challenge of balancing safety and efficiency in autonomous driving, especially in complex multi-agent environments. By integrating MPC's structured safety with RL's adaptability, it offers a more robust and generalizable solution than current methods, paving the way for safer and more efficient autonomous systems.

Original Abstract

Automated driving at unsignalized intersections is challenging due to complex multi-vehicle interactions and the need to balance safety and efficiency. Model Predictive Control (MPC) offers structured constraint handling through optimization but relies on hand-crafted rules that often produce overly conservative behavior. Deep Reinforcement Learning (RL) learns adaptive behaviors from experience but often struggles with safety assurance and generalization to unseen environments. In this study, we present an integrated MPC-RL framework to improve navigation performance in multi-agent scenarios. Experiments show that MPC-RL outperforms standalone MPC and end-to-end RL across three traffic-density levels. Collectively, MPC-RL reduces the collision rate by 21% and improves the success rate by 6.5% compared to pure MPC. We further evaluate zero-shot transfer to a highway merging scenario without retraining. Both MPC-based methods transfer substantially better than end-to-end PPO, which highlights the role of the MPC backbone in cross-scenario robustness. The framework also shows faster loss stabilization than end-to-end RL during training, which indicates a reduced learning burden. These results suggest that the integrated approach can improve the balance between safety performance and efficiency in multi-agent intersection scenarios, while the MPC component provides a strong foundation for generalization across driving environments. The implementation code is available open-source.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers