| Abstract | As autonomous vehicles (AVs) increasingly incorporate Reinforcement Learning (RL) into their decision-making processes, ensuring both security and stability becomes paramount. This paper introduces the Optimism Induction Attack (OIA), a novel adversarial strategy specifically targeting Deep Reinforcement Learning (DRL) agents. In contrast to traditional adversarial attacks—such as the Fast Gradient Sign Method (FGSM), which primarily degrades overall performance—OIA strategically exploits the agent’s misperception of state safety, causing it to overestimate safety margins and consequently make suboptimal decisions in critical scenarios. While OIA is broadly applicable to any actor-critic RL algorithm, we conduct a case study on a Proximal Policy Optimization (PPO) -trained Adaptive Cruise Control (ACC) agent protected by Control Barrier Function (CBF). Our analysis examines system performance using metrics such as collision rate, jerk, and engine torque. The results reveal that OIA significantly undermines both safety and efficiency, emerging as a subtler and more effective adversarial threat than FGSM, as evidenced by increased collision rates despite the nominal safety guarantees provided by CBFs. This work advances the field of adversarial machine learning in AVs by highlighting the urgent need for more robust defense mechanisms capable of countering sophisticated attacks like OIA, particularly in safety-critical applications. |
|---|