Motion Planning and Interactive Decision Making

Faizan M. Tariq, David Isele, John S Baras, and Sangjae Bae.

RCMS: Risk-Aware Crash Mitigation System for Autonomous Vehicles

IEEE International Conference on Intelligent Transportation Systems (ITSC) 2023.

September, 2023 - Abstract

We propose a risk-aware crash mitigation system (RCMS), to augment any existing motion planner (MP), that enables an autonomous vehicle to perform evasive maneuvers in high-risk situations and minimize the severity of collision if a crash is inevitable. In order to facilitate a smooth transition between RCMS and MP, we develop a novel activation mechanism that combines instantaneous as well as predictive collision risk evaluation strategies in a unified hysteresis-band approach. For trajectory planning, we deploy a modular receding horizon optimization-based approach that minimizes a smooth situational risk profile, while adhering to the physical road limits as well as vehicular actuator limits. We demonstrate the performance of our approach in a simulation environment.

Kanghoon Lee, Jiachen Li, David Isele, Jinkyoo Park, Kikuo Fujimura, Mykel J. Kochenderfer

Robust Driving Policy Learning with Guided Meta Reinforcement Learning.

IEEE International Conference on Intelligent Transportation Systems (ITSC) 2023

September, 2023 - Abstract

Although deep reinforcement learning (DRL) has shown promising results for autonomous navigation in interactive traffic scenarios, existing work typically adopts a fixed behavior policy to control social vehicles in the training environment. This may cause the learned driving policy to overfit the environment, making it difficult to interact well with vehicles with different, unseen behaviors. In this work, we introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy. By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy through guiding policies that achieve specific objectives. We further propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy. Our method successfully learns an ego driving policy that generalizes well to unseen situations with out-of-distribution (OOD) social agents' behaviors in a challenging uncontrolled T-intersection scenario.

Dustin Holley, Jovin D’sa, Hossein Nourkhiz Mahjoub, Gibran Ali, Behdad Chalaki, Ehsan Moradi-Pari

MR-IDM - Merge Reactive Intelligent Driver Model: Towards Enhancing Laterally Aware Car-following Models

IEEE International Conference on Intelligent Transportation Systems (ITSC) 2023

September, 2023 - Abstract

This paper discusses the limitations of existing microscopic traffic models in accounting for the potential impacts of on-ramp vehicles on the car-following behavior of main-lane vehicles on highways. We first surveyed U.S. on-ramps to choose a representative set of on-ramps and then collected real-world
observational data from the merging vehicle’s perspective in
various traffic conditions ranging from free-flowing to rushhour
traffic jams. Next, as our core contribution, we introduce
a novel car-following model, called MR-IDM, for highway
driving that reacts to merging vehicles in a realistic way. This
proposed driving model can either be used in traffic simulators
to generate realistic highway driving behavior or integrated
into a prediction module for autonomous vehicles attempting
to merge onto the highway. We quantitatively evaluated the
effectiveness of our model and compared it against several
other methods. We show that MR-IDM has the least error
in mimicking the real-world data, while having

Haimin Hu, David Isele, Sangjae Bae, Jaime F Fisac

Active Uncertainty Reduction for Safe and Efficient Interaction Planning: A Shielding-Aware Dual Control Approach

The International Journal of Robotics Research (IJRR)

September, 2023 - Abstract

The ability to accurately predict others' behavior is central to the safety and efficiency of robotic systems in interactive settings, such as human-robot interaction and multi-robot teaming tasks.Unfortunately, robots often lack access to key information on which these predictions may hinge, such as other agents' goals, attention, and willingness to cooperate. Dual control theory addresses this challenge by treating unknown parameters of a predictive model as stochastic hidden states and inferring their values at runtime using information gathered during system operation. While able to optimally and automatically trade off exploration and exploitation, dual control is computationally intractable for general interactive motion planning, mainly due to the fundamental coupling between the robot's trajectory plan and its prediction of other agents' intent. In this paper, we present a novel algorithmic approach to enable active uncertainty reduction for interactive motion planning based on the implicit dual control paradigm. Our approach relies on sampling-based approximation of stochastic dynamic programming, leading to a model predictive control problem that can be readily solved by real-time gradient-based optimization methods. The resulting policy is shown to preserve the dual control effect for a broad class of predictive models with both continuous and categorical uncertainty. To ensure the safe operation of the interacting agents, we use a runtime safety filter (also referred to as a ``shielding'' scheme), which overrides the robot's dual control policy with a safety fallback strategy when a safety-critical event is imminent. We then augment the dual control framework with an improved variant of the recently proposed shielding-aware robust planning scheme, which proactively balances the nominal planning performance with the risk of high-cost emergency maneuvers triggered by low-probability agent behaviors.We demonstrate the efficacy of our approach with both simulated driving studies and hardware experiments using 1/10 scale autonomous vehicles.

Piyush Gupta, David Isele, Donggun Lee, Sangjae Bae

Interaction-Aware Trajectory Planning for Autonomous Vehicles with Analytic Integration of Neural Networks into Model Predictive Control.

International Conference on Robotics and Automation (ICRA) 2023, London

May, 2023 - Abstract

Autonomous vehicles (AVs) must share the driving space with other drivers and often employ conservative motion planning strategies to ensure safety. These conservative strategies can negatively impact AV’s performance and significantly slow traffic throughput. Therefore, to avoid conservatism, we design an interaction-aware motion planner for the ego vehicle (AV) that interacts with surrounding vehicles to perform complex maneuvers in a locally optimal manner. Our planner uses a neural network-based interactive trajectory predictor and analytically integrates it with model predictive control (MPC). We solve the MPC optimization using the alternating direction method of multipliers (ADMM) and prove the algorithm’s convergence. We provide an empirical study and compare our method with a baseline heuristic method.

Behdad Chalaki, Vaishnav Tadiparthi, Hossein Nourkhiz Mahjoub, Jovin D’sa, Ehsan Moradi-Pari, Andres S. Chavez Armijos, Anni Li, and Christos G. Cassandras.

Minimally Disruptive Cooperative Lane-Change Maneuvers

IEEE Control Systems Letters, vol. 7, pp. 1766-1771, 2023, Proceeding of 2023 IEEE 62st Conference on Decision and Control (CDC)

May, 2023 - Abstract

A lane-change maneuver on a congested highway could be severely disruptive or even infeasible without the cooperation of neighboring cars. However, cooperation with other vehicles does not guarantee that the performed maneuver will not have a negative impact on traffic flow unless it is explicitly considered in the cooperative controller design. In this letter, we present a socially compliant framework for cooperative lane-change maneuvers for an arbitrary number of CAVs on highways that aims to interrupt traffic flow as minimally as possible. Moreover, we explicitly impose feasibility constraints in the optimization formulation by using reachability set theory, leading to a unified design that removes the need for an iterative procedure used in prior work. We quantitatively evaluate the effectiveness of our framework and compare it against previously offered approaches in terms of maneuver time and incurred throughput disruption.

F. Tariq, D. Isele, J. Baras, S. Bae

SLAS: Speed and Lane Advisory System for Highway Navigation

IEEE Conference on Decision and Control (CDC), 2022

December, 2022 - Abstract

This paper proposes a hierarchical autonomous vehicle navigation architecture, composed of a high-level speed and lane advisory system (SLAS) coupled with low-level trajectory generation and trajectory following modules. Specifically, we target a multi-lane highway driving scenario where an autonomous ego vehicle navigates in traffic. We propose a novel receding horizon mixed-integer optimization based method for SLAS with the objective to minimize travel time while accounting for passenger comfort. We further incorporate various modifications in the proposed approach to improve the overall computational efficiency and achieve real-time performance.We demonstrate the efficacy of the proposed approach in contrast to the existing methods, when applied in conjunction with state-of-the-art trajectory generation and trajectory following frameworks, in a CARLA simulation environment.

K. Lee, D. Isele, E. Theodorou, S. Bae

Risk-sensitive MPCs with Deep Distributional Inverse RL for Autonomous Driving

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022, K

October, 2022 - Abstract

In robot learning from demonstration (LfD), a visual representation of a cost function inferred from Inverse Reinforcement Learning (IRL) provides an intuitive tool for humans to quickly interpret the underlying objectives of the demonstration. The inferred cost function can be used by controllers, for example, Model Predictive Controllers (MPCs). In this work, we improve the recently developed IRL-MPC framework, by enhancing it in a risk-sensitive formulation to be more applicable for safety-critical applications like autonomous driving. Our risk-sensitive MPCs together with the distributional costmap demonstrate lower collision rates in the CARLA simulator for autonomous driving tasks compared to other learning-based baseline methods.

A. Moradipari, S. Bae, M. Alizadeh, E. Moradi Pari, D. Isele

Predicting Parameters for Modeling Traffic Participants

IEEE International Conference on Intelligent Transportation Systems (ITSC) 2022

October, 2022 - Abstract

Accurately modeling the behavior of traffic participants is essential for safely and efficiently navigating an autonomous vehicle through heavy traffic. We propose a method, based on the intelligent driver model, that allows us to accurately model individual driver behaviors from only a small number of frames using easily observable features. On average, this method makes prediction errors that have less than 1 meter difference from an oracle with full-information when analyzed over a 10-second horizon of highway driving. We then validate the efficiency of our method through extensive analysis against a competitive data-driven method such as Reinforcement Learning that may be of independent interest.

Lilit Avetisyan, Chengxin Zhang, Sue Bai, Ehsan Moradi Pari, Fred Feng, Shan Bao & Feng Zhou

Design a sustainable micro-mobility future: trends and challenges in the US and EU

Journal of Engineering Design, 33:8-9, 587-606, 2022

October, 2022 - Abstract

Micro-mobility is promising to contribute to sustainable cities with its efficiency and low cost. To better design such a sustainable future, it is necessary to understand the trends and challenges. Thus, we examined people's opinions on micro-mobility in the US and the EU using Tweets. We used topic modelling based on advanced natural language processing techniques and categorised the data into seven topics: promotion and service, mobility, technical features, acceptance, recreation, infrastructure and regulations. Furthermore, using sentiment analysis, we investigated people's positive and negative attitudes towards specific aspects of these topics and compared the patterns of the trends and challenges in the US and the EU. We found that (1) promotion and service included the majority of Twitter discussions in the both regions, (2) the EU had more positive opinions than the US, (3) micro-mobility devices were more widely used for utilitarian mobility and recreational purposes in the EU than in the US and (4) compared to the EU, people in the US had many more concerns related to infrastructure and regulation issues. These findings help us design and prioritise micro-mobility to improve their safety and experience across the two areas for designing a more sustainable micro-mobility future.

A. S. Chavez Armijos, R. Chen, C. G. Cassandras, Y. K. Al-Nadawi, H. N. Mahjoub and H. Araki

Sequential Cooperative Energy and Time-Optimal Lane Change Maneuvers for Highway Traffic

International Conference on Intelligent Transportation Systems (ITSC)

September, 2022 - Abstract

We derive optimal control policies for a Connected Automated Vehicle (CAV) and cooperating neighboring CAVs to carry out a lane change maneuver consisting of a longitudinal phase where the CAV properly positions itself relative to the cooperating neighbors and a lateral phase where it safely changes lanes. In contrast to prior work on this problem, where the CAV “selfishly” seeks to minimize its maneuver time, we seek to ensure that the fast-lane traffic flow is minimally disrupted (through a properly defined metric) and that highway throughput is improved by optimally selecting the cooperating vehicles. We show that analytical solutions for the optimal trajectories can be derived and are guaranteed to satisfy safety constraints for all vehicles involved in the maneuver. When feasible solutions do not exist, we include a time relaxation method trading off a longer maneuver time with reduced disruption. Our analysis is also extended to multiple sequential maneuvers. Simulation results where the controllers are implemented show their effectiveness in terms of safety guarantees and up to 35 % throughput improvement compared to maneuvers with no vehicle cooperation.

S. Bae, D. Isele, A. Nakhaei, P. Xu, A. Miranda Anon, C. Choi, K. Fujimura, S. Moura

Lane-Change in Dense Traffic With Model Predictive Control and Neural Networks

IEEE Transactions on Control Systems Technology

August, 2022 - Abstract

This article presents an online smooth-path lane-change control framework. We focus on dense traffic where intervehicle space gaps are narrow, and cooperation with surrounding drivers is essential to achieve the lane-change maneuver. We propose a two-stage control framework that harmonizes model predictive control (MPC) with generative adversarial networks (GANs) by utilizing driving intentions to generate smooth lane-change maneuvers. To improve performance in practice, the system is augmented with an adaptive safety boundary and a Kalman filter to mitigate sensor noise. Simulation studies are investigated at different levels of traffic density and cooperativeness of other drivers. The simulation results support the effectiveness, driving comfort, and safety of the proposed method.

K. Lee, D. Isele, E. A. Theodorou, S. Bae

Spatiotemporal Costmap Inference for MPC Via Deep Inverse Reinforcement Learning

IEEE Robotics and Automation Letters (RA-L)

April, 2022 - Abstract

It can be difficult to autonomously produce driver behavior so that it appears natural to other traffic participants. Through Inverse Reinforcement Learning (IRL), we can automate this process by learning the underlying reward function from human demonstrations. We propose a new IRL algorithm that learns a goal-conditioned spatiotemporal reward function. The resulting costmap is used by Model Predictive Controllers (MPCs) to perform a task without any hand-designing or hand-tuning of the cost function. We evaluate our proposed Goal-conditioned SpatioTemporal Zeroing Maximum Entropy Deep IRL (GSTZ)-MEDIRL framework together with MPC in the CARLA simulator for autonomous driving, lane keeping, and lane changing tasks in a challenging dense traffic highway scenario. Our proposed methods show higher success rates compared to other baseline methods including behavior cloning, state-of-the-art RL policies, and MPC with a learning-based behavior prediction model.

X. Ma, D. Isele, J. K. Gupta, K. Fujimura, M. J. Kochenderfer

Recursive Reasoning Graph for Multi-Agent Reinforcement Learning

AAAI Conference on Artificial Intelligence

February, 2022 - Abstract

Multi-agent reinforcement learning (MARL) provides an efficient way for simultaneously learning policies for multiple agents interacting with each other. However, in scenarios requiring complex interactions, existing algorithms can suffer from an inability to accurately anticipate the influence of selfactions on other agents. Incorporating an ability to reason about other agents’ potential responses can allow an agent to formulate more effective strategies. This paper adopts a recursive reasoning model in a centralized-training-decentralizedexecution framework to help learning agents better cooperate with or compete against others. The proposed algorithm, referred to as the Recursive Reasoning Graph (R2G), shows state-of-the-art performance on multiple multi-agent particle and robotics games.

Ran Tian, Liting Sun, Masayoshi Tomizuka, David Isele.

Anytime Game-Theoretic Planning with Safe and Active Information Gathering on Humans’ Latent States for Human-Centered Robots

International Conference on Robotics and Automation (ICRA) 2021

May, 2021 - Abstract

A human-centered robot needs to reason about the cognitive limitations and potential irrationality of its human partner to achieve seamless interactions. This paper proposes a novel anytime game-theoretic planning framework that integrates iterative reasoning models, partially observable Markov decision process, and Monte-Carlo belief tree search for robot behavioral planning. Our planner equips a robot with the ability to reason about its human partner’s latent cognitive states(bounded intelligence and irrationality) and enables the robot to actively learn these latent states to better maximize its utility. Furthermore, our planner handles safety explicitly by enforcing change constraints. We validate our approach in an autonomous driving domain where our behavioral planner and a low-level motion controller hierarchically control an autonomous car to negotiate traffic merges. Simulations and user studies are conducted to show our planner’s effectiveness.

Xiaobai Ma, Jiachen Li, Mykel J. Kochenderfer, David Isele, Kikuo Fujimura

Reinforcement Learning for Autonomous Driving with Latent State Inference and Spatial-Temporal Relationships

International Conference on Robotics and Automation (ICRA) 2021

May, 2021 - Abstract

Deep reinforcement learning (DRL) provides a promising way for learning navigation in complex autonomous driving scenarios. However, identifying the subtle cues that can indicate drastically different outcomes remains an open problem with designing autonomous systems that operate in human environments. In this work, we show that explicitly inferring the latent state and encoding spatial-temporal relationships in a reinforcement learning framework can help address this difficulty. We encode prior knowledge on the latent states of other drivers through a framework that combines the reinforcement learner with a supervised learner. In addition, we model the influence passing between different vehicles through graph neural networks (GNNs). The proposed framework significantly improves performance in the context of navigating T-intersections compared with state-of-the-art baseline approaches.

Dhruv Mauria Saxena, Sangjae Bae, Alireza Nakhaei, Kikuo Fujimura, Maxim Likhachev

Driving in Dense Traffic with Model-Free Reinforcement Learning

International Conference on Robotics and Automation (ICRA) 2020

November, 2020 - Abstract

Traditional planning and control methods could fail to find a feasible trajectory for an autonomous vehicle to execute amongst dense traffic on roads. This is because the obstacle-free volume in spacetime is very small in these scenarios for the vehicle to drive through. However, that does not mean the task is infeasible since human drivers are known to be able to drive amongst dense traffic by leveraging the cooperativeness of other drivers to open a gap. The traditional methods fail to take into account the fact that the actions taken by an agent affect the behaviour of other vehicles on the road. In this work, we rely on the ability of deep reinforcement learning to implicitly model such interactions and learn a continuous control policy over the action space of an autonomous vehicle. The application we consider requires our agent to negotiate and open a gap in the road in order to successfully merge or change lanes. Our policy learns to repeatedly probe into the target road lane while trying to find a safe spot to move in to. We compare against two model-predictive control-based algorithms and show that our policy outperforms them in simulation.

Huckleberry Febbo, Jiawei Huang, David Isele

A Comprehensive Trajectory Planner for a Person-Following ATV

International Conference on Intelligent Robots and Systems (IROS) 2020

October, 2020 - Abstract

This paper presents a trajectory planning algo-rithm for person following that is more comprehensive thanexisting algorithms. This algorithm is tailored for a front-wheel-steered vehicle, is designed to follow a person while avoidingcollisions with both static and moving obstacles, simultaneouslyoptimizing speed and steering, and minimizing control effort.This algorithm uses nonlinear model predictive control, wherethe underling trajectory optimization problem is approximatedusing a simultaneous method. Results collected in an unknownenvironment show that the proposed planning algorithm workswell with a perception algorithm to follow a person in unevengrass near obstacles and over ditches and curbs, and on asphaltover train-tracks and near buildings and cars. Overall, theresults indicate that the proposed algorithm can safely followa person in unknown, dynamic environment

Maxime Bouton, David Isele, Alireza Nakhaei, Kikuo Fujimura,and Mykel J. Kochenderfer

Reinforcement Learning with Iterative Reasoning for Merging in Dense Traffic

IEEE Intelligent Transportation Systems Conference (ITSC) 2020

September, 2020 - Abstract

Maneuvering in dense traffic is a challenging task for autonomous vehicles because it requires reasoning about the stochastic behaviors of many other participants. In addition, the agent must achieve the maneuver within a limited time and distance. In this work, we propose a combination of reinforcement learning and game theory to learn merging behaviors. We design a training curriculum for a reinforcement learning agent using the concept of level-k behavior. This approach exposes the agent to a broad variety of behaviors during training, which promotes learning policies that are robust to model discrepancies. We show that our approach learns more efficient policies than traditional training methods.

Huckleberry Febbo, David Isele

Accurate Trajectory Following for Automated Vehicles in Dynamic Environments

American Control Conference (ACC) 2020

July, 2020 - Abstract

This paper introduces an accurate nonlinear model predictive control-based algorithm for trajectory following. For accuracy, the algorithm incorporates both the planned state and control trajectories into its cost functional. Current following algorithms do not incorporate control trajectories into their cost functionals. Comparisons are made against two trajectory following algorithms, where the trajectory planning problem is to safely follow a person using an automated ATV with control delays in a dynamic environment while simultaneously optimizing speed and steering, minimizing control effort, and minimizing the time-to-goal. Results indicate that the proposed algorithm reduces collisions, tracking error, orientation error, and time-to-goal. Therefore, tracking the control trajectories with the trajectory following algorithm helps the vehicle follow the planned state trajectories more accurately, which ultimately improves safety, especially in dynamic environments

Sangjae Bae, Dhruv Saxena, Alireza Nakhaei, Chiho Choi, Kikuo Fujimura, Scott Moura

Cooperation-aware Lane Change Control in Dense Traffic.

American Control Conference (ACC) 2020

July, 2020 - Abstract

This paper presents an online smooth-path lane-change control framework. We focus on dense traffic where inter-vehicle space gaps are narrow, and cooperation with surroundingdrivers is essential to achieve the lane-change maneuver. Wepropose a two-stage control framework that harmonizes ModelPredictive Control (MPC) with Generative Adversarial Networks(GAN) by utilizing driving intentions to generate smooth lane-change maneuvers. To improve performance in practice, thesystem is augmented with an adaptive safety boundary and aKalman Filter to mitigate sensor noise. Simulation studies are in-vestigated in different levels of traffic density and cooperativenessof other drivers. The simulation results support the effectiveness,driving comfort, and safety of the proposed method.

Jiachen Yang, Alireza Nakhaei, David Isele, Kikuo Fujimura, and Hongyuan Zha

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

International Conference on Learning Representations (ICLR) 2020

January, 2020 - Abstract

A variety of cooperative multi-agent control problems require agents to achieve individual goals while contributing to collective success. This multi-goal multiagent setting poses difficulties for recent algorithms, which primarily target settings with a single global reward, due to two new challenges: efficient exploration for learning both individual goal attainment and cooperation for others’ success, and credit-assignment for interactions between actions and goals of different agents. To address both challenges, we restructure the problem into a novel two-stage curriculum, in which single-agent goal attainment is learned prior to learning multi-agent cooperation, and we derive a new multi-goal multi-agent policy gradient with a credit function for localized credit assignment. We use a function augmentation scheme to bridge value and policy functions across the curriculum. The complete architecture, called CM3, learns significantly faster than direct adaptations of existing algorithms on three challenging multi-goal multi-agent problems: cooperative navigation in difficult formations, negotiating multi-vehicle lane changes in the SUMO traffic simulator, and strategic cooperation in a Checkers environment.

Athma Narayanan, Avinash Siravuru, Behzad Dariush

Gated Recurrent Fusion to Learn Driving Behavior from Temporal Multimodal Data

IEEE Robotics and Automation Letters 5 (2), 1287-1294. 2020

January, 2020 - Abstract

The Tactical Driver Behavior modeling problem requires an understanding of driver actions in complicated urban scenarios from rich multimodal signals including video, LiDAR and CAN signal data streams. However, the majority of deep learning research is focused either on learning the vehicle/environment state (sensor fusion) or the driver policy (from temporal data), but not both. Learning both tasks jointly offers the richest distillation of knowledge but presents challenges in the formulation and successful training. In this work, we propose promising first steps in this direction. Inspired by the gating mechanisms in Long ShortTerm Memory units (LSTMs), we propose Gated Recurrent Fusion Units (GRFU) that learn fusion weighting and temporal weighting simultaneously. We demonstrate it’s superior performance over multimodal and temporal baselines in supervised regression and classification tasks, all in the realm of autonomous navigation. On tactical driver behavior classification using Honda Driving Dataset (HDD), we report 10% improvement in mean Average Precision (mAP) score, and similarly, for steering angle regression on TORCS dataset, we note a 20% drop in Mean Squared Error (MSE) over the state-of-the-art

Maxime Bouton, Alireza Nakhaei, Kikuo Fujimura, Mykel J Kochenderfer

Cooperation-Aware Reinforcement Learning for Merging in Dense Traffic

IEEE Intelligent Transportation Systems Conference (ITSC) 2019

October, 2019 - Abstract

Decision making in dense traffic can be challenging for autonomous vehicles. An autonomous system only relying on predefined road priorities and considering other drivers as moving objects will cause the vehicle to freeze and fail the maneuver. Human drivers leverage the cooperation of other drivers to avoid such deadlock situations and convince others to change their behavior. Decision making algorithms must reason about the interaction with other drivers and anticipate a broad range of driver behaviors. In this work, we present a reinforcement learning approach to learn how to interact with drivers with different cooperation levels. We enhanced the performance of traditional reinforcement learning algorithms by maintaining a belief over the level of cooperation of other drivers. We show that our agent successfully learns how to navigate a dense merging scenario with less deadlocks than with online planning methods.

David Isele

Interactive Decision Making for Autonomous Vehicles in Dense Traffic

IEEE Intelligent Transportation Systems Conference (ITSC) 2019

October, 2019 - Abstract

Dense urban traffic environments can produce situations where accurate prediction and dynamic models are insufficient for successful autonomous vehicle motion planning. We investigate how an autonomous agent can safely negotiate with other traffic participants, enabling the agent to handle potential deadlocks. Specifically we consider merges where the gap between cars is smaller than the size of the ego vehicle. We propose a game theoretic framework capable of generating and responding to interactive behaviors. Our main contribution is to show how game-tree decision making can be executed by an autonomous vehicle, including approximations and reasoning that make the tree-search computationally tractable. Additionally, to test our model we develop a stochastic rule-based traffic agent capable of generating interactive behaviors that can be used as a benchmark for simulating traffic participants in a crowded merge setting.

Maxime Bouton, Alireza Nakhaei, Kikuo Fujimura, Mykel J Kochenderfer.

Safe Reinforcement Learning with Scene Decomposition for Navigating Complex Urban Environments

IEEE Intelligent Vehicles Symposium (IV) 2019

June, 2019 - Abstract

Navigating urban environments represents a complex task for automated vehicles. They must reach their goal safely and efficiently while considering a multitude of traffic participants. We propose a modular decision making algorithm to autonomously navigate intersections, addressing challenges of existing rule-based and reinforcement learning (RL) approaches. We first present a safe RL algorithm relying on a model-checker to ensure safety guarantees. To make the decision strategy robust to perception errors and occlusions, we introduce a belief update technique using a learning based approach. Finally, we use a scene decomposition approach to scale our algorithm to environments with multiple traffic participants. We empirically demonstrate that our algorithm outperforms rule-based methods and reinforcement learning techniques on a complex intersection scenario.

Safa Cicek, Alireza Nakhaei, Stefano Soatto, Kikuo Fujimura.

MARL-PPS: Multi-agent Reinforcement Learning with Periodic Parameter Sharing

Autonomous Agents and Multi Agent Systems (AAMAS) 2019

May, 2019 - Abstract

We present a multi-agent reinforcement learning algorithm that is a simple, yet effective modification of a known algorithm. External agents are modeled as a time-varying environment, whose policy parameters are updated periodically at a slower rate than the planner to make learning stable and more efficient. Replay buffer, which is used to store the experiences, is also reset with the same large period to draw samples from a fixed environment. This enables us to address challenging cooperative control problems in highway navigation. The resulting Multi-agent Reinforcement Learning with Periodic Parameter Sharing (MARL-PPS) algorithm outperforms the baselines in multi-agent highway scenarios we tested.

Maxime Bouton, Kyle D Julian, Alireza Nakhaei, Kikuo Fujimura, Mykel J Kochenderfer

Decomposition Methods with Deep Corrections for Reinforcement Learning

Autonomous Agents and Multi-Agent Systems (AAMAS) 2019

May, 2019 - Abstract

Decomposition methods have been proposed to approximate solutions to large sequential decision making problems. In contexts where an agent interacts with multiple entities, utility decomposition can be used to separate the global objective into local tasks considering each individual entity independently. An arbitrator is then responsible for combining the individual utilities and selecting an action in real time to solve the global problem. Although these techniques can perform well empirically, they rely on strong assumptions of independence between the local tasks and sacrifice the optimality of the global solution. This paper proposes an approach that improves upon such approximate solutions by learning a correction term represented by a neural network. We demonstrate this approach on a fisheries management problem where multiple boats must coordinate to maximize their catch over time as well as on a pedestrian avoidance problem for autonomous driving. In each problem, decomposition methods can scale to multiple boats or pedestrians by using strategies involving one entity. We verify empirically that the proposed correction method significantly improves the decomposition method and outperforms a policy trained on the full scale problem without utility decomposition.

David Isele, Reza Rahimi , Akansel Cosgun , Kaushik Subramanian ,Kikuo Fujimura

Navigating Occluded Intersections with Autonomous Vehicles using Deep Reinforcement Learning

International Conference on Robotics and Automation (ICRA) 2018

February, 2018 - Abstract

Providing an efficient strategy to navigate safely through unsignaled intersections is a difficult task that requires determining the intent of other drivers. We explore the effectiveness of Deep Reinforcement Learning to handle intersection problems. Using recent advances in Deep RL, we are able to learn policies that surpass the performance of a commonly-used heuristic approach in several metrics including task completion time and goal success rate, and have limited ability to generalize. We then explore a system’s ability to learn active sensing behaviors to enable navigating safely in the case of occlusions. Our analysis, provides insight into the intersection handling problem, the solutions learned by the network point out several shortcomings of current rule-based methods, and the failures of our current deep reinforcement learning system point to future research directions.

David Isele, Akansel Cosgun

Selective Experience Replay for Lifelong Learning

AAAI Conference on Artificial Intelligence (AAAI) 2018

February, 2018 - Abstract

Deep reinforcement learning has emerged as a powerful tool for a variety of learning tasks, however, deep nets typically exhibit forgetting when learning multiple tasks in sequence. To mitigate forgetting, we propose an experience replay process that augments the standard FIFO buffer and selectively stores experiences in a long-term memory. We explore four strategies for selecting which experiences will be stored: favoring surprise, favoring reward, matching the global training distribution, and maximizing coverage of the state space. We show that distribution matching successfully prevents catastrophic forgetting, and is consistently the best approach on all domains tested. While distribution matching has better and more consistent performance, we identify one case in which coverage maximization is beneficial - when tasks that receive less trained are more important. Overall, our results show that selective experience replay, when suitable selection algorithms are employed, can prevent catastrophic forgetting.

Navigation

Navigation

Motion Planning and Interactive Decision Making - Honda Research Institute USA

Motion Planning and Interactive Decision Making

Related Publications

RCMS: Risk-Aware Crash Mitigation System for Autonomous Vehicles

Robust Driving Policy Learning with Guided Meta Reinforcement Learning.

MR-IDM - Merge Reactive Intelligent Driver Model: Towards Enhancing Laterally Aware Car-following Models

Active Uncertainty Reduction for Safe and Efficient Interaction Planning: A Shielding-Aware Dual Control Approach

Interaction-Aware Trajectory Planning for Autonomous Vehicles with Analytic Integration of Neural Networks into Model Predictive Control.

Minimally Disruptive Cooperative Lane-Change Maneuvers

SLAS: Speed and Lane Advisory System for Highway Navigation

Risk-sensitive MPCs with Deep Distributional Inverse RL for Autonomous Driving

Predicting Parameters for Modeling Traffic Participants

Design a sustainable micro-mobility future: trends and challenges in the US and EU

Sequential Cooperative Energy and Time-Optimal Lane Change Maneuvers for Highway Traffic

Lane-Change in Dense Traffic With Model Predictive Control and Neural Networks

Spatiotemporal Costmap Inference for MPC Via Deep Inverse Reinforcement Learning

Recursive Reasoning Graph for Multi-Agent Reinforcement Learning

Anytime Game-Theoretic Planning with Safe and Active Information Gathering on Humans’ Latent States for Human-Centered Robots

Reinforcement Learning for Autonomous Driving with Latent State Inference and Spatial-Temporal Relationships

Driving in Dense Traffic with Model-Free Reinforcement Learning

A Comprehensive Trajectory Planner for a Person-Following ATV

Reinforcement Learning with Iterative Reasoning for Merging in Dense Traffic

Accurate Trajectory Following for Automated Vehicles in Dynamic Environments

Cooperation-aware Lane Change Control in Dense Traffic.

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

Gated Recurrent Fusion to Learn Driving Behavior from Temporal Multimodal Data

Cooperation-Aware Reinforcement Learning for Merging in Dense Traffic

Interactive Decision Making for Autonomous Vehicles in Dense Traffic

Safe Reinforcement Learning with Scene Decomposition for Navigating Complex Urban Environments

MARL-PPS: Multi-agent Reinforcement Learning with Periodic Parameter Sharing

Decomposition Methods with Deep Corrections for Reinforcement Learning

Navigating Occluded Intersections with Autonomous Vehicles using Deep Reinforcement Learning

Selective Experience Replay for Lifelong Learning

Related Research Areas

Robotics

Machine Learning & Computer Vision

Automated Vehicles