End-to-End Spatio-Temporal Attention-Based Lane-Change Intention Prediction from Multi-Perspective Cameras

Zhouqiao Zhao Zhensong Wei Danyang Tian Bryan Reimer Pnina Gershon Ehsan Moradi-Pari

2023 IEEE Intelligent Vehicles Symposium (IV)

Advanced Driver Assistance Systems (ADAS) with proactive alerts have been used to increase driving safety. Such systems’ performance greatly depends on how accurately and quickly the risky situations and maneuvers are detected. Existing ADAS provide warnings based on the vehicle’s operational status, detection of environments, and the drivers’ overt actions (e.g., using turn signals or steering wheels), which may not give drivers as much as optimal time to react. In this paper, we proposed a spatio-temporal attention-based neural network to predict drivers’ lane-change intention by fusing the videos from both in-cabin and forward perspectives. The Convolutional Neural Network (CNN)-Recursive Neural Network (RNN) network architecture was leveraged to extract both the spatial and temporal information. On top of this network backbone structure, the feature maps from different time steps and perspectives were fused using multi-head self-attention at each resolution of the CNN. The proposed model was trained and evaluated using a processed subset of the MIT Advanced Vehicle Technology (MIT-AVT) dataset which contains synchronized CAN data, 11058-second videos from 3 different views, 548 lane-change events, and 274 non-lane-change events performed by 83 drivers. The results demonstrate that the model achieves 87% F1-score within the 1-second validation window and 70% F1-score within the 5-second validation window with real-time performance.

Downloadable item

Interaction