EPOSH Dataset

The Egocentric Perspective and Overhead-View Segmentation from Honda (EPOSH) dataset consists of 560 ego-centric road-scene video clips collected in the San Francisco Bay Area with a front-facing GoPro Hero 7 camera mounted on a vehicle.

Introduction

Most video clips are between 10 − 30 sec long and are recorded around construction zones. For each video about 10 frames are manually selected and annotated. We annotate a total of 5, 630 perspective images.

Using COLMAP, we reconstruct a 3D dense point cloud given a video clip. We then annotate semantic labels of 3D points manually. A total of about 70, 000 BEV  image / ground truth pairs are constructed.

The below showing distribution of classes and corresponding attributes in the perspective EPOSH dataset. The left subplot shows classes and the right subplot shows the corresponding attributes and affordance classes in the dataset.

 

The below figure shows a sample annotation from the EPOSH dataset,


Availability of code

The EPOSH dataset and code used for this work will be available soon. If you would like to get a notification upon release, please leave your email here