Human Action Understanding in Long Videos - Honda Research Institute USA

Intern Positions

Currently, HRI-US is offering research and engineering internships to highly qualified and motivated students.  Research interns will work closely with HRI scientists and are encouraged to publish results in academic forums.   For research internship positions, we are looking for candidates with a good publication record and excellent programming skills to join our team. The positions currently listed are for the Spring/Summer 2024.

Human Action Understanding in Long Videos

Your application is being processed

Human Action Understanding in Long Videos

Job Number: P23INT-01
The project focuses on research and development of computer vision and machine learning algorithms toward understanding of human actions and activities in instructional videos, with particular emphasis on weakly supervised online action segmentation and detection of anomalies and errors during execution of those tasks.
San Jose, CA
Duration 3 Months
Position Introduction

​The project focuses on research and development of computer vision and machine learning algorithms toward understanding of human actions and activities in instructional videos, with particular emphasis on weakly supervised online action segmentation and detection of anomalies and errors during execution of those tasks.

Key Responsibilities

During the time of the internship, you are expected to:

• Develop computer vision algorithm for action understanding and anomaly detection using weakly supervised methods

• Support development of a benchmark dataset for evaluation of results

• Develop and evaluate metrics to verify reliability of the proposed algorithms

• Contribute to a portfolio of patents, academic publications, and prototypes to demonstrate research value  


 

 

Minimum Qualifications

 

  • Ph.D. or highly qualified M.S. candidate in computer science, electrical engineering, or related field
  • Strong research experience in computer vision and machine learning
  • Hands-on experience in video and action understanding.
  • Experience in addressing problems at the intersection of language and vision, particularly use of large language models.
  • Experience in open-source deep learning frameworks such as TensorFlow or PyTorch
Bonus Qualifications
  • ​Zero-shot learning.
  • Anomaly and out of distribution detection.
  • Hands-on experience in long-range video understanding of instructional videos, e.g., action segmentation, action detection, or action anticipation.
  • Publications in top-tier conferences (CVPR, ICCV, ECCV, ICML, NeurIPS, ICLR, etc.)

 

Years of Work Experience Required 1 year
Position Keywords

​Human action understanding, zero-shot learning,  LLMs, VLMs, video understanding, large language models

Alternate Way to Apply

Send an e-mail to careers@honda-ri.com with the following:
- Subject line including the job number(s) you are applying for 
- Recent CV 
- A cover letter highlighting relevant background (Optional)

Please, do not contact our office to inquiry about your application status.