Research Intern: Semantics Detection for Robotics Perception - Honda Research Institute USA

Research Intern: Semantics Detection for Robotics Perception

Your application is being processed

Research Intern: Semantics Detection for Robotics Perception

Job Number: P25INT-20
​Honda Research Institute USA (HRI-US) is seeking a self-motivated PhD student to build semantic feature detection for objects and environments—attributes, affordances, relations, and 6-DoF poses—that a vision-language model (VLM) will use to reason about tasks and synthesize robot motions and motion constraints, closely tied to object operations. The successful candidate will conduct reseearch on perception, representation, and robotics to turn raw sensor data into a scene description that safely guides planning and control of the robot.
San Jose, CA

 

Key Responsibilities

 

  • Design and Develop software modules on semantics, perception and constraints reasoning.
  • Deploy and demonstrate the taslk on a robot hardware by integrating accelerated infrence, ROS1/2 integration and planning stacks.
  • Document findings and contribute to internal research reports.
  • Contribute to the portfolio of patents, and publish research results when applicable at top-tier conferences and journals in robotics and machine learning.

 

Minimum Qualifications

 

  • Currently pursuing a Ph.D. in machine learning, robotics, computer science or related fields.
  • Strong Python and deep learning fundamentals, solid computer-vision foundations.
  • Hands-on experience training a vision model (detection/segmentation/pose/…) on a custom dataset.
  • Basic 3D perception (camera models, calibration, transforms) and experience with RGB-D and point clouds.
  • Proficiency in Linux, Git, PyTorch, OpenCV, and reproducible experimentation

 

Bonus Qualifications

  • 6-DoF pose & tracking: Experience with FoundationPose/DOPE/CenterPose/GenPose++ or similar; multi-object tracking (e.g., ByteTrack/BoT-SORT) and temporal filtering (EKF).
  • Affordances & parts detection: Familiarity with-or experience in-methods for detecting graspable regions, handles, openings, containers, tools; articulated objects (hinge/prismatic joints).
  • Open-vocabulary vision: Experience with CLIP/OWL-ViT/Grounding DINO; segmentation with SAM/Mask2Former; tuning with few-shot/adaptive labeling.
  • VLM/robotics interface: Familiarity with VLM prompting/tool-use and turning semantics into motion goals/constraints; basics of motion planning/control (MoveIt, constraint specs, diffusion/learned policies.

 

Years of Work Experience Required  0
Desired Start Date 5/11/2026
Internship Duration 3 Months
Position Keywords ​Semantic perception, affordance detection, Vision Language Model (VLM), teleoperation/robotic autonomy

Alternate Way to Apply

Send an e-mail to careers@honda-ri.com with the following:
- Subject line including the job number(s) you are applying for 
- Recent CV 
- A cover letter highlighting relevant background (Optional)

Please, do not contact our office to inquiry about your application status.