Research Intern: Tactile-Aware Multi-Modal Foundation Models for Dexterous Manipulation - Honda Research Institute USA

Research Intern: Tactile-Aware Multi-Modal Foundation Models for Dexterous Manipulation

Your application is being processed

Research Intern: Tactile-Aware Multi-Modal Foundation Models for Dexterous Manipulation

Job Number: P25INT-18
​Honda Research Institute USA (HRI-US) is seeking a passionate individual to contribute to the development of tactile-aware multi-modal foundation models for robotic dexterous manipulation. The project investigates how large models can integrate tactile, vision, force, audio, and language modalities to build a representation of robot-object interactions and integrate them into action policies that demonstrate their effectiveness in multi-fingered dexterous manipulation. The successful candidate will explore novel approaches for sensor encoding, cross-modal alignment, and context-driven modality selection, and evaluate these methods in both simulated and real-robot environments.
San Jose, CA

 

Key Responsibilities

 

  • Collaborate with researchers to develop and evaluate novel architectures, and assist in multi-modal dataset generation.
  • Design and implement sensor encoders and cross-modal alignment strategies for heterogeneous data streams.
  • Develop action policies or control heads that directly leverage learned representations to perform dexterous manipulation using multi-fingered hands.
  • Conduct experiments in simulation and real-robot platforms, analyzing model performance and generalization.
  • Document findings and contribute to internal research reports
  • Contribute to the portfolio of patents, and publish research results when applicable at top-tier conferences and journals in robotics and machine learning.

 

Minimum Qualifications

 

  • Ph.D. or highly qualified M.S. candidate in computer science, robotics or a related field.
  • Strong background in machine learning and deep learning.
  • Proficiency in Python and at least one ML framework (PyTorch or TensorFlow).
  • Research experience in machine learning, robotics, or computer vision.
  • Familiarity with robotics software frameworks such as ROS

 

Bonus Qualifications

  • Experience with multi-modal learning or representation learning (e.g., integrating visual, tactile, or language data).
  • Experience with tactile or force/torque sensing and dexterous manipulation.
  • Expericne with optical tactie sensors like DIGIT.
  • Experience with large models (e.g., VLMs, multi-modal transformers, or foundation models) for perception or control.
  • Familiarity with simulation platforms such as Isaac Sim, Isaac Lab or MuJoCo
  • Experience with robot learning, multi-modal perception, and large model adaptation.

 

Years of Work Experience Required  0
Desired Start Date 5/11/2026
Internship Duration 3 Months
Position Keywords ​Robotics, perception, representation learning, vision, tactile, visuotactile, vision-language models, VLM, VLA, VTLA

Alternate Way to Apply

Send an e-mail to careers@honda-ri.com with the following:
- Subject line including the job number(s) you are applying for 
- Recent CV 
- A cover letter highlighting relevant background (Optional)

Please, do not contact our office to inquiry about your application status.