Multimodal Contact Perception and Modeling

Multimodal Contact Perception and Modeling

Multimodal Contact Perception and Modeling

Our multimodal perception system learns to reliably estimate 6D object poses under occlusion during contact-rich interactions by tightly integrating vision and touch

Determining an object’s precise position and orientation while it’s already inside a robotic hand is one of in-hand manipulation’s most difficult challenges. Cameras can’t “see” through fingers. IRR is addressing this with a multimodal perception system that fuses vision, depth, and tactile sensing.

Using a deep neural network trained on combined sensory data, the hand can infer the object’s full 6D pose throughout its manipulation, even for unseen grasps and unseen object orientations. This enables reliable in-hand manipulation, fine object reorientation, and transitions between grasp types without requiring a clear visual line of sight.

By closing this perception gap, we unlock more human-like dexterity and more capable remote operation workflows.

Learn more in the associated research publication – IROS 2024 .

Media Gallery

Popup Image