Long-horizon contact-rich manipulation has long been a challenging problem, as it requires reasoning over both discrete contact modes and continuous object motion. We introduce Implicit Contact Diffuser (ICD), a diffusion-based model that generates a sequence of neural descriptors that specify a series of contact relationships between the object and the environment. This sequence is then used as guidance for an MPC method to accomplish a given task. The key advantage of this approach is that the latent descriptors provide more task- relevant guidance to MPC, helping to avoid local minima for contact-rich manipulation tasks. Our experiments demonstrate that ICD outperforms baselines on complex, long-horizon, contact-rich manipulation tasks, such as cable routing and notebook folding. Additionally, our experiments also indicate that ICD can generalize a target contact relationship to a different environment.
When planning with point cloud-based subgoals, the MPC gets stuck at local minima. The cable is placed on the wrong side of the fixtures. The NDF point cloud better captures the contact relationships.
We transform raw object and scene point cloud to latent space using a contact-aware neural descriptor field (NDF) model. The NDF point clouds captures contact relationship between the objects and the environments.
Then, conditoned on current state, goal state and the scene, we train a latent point cloud diffusion model to generate future contact sequence represented by NDF point clouds. We use sampling-based MPC to track the contact sequence and reach the goal specification.