Diffusion-PbD: Generalizable Robot Programming by Demonstration with Diffusion Features | IEEE Conference Publication | IEEE Xplore

Diffusion-PbD: Generalizable Robot Programming by Demonstration with Diffusion Features


Abstract:

Programming by Demonstration (PbD) is an intuitive technique for programming robot manipulation skills by demonstrating the desired behavior. However, most existing appro...Show More

Abstract:

Programming by Demonstration (PbD) is an intuitive technique for programming robot manipulation skills by demonstrating the desired behavior. However, most existing approaches either require extensive demonstrations or fail to generalize beyond their initial demonstration conditions. We introduce Diffusion-PbD, a novel approach to PbD that enables users to synthesize generalizable robot manipulation skills from a single demonstration by utilizing the representations captured by pre-trained visual foundation models. At demonstration time, hand and object detection priors are used to extract waypoints from the human demonstrations anchored to reference points in the scene. At execution time, features from pre-trained diffusion models are leveraged to identify corresponding reference points in new observations. We validate this approach through a series of real-world robot experiments, showing that Diffusion-PbD is applicable to a wide range of manipulation tasks and has strong ability to generalize to unseen objects, camera viewpoints, and scenes. Code and supplementary videos can be found at https://diffusion-pbd.github.io
Date of Conference: 14-18 October 2024
Date Added to IEEE Xplore: 25 December 2024
ISBN Information:

ISSN Information:

Conference Location: Abu Dhabi, United Arab Emirates

I. Introduction

General-purpose robots have the promise to automate tasks in many human-centric environments such as homes and workplaces. However, programming robots to robustly perform behaviors with every possible object in every possible environment is extremely challenging. Programming by Demonstration (PbD) is a popular approach that enables end-users to program new robot capabilities by simply demonstrating the desired behavior [1]. For robots deployed in human-centric environments, demonstration provides an intuitive way for end-users to teach robots new skills without having technical training or expertise in robotics. But this approach typically requires a large-scale and diverse set of demonstrations in order for the programmed capabilities to generalize to new environments and objects, which is not feasible for an end-user to provide. Ideally, an end-user could program robot capabilities by providing just a single demonstration of the desired behavior and those capabilities would generalize to new scenarios. For example, after demonstrating how to put a mug into a coffee machine, the robot should be able to repeat this task with other mugs even if they are visually distinct. Additionally, if the coffee machine and mugs are re-arranged or moved to an entirely different location the robot should still be able to perform the demonstrated task.

Contact IEEE to Subscribe

References

References is not available for this document.