I. Introduction
Consider a robot deployed in a previously unseen kitchen, such as the one shown in Fig. 1. Since this is an unfamiliar environment, the robot might not know exactly how to operate all the kitchen appliances, open cabinets, operate the microwave or turn on the burners, especially if these need to be done in a specific sequence resulting in along-horizon task. Like many intelligent agents, the robot needs some practice, making reinforcement learning (RL) a promising paradigm. However, as we have learn from prior work, practicing long-horizon tasks with RL is difficult for reasons such as exploration [1], environment resets [2], and credit assignment problems [3]. The process of building robotic learning systems (especially with RL) is never as simple as placing a robot in the environment and having it continually improve. Instead, human intervention is often needed in the training process, especially to provide resets to the learning agent. Ideally, we would want to provide the robot with a small number of examples to help with these challenges in the training process and do so at the beginning of the practicing process so that it can continually practice a wide range of tasks in the kitchen on its own, with minimal human intervention.