I. Introduction
Robots have already firmly become part of our daily lives, making it crucial to learn from users, especially non-expert users. Learning from Demonstration (LfD) enables robots to learn new skills by observing expert policies [1], [2] while Learning from Human Feedback (LfHF) allows robots to adapt to human preferences or correct wrong behaviors by learning or shaping a policy [3], [4], [5]. More recent work has further shown that using human feedback and demonstrations together can make learning even more effective by reducing the data needs for human feedback [6] and loosening the requirements of demonstrations to be near-optimal [7]. However, while interest in learning fully or partially from humans is high, there is relatively little research on what the most effective forms of human feedback are, especially in combination with human demonstrations.