Loading [MathJax]/extensions/MathZoom.js
Neural Task Planning With AND–OR Graph Representations | IEEE Journals & Magazine | IEEE Xplore

Neural Task Planning With AND–OR Graph Representations


Abstract:

This paper focuses on semantic task planning, that is, predicting a sequence of actions toward accomplishing a specific task under a certain scene, which is a new problem...Show More

Abstract:

This paper focuses on semantic task planning, that is, predicting a sequence of actions toward accomplishing a specific task under a certain scene, which is a new problem in computer vision research. The primary challenges are how to model the task-specific knowledge and how to integrate this knowledge into the learning procedure. In this paper, we propose training a recurrent long short-term memory (LSTM) network to address this problem, that is, taking a scene image (including prelocated objects) and the specified task as input and recurrently predicting action sequences. However, training such a network generally requires large numbers of annotated samples to cover the semantic space (e.g., diverse action decomposition and ordering). To overcome this issue, we introduce a knowledge and-or graph (AOG) for task description, which hierarchically represents a task as atomic actions. With this AOG representation, we can produce many valid samples (i.e., action sequences according to common sense) by training another auxiliary LSTM network with a small set of annotated samples. Furthermore, these generated samples (i.e., task-oriented action sequences) effectively facilitate training of the model for semantic task planning. In our experiments, we create a new dataset that contains diverse daily tasks and extensively evaluates the effectiveness of our approach.
Published in: IEEE Transactions on Multimedia ( Volume: 21, Issue: 4, April 2019)
Page(s): 1022 - 1034
Date of Publication: 16 September 2018

ISSN Information:

Funding Agency:


I. Introduction

Automatically predicting and executing a sequence of actions given a specific task is an ability that is quite expected for intelligent robots [1], [2]. For example, to complete the task of “make tea” under the scene shown in Fig. 1, an agent needs to plan and successively execute a number of steps, e.g., “move to the tea box” and “grasp the tea box”. In this paper, we aim to train a neural network model to enable this capability, which has rarely been addressed in computer vision research.

Contact IEEE to Subscribe

References

References is not available for this document.