Conferences >2015 IEEE Conference on Compu...

Understanding tools: Task-oriented object modeling, learning and recognition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this paper, we present a new framework - task-oriented modeling, learning and recognition which aims at understanding the underlying functions, physics and causality i...Show More

Metadata

Abstract:

In this paper, we present a new framework - task-oriented modeling, learning and recognition which aims at understanding the underlying functions, physics and causality in using objects as “tools”. Given a task, such as, cracking a nut or painting a wall, we represent each object, e.g. a hammer or brush, in a generative spatio-temporal representation consisting of four components: i) an affordance basis to be grasped by hand; ii) a functional basis to act on a target object (the nut), iii) the imagined actions with typical motion trajectories; and iv) the underlying physical concepts, e.g. force, pressure, etc. In a learning phase, our algorithm observes only one RGB-D video, in which a rational human picks up one object (i.e. tool) among a number of candidates to accomplish the task. From this example, our algorithm learns the essential physical concepts in the task (e.g. forces in cracking nuts). In an inference phase, our algorithm is given a new set of objects (daily objects or stones), and picks the best choice available together with the inferred affordance basis, functional basis, imagined human actions (sequence of poses), and the expected physical quantity that it will produce. From this new perspective, any objects can be viewed as a hammer or a shovel, and object recognition is not merely memorizing typical appearance examples for each category but reasoning the physical mechanisms in various tasks to achieve generalization.

Published in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 07-12 June 2015

Date Added to IEEE Xplore: 15 October 2015

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPR.2015.7298903

Conference Location: Boston, MA, USA

Citations are not available for this document.

Contents

1. Introduction

In this paper, we rethink object recognition from the perspective of an agent: how objects are used as “tools” in actions to accomplish a “task”. Here a task is defined as changing the physical states of a target object by actions, such as, cracking a nut or painting a wall. A tool is a physical object used in the human action to achieve the task, such as a hammer or brush, and it can be any daily objects and is not restricted to conventional hardware tools. This leads us to a new framework-task-oriented modeling, learning and recognition, which aims at understanding the underlying functions, physics and causality in using objects as tools in various task categories. Figure 1.

Task-oriented object recognition. (a) In a learning phase, a rational human is observed picking a hammer among other tools to crack a nut. (b) In an inference phase, the algorithm is asked to pick the best object (i.e. The wooden leg) on the table for the same task. This generalization entails physical reasoning.

Cites in Papers - |

Cites in Papers - Other Publishers (31)

Xinhang Song, Bohan Wang, Liye Dong, Gongwei Chen, Xinyun Hu, Shuqiang Jiang, "Object-to-Manipulation Graph for Affordance Navigation", CAAI Artificial Intelligence Research, pp.9150032, 2024.

MIT Libraries

MIT Libraries

Understanding tools: Task-oriented object modeling, learning and recognition

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

Cites in Papers - IEEE (53) | Other Publishers (31)

Cites in Papers - IEEE (53)

Cites in Papers - Other Publishers (31)

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Cites in Papers - |