Loading [MathJax]/extensions/MathZoom.js
Understanding tools: Task-oriented object modeling, learning and recognition | IEEE Conference Publication | IEEE Xplore

Understanding tools: Task-oriented object modeling, learning and recognition


Abstract:

In this paper, we present a new framework - task-oriented modeling, learning and recognition which aims at understanding the underlying functions, physics and causality i...Show More

Abstract:

In this paper, we present a new framework - task-oriented modeling, learning and recognition which aims at understanding the underlying functions, physics and causality in using objects as “tools”. Given a task, such as, cracking a nut or painting a wall, we represent each object, e.g. a hammer or brush, in a generative spatio-temporal representation consisting of four components: i) an affordance basis to be grasped by hand; ii) a functional basis to act on a target object (the nut), iii) the imagined actions with typical motion trajectories; and iv) the underlying physical concepts, e.g. force, pressure, etc. In a learning phase, our algorithm observes only one RGB-D video, in which a rational human picks up one object (i.e. tool) among a number of candidates to accomplish the task. From this example, our algorithm learns the essential physical concepts in the task (e.g. forces in cracking nuts). In an inference phase, our algorithm is given a new set of objects (daily objects or stones), and picks the best choice available together with the inferred affordance basis, functional basis, imagined human actions (sequence of poses), and the expected physical quantity that it will produce. From this new perspective, any objects can be viewed as a hammer or a shovel, and object recognition is not merely memorizing typical appearance examples for each category but reasoning the physical mechanisms in various tasks to achieve generalization.
Date of Conference: 07-12 June 2015
Date Added to IEEE Xplore: 15 October 2015
ISBN Information:

ISSN Information:

Conference Location: Boston, MA, USA
Citations are not available for this document.

1. Introduction

In this paper, we rethink object recognition from the perspective of an agent: how objects are used as “tools” in actions to accomplish a “task”. Here a task is defined as changing the physical states of a target object by actions, such as, cracking a nut or painting a wall. A tool is a physical object used in the human action to achieve the task, such as a hammer or brush, and it can be any daily objects and is not restricted to conventional hardware tools. This leads us to a new framework-task-oriented modeling, learning and recognition, which aims at understanding the underlying functions, physics and causality in using objects as tools in various task categories.

Task-oriented object recognition. (a) In a learning phase, a rational human is observed picking a hammer among other tools to crack a nut. (b) In an inference phase, the algorithm is asked to pick the best object (i.e. The wooden leg) on the table for the same task. This generalization entails physical reasoning.

Cites in Papers - |

Cites in Papers - IEEE (53)

Select All
1.
Zhenyu Lu, Ning Wang, Chenguang Yang, "A Dynamic Movement Primitives-Based Tool Use Skill Learning and Transfer Framework for Robot Manipulation", IEEE Transactions on Automation Science and Engineering, vol.22, pp.1748-1763, 2025.
2.
Carl Qi, Yilin Wu, Lifan Yu, Haoyue Liu, Bowen Jiang, Xingyu Lin, David Held, "Learning Generalizable Tool-use Skills through Trajectory Generation", 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.2847-2854, 2024.
3.
Ram Ramrakhya, Aniruddha Kembhavi, Dhruv Batra, Zsolt Kira, Kuo-Hao Zeng, Luca Weihs, "Seeing the Unseen: Visual Common Sense for Semantic Placement", 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.16273-16283, 2024.
4.
Asim Unmesh, Rahul Jain, Jingyu Shi, V. K. Chaithanya Manam, Hyung-Gun Chi, Subramanian Chidambaram, Alexander Quinn, Karthik Ramani, "Interacting Objects: A Dataset of Object-Object Interactions for Richer Dynamic Scene Representations", IEEE Robotics and Automation Letters, vol.9, no.1, pp.451-458, 2024.
5.
Zhongli Wang, Guohui Tian, "Task-Oriented Robot Cognitive Manipulation Planning Using Affordance Segmentation and Logic Reasoning", IEEE Transactions on Neural Networks and Learning Systems, vol.35, no.9, pp.12172-12185, 2024.
6.
Zeyu Zhang, Muzhi Han, Baoxiong Jia, Ziyuan Jiao, Yixin Zhu, Song-Chun Zhu, Hangxin Liu, "Learning a Causal Transition Model for Object Cutting", 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.1996-2003, 2023.
7.
Syed Afaq Ali Shah, Zeyad Khalifa, "Hierarchical Transformer for Visual Affordance Understanding using a Large-scale Dataset", 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.11371-11376, 2023.
8.
Xin Meng, Hongtao Wu, Sipu Ruan, Gregory S. Chirikjian, "Prepare the Chair for the Bear! Robot Imagination of Sitting Affordance to Reorient Previously Unseen Chairs", IEEE Robotics and Automation Letters, vol.8, no.10, pp.6515-6522, 2023.
9.
Dongpan Chen, Dehui Kong, Jinghua Li, Shaofan Wang, Baocai Yin, "A Survey of Visual Affordance Recognition Based on Deep Learning", IEEE Transactions on Big Data, vol.9, no.6, pp.1458-1476, 2023.
10.
Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, Dacheng Tao, "Leverage Interactive Affinity for Affordance Learning", 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.6809-6819, 2023.
11.
Tianqiang Zhu, Rina Wu, Jinglue Hang, Xiangbo Lin, Yi Sun, "Toward Human-Like Grasp: Functional Grasp by Dexterous Robotic Hand Via Object-Hand Semantic Representation", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.45, no.10, pp.12521-12534, 2023.
12.
Ziyuan Jiao, Yida Niu, Zeyu Zhang, Song-Chun Zhu, Yixin Zhu, Hangxin Liu, "Sequential Manipulation Planning on Scene Graph", 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.8203-8210, 2022.
13.
Zeyu Zhang, Ziyuan Jiao, Weiqi Wang, Yixin Zhu, Song-Chun Zhu, Hangxin Liu, "Understanding Physical Effects for Effective Tool-Use", IEEE Robotics and Automation Letters, vol.7, no.4, pp.9469-9476, 2022.
14.
Jianjia Xin, Lichun Wang, Shaofan Wang, Yukun Liu, Chao Yang, Baocai Yin, "Recommending Fine-Grained Tool Consistent With Common Sense Knowledge for Robot", IEEE Robotics and Automation Letters, vol.7, no.4, pp.8574-8581, 2022.
15.
Zhenyu Lu, Ning Wang, Miao Li, Chenguang Yang, "A Novel Dynamic Movement Primitives-based Skill Learning and Transfer Framework for Multi-Tool Use", 2022 IEEE 17th International Conference on Control & Automation (ICCA), pp.1-8, 2022.
16.
Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan, "Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction", 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.1403-1413, 2022.
17.
Hongtao Wu, Xin Meng, Sipu Ruan, Gregory S. Chirikjian, "Put the Bear on the Chair! Intelligent Robot Interaction with Previously Unseen Chairs via Robot Imagination", 2022 International Conference on Robotics and Automation (ICRA), pp.6276-6282, 2022.
18.
Xiaobai Sun, Takahiro Nozaki, Kouhei Ohnishi, Toshiyuki Murakami, "Object Detection in Motion Reproduction System with Segmentation Algorithm", 2022 IEEE 17th International Conference on Advanced Motion Control (AMC), pp.42-47, 2022.
19.
Zhengtao Hu, Weiwei Wan, Keisuke Koyama, Kensuke Harada, "A Mechanical Screwing Tool for Parallel Grippers—Design, Optimization, and Manipulation Policies", IEEE Transactions on Robotics, vol.38, no.2, pp.1139-1159, 2022.
20.
Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia, "3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding", 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.1778-1787, 2021.
21.
Mariem Mezghanni, Malika Boulkenafed, André Lieutier, Maks Ovsjanikov, "Physically-aware Generative Network for 3D Shape Modeling", 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.9326-9337, 2021.
22.
Spyridon Thermos, Gerasimos Potamianos, Petros Daras, "Joint Object Affordance Reasoning and Segmentation in RGB-D Videos", IEEE Access, vol.9, pp.89699-89713, 2021.
23.
Kento Kawaharazuka, Kei Okada, Masayuki Inaba, "Adaptive Robotic Tool-Tip Control Learning Considering Online Changes in Grasping State", IEEE Robotics and Automation Letters, vol.6, no.3, pp.5992-5999, 2021.
24.
Muzhi Han, Zeyu Zhang, Ziyuan Jiao, Xu Xie, Yixin Zhu, Song-Chun Zhu, Hangxin Liu, "Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments", 2021 IEEE International Conference on Robotics and Automation (ICRA), pp.12199-12206, 2021.
25.
Zhenliang Zhang, Yixin Zhu, Song-Chun Zhu, "Graph-based Hierarchical Knowledge Representation for Robot Task Transfer from Virtual to Physical World", 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.11139-11145, 2020.
26.
Kento Kawaharazuka, Toru Ogawa, Cota Nabeshima, "Tool Shape Optimization through Backpropagation of Neural Network", 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.8387-8393, 2020.
27.
Zengyi Qin, Kuan Fang, Yuke Zhu, Li Fei-Fei, Silvio Savarese, "KETO: Learning Keypoint Representations for Tool Manipulation", 2020 IEEE International Conference on Robotics and Automation (ICRA), pp.7278-7285, 2020.
28.
Lin Shao, Toki Migimatsu, Jeannette Bohg, "Learning to Scaffold the Development of Robotic Manipulation Skills", 2020 IEEE International Conference on Robotics and Automation (ICRA), pp.5671-5677, 2020.
29.
Hongtao Wu, Deven Misra, Gregory S. Chirikjian, "Is That a Chair? Imagining Affordances Using Simulations of an Articulated Human Body", 2020 IEEE International Conference on Robotics and Automation (ICRA), pp.7240-7246, 2020.
30.
Pouria Chalangari, Thomas Fevens, Hassan Rivaz, "3D Human Knee Flexion Angle Estimation Using Deep Convolutional Neural Networks", 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp.5424-5427, 2020.

Cites in Papers - Other Publishers (31)

1.
Xinhang Song, Bohan Wang, Liye Dong, Gongwei Chen, Xinyun Hu, Shuqiang Jiang, "Object-to-Manipulation Graph for Affordance Navigation", CAAI Artificial Intelligence Research, pp.9150032, 2024.
2.
Hangxin Liu, Zeyu Zhang, Ziyuan Jiao, Zhenliang Zhang, Minchen Li, Chenfanfu Jiang, Yixin Zhu, Song-Chun Zhu, "A Reconfigurable Data Glove for Reconstructing Physical and Virtual Grasps", Engineering, 2023.
3.
Luyao Yuan, Song-Chun Zhu, "Communicative Learning: A Unified Learning Formalism", Engineering, 2023.
4.
Muzhi Han, Zeyu Zhang, Ziyuan Jiao, Xu Xie, Yixin Zhu, Song-Chun Zhu, Hangxin Liu, "Scene Reconstruction with Functional Objects for Robot Autonomy", International Journal of Computer Vision, vol.130, no.12, pp.2940, 2022.
5.
Wei Zhai, Hongchen Luo, Jing Zhang, Yang Cao, Dacheng Tao, "One-Shot Object Affordance Detection in the Wild", International Journal of Computer Vision, vol.130, no.10, pp.2472, 2022.
6.
Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nie?ner, "Pose2Room: Understanding 3D Scenes from Human Activities", Computer Vision ? ECCV 2022, vol.13687, pp.425, 2022.
7.
Mohammed Hassanin, Salman Khan, Murat Tahtali, "Visual Affordance and Function Understanding", ACM Computing Surveys, vol.54, no.3, pp.1, 2022.
8.
Aizreena Azaman, Husnir Nasyuha Abdul Halim, Muhammad Fariz Shafiq Abd. Aziz, Sagida M. A. Bilal, 11th Asian-Pacific Conference on Medical and Biological Engineering, vol.82, pp.135, 2021.
9.
Yongqi Zhang, Haikun Huang, Erion Plaku, Lap-Fai Yu, "Joint computational design of workspaces and workplans", ACM Transactions on Graphics, vol.40, no.6, pp.1, 2021.
10.
Hangxin Liu, Yixin Zhu, Song‐Chun Zhu, " Patching interpretable And‐Or‐Graph knowledge representation using augmented reality ", Applied AI Letters, vol.2, no.4, 2021.
11.
Ruizhen Hu, Manolis Savva, Oliver van Kaick, "Learning 3D functionality representations", SIGGRAPH Asia 2020 Courses, pp.1, 2020.
12.
Kim Wölfel, Dominik Henrich, "Affordance Based Disambiguation and Validation in Human-Robot Dialogue", Annals of Scientific Society for Assembly, Handling and Industrial Robotics, pp.307, 2020.
13.
Yixin Zhu, Tao Gao, Lifeng Fan, Siyuan Huang, Mark Edmonds, Hangxin Liu, Feng Gao, Chi Zhang, Siyuan Qi, Ying Nian Wu, Joshua B. Tenenbaum, Song-Chun Zhu, "Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense", Engineering, vol.6, no.3, pp.310, 2020.
14.
Xiaomin Liu, Jun-Bao Li, Jeng-Shyang Pan, Shuo Wang, Xudong Lv, Shuanglong Cui, "Image-matching framework based on region partitioning for target image location", Telecommunication Systems, vol.74, no.3, pp.269, 2020.
15.
Spyridon Thermos, Georgios Th. Papadopoulos, Petros Daras, Gerasimos Potamianos, "Deep sensorimotor learning for RGB-D object recognition", Computer Vision and Image Understanding, vol.190, pp.102844, 2020.
16.
Kuan Fang, Yuke Zhu, Animesh Garg, Andrey Kurenkov, Viraj Mehta, Li Fei-Fei, Silvio Savarese, "Learning task-oriented grasping for tool manipulation from simulated self-supervision", The International Journal of Robotics Research, vol.39, no.2-3, pp.202, 2020.
17.
Mark Bugeja, Alexiei Dingli, Maria Attard, Dylan Seychell, "A Framework for Queryable Video Analysis", Proceedings of the 1st ACM Workshop on Emerging Smart Technologies and Infrastructures for Smart Mobility and Sustainability - SMAS 19, pp.21, 2019.
18.
R. Hu, M. Savva, O. van Kaick, "Functionality Representations and Applications for Shape Analysis", Computer Graphics Forum, vol.37, no.2, pp.603, 2018.
19.
Shuichi AKIZUKI, Masaki IIZUKA, Kentaro KOZAI, Manabu HASHIMOTO, "Integration Method of Local Evidence for Part-affordance Estimation of Everyday Objects", Journal of the Japan Society for Precision Engineering, vol.84, no.7, pp.658, 2018.
20.
Zhijian Liu, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu, "Physical Primitive Decomposition", Computer Vision ? ECCV 2018, vol.11216, pp.3, 2018.
21.
Natsuki Yamanobe, Weiwei Wan, Ixchel G. Ramirez-Alpizar, Damien Petit, Tokuo Tsuji, Shuichi Akizuki, Manabu Hashimoto, Kazuyuki Nagata, Kensuke Harada, "A Brief Review of Affordance in Robotic Manipulation Research", Journal of the Robotics Society of Japan, vol.36, no.5, pp.327, 2018.
22.
Henry Muchiri, Ismail Ateya, Gregory Wanyembi, "Human Gait Indicators of Carrying a Concealed Firearm : A Skeletal Tracking and Data Mining Approach", International Journal of Scientific Research in Computer Science, Engineering and Information Technology, pp.368, 2018.
23.
Ruizhen Hu, Zihao Yan, Jingwen Zhang, Oliver Van Kaick, Ariel Shamir, Hao Zhang, Hui Huang, "Predictive and generative neural networks for object functionality", ACM Transactions on Graphics, vol.37, no.4, pp.1, 2018.
24.
Chenfanfu Jiang, Siyuan Qi, Yixin Zhu, Siyuan Huang, Jenny Lin, Lap-Fai Yu, Demetri Terzopoulos, Song-Chun Zhu, "Configurable 3D Scene Synthesis and 2D Image Rendering with Per-pixel Ground Truth Using Stochastic Grammars", International Journal of Computer Vision, vol.126, no.9, pp.920, 2018.
25.
Ruizhen Hu, Wenchao Li, Oliver Van Kaick, Ariel Shamir, Hao Zhang, Hui Huang, "Learning to predict part mobility from a single static snapshot", ACM Transactions on Graphics, vol.36, no.6, pp.1, 2017.
26.
Natsuki Yamanobe, Weiwei Wan, Ixchel G. Ramirez-Alpizar, Damien Petit, Tokuo Tsuji, Shuichi Akizuki, Manabu Hashimoto, Kazuyuki Nagata, Kensuke Harada, "A brief review of affordance in robotic manipulation research", Advanced Robotics, vol.31, no.19-20, pp.1086, 2017.
27.
Rogério Sales Gonçalves, Hermano Igo Krebs, "MIT-Skywalker: considerations on the Design of a Body Weight Support System", Journal of NeuroEngineering and Rehabilitation, vol.14, no.1, 2017.
28.
Philipp Zech, Simon Haller, Safoura Rezapour Lakani, Barry Ridge, Emre Ugur, Justus Piater, "Computational models of affordance in robotics: a taxonomy and systematic classification", Adaptive Behavior, vol.25, no.5, pp.235, 2017.
29.
R. Omar Chavez-Garcia, Mihai Andries, Pierre Luce-Vayrac, Raja Chatila, 2016 International Symposium on Experimental Robotics, vol.1, pp.679, 2017.
30.
Cornelia Fermuller, Fang Wang, Yezhou Yang, Konstantinos Zampogiannis, Yi Zhang, Francisco Barranco, Michael Pfeiffer, "Prediction of Manipulation Actions", International Journal of Computer Vision, 2017.
Contact IEEE to Subscribe

References

References is not available for this document.