Abstract:
A challenging task in embodied artificial intelligence is enabling the robot to carry out a navigational task following natural language instruction. In the task, the nav...Show MoreMetadata
Abstract:
A challenging task in embodied artificial intelligence is enabling the robot to carry out a navigational task following natural language instruction. In the task, the navigator needs to understand objects, directions, as well as room types, which serve as landmarks for navigation. Although it is easy to encode objects and directions with an external encoder like an object detector, current navigators struggle to encode room type information properly due to the low accuracy offered by existing classifiers. This inadequacy poses confusion that navigators find difficult to overcome. Even humans may sometimes fail to determine the exact type of a room since multiple room types may exist in one panorama. To mitigate this problem, we propose to encode room type information in a prompt manner. Specifically, we first establish multi-modal, learnable prompt pools containing knowledge of room types. By querying the prompt pools, the navigator can obtain room-type prompts of the current view, and incorporate them into the navigator using a prompt-based learning method. Experimental results on the REVERIE, R2R and SOON datasets demonstrate the effectiveness of our approach.
Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 34, Issue: 10, October 2024)
Funding Agency:

School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong, China
Zhaohuan Zhan received the B.S. degree from the College of Information Science and Engineering, Northeastern University, Shenyang, China, in 2016. He is currently pursuing the Ph.D. degree in computer science with the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. His current research interests include vision and language understanding and image processing.
Zhaohuan Zhan received the B.S. degree from the College of Information Science and Engineering, Northeastern University, Shenyang, China, in 2016. He is currently pursuing the Ph.D. degree in computer science with the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. His current research interests include vision and language understanding and image processing.View more

School of Information Engineering, Guangdong University of Technology, Guangzhou, Guangdong, China
Jinghui Qin received the B.S. and M.A.Eng. degrees from the School of Software, Sun Yat-sen University (SYSU), Guangzhou, China, in 2012 and 2014, respectively, and the Ph.D. degree from the School of Data and Computer Science, SYSU, in 2020. From 2020 to 2022, he was a Post-Doctoral Fellow with SYSU. He is currently a Lecturer with Guangdong University of Technology, Guangzhou. His research interests include natural lang...Show More
Jinghui Qin received the B.S. and M.A.Eng. degrees from the School of Software, Sun Yat-sen University (SYSU), Guangzhou, China, in 2012 and 2014, respectively, and the Ph.D. degree from the School of Data and Computer Science, SYSU, in 2020. From 2020 to 2022, he was a Post-Doctoral Fellow with SYSU. He is currently a Lecturer with Guangdong University of Technology, Guangzhou. His research interests include natural lang...View more

School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong, China
Wei Zhuo is currently pursuing the Ph.D. degree in computer science with the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. He was a Visiting Student in computer science with the University of Helsinki. His research interests include machine learning, data mining, and graph representation learning.
Wei Zhuo is currently pursuing the Ph.D. degree in computer science with the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. He was a Visiting Student in computer science with the University of Helsinki. His research interests include machine learning, data mining, and graph representation learning.View more

School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong, China
Guang Tan (Member, IEEE) received the Ph.D. degree in computer science from the University of Warwick, U.K., in 2007. Since 2018, he has been a Professor with the School of Intelligent Systems Engineering, Sun Yat-sen University (SYSU), where he works on the design and evaluation of mobile systems and networking. Before joining SYSU, he was a Professor with Shenzhen Institutes of Advanced Technology, Chinese Academy of Sc...Show More
Guang Tan (Member, IEEE) received the Ph.D. degree in computer science from the University of Warwick, U.K., in 2007. Since 2018, he has been a Professor with the School of Intelligent Systems Engineering, Sun Yat-sen University (SYSU), where he works on the design and evaluation of mobile systems and networking. Before joining SYSU, he was a Professor with Shenzhen Institutes of Advanced Technology, Chinese Academy of Sc...View more

School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong, China
Zhaohuan Zhan received the B.S. degree from the College of Information Science and Engineering, Northeastern University, Shenyang, China, in 2016. He is currently pursuing the Ph.D. degree in computer science with the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. His current research interests include vision and language understanding and image processing.
Zhaohuan Zhan received the B.S. degree from the College of Information Science and Engineering, Northeastern University, Shenyang, China, in 2016. He is currently pursuing the Ph.D. degree in computer science with the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. His current research interests include vision and language understanding and image processing.View more

School of Information Engineering, Guangdong University of Technology, Guangzhou, Guangdong, China
Jinghui Qin received the B.S. and M.A.Eng. degrees from the School of Software, Sun Yat-sen University (SYSU), Guangzhou, China, in 2012 and 2014, respectively, and the Ph.D. degree from the School of Data and Computer Science, SYSU, in 2020. From 2020 to 2022, he was a Post-Doctoral Fellow with SYSU. He is currently a Lecturer with Guangdong University of Technology, Guangzhou. His research interests include natural language processing, machine learning, and computer vision.
Jinghui Qin received the B.S. and M.A.Eng. degrees from the School of Software, Sun Yat-sen University (SYSU), Guangzhou, China, in 2012 and 2014, respectively, and the Ph.D. degree from the School of Data and Computer Science, SYSU, in 2020. From 2020 to 2022, he was a Post-Doctoral Fellow with SYSU. He is currently a Lecturer with Guangdong University of Technology, Guangzhou. His research interests include natural language processing, machine learning, and computer vision.View more

School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong, China
Wei Zhuo is currently pursuing the Ph.D. degree in computer science with the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. He was a Visiting Student in computer science with the University of Helsinki. His research interests include machine learning, data mining, and graph representation learning.
Wei Zhuo is currently pursuing the Ph.D. degree in computer science with the School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China. He was a Visiting Student in computer science with the University of Helsinki. His research interests include machine learning, data mining, and graph representation learning.View more

School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, Guangdong, China
Guang Tan (Member, IEEE) received the Ph.D. degree in computer science from the University of Warwick, U.K., in 2007. Since 2018, he has been a Professor with the School of Intelligent Systems Engineering, Sun Yat-sen University (SYSU), where he works on the design and evaluation of mobile systems and networking. Before joining SYSU, he was a Professor with Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences. He is a member of CCF.
Guang Tan (Member, IEEE) received the Ph.D. degree in computer science from the University of Warwick, U.K., in 2007. Since 2018, he has been a Professor with the School of Intelligent Systems Engineering, Sun Yat-sen University (SYSU), where he works on the design and evaluation of mobile systems and networking. Before joining SYSU, he was a Professor with Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences. He is a member of CCF.View more