Loading [MathJax]/extensions/MathMenu.js
Video Saliency Prediction Using Spatiotemporal Residual Attentive Networks | IEEE Journals & Magazine | IEEE Xplore

Video Saliency Prediction Using Spatiotemporal Residual Attentive Networks


Abstract:

This paper proposes a novel residual attentive learning network architecture for predicting dynamic eye-fixation maps. The proposed model emphasizes two essential issues,...Show More

Abstract:

This paper proposes a novel residual attentive learning network architecture for predicting dynamic eye-fixation maps. The proposed model emphasizes two essential issues, i.e., effective spatiotemporal feature integration and multi-scale saliency learning. For the first problem, appearance and motion streams are tightly coupled via dense residual cross connections, which integrate appearance information with multi-layer, comprehensive motion features in a residual and dense way. Beyond traditional two-stream models learning appearance and motion features separately, such design allows early, multi-path information exchange between different domains, leading to a unified and powerful spatiotemporal learning architecture. For the second one, we propose a composite attention mechanism that learns multi-scale local attentions and global attention priors end-to-end. It is used for enhancing the fused spatiotemporal features via emphasizing important features in multi-scales. A lightweight convolutional Gated Recurrent Unit (convGRU), which is flexible for small training data situation, is used for long-term temporal characteristics modeling. Extensive experiments over four benchmark datasets clearly demonstrate the advantage of the proposed video saliency model over other competitors and the effectiveness of each component of our network. Our code and all the results will be available at https://github.com/ashleylqx/STRA-Net.
Published in: IEEE Transactions on Image Processing ( Volume: 29)
Page(s): 1113 - 1126
Date of Publication: 23 August 2019

ISSN Information:

PubMed ID: 31449021

Funding Agency:

Author image of Qiuxia Lai
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong
Qiuxia Lai received the B.E. and M.S. degrees from the School of Automation, Huazhong University of Science and Technology, in 2013 and 2016, respectively. She is currently pursuing the Ph.D. degree with the Department of Computer Science and Engineering, The Chinese University of Hong Kong. Her research interests include image/video processing and deep learning.
Qiuxia Lai received the B.E. and M.S. degrees from the School of Automation, Huazhong University of Science and Technology, in 2013 and 2016, respectively. She is currently pursuing the Ph.D. degree with the Department of Computer Science and Engineering, The Chinese University of Hong Kong. Her research interests include image/video processing and deep learning.View more
Author image of Wenguan Wang
Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, China
Wenguan Wang received the Ph.D. degree from the Beijing Institute of Technology in 2018. He was a joint Ph.D. student at the Department of Statistics, University of California, directed by Prof. S.-C. Zhu during 2016–2018. He is currently a Senior Scientist with the Inception Institute of Artificial Intelligence (IIAI), United Arab Emirates. His current research interests include visual relation understanding and graph ne...Show More
Wenguan Wang received the Ph.D. degree from the Beijing Institute of Technology in 2018. He was a joint Ph.D. student at the Department of Statistics, University of California, directed by Prof. S.-C. Zhu during 2016–2018. He is currently a Senior Scientist with the Inception Institute of Artificial Intelligence (IIAI), United Arab Emirates. His current research interests include visual relation understanding and graph ne...View more
Author image of Hanqiu Sun
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong
Hanqiu Sun received the M.S. degree in electrical engineering from The University of British Columbia and the Ph.D. degree in computer science from the University of Alberta, Canada. She is currently an Associate Professor with The Chinese University of Hong Kong. Her research interests include virtual reality, interactive graphics/animation, real-time hypermedia, virtual surgery, mobile image/video synopsis and navigatio...Show More
Hanqiu Sun received the M.S. degree in electrical engineering from The University of British Columbia and the Ph.D. degree in computer science from the University of Alberta, Canada. She is currently an Associate Professor with The Chinese University of Hong Kong. Her research interests include virtual reality, interactive graphics/animation, real-time hypermedia, virtual surgery, mobile image/video synopsis and navigatio...View more
Author image of Jianbing Shen
Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, China
Jianbing Shen (M’11–SM’12) is currently the Lead Scientist with the Inception Institute of Artificial Intelligence (IIAI), United Arab Emirates, and an Adjunct Honorary Professor with the Beijing Institute of Technology. He has published about 100 journal and conference articles, among them eight papers are selected as the ESI Highly Cited Papers. His research interests include computer vision, deep learning, autonomous d...Show More
Jianbing Shen (M’11–SM’12) is currently the Lead Scientist with the Inception Institute of Artificial Intelligence (IIAI), United Arab Emirates, and an Adjunct Honorary Professor with the Beijing Institute of Technology. He has published about 100 journal and conference articles, among them eight papers are selected as the ESI Highly Cited Papers. His research interests include computer vision, deep learning, autonomous d...View more

I. Introduction

Humans are able to rapidly orient attention to important areas in visual field and filter out irrelevant information. Such selective process, called as visual attention mechanism, helps humans operate huge amount of visual information in realtime. Visual attention has long been studied in computer vision community dated back to 1990s [1], and shown wide applications such as object segmentation [2], [3], [77], and video summarization [4], to name a few. Integrating attention mechanism into above tasks could allocate limited computation source into the most task-related targets, as well as obtain biologically inspired results.

Author image of Qiuxia Lai
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong
Qiuxia Lai received the B.E. and M.S. degrees from the School of Automation, Huazhong University of Science and Technology, in 2013 and 2016, respectively. She is currently pursuing the Ph.D. degree with the Department of Computer Science and Engineering, The Chinese University of Hong Kong. Her research interests include image/video processing and deep learning.
Qiuxia Lai received the B.E. and M.S. degrees from the School of Automation, Huazhong University of Science and Technology, in 2013 and 2016, respectively. She is currently pursuing the Ph.D. degree with the Department of Computer Science and Engineering, The Chinese University of Hong Kong. Her research interests include image/video processing and deep learning.View more
Author image of Wenguan Wang
Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, China
Wenguan Wang received the Ph.D. degree from the Beijing Institute of Technology in 2018. He was a joint Ph.D. student at the Department of Statistics, University of California, directed by Prof. S.-C. Zhu during 2016–2018. He is currently a Senior Scientist with the Inception Institute of Artificial Intelligence (IIAI), United Arab Emirates. His current research interests include visual relation understanding and graph neural networks.
Wenguan Wang received the Ph.D. degree from the Beijing Institute of Technology in 2018. He was a joint Ph.D. student at the Department of Statistics, University of California, directed by Prof. S.-C. Zhu during 2016–2018. He is currently a Senior Scientist with the Inception Institute of Artificial Intelligence (IIAI), United Arab Emirates. His current research interests include visual relation understanding and graph neural networks.View more
Author image of Hanqiu Sun
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong
Hanqiu Sun received the M.S. degree in electrical engineering from The University of British Columbia and the Ph.D. degree in computer science from the University of Alberta, Canada. She is currently an Associate Professor with The Chinese University of Hong Kong. Her research interests include virtual reality, interactive graphics/animation, real-time hypermedia, virtual surgery, mobile image/video synopsis and navigation, and touch-enhanced and dynamics simulations.
Hanqiu Sun received the M.S. degree in electrical engineering from The University of British Columbia and the Ph.D. degree in computer science from the University of Alberta, Canada. She is currently an Associate Professor with The Chinese University of Hong Kong. Her research interests include virtual reality, interactive graphics/animation, real-time hypermedia, virtual surgery, mobile image/video synopsis and navigation, and touch-enhanced and dynamics simulations.View more
Author image of Jianbing Shen
Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, China
Jianbing Shen (M’11–SM’12) is currently the Lead Scientist with the Inception Institute of Artificial Intelligence (IIAI), United Arab Emirates, and an Adjunct Honorary Professor with the Beijing Institute of Technology. He has published about 100 journal and conference articles, among them eight papers are selected as the ESI Highly Cited Papers. His research interests include computer vision, deep learning, autonomous driving, and medical image analysis. He is also an Associate Editor of the IEEE Transactions on Image Processing, IEEE Transactions on Neural Networks and Learning Systems, and other journals.
Jianbing Shen (M’11–SM’12) is currently the Lead Scientist with the Inception Institute of Artificial Intelligence (IIAI), United Arab Emirates, and an Adjunct Honorary Professor with the Beijing Institute of Technology. He has published about 100 journal and conference articles, among them eight papers are selected as the ESI Highly Cited Papers. His research interests include computer vision, deep learning, autonomous driving, and medical image analysis. He is also an Associate Editor of the IEEE Transactions on Image Processing, IEEE Transactions on Neural Networks and Learning Systems, and other journals.View more
Contact IEEE to Subscribe

References

References is not available for this document.