Loading [MathJax]/extensions/MathMenu.js
Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation | IEEE Conference Publication | IEEE Xplore

Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation


Abstract:

In this paper we propose an approach to holistic scene understanding that reasons jointly about regions, location, class and spatial extent of objects, presence of a clas...Show More

Abstract:

In this paper we propose an approach to holistic scene understanding that reasons jointly about regions, location, class and spatial extent of objects, presence of a class in the image, as well as the scene type. Learning and inference in our model are efficient as we reason at the segment level, and introduce auxiliary variables that allow us to decompose the inherent high-order potentials into pairwise potentials between a few variables with small number of states (at most the number of classes). Inference is done via a convergent message-passing algorithm, which, unlike graph-cuts inference, has no submodularity restrictions and does not require potential specific moves. We believe this is very important, as it allows us to encode our ideas and prior knowledge about the problem without the need to change the inference engine every time we introduce a new potential. Our approach outperforms the state-of-the-art on the MSRC-21 benchmark, while being much faster. Importantly, our holistic model is able to improve performance in all tasks.
Date of Conference: 16-21 June 2012
Date Added to IEEE Xplore: 26 July 2012
ISBN Information:

ISSN Information:

Conference Location: Providence, RI, USA
Citations are not available for this document.

1. Introduction

While there has been significant progress in solving tasks such as image labeling [14], object detection [5] and scene classification [26], existing approaches could benefit from solving these problems jointly [9]. For example, segmentation should be easier if we know where the object of interest is. Similarly, if we know the type of the scene, we can narrow down the classes we are expected to see, e.g., if we are looking at the sea, we are more likely to see a boat than a cow. Conversely, if we know which semantic regions (e.g., sky, road) and which objects are present in the scene, we can more accurately infer the scene type. Holistic scene understanding aims at recovering multiple related aspects of a scene so as to provide a deeper understanding of the scene as a whole.

Cites in Papers - |

Cites in Papers - IEEE (47)

Select All
1.
Chun-Peng Chang, Shaoxiang Wang, Alain Pagani, Didier Stricker, "MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding", 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.14131-14140, 2024.
2.
Jun Bao, Buyu Liu, Kui Ren, Jun Yu, "GLOW: Global Layout Aware Attacks on Object Detection", 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.12057-12066, 2024.
3.
Baoxian Li, Hongbin Guo, Zhanfei Wang, Fengchi Wang, "Automatic Concrete Crack Identification Based on Lightweight Embedded U-Net", IEEE Access, vol.12, pp.148387-148404, 2024.
4.
Liulei Li, Wenguan Wang, Tianfei Zhou, Ruijie Quan, Yi Yang, "Semantic Hierarchy-Aware Segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.46, no.4, pp.2123-2138, 2024.
5.
Zhicheng Zhang, Song Chen, Zichuan Wang, Jufeng Yang, "PlaneSeg: Building a Plug-In for Boosting Planar Region Segmentation", IEEE Transactions on Neural Networks and Learning Systems, vol.35, no.8, pp.11486-11500, 2024.
6.
Aya Farrag, Gad Gad, Zubair Md Fadlullah, Mostafa M. Fouda, "Mammogram Tumor Segmentation with Preserved Local Resolution: An Explainable AI System", GLOBECOM 2023 - 2023 IEEE Global Communications Conference, pp.314-319, 2023.
7.
Liulei Li, Wenguan Wang, Yang Yi, "LogicSeg: Parsing Visual Semantics with Neural Logic Learning and Reasoning", 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp.4099-4110, 2023.
8.
Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi, "OneFormer: One Transformer to Rule Universal Image Segmentation", 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.2989-2998, 2023.
9.
Bisma Riaz Chughtai, Ahmad Jalal, "Object Detection and Segmentation for Scene Understanding via Random Forest", 2023 4th International Conference on Advancements in Computational Sciences (ICACS), pp.1-6, 2023.
10.
Bo Sun, Jason Kuen, Zhe Lin, Philippos Mordohai, Simon Chen, "PRN: Panoptic Refinement Network", 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp.3952-3962, 2023.
11.
Lei Yang, Lijuan Lou, Xin Song, Jiangtao Chen, Xiaorui Zhou, "An Improved Object Detection of Image Based on Multi-task Learning", 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), pp.453-457, 2022.
12.
Zhong Zhang, Shuzhen Yang, Shuang Liu, Xiaozhong Cao, Tariq S. Durrani, "Ground-Based Remote Sensing Cloud Detection Using Dual Pyramid Network and Encoder–Decoder Constraint", IEEE Transactions on Geoscience and Remote Sensing, vol.60, pp.1-10, 2022.
13.
Feng Zhou, Renlong Hang, Hui Shuai, Qingshan Liu, "Hierarchical Context Network for Airborne Image Segmentation", IEEE Transactions on Geoscience and Remote Sensing, vol.60, pp.1-12, 2022.
14.
Simon Geisler, Carlos Cunha, Ravi Kumar Satzoda, "Better, Faster Small Hazard Detection: Instance-Aware Techniques, Metrics and Benchmarking", IEEE Transactions on Intelligent Transportation Systems, vol.23, no.7, pp.9062-9077, 2022.
15.
Daan de Geus, Panagiotis Meletis, Gijs Dubbelman, "Fast Panoptic Segmentation Network", IEEE Robotics and Automation Letters, vol.5, no.2, pp.1742-1749, 2020.
16.
Lei Zhang, Xin Li, Zhiping Jian, Wei Zhang, Xiantong Zhen, "Cosine Activation in Compact Network (CACN): Application to Scene Classification", IEEE Access, vol.7, pp.101855-101864, 2019.
17.
Quang-Hieu Pham, Thanh Nguyen, Binh-Son Hua, Gemma Roig, Sai-Kit Yeung, "JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds With Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields", 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.8819-8828, 2019.
18.
K. Manoj, S.S. Shylaja, "Scene Classification with Deep Neural Nets Using Background Suppression", 2018 International Conference on Information Technology (ICIT), pp.178-181, 2018.
19.
Tengping Jiang, Yongjun Wang, Shuaibing Tao, Yunli Li, Shan Liu, "Integrating Active Learning and Contextually Guide for Semantic Labeling of LiDAR Point Cloud", 2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS), pp.1-7, 2018.
20.
Karthik Pujar, Satyadhyan Chickerur, Mahesh S. Patil, "Combining RGB and Depth Images for Indoor Scene Classification Using Deep Learning", 2017 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp.1-8, 2017.
21.
Chen Jing-Xia, Zhang Yan-Ning, Jiang Dong-Mei, Li Fei, Xie Jia, "Multi-class Object Recognition and Segmentation Based on Multi-feature Fusion Modeling", 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), pp.336-339, 2015.
22.
Yu Zhang, Xiaowu Chen, Jia Li, Chen Wang, Changqun Xia, "Semantic object segmentation via detection in weakly labeled video", 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3641-3649, 2015.
23.
Shuo Wang, Yizhou Wang, "Weakly Supervised Semantic Segmentation with a Multiscale Model", IEEE Signal Processing Letters, vol.22, no.3, pp.308-312, 2015.
24.
Wenqi Huang, Xiaojin Gong, Michael Ying Yang, "Joint Object Segmentation and Depth Upsampling", IEEE Signal Processing Letters, vol.22, no.2, pp.192-196, 2015.
25.
Ahmed Bassiouny, Motaz El-Saban, "Semantic segmentation as image representation for scene recognition", 2014 IEEE International Conference on Image Processing (ICIP), pp.981-985, 2014.
26.
Frank Dittrich, Heinz Woern, Vivek Sharma, Sule Yayilgan, "Pixelwise object class segmentation based on synthetic data using an optimized training strategy", 2014 First International Conference on Networks & Soft Computing (ICNSC2014), pp.388-394, 2014.
27.
Aditya Khosla, Byoungkwon An, Joseph J. Lim, Antonio Torralba, "Looking Beyond the Visible Scene", 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.3710-3717, 2014.
28.
Anirban Roy, Sinisa Todorovic, "Scene Labeling Using Beam Search under Mutex Constraints", 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.1178-1185, 2014.
29.
Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, Alan Yuille, "The Role of Context for Object Detection and Semantic Segmentation in the Wild", 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.891-898, 2014.
30.
Jia Xu, Alexander G. Schwing, Raquel Urtasun, "Tell Me What You See and I Will Show You Where It Is", 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.3190-3197, 2014.

Cites in Papers - Other Publishers (19)

1.
Zehan Tan , Weidong Yang , Zhiwei Zhang , " PyraBiNet: A Hybrid Semantic Segmentation Network Combining PVT and\xa0BiSeNet for\xa0Deformable Objects in\xa0Indoor Environments ", Neural Information Processing , vol. 1968 , pp. 552 , 2024 .
2.
G. Divya Deepak, Subraya Krishna Bhat, "A comparative study of breast tumour detection using a semantic segmentation network coupled with different pretrained CNNs", Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, vol.12, no.1, 2024.
3.
Nafiseh Sadeghi, Homayoun Mahdavi-Nasab, Mansoor Zeinali, Hossein Pourghasem, "Comparing the Semantic Segmentation of High-Resolution Images Using Deep Convolutional\ Networks: SegNet, HRNet, CSE-HRNet and RCA-FCN", Journal of Information Systems and Telecommunication (JIST), vol.11, no.44, pp.359, 2023.
4.
Hemang Chawla, Arnav Varma, Elahe Arani, Bahram Zonooz, "Transformers in Unsupervised Structure-from-Motion", Computer Vision, Imaging and Computer Graphics Theory and Applications, vol.1815, pp.281, 2023.
5.
Jian Guo, Shuchen Wang, Qizhi Xu, "Saliency Guided DNL-Yolo for Optical Remote Sensing Images for Off-Shore Ship Detection", Applied Sciences, vol.12, no.5, pp.2629, 2022.
6.
Irem Ulku, Erdem Akagunduz, "A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D Images", Applied Artificial Intelligence, vol.36, no.1, 2022.
7.
Yongsheng Chen, "Semantic Image Segmentation with Feature Fusion Based on Laplacian Pyramid", Neural Processing Letters, vol.54, no.5, pp.4153, 2022.
8.
Yuqing Peng, Xianzi Liu, Chenxi Wang, Tengfei Xiao, Tiejun Li, "Fusing Attention Features and Contextual Information for Scene Recognition", International Journal of Pattern Recognition and Artificial Intelligence, vol.36, no.03, 2022.
9.
Anubha Parashar, Apoorva Parashar, Lalit Mohan Goyal, "Optimized navigation using deep learning technique for automatic guided vehicle", Cognitive Computing for Human-Robot Interaction, pp.147, 2021.
10.
Jyh-Jing Hwang, Tsung-Wei Ke, Stella X. Yu, "Contextual Image Parsing via Panoptic Segment Sorting", Multimedia Understanding with Less Labeling on Multimedia Understanding with Less Labeling, pp.27, 2021.
11.
Steffen Wolf, Yuyan Li, Constantin Pape, Alberto Bailoni, Anna Kreshuk, Fred A. Hamprecht, "The Semantic Mutex Watershed for Efficient Bottom-Up Semantic Instance Segmentation", Computer Vision ? ECCV 2020, vol.12351, pp.208, 2020.
12.
Xinliang Zhang, Chenlin Fu, Yunji Zhao, Xiaozhuo Xu, "Hybrid feature CNN model for point cloud classification and segmentation", IET Image Processing, vol.14, no.16, pp.4086, 2020.
13.
Mhafuzul Islam, Mashrur Chowdhury, Hongda Li, Hongxin Hu, "Vision-Based Navigation of Autonomous Vehicles in Roadway Environments with Unexpected Hazards", Transportation Research Record: Journal of the Transportation Research Board, vol.2673, no.12, pp.494, 2019.
14.
Alexander Y Sun, Bridget R Scanlon, "How can Big Data and machine learning benefit environment and water management: a survey of methods, applications, and future directions", Environmental Research Letters, vol.14, no.7, pp.073001, 2019.
15.
Haoyang Tang, Meng Qian, Ziwei Sun, Cong Song, "Visual Question Answer System Based on Bidirectional Recurrent Networks", Proceedings of the Fifth Euro-China Conference on Intelligent Data Analysis and Applications, vol.891, pp.594, 2019.
16.
Sreyasi Nag Chowdhury, Niket Tandon, Hakan Ferhatosmanoglu, Gerhard Weikum, "VISIR", Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp.117, 2018.
17.
Ding Yuan, Jingjing Qiang, Jihao Yin, "Image segmentation via foreground and background semantic descriptors", Journal of Electronic Imaging, vol.26, no.05, pp.1, 2017.
18.
Kobus Barnard, "Computational Methods for Integrating Vision and Language", Synthesis Lectures on Computer Vision, vol.6, no.1, pp.1, 2016.
19.
Mohammadreza Mostajabi, Iman Gholampour, "A robust multilevel segment description for multi-class object recognition", Machine Vision and Applications, vol.26, no.1, pp.15, 2015.
Contact IEEE to Subscribe

References

References is not available for this document.