Loading [MathJax]/extensions/MathZoom.js
Semantic Segmentation Using Transfer Learning on Fisheye Images | IEEE Conference Publication | IEEE Xplore

Semantic Segmentation Using Transfer Learning on Fisheye Images


Abstract:

While semantic segmentation has been extensively studied in the realm of regular perspective images, its application to fisheye images remains relatively unexplored. Exis...Show More

Abstract:

While semantic segmentation has been extensively studied in the realm of regular perspective images, its application to fisheye images remains relatively unexplored. Existing literature on fisheye semantic segmentation mostly revolves around multi-task or multi-modal models, which are computationally intensive. This motivated us to assess the performance of current segmentation methods, specifically on fisheye images. Surprisingly, we discover that these methods do not yield satisfactory results when directly trained on fisheye datasets using a fully supervised approach. This can be attributed to the fact that the models are not designed to handle fisheye images, and the available fisheye datasets are not sufficiently large to effectively train complex models. To overcome these challenges, we propose a novel training method by employing Transfer Learning (TL) on existing semantic segmentation models that concentrate on a single task and modality. To achieve this, we investigate six different fine-tuning configurations using the WoodScape fisheye image segmentation dataset. Furthermore, we introduce a pre-training stage that learns from perspective images by applying a fisheye transformation before employing transfer learning. As a result, our proposed training pipeline demonstrates a remarkable 18.29% improvement in mean Intersection over Union (mIoU) compared to directly adopting the best existing segmentation methods for fish eye images.
Date of Conference: 15-17 December 2023
Date Added to IEEE Xplore: 19 March 2024
ISBN Information:

ISSN Information:

Conference Location: Jacksonville, FL, USA
References is not available for this document.

I. Introduction

Semantic segmentation, a popular task in computer vision, is gaining increasing popularity in the domain of fisheye images, particularly in the context of autonomous driving. Fisheye images possess a wide field-of-view (FOV), ranging from 100° to 180°, allowing them to capture a larger amount of information from the surrounding environment. This charac-teristic makes fisheye images widely utilized in autonomous driving [1], surveillance [2], and augmented reality (AR) [3] applications. However, the advantage of fisheye images comes at the cost of significant optical distortion caused by the highly non-linear mapping of real-world scenes captured by fisheye lenses [4]. Correcting this radial distortion introduces various drawbacks, such as reduced field-of-view and resampling of distorted features in the periphery [5]. Consequently, previous approaches that involved unwrapping fisheye images into rectilinear images for semantic segmentation have not yielded satisfactory performance [6], [7].

Select All
1.
S. Yogamani, C. Hughes, J. Horgan, G. Sistu, P. Varley, D. O'Dea, M. Uricar, S. Milz, M. Simon, K. Amende et al., "Woodscape: A multi-task multi-camera fisheye dataset for autonomous driving", Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9308-9318, 2019.
2.
H. Kim, J. Jung and J. Paik, "Fisheye lens camera based surveillance system for wide field of view monitoring", Optik, vol. 127, no. 14, pp. 5636-5646, 2016.
3.
S. Urban, J. Leitloff, S. Wursthorn and S. Hinz, "Self-localization of a multi-fisheye camera based augmented reality system in textureless 3d building models", ISPRS Annals of the Photogrammetry Remote Sensing and Spatial Information Sciences, vol. 2, pp. 43-48, 2013.
4.
A. Saez, L. M. Bergasa, E. Romeral, E. Lopez, R. Barea and R. Sanz, "Cnn-based fisheye image real-time semantic segmentation", 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 1039-1044, 2018.
5.
V. R. Kumar, S. Yogamani, H. Rashed, G. Sitsu, C. Witt, I. Leang, et al., "Omnidet: Surround view cameras based multi-task visual perception network for autonomous driving", IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2830-2837, 2021.
6.
A. R. Sekkat, Y. Dupuis, P. Honeine and P. Vasseur, "A comparative study of semantic segmentation of omnidirectional images from a motorcycle perspective", Scientific Reports, vol. 12, no. 1, pp. 4968, 2022.
7.
Y. Ye, K. Yang, K. Xiang, J. Wang and K. Wang, "Universal semantic segmentation for fisheye urban driving images", 2020 IEEE International Conference on Systems Man and Cybernetics (SMC), pp. 648-655, 2020.
8.
J. Chen, J. Lu, X. Zhu and L. Zhang, "Generative semantic segmentation", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7111-7120, 2023.
9.
K. Li, Z. Wang, Z. Cheng, R. Yu, Y. Zhao, G. Song, et al., "Acseg: Adaptive conceptualization for unsupervised se-mantic segmentation", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7162-7172, 2023.
10.
F. Liang, B. Wu, X. Dai, K. Li, Y. Zhao, H. Zhang, et al., "Open-vocabulary semantic segmentation with mask-adapted clip", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7061-7070, 2023.
11.
W. He, S. Jamonnak, L. Gou and L. Ren, "Clip-s4: Language-guided self-supervised semantic segmentation", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11207-11216, 2023.
12.
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.- Y. Lo et al., "Segment anything", arXiv preprint, pp. 1, 2023.
13.
L. Deng, M. Yang, H. Li, T. Li, B. Hu and C. Wang, "Restricted deformable convolution-based road scene semantic segmentation using surround view cameras", IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 10, pp. 4350-4362, 2019.
14.
E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez and P. Luo, "Segformer: Simple and efficient design for semantic segmentation with transformers", Advances in Neural Information Processing Systems, vol. 34, pp. 12077-12090, 2021.
15.
S. Paul, Z. Patterson and N. Bouguila, "Improved training for 3d point cloud classification", Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pp. 253-263, 2022.
16.
S. Roy and S. Paul, "Land-use detection using residual convolutional neural network", 2019 1st International Conference on Advances in Science Engineering and Robotics Technology (ICASERT), pp. 1-6, 2019.
17.
S. Paul, Z. Patterson and N. Bouguila, "Crossmoco: Multi-modal mo-mentum contrastive learning for point cloud", 2023 20th Conference on Robots and Vision (CRV), pp. 273-280, 2023.
18.
L. Cui, X. Jing, Y. Wang, Y. Huan, Y. Xu and Q. Zhang, "Improved swin transformer-based semantic segmentation of postearthquake dense buildings in urban areas using remote sensing images", IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 16, pp. 369-385, 2022.
19.
S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P. H. Torr et al., "Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6881-6890, 2021.
20.
B. Zhang, Z. Tian, Q. Tang, X. Chu, X. Wei, C. Shen et al., "Segvit: Semantic segmentation with plain vision transformers", Advances in Neural Information Processing Systems, vol. 35, pp. 4971-4982, 2022.
21.
A. Saez, L. M. Bergasa, E. Lopez-Guillen, E. Romera, M. Tradacete, C. Gomez-Huelamo, et al., "Real-time semantic segmentation for fisheye urban driving images based on erfnet", Sensors, vol. 19, no. 3, pp. 503, 2019.
22.
C. Playout, O. Ahmad, F. Lecue and F. Cheriet, "Adaptable deformable convolutions for semantic segmentation of fisheye images in autonomous driving systems", arXiv preprint, 2021.
23.
L. Deng, M. Yang, Y. Qian, C. Wang and B. Wang, "Cnn based semantic segmentation for urban traffic scenes using fisheye camera", 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 231-236, 2017.
24.
G. Blott, M. Takami and C. Heipke, "Semantic segmentation of fisheye images", Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 0-0, 2018.
25.
V. R. Kumar, M. Klingner, S. Yogamani, S. Milz, T. Fingscheidt and P. Mader, "Syndistnet: Self-supervised monocular fisheye camera distance estimation synergized with semantic segmentation for au-tonomous driving", Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 61-71, 2021.
26.
E. Olivas, J. Guerrero, M. Martinez-Sober, J. Magdalena-Benedito and A. Lopez, Chapter 11: Transfer learning handbook of research on machine learning applications, pp. 3, 2009.
27.
M. Akhand, S. Roy, N. Siddique, M. A. S. Kamal and T. Shimamura, "Facial emotion recognition using transfer learning in the deep cnn", Electronics, vol. 10, no. 9, pp. 1036, 2021.
28.
C. B. Do and A. Y. Ng, "Transfer learning for text classification", Advances in neural information processing systems, vol. 18, pp. 3, 2005.
29.
M. Wurm, T. Stark, X. X. Zhu, M. Weigand and H. Taubenböck, "Semantic segmentation of slums in satellite images using transfer learning on fully convolutional neural networks", ISPRS journal of photogrammetry and remote sensing, vol. 150, pp. 59-69, 2019.
30.
R. Dufour, C. Meurie, C. Strauss and O. Lezoray, "Instance segmen-tation in fisheye images", 2020 Tenth International Conference on Image Processing Theory Tools and Applications (IPTA), pp. 1-6, 2020.
Contact IEEE to Subscribe

References

References is not available for this document.