Conferences >2024 IEEE/CVF Winter Conferen...

Temporally-Consistent Video Semantic Segmentation with Bidirectional Occlusion-guided Feature Propagation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Despite recent progress in static image segmentation, video segmentation is still challenging due to the need for an accurate, fast, and temporally consistent model. Cond...Show More

Metadata

Abstract:

Despite recent progress in static image segmentation, video segmentation is still challenging due to the need for an accurate, fast, and temporally consistent model. Conducting per-frame static image segmentation on a video is not acceptable since it is computationally prohibitive and prone to temporal inconsistency. In this paper, we present bidirectional occlusion-guided feature propagation (BOFP) method with the goal of improving temporal consistency of segmentation results without sacrificing segmentation accuracy, while at the same time keeping the operations at a low computation cost. It leverages temporal coherence in the video by feature propagation from keyframes to other frames along the motion paths in both forward and backward directions. We propose an occlusion-based attention network to estimate the distorted areas based on bidirectional optical flows, and utilize them as cues for correcting and fusing the propagated features. Extensive experiments on benchmark datasets demonstrate that the proposed BOFP method achieves superior performance in terms of temporal consistency while maintaining comparable level of segmentation accuracy at a low computation cost, striking a great balance among the three performance metrics essential to evaluate video segmentation solutions.

Published in: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Date of Conference: 03-08 January 2024

Date Added to IEEE Xplore: 09 April 2024

ISBN Information:

ISSN Information:

DOI: 10.1109/WACV57701.2024.00074

Conference Location: Waikoloa, HI, USA

Contents

1. Introduction

Semantic segmentation is a fundamental problem in visual recognition systems. Regardless of tremendous progress for static image segmentation [3], [6], [8], [56], [63], video semantic segmentation (VSS) remains as a challenging problem mainly because of two reasons. First of all, processing a sheer amount of data in real time becomes non-trivial for some practical applications due to resource constraints. Secondly and more importantly, the segmentation predictions need to be temporally consistent in order to avoid the so-called "flickering" problem [31],[41].

References is not available for this document.

Temporally-Consistent Video Semantic Segmentation with Bidirectional Occlusion-guided Feature Propagation

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Temporally-Consistent Video Semantic Segmentation with Bidirectional Occlusion-guided Feature Propagation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

Authors

Figures

References

Keywords

Metrics

Supplemental Items

References