Loading [MathJax]/extensions/MathZoom.js
MemFlow: Optical Flow Estimation and Prediction with Memory | IEEE Conference Publication | IEEE Xplore

MemFlow: Optical Flow Estimation and Prediction with Memory


Abstract:

Optical flow is a classical task that is important to the vision community. Classical optical flow estimation uses two frames as input, whilst some recent methods conside...Show More

Abstract:

Optical flow is a classical task that is important to the vision community. Classical optical flow estimation uses two frames as input, whilst some recent methods consider multiple frames to explicitly model long-range information. The former ones limit their ability to fully leverage temporal coherence along the video sequence; and the latter ones incur heavy computational overhead, typically not possible for real-time flow estimation. Some multi-frame-based approaches even necessitate unseen future frames for current estimation, compromising real-time applicability in safety-critical scenarios. To this end, we present MemFlow, a real-time method for optical flow estimation and prediction with memory. Our method enables memory readout and update modules for aggregating historical motion information in real-time. Furthermore, we integrate resolution-adaptive re-scaling to accommodate diverse video resolutions. Besides, our approach seamlessly extends to the future prediction of optical flow based on past observations. Leveraging effective historical motion aggregation, our method outperforms VideoFlow with fewer parameters and faster inference speed on Sintel and KITTI-15 datasets in terms of generalization performance. At the time of submission, MemFlow also leads in performance on the 1080p Spring dataset. Codes and models will be available at: https://dqiaole.github.io/MemFlow/.
Date of Conference: 16-22 June 2024
Date Added to IEEE Xplore: 16 September 2024
ISBN Information:

ISSN Information:

Conference Location: Seattle, WA, USA

1. Introduction

Optical flow, a critical area in computer vision, plays a key role in various real-world applications like video in-painting [21], action recognition [58], and video prediction [23], [67]. In essence, it captures the displacement vector field for each pixel between successive video frames. Re-cent advances in optical flow estimation, as highlighted by works such as FlowNet [28], PWC-Net [57], RAFT [60], SKFlow [59], FlowFormer [26], and a rethinking training approach by MatchFlow[17], have been successful. This success is attributed to advancements in model architectures [26], [57], [60] and dedicated datasets [17], [19], [42].

End-point-error on Sintel (clean) vs. inference time (ms) and model size (M). All models are trained on FlyingChairs and FlyingThings3D, and tested with one NVIDIA A100 GPU. MemFlow(-T) (x it) indicates running our network with only x iterations of GRU. Our MemFlow(-T) achieves significant reductions in computational overhead as well as substantial performance boosts over the state-of-the-art methods.

Contact IEEE to Subscribe

References

References is not available for this document.