Abstract:
The detection of cardiac phase in ultrasound videos, identifying end-systolic (ES) and end-diastolic (ED) frames, is a critical step in assessing cardiac function, monito...Show MoreMetadata
Abstract:
The detection of cardiac phase in ultrasound videos, identifying end-systolic (ES) and end-diastolic (ED) frames, is a critical step in assessing cardiac function, monitoring structural changes, and diagnosing congenital heart disease. Current popular methods use recurrent neu ral networks to track dependencies over long sequences for cardiac phase detection, but often overlook the short-term motion of cardiac valves that sonographers rely on. In this paper, we propose a novel optical flow-enhanced Mamba U-net framework, designed to utilize both short-term motion and long-term dependencies to detect the cardiac phase in ultrasound videos. We utilize optical flow to capture the short-term motion of cardiac muscles and valves between adjacent frames, enhancing the input video. The Mamba layer is employed to track long-term dependencies across cardiac cycles. We then develop regression branches using the U-Net architecture, which integrates short-term and long-term information while extracting multi-scale features. Using this method, we can generate regression scores for each frame and identify keyframes (i.e., ES and ED frames). Additionally, we design a keyframe weighted loss function to guide the network to focus more on keyframes rather than intermediate period frames. Ourmethod demonstrates superior performance compared to advanced baseline methods, achieving frame mismatches of 1.465 frames for ES and 0.842 frames for ED in the Fetal Echocardiogram dataset, where heart rates are higher and phase changes occur rapidly, and 2.444 frames and 2.072 frames in the publicly available adult Echonet-Dynamic dataset. Its accuracy and robustness in both fetal and adult datasets highlight its potential for clinical application.
Published in: IEEE Transactions on Medical Imaging ( Early Access )