I. Introduction
Human fall detection (HFD) aims to continuously and precisely identify fall events executed by humans, which is both fundamental and challenging. Research findings indicate that HFD has a profound and extensive influence in various domains, such as healthcare [1], security surveillance [2], [3], and pose recovery [4], [5]. Although the fall detection community witnessed encouraging progress in the past years, robust fall detection in complicated scenes is still struggling to cope with diverse types of interference, such as posture variation, dim light, and background clutter. For example, fall activities undertaken by humans exhibit a wide range of variations, encompassing the fall’s direction and velocity, along with the body’s posture throughout the descent. Nevertheless, the unpredictability of human movements and the changing environment pose challenges to these detection systems’ accuracy in predicting falls. Furthermore, the effectiveness of many detection methods largely depends on the quality of the learned feature representations, rendering them susceptible to cluttered backgrounds. Although several attempts have been made to tackle this problem, they typically struggle when similar individuals interact frequently. To date, reliable and performing solutions for daily human falls are being extensively studied.