3D human pose estimation aims to simultaneously localize and estimate articulated 3D joint locations of humans from 2D images, which facilities substantial practical applications of the human-computer interaction [1], [2], [3], [4], [5], [6], such as Virtual Reality (VR)/Augmented Reality (AR), human action recognition, telemedicine and telesurgery, visual impairment assistance, etc. Recently, great progress has been achieved thanks to the sophisticated design of models [7], [8], [9], [10] and the availability of large-scale datasets [11], [12], [13]. Nevertheless, these methods still have a limited performance under crowded scenes, which involve severe overlapping of human body parts and lead to incorrect detection or association of keypoints for multi-person 3D pose estimation.
Abstract:
The performance of existing methods for multi-person 3D pose estimation in crowded scenes is still limited, due to the challenge of heavy overlapping among persons. Attem...Show MoreMetadata
Abstract:
The performance of existing methods for multi-person 3D pose estimation in crowded scenes is still limited, due to the challenge of heavy overlapping among persons. Attempt to address this issue, we propose a progressive inference scheme, i.e., Articulation-aware Knowledge Exploration (AKE), to improve the multi-person 3D pose models on those samples with complex occlusions at the inference stage. We argue it is beneficial to explore the underlying articulated information/ knowledge of the human body, which helps to further correct the predicted poses in those samples. To exploit such information, we propose an iterative scheme to achieve a self-improving loop for keypoint association. Specifically, we introduce a kinematic validation module for locating unreasonable articulations and an occluded-keypoint discovering module for discovering occluded articulations. Extensive experiments on two challenging benchmarks under both weakly-supervised and fully-supervised settings demonstrate the superiority and generalization ability of our proposed method for crowded scenes.
Published in: IEEE Transactions on Multimedia ( Volume: 26)
Funding Agency:
No metrics found for this document.