Abstract:
The performance of existing methods for multi-person 3D pose estimation in crowded scenes is still limited, due to the challenge of heavy overlapping among persons. Attem...Show MoreMetadata
Abstract:
The performance of existing methods for multi-person 3D pose estimation in crowded scenes is still limited, due to the challenge of heavy overlapping among persons. Attempt to address this issue, we propose a progressive inference scheme, i.e., Articulation-aware Knowledge Exploration (AKE), to improve the multi-person 3D pose models on those samples with complex occlusions at the inference stage. We argue it is beneficial to explore the underlying articulated information/ knowledge of the human body, which helps to further correct the predicted poses in those samples. To exploit such information, we propose an iterative scheme to achieve a self-improving loop for keypoint association. Specifically, we introduce a kinematic validation module for locating unreasonable articulations and an occluded-keypoint discovering module for discovering occluded articulations. Extensive experiments on two challenging benchmarks under both weakly-supervised and fully-supervised settings demonstrate the superiority and generalization ability of our proposed method for crowded scenes.
Published in: IEEE Transactions on Multimedia ( Volume: 26)