Conferences >2017 IEEE Conference on Compu...

Towards Accurate Multi-person Pose Estimation in the Wild

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

We propose a method for multi-person detection and 2-D pose estimation that achieves state-of-art results on the challenging COCO keypoints task. It is a simple, yet powe...Show More

Metadata

Abstract:

We propose a method for multi-person detection and 2-D pose estimation that achieves state-of-art results on the challenging COCO keypoints task. It is a simple, yet powerful, top-down approach consisting of two stages. In the first stage, we predict the location and scale of boxes which are likely to contain people, for this we use the Faster RCNN detector. In the second stage, we estimate the keypoints of the person potentially contained in each proposed bounding box. For each keypoint type we predict dense heatmaps and offsets using a fully convolutional ResNet. To combine these outputs we introduce a novel aggregation procedure to obtain highly localized keypoint predictions. We also use a novel form of keypoint-based Non-Maximum-Suppression (NMS), instead of the cruder box-level NMS, and a novel form of keypoint-based confidence score estimation, instead of box-level scoring. Trained on COCO data alone, our final system achieves average precision of 0.649 on the COCO test-dev set and the 0.643 test-standard sets, outperforming the winner of the 2016 COCO keypoints challenge and other recent state-of-art. Further, by using additional in-house labeled data we obtain an even higher average precision of 0.685 on the test-dev set and 0.673 on the test-standard set, more than 5% absolute improvement compared to the previous best performing method on the same dataset.

Published in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 21-26 July 2017

Date Added to IEEE Xplore: 09 November 2017

ISBN Information:

Print ISSN: 1063-6919

DOI: 10.1109/CVPR.2017.395

Conference Location: Honolulu, HI, USA

Citations are not available for this document.

Contents

1. Introduction

Visual interpretation of people plays a central role in the quest for comprehensive image understanding. We want to localize people, understand the activities they are involved in, understand how people move for the purpose of Vir-tual/Augmented Reality, and learn from them to teach autonomous systems. A major cornerstone in achieving these goals is the problem of human pose estimation, defined as 2-D localization of human joints on the arms, legs, and key-ooints on torso and the face.

Cites in Papers - |

Cites in Papers - Other Publishers (297)

Toan D. Gian, Tien Dac Lai, Thien Van Luong, Kok-Seng Wong, Van-Dinh Nguyen, "HPE-Li: WiFi-Enabled Lightweight Dual Selective Kernel Convolution for\\xa0Human Pose Estimation", Computer Vision – ECCV 2024, vol.15089, pp.93, 2025.

Towards Accurate Multi-person Pose Estimation in the Wild

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

Cites in Papers - IEEE (306) | Other Publishers (297)

Cites in Papers - IEEE (306)

Cites in Papers - Other Publishers (297)

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Cites in Papers - |