GLPose: Global-Local Representation Learning for Human Pose Estimation

Page view(s)
41
Checked on Jul 20, 2024
GLPose: Global-Local Representation Learning for Human Pose Estimation
Title:
GLPose: Global-Local Representation Learning for Human Pose Estimation
Journal Title:
ACM Transactions on Multimedia Computing, Communications, and Applications
Publication Date:
12 March 2022
Citation:
Jiao, Y., Chen, H., Feng, R., Chen, H., Wu, S., Yin, Y., & Liu, Z. (2022). GLPose: Global-Local Representation Learning for Human Pose Estimation. ACM Transactions on Multimedia Computing, Communications, and Applications. https://doi.org/10.1145/3519305
Abstract:
Multi-frame human pose estimation is at the core of many computer vision tasks. Although state-of-the-art approaches have demonstrated remarkable results for human pose estimation on static images, their performances inevitably come short when being applied to videos. A central issue lies in the visual degeneration of video frames induced by rapid motion and pose occlusion in dynamic environments. This problem, by nature, is insurmountable for a single frame. Therefore, incorporating complementary visual cues from other video frames becomes an intuitive paradigm. Current state-of-the-art methods usually leverage information from adjacent frames, which unfortunately place excessive focuses on only the temporally nearby frames. In this paper, we argue that combining global semantically similar information and local temporal visual context will deliver more comprehensive and more robust representations for human pose estimation. Towards this end, we present an effective framework, namely global-local enhanced pose estimation ( GLPose ) network. Our framework consists of a feature processing module that conditionally incorporates global semantic information and local visual context to generate a robust human representation and a feature enhancement module that excavates complementary information from this aggregated representation to enhance keyframe features for precise estimation. We empirically find that the proposed GLpose outperforms existing methods by a large margin and achieves new state-of-the-art results on large benchmark datasets.
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done
Description:
© Author | ACM 2022. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Multimedia Computing, Communications, and Applications, http://dx.doi.org/10.1145/3519305
ISSN:
1551-6857
1551-6865
Files uploaded:

File Size Format Action
glpose-v2.pdf 3.61 MB PDF Open