Multiple Human Identification and Cosegmentation: A Human-Oriented CRF Approach With Poselets

Multiple Human Identification and Cosegmentation: A Human-Oriented CRF Approach With Poselets
Title:
Multiple Human Identification and Cosegmentation: A Human-Oriented CRF Approach With Poselets
Other Titles:
IEEE Transactions on Multimedia
Keywords:
Publication Date:
01 August 2016
Citation:
H. Zhu, J. Lu, J. Cai, J. Zheng, S. Lu and N. M. Thalmann, "Multiple Human Identification and Cosegmentation: A Human-Oriented CRF Approach With Poselets," in IEEE Transactions on Multimedia, vol. 18, no. 8, pp. 1516-1530, Aug. 2016. doi: 10.1109/TMM.2016.2571629
Abstract:
Localizing, identifying, and extracting humans with consistent appearance jointly from a personal photo stream is an important problem and has wide applications. The strong variations in foreground and background and irregularly occurring foreground humans make this realistic problem challenging. Inspired by advancements in object detection, scene understanding, and image cosegmentation, we explore explicit constraints to label and segment human objects rather than other nonhuman objects and “stuff.” We refer to such a problem as multiple human identification and cosegmentation (MHIC). To identify specific human subjects, we propose an efficient human instance detector by combining an extended color line model with a poselet-based human detector. Moreover, to capture high-level human shape information, a novel soft shape cue is proposed. It is initialized by the human detector, then further enhanced through a generalized geodesic distance transform, and finally refined with a joint bilateral filter. We also propose to capture the rich feature context around each pixel by using an adaptive cross-region data structure, which gives a higher discriminative power than a single pixel-based estimation. The high-level object cues from the detector and the shape are then integrated with the low-level pixel cues and midlevel contour cues into a principled conditional random field (CRF) framework, which can be efficiently solved by using fast graph cut algorithms. We evaluate our method over a newly created NTU-MHIC human dataset, which contains 351 images with manually annotated groundtruth segmentation. Both visual and quantitative results demonstrate that our method achieves state-of-the-art performance for the MHIC task.
License type:
PublisherCopyrights
Funding Info:
Description:
(c) 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
ISSN:
1520-9210
1941-0077
Files uploaded:

File Size Format Action
mhic-preprint.pdf 24.32 MB PDF Open