Zhang, Y., Ni, H., Sun, B., Chin, Z. Y., & Ang, K. K. (2025). Depth, Thermal and RGB-Segmented Silhouette Imaging in Human Pose Estimation for Activity Monitoring. Proceedings of the 2025 7th International Conference on Image, Video and Signal Processing, 16–24. https://doi.org/10.1145/3749859.3749862
Abstract:
Human Pose Estimation (HPE) is a computer vision task that involves deep learning algorithms to estimate keypoints of different human joints in images. The keypoints form a skeleton model of the human body with applications in activity monitoring for healthcare, domotics and surveillance systems. However, HPE that uses RGB images deteriorates under inadequate lighting and raises privacy concerns because identifiable facial features are recorded. Thus, training a HPE deep learning model that predicts keypoints on lighting-invariant and privacy-preserving human silhouette images is proposed. Silhouette Imaging Dataset (SID) was collated from numerous public datasets, consisting of depth images, thermal images and simulated human silhouette images generated from RGB images using image segmentation, which were used to fine-tune the YOLO11s-pose HPE model. To classify predicted keypoints from the HPE model into different actions for activity monitoring, this study proposed a Residual Temporal Convolution Network with Stacked Bidirectional Gated Recurrent Unit (ResTCN-SBiGRU) Human Activity Recognition (HAR) deep learning model. The HAR model was trained on a public dataset to recognise 11 actions. The study shows the following results: the fine-tuned YOLO11s-pose yielded an mAP50-95(P) of 0.851; ResTCN-SBiGRU achieved an F1-score of 0.933. Finally, the study developed a real-time activity monitoring application using HPE, HAR and an additional proposed motion detection algorithm. The application gets live video input from a KinectV2 sensor and a laptop webcam, and displays pose, action and motion visualisations. This allows for testing of the different models and demonstrates the feasibility of activity monitoring using silhouette-based HPE.
License type:
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Funding Info:
There was no specific funding for the research done