Tan, H. L., Zhu, H., Lim, J.-H., & Tan, C. (2021). A comprehensive survey of procedural video datasets. Computer Vision and Image Understanding, 202, 103107. doi:10.1016/j.cviu.2020.103107
Abstract:
Procedural knowledge is crucial for understanding and performing concrete real-world tasks. Yet, despite the importance of procedural knowledge, research into procedural knowledge understanding is still under-developed. In particular, videos contain rich semantics that are important for understanding procedural knowledge, but have traditionally been less explored than natural language texts for understanding procedural knowledge. Motivated by harnessing procedural knowledge from videos for task assistance (i.e., assisting people in performing procedural tasks), we present the first comprehensive survey of procedural video datasets. Through systematically surveying 23 procedural video datasets, including both instructional and non-instructional videos, in a conceptual framework for task assistance, we seek to understand the trends and gaps in existing datasets, as well as to gain insights into the future of such datasets. This survey examines the current state of procedural video datasets, in terms of their data, content and annotation characteristics, as well as processing function and evaluation. The survey also identifies and suggests a number of possible directions to bring this area to the next level.
License type:
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Funding Info:
This research / project is supported by the National Research Foundation - NRF-ISF Joint Grant
Grant Reference no. : NRF2015-NRFISF001- 2541
This research / project is supported by the Agency for Science, Technology and Research - AME Programmatic Funding Scheme
Grant Reference no. : A18A2b0046