Chen, C., Li, K., Zou, X., Cheng, Z., Wei, W., Tian, Q., & Zeng, Z. (2022). Hierarchical Semantic Graph Reasoning for Train Component Detection. IEEE Transactions on Neural Networks and Learning Systems, 33(9), 4502–4514. https://doi.org/10.1109/tnnls.2021.3057792
Abstract:
Recently, deep learning based approaches have achieved superior performance on object detection applications. However, object detection for industrial scenarios are not well investigated yet. A case worth exploring could be train com-ponent detection, in which the objects may also have some structures (e.g., fixed relative positions, object relationships) and the structured patterns are normally presented in a hierarchical way. In this work, we propose a novel deep learning based method, Hierarchical Graphical Reasoning (HGR), which uti-lizes the hierarchical structures of trains for train component detection. HGR contains multiple graphical reasoning branches, each of which is utilized to conduct graphical reasoning for one cluster of train components based on their sizes. Specifically, in each branch, the visual appearances and structures of train components, e.g., object relationships of a specific cluster or scene context, are considered jointly with our proposed novel densely connected dual gated recurrent units (Dense-DGRU). Dense-DGRU can combine hidden features of objects and encoded messages from scene context and object relationships in a gated and recurrent manner. To the best of our knowledge, HGR is the first kind of framework that explores hierarchical structures among objects for object detection. We have collected a dataset of 1, 130 images captured from moving trains, in which 17, 334 train components are manually annotated with bounding boxes. Based on this dataset, we carry out extensive experiments that have demonstrated our proposed HGR outperforms the existing state-of-the-art baselines significantly. The constructed dataset and corresponding source code will be released soon to encourage more future work.e released soon to facilitate more future work.
License type:
Publisher Copyright
Funding Info:
This work was supported in part by the National Natural Science Foundation of China under Grant 61902120, in part by the National Key Research and Development Program of China under Grant 2018YFB1003401, and in part by the Postdoctoral Science Foundation of China under Grant 2019M662768 and Grant 2019TQ0086