Deep learning methods are becoming the de-facto standard for generic visual recognition in the literature. However, their adaptations to industrial scenarios, such as visual recognition for machines, product streamlines, etc., which consist of countless components, have not been investigated well yet. Compared with the generic object detection, there is some strong structural knowledge in these scenarios (e.g., fixed relative positions of components, component relationships, etc.). A case worth exploring could be automated visual inspection for trains, where there are various correlated components. However, the dominant object detection paradigm is limited by treating the visual features of each object region separately without considering common sense knowledge among objects. In this article, we propose a novel automated visual inspection framework for trains exploring structural knowledge for train component detection, which is called SKTCD. SKTCD is an end-to-end trainable framework, in which the visual features of train components and structural knowledge (including hierarchical scene contexts and spatial-aware component relationships) are jointly exploited for train component detection. We propose novel residual multiple gated recurrent units (Res-MGRUs) that can optimally fuse the visual features of train components and messages from the structural knowledge in a weighted-recurrent way. In order to verify the feasibility of SKTCD, a dataset that contains high-resolution images captured from moving trains has been collected, in which 18 590 critical train components are manually annotated. Extensive experiments on this dataset and on the PASCAL VOC dataset have demonstrated that SKTCD outperforms the existing challenging baselines significantly. The dataset as well as the source code can be downloaded online (https://github.com/smartprobe/SKCD).