Recently, convolutional neural network (CNN) based methods have achieved superior results in generic object detection and have become the de-facto standard in the domain. However, potential adaptations to industrial areas are not well studied yet. A case worth exploring is the train component detection, in which the components may have strong relationships and some components (e.g., screws and nuts) are very small. Nevertheless, the detection performance of small train components significantly affects the efficiency of overall train component detection. In this work, we propose a novel robust train component detection(RTCD) framework, built on cascading CNNs and utilizing prior structure knowledge of the relationships between train components. The core idea of RTCD is to detect the big and easily detectable component first, and then find the areas that may contain small and challenging to detect components for following fine-grained exploitation. Our proposed attention region mechanism can find regions deserving of further analysis based on the region-of-interest (ROI) detected by the previous CNNs with the known structure knowledge. Then, these areas are cropped, zoomed in and fed into the following deep learning models for further detection. In order to verify the effectiveness of RTCD, 1, 130 high-resolution images of moving trains are captured and collected, from which 17, 334 critical train components are manually annotated. Extensive experiments therein have demonstrated that RTCD outperforms the existing state-of-the-art baselines significantly. The dataset and corresponding source code will be released to facilitate more future work.