TIRDet: Mono-Modality Thermal InfraRed Object Detection Based on Prior Thermal-To-Visible Translation

Page view(s)
113
Checked on Jun 23, 2024
TIRDet: Mono-Modality Thermal InfraRed Object Detection Based on Prior Thermal-To-Visible Translation
Title:
TIRDet: Mono-Modality Thermal InfraRed Object Detection Based on Prior Thermal-To-Visible Translation
Journal Title:
Proceedings of the 31st ACM International Conference on Multimedia
Keywords:
Publication Date:
27 October 2023
Citation:
Wang, Z., Colonnier, F., Zheng, J., Acharya, J., Jiang, W., & Huang, K. (2023). TIRDet: Mono-Modality Thermal InfraRed Object Detection Based on Prior Thermal-To-Visible Translation. Proceedings of the 31st ACM International Conference on Multimedia. https://doi.org/10.1145/3581783.3613849
Abstract:
Cross-modality images that combine visible-infrared spectra can provide complementary information for object detection. In particular, they are well-suited for autonomous vehicle applications in dark environments with limited illumination. However, it is time-consuming to acquire a large number of pixel-aligned visible-thermal image pairs, and real-time alignment is challenging in practical driving systems. Furthermore, the quality of visible-spectrum images can be adversely affected by complex environmental conditions. In this paper, we propose a novel neural network called TIRDet, which only utilizes Thermal InfraRed (TIR) images for mono-modality object detection. To compensate for the lacked visible band information, we adopt a prior Thermal-To-Visible (T2V) translation model to obtain the translated visible images and the latent T2V codes. In addition, we introduce a novel attention-based Cross-Modality Aggregation (CMA) module, which can augment the modality-translation awareness of TIRDet by preserving the T2V semantic information. Extensive experiments on FLIR and LLVIP datasets demonstrate that our TIRDet significantly outperforms all mono-modality detection methods based on thermal images, and it even surpasses most State-Of-The-Art (SOTA) multispectral methods using visible-thermal image pairs. Code is available at https://github.com/zeyuwang-zju/TIRDet
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done
Description:
© Author | ACM 2023. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 31st ACM International Conference on Multimedia, http://dx.doi.org/10.1145/3581783.3613849
ISBN:
978-1-4503-9203-7
Files uploaded:

File Size Format Action
tirdet-camera-ready-for-astar.pdf 3.39 MB PDF Open