Feng, Y., Zhu, H., Peng, D., Peng, X., & Hu, P. (2023, June). RONO: Robust Discriminative Learning with Noisy Labels for 2D-3D Cross-Modal Retrieval. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr52729.2023.01117
Abstract:
Recently, with the advent of Metaverse and AI Generated Content, cross-modal retrieval becomes popular with a burst of 2D and 3D data. However, this problem is challenging given the heterogeneous structure and semantic discrepancies. Moreover, imperfect annotations are ubiquitous given the ambiguous 2D and 3D content, thus inevitably producing noisy labels to degrade the learning performance. To tackle the problem, this paper proposes a robust 2D-3D retrieval framework (RONO) to robustly learn from noisy multimodal data. Specifically, one novel Robust Discriminative Center Learning mechanism (RDCL) is proposed in RONO to adaptively distinguish clean and noisy samples for respectively providing them with positive and negative optimization directions, thus mitigating the negative impact of noisy labels. Besides, we present a Shared Space Consistency Learning mechanism (SSCL) to capture the intrinsic information inside the noisy data by minimizing the cross-modal and semantic discrepancy between common space and label space simultaneously. Comprehensive mathematical analyses are given to theoretically prove the noise tolerance of the proposed method. Furthermore, we conduct extensive experiments on four 3D-model multimodal datasets to verify the effectiveness of our method by comparing it with 15 state-of-the-art methods. Code is available at https://github.com/penghu-cs/RONO.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the A*STAR - MTC Programmatic
Grant Reference no. : A18A2b0046
This research / project is supported by the A*STAR - RobotHTPO
Grant Reference no. : C211518008
This research / project is supported by the Singapore Economic Development Board (EDB) - Space Technology Development Grant (STDP)
Grant Reference no. : S22-19016- STDP
This work is supported by the National Key R&D Program of China under Grant 2020YFB1406702, the National Natural Science Foundation of China (Grants No. 62102274, 62176171, and U19A2078), Sichuan
Science and Technology Planning Project (Grants No. 2021YFS0389, 2021YFG0317, 2021YFG0301, 2022YFQ0014, and 2022YFH0021), Fundamental Research Funds for the Central Universities.