DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval

Page view(s)
51
Checked on Jan 15, 2025
DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval
Title:
DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval
Journal Title:
Information Sciences
Publication Date:
06 February 2021
Citation:
Wang X, Hu P, Zhen L, et al. DRSL: Deep Relational Similarity Learning for Cross-modal Retrieval[J]. Information Sciences, 546: 298-311.
Abstract:
Cross-modal retrieval aims to retrieve relevant samples across different media modalities. Existing cross-modal retrieval approaches are contingent on learning common representations of all modalities by assuming that an equal amount of information exists in different modalities. However, since the quantity of information among cross-modal samples is unbalanced and unequal, it is inappropriate to directly match the obtained modality-specific representations across different modalities in a common space. In this paper, we propose a new method called Deep Relational Similarity Learning (DRSL) for cross-modal retrieval. Unlike existing approaches, the proposed DRSL aims to effectively bridge the heterogeneity gap of different modalities by directly learning the natural pairwise similarities instead of explicitly learning a common space. DRSL is a deep hybrid framework that integrates the relation networks module for relation learning, capturing the implicit nonlinear distance metric. To the best of our knowledge, DRSL is the first approach that incorporates relation networks into the cross-modal learning scenario. Comprehensive experimental results show that the proposed DRSL model achieves state-of-the-art results in cross-modal retrieval tasks on four widely-used benchmark datasets, i.e., Wikipedia, Pascal Sentences, NUS-WIDE-10K, and XMediaNet.
License type:
http://creativecommons.org/licenses/by-nc-nd/4.0/
Funding Info:
This work is supported by the National Key Research and Development Project of China under contract No. 2017YFB1002201 and partially supported by the National Natural Science Foundation of China (Grants No. 61971296, U19A2078, 61625204), the Ministry of Education \& China Mobile Research Foundation Project (No. MCM20180405), Sichuan Science and Technology Planning Project (No. 2019YFH0075), and Scu-Luzhou Corporation Sci\&Tech Research Project (No. 2019CDLZ-07).
Description:
ISSN:
0020-0255
Files uploaded:
File Size Format Action
There are no attached files.