Yu, Y., Hu, P., Lin, J., & Krishnaswamy, P. (2021). Multimodal Multitask Deep Learning for X-Ray Image Retrieval. Lecture Notes in Computer Science, 603–613. doi:10.1007/978-3-030-87240-3_58
Content-based image retrieval (CBIR) is of increasing interest for clinical applications spanning differential diagnosis, prognostication, and indexing of electronic radiology databases. However, meaningful CBIR for radiology applications requires capabilities to address the semantic gap and assess similarity based on fine-grained image features. We observe that images in radiology databases are often accompanied by free-text radiologist reports containing rich semantic information. Therefore, we propose a Multimodal Multitask Deep Learning (MMDL) approach for CBIR on radiology images. Our proposed approach employs multimodal database inputs for training, learns semantic feature representations for each modality, and maps these representations into a common subspace. During testing, we use representations from the common subspace to rank similarities between the query and database. To enhance our framework for fine-grained image retrieval, we provide extensions employing deep descriptors and ranking loss optimization. We performed extensive evaluations on the MIMIC Chest X-ray (MIMICCXR) dataset with images and reports from 227,835 studies. Our results demonstrate strong performance gains over a typical unimodal CBIR strategy. Further, we show that the performance gains of our approach are robust even in scenarios where only a subset of database images are paired with free-text radiologist reports. Our work has implications for next-generation medical image indexing and retrieval systems.
This research / project is supported by the Institute for Infocomm Research, Science and Engineering Research Council, A*STAR, Singapore - A suite of next generation deep learning tools for medical imaging
Grant Reference no. : EC-2018-046