A deep learning based automatic report generator for retinal optical coherence tomography images

Page view(s)

Checked on

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/22559

Title:

A deep learning based automatic report generator for retinal optical coherence tomography images

Journal Title:

npj Digital Medicine

DOI:

10.1038/s41746-025-01988-2

Publication URL:

https://doi.org/10.1038/s41746-025-01988-2

Authors:

Xinjian Chen, Huazhu Fu, Jingtao Wang, Tian Lin, Qian Cheng, Cangxin Li, Meng Wang, Zhongyue Chen, Aidi Lin, Anlin Zhang, Weifang Zhu, Shirong Chen, Fei Shi, Dehui Xiang, Baoqing Nie, Yi Zhou, Yuanyuan Peng, Danqi Fang, Chao Guo, Ting Wang, Mingzhi Zhang, Chi Pui Pang, Haoyu Chen

Keywords:

Publication Date:

20 October 2025

Citation:

Chen, X., Fu, H., Wang, J., Lin, T., Cheng, Q., Li, C., Wang, M., Chen, Z., Lin, A., Zhang, A., Zhu, W., Chen, S., Shi, F., Xiang, D., Nie, B., Zhou, Y., Peng, Y., Fang, D., Guo, C., … Chen, H. (2025). A deep learning based automatic report generator for retinal optical coherence tomography images. Npj Digital Medicine, 8(1). https://doi.org/10.1038/s41746-025-01988-2

Abstract:

Reading and summarizing insights from Optical Coherence Tomography (OCT) images is a routine yet time-consuming task that requires expensive time from experienced ophthalmologists. This paper introduces the Multi-label OCT Report Generation (MORG) model, a deep learning approach to assist in the interpretation of OCT images. MORG employs dual image encoders to extract features from OCT image pairs, fusing them through a multi-scale module with an attention mechanism, followed by a sentence decoder to produce reports. Trained and tested on 57,308 retinal OCT image pairs, MORG achieved high classification accuracy for 16 pathologies with 37 descriptive types. It also excelled in a blind grading test against general large language models and other state-of-the-art image captioning models, scoring 4.55 compared to ophthalmologists’ 4.63 out of a maximum of 5. Furthermore, MORG has the potential to reduce the report drafting time for ophthalmologists by 58.9%, significantly alleviating their workload.

License type:

Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

Funding Info:

Agency for Science, Technology and Research (A*STAR) Career Development Fund

Agency for Science, Technology and Research (A*STAR) Central Research Fund (CRF)

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/22559

ISSN:

2398-6352

Collections:

Institute of High Performance Computing

Files uploaded:

https://doi.org/10.1038/s41746-025-01988-2