Shi, W., Wu, J., Yang, X., Chen, N., Ho Mien, I., Kim, J.-J., & Krishnaswamy, P. (2021). Analyzing Code Embeddings for Coding Clinical Narratives. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. doi:10.18653/v1/2021.findings-acl.410
Abstract:
Medical professionals review clinical narratives to assign medical codes as per the International Classification of Diseases (ICD) for billing and care management. This manual process is inefficient and error-prone as it involves a nuanced one-to-many mapping. Recent works on automated ICD coding learn mappings between low-dimensional representations of the reports and the codes. While they propose novel neural networks for encoding varied types of information about the codes, it is unclear as to what information in the medical codes is helpful for performance improvement and why. Here, we compare different ways to represent, or embed, the codes based on their textual, structural and statistical characteristics, using a single dee learning baseline model in quantitative evaluations on discharge reports from the MIMIC-III Intensive Care Unit database. We also qualitatively analyse the nature of the cases that benefit most from the code embeddings and demonstrate that code embeddings are important for predicting ambiguous and oblique codes.
License type:
Attribution 4.0 International (CC BY 4.0)
Funding Info:
This research is supported by core funding from: Institute for Infocomm Research
Grant Reference no. : SC20-RV230
This research / project is supported by the Agency for Science, Technology and Research - Digital Health and Deep Learning
Grant Reference no. : A1818g0044