Analyzing Code Embeddings for Coding Clinical Narratives

Page view(s)

Checked on Aug 10, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/17840

Title:

Analyzing Code Embeddings for Coding Clinical Narratives

Journal Title:

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

DOI:

10.18653/v1/2021.findings-acl.410

Publication URL:

https://doi.org/10.18653/v1/2021.findings-acl.410

Authors:

Wei Shi, Jiewen Wu, Xiwen Yang, Nancy Chen, Ivan Mien Ho, Jung-jae Kim, Pavitra Krishnaswamy

Keywords:

Publication Date:

01 August 2021

Citation:

Shi, W., Wu, J., Yang, X., Chen, N., Ho Mien, I., Kim, J.-J., & Krishnaswamy, P. (2021). Analyzing Code Embeddings for Coding Clinical Narratives. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. doi:10.18653/v1/2021.findings-acl.410

Abstract:

Medical professionals review clinical narratives to assign medical codes as per the International Classification of Diseases (ICD) for billing and care management. This manual process is inefficient and error-prone as it involves a nuanced one-to-many mapping. Recent works on automated ICD coding learn mappings between low-dimensional representations of the reports and the codes. While they propose novel neural networks for encoding varied types of information about the codes, it is unclear as to what information in the medical codes is helpful for performance improvement and why. Here, we compare different ways to represent, or embed, the codes based on their textual, structural and statistical characteristics, using a single dee learning baseline model in quantitative evaluations on discharge reports from the MIMIC-III Intensive Care Unit database. We also qualitatively analyse the nature of the cases that benefit most from the code embeddings and demonstrate that code embeddings are important for predicting ambiguous and oblique codes.

License type:

Attribution 4.0 International (CC BY 4.0)

Funding Info:

This research is supported by core funding from: Institute for Infocomm Research
Grant Reference no. : SC20-RV230

This research / project is supported by the Agency for Science, Technology and Research - Digital Health and Deep Learning
Grant Reference no. : A1818g0044

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/17840

ISSN:

NIL

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
2021findings-acl410.pdf	481.43 KB	PDF	Open