Performance and Accuracy of Natural Language Processing to Identify Disease Aetiology from Non-Structured Cardiac MRI Electronic Medical Record Reports

Duygu Kocyigit; Alex Milinovich; Chan Mi Lee; Michael Silverman; Maleeha Ahmad; Mazen Hanna; Andrej Gabrovsek; Jian Jin; WH Wilson Tang; Richard Grimm; Leslie Cho; Brian Griffin; Scott Flamm; Deborah Kwon

doi:10.33590/emjcardiol/2009142

INTRODUCTION

The utility of cardiac MRI (CMR) in patients with heart failure has been well demonstrated and continues to expand as MRI techniques evolve. Its main superiorities in this patient population include: accurate and reproducible quantification of ventricular systolic functions; enhanced discrimination of abnormal myocardial tissue characteristics (i.e., oedema, interstitial fibrosis, and replacement fibrosis); and assessment of valvular function/morphology, endocardium and pericardium in a single scan.^1,2

CMR is now an essential part of the diagnosis of various types of heart failure, including cardiac amyloidosis, cardiac sarcoidosis, myocarditis, arrhythmogenic right ventricular cardiomyopathy, and iron overload cardiomyopathy. CMR findings also have prognostic implications, such as in hypertrophic cardiomyopathy.^1,2These have resulted in an increasing demand and utility of CMR in routine clinical practice. However, the synthesis of imaging findings into a final or differential diagnosis is typically written in free-text, resulting in difficulties with accurately categorising cardiomyopathy types by generic query algorithms.

Natural language processing (NLP) is an analytical method that has been used to develop computer-based algorithms that handle and transform natural linguistics so that the information can be used for computation.³ It enables gathering and combining of information extracted from various online databases, and helps create solid outputs that could serve as research endpoints, including sample identification and variable collection. In the field of imaging, NLP may also have several clinical applications, such as highlighting and classifying imaging findings, generating follow-up recommendations, imaging protocols, and survival prediction models.⁴

METHODS AND RESULTS

There are scarce data on the utility of NLP in heart failure imaging, which focusses on extraction of left ventricular ejection fraction from echocardiography reports.^5,6 In this study, the authors assessed the utility of NLP for heart failure aetiology extraction from CMR reports that were in a free-text, non-structured format. For this purpose, CMR records at a single centre from May 1995–May 2019 were examined for reports favouring or excluding cardiac amyloidosis, cardiac sarcoidosis, and myocarditis diagnoses using NLP via cTAKES (clinical text analysis knowledge extraction system). CMR reports of the extracted cases were reviewed manually (N=1262). Indeterminate cases, defined as having at least two differential diagnoses on the CMR report, were excluded (n=339). The accuracy of NLP was determined for cardiac amyloidosis, cardiac sarcoidosis, and myocarditis separately. This initial review was followed with five iterations for improving the accuracy of NLP, using a gradient boosting machine model with a word2vec model representation of the sentences of interest combined with indicators of diagnosis identified, certainty, polarity, and section header in the final algorithm.

CONCLUSION

Overall, this study demonstrates that NLP can be used as an accurate method to extract cardiac amyloidosis, cardiac sarcoidosis, and myocarditis diagnoses from CMR reports in patients with heart failure. Adjustments to the algorithm are essential to improve its accuracy because of variations in linguistic expression manners of CMR readers. Application of this analytical method enables timesaving and accurate documentation of various heart failure aetiologies, with the potential for improving both heart failure care quality and performance, as well as facilitating future heart failure research.

Performance and Accuracy of Natural Language Processing to Identify Disease Aetiology from Non-Structured Cardiac MRI Electronic Medical Record Reports

INTRODUCTION

METHODS AND RESULTS

CONCLUSION

Simple Fitness Test Predicts Pulmonary Hypertension Outcomes

AI Tool Guides Care for Cancer Heart Attacks

More articles

Unwrap the Best of Cardiology

Unrepaired TOF with Pulmonary Atresia and MAPCA in Pregnancy

SGLT2 Inhibitors in Heart Failure: Why the Elderly are Missing Out

Featured journals

EMJ Cardiology 13 [Supplement 1] 2025

EMJ Cardiology 13.1 2025

Therapy Area

About Us

Performance and Accuracy of Natural Language Processing to Identify Disease Aetiology from Non-Structured Cardiac MRI Electronic Medical Record Reports

INTRODUCTION

METHODS AND RESULTS

CONCLUSION

Related To This Subject

Simple Fitness Test Predicts Pulmonary Hypertension Outcomes

AI Tool Guides Care for Cancer Heart Attacks

More articles

Unwrap the Best of Cardiology

Unrepaired TOF with Pulmonary Atresia and MAPCA in Pregnancy

SGLT2 Inhibitors in Heart Failure: Why the Elderly are Missing Out

Featured journals

EMJ Cardiology 13 [Supplement 1] 2025

EMJ Cardiology 13.1 2025