You are here

Nat Commun DOI:10.1038/s41467-020-16378-3

Inferring multimodal latent topics from electronic health records.

Publication TypeJournal Article
Year of Publication2020
AuthorsLi, Y, Nair, P, Lu, XHan, Wen, Z, Wang, Y, Dehaghi, AArdalan Ka, Miao, Y, Liu, W, Ordog, T, Biernacka, JM, Ryu, E, Olson, JE, Frye, MA, Liu, A, Guo, L, Marelli, A, Ahuja, Y, Davila-Velderrain, J, Kellis, M
JournalNat Commun
Date Published2020 May 21

Electronic health records (EHR) are rich heterogeneous collections of patient health information, whose broad adoption provides clinicians and researchers unprecedented opportunities for health informatics, disease-risk prediction, actionable clinical recommendations, and precision medicine. However, EHRs present several modeling challenges, including highly sparse data matrices, noisy irregular clinical notes, arbitrary biases in billing code assignment, diagnosis-driven lab tests, and heterogeneous data types. To address these challenges, we present MixEHR, a multi-view Bayesian topic model. We demonstrate MixEHR on MIMIC-III, Mayo Clinic Bipolar Disorder, and Quebec Congenital Heart Disease EHR datasets. Qualitatively, MixEHR disease topics reveal meaningful combinations of clinical features across heterogeneous data types. Quantitatively, we observe superior prediction accuracy of diagnostic codes and lab test imputations compared to the state-of-art methods. We leverage the inferred patient topic mixtures to classify target diseases and predict mortality of patients in critical conditions. In all comparison, MixEHR confers competitive performance and reveals meaningful disease-related topics.


Alternate JournalNat Commun
PubMed ID32439869
Grant ListRGPIN-2019-0621 / / Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada (NSERC Canadian Network for Research and Innovation in Machining Technology) /
NC-268592 / / Fonds de Recherche du Québec - Nature et Technologies (Quebec Fund for Research in Nature and Technology) /
G249591 / / Canada First Research Excellence Fund (Fonds d'excellence en recherche Apogée Canada) /
35223 / / Gouvernement du Canada | Canadian Institutes of Health Research (Instituts de Recherche en Santé du Canada) /