You are here

PLoS One DOI:10.1371/journal.pone.0078927

Modeling disease severity in multiple sclerosis using electronic health records.

Publication TypeJournal Article
Year of Publication2013
AuthorsXia, Z, Secor, E, Chibnik, LB, Bove, RM, Cheng, S, Chitnis, T, Cagan, A, Gainer, VS, Chen, PJ, Liao, KP, Shaw, SY, Ananthakrishnan, AN, Szolovits, P, Weiner, HL, Karlson, EW, Murphy, SN, Savova, GK, Cai, T, Churchill, SE, Plenge, RM, Kohane, IS, De Jager, PL
JournalPLoS One
Date Published2013
KeywordsAlgorithms, Electronic Health Records, Female, Humans, Male, Models, Biological, Multiple Sclerosis, Severity of Illness Index

OBJECTIVE: To optimally leverage the scalability and unique features of the electronic health records (EHR) for research that would ultimately improve patient care, we need to accurately identify patients and extract clinically meaningful measures. Using multiple sclerosis (MS) as a proof of principle, we showcased how to leverage routinely collected EHR data to identify patients with a complex neurological disorder and derive an important surrogate measure of disease severity heretofore only available in research settings.

METHODS: In a cross-sectional observational study, 5,495 MS patients were identified from the EHR systems of two major referral hospitals using an algorithm that includes codified and narrative information extracted using natural language processing. In the subset of patients who receive neurological care at a MS Center where disease measures have been collected, we used routinely collected EHR data to extract two aggregate indicators of MS severity of clinical relevance multiple sclerosis severity score (MSSS) and brain parenchymal fraction (BPF, a measure of whole brain volume).

RESULTS: The EHR algorithm that identifies MS patients has an area under the curve of 0.958, 83% sensitivity, 92% positive predictive value, and 89% negative predictive value when a 95% specificity threshold is used. The correlation between EHR-derived and true MSSS has a mean R(2) = 0.38±0.05, and that between EHR-derived and true BPF has a mean R(2) = 0.22±0.08. To illustrate its clinical relevance, derived MSSS captures the expected difference in disease severity between relapsing-remitting and progressive MS patients after adjusting for sex, age of symptom onset and disease duration (p = 1.56×10(-12)).

CONCLUSION: Incorporation of sophisticated codified and narrative EHR data accurately identifies MS patients and provides estimation of a well-accepted indicator of MS severity that is widely used in research settings but not part of the routine medical records. Similar approaches could be applied to other complex neurological disorders.


Alternate JournalPLoS ONE
PubMed ID24244385
PubMed Central IDPMC3823928
Grant ListK08 AR 060257 / AR / NIAMS NIH HHS / United States
K08 NS079493 / NS / NINDS NIH HHS / United States
K25 AG041906 / AG / NIA NIH HHS / United States
K08-NS079493 / NS / NINDS NIH HHS / United States
K23 DK097142 / DK / NIDDK NIH HHS / United States
K24 AR052403 / AR / NIAMS NIH HHS / United States
U54 LM008748 / LM / NLM NIH HHS / United States
U54-LM008748 / LM / NLM NIH HHS / United States