Deep Learning for Early Diagnosis of Chronic Conditions Using Electronic Health Records
Keywords:
Deep Learning, Electronic Health Records (EHR), Early Disease Diagnosis, Chronic Disease Prediction, Long Short-Term Memory (LSTM)Abstract
Early diagnosis of chronic diseases remains a major challenge in healthcare, especially given the complexity and volume of longitudinal Electronic Health Records (EHR). This study proposes a deep learning framework based on Long Short-Term Memory (LSTM) networks enhanced with attention mechanisms to identify early onset patterns of chronic conditions such as Type 2 Diabetes Mellitus (T2DM), Hypertension, Chronic Kidney Disease (CKD), and Congestive Heart Failure (CHF). Trained on a dataset of 72,593 patient records, the model achieved a high overall F1-Score of 90.8% and AUROC of 96.2%, significantly outperforming traditional models like logistic regression, random forest, and XGBoost. Condition-wise analysis showed strongest performance in T2DM (F1-Score: 92.0%), attributed to the model’s ability to track lab and medication sequences. The framework demonstrated robustness across demographics, with F1-Scores exceeding 88% across age, gender, and ethnic groups, confirming its fairness and general applicability. Ablation studies validated the essential roles of temporal learning and attention components, while visualization of attention weights provided meaningful interpretability aligned with clinical reasoning. Generalization experiments on MIMIC-III and eICU datasets yielded F1-Scores of 88.8% and 86.5%, respectively, underscoring the model’s resilience to domain shifts. These results support the deployment of the proposed deep learning framework as a reliable, equitable, and interpretable tool for early chronic disease diagnosis. Future extensions will target integration
Downloads
Metrics
References
World Health Organization, “Noncommunicable diseases,” [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases
J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Annals of Statistics, vol. 29, no. 5, pp. 1189–1232, 2001.
C. H. Chen, S. H. Lin, and T. C. Chien, “Early detection of chronic diseases using data mining techniques,” Healthcare, vol. 6, no. 3, pp. 1–15, 2018.
Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu, “Recurrent neural networks for multivariate time series with missing values,” Scientific Reports, vol. 8, no. 1, pp. 1–12, 2018.
E. Choi, M. T. Bahadori, J. Sun, J. Kulas, A. Schuetz, and W. Stewart, “RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2016, pp. 3504–3512.
F. Ma, Q. Yu, T. Cheng, J. Zhou, R. Malin, and J. Gao, “Care2Vec: A deep learning approach for dynamic treatment recommendations from electronic health records,” IEEE Journal of Biomedical and Health Informatics, vol. 24, no. 2, pp. 556–566, 2020.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.