Deep Learning-Based Early Detection of Rare Diseases Using Electronic Health Records
DOI:
https://doi.org/10.63682/jns.v14i14S.3653Keywords:
Rare Disease Detection, Electronic Health Records (EHR), Deep Learning in Healthcare, Hierarchical Temporal Transformer, Early DiagnosisAbstract
Early detection of rare diseases is one of the ongoing challenges in clinical practice because their prevalence is low, presentations are heterogeneous, and their diagnoses are complex. In this work, we introduce a new deep learning approach based on a Hierarchical Temporal Transformer (HTT) to detect rare diseases from clinical multi-dimensional patient data stored in electronic health records (EHR). Our model is tailored to recognize complex patterns in patient data over time and overcome extreme class imbalance by introducing a focal loss function. We perform an extensive comparison with conventional machine learning and deep learning baselines on three big-scale real-world EHR datasets covering over 170,000 patients with multiple rare diseases like Gaucher, ALS, and AADC deficiency. Our model outperforms baselines with a significant margin with an F1-score of 0.69, AUC of 0.89, and detects disease up to 12 days in advance of clinical detection. Our model models generalize well across datasets from other institutions and shows stable, balanced performance across demographic sub-populations. Besides predictive performance, the model provides clinical interpretability in the form of its temporal attention mechanism to flag medical relevant features (e.g., splenomegaly, thrombocytopenia) that are known to be associated with disease. An exhaustive ablation study also ensures the contribution of the key architectural elements such as hierarchical embedding and positional encodings. Our work shows the potential of deep learning in diagnosing rare diseases early and indicates the clinical utility of interpretable AI in clinical care in real-world healthcare. Our proposed model is a transferable and scalable solution that can facilitate early intervention and reduce the delay in diagnostics and improve the treatment of people with rare diseases
Downloads
Metrics
References
EURORDIS, “The Voice of Rare Disease Patients in Europe,” [Online]. Available: https://www.eurordis.org/
S. A. Farmer et al., “Understanding the economic burden of rare diseases: A scoping review,” Orphanet J. Rare Dis., vol. 15, no. 1, pp. 1–14, 2020.
B. E. Kingsmore, “Newborn screening for rare diseases: A roadmap for ending the diagnostic odyssey,” Am. J. Med. Genet., vol. 187, no. 6, pp. 1121–1130, 2021.
S. Nguengang Wakap et al., “Estimating cumulative point prevalence of rare diseases: Analysis of the Orphanet database,” Eur. J. Hum. Genet., vol. 28, pp. 165–173, 2020.
R. Miotto et al., “Deep learning for healthcare: Review, opportunities and challenges,” Brief Bioinform., vol. 19, no. 6, pp. 1236–1246, 2018.
B. Chen et al., “Predicting rare disease from EHR using data augmentation and deep learning,” in Proc. IEEE BHI, pp. 1–4, 2019.
A. Rajkomar et al., “Scalable and accurate deep learning with electronic health records,” npj Digit. Med., vol. 1, pp. 18, 2018.
E. Choi et al., “RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism,” in Adv. Neural Inf. Process. Syst., vol. 29, pp. 3512–3520, 2016.
T. Bai et al., “Interpretable representation learning for EHR with hierarchical attention,” in Proc. AMIA, pp. 992–1001, 2018.
Z. Zhang et al., “Transformers in healthcare: A survey,” J. Biomed. Inform., vol. 135, pp. 104216, 2022.
H. Ma et al., “Rare disease identification using deep embeddings and EHR data,” J. Biomed. Informatics, vol. 119, pp. 103811, 2021.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.