Contextualized embedding model for structured EHR data
Top 90.9% on sourcepulse
Med-BERT provides pre-trained contextualized embeddings for structured Electronic Health Records (EHR) data, specifically diagnosis codes in ICD-9 and ICD-10 formats. It aims to improve disease prediction performance for researchers and practitioners working with large-scale EHR datasets.
How It Works
Med-BERT adapts the BERT framework to process structured EHR data, pre-training embeddings on a massive dataset of over 28 million patients. This approach leverages the power of transformers to capture complex relationships and context within patient diagnosis histories, offering a significant performance boost over existing models for disease prediction tasks.
Quick Start & Requirements
create_ehr_pretrain_FTdata.py
) and can be followed via a provided DHF prediction notebook.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The pre-trained models are not available for download due to contractual limitations with data vendors. The code was primarily tested on GPU, with CPU and TPU support being untested.
1 year ago
Inactive