Medical LLM for Chinese medical applications
Top 74.7% on sourcepulse
HuatuoGPT-II is an open-source Large Language Model specifically adapted for the medical domain, targeting researchers and developers in medical AI. It offers significant improvements in medical knowledge and dialogue capabilities, demonstrated by state-of-the-art performance on Chinese medical benchmarks and professional exams, even outperforming GPT-4 in expert evaluations.
How It Works
HuatuoGPT-II utilizes a novel one-stage domain adaptation method. This approach involves transforming pre-training corpora into instruction-output pairs and then applying a priority sampling algorithm for data processing before a single-stage training phase. This method is designed to enhance model adaptation across languages and domains efficiently.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project is associated with the School of Data Science, CUHKSZ, and the Shenzhen Research Institute of Big Data. Updates are ongoing, with recent news including paper acceptance at COLM 2024 and public release of training data.
Licensing & Compatibility
The repository does not explicitly state a license. The models are available on Hugging Face, implying a permissive license for use, but specific terms should be verified.
Limitations & Caveats
The project focuses on Chinese medical applications, and performance on other languages or medical contexts may vary. While code and data are being released, some aspects are still being organized. The evaluation benchmarks are specific to medical QA and professional exams.
11 months ago
1+ week