Curated list of datasets, models, and papers for medical LLMs
Top 98.4% on sourcepulse
This repository serves as a comprehensive, curated list of datasets, models, and research papers relevant to Large Language Models (LLMs) in the medical and healthcare domain. It aims to be a central resource for researchers, developers, and practitioners working with AI in healthcare, providing pointers to valuable resources for building and evaluating medical LLMs.
How It Works
The repository categorizes resources into Datasets, Models, and Papers. Datasets are further broken down by language (Chinese and English) and type (dialogue, EHR, literature, etc.), with details on content, size, and access links. Models are listed with their base architecture, parameter count, key features, and availability. Papers are linked with their respective research contributions and often include code repositories.
Quick Start & Requirements
This repository is a curated list and does not require installation or direct execution. Users can browse the links provided for datasets, models, and papers to access them. Requirements will vary based on the specific resources accessed.
Highlighted Details
Maintenance & Community
The repository is maintained by onejune2018 and lists several contributors. It provides a GitHub link for community engagement and code contributions.
Licensing & Compatibility
The repository itself is licensed under the MIT License. However, the underlying datasets and models listed may have their own specific licenses, including Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License for some resources. Users must verify the licenses of individual components before use, especially for commercial applications.
Limitations & Caveats
As a curated list, the quality and availability of linked resources are dependent on their original sources. Some links may become outdated, and the rapid evolution of the LLM field means new resources are constantly emerging. Users should independently verify the suitability and licensing of any dataset or model before integration.
1 year ago
Inactive