Survey of scientific LLMs, focusing on biology and chemistry
Top 85.7% on sourcepulse
This repository serves as a comprehensive survey of Scientific Large Language Models (Sci-LLMs), focusing on their applications in biology and chemistry. It aims to consolidate research, datasets, and benchmarks for researchers and practitioners in these specialized AI domains, providing a structured overview of the rapidly evolving field.
How It Works
The survey categorizes Sci-LLMs based on their primary data modalities and application areas: Textual (medical, biology, chemistry), Molecular (property prediction, generation, reaction prediction), Protein, Genomic, and Multimodal (combining different data types). It meticulously lists relevant papers, datasets, and benchmarks within each category, offering a structured landscape of the Sci-LLM ecosystem.
Quick Start & Requirements
This repository is a curated survey and does not have direct installation or execution requirements. It provides links to papers, code, and datasets for further exploration.
Highlighted Details
Maintenance & Community
The project is maintained by HICAI-ZJU and lists several contributors. Users are encouraged to recommend missing papers via issues or pull requests. Contact information for Xinda Wang is provided.
Licensing & Compatibility
The repository itself is a survey and does not impose licensing restrictions. Individual papers and code linked within the survey will have their own respective licenses.
Limitations & Caveats
As a survey, this repository does not provide executable code or models. Its value is in its comprehensive cataloging of existing research, requiring users to independently access and evaluate the linked resources.
1 week ago
Inactive