Discover and explore top open-source AI tools and projects—updated daily.
Guide to scientific datasets and LLMs
New!
Top 80.1% on SourcePulse
Summary
This repository offers a meticulously curated collection of papers, datasets, and models focused on Scientific Large Language Models (Sci-LLMs). It serves as a comprehensive resource, structured around a survey paper, for researchers and practitioners seeking to navigate the rapidly advancing landscape of LLMs in scientific domains. The primary benefit is a centralized, categorized overview of the field, facilitating discovery and adoption of relevant resources across diverse scientific disciplines.
How It Works
The project systematically organizes a vast array of scientific datasets and LLM resources, categorized by domain (e.g., Life Sciences, Chemistry, Physics, Astronomy, Materials Science, Earth Science, General Science). It also presents key trends, historical development paradigms, and timelines illustrating the evolution of Sci-LLMs, providing a structured map of the field's progress.
Quick Start & Requirements
This repository functions as a curated knowledge base. There are no installation or execution requirements; users can directly browse the categorized lists of datasets, papers, and models.
Highlighted Details
Maintenance & Community
Contributions and suggestions are welcomed via email (huming@pjlab.org.cn, clma24@m.fudan.edu.cn, litianbin@pjlab.org.cn). Citation details for the associated survey paper are provided. No dedicated community channels (e.g., Discord, Slack) are listed.
Licensing & Compatibility
The provided README content does not specify a license for the repository's curated resources or the collection itself.
Limitations & Caveats
As a curated list, its comprehensiveness is subject to the pace of updates in the rapidly evolving Sci-LLM field. While extensive, it may not encompass every nascent dataset or model.
2 days ago
Inactive