Curated list of scientific LLMs for scientific discovery (EMNLP'24 survey)
Top 55.5% on sourcepulse
This repository is a curated list of pre-trained language models specifically designed for scientific domains, including mathematics, physics, biology, and more. It serves as a comprehensive survey for researchers and practitioners looking to leverage LLMs for scientific discovery, covering models of various sizes and modalities.
How It Works
The project categorizes scientific LLMs by domain and modality (e.g., language-only, language+graph, language+vision). Each entry includes links to papers, GitHub repositories, and model weights where available. The organization is chronological within subsections, prioritizing papers with accessible pre-print versions and public code/model links.
Quick Start & Requirements
This repository is a curated list and does not have a direct installation or execution command. Users are directed to individual model repositories for setup and usage.
Highlighted Details
Maintenance & Community
The repository is associated with the EMNLP'24 paper "A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery." Contributions are welcomed via email or pull requests.
Licensing & Compatibility
The repository itself is a list and does not impose licensing restrictions. Individual models listed will have their own licenses, which users must consult.
Limitations & Caveats
The repository is a survey and does not provide direct model functionality. Users must navigate to individual model projects for implementation, and the availability of code and models varies.
1 month ago
Inactive