galai by paperswithcode

Scientific language model API

Created 3 years ago

2,737 stars

Top 17.2% on SourcePulse

View on GitHub

10 Experts Love This Project

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Vincent Weisser

Cofounder of Prime Intellect

Amir Salihefendic

Founder of Doist

Ross Taylor

Cofounder of General Reasoning; Cocreator of Papers with Code

and 6 more!

Project Summary

GALAI provides a Python API for the GALACTICA family of large language models, specifically designed for scientific text and data. It enables users to perform a variety of scientific NLP tasks, including citation prediction, mathematical reasoning, molecular property prediction, and protein annotation, offering a powerful tool for researchers and developers working with scientific information.

How It Works

GALACTICA models are trained on a vast corpus of scientific literature, data, and knowledge bases. The API allows users to load different model sizes (from 125M to 120B parameters) and interact with them using specific prompt formats to elicit desired scientific outputs. This approach leverages the models' specialized training to achieve high performance on scientific tasks, outperforming general-purpose models on benchmarks like LaTeX equation generation and mathematical reasoning.

Quick Start & Requirements

Install via pip: pip install galai or pip install git+https://github.com/paperswithcode/galai.
For advanced usage with Hugging Face transformers: pip install transformers accelerate.
Requires Python. GPU acceleration is recommended for larger models.
Full introduction and examples are available as a PDF and Jupyter Notebook.
Model weights and cards are available on the Hugging Face Hub.

Highlighted Details

Offers five model sizes: mini (125M), base (1.3B), standard (6.7B), large (30B), and huge (120B).
Supports specialized tasks like citation prediction, LaTeX generation, molecule generation, and protein annotation via specific prompt tokens (e.g., [START_REF], [START_I_SMILES]).
Achieves state-of-the-art results on scientific benchmarks, outperforming models like GPT-3, Chinchilla, and PaLM on technical knowledge and reasoning tasks.
Can be used with Hugging Face transformers library for more control over inference.

Maintenance & Community

The project is associated with PapersWithCode and the GALACTICA initiative. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The project's license is not explicitly stated in the README. However, the models themselves are available under licenses that may vary, with the original GALACTICA model having been withdrawn due to misuse concerns. Users should verify the specific license for each model version used.

Limitations & Caveats

GALACTICA models are not instruction-tuned, requiring specific prompt engineering for optimal results. The README notes that some predictions, such as IUPAC name prediction for a given SMILES string, may be incorrect. The original GALACTICA model was withdrawn by Meta AI due to concerns about generating misinformation.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days