galai  by paperswithcode

Scientific language model API

created 2 years ago
2,733 stars

Top 17.8% on sourcepulse

GitHubView on GitHub
Project Summary

GALAI provides a Python API for the GALACTICA family of large language models, specifically designed for scientific text and data. It enables users to perform a variety of scientific NLP tasks, including citation prediction, mathematical reasoning, molecular property prediction, and protein annotation, offering a powerful tool for researchers and developers working with scientific information.

How It Works

GALACTICA models are trained on a vast corpus of scientific literature, data, and knowledge bases. The API allows users to load different model sizes (from 125M to 120B parameters) and interact with them using specific prompt formats to elicit desired scientific outputs. This approach leverages the models' specialized training to achieve high performance on scientific tasks, outperforming general-purpose models on benchmarks like LaTeX equation generation and mathematical reasoning.

Quick Start & Requirements

  • Install via pip: pip install galai or pip install git+https://github.com/paperswithcode/galai.
  • For advanced usage with Hugging Face transformers: pip install transformers accelerate.
  • Requires Python. GPU acceleration is recommended for larger models.
  • Full introduction and examples are available as a PDF and Jupyter Notebook.
  • Model weights and cards are available on the Hugging Face Hub.

Highlighted Details

  • Offers five model sizes: mini (125M), base (1.3B), standard (6.7B), large (30B), and huge (120B).
  • Supports specialized tasks like citation prediction, LaTeX generation, molecule generation, and protein annotation via specific prompt tokens (e.g., [START_REF], [START_I_SMILES]).
  • Achieves state-of-the-art results on scientific benchmarks, outperforming models like GPT-3, Chinchilla, and PaLM on technical knowledge and reasoning tasks.
  • Can be used with Hugging Face transformers library for more control over inference.

Maintenance & Community

The project is associated with PapersWithCode and the GALACTICA initiative. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The project's license is not explicitly stated in the README. However, the models themselves are available under licenses that may vary, with the original GALACTICA model having been withdrawn due to misuse concerns. Users should verify the specific license for each model version used.

Limitations & Caveats

GALACTICA models are not instruction-tuned, requiring specific prompt engineering for optimal results. The README notes that some predictions, such as IUPAC name prediction for a given SMILES string, may be incorrect. The original GALACTICA model was withdrawn by Meta AI due to concerns about generating misinformation.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
23 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Woosuk Kwon Woosuk Kwon(Author of vLLM), and
11 more.

WizardLM by nlpxucan

0.1%
9k
LLMs built using Evol-Instruct for complex instruction following
created 2 years ago
updated 1 month ago
Feedback? Help us improve.