awesome-open-source-lms  by allenai

Curated list of open-source language models and resources

created 7 months ago
286 stars

Top 92.5% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository serves as a curated list of open-source language models and associated resources, targeting AI researchers and practitioners interested in fully reproducible LLM development. It aims to counter the trend of proprietary models by providing access to training code, data, and architectures, fostering scientific study and the development of truly open LMs.

How It Works

The project curates links to various components of the LLM development pipeline, including pretraining data, model architectures, training code, and adaptation techniques like instruction tuning and RLHF. It emphasizes models where more than just weights are open, prioritizing projects that offer the complete pipeline for transparency and scientific rigor.

Quick Start & Requirements

This repository is a curated list and does not have a direct installation or execution command. It links to external projects, each with its own requirements.

Highlighted Details

  • Focuses on "fully open-source" models, including training code, data, and architectures.
  • Covers the entire language model development pipeline from data processing to post-training.
  • Features contributions and models from prominent organizations like Allen Institute for AI (AI2), Databricks, EleutherAI, and Together.AI.
  • Includes links to papers, blog posts, and demos for many listed projects.

Maintenance & Community

The project is maintained by Allen Institute for AI (Ai2) and encourages community contributions via Pull Requests. It was built for a 2024 NeurIPS tutorial.

Licensing & Compatibility

The repository itself is a list of links. The licensing and compatibility of the individual projects linked within the repository will vary and must be checked on a per-project basis.

Limitations & Caveats

This is a curated list, not a unified framework. Users must navigate to individual linked projects to assess their specific features, maturity, and usability. Some linked projects may be in early stages of development or have specific hardware requirements.

Health Check
Last commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
13 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.