open-llms  by eugeneyan

Curated list of commercially-usable open LLMs

created 2 years ago
12,230 stars

Top 4.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a curated list of open-source Large Language Models (LLMs) that are explicitly licensed for commercial use. It targets developers, researchers, and businesses seeking to leverage LLMs without restrictive licensing, providing a centralized resource for model selection and comparison.

How It Works

The project compiles a comprehensive table of open LLMs, detailing their release date, available checkpoints, associated papers or blog posts, parameter counts, context lengths, and crucially, their licensing terms. This structured data allows users to quickly assess model suitability for their specific needs and commercial applications.

Quick Start & Requirements

This repository is a reference list; it does not require installation or execution. Users are directed to individual model repositories (e.g., Hugging Face) for download, setup, and usage instructions.

Highlighted Details

  • Comprehensive coverage of LLMs, including recent releases and models with extended context lengths (e.g., MPT-7B at 84k, LWM models up to 1M).
  • Categorization includes models specifically for code generation (e.g., StarCoder, Code Llama) and datasets for pre-training, instruction-tuning, and alignment.
  • Detailed explanations of common open-source licenses (Apache 2.0, MIT, CC BY-SA, OpenRAIL-M) and their implications for commercial use.
  • Links to various leaderboards and evaluation benchmarks (e.g., lmsys.org, Hugging Face Open LLM Leaderboard) for performance assessment.

Maintenance & Community

The repository is maintained by eugeneyan and welcomes community contributions for adding new models or updating existing entries.

Licensing & Compatibility

The repository itself is not licensed, but it lists models with various licenses including Apache 2.0, MIT, CC BY-SA, OpenRAIL-M, and custom licenses. Users must consult the specific license for each model. Apache 2.0 and MIT generally permit commercial use and modification. CC BY-SA and some custom licenses may have restrictions on derivative works or require attribution.

Limitations & Caveats

The repository disclaims legal advice, stating that users are responsible for consulting attorneys regarding commercial use. Some listed models have usage restrictions or require user registration, which are noted but require further investigation by the user. The "infinity" context length for RWKV is noted as RNN-based.

Health Check
Last commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
315 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.