sealion  by aisingapore

Open-source LLM family for Southeast Asian languages

created 1 year ago
343 stars

Top 81.8% on sourcepulse

GitHubView on GitHub
Project Summary

SEA-LION is a family of open-source Large Language Models (LLMs) specifically designed to understand and cater to the diverse linguistic and cultural contexts of Southeast Asia. It targets researchers, developers, and organizations working with or within the region, aiming to improve representation for under-represented populations and low-resource languages.

How It Works

SEA-LION models are built through a combination of continued pre-training (CPT) and supervised fine-tuning (SFT) on foundational models like Llama 3.1 and Gemma2. This approach leverages existing powerful architectures while adapting them to the specific nuances of Southeast Asian languages and cultures, as evaluated by their custom SEA-HELM benchmark.

Quick Start & Requirements

  • Models are available via Hugging Face (links not provided in README).
  • Requires standard LLM inference hardware (GPU recommended).
  • Specific model variants may inherit licensing restrictions from base models (e.g., Llama 3.1, Gemma2).

Highlighted Details

  • Offers multiple model sizes (3B to 70B) and context lengths (up to 128K).
  • v3.5 models are optimized for reasoning tasks.
  • Evaluated using SEA-HELM, a custom benchmark focusing on English performance, SEA chat proficiency, instruction-following, and linguistic tasks.
  • Models are available in Base, Instruct, and GGUF formats.

Maintenance & Community

  • Anchored by AI Singapore's Products Pillar.
  • Welcomes community contributions for bug reporting, documentation, evaluation tasks, and model training.
  • Contact via GitHub issues or an inquiry form.

Licensing & Compatibility

  • Primarily licensed under MIT, but exact terms depend on the base model used.
  • Llama-based variants may be subject to the Llama 3 License, potentially restricting commercial use. Gemma-based variants may have different terms. Users must check individual model cards.

Limitations & Caveats

  • Commercial use restrictions may apply depending on the base model. Users must verify licensing for each specific SEA-LION model.
Health Check
Last commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
43 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Woosuk Kwon Woosuk Kwon(Author of vLLM), and
11 more.

WizardLM by nlpxucan

0.1%
9k
LLMs built using Evol-Instruct for complex instruction following
created 2 years ago
updated 1 month ago
Feedback? Help us improve.