LaMini-LM  by mbzuai-nlp

Small, efficient language models distilled from ChatGPT for research

created 2 years ago
821 stars

Top 44.1% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

LaMini-LM offers a diverse collection of small, efficient language models distilled from ChatGPT and trained on a 2.58M instruction dataset. Targeting researchers and developers seeking performant, compact LLMs, it provides a range of architectures and sizes for various NLP tasks.

How It Works

LaMini-LM employs offline distillation from GPT-3.5-turbo, generating 2.58M instruction-response pairs using prompts from existing resources like Self-Instruct, P3, Flan, and Alpaca. This approach allows for the creation of smaller, more manageable models that retain significant instruction-following capabilities, making them suitable for resource-constrained environments.

Quick Start & Requirements

  • Install via pip: pip install -q transformers
  • Models can be loaded using HuggingFace pipeline().
  • Requires Python and the transformers library.
  • See HuggingFace Hub for model checkpoints.

Highlighted Details

  • Offers models based on T5, Flan-T5, Cerebras-GPT, GPT-2, and GPT-Neo architectures.
  • Evaluated on 15 diverse NLP tasks using lm-evaluation-harness.
  • Includes human evaluation results and qualitative analysis comparing LaMini-LM performance against Alpaca-7B.
  • Models are available in various sizes, from 61M to 1.5B parameters.

Maintenance & Community

  • The project is associated with mbzuai-nlp.
  • Citation details are provided in BibTeX format.

Licensing & Compatibility

  • Licensed under CC BY NC 4.0.
  • Intended for research use only; commercial use is restricted.

Limitations & Caveats

The CC BY NC 4.0 license prohibits commercial use. The README notes that reported LLaMA results are not directly comparable due to insufficient detail for reproducible evaluation.

Health Check
Last commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm).

LongLoRA by dvlab-research

0.1%
3k
LongLoRA: Efficient fine-tuning for long-context LLMs
created 1 year ago
updated 11 months ago
Feedback? Help us improve.