MobiLlama  by mbzuai-oryx

Small language model for edge devices

Created 1 year ago
660 stars

Top 50.7% on SourcePulse

GitHubView on GitHub
Project Summary

MobiLlama introduces Small Language Models (SLMs) designed for resource-constrained edge devices, addressing the limitations of larger models in terms of memory, energy, and response efficiency. It offers a fully transparent, open-source 0.5B parameter SLM, catering to privacy, security, and sustainable deployment needs.

How It Works

MobiLlama builds upon the LLaMA-7B architecture, employing a parameter-sharing scheme to reduce pre-training and deployment costs. This approach allows for a significant reduction in model size while aiming to maintain accuracy, making it suitable for edge computing environments.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies via pip install -r requirements.txt. PyTorch installation is a prerequisite.
  • Prerequisites: Python 3.10, PyTorch, CUDA (for GPU acceleration).
  • Demo: An Android APK is available for on-device testing.
  • Models: Pre-trained models (0.5B, 0.8B, 1B) and chat versions are available on HuggingFace.
  • Resources: Links to training code, data preparation, and metrics are provided.

Highlighted Details

  • Offers models in 0.5B, 0.8B, and 1B parameter sizes, including chat-tuned variants.
  • Achieves competitive performance on various LLM benchmarks, outperforming similarly sized models.
  • Pre-trained on a 1.2 Trillion token dataset (Amber dataset).
  • Provides intermediate checkpoints for research and fine-tuning.

Maintenance & Community

The project is associated with Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) and Linköping University. The repository is built using the LLM-360 framework.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is presented as an Arxiv preprint, indicating it may still be under active development or peer review. While benchmarks are provided, real-world performance on diverse edge devices may vary.

Health Check
Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
2 more.

YaFSDP by yandex

0.1%
975
Sharded data parallelism framework for transformer-like neural networks
Created 1 year ago
Updated 3 months ago
Starred by Lianmin Zheng Lianmin Zheng(Coauthor of SGLang, vLLM), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
1 more.

MiniCPM by OpenBMB

0.4%
8k
Ultra-efficient LLMs for end devices, achieving 5x+ speedup
Created 1 year ago
Updated 1 week ago
Starred by Phil Wang Phil Wang(Prolific Research Paper Implementer), Lianmin Zheng Lianmin Zheng(Coauthor of SGLang, vLLM), and
6 more.

Kimi-K2 by MoonshotAI

1.7%
8k
State-of-the-art MoE language model
Created 2 months ago
Updated 1 week ago
Feedback? Help us improve.