MobiLlama  by mbzuai-oryx

Small language model for edge devices

Created 2 years ago
667 stars

Top 50.1% on SourcePulse

GitHubView on GitHub
Project Summary

MobiLlama introduces Small Language Models (SLMs) designed for resource-constrained edge devices, addressing the limitations of larger models in terms of memory, energy, and response efficiency. It offers a fully transparent, open-source 0.5B parameter SLM, catering to privacy, security, and sustainable deployment needs.

How It Works

MobiLlama builds upon the LLaMA-7B architecture, employing a parameter-sharing scheme to reduce pre-training and deployment costs. This approach allows for a significant reduction in model size while aiming to maintain accuracy, making it suitable for edge computing environments.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies via pip install -r requirements.txt. PyTorch installation is a prerequisite.
  • Prerequisites: Python 3.10, PyTorch, CUDA (for GPU acceleration).
  • Demo: An Android APK is available for on-device testing.
  • Models: Pre-trained models (0.5B, 0.8B, 1B) and chat versions are available on HuggingFace.
  • Resources: Links to training code, data preparation, and metrics are provided.

Highlighted Details

  • Offers models in 0.5B, 0.8B, and 1B parameter sizes, including chat-tuned variants.
  • Achieves competitive performance on various LLM benchmarks, outperforming similarly sized models.
  • Pre-trained on a 1.2 Trillion token dataset (Amber dataset).
  • Provides intermediate checkpoints for research and fine-tuning.

Maintenance & Community

The project is associated with Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) and Linköping University. The repository is built using the LLM-360 framework.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is presented as an Arxiv preprint, indicating it may still be under active development or peer review. While benchmarks are provided, real-world performance on diverse edge devices may vary.

Health Check
Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
2 more.

YaFSDP by yandex

0%
988
Sharded data parallelism framework for transformer-like neural networks
Created 2 years ago
Updated 1 week ago
Starred by Pawel Garbacki Pawel Garbacki(Cofounder of Fireworks AI) and Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

aiconfigurator by ai-dynamo

3.3%
312
LLM serving configuration optimization
Created 10 months ago
Updated 12 hours ago
Starred by Lianmin Zheng Lianmin Zheng(Coauthor of SGLang, vLLM), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
1 more.

MiniCPM by OpenBMB

2.2%
9k
Ultra-efficient LLMs for end devices, achieving 5x+ speedup
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.