MobiLlama by mbzuai-oryx

Small language model for edge devices

Created 1 year ago

669 stars

Top 50.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Project Summary

MobiLlama introduces Small Language Models (SLMs) designed for resource-constrained edge devices, addressing the limitations of larger models in terms of memory, energy, and response efficiency. It offers a fully transparent, open-source 0.5B parameter SLM, catering to privacy, security, and sustainable deployment needs.

How It Works

MobiLlama builds upon the LLaMA-7B architecture, employing a parameter-sharing scheme to reduce pre-training and deployment costs. This approach allows for a significant reduction in model size while aiming to maintain accuracy, making it suitable for edge computing environments.

Quick Start & Requirements

Install: Clone the repository and install dependencies via pip install -r requirements.txt. PyTorch installation is a prerequisite.
Prerequisites: Python 3.10, PyTorch, CUDA (for GPU acceleration).
Demo: An Android APK is available for on-device testing.
Models: Pre-trained models (0.5B, 0.8B, 1B) and chat versions are available on HuggingFace.
Resources: Links to training code, data preparation, and metrics are provided.

Highlighted Details

Offers models in 0.5B, 0.8B, and 1B parameter sizes, including chat-tuned variants.
Achieves competitive performance on various LLM benchmarks, outperforming similarly sized models.
Pre-trained on a 1.2 Trillion token dataset (Amber dataset).
Provides intermediate checkpoints for research and fine-tuning.

Maintenance & Community

The project is associated with Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) and Linköping University. The repository is built using the LLM-360 framework.

Licensing & Compatibility

License: Apache 2.0.
Compatibility: Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is presented as an Arxiv preprint, indicating it may still be under active development or peer review. While benchmarks are provided, real-world performance on diverse edge devices may vary.

Health Check

Last Commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days