MobiLlama  by mbzuai-oryx

Small language model for edge devices

created 1 year ago
653 stars

Top 52.1% on sourcepulse

GitHubView on GitHub
Project Summary

MobiLlama introduces Small Language Models (SLMs) designed for resource-constrained edge devices, addressing the limitations of larger models in terms of memory, energy, and response efficiency. It offers a fully transparent, open-source 0.5B parameter SLM, catering to privacy, security, and sustainable deployment needs.

How It Works

MobiLlama builds upon the LLaMA-7B architecture, employing a parameter-sharing scheme to reduce pre-training and deployment costs. This approach allows for a significant reduction in model size while aiming to maintain accuracy, making it suitable for edge computing environments.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies via pip install -r requirements.txt. PyTorch installation is a prerequisite.
  • Prerequisites: Python 3.10, PyTorch, CUDA (for GPU acceleration).
  • Demo: An Android APK is available for on-device testing.
  • Models: Pre-trained models (0.5B, 0.8B, 1B) and chat versions are available on HuggingFace.
  • Resources: Links to training code, data preparation, and metrics are provided.

Highlighted Details

  • Offers models in 0.5B, 0.8B, and 1B parameter sizes, including chat-tuned variants.
  • Achieves competitive performance on various LLM benchmarks, outperforming similarly sized models.
  • Pre-trained on a 1.2 Trillion token dataset (Amber dataset).
  • Provides intermediate checkpoints for research and fine-tuning.

Maintenance & Community

The project is associated with Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) and Linköping University. The repository is built using the LLM-360 framework.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is presented as an Arxiv preprint, indicating it may still be under active development or peer review. While benchmarks are provided, real-world performance on diverse edge devices may vary.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
20 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Michael Han Michael Han(Cofounder of Unsloth), and
1 more.

ktransformers by kvcache-ai

0.4%
15k
Framework for LLM inference optimization experimentation
created 1 year ago
updated 3 days ago
Feedback? Help us improve.