Discover and explore top open-source AI tools and projects—updated daily.
LLM inference on a microcontroller
Top 59.5% on SourcePulse
This project demonstrates running a "large" language model on a microcontroller, specifically the Coral Dev Board Micro with 64MB of RAM. It targets embedded systems developers and researchers interested in pushing the boundaries of on-device AI, enabling generative text capabilities in resource-constrained environments.
How It Works
The project adapts the llama2.c implementation and tinyllamas checkpoints, trained on the TinyStories dataset, to run on the Coral Dev Board Micro's 800 MHz Arm Cortex-M7 CPU. For image input, it leverages the board's Edge TPU with a compiled YOLOv5 model for object detection. The detected object forms the initial prompt for the LLM, generating text output streamed via serial.
Quick Start & Requirements
cmake
and make
, flash using python3 -m venv venv
and python ../coralmicro/scripts/flashtool.py
.Highlighted Details
Maintenance & Community
The project is a personal endeavor by maxbbraun. No specific community channels or roadmap are indicated in the README.
Licensing & Compatibility
The project itself appears to be MIT licensed, but it incorporates submodules from other projects (llama2.c, coralmicro, yolov5) which may have different licenses. Compatibility for commercial use depends on the licenses of these submodules.
Limitations & Caveats
The quality of the generated stories from the smaller model versions is described as "not ideal" but "somewhat coherent." The second Arm Cortex-M4 CPU core on the board is currently unused.
1 year ago
Inactive