Nano by bd4sur

Transformer LLM for interactive, offline voice and text applications

Created 2 years ago

261 stars

Top 97.3% on SourcePulse

Project Summary

Summary: bd4sur/Nano is a toy language model project inspired by nanoGPT, targeting hobbyists and researchers for LLM experimentation. It facilitates training, fine-tuning, and efficient inference of Transformer models (including Qwen adaptations) across diverse hardware, from browsers and embedded devices to PCs, enabling offline, local LLM usage.

How It Works: Nano implements Transformer architectures (Llama2/nanoGPT-based) with features like RoPE and GQA. Its key innovation is highly portable inference engines: WASM for browsers and pure C (from llama2.c) for resource-constrained devices like Raspberry Pis and routers, enabling offline execution.

Quick Start & Requirements:

Training: Requires Python 3.10+, Conda, PyTorch. GPU with CUDA recommended. Install dependencies via pip install -r requirements.txt.
Inference (C): Requires C/C++ toolchain (make).
Inference (WASM): Runs directly in browsers; build WASM module via infer/build_wasm.sh.
Models: Pre-trained models (e.g., Nano-168M) and Qwen conversion tools available.
Links: Bilibili Demo, Animac Playground, HuggingFace.

Highlighted Details:

Cross-Platform Inference: Supports local execution via browser WASM, C binaries for embedded systems, and PyTorch CPU/GPU.
Model Scale: Offers models from 1M to 168M parameters, plus Qwen support.
Non-NLP Tasks: Experimental scripts explore LLM use cases like sorting and logic evaluation.
PEFT: Includes support for LoRA fine-tuning.

Maintenance & Community: Appears to be a personal project by BD4SUR, lacking explicit community channels (Discord/Slack), formal roadmaps, or multiple listed maintainers.

Licensing & Compatibility: States MIT license, generally permissive for commercial use. However, a concurrent "all rights reserved" copyright notice creates ambiguity requiring clarification for adoption.

Limitations & Caveats: Positioned as a "toy" for learning/research; training large models is computationally expensive. Data preprocessing can be memory-intensive. The custom heuristic tokenizer differs from standard BPE. License ambiguity is a key adoption blocker.

Nano by bd4sur

Explore Similar Projects

awesome-mobile-llm by stevelaskaridis

SiLLM by armbues

Kolosal by KolosalAI

ai-infra-learning by cr7258

hpc-ops by Tencent

GongBU by Bolin97

xFasterTransformer by intel

Awesome-LLMs-on-device by NexaAI

torchchat by pytorch

LightLLM by ModelTC

intel-extension-for-pytorch by intel

mlx-lm by ml-explore