Low-resource LLM inference library
Top 19.5% on sourcepulse
JittorLLMs is a high-performance, low-resource large language model (LLM) inference library designed for broad accessibility, enabling users to run LLMs on standard hardware without dedicated GPUs. It supports popular models like ChatGLM, LLaMA, and Pangu, with a focus on low deployment costs and ease of use for Chinese language users.
How It Works
JittorLLMs leverages Jittor's dynamic swapping technology, allowing tensors to automatically exchange between GPU memory, system RAM, and disk. This unique feature, claimed to be the first for dynamic graphs, significantly reduces hardware requirements, enabling LLM execution with as little as 2GB of RAM. The framework also employs zero-copy techniques and meta-operator auto-compilation for faster model loading and improved computational performance compared to similar frameworks.
Quick Start & Requirements
pip install -r requirements.txt -i https://pypi.jittor.org/simple -I
after cloning the repository.python cli_demo.py [model_name]
.Highlighted Details
Maintenance & Community
Developed by Non-ten Technology in collaboration with Tsinghua University's Visual Media Research Center. A developer community group is available (QQ group: 761222083). Future plans include model training/fine-tuning, MOSS support, and further performance optimizations.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is primarily focused on inference; training and fine-tuning are listed as future plans. Chinese dialogue is supported for specific models (ChatGLM, Atom7B, Pangu Alpha), while others support English. The license is not clearly stated, which may impact commercial adoption.
5 months ago
1 day