JittorLLMs by Jittor

Low-resource LLM inference library

Created 2 years ago

2,426 stars

Top 18.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Project Summary

JittorLLMs is a high-performance, low-resource large language model (LLM) inference library designed for broad accessibility, enabling users to run LLMs on standard hardware without dedicated GPUs. It supports popular models like ChatGLM, LLaMA, and Pangu, with a focus on low deployment costs and ease of use for Chinese language users.

How It Works

JittorLLMs leverages Jittor's dynamic swapping technology, allowing tensors to automatically exchange between GPU memory, system RAM, and disk. This unique feature, claimed to be the first for dynamic graphs, significantly reduces hardware requirements, enabling LLM execution with as little as 2GB of RAM. The framework also employs zero-copy techniques and meta-operator auto-compilation for faster model loading and improved computational performance compared to similar frameworks.

Quick Start & Requirements

Install via pip install -r requirements.txt -i https://pypi.jittor.org/simple -I after cloning the repository.
Requires Python >= 3.8 (Linux >= 3.7), 2GB+ RAM (32GB recommended), 40GB disk space. GPU is optional but recommended (16GB recommended).
Supports Windows, macOS, and Linux.
Run inference with python cli_demo.py [model_name].
Official Docs: https://cg.cs.tsinghua.edu.cn/jittor/assets/docs/index.html
Forum: https://discuss.jittor.org/

Highlighted Details

Reduced hardware requirements by up to 80%.
Achieves over 20% performance improvement via meta-operator auto-compilation.
Model loading speed increased by 40% compared to PyTorch.
Supports seamless migration of existing PyTorch code via JTorch.

Maintenance & Community

Developed by Non-ten Technology in collaboration with Tsinghua University's Visual Media Research Center. A developer community group is available (QQ group: 761222083). Future plans include model training/fine-tuning, MOSS support, and further performance optimizations.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is primarily focused on inference; training and fine-tuning are listed as future plans. Chinese dialogue is supported for specific models (ChatGLM, Atom7B, Pangu Alpha), while others support English. The license is not clearly stated, which may impact commercial adoption.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days