JittorLLMs  by Jittor

Low-resource LLM inference library

created 2 years ago
2,432 stars

Top 19.5% on sourcepulse

GitHubView on GitHub
Project Summary

JittorLLMs is a high-performance, low-resource large language model (LLM) inference library designed for broad accessibility, enabling users to run LLMs on standard hardware without dedicated GPUs. It supports popular models like ChatGLM, LLaMA, and Pangu, with a focus on low deployment costs and ease of use for Chinese language users.

How It Works

JittorLLMs leverages Jittor's dynamic swapping technology, allowing tensors to automatically exchange between GPU memory, system RAM, and disk. This unique feature, claimed to be the first for dynamic graphs, significantly reduces hardware requirements, enabling LLM execution with as little as 2GB of RAM. The framework also employs zero-copy techniques and meta-operator auto-compilation for faster model loading and improved computational performance compared to similar frameworks.

Quick Start & Requirements

  • Install via pip install -r requirements.txt -i https://pypi.jittor.org/simple -I after cloning the repository.
  • Requires Python >= 3.8 (Linux >= 3.7), 2GB+ RAM (32GB recommended), 40GB disk space. GPU is optional but recommended (16GB recommended).
  • Supports Windows, macOS, and Linux.
  • Run inference with python cli_demo.py [model_name].
  • Official Docs: https://cg.cs.tsinghua.edu.cn/jittor/assets/docs/index.html
  • Forum: https://discuss.jittor.org/

Highlighted Details

  • Reduced hardware requirements by up to 80%.
  • Achieves over 20% performance improvement via meta-operator auto-compilation.
  • Model loading speed increased by 40% compared to PyTorch.
  • Supports seamless migration of existing PyTorch code via JTorch.

Maintenance & Community

Developed by Non-ten Technology in collaboration with Tsinghua University's Visual Media Research Center. A developer community group is available (QQ group: 761222083). Future plans include model training/fine-tuning, MOSS support, and further performance optimizations.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is primarily focused on inference; training and fine-tuning are listed as future plans. Chinese dialogue is supported for specific models (ChatGLM, Atom7B, Pangu Alpha), while others support English. The license is not clearly stated, which may impact commercial adoption.

Health Check
Last commit

5 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.