torchchat by pytorch

PyTorch-native SDK for local LLM inference across diverse platforms

Created 1 year ago

3,623 stars

Top 13.2% on SourcePulse

View on GitHub

4 Experts Love This Project

Andrej Karpathy

Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n

Gabriel Almeida

Cofounder of Langflow

Elvis Saravia

Founder of DAIR.AI

Travis Oliphant

Author of NumPy, SciPy, Anaconda

Project Summary

torchchat enables running PyTorch Large Language Models (LLMs) locally across servers, desktops, and mobile devices. It targets developers and power users seeking a flexible, PyTorch-native solution for LLM deployment, offering Python, C++, and mobile (iOS/Android) interfaces with performance optimizations.

How It Works

torchchat leverages PyTorch's native capabilities, including eager execution, compilation via AOT Inductor for optimized desktop/server deployment, and ExecuTorch for mobile optimization. This PyTorch-centric approach emphasizes simplicity, extensibility, and correctness, allowing for modular integration and customization of LLM execution.

Quick Start & Requirements

Install: Clone the repo, create a virtual environment, and run ./install/install_requirements.sh.
Prerequisites: Python 3.10+, Hugging Face account and CLI login for model downloads.
Resources: Requires sufficient RAM for the chosen LLM (e.g., 8GB+ for some models).
Docs: Customization Guide, Multimodal Guide

Highlighted Details

Multimodal support for Llama 3.2 11B Vision.
Command-line interface for popular LLMs (Llama 3, Mistral, etc.).
Support for various data types (FP32, FP16, BF16) and quantization schemes.
Native execution via AOT Inductor (for C++ runner) and ExecuTorch (for mobile).

Maintenance & Community

Active development with recent updates for DeepSeek R1 Distill and Llama 3.2 multimodal support.
Community engagement via Discord for support and contributions.
CONTRIBUTING guide available.

Licensing & Compatibility

BSD 3-Clause license for torchchat, with MIT and Apache licenses for additional code.
Compatibility with commercial use is generally permissive, but users must comply with third-party model terms of service.

Limitations & Caveats

The eval feature is noted as a work in progress. Some model access requires requesting permission via Hugging Face. The README includes a disclaimer about potential performance and compatibility differences compared to original model versions.

Health Check

Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days