LLM-VM  by anarchy-ai

Open-source AGI server for LLMs

Created 2 years ago
487 stars

Top 63.2% on SourcePulse

GitHubView on GitHub
Project Summary

The Anarchy LLM-VM project aims to provide an open-source, highly optimized backend for running Large Language Models (LLMs) locally. It targets developers and researchers seeking to accelerate AGI development, reduce costs, and gain flexibility by running various open-source LLMs with advanced features like tool usage, memory, and data augmentation.

How It Works

The LLM-VM acts as a virtual machine for human language, orchestrating data, models, prompts, and tools. It employs a multi-level optimization strategy, from agent-level coordination down to assembly code, incorporating techniques like state-of-the-art batching, sparse inference, quantization, distillation, and multi-level colocation. This approach aims to deliver high performance and efficiency for local LLM execution, supporting model and architecture agnosticism.

Quick Start & Requirements

  • Installation: pip install llm-vm or clone the repository and run ./setup.sh (macOS/Linux) or .\windows_setup.ps1 (Windows).
  • Prerequisites: Python >= 3.10. System requirements vary by model, with RAM being a common limiting factor (16GB recommended). OpenAI models require an LLM_VM_OPENAI_API_KEY environment variable.
  • Links: anarchy.ai

Highlighted Details

  • Supports tool usage via agents (FLAT, REBEL).
  • Features inference optimization through batching, quantization, and distillation.
  • Enables task auto-optimization via student-teacher distillation and data synthesis.
  • Offers library callable Python interface and HTTP endpoints.

Maintenance & Community

  • Development Status: DEVELOPMENT ON PAUSE.
  • Community: Active Discord community for contributors.
  • Contributors: Notable contributors include Matthew Mirman (CEO) and Victor Odede.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive for commercial use and closed-source linking.

Limitations & Caveats

The project is currently in BETA with development on pause. Several advanced features like live data augmentation, web playground, load-balancing, output templating, and persistent stateful memory are listed on the roadmap and not yet implemented.

Health Check
Last Commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
3 more.

LitServe by Lightning-AI

0.3%
4k
AI inference pipeline framework
Created 1 year ago
Updated 1 day ago
Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
9 more.

LightLLM by ModelTC

0.5%
4k
Python framework for LLM inference and serving
Created 2 years ago
Updated 14 hours ago
Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
11 more.

mistral.rs by EricLBuehler

0.3%
6k
LLM inference engine for blazing fast performance
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.