aici  by microsoft

AICI constrains LLM output using (Wasm) programs

Created 2 years ago
2,051 stars

Top 21.7% on SourcePulse

GitHubView on GitHub
Project Summary

AICI (Artificial Intelligence Controller Interface) provides a framework for real-time control and constraint of Large Language Model (LLM) output. It enables developers to build flexible "Controllers" that dictate token-by-token generation, manage state, and integrate custom logic, targeting researchers and developers seeking fine-grained control over LLM responses.

How It Works

Controllers are implemented as WebAssembly (Wasm) modules, allowing them to run efficiently on the CPU in parallel with the LLM's GPU-based token generation. This approach minimizes overhead and allows controllers to be written in various languages that compile to Wasm, such as Rust, C++, or interpreted languages like Python and JavaScript. AICI abstracts LLM inference details, aiming for portability across different inference engines.

Quick Start & Requirements

  • Installation: Requires Rust toolchain, Python 3.11+, and specific system dependencies (e.g., build-essential, cmake, clang).
  • LLM Backend: Integrates with llama.cpp (via rllm-llamacpp) and libtorch/CUDA (via rllm-cuda). The CUDA backend requires NVIDIA GPUs with compute capability 8.0+.
  • Setup: Detailed setup instructions are provided for WSL/Linux/macOS, with a recommended devcontainer for easier CUDA setup.
  • Running: Use ./server.sh to start the rLLM server and ./aici.sh run <script> to execute controllers.
  • Documentation: QuickStart: Example Walkthrough

Highlighted Details

  • Controllers are sandboxed Wasm modules, enhancing security by restricting filesystem, network, and other resource access.
  • Supports multiple controller implementations (e.g., pyctrl for Python, jsctrl for JavaScript) and aims to support higher-level libraries like Guidance and LMQL.
  • Performance claims indicate minimal overhead (0.2-2.0ms per token for common constraints) on an AMD EPYC 7V13 with NVIDIA A100 GPU.
  • Offers flexibility for complex control strategies including backtracking KV-cache, forking generations, and inter-fork communication.

Maintenance & Community

  • Actively maintained by Microsoft Research.
  • Contributions are welcome via pull requests, requiring agreement to a CLA.
  • Follows the Microsoft Open Source Code of Conduct.

Licensing & Compatibility

  • Licensed under the MIT License.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

  • AICI is described as a prototype.
  • The vLLM integration is noted as out-of-date, recommending rLLM-cuda or rLLM-llama.cpp instead.
  • Native Windows support is tracked as a future enhancement.
Health Check
Last Commit

7 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
41 more.

guidance by guidance-ai

0.1%
21k
Guidance is a programming paradigm for steering LLMs
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.