AICI constrains LLM output using (Wasm) programs
Top 22.2% on sourcepulse
AICI (Artificial Intelligence Controller Interface) provides a framework for real-time control and constraint of Large Language Model (LLM) output. It enables developers to build flexible "Controllers" that dictate token-by-token generation, manage state, and integrate custom logic, targeting researchers and developers seeking fine-grained control over LLM responses.
How It Works
Controllers are implemented as WebAssembly (Wasm) modules, allowing them to run efficiently on the CPU in parallel with the LLM's GPU-based token generation. This approach minimizes overhead and allows controllers to be written in various languages that compile to Wasm, such as Rust, C++, or interpreted languages like Python and JavaScript. AICI abstracts LLM inference details, aiming for portability across different inference engines.
Quick Start & Requirements
build-essential
, cmake
, clang
).llama.cpp
(via rllm-llamacpp
) and libtorch
/CUDA (via rllm-cuda
). The CUDA backend requires NVIDIA GPUs with compute capability 8.0+../server.sh
to start the rLLM server and ./aici.sh run <script>
to execute controllers.Highlighted Details
pyctrl
for Python, jsctrl
for JavaScript) and aims to support higher-level libraries like Guidance and LMQL.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
rLLM-cuda
or rLLM-llama.cpp
instead.6 months ago
Inactive