WilmerAI by SomeOddCodeGuy

AI inference router for specialized workflows

Created 1 year ago

794 stars

Top 44.4% on SourcePulse

View on GitHub

2 Experts Love This Project

Jonathan Ragan-Kelley

Professor at MIT

Pawel Garbacki

Cofounder of Fireworks AI

Project Summary

WilmerAI acts as a sophisticated intermediary for Large Language Models (LLMs), enabling users to route prompts to specialized LLM workflows based on domain or persona. It allows multiple LLMs to collaborate on generating a single response, enhancing output quality and enabling complex AI assistant configurations. The project is targeted at users who want to orchestrate multiple LLMs for advanced tasks, including RAG and iterative response refinement.

How It Works

WilmerAI routes prompts through user-defined workflows, which are sequences of LLM calls. These workflows can incorporate custom Python scripts, external APIs like the Offline Wikipedia API, and conditional logic. The system supports distributing LLM inference across multiple machines and leverages Ollama's model hotswapping to maximize VRAM usage on systems with limited GPU memory. It exposes OpenAI and Ollama compatible API endpoints for seamless integration with various front-end applications.

Quick Start & Requirements

Installation: Clone the repository and run pip install -r requirements.txt, then start the server with python server.py. Alternatively, use provided .bat (Windows) or .sh (macOS) scripts.
Prerequisites: Python 3.10 or 3.12.
Dependencies: Flask, requests, scikit-learn, urllib3, jinja2.
Configuration: Requires JSON configuration files for endpoints, users, and workflows.
Documentation: Extensive setup and workflow configuration details are available in the README.

Highlighted Details

Workflow Orchestration: Define complex, multi-step LLM interactions with conditional logic and custom scripting.
Distributed Inference: Distribute LLM workloads across multiple machines.
Memory & Summarization: Features for generating and managing conversation memories and summaries to maintain context over long interactions.
Multi-Modal Support: Experimental support for image processing via Ollama.

Maintenance & Community

This is a personal project under heavy development, maintained by "Socg" in their free time. Updates may take a week or two. Contact is available via WilmerAI.Project@gmail.com.

Licensing & Compatibility

WilmerAI is licensed under the GNU General Public License v3.0 or later. This license permits redistribution and modification but requires derived works to also be licensed under the GPL, potentially restricting commercial use or linking with closed-source applications.

Limitations & Caveats

WilmerAI does not currently track or report token usage, requiring users to monitor costs via their LLM API dashboards. The project is in heavy development, and the README explicitly states it may contain bugs or incomplete code. The quality of WilmerAI's output is highly dependent on the connected LLMs and the user's configuration of prompts and presets. Linux support is not provided due to a lack of testing.

Health Check

Last Commit

6 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days