forge  by antoinezambelli

Self-hosted LLM agentic workflows and tool-calling framework

Created 3 months ago
1,850 stars

Top 22.9% on SourcePulse

GitHubView on GitHub
Project Summary

A Python framework for self-hosted LLM tool-calling and multi-step agentic workflows, Forge enhances the reliability and performance of local models. It targets engineers and researchers building or integrating self-hosted LLM agents, enabling them to achieve top-tier results on complex tasks through robust guardrails and intelligent context management.

How It Works

Forge acts as a reliability layer for self-hosted LLMs, employing guardrails such as rescue parsing, retry nudges, and step enforcement, alongside VRAM-aware context management with tiered compaction. This approach significantly boosts the capabilities of local models on multi-step agentic workflows. Users can leverage Forge via three main patterns: WorkflowRunner for defining and executing structured agent loops; SlotWorker for managing priority-queued inference slots in multi-agent architectures; and Guardrails middleware for integrating Forge's validation and error-handling stack into custom orchestration loops. A notable feature is its OpenAI-compatible proxy server, which transparently applies Forge's guardrails to local model interactions.

Quick Start & Requirements

  • Installation: Core: pip install forge-guardrails. With Anthropic client: pip install "forge-guardrails[anthropic]". For development: git clone https://github.com/antoinezambelli/forge.git, cd forge, pip install -e ".[dev]".
  • Prerequisites: Python 3.12+, a running LLM backend (Ollama, llama-server, Llamafile, or Anthropic API). llama-server is recommended for optimal performance.
  • Documentation: User Guide, Model Guide, Backend Setup, Eval Guide.

Highlighted Details

  • Achieves 86.5% on its 26-scenario eval suite with a Ministral-3 8B Instruct Q8 model on llama-server, positioning it as a top performer among self-hosted configurations.
  • Provides an OpenAI-compatible proxy server that transparently adds guardrails to local LLM interactions for clients like opencode, Continue, and aider.
  • The proxy automatically injects a synthetic respond tool to guide smaller models towards reliable tool-calling, essential for models that struggle with mixed text/tool output.

Maintenance & Community

No specific details on maintainers, community channels (e.g., Discord, Slack), or roadmap were found in the provided README.

Licensing & Compatibility

Licensed under the MIT License, Forge is permissive for commercial use and integration with closed-source projects.

Limitations & Caveats

The effectiveness of Forge is contingent on the chosen LLM backend and model configuration. Smaller models may still require explicit guidance, such as the synthetic respond tool, to reliably execute tool calls. The evaluation harness primarily focuses on tool-calling and multi-step reasoning capabilities.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
22
Issues (30d)
5
Star History
1,852 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.