LLMCompiler  by SqueezeAILab

LLM compiler for parallel function calling

created 1 year ago
1,722 stars

Top 25.3% on sourcepulse

GitHubView on GitHub
Project Summary

LLMCompiler is a framework designed to optimize Large Language Model (LLM) interactions by enabling parallel function calling. It addresses the latency, cost, and accuracy issues of sequential function execution by automatically identifying and orchestrating tasks that can be performed concurrently, benefiting researchers and developers working with complex LLM-driven applications.

How It Works

LLMCompiler decomposes complex problems into a Directed Acyclic Graph (DAG) of tasks, allowing for parallel execution of LLM function calls. This approach leverages the LLM's reasoning capabilities to determine task dependencies and optimize the execution order, leading to significant speedups and cost reductions compared to traditional sequential methods.

Quick Start & Requirements

  • Installation: Clone the repository and install dependencies via pip install -r requirements.txt within a Python 3.10 conda environment.
  • Prerequisites: OpenAI API key (or Azure/Friendli credentials), Python 3.10. vLLM is supported for custom models.
  • Running Benchmarks: python run_llm_compiler.py --benchmark {benchmark-name} --store {store-path}.
  • Resources: Requires API access to LLMs. Detailed setup for vLLM serving is available in vLLM documentation.
  • Links: Paper, vLLM, LangGraph, LlamaIndex

Highlighted Details

  • Achieves latency speedup, cost saving, and accuracy improvement across various benchmarks.
  • Supports both open-source (LLaMA via vLLM) and closed-source (OpenAI, Azure) LLM models.
  • Integrates with LangChain's LangGraph and LlamaIndex frameworks.
  • Offers streaming capabilities for improved responsiveness.

Maintenance & Community

The project is associated with SqueezeAILab and has been integrated into popular LLM orchestration frameworks like LangChain and LlamaIndex. Updates include support for Friendli endpoints and vLLM.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

Logging is not yet supported for vLLM. Default prompts are tailored for LLaMA-2 70B and may require adjustments for other models. The roadmap indicates planned Tree-of-Thoughts evaluation.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
58 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Philipp Schmid Philipp Schmid(DevRel at Google DeepMind), and
2 more.

LightLLM by ModelTC

0.7%
3k
Python framework for LLM inference and serving
created 2 years ago
updated 16 hours ago
Feedback? Help us improve.