LLM compiler for parallel function calling
Top 25.3% on sourcepulse
LLMCompiler is a framework designed to optimize Large Language Model (LLM) interactions by enabling parallel function calling. It addresses the latency, cost, and accuracy issues of sequential function execution by automatically identifying and orchestrating tasks that can be performed concurrently, benefiting researchers and developers working with complex LLM-driven applications.
How It Works
LLMCompiler decomposes complex problems into a Directed Acyclic Graph (DAG) of tasks, allowing for parallel execution of LLM function calls. This approach leverages the LLM's reasoning capabilities to determine task dependencies and optimize the execution order, leading to significant speedups and cost reductions compared to traditional sequential methods.
Quick Start & Requirements
pip install -r requirements.txt
within a Python 3.10 conda environment.python run_llm_compiler.py --benchmark {benchmark-name} --store {store-path}
.Highlighted Details
Maintenance & Community
The project is associated with SqueezeAILab and has been integrated into popular LLM orchestration frameworks like LangChain and LlamaIndex. Updates include support for Friendli endpoints and vLLM.
Licensing & Compatibility
The repository does not explicitly state a license in the README. Users should verify licensing for commercial use or integration into closed-source projects.
Limitations & Caveats
Logging is not yet supported for vLLM. Default prompts are tailored for LLaMA-2 70B and may require adjustments for other models. The roadmap indicates planned Tree-of-Thoughts evaluation.
1 year ago
1+ week