ToolkenGPT  by Ber666

Research code for augmenting frozen LLMs with tools via embeddings

created 2 years ago
264 stars

Top 97.5% on sourcepulse

GitHubView on GitHub
Project Summary

ToolkenGPT augments frozen large language models (LLMs) with a vast array of tools through tool embeddings, enabling them to perform complex tasks requiring external functionalities. This NeurIPS 2023 oral presentation and Best Paper Award recipient targets researchers and developers aiming to enhance LLM capabilities beyond their inherent knowledge.

How It Works

ToolkenGPT introduces a novel "tool embedding" mechanism that allows pre-trained, frozen LLMs to interact with external tools. This approach avoids costly fine-tuning of the entire LLM, instead focusing on learning how to effectively query and utilize tools. The system represents tools as embeddings, which are then integrated into the LLM's input or attention mechanisms, enabling it to select and invoke the appropriate tool for a given task.

Quick Start & Requirements

  • Installation: Requires acquiring LLaMA checkpoints from MetaAI and installing dependencies.
  • Prerequisites: LLaMA-13B/33B checkpoints, Python, PyTorch, and CUDA (>= 24GB GPU memory per instance, e.g., 2-4 GPUs for LLaMA-33B). Specific datasets (GSM8K-XL, FuncQA, VirtualHome, KAMEL) need to be downloaded separately.
  • Setup: Requires obtaining LLaMA weights and preparing datasets, which can be time-consuming.
  • Resources: Training and inference demand significant GPU resources.
  • Documentation: Links to LLaMA official repo and VirtualHome repo are provided for data acquisition and setup.

Highlighted Details

  • NeurIPS 2023 (oral) and Best Paper Award at SoCalNLP 2023.
  • Augments frozen LLMs, reducing computational cost compared to full fine-tuning.
  • Supports multiple tool-use scenarios including GSM8K-XL, FuncQA (1-hop and multi-hop), VirtualHome, and KAMEL.
  • Provides specific training and inference commands for various datasets and configurations.

Maintenance & Community

The project is associated with the authors of the NeurIPS 2023 paper. No specific community channels (Discord, Slack) or roadmap are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state the license for the ToolkenGPT code. However, it relies on LLaMA checkpoints, which have their own usage terms. Compatibility with commercial or closed-source applications would depend on the LLaMA license and the ToolkenGPT license once clarified.

Limitations & Caveats

The project requires access to LLaMA model checkpoints, which are not provided directly and have specific distribution terms. The setup process involves downloading large datasets and potentially fixing bugs in external repositories (e.g., VirtualHome), indicating a non-trivial setup effort.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.