NexusRaven  by nexusflowai

Evaluation framework for function-calling LLM, NexusRaven-13B

created 1 year ago
316 stars

Top 86.7% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

NexusRaven-13B is an open-source LLM specifically designed for function calling, aiming to surpass existing state-of-the-art models in this domain. It is targeted at developers and researchers needing robust and efficient API interaction capabilities from LLMs, offering significant performance gains and commercial viability.

How It Works

NexusRaven-13B is trained for function calling, accepting Python function signatures and docstrings to generate appropriate API calls. It is designed to generalize to unseen tools and is compatible with frameworks like LangChain. The model's output often includes a "reflection" step, which the authors recommend bypassing by using a specific stop criterion (["\nReflection:"]) to prioritize the "Initial Call" for efficiency and direct execution.

Quick Start & Requirements

  • Install: pip install transformers accelerate
  • Usage: Requires Hugging Face transformers library. GPU recommended for inference.
  • Demo: Nexusflow HF
  • Documentation: NexusRaven blog post

Highlighted Details

  • Achieves 95% success rate in using cybersecurity tools (CVE/CPE Search, VirusTotal) with a retrieval system, outperforming GPT-4 (64%).
  • Generalizes to unseen tools in a zero-shot setting, outperforming other open-source LLMs of similar size.
  • Trained without proprietary LLM data, enabling commercial use.
  • Evaluation framework and data processing code are Apache 2.0 licensed.

Maintenance & Community

Licensing & Compatibility

  • Code: Apache 2.0
  • Evaluation Data: CC-BY-NC-4.0 (Non-commercial due to use of GPT-generated data from ToolLLM and ToolAlpaca datasets).

Limitations & Caveats

The model may generate reflections that are not always helpful; using a stop criterion is recommended. It performs best with a retriever when dealing with many functions, as a large number can saturate the context window. The model can be prone to generating incorrect calls, necessitating guardrails.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
created 3 years ago
updated 7 months ago
Feedback? Help us improve.