NexusRaven-V2  by nexusflowai

Open-source LLM for function calling, outperforming GPT-4 in some cases

created 1 year ago
415 stars

Top 71.7% on sourcepulse

GitHubView on GitHub
Project Summary

NexusRaven-V2 is a 13B parameter open-source LLM designed for advanced function calling, including single, parallel, and nested calls. It aims to provide a commercially viable alternative to proprietary models, offering superior performance on complex function calling tasks and detailed, optional explanations for its generated calls.

How It Works

NexusRaven-V2 leverages a transformer architecture, fine-tuned on a proprietary dataset that does not include outputs from other large language models. Its core strength lies in its ability to interpret natural language queries and translate them into executable Python function calls, including intricate nested and parallel sequences, by understanding function signatures and docstrings.

Quick Start & Requirements

  • Install: pip install transformers accelerate nexusraven
  • Requirements: GPU recommended for optimal performance.
  • Usage: See the provided Prompting Notebook CoLab for detailed examples.

Highlighted Details

  • Surpasses GPT-4 by up to 7% in function calling success rates on human-generated use cases involving nested and composite functions.
  • Demonstrates generalization to unseen functions not present in its training data.
  • Offers an OpenAI-compatible RESTful client for seamless integration.
  • Benchmarked on a curated set of 9 real-world API tasks, categorized into single, nested, and parallel calls.

Maintenance & Community

  • Active community support via Nexusflow Discord.
  • Model weights and evaluation data are available on Hugging Face.

Licensing & Compatibility

  • Code, evaluation notebooks, and data are licensed under Apache 2.0.
  • The model's training data is free from proprietary LLM outputs, ensuring commercial viability.

Limitations & Caveats

  • While benchmarks show strong performance, the README notes that GPT-4's accuracy can have "wild swings" due to non-deterministic outputs, which may affect direct comparisons when rerunning evaluations.
Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.