octopus-v4  by NexaAI

Graph of language models for connecting specialized AI

created 1 year ago
272 stars

Top 95.5% on sourcepulse

GitHubView on GitHub
Project Summary

This project aims to construct the world's largest graph of specialized language models, enabling unified performance competitive with closed-source alternatives. It targets AI researchers and developers seeking to integrate diverse LLMs for enhanced, domain-specific capabilities.

How It Works

The project builds a graph where nodes represent specialized language models and edges represent trained "Octopus" models that facilitate efficient information distribution between nodes. This approach allows the system to identify the most appropriate model for a given task and reformat queries for optimal processing by the selected "worker" model.

Quick Start & Requirements

  • Install: conda create -n octopus4 python=3.10, then pip3 install torch torchvision torchaudio transformers datasets accelerate peft. Alternatively, use Docker: docker build -t octopus4 . and docker run --gpus all -p 8700:8700 octopus4.
  • Prerequisites: Linux environment, NVIDIA GPU.
  • Resources: Requires PyTorch and Hugging Face libraries.
  • Docs: Hugging Face, Domain LLM Leaderboard, YouTube Setup.

Highlighted Details

  • Aims to unify all open-source LLMs into a single, powerful graph.
  • Initial v4 model shows competitive performance on MMLU benchmark (74.6%), outperforming GPT-3.5 (70.0%) in 5-shot learning.
  • Supports training specialized models using Hugging Face TRL, with plans for LoRA, larger models, and distributed training.
  • Integrates various specialized models (e.g., biology, physics, law) and uses generic models like Llama3-8b where no specialized model exists.

Maintenance & Community

  • Developed by Nexa AI, with a commitment to dedicating resources.
  • Encourages community contributions for model suggestions and training.

Licensing & Compatibility

  • No explicit license is mentioned in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is described as being in its early stages, with only the initial Octopus model included. Future support for multimodal AI agents is planned but not yet implemented.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Didier Lopes Didier Lopes(Founder of OpenBB), and
10 more.

JARVIS by microsoft

0.1%
24k
System for LLM-orchestrated AI task automation
created 2 years ago
updated 4 days ago
Feedback? Help us improve.