octopus-v4 by NexaAI

Graph of language models for connecting specialized AI

Created 1 year ago

278 stars

Top 93.5% on SourcePulse

View on GitHub

3 Experts Love This Project

Zack Li

Cofounder of Nexa AI

Alex Atallah

Cofounder of OpenRouter, OpenSea

Alex Chen

Cofounder of Nexa AI

Project Summary

This project aims to construct the world's largest graph of specialized language models, enabling unified performance competitive with closed-source alternatives. It targets AI researchers and developers seeking to integrate diverse LLMs for enhanced, domain-specific capabilities.

How It Works

The project builds a graph where nodes represent specialized language models and edges represent trained "Octopus" models that facilitate efficient information distribution between nodes. This approach allows the system to identify the most appropriate model for a given task and reformat queries for optimal processing by the selected "worker" model.

Quick Start & Requirements

Install: conda create -n octopus4 python=3.10, then pip3 install torch torchvision torchaudio transformers datasets accelerate peft. Alternatively, use Docker: docker build -t octopus4 . and docker run --gpus all -p 8700:8700 octopus4.
Prerequisites: Linux environment, NVIDIA GPU.
Resources: Requires PyTorch and Hugging Face libraries.
Docs: Hugging Face, Domain LLM Leaderboard, YouTube Setup.

Highlighted Details

Aims to unify all open-source LLMs into a single, powerful graph.
Initial v4 model shows competitive performance on MMLU benchmark (74.6%), outperforming GPT-3.5 (70.0%) in 5-shot learning.
Supports training specialized models using Hugging Face TRL, with plans for LoRA, larger models, and distributed training.
Integrates various specialized models (e.g., biology, physics, law) and uses generic models like Llama3-8b where no specialized model exists.

Maintenance & Community

Developed by Nexa AI, with a commitment to dedicating resources.
Encourages community contributions for model suggestions and training.

Licensing & Compatibility

No explicit license is mentioned in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is described as being in its early stages, with only the initial Octopus model included. Future support for multimodal AI agents is planned but not yet implemented.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days