octopus-v4  by NexaAI

Graph of language models for connecting specialized AI

Created 1 year ago
275 stars

Top 94.1% on SourcePulse

GitHubView on GitHub
Project Summary

This project aims to construct the world's largest graph of specialized language models, enabling unified performance competitive with closed-source alternatives. It targets AI researchers and developers seeking to integrate diverse LLMs for enhanced, domain-specific capabilities.

How It Works

The project builds a graph where nodes represent specialized language models and edges represent trained "Octopus" models that facilitate efficient information distribution between nodes. This approach allows the system to identify the most appropriate model for a given task and reformat queries for optimal processing by the selected "worker" model.

Quick Start & Requirements

  • Install: conda create -n octopus4 python=3.10, then pip3 install torch torchvision torchaudio transformers datasets accelerate peft. Alternatively, use Docker: docker build -t octopus4 . and docker run --gpus all -p 8700:8700 octopus4.
  • Prerequisites: Linux environment, NVIDIA GPU.
  • Resources: Requires PyTorch and Hugging Face libraries.
  • Docs: Hugging Face, Domain LLM Leaderboard, YouTube Setup.

Highlighted Details

  • Aims to unify all open-source LLMs into a single, powerful graph.
  • Initial v4 model shows competitive performance on MMLU benchmark (74.6%), outperforming GPT-3.5 (70.0%) in 5-shot learning.
  • Supports training specialized models using Hugging Face TRL, with plans for LoRA, larger models, and distributed training.
  • Integrates various specialized models (e.g., biology, physics, law) and uses generic models like Llama3-8b where no specialized model exists.

Maintenance & Community

  • Developed by Nexa AI, with a commitment to dedicating resources.
  • Encourages community contributions for model suggestions and training.

Licensing & Compatibility

  • No explicit license is mentioned in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is described as being in its early stages, with only the initial Octopus model included. Future support for multimodal AI agents is planned but not yet implemented.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Didier Lopes Didier Lopes(Founder of OpenBB), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
5 more.

mlx-lm by ml-explore

26.1%
2k
Python package for LLM text generation and fine-tuning on Apple silicon
Created 6 months ago
Updated 23 hours ago
Feedback? Help us improve.