OpenChatKit  by togethercomputer

Open-source toolkit for building specialized/general-purpose chat models

created 2 years ago
9,009 stars

Top 5.8% on sourcepulse

GitHubView on GitHub
Project Summary

OpenChatKit provides an open-source toolkit for building and deploying conversational AI models. It offers pre-trained chat models, fine-tuning scripts, and an extensible retrieval system, targeting developers and researchers aiming to create specialized or general-purpose dialogue agents.

How It Works

The kit leverages instruction-tuned large language models, including a 20B parameter GPT-NeoXT-Chat-Base-20B and a 7B parameter Pythia-Chat-Base-7B. It supports fine-tuning on custom datasets and integrates a retrieval-augmented generation (RAG) capability using a Faiss index for incorporating external knowledge.

Quick Start & Requirements

  • Install: Create a Conda environment using mamba env create -f environment.yml and activate it with conda activate OpenChatKit.
  • Prerequisites: Miniconda, Git LFS, PyTorch.
  • Chatting: Run python inference/bot.py --model togethercomputer/Pythia-Chat-Base-7B.
  • Fine-tuning: Scripts are provided for Llama-2-7B-32K-beta and Pythia-Chat-Base-7B.
  • Docs: GPT-NeoXT-Chat-Base-20B.md

Highlighted Details

  • Supports fine-tuning of Llama-2-7B-32K-beta with a 32K context window.
  • Includes scripts for reproducing Pythia-Chat-Base-7B training.
  • Experimental retrieval augmentation for context-aware responses.
  • Offers options for 8-bit Adam optimization during training.

Maintenance & Community

  • Developed by Together Computer.
  • Models are fine-tuned versions of Eleuther AI models.
  • Collaborations with LAION, Ontocord.ai, and Stanford's CRFM/HazyResearch.

Licensing & Compatibility

  • Code licensed under Apache 2.0.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The retrieval augmentation feature is experimental and requires significant time to load the index. Specific hardware requirements for running the larger 20B parameter model are not detailed.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
25 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.