openchat  by imoneoi

Open-source LLM fine-tuned with C-RLFT, inspired by offline reinforcement learning

created 2 years ago
5,390 stars

Top 9.5% on sourcepulse

GitHubView on GitHub
Project Summary

OpenChat provides a suite of open-source large language models fine-tuned using C-RLFT, a strategy inspired by offline reinforcement learning. This approach allows models to learn from mixed-quality data without explicit preference labels, achieving performance competitive with models like ChatGPT, even with a 7B parameter model runnable on consumer GPUs. The project targets developers and researchers seeking high-performance, commercially viable LLMs.

How It Works

OpenChat models are fine-tuned using C-RLFT, which leverages mixed-quality data and offline reinforcement learning principles. This method enables the models to learn effectively from data that lacks explicit preference labels, leading to robust performance across various tasks. The library supports different "conditions" (e.g., "GPT4 Correct," "Math Correct") to tailor model behavior for specific use cases.

Quick Start & Requirements

  • Installation: pip3 install ochat or via Conda.
  • Prerequisites: PyTorch and CUDA are required. Specific models may have additional requirements.
  • Demo: An online demo is available at https://openchat.team.
  • Documentation: Huggingface models are available at https://huggingface.co/openchat.

Highlighted Details

  • OpenChat 3.6 (8B) outperforms official Llama 3 8B Instruct and other open-source fine-tunes.
  • OpenChat 3.5 (7B) demonstrates performance on par with or exceeding ChatGPT on various benchmarks, including MT-Bench and HumanEval.
  • The project offers an OpenAI-compatible API server optimized with vLLM for production deployment.
  • Training utilizes padding-free techniques and Multipack Sampler for significant speedups.

Maintenance & Community

  • Active development with releases including Llama-3 based versions.
  • Community support is available via Discord.
  • Key contributors and sponsors include Tsinghua University, 01.AI Company, and RunPod.

Licensing & Compatibility

  • Code is distributed under the Apache License 2.0.
  • Models based on Llama 2 are explicitly stated as free for commercial use.

Limitations & Caveats

OpenChat models inherit limitations from their foundation models, potentially affecting performance in complex reasoning, mathematical tasks, and coding. Like other LLMs, they are susceptible to hallucination and may generate unsafe or biased content, requiring careful implementation of safety measures.

Health Check
Last commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
67 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.