openchat by imoneoi

Open-source LLM fine-tuned with C-RLFT, inspired by offline reinforcement learning

Created 2 years ago

5,466 stars

Top 9.1% on SourcePulse

View on GitHub

9 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

Philipp Schmid

DevRel at Google DeepMind

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Travis Fischer

Founder of Agentic

and 5 more!

Project Summary

OpenChat provides a suite of open-source large language models fine-tuned using C-RLFT, a strategy inspired by offline reinforcement learning. This approach allows models to learn from mixed-quality data without explicit preference labels, achieving performance competitive with models like ChatGPT, even with a 7B parameter model runnable on consumer GPUs. The project targets developers and researchers seeking high-performance, commercially viable LLMs.

How It Works

OpenChat models are fine-tuned using C-RLFT, which leverages mixed-quality data and offline reinforcement learning principles. This method enables the models to learn effectively from data that lacks explicit preference labels, leading to robust performance across various tasks. The library supports different "conditions" (e.g., "GPT4 Correct," "Math Correct") to tailor model behavior for specific use cases.

Quick Start & Requirements

Installation: pip3 install ochat or via Conda.
Prerequisites: PyTorch and CUDA are required. Specific models may have additional requirements.
Demo: An online demo is available at https://openchat.team.
Documentation: Huggingface models are available at https://huggingface.co/openchat.

Highlighted Details

OpenChat 3.6 (8B) outperforms official Llama 3 8B Instruct and other open-source fine-tunes.
OpenChat 3.5 (7B) demonstrates performance on par with or exceeding ChatGPT on various benchmarks, including MT-Bench and HumanEval.
The project offers an OpenAI-compatible API server optimized with vLLM for production deployment.
Training utilizes padding-free techniques and Multipack Sampler for significant speedups.

Maintenance & Community

Active development with releases including Llama-3 based versions.
Community support is available via Discord.
Key contributors and sponsors include Tsinghua University, 01.AI Company, and RunPod.

Licensing & Compatibility

Code is distributed under the Apache License 2.0.
Models based on Llama 2 are explicitly stated as free for commercial use.

Limitations & Caveats

OpenChat models inherit limitations from their foundation models, potentially affecting performance in complex reasoning, mathematical tasks, and coding. Like other LLMs, they are susceptible to hallucination and may generate unsafe or biased content, requiring careful implementation of safety measures.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

15 stars in the last 30 days