pykoi-rlhf-finetuned-transformers  by CambioML

Python library for reinforcement learning with human feedback (RLHF)

created 2 years ago
411 stars

Top 72.2% on sourcepulse

GitHubView on GitHub
Project Summary

pykoi is a Python library designed to streamline the process of improving Large Language Models (LLMs) through Reinforcement Learning from Human Feedback (RLHF) and Retrieval-Augmented Generation (RAG). It offers a unified interface for collecting user feedback, fine-tuning models with RL, reward modeling, and comparing LLM performance, targeting researchers and developers seeking to enhance LLM capabilities with real-time user input.

How It Works

pykoi provides a modular approach to LLM improvement. For RAG, it allows users to upload documents to create context-aware chatbots, enabling source selection and saving modified responses for RLHF data collection. For RLHF, it facilitates fine-tuning LLMs using collected datasets, integrating human evaluative feedback to guide model learning. The library also includes features for easily comparing multiple LLMs interactively or on specific prompts.

Quick Start & Requirements

  • Installation: pip3 install "pykoi[rag]" or pip3 install "pykoi[rlhf]". For GPU RAG with HuggingFace, use pip3 install "pykoi[rag, huggingface]".
  • Prerequisites: Python 3.10+, Conda recommended. GPU installations require specific PyTorch versions compatible with CUDA (e.g., cu121). RAG can run on CPU with OpenAI or Anthropic Claude2 APIs. RLHF requires a GPU.
  • Resources: GPU instances (e.g., EC2 g5.2xlarge) with at least 100GB storage are recommended for RLHF and GPU RAG.
  • Demos: CPU Demo | GPU Demo (Note: Links appear to be broken or mislabeled in the README).

Highlighted Details

  • Unified interface for RLHF/RLAIF data collection, fine-tuning, and model comparison.
  • Features a sharable UI for local chat history storage with 100% privacy.
  • Supports RAG implementation with custom document uploads for context-aware responses.
  • Enables direct comparison of multiple LLMs via interactive sessions or prompt sets.

Maintenance & Community

The project is maintained by CambioML. Links to community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The provided demo links in the README appear to be broken or incorrectly labeled. The library is actively under development, with some features noted as "building now."

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.