OpenTinker by open-tinker

RL-as-a-Service infrastructure for foundation models

Created 2 months ago

631 stars

Top 52.5% on SourcePulse

View on GitHub

7 Experts Love This Project

Elvis Saravia

Founder of DAIR.AI

Maxime Labonne

Head of Post-Training at Liquid AI

Yaowei Zheng

Author of LLaMA-Factory

Wing Lian

Founder of Axolotl AI

and 3 more!

Project Summary

Summary

OpenTinker provides an RL-as-a-Service infrastructure designed to democratize agentic reinforcement learning for foundation models. It offers a streamlined platform for researchers and developers to implement, train, and deploy RL agents, simplifying complex setups and accelerating development.

How It Works

The core innovation lies in its flexible environment design framework, which categorizes scenarios across two dimensions: Data Source (Data-Dependent vs. Data-Free) and Interaction Mode (Single-Turn vs. Multi-Turn). This 2x2 paradigm enables four distinct training approaches, catering to diverse learning objectives from simple QA tasks to complex game playing agents.

Quick Start & Requirements

Installation involves cloning the repository (git clone --recurse-submodules), followed by installing the core package (pip install -e .) and the verl component (cd verl; pip install -e .). Server setup is recommended via Docker, requiring GPU access (docker run ... --gpus all). Manual server installation is possible but may lead to version conflicts. Authentication is configurable via opentinker/scheduler/config/scheduler.yaml. Links to examples, Project Page, DeepWiki, and Slack are available.

Highlighted Details

The project supports various agentic RL tasks, including LLM and VLM applications for mathematical problem-solving (single and multi-turn, with LoRA options), and multi-turn agents for games like Gomoku and AlfWorld. Performance tracking is integrated via Weights & Biases (wandb).

Maintenance & Community

Community support is available via a Slack channel. Specific details on core contributors, active development, or a public roadmap are not detailed in the provided README.

Licensing & Compatibility

The provided README does not specify a software license. This absence requires clarification for assessing commercial use or closed-source integration compatibility.

Limitations & Caveats

The client currently has a transitional dependency on a subset of verl functions, planned for future decoupling to ensure a lightweight client. Manual server dependency installation carries a risk of version conflicts, making the Docker approach preferable for stability.

Health Check

Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

30 stars in the last 30 days