trainable-agents  by choosewhatulike

Trainable agent for role-playing, learning from experiences, characteristics, and emotions

created 1 year ago
568 stars

Top 57.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the code and datasets for Character-LLM, a trainable agent designed for role-playing. It enables LLMs to embody specific historical figures or fictional characters with distinct personalities and knowledge, offering a more authentic role-playing experience than prompt-based methods. The target audience includes researchers and developers working on conversational AI, character simulation, and creative LLM applications.

How It Works

Character-LLMs are trained using a novel "Experience Reconstruction" process. This involves generating detailed experience data for target characters, including profiles, scenes, and multi-turn interactions. This data is then used to fine-tune base LLMs (like Llama 1) to imbue them with character-specific traits, knowledge, and emotional nuances, allowing them to respond authentically without requiring external prompts during inference.

Quick Start & Requirements

  • Model Application: Requires applying weight differences to base Llama 1 models using python3 -m fastchat.model.apply_delta.
  • Prerequisites: Python, PyTorch, Transformers, FastChat, Hugging Face Hub. Base Llama 1 model weights are necessary.
  • Inference: Uses FastChat for controller, OpenAI-compatible API server, and model workers.
  • Training: Requires 8 A100 GPUs for ~30-45 minutes.
  • Resources: Pre-trained model weights (as deltas) and training datasets are available on Hugging Face.

Highlighted Details

  • Offers pre-trained models for nine distinct characters (e.g., Cleopatra, Beethoven, Voldemort).
  • Includes a detailed data generation pipeline using GPT-3.5 Turbo for creating character experiences.
  • Training leverages FastChat with distributed training capabilities (FSDP).
  • Provides examples for both single-turn and multi-turn interviews for model evaluation.

Maintenance & Community

The project is associated with the EMNLP 2023 paper "Character-LLM: A Trainable Agent for Role-Playing." No specific community channels (like Discord/Slack) or active maintenance signals are mentioned in the README.

Licensing & Compatibility

  • License: Model weights are based on Llama 1, inheriting its license restrictions. The README explicitly states resources are restricted for academic research purposes only and cannot be used for commercial purposes.
  • Compatibility: Not compatible with commercial use due to licensing.

Limitations & Caveats

The project's resources are strictly for academic research and prohibit commercial use. Output quality and accuracy are subject to uncontrollable variables like randomness, and the authors disclaim responsibility for any consequences arising from resource usage.

Health Check
Last commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
32 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
10 more.

TinyLlama by jzhang38

0.3%
9k
Tiny pretraining project for a 1.1B Llama model
created 1 year ago
updated 1 year ago
Feedback? Help us improve.