trainable-agents by choosewhatulike

Trainable agent for role-playing, learning from experiences, characteristics, and emotions

Created 2 years ago

605 stars

Top 54.1% on SourcePulse

Project Summary

This repository provides the code and datasets for Character-LLM, a trainable agent designed for role-playing. It enables LLMs to embody specific historical figures or fictional characters with distinct personalities and knowledge, offering a more authentic role-playing experience than prompt-based methods. The target audience includes researchers and developers working on conversational AI, character simulation, and creative LLM applications.

How It Works

Character-LLMs are trained using a novel "Experience Reconstruction" process. This involves generating detailed experience data for target characters, including profiles, scenes, and multi-turn interactions. This data is then used to fine-tune base LLMs (like Llama 1) to imbue them with character-specific traits, knowledge, and emotional nuances, allowing them to respond authentically without requiring external prompts during inference.

Quick Start & Requirements

Model Application: Requires applying weight differences to base Llama 1 models using python3 -m fastchat.model.apply_delta.
Prerequisites: Python, PyTorch, Transformers, FastChat, Hugging Face Hub. Base Llama 1 model weights are necessary.
Inference: Uses FastChat for controller, OpenAI-compatible API server, and model workers.
Training: Requires 8 A100 GPUs for ~30-45 minutes.
Resources: Pre-trained model weights (as deltas) and training datasets are available on Hugging Face.

Highlighted Details

Offers pre-trained models for nine distinct characters (e.g., Cleopatra, Beethoven, Voldemort).
Includes a detailed data generation pipeline using GPT-3.5 Turbo for creating character experiences.
Training leverages FastChat with distributed training capabilities (FSDP).
Provides examples for both single-turn and multi-turn interviews for model evaluation.

Maintenance & Community

The project is associated with the EMNLP 2023 paper "Character-LLM: A Trainable Agent for Role-Playing." No specific community channels (like Discord/Slack) or active maintenance signals are mentioned in the README.

Licensing & Compatibility

License: Model weights are based on Llama 1, inheriting its license restrictions. The README explicitly states resources are restricted for academic research purposes only and cannot be used for commercial purposes.
Compatibility: Not compatible with commercial use due to licensing.

Limitations & Caveats

The project's resources are strictly for academic research and prohibit commercial use. Output quality and accuracy are subject to uncontrollable variables like randomness, and the authors disclaim responsibility for any consequences arising from resource usage.

trainable-agents by choosewhatulike

Explore Similar Projects

awesome-open-source-ai by suncloudsmoon

Open-LLaVA-NeXT by xiaoachen98

Soul-of-Waifu by jofizcd

AnimationGPT by fyyakaxyy

anime.gf by cyanff

huanhuan-chat by KMnO4-zx

my-neuro by morettt

chat-dataset-baseline by hikariming

characterfile by elizaOS

Chat-Haruhi-Suzumiya by LC1332

augmentoolkit by e-p-armstrong

llama-models by meta-llama