CharacterGLM-6B by thu-coai

Character customization research paper using LLMs for Chinese conversational AI

Created 2 years ago

488 stars

Top 63.2% on SourcePulse

Project Summary

CharacterGLM-6B is an open-source conversational AI model designed to create lifelike Chinese AI characters by focusing on attributes and behaviors. It targets researchers and developers aiming to build engaging, consistent, and anthropomorphic conversational agents, offering a foundation for academic exploration in character-driven dialogue.

How It Works

CharacterGLM-6B is fine-tuned from the ChatGLM2 series, incorporating character attributes (identity, interests, opinions, etc.) and behaviors (language style, emotional expression) into its training. This approach aims to imbue AI characters with human-like traits, enhancing consistency, anthropomorphism, and engagement in conversations. The model was trained on a large, crowdsourced dataset of character descriptions and dialogues, with further refinement using online interaction data for iterative improvement.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt transformers>=4.36.2 torch>=2.1.0
Clone the Hugging Face model repository: git lfs install && git clone https://huggingface.co/thu-coai/CharacterGLM-6B
Run a web demo: cd basic_demo && streamlit run web_demo_streamlit.py
Run a CLI demo: python basic_demo/cli_demo.py
Official documentation: CharacterGLM-6B 技术文档

Highlighted Details

Fine-tuned from ChatGLM2 series, retaining its strengths in conversational fluency and low deployment threshold.
Evaluated against 10 mainstream Chinese LLMs and commercial models like GPT-3.5/GPT-4, showing competitive performance in consistency, anthropomorphism, and engagement.
Supports model fine-tuning using the LlamaFactory framework.
Offers both a Streamlit-based web demo and a command-line interface for interaction.

Maintenance & Community

Developed jointly by Tsinghua University's CoAI Lab and Lingxin Intelligence.
Community engagement encouraged via WeChat (link provided in README).
Model fine-tuning guidance available through LlamaFactory.

Licensing & Compatibility

The open-source model and code are for academic research use only.
Explicitly prohibited for commercial use, dissemination, or any purpose that could harm society.
Users must comply with the open-source agreement and avoid misuse.

Limitations & Caveats

The open-source model is small-scale and its output is subject to probabilistic randomness, meaning accuracy is not guaranteed. The model's output can also be easily misled by user input. The project disclaims responsibility for data security, public opinion risks, or any issues arising from model misuse or misguidance.

Health Check

Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days