Discover and explore top open-source AI tools and projects—updated daily.
whyzhowAI personalization framework for emulating user style
Top 77.5% on SourcePulse
PersonalStyleAI-Framework- provides a complete pipeline for enabling AI to emulate a user's unique communication style. It targets individuals seeking personalized AI interactions, offering a seamless process from cleaning raw chat data to adapting models like GPT-4, Claude, and Llama 3, and finally solidifying stylistic nuances through lightweight fine-tuning.
How It Works
The framework operates in three core stages: "Data Alchemy" cleans messy chat records (emojis, links, spam) into high-quality JSONL conversation pairs using optimized regular expressions. The "Adaptor Centre" utilizes a factory pattern for a unified interface, allowing effortless switching between different AI models (e.g., OpenAI, Ollama, Claude, Llama 3) without altering business logic. "Style Evolution" employs local lightweight fine-tuning (LoRA via PEFT) to embed language habits directly into model weights, offering a more persistent personalization than prompt engineering alone.
Quick Start & Requirements
Basic installation involves cloning the repository, creating a Python virtual environment, and running pip install -e .. API keys must be configured in a .env file by copying .env.example. Data preprocessing is initiated via python preprocess_data.py using chat records in data/raw/chat.txt, followed by dialogue testing with python main.py. Advanced local fine-tuning requires a CUDA-enabled GPU and installing extra components with pip install -e ".[train]", then running python run_train.py. Key dependencies include Python and Git; CUDA is necessary for training.
Highlighted Details
Maintenance & Community
The project welcomes contributions via Pull Requests and Issues for feature requests and adding new AI adaptors. Specific community channels or contributor details are not provided in the README.
Licensing & Compatibility
The repository's license is not specified in the provided README, making its terms for commercial use or distribution unclear.
Limitations & Caveats
Local fine-tuning necessitates a CUDA-compatible GPU. The absence of a specified license poses a significant adoption blocker for commercial or closed-source integration. The data cleaning is optimized for chat records, potentially requiring adjustments for other text formats.
1 month ago
Inactive