Streamer-Sales is an LLM-based system designed to act as a virtual sales streamer, generating compelling product descriptions and engaging with potential customers. It aims to revolutionize the e-commerce experience by creating dynamic and persuasive sales pitches, targeting online sellers and marketing professionals.
How It Works
The core of Streamer-Sales is an InternLM2 model fine-tuned using xtuner. It leverages Retrieval-Augmented Generation (RAG) to incorporate up-to-date product information, ensuring accurate and relevant sales copy. For enhanced realism, it integrates Text-to-Speech (TTS) for emotive voice generation and a digital human module for video output. Agent capabilities allow for real-time information retrieval, such as checking delivery status, further enriching the interactive sales experience.
Quick Start & Requirements
- Installation: Docker-Compose is recommended for deployment. Alternatively, direct host deployment involves cloning the repository, setting up a Conda environment (
environment.yml
), and installing requirements (requirements.txt
).
- Prerequisites: Python 3.10+, CUDA 12.2, NVIDIA GPU (RTX 3090/4090 or A100 recommended), 64GB+ RAM. Specific services have varying VRAM requirements (e.g., LLM-7B requires 16GB, 4-bit version ~6.5GB). API keys for external services (delivery, weather) may be needed.
- Resources: Fine-tuning requires 24GB-80GB VRAM depending on batch size. Deployment VRAM needs are detailed for each service.
- Links: Demo, Architecture Diagram, Video Explanation
Highlighted Details
- End-to-end solution: Includes LLM, RAG, TTS, Digital Human generation, Agent, ASR, and a Vue/FastAPI frontend.
- Performance: LMDeploy with Turbomind offers 3x+ inference speed improvement; 4-bit quantization yields 5x speedup over original inference.
- Data Generation: Detailed pipeline for generating fine-tuning datasets using LLMs, with open-sourced scripts and example data.
- Deployment: Supports Docker-Compose for easy deployment and a decoupled frontend/backend architecture for scalability.
Maintenance & Community
- The project is actively developed by PeterH0323.
- Recent updates include API refactoring, PostgreSQL integration, and a complete frontend rewrite.
- The project won 1st place in the 2024 Puyu Large Model Challenge (Summer Competition) - Innovative Creativity Track.
Licensing & Compatibility
- Code licensed under AGPL-3.0.
- The "Lelemiao" model uses Apache License 2.0.
- Users must comply with the licenses of all used models and datasets (e.g., InternLM2, GPT-SoVITS). Commercial use requires careful review of all component licenses.
Limitations & Caveats
- The online demo has Agent and ASR features disabled due to API costs and VRAM limitations.
- The project is described as being in its early stages, with ongoing development and potential for improvements.
- Fine-tuning requires significant VRAM and technical expertise.