Discover and explore top open-source AI tools and projects—updated daily.
Emotional generative dialog system backend
Top 24.8% on SourcePulse
CakeChat is an emotional generative dialog system designed to create chatbots capable of expressing emotions through conversation. It targets developers and researchers looking to build more engaging and nuanced conversational AI agents. The primary benefit is the ability to generate responses conditioned on specific emotional states or personas, enhancing the expressiveness of chatbots.
How It Works
CakeChat employs a Hierarchical Recurrent Encoder-Decoder (HRED) architecture utilizing multilayer RNNs with GRU cells, featuring a bidirectional encoder for deep dialog context. A "thought vector" is fed into the decoder at each step, and the model's responses can be conditioned on arbitrary categorical labels, such as emotion or persona ID. For response generation, it supports four algorithms: sampling, beamsearch, sampling-reranking, and beamsearch-reranking, with options for reranking based on log-likelihood or MMI criteria.
Quick Start & Requirements
The recommended installation method is via Docker, with separate CPU and GPU images available. Manual installation requires Python 3.5.2, TensorFlow 1.12.2, and Keras 2.2.4. Pre-trained models can be fetched using python tools/fetch.py
. The project was trained on a large, preprocessed Twitter corpus (approx. 50 million dialogs).
Highlighted Details
Maintenance & Community
The project explicitly states it is unmaintained, recommending Transformer-based dialog models (e.g., Microsoft's DialoGPT) as superior alternatives. It was developed by the Replika team. Issues and feature requests can be tracked on GitHub Issues.
Licensing & Compatibility
CakeChat is licensed under the Apache License, Version 2.0. This license generally permits commercial use and modification.
Limitations & Caveats
The project is unmaintained, and its RNN-based architecture is considered less effective than modern Transformer models. It relies on older versions of TensorFlow (1.x) and Keras. The original training dataset (Twitter corpus) is not publicly available due to privacy policies, necessitating users to provide their own data for training or fine-tuning.
5 years ago
Inactive